Skip to content

Simple multithreaded script to download Google Photos to a PC hard drive in a windows environment

License

Notifications You must be signed in to change notification settings

healdgar/GooglePhotosDownloader

Repository files navigation

Here is a generated README.md file for the Google Photos Downloader code (by Claude.AI):

Google Photos Downloader

This is a Python script to download photos and videos from Google Photos between a given date range and save them to a local folder.

Features

  • Downloads media between a specified start and end date
  • Saves photos and videos to a local folder organized by year and month
  • Automatically handles authentication using OAuth
  • Multi-threaded for faster downloading
  • Maintains history in a JSON index to avoid re-downloading
  • Resumes interrupted downloads
  • Handles filepath conflicts during organization
  • Ratelimits requests to avoid Google API limits
  • Provides statistics on the download results
  • Compatible with saving to a Windows OneDrive folder, even if the files are set to "free up space" and are in the cloud.

Roadmap

  • Selection and implementaiton of a NoSQL database instead of JSON to improve performance for large video collections and enable some local search and reporting.

Usage

The script requires Python 3 and the google-api-python-client library.

pip install google-api-python-client

To run the script:

python google_photos_downloader.py --start_date YYYY-MM-DD --end_date YYYY-MM-DD --backup_path /path/to/local/folder

Required arguments:

  • --start_date: The start date for photos to download in YYYY-MM-DD format.
  • --end_date: The end date for photos to download in YYYY-MM-DD format.
  • --backup_path: The local folder path to save the downloaded photos.

Optional arguments:

  • --num_workers: Number of threads to use for parallel downloading. Default 5.
  • --refresh_index: Fetch a fresh index of photos from Google Photos. Useful if your local history is out of date.
  • --stats_only: Only show statistics on current status, don't download anything.
  • --download_missing: Re-download any files missing from the local folder.
  • --auth: Re-authenticate with Google Photos and refresh OAuth token.
  • --validate_only: Validate all files in the index exist locally.

The first time you run it, the script will open a browser to authenticate with Google Photos and obtain the required OAuth credentials. This is saved locally so you only have to do it once.

Downloaded photos are saved in your local backup_path folder organized by year and month subfolders.

The script maintains a history of previously downloaded photos in DownloadItems.json to avoid re-downloading. It compares this index against your local files on each run.

Notes

  • Date filters and filenames use the local timezone, but the metadata index from Google Photos API uses UTC dates. The script handles conversion between timezones.
  • A ratelimiter is used to avoid hitting Google API limits. You can adjust the rate and capacity if needed.
  • Stats are calculated based on status fields in the index like downloaded and file sizes on disk.
  • Re-run it periodically to download any new photos, or re-download missing files.
  • Use --refresh_index if your local index history becomes outdated.
  • Photos and videos with matching filenames are deduplicated.

License

This script is released under the MIT License. Feel free to modify and reuse it.

alternative Readme.md version (by ChatGPT 4 Code Interpreter)

Let's first review the code in the file google_photos_downloader.py that you've provided. Once I understand the structure and functionality of your application, I'll be able to generate a helpful README.md file.

The provided Python script is an application that downloads Google Photos between a given date range. This script also validates a local repository for extraneous, missing, or misnamed files based on the Google Photos index.

Here's an outline of the script:

  1. GooglePhotosDownloader class: This is the main class that encapsulates the functionality of the script.

    • __init__: Initializes the class with arguments such as the start and end date, backup path, and number of worker threads.
    • authenticate: Authenticates the user with Google using OAuth2.0.
    • get_all_media_items: Fetches all media items from Google Photos API and updates the class's media items dictionary.
    • scandisk_and_get_filepaths_and_filenames: Scans the backup directory for all files and returns their file paths and filenames.
    • validate_repository: Validates the local backup repository against the Google Photos index.
    • download_image: Downloads a single image item from Google Photos.
    • download_photos: Downloads all photos and videos in the all_media_items list.
    • report_stats: Reports the status of all items in the index.
    • save_index_to_file: Saves the index to a JSON file.
  2. if __name__ == "__main__": This is the main function that runs when the script is executed. It parses command-line arguments and calls the appropriate functions of the GooglePhotosDownloader class.

With this information, we can create a README file. The file will explain what the script does, how to run it, and its dependencies. Let's generate a sample README.md file.

README.md

Google Photos Downloader

This Python script is designed to download Google Photos between a specified date range. It also validates a local repository for extraneous, missing, or misnamed files based on the Google Photos index.

Dependencies

The script uses the following Python libraries:

  • os
  • json
  • argparse
  • datetime
  • logging
  • traceback
  • requests
  • ssl
  • googleapiclient
  • tzlocal
  • ratelimiter
  • concurrent.futures

Please make sure to install these dependencies before running the script. You can install them with pip:

pip install google-auth google-auth-httplib2 google-auth-oauthlib google-api-python-client tzlocal ratelimiter

How to Run

The script is executed from the command line and accepts several arguments:

  • --start_date: The start date in the format YYYY-MM-DD. This argument is required.
  • --end_date: The end date in the format YYYY-MM-DD. This argument is required.
  • --backup_path: The path to the folder where you want to save the backup. This argument is required.
  • --num_workers: The number of worker threads for downloading images. This argument is optional, with a default value of 5.
  • --refresh_index: If provided, the index will be fetched from the server.
  • --stats_only: If provided, only the status of items in the index will be reported.
  • --download_missing: If provided, all missing items will be downloaded.
  • --auth: If provided, OAuth re-authentication and token refresh will be performed.
  • --validate_only: If provided, the index will be validated by checking all filepaths.

Here's an example of how to run the script:

python google_photos_downloader.py --start_date 2023-01-01 --end_date 2023-01-31 --backup_path /path/to/your/backup --num_workers 3

Output

The script will download all Google Photos between the specified date range to the provided backup path. It will also create a log file (google_photos_downloader.log) in the backup path.

Please note that the script uses Google's OAuth 2.0 system for authentication. You'll need to have your OAuth credentials set up and available for the script to access your Google Photos.

Contributing

If you have suggestions for improving this script, please open an issue or submit a pull request.

About

Simple multithreaded script to download Google Photos to a PC hard drive in a windows environment

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages