Working with large files - ubc-geomatics-textbook/docs GitHub Wiki

If you've tried to use GitHub for storing large files, then you've probably realized there are hard limits for file and repository sizes. While there are options for externally hosting and embedding large files (like AWS S3, Google Drive, DropBox, etc.), the preferred method is to use Git Large File Storage (LFS). Git LFS is easy to implement and doesn't require you to work outside the Git workflows for handling and uploading large files to your GitHub repositories.

With Git LFS, you can use GitHub to store files up to 2 GB.

Why use Git LFS?

Every time you create a commit, Git creates a new tree object of your repository's files. When this happens, files are added for ones that have been modified, and the old file is stored in the commit history. If you're modifying large binary files (like images, datasets, or videos), you could quickly approach storage limits since every version of every file is preserved. When someone clones your repository, they in turn have to download every version of every file that has been changed, which could take some time and continue causing unnecessary problems when pushing and fetching.

Git LFS works around this by created pointers to large files and storing them alongside your GitHub repository. When you modify a large binary file, Git updates the new pointer file, and only the large files they reference – not the modified version history. This means your GitHub repository will have a reduced storage size, and only the latest version of large files will be downloaded when the repository is cloned.

These resources contain information for learning more about Git LFS:

If your book chapter includes files larger than 15 MB, you should install Git LFS to track these files.

More information about installing and using Git LFS can be found here: https://github.com/ubc-library-rc/using-git-lfs.