8.5. Walk-through: Git LFS as a special remote on GitHub

Some repository hosting services provide for-pay support for large files, and can thus be used as special remotes as well. GitHub and GitLab for example support Git Large File Storage (Git LFS) for managing data files using Git. A free GitHub subscription allows up to 1GB of free storage and up to 1GB of bandwidth monthly. As such, it might be sufficient for some use cases, and could be configured quite easily.

In order to store annexed dataset contents on GitHub, we need first to create a repository on GitHub:

$ datalad create-sibling-github test-github-lfs
.: github(-) [https://github.com/yarikoptic/test-github-lfs.git (git)]
'https://github.com/yarikoptic/test-github-lfs.git' configured as sibling 'github' for <Dataset path=/tmp/test-github-lfs>

and then initialize a special remote of type git-lfs, pointing to the same GitHub repository:

$ git annex initremote github-lfs type=git-lfs url=https://github.com/yarikoptic/test-github-lfs encryption=none embedcreds=no

If you would like to compress data in Git LFS, you need to take a detour via encryption during git annex initremote – this has compression as a convenient side effect. Here is an example initialization:

$ git annex initremote --force github-lfs type=git-lfs url=https://github.com/yarikoptic/test-github-lfs encryption=shared

With this single step it becomes possible to transfer contents to GitHub:

$ git annex copy --to=github-lfs file.dat
copy file.dat (to github-lfs...)
ok
(recording state in git...)

and the entire dataset to the same GitHub repository:

$ datalad push --to=github
[INFO   ] Publishing <Dataset path=/tmp/test-github-lfs> to github
publish(ok): . (dataset) [pushed to github: ['[new branch]', '[new branch]']]

Because the special remote URL coincides with the regular remote URL on GitHub, siblings enable (as shown in the section Walk-through: Dropbox as a special remote) will not even be necessary when datalad is installed from GitHub.

No drop from LFS

Unfortunately, it is impossible to drop contents from Git LFS: help.github.com/en/github/managing-large-files