Tell me what you are and I tell you where to start

The DataLad Handbook has grown into an extensive document. Depending on your use case for DataLad, you may not want to read all of the content there is. This section tries to be your guide.


If you can identify with one of the user types listed below, check out the recommended sections.

1 Independent but intrigued

“I don’t have a particular use case for DataLad (yet), but I want to know what it is all about”.

Start by reading sections What you really need to know to get a high-level overview about DataLad’s functionality, and continue with section A brief overview of DataLad for a short introduction to the fundamental principles of the software. Afterwards, you may want to skim through the different usecases to see whether one catches your attention. The DataLad cheat sheet, finally, can give you concrete command overviews.

2 Eager to try all of the things!

“I so want to try out all of DataLad!”

Awesome, and Welcome! The complete Basics part of the book was written for you. If you read it from start to end, you will become a DataLad expert. Don’t forget to install DataLad first, though. And if the Basics are not enough, continue right into the Use cases afterwards.

3 Seeking help

“I ran into a problem and hoped the book could help”.

The section How to get help may give you a good general overview on what to do if you encountered a problem. If you’re dealing with file system operations, Miscellaneous file system operations could be a resource to help you, and for all things configuration, the chapter Tuning datasets to your needs is your place to go to. If you are confused by symlinks or “permission denied” error in your dataset, checkout section Data integrity for some Basics on git-annex. The “Quick search” bar at the sidebar can also help to navigate to relevant sections, and the index at the end of the book can show you where all commands mentioned in the handbook are introduced.

If you’re seeking help with regard to large datasets you might want to take a look at Use case Scaling up: Managing 80TB and 15 million files from the HCP release.

If none of this helps, don’t hesitate to file an issue or post your question to We are happy to help, and appreciate bug reports!

4 The impatient

“I need to get going. FAST.”

Umm, sure. In principle, you can jump through the Basics and checkout precisely the sections you need, even though not all things will become clear. It’s best to keep the DataLad cheat sheet near by.

You want to know how to set-up and share an analysis with DataLad? Reading chapters DataLad datasets, Under the hood: git-annex, Make the most out of datasets and chapter Third party infrastructure should work for you.

You want to use DataLad as a back-up or dataset storage solution? Go to section Remote indexed archives for dataset storage and backup and use case Building a scalable data storage for scientific computing.

5 The data publisher

“I have a large amount of data that I want to publish, and thought DataLad would be a potential solution.”

If you’re not yet familiar with DataLad’s concept of a dataset, quickly read through the chapter DataLad datasets, and reading Under the hood: git-annex is also a good idea to get the Basics of how large files in datasets are handled. Afterwards, jump to chapter Third party infrastructure. Depending on the amount of data, it may make sense to read about an example of a large dataset (80TB/15 million files) in the use case Scaling up: Managing 80TB and 15 million files from the HCP release, and about the possibility of a Remote Indexed Archive (RIA) store in the section Remote indexed archives for dataset storage and backup and the use case Building a scalable data storage for scientific computing.

6 The advanced user

“Don’t bore me with all the introductory stuff…”

You already have plenty of DataLad experience and want to learn about advanced aspects of it? The handbook can show you a few of those! The section Configurations to go can show you how to write or distribute run-procedures. The section DataLad’s result hooks introduces the hook feature of DataLad. The section

can show you how to use DataLad’s rclone helper for special remotes. The section Remote indexed archives for dataset storage and backup introduces the concept of a Remote Indexed Archive (RIA) store. Still not enough? We’re happy to consider your feature request for new handbook content, but also your pull request with your addition or use case.

7 Teacher

“I came here to teach!”

Awesome! There are instructions in section Teaching with the DataLad Handbook, and the companion repository at contains slides, code casts, and tools for teaching.