The DataLad Handbook

Virtual directory tree of a nested DataLad dataset

Welcome to the DataLad handbook!

This handbook is a living resource about why and – more importantly – how to use DataLad. It aims to provide novices and advanced users of all backgrounds with both the basics of DataLad and start-to-end use cases of specific applications. If you want to get hands-on experience and learn DataLad, the Basics part of this book will teach you. If you want to know what is possible, the use cases will show you. And if you want to help others to get started with DataLad, the companion repository provides free and open source teaching material tailored to the handbook.

Before you read on, please note that the handbook is based on DataLad version 0.12, but the section Installation and configuration will set you up with what you need if you currently do not have DataLad 0.12 or higher installed. If you’re new here, please start the handbook here.


The handbook is currently in beta stage. If you would be willing to provide feedback on its contents, please get in touch.

Basics 1 – DataLad datasets

Basics 2 – Datalad, Run!

How DataLad records provenance of dataset modifications

Basics 3 – Under the hood: git-annex

A closer look at how and why things work

Basics 4 – Collaboration

Basics 5 – Tuning datasets to your needs

Various types and methods for dataset configurations

Basics 6 – Make the most out of datasets

Basics 7 – One step further

Basics 8 – Help yourself

Basics 9 Third party infrastructure

Leverage third party services to share datasets

Basics 10 – Further options

Small pieces of advice and helpful additional options