1.2. DataLad extensions

The commands DataLad provides cover a broad range of domain-agnostic use cases. However, there are extension packages that can add (domain-specific) functionality and new commands. Table 1.1 lists a number of available extensions.

Such extensions are shipped as separate Python packages, and are not included in DataLad itself. Instead, users with the need for a particular extension can install the extension package – either on top of DataLad if DataLad is already installed, or on its own (the extension will then pull in DataLad core automatically, with no need to first or simultaneously install DataLad itself explicitly). The installation is done with standard Python package managers, such as pip, and beyond installation of the package, no additional setup is required.

DataLad extensions listed here are of various maturity levels. Check out their documentation and the sections or chapters associated with an extension to find out more about them.

Table 1.1 Selection of available DataLad extensions. A more up-to-date list can be found on PyPi




Equips DataLad’s run/rerun functionality with the ability to transparently execute commands in containerized computational environments. The section Computational reproducibility with software containers demonstrates how this extension can be used, as well as the usecase An automatically and computationally reproducible neuroimaging analysis from scratch.


One of the initial goals behind DataLad was to provide access to already existing data resources. With crawl-init/crawl commands, this extension allows to automate creation of DataLad datasets from resources available online, and efficiently keep them up-to-date. The majority of datasets in the DataLad superdataset /// on datasets.datalad.org are created and updated using this extension functionality.


Metadata extraction support for a range of standards common to neuroimaging data. The usecase An automatically and computationally reproducible neuroimaging analysis from scratch demonstrates how this extension can be used.


A neuroimaging specific extension to allow reproducible DICOM to BIDS conversion of (f)MRI data. The chapter … introduces this extension.


Equips DataLad with an alternative command suite and advanced tooling for metadata handling (extraction, aggregation, reporting).


Equips DataLad with a set of commands to track XNAT projects. An alternative, more basic method to retrieve data from an XNAT server is outlined in section Configure custom data access.


Equips DataLad with a set of commands to obtain and monitor imaging data releases of the UKBiobank. An introduction can be found in chapter


Enhances DataLad with the ability for remote execution via the job scheduler HTCondor.


Enables DataLad to push and pull to all third party providers with no native Git support that are supported by rclone.


Enables DataLad to interface and work with the Open Science Framework. Use it to publish your dataset’s data to an OSF project, thus utilizing the OSF for dataset storage and sharing.

To install a DataLad extension, use

$ pip install <extension-name>

such as in

$ pip install datalad-container

Afterwards, the new DataLad functionality the extension provides is readily available.

Some extensions could also be available from the software distribution (e.g., NeuroDebian or conda) you used to install DataLad itself. Visit github.com/datalad/datalad-extensions/ to review available versions and their status.