2. DataLad’s extensions

The commands DataLad provides cover a broad range of domain-agnostic use cases. However, there are extension packages that can add (domain-specific) functionality and new commands.

Such extensions are shipped as separate Python packages, and are not included in DataLad itself. Instead, users with the need for a particular extension can install the extension package – either on top of DataLad if DataLad is already installed, or on its own (the extension will then pull in DataLad core automatically, with no need to first or simultaneously install DataLad itself explicitly). The installation is done with standard Python package managers, such as pip, and beyond installation of the package, no additional setup is required.

Among others (a full list can be found on PyPi), the following DataLad extensions are available:

Extension name

Description

DataLad Container

Equips DataLad’s run/rerun functionality with the ability to transparently execute commands in containerized computational environments. The section Computational reproducibility with software containers demonstrates how this extension can be used, as well as the usecase An automatically reproducible analysis of public neuroimaging data.

DataLad Crawler

One of the initial goals behind DataLad was to provide access to already existing data resources. With crawl-init/crawl commands, this extension allows to automate creation of DataLad datasets from resources available online, and efficiently keep them up-to-date. The majority of datasets in the DataLad superdataset /// on datasets.datalad.org are created and updated using this extension functionality.

Todo

contribute a section or a demo, e.g. based on existing one

DataLad Neuroimaging

Metadata extraction support for a range of standards common to neuroimaging data. The usecase An automatically reproducible analysis of public neuroimaging data demonstrates how this extension can be used.

DataLad Metalad

Equips DataLad with an alternative command suite and advanced tooling for metadata handling (extraction, aggregation, reporting).

Todo

once section on metadata is done, link it here

To install a DataLad extension, use

$ pip install <extension-name>

such as in

$ pip install datalad-container

Afterwards, the new DataLad functionality the extension provides is readily available.

Some extensions could also be available from the software distribution (e.g., NeuroDebian or conda) you used to install DataLad itself. Visit github.com/datalad/datalad-extensions/ to review available versions and their status.