DataLad is a tool for the joint management of code, data, and their relationship, built on top of the version control systems Git & git-annex. It adapts principles of open-source software development & distribution to address challenges of data management, data sharing, & digital provenance capture.


Cite this software

What DataLad can do for you

DataLad is a Python-based tool for the joint management of code, data, and their relationship, built on top of a versatile system for data logistics (git-annex) and the most popular distributed version control system (Git). It adapts principles of open-source software development and distribution to address the technical challenges of data management, data sharing, and digital provenance collection across the life cycle of digital objects.

DataLad aims to make data management as easy as managing code. It streamlines procedures to consume, publish, and update data, for data of any size or type, and to link them as precisely versioned, lightweight dependencies. DataLad helps to make science more reproducible and FAIR. It can capture complete and actionable process provenance of data transformations to enable automatic re-computation. The DataLad project ( delivers a completely open, pioneering platform for flexible decentralized research data management (RDM). It features a Python and a command-line interface as well as a dedicated graphical user interface, an extensible architecture, and does not depend on any centralized services but facilitates interoperability with a plurality of existing tools and services. In order to maximize its utility and target audience, DataLad is available for all major operating systems, and can be integrated into established workflows and environments with minimal friction.

Participating organisations

Forschungszentrum Jülich
Dartmouth College
The University of Texas at Austin
University of California, Berkeley
Stanford University
Potsdam Institute for Climate Impact Research
Université Catholique de Louvain
University of Tübingen
Otto-von-Guericke University Magdeburg



Datalad. Tracks your data just like git tracks your code.
Ted Satterthwaite
Doing a PhD equals working your way through many tutorials that can be quite boring. The @datalad handbook, however, is really enjoyable! Keeps my motivation for reproducible research up.
Jasmin Stein
With the upcoming requirement of funding agencies for FAIR data management in Canada, we’ve started helping other neuroimaging centres in Montreal to transition to @datalad. Thank you, Datalad
The Courtois Project on Neuronal Modelling
Yesterday was the 1st time I used @datalad and I feel ashamed for not having looked into it earlier! Very beautiful, sophisticated and absolutely necessary piece of software!
Shreyas Fadnavis
The @datalad folks are doing God's work!
Maurizio Sicorello
One of my favorite tools! Thank you, @datalad :)
Matteo Visconti di Oleggio Castello
@datalad is such a terrific tool for managing the evolution of your research project (code, data and beyond) in a transparent, reproducible and shareable way!
Lennart Wittkuhn
Likely one of the most impactful data sharing tools in the past few years. Go @datalad !
Tristan Glatard
This is a fantastic tool for reproducible research which solves several issues and has IMO not enough attention so far.
Konrad Förstner
Datalad can really help you simplify research data management and provides access to many data resources. like many tools it does have a learning curve to get comfortable. so spend the time using it and spend the time reporting issues.
Satrajit Ghosh
Great tool to help you become the Marie Kondo of data and digital life!
Sofie Valk
The @datalad project doesn't receive nearly enough kudos, so here is an official endorsement tweet from yours truly \o/ #fromtartodatalad #justuseit
Datalad is awesome. The ability to easily maintain shared file trees with optional data-files across machines (though just the tip the iceberg of datalad's functionality) makes life SO much better :)
Eshin Jolly
Creating @datalad datasets and analyzing them with "datalad run"... I'm going mad with power!!!
Samuel Nastase
I want to talk about one of the best tools we use here at TIES: @datalad [...] It has allowed us to centralize data management in ways that previously have been difficult in academia.
Patrick Anker


Michael Hanke
Lead developer
Research Center Jülich
Yaroslav Halchenko
Benjamin Poldrack
Adina Svenja Wagner
Matteo Visconti di Oleggio Castello
Jason Gors
Alexander Q Waite
Kyle Meyer
John T. Wodder II
Michael Burgardt
Taylor Olson
Chris Lamb
Torsten Stoeter

Related tools

DataLad Container extension


This DataLad extension package equips DataLad's run/rerun functionality with the ability to transparently execute commands in containerized computational environments. On re-run, DataLad will automatically obtain any required container at the correct version prior execution.

Updated 2 months ago



JTrack is a software(s) and mobile application ecosystem for digital biomarkers collection and remote assessment. JTrack is designed to collect health-related information from participants' smartphones. It also has a clinicians and administration dashboard for study, user, and data management.

Updated 1 month ago
2 4