DataLad is a tool for the joint management of code, data, and their relationship, built on top of the version control systems Git & git-annex. It adapts principles of open-source software development & distribution to address challenges of data management, data sharing, & digital provenance capture.

Updated 6 days ago
110 51

UltraMassExplorer (UME)


Natural organic matter is the most complex chemical mixture on our planet. UME is an open access, browser-based tool that allows efficient, interactive, transparent, and reproducible exploration and evaluation of ultrahigh resolution mass spectra of complex organic matter.

Updated 1 month ago
32 3



This AiiDA plugin provides high-throughput automation and FAIR data management for the Jülich KKR codes.

Updated 2 months ago
17 7



RiboDetector is designed to rapidly and accurately detect rRNA sequences from metagenomic, metatranscriptomic, and ncRNA sequencing data. It has been optimized for use with both CPUs and GPUs. It outperforms existing software by delivering 10-50x faster runtime and ~10x fewer false classifications.

Updated 2 months ago
8 4



Kadi4Mat is an open-source software for managing research data, which supports close cooperation between experimenters, theorists, and simulators, especially in the field of materials science.

Updated 4 months ago
5 1



The guidance system HELIPORT aims to make the entire life cycle of a scientific project according to the FAIR principles. In particular, our data management solution deals with the areas from the generation of the data to the publication of primary research data, the workflows and results.

Updated 5 months ago
4 9



AiiDA plugin for FAIR high-throughput spin-dynamics simulations with the Spirit code (

Updated 2 months ago
4 5

CICMoD - A Climate Index Collection based on Model Data


The software provides a consistent and comprehensive collection of climate indices typically used to describe Earth System dynamics and serves as a new benchmark data set. It allows users to develop new machine learning methods and to compare their results to existing methods in an objective way.

Updated 2 months ago
1 5



dCache is a system for storing and retrieving huge amounts of scientific data, distributed among a large number of heterogeneous server nodes, under a single virtual filesystem tree with a variety of standard access methods including NFSv4.1 (pNFS), FTP, WebDav and xroot.

Updated 2 months ago
1 4



anndata is a Python package for handling annotated data matrices in memory and on disk, positioned between pandas and xarray. anndata offers a broad range of computationally efficient features including, among others, sparse data support, lazy operations, and a PyTorch interface.

Updated 2 weeks ago



Alpaca (Automated Lightweight Provenance Capture) captures the provenance during the execution of Python scripts that process data and stores the information using a data model based on the the W3C PROV standard.

Updated 1 month ago



Base-repo is a generic, domain-agnostic research data repository suitable to store and manage all kinds of research data. Its content is organized as DataResources, which consist of descriptive DataCite-compliant Metadata and one or more data elements, either stored as files or linked by reference.

Updated 6 months ago