SpatialData: an open and interoperable data framework for spatial omics.
Introduction
The SpatialData framework provides an open and interoperable solution for storing, representing, processing, and visualizing spatial omics data.
It provides both methods for researchers aimed at supporting and simplifying their data analysis tasks and APIs for other developers, so that can develop their software library faster and more reliably by building upon the SpatialData infrastructure.
- Built on established, language-agnostic technologies for both on-disk and in-memory data representation (Python).
- Extends the OME-NGFF specification, leading to a language-agnostic storage format. This promotes interoperability across different programming languages (Python, R, JavaScript) and within with the standard geospatial tech stack in Python.
- A collaborative effort by numerous institutions, with dozens of contributors and hundreds of adopters.
- An integral part of the scverse ecosystem, a consortium of open-source methods for single-cell bioinformatics.
- Provides a foundational library for spatial omics data, prioritising long-term support and interoperability within the Python single-cell and spatial ecosystems.
Use cases
The following use cases are enabled by the SpatialData framework:
- Ingesting spatial omics datasets: The technology can ingest datasets from various commercial technologies like 10x Genomics, Bruker, and Vizgen, and represent them in an interoperable file format.
- Representing massive datasets: The file format allows for efficient representation of large datasets, combining 2D/3D images, segmentation masks, geometries, point locations, and annotations.
- Spatial alignment: It enables alignment of multimodal, serial, or multi-field datasets. It also supports datasets with multiple samples.
- Data operations: The technology allows for operations like querying data within specific regions, transferring annotations and metrics across modalities, preparing tiled data for deep learning use cases, and more.
- Interactive visualization: It facilitates interactive visualization of large datasets using tools like Matplotlib, Datashader, and Napari, and allows for interactive annotations.
Example of publications/methods using the software:
Some popular Python libraries that built upon the SpatialData framework:
Some popular methods that are interoperable with SpatialData file format.
- Vitessce: integrative visualization of multimodal and spatially resolved single-cell data (Nature Methods, GitHub)
- WebAtlas pipeline for integrated single-cell and spatial transcriptomic data: (Nature Methods, GitHub)
Governance and sponsors
The spatialdata project uses a consensus based governance model and is fiscally sponsored by NumFOCUS. Consider making a tax-deductible donation to help the project pay for developer time, professional services, travel, workshops, and a variety of other needs.
The spatialdata project also received support by the Chan Zuckerberg Initiative.
Getting started
Please refer to the documentation. In particular:
Also, see the links below to learn more about other packages in the SpatialData ecosystem.
Installation
Check out the docs for more complete installation instructions. To get started with the "batteries included" installation, you can install via pip:
pip install "spatialdata[extra]"
or via conda:
mamba install -c conda-forge spatialdata napari-spatialdata spatialdata-io spatialdata-plot
Limitations
- Code only manually tested for Windows machines. Currently the framework is being developed using Linux, macOS and Windows machines, but it is automatically tested only for Linux and macOS machines.
Contact
To get involved in the discussion, or if you need help to get started, you are welcome to use the following options.
Finally, especially relevant for for developers that are building a library upon spatialdata
, please follow this channel for:
- Announcements on new features and important changes Zulip.
Citation
Marconato, L., Palla, G., Yamauchi, K.A. et al. SpatialData: an open and universal data framework for spatial omics. Nat Methods (2024). https://doi.org/10.1038/s41592-024-02212-x
Further reading
A universal framework for spatial biology. EMBL Communications (2024). https://www.embl.org/news/science-technology/a-universal-framework-for-spatial-biology/