Kaapana is an open-source toolkit for state-of-the-art platform provisioning in the field of medical data analysis. The applications comprise AI-based workflows and federated learning scenarios with a focus on radiological and radiotherapeutic imaging.
Obtaining large amounts of medical data necessary for developing and training modern machine learning methods is an extremely challenging effort that often fails in a multi-center setting, e.g. due to technical, organizational and legal hurdles. A federated approach where the data remains under the authority of the individual institutions and is only processed on-site is, in contrast, a promising approach ideally suited to overcome these difficulties.
Following this federated concept, the goal of Kaapana is to provide a framework and a set of tools for sharing data processing algorithms, for standardized workflow design and execution as well as for performing distributed method development. This will facilitate data analysis in a compliant way enabling researchers and clinicians to perform large-scale multi-center studies.
By adhering to established standards and by adopting widely used open technologies for private cloud development and containerized data processing, Kaapana integrates seamlessly with the existing clinical IT infrastructure, such as the Picture Archiving and Communication System (PACS), and ensures modularity and easy extensibility.
Core components of Kaapana:
- Workflow management: Large-scale image processing with SOTA deep learning algorithms, such as nnU-Net image segmentation and TotalSegmentator
- Datasets: Exploration, visualization and curation of medical images
- Extensions: Simple integration of new, customized algorithms and applications into the framework
- Storage: An integrated PACS system and Minio for other types of data
- System monitoring: Extensive resource and system monitoring for administrators
- User management Simple user management via Keycloak
Core technologies used in Kaapana:
- Kubernetes: Container orchestration system
- Airflow: Workflow management system enabling complex and flexible data processing workflows
- OpenSearch: Search engine for DICOM metadata-based searches
- dcm4chee: Open source PACS system serving as a central DICOM data storage
- Prometheus: Collecting metrics for system monitoring
- Grafana: Visualization for monitoring metrics
- Keycloak: User authentication
Currently, Kaapana is used in multiple projects in which a Kaapana-based platform is deployed at multiple clinical sites with the objective of distributed radiological image analysis and quantification. The projects include RACOON initiated by NUM with all 37 German university clinics participating, the Joint Imaging Platform (JIP) initiated by the German Cancer Consortium (DKTK) with 11 university clinics participating as well as DART initiated by the Cancer Core Europe with 7 cancer research centers participating.