DeepTrees is a end-to-end library for tree crown semantic and instance segmentation, as well as analysis in remote sensing imagery. It provides a modular and flexible framework based on PyTorch for training, active-learning and deploying deep learning models for tree crown semantic and instance segmentation. The library is designed to be easy to use and extendable, with a focus on reproducibility and scalability. It includes a variety of pre-trained models, datasets, and tree allometrical metrics to help you understand tree crown dynamics.

Read more about this work and find tutorials on: https://deeptrees.de. The DeepTrees project is funded by the Helmholtz Centre for Environmental Research -- UFZ, in collaboration with Helmholtz AI.

Installation

To install the package, clone the repository and install the dependencies.

git clone https://codebase.helmholtz.cloud/taimur.khan/DeepTrees.git
cd DeepTrees

pip install -r requirements.txt

or from Gitlab registry:

pip install deeptrees --index-url https://codebase.helmholtz.cloud/api/v4/projects/13888/packages/pypi/simple

or from PyPI.

pip install deeptrees

Note: DeepTrees uses python libaries that depend on GDAL. Make sure to have GDAL>=3.9.2 installed on your system, e.g. via conda: conda install -c conda-forge gdal==3.9.2.

API Documentation

You can view the documentation page on: https://treecrowndelineation-ai-consultants-dkrz-35d16e8c9ecc31028ba160.pages.hzdr.de/

This library is documented using Sphinx. To build the documentation, run the following command.

sphinx-apidoc -o docs/source deeptrees 
cd docs
make html

This will create the documentation in the docs/build directory. Open the index.html file in your browser to view the documentation.

Configuration

This software requires Hydra for configuration management. The configuration yaml files are stored in the config directory.

This software uses Hydra for configuration management. The configuration files are stored in the config directory.

The confirguration schema can be found in the config/schema.yaml file.

A list of Hydra configurations can be found in: /docs/prediction_config.md

Pretrained Models

DeepTrees provides a set of pretrained models for tree crown segmentation. Currently the following models are available:

Author	Description	Model Weights
Freudenberg et al., 2022	Tree Crown Delineation model based on U-Net with ResNet18 backbone. Trained on 89 images sampled randomly within Germany. Set of 5 model weights from 5-fold cross validation.	k=0, 1, 2, 3 (default), 4

Note: We are in the process of adding more pretrained models.

Note: Like all AI systems, these pretrained models can make mistakes. Validate predictions, especially in critical applications. Be aware that performance may degrade significantly on data that differs from the training set (e.g., different seasons, regions, or image qualities)

Download the pretrained models from the links:

from deeptrees.pretrained import freudenberg2022

freudenberg2022(
  filename="name_your_file", # name of the file to save the model
  k=0, # number of k-fold cross validation
  return_dict=True # returns the weight pytorch model weights dictionary
)

Datasets

DeepTrees also provides a lablled DOP dataset with DOP20cm rasters and corresponding polygon labels as .shp files. The dataset is available for download:

from deeptrees.datasets.halleDOP20 import load_tiles, load_labels

load_tiles(zip_filename="path/to/tiles.zip") #give the path to where you want to save the tiles
load_labels(zip_filename="path/to/labels.zip") #give the path to where you want to save the labels

Note: We are in the process of adding more datasets and updating the current datasets.

Predict on a list of images

Predict tree crown polygons for a list of images. The configuration file in config_path controls the pretrained model, output paths, and postprocessing options.

from deeptrees import predict

predict(image_path=["list of image_paths"],  config_path = "config_path")

Training

DeepTrees calculates pixel-wise entropy maps for each input image. The entropy maps can be used to select the most informative tiles for training. The entropy maps are stored in the entropy directory.

To train the model, you need to have the labeled tiles in the tiles and labels directories. The unlabeled tiles go into pool_tiles. Your polygon labels need to be in ESRI shapefile format.

Adapt your own config file based on the defaults in train_halle.yaml as needed. For inspiration for a derived config file for finetuning, check finetune_halle.yaml.

Run the script like this:

python scripts/train.py # this is the default config that trains from scratch
python scripts/train.py --config-name=finetune_halle # finetune with pretrained model
python scripts/train.py --config-name=yourconfig # with your own config

To re-generate the ground truth for training, make sure to pass the label directory in data.ground_truth_labels. To turn it off, pass data.ground_truth_labels=null.

You can overwrite individual parameters on the command line, e.g.

python scripts/train.py trainer.fast_dev_run=True

To resume training from a checkpoint, take care to pass the hydra arguments in quotes to avoid the shell intercepting the string (pretrained model contains =):

python scripts/train.py 'model.pretrained_model="Unet-resnet18_epochs=209_lr=0.0001_width=224_bs=32_divby=255_custom_color_augs_k=0_jitted.pt"'

Expected Directory structure

Before you embark onSync the folder tiles and labels with the labeled tiles. The unlabeled tiles go into pool_tiles.

|-- tiles
|   |-- tile_0_0.tif
|   |-- tile_0_1.tif
|   |-- ...
|-- labels
|   |-- label_tile_0_0.shp
|   |-- label_tile_0_1.shp
|   |-- ...
|-- pool_tiles
|   |-- tile_4_7.tif
|   |-- tile_4_8.tif
|   |-- ...

Create the new empty directories

|-- masks
|-- outlines
|-- dist_trafo

Training Classes

We use the following classes for training:

0 = tree 1 = cluster of trees 2 = unsure 3 = dead trees (haven’t added yet)

However, you can adjust classes as needed in your own training workflow.

Training Logs

By default, MLFlow logs are created during training.

Inference

Run the inference script with the corresponding config file. Adjust as needed.

python scripts/test.py --config-name=inference_halle

Semantic Versioning

This repository has auto semantic versionining enabled. To create new releases, we need to merge into the default main branch.

Semantic Versionining, or SemVer, is a versioning standard for software (SemVer website). Given a version number MAJOR.MINOR.PATCH, increment the:

MAJOR version when you make incompatible API changes
MINOR version when you add functionality in a backward compatible manner
PATCH version when you make backward compatible bug fixes
Additional labels for pre-release and build metad

See the SemVer rules and all possible commit prefixes in the .releaserc.json file.

Prefix	Explanation	Example
feat	A new feature was implemented as part of the commit, so the Minor part of the version will be increased once this is merged to the main branch	feat: model training updated
fix	A bug was fixed, so the Patch part of the version will be increased once this is merged to the main branch	fix: fix a bug that causes the user to not be properly informed when a jobfinishes

The implementation is based on. https://mobiuscode.dev/posts/Automatic-Semantic-Versioning-for-GitLab-Projects/

License

This repository is licensed under the MIT License. For more information, see the LICENSE.md file.

Cite as

@article{khan2025torchtrees,
        author    = {Taimur Khan and Caroline Arnold and Harsh Grover},
        title     = {DeepTrees: Tree Crown Segmentation and Analysis in Remote Sensing Imagery with PyTorch},
        journal   = {arXiv},
        year      = {2025},
        archivePrefix = {arXiv},
        eprint    = {XXXXX.YYYYY},  
        primaryClass = {cs.CV}      
      }

DeepTrees🌳

What DeepTrees🌳 can do for you

Installation

API Documentation

Configuration

Pretrained Models

Datasets

Predict on a list of images

Training

Expected Directory structure

Training Classes

Training Logs

Inference

Semantic Versioning

License

Cite as

Participating organisations

Contributors

Contact person

Taimur Khan

Helmholtz - UFZ

0000-0001-7833-5474

Helmholtz Program-oriented Funding IV

Related projects

Helmholtz AI

DeepTrees🌳

What DeepTrees🌳 can do for you

Installation

API Documentation

Configuration

Pretrained Models

Datasets

Predict on a list of images

Training

Expected Directory structure

Training Classes

Training Logs

Inference

Semantic Versioning

License

Cite as

Participating organisations

Contributors

Contact person

Taimur Khan

Helmholtz - UFZ

.logo-orcid_svg__st1{fill:#fff}0000-0001-7833-5474

Helmholtz Program-oriented Funding IV

Related projects

Helmholtz AI

0000-0001-7833-5474