Code

All public repositories are available on GitHub.

LiDAR & Spaceborne Data Infrastructure

gediDB: Toolbox for ingesting, processing and providing GEDI L2A-B and L4A-C LiDAR data at scale. Uses TileDB for scalable storage and efficient large-scale queries over the global GEDI archive. Published in the Journal of Open Source Software (2025).
GitHub | paper

icesat2DB: Python package for streamlining the processing, analysis and management of ICESat-2 ATL08 canopy height data, with TileDB integration for scalable storage and querying.
GitHub | docs

alsDB: Python package for ingesting, storing and processing Airborne Laser Scanning (ALS/LiDAR) point clouds at scale.
GitHub | docs

Earth Observation Data Infrastructure

EOForestSTAC: STAC-based unified metadata and cloud-native access layer for heterogeneous forest Earth observation datasets. Enables interoperable, analysis-ready access to multi-source forest EO data hosted on GFZ S3/Ceph infrastructure.
GitHub | catalog

Forest Age & Structure Modelling

forest-age-upscale: Python package for training machine learning models and performing global upscaling of forest age estimates from inventory plots, biomass and remote sensing observations. Underpins the GAMI dataset.
GitHub

structshift: Python toolbox for analysing structural shifts in forest canopy properties from multi-temporal LiDAR and EO data.
GitHub

Deep Learning for the Earth System

Emulating Ecological Memory with Recurrent Neural Networks: Recurrent Neural Network emulator demonstrating that memory effects in Earth system models can be learned from data. Published as a book chapter in Deep Learning for the Earth Sciences (Wiley, 2021).
GitHub | paper