Research & Projects
(unfold drop-down for details)

AI-Compressors for Federated Geospatial Analytics

Embed2Scale aims to research the potential of representation learning for federated geospatial applications across high-performance compute infrastructure. Self-supervised learning (cf. foundation models / Large-Language Models) transforms data such as a 1000x1000 = 1 million pixel image into a feature vector of, e.g., 1000 floating point values.
In a collaboration of the German Aerospace Center, IBM Research, Oxford University, Juelich Supercomputing Center, the European Union Satellite Center, Zurich University, Muenster University, and Sinergise/Planet, we explore the potential for compression by self-supervised learning on geospatial [DOI10.1109/MGRS.2022.3198244] data across data centers.


schematic workflow of distributed feature vector sharing across (geospatial) data centers

Evolution of Land Cover Monitoring with Artificial Intelligence

The broad spectrum of European satellite missions under the umbrella of the Copernicus Programme and the success of deep learning in computer vision call for a test of multi-sensor data fusion with artificial neural networks. As part of the EvoLand consortium with Vito, GAF AG, CESBIO/CNES, CLS, the German Aerospace Center, Joanneum Research, and Sinergise/Planet, we explore the potential and limitations of land use mapping with deep learning for real-world industry use cases. 


semantic segmentation (right, forest in green) of rural scene (left)

AutoGeoLabel since 2021, research theme

Open-Source, Auto-Generated Labels for Machine Learning in Urban Spaces

How may we exploit high-quality remote sensing data (such as LiDAR) and incomplete crowdsourcing annotations (such as OpenStreetMap), to automatically label geospatial, multi-spectral imagery? To answer that question, we study noise-robust training of deep neural networks for semantic segmentation of geospatial (urban) scenes, e.g.:

LiDAR point cloud (top left) in Williamsburg, New York City with corresponding map below (green vegetation, yellow roads, red buildings)

SSL4EO since 2021, research theme

Multi-Modal Data Fusion with Self-Supervised Deep Learning

While remote sensing streams petabytes of optical, radar, and LiDAR (laser) data for various Earth observation (EO) applications, human labeling and efficient multi-modal data fusion to train deep neural networks poses a major challenge. With my students, we research self-supervised learning (SSL) methodologies for representation learning (feature vectors) of remote sensing information, e.g.:

illustration of diversity in the SSL4EO-S12 benchmark dataset spatio-temporally aligning Sentinel-1 and Sentinel-2 imagery

AI4Archaeology since 2019, research theme

Uncovering Ancient Treasures with Large-Scale Data Mining

Remote sensing such as aerial imaging and airborne LiDAR surveys provide a versatile tool to scan large areas for ancient artifacts. However, a major technical challenge poses the low signal-to-noise ratio (artifact erosion) and the little amount of available labels (scarcity of artifacts). In close collaboration with IBM Research, I develop machine learning pipelines to guide archaeologists in their field work, e.g.:

pottery fragments of the Nasca culture, Peru

AI4GreenSpaces since 2021, research theme

Biomass Mapping at all Scales

Carbon sequestration through trees is a natural approach with additional benefit for local climate zones and biodiversity. On various scales, I collaborate in projects to estimate biomass from remote sensing data, e.g.:

3D point cloud of LiDAR data covering a patch of 20m x 40m of forest (green) in uneven terrain (brown)

IBM PAIRS 2015-2021, corporate research

Petabyte-Scale Geospatial Analytics: Platform & Applications

Complex geospatial analytics requires scalable infrastructure and distributed software to index, process, and fuse data spatio-temporal information. I contributed the following innovations:

Employing open-source technologies such as Apache HBase, Hadoop, and Spark, I co-developed, tested, and employed a platform for Earth observation data science, cf.:

raster (top) to vector (bottom) data fusion by grid indexing (middle) for large-scale geo-data-format queries