Benedikt Blumenstiel - Researcher at IBM Research, AI for Climate Impact, Zurich
https://blumenstiel.github.io/
Title: TerraMind: A Foundation Model with Multimodal Understanding
Abstract:
Meet TerraMind, our first generative, multimodal foundation model for Earth observation, built by IBM and ESA. TerraMind takes geospatial data understanding to new levels, introduces new capabilities such as Thinking-in-Modalities (TiM), and significantly outperforms existing models in the community-standard PANGAEA benchmark. TerraMind is pretrained on dual-scale representations that combine token- and pixel-level data across modalities. At the token level, TerraMind encodes high-level contextual information to learn cross-modal relationships. At the pixel level, TerraMind leverages fine-grained representations to capture critical spatial nuances. Due to multimodal correlation learning, the model can natively generate any pretraining modality from other modalities and supports chained generation for consistent generation across modalities. In the keynote, we will cover the technology behind this foundation model and explore EO applications enabled by TerraMind. For more information, visit https://ibm.github.io/terramind/.