Numerous strides have been made at the intersection of computer vision, machine learning and remote sensing. Although remotely sensed data play a critical role in a wide array of applications such as environmental monitoring, climate science and urban modeling, they preset unique challenges for scalable interpretation. In recent years, foundation models are emerging as a powerful framework that can be adapted for a variety of downstream vision tasks. In the arena of remote sensing, prior work has been focused on task-specific models that are optimized for specific applications and downstream tasks at hand (e.g. land-cover mapping, target recognition, object detection etc. from specific sensors). There is a significant and emergent interest in developing and deploying task-agnostic generalized large vision and vision language models that can be tailored for a variety of downstream remote sensing tasks.
This workshop will feature keynotes and presentations at the cutting-edge of foundation models and large vision models for remote sensing - it will bring together researchers working on both foundation and large vision models and geospatial image analysis to address the nuances presented by using such emergent models for remotely sensed imagery (e.g. a multitude of sensors with different sensing characteristics/specifications, diverse imaging modalities, ranging from passive-optical multi/hyperspectral to active-imaging such as SAR and LiDAR; limited ground-reference data etc.). Our emphasis will range from large vision and foundation models that are showing promise in the computer vision community to foundation models that are pre-trained on large-quantities of earth-observation imagery - this workshop will provide a venue for the community to present works that push the envelope on adapting these models for effective inference of multi-sensor, multi-temporal, multi-scale earth observation imagery.
We invite authors to submit high-quality papers at the intersection of emerging vision models and remote sensing. Submitted manuscripts will be peer-reviewed, and refereed for originality, presentation, empirical results and overall quality. In addition to papers focused on algorithmic novelty, we also encourage papers that demonstrate effective deployment of recent architectures to compelling geospatial imaging applications.
Topics of interest include (but are not limited to):
Foundation Models, Large Vision Language Models and Large Multi-Modal Models in Remote Sensing
Discriminative and Generative Models
Training of Large Vision Models (e.g. masked image modeling, new datasets, and benchmarks)
Deploying Large Vision Models for downstream tasks (e.g. segmentation, classification, regression, object detection, counting, change detection etc.)
Adaptation strategies, prompt tuning and visual instruction tuning
Few-shot and Continual learning
Open-set recognition and classification
Applications to multi-sensor and multi-temporal datasets
Paper Submission: All submissions will be handled electronically through Microsoft CMT.
Paper Format: Papers are limited to 8 pages (additional pages allowed for references) and will follow the CVPR conference format. Authors should follow the Author Guidelines and use the CVPR 2025 Author Kit available here. Accepted research papers that are presented at the conference will be included in the CVPR 2025 Workshop proceedings.
Manuscripts should be submitted via Microsoft CMT.
CMT Link for Paper Submission: https://cmt3.research.microsoft.com/MORSE2025/
Saurabh Prasad
University of Houston
Jocelyn Chanussot
INRIA
Begüm Demir
Technische Universität Berlin
Biplab Banerjee
Indian Institute of Technology, Bombay
Danfeng Hong
Chinese Academy of Sciences
Deadline for Paper Submissions: March 5 9, 2025*
Paper Decisions Announced: March 31, 2025
Camera Ready Paper Submission: April 7, 2025
Workshop Date: June 12, 2025
Prof. Salman Khan
Mohamed Bin Zayed University of Artificial Intelligence
Title of Talk: Towards scaling remote sensing analysis via Dialogue-centric agents
Salman Khan is an associate professor at Mohamed Bin Zayed University of AI (MBZUAI) and an honorary faculty member at the Australian National University (ANU). He works on multimodal learning algorithms for a range of applications, including earth observation and climate science.
Dr. Johannes Jakubik
IBM
Title of Talk: Large-scale generative multimodality for Earth observation
Dr. Jakubik is a Staff Research Scientist within the AI for Climate Impact team at IBM Research Europe. He is leading research activities on pretraining and scaling multi-modal AI foundation models for earth observation as well as developing AI foundation models for weather and climate assessments in collaborations with NASA, ESA, and within the EU Horizon program. His work on large-scale deep learning for Earth observation was awarded with the NASA Marshall Space Flight Center Honor Award, several IBM accomplishment awards, and was featured in various international and national media.
8.05am - 8.15am: Welcome Remarks
8.15am - 9.15am: Keynote (Salman Khan)
9.15am - 10.15am: Keynote (Johannes Jakubik)
10.15am - 10.45am: Coffee Break
10.45am - 12.15pm: Poster Session (Assigned Poster Boards: #127 - #139)
Posters (Accepted Papers):
Christel Chappuis*; Gencer Sümbül; Syrielle Montariol; Sylvain Lobry; Devis Tuia, "PAN-RSVQA: Vision Foundation Models as Pseudo-ANnotators for Remote Sensing Visual Question Answering"
Darryl Hanan; John Cooper; Dylan White; Tim Doster; Henry Kvinge; Yijing Watkins, "Foundation Models for Remote Sensing: An Analysis of MLLMs for Object Localization"
Martina Pastorino; Michael Alibani; Nicola Acito; Gabriele Moser, "Deep Diffusion Models and Unsupervised Hyperspectral Unmixing for Realistic Abundance Map Synthesis"
Anan Yaghmour; Melba Crawford; Saurabh Prasad, "A Sensor Agnostic Domain Generalization Framework for Leveraging Geospatial Foundation Models: Enhancing Semantic Segmentation via Synergistic Pseudo-Labeling and Generative Learning"
Clément Barbier; Baptiste Abeloss; Stéphane Herbin, "Bridging the Modality Gap: Training-free Adaptation of Vision-Language Models for Remote Sensing via Visual Prototypes"
Chenyu Li; Zhaojie Pan; Danfeng Hong, "Dynamic State-Control Modeling for Generalized Remote Sensing Image Super-Resolution"
Paul Borne--Pons; Mikolaj Czerkawski; Rosalie Martin; Romain Rouffet, "MESA: Text-Driven Terrain Generation Using Latent Diffusion and Global Copernicus Data"
Miguel Espinosa; Valerio Marsocci; Yuru Jia; Elliot J. Crowley; Mikolaj Czerkawski, "COP-GT: Unified Generative Modelling of COPernicus Imagery Thumbnails"
Pierre Adorni; Minh-Tan Pham; Stéphane May; Sébastien Lefèvre, "Towards Efficient Benchmarking of Foundation Models in Remote Sensing: A Capabilities Encoding Approach"
Thomas Kerdreux; Alexandre Tuel; Alexis Mouche; Quentin Febvre; Bertrand Chapron, "Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation" (virtual participation, link to recorded presentation: https://www.youtube.com/watch?v=HFbTsqUVA3g)
Invited Posters:
Syed Roshaan Ali Shah, Muhammad Akhtar Munir, Muhammad Sohail Danish, Syed Waqas Zamir, Fahad Shahbaz Khan, Salman Khan, "Evaluating the Diversity and Representativeness of Large-Scale and General-Purpose AI4EO Datasets" (Poster)