Numerous strides have been made at the intersection of computer vision, machine learning and remote sensing. Although remotely sensed data play a critical role in a wide array of applications such as environmental monitoring, climate science and urban modeling, they preset unique challenges for scalable interpretation. In recent years, foundation models are emerging as a powerful framework that can be adapted for a variety of downstream vision tasks. In the arena of remote sensing, prior work has been focused on task-specific models that are optimized for specific applications and downstream tasks at hand (e.g. land-cover mapping, target recognition, object detection etc. from specific sensors). There is a significant and emergent interest in developing and deploying task-agnostic generalized large vision and vision language models that can be tailored for a variety of downstream remote sensing tasks.
This workshop will feature keynotes and presentations at the cutting-edge of foundation models and large vision models for remote sensing - it will bring together researchers working on both foundation and large vision models and geospatial image analysis to address the nuances presented by using such emergent models for remotely sensed imagery (e.g. a multitude of sensors with different sensing characteristics/specifications, diverse imaging modalities, ranging from passive-optical multi/hyperspectral to active-imaging such as SAR and LiDAR; limited ground-reference data etc.). Our emphasis will range from large vision and foundation models that are showing promise in the computer vision community to foundation models that are pre-trained on large-quantities of earth-observation imagery - this workshop will provide a venue for the community to present works that push the envelope on adapting these models for effective inference of multi-sensor, multi-temporal, multi-scale earth observation imagery.
We invite authors to submit high-quality papers at the intersection of emerging vision models and remote sensing. Submitted manuscripts will be peer-reviewed, and refereed for originality, presentation, empirical results and overall quality. In addition to papers focused on algorithmic novelty, we also encourage papers that demonstrate effective deployment of recent architectures to compelling geospatial imaging applications.
Topics of interest include (but are not limited to):
Foundation Models, Large Vision Language Models and Large Multi-Modal Models in Remote Sensing
Discriminative and Generative Models
Training of Large Vision Models (e.g. masked image modeling, new datasets, and benchmarks)
Deploying Large Vision Models for downstream tasks (e.g. segmentation, classification, regression, object detection, counting, change detection etc.)
Adaptation strategies, prompt tuning and visual instruction tuning
Few-shot and Continual learning
Open-set recognition and classification
Applications to multi-sensor and multi-temporal datasets
Agentic AI for spatial reasoning
Paper Submission: All submissions will be handled electronically through Microsoft CMT.
Paper Format: We welcome the following types of contributions:
Regular research papers: Manuscripts are limited to 8 pages (with additional pages permitted for references) and must follow the CVPR conference format. Authors should consult the Author Guidelines and use the CVPR 2026 Author Kit (linked here). Accepted papers that are presented at the workshop will be included in the CVPR 2026 Workshop Proceedings.
Extended Abstracts: We also welcome extended abstracts describing emerging or ongoing work in the thematic areas listed above. Extended abstracts must be submitted via the same CMT link. Accepted abstracts will be presented as posters at the workshop, but will not be included in the workshop proceedings. Although there isn't a strict formatting requirement for extended abstracts, we do request keeping the abstracts limited to two single-column pages.
Please be sure to select the appropriate track when submitting in the CMT system.
Submission is now closed. The workshop program is available here.
Manuscripts should be submitted via Microsoft CMT.
CMT Link for Paper Submission: https://cmt3.research.microsoft.com/MORSE2026/
The Microsoft CMT service was used for managing the peer-reviewing process for this conference. This service was provided for free by Microsoft and they bore all expenses, including costs for Azure cloud services as well as for software development and support.
Saurabh Prasad
University of Houston
Jocelyn Chanussot
INRIA
Begüm Demir
Technische Universität Berlin
Biplab Banerjee
Indian Institute of Technology, Bombay
Danfeng Hong
Southeast University
Deadline for Archival Regular Paper Submissions: March 5, 2026 March 8, 2026 (09:00AM PDT)
Decisions Announced (Archival Regular Paper Track): March 31, 2026
Camera Ready Paper Submission: April 9, 2026
Deadline for Extended Abstract Submissions: April 5, 2026
Decisions Announced (Extended Abstracts Track): April 10, 2026
Workshop Date: June 3/4, 2026
David Rolnick
David Rolnick is an Assistant Professor and Canada CIFAR AI Chair in the School of Computer Science at McGill University and at Mila – Quebec AI Institute. His group’s work focuses on innovations in machine learning driven by problems in climate change, encompassing areas such as biodiversity monitoring, land use classification, climate model emulation, and materials discovery. He also serves as Co-founder and Chair of Climate Change AI, Scientific Co-director of Sustainability in the Digital Age, and co-lead of the NSF-NSERC Global Center on AI and Biodiversity Change (ABC).
Sylvain Lobry
Sylvain Lobry is an assistant professor (Maître de conférences) in Computer Science. He conducts research with the SIP team at the LIPADE laboratory and teaches at the UFR de Mathématiques et Informatique at Université Paris Cité. Previously, he was a postdoctoral researcher at Wageningen University & Research in the Laboratory of Geo-information Science and Remote Sensing. He earned his PhD in image processing from Télécom Paris in 2017. This work was carried out in collaboration with CNES and received the best PhD award from Fondation Mines-Télécom.
Abhijit Mahalanobis
Abhijit Mahalanobis is an associate professor of electrical and computer engineering at the University of Arizona. His primary research areas are video/image processing for target detection and recognition, and computational imaging. He has over 190 journal and conference publications in this area. Mahalanobis also holds six patents, co-authored a book on pattern recognition, contributed several book chapters, and edited special issues of several journals. Abhijit completed his BS degree with honors at the University of California, Santa Barbara in 1984. He then joined Carnegie Mellon University and received MS and PhD degrees in 1985 and 1987, respectively. Before joining the U of A, Mahalanobis was an associate professor of the Center for Research in Computer Vision at the University of Central Florida and a senior fellow at Lockheed Martin in Orlando. He has also worked at Raytheon and was a faculty member at the University of Maryland.
Mikhail Klassen
Mikhail Klassen is a Senior AI Engineer at Planet, where he develops planetary-scale solutions using geospatial foundation models and generative AI. As a core member of Planet’s AI research team, his work focuses on the development of novel capabilities for monitoring global change using computer vision and LLMs. Prior to Planet, Mikhail was the co-founder and CTO of Paladin AI, a venture-backed aerospace startup acquired in 2023. He holds a PhD in Computational Astrophysics from McMaster University, with a research background spanning star formation, nuclear fusion, and gravitational wave detection. Mikhail is also the co-author of Mining the Social Web (O’Reilly Media) and an advisor to several AI startups.
Change Detection, Temporal Modeling, and Forecasting
[P]. M. I. J. Putra, A. Gustini, M. C. Haris, E. Irwansyah, R. A. Abdurrahman, S. Supriatna, and V. Alexander, “VP-GIS: Zero-Shot Spatiotemporal Land Cover Change Detection via Physical-Guided Visual Prompting.”
[P]. S. Saha, “Open-Vocabulary Change Detection via Synthetic Examples and Grounded Visual Models.”
[P]. F. Fallah, C.-Y. Hsu, W. Li, A. Liljedahl, and Y. Yang, “Asynchronous Remote Sensing Time-Series Fusion for Cloud Removal and Anytime Reconstruction.”
[A]. B. Rolih, M. Fučka, F. Wolf, and L. Čehovin Zajc, “Generative Remote Sensing Change Detection Using Latent Rectified Flow.”
[A]. B. Rolih, M. Fučka, F. Wolf, and L. Čehovin Zajc, “Gaussian Latent Perturbations for Unsupervised Remote Sensing Change Detection.”
[A]. R. Kazoom, Y. Gigi, G. Leifman, T. Shekel, and G. Beryozkin, “Ranking the Changes: Reinforced Best-of-N Ranking with Retrieval-Augmented Vision-Language Models for Semantic Change Captioning.”
[A]. R. Kazoom, G. Leifman, and G. Beryozkin, “FM-ChangeNet: Flow Matching Network for Change Detection.”
[A]. C. Rampersad, “EarthDaily FM: A Change Detection and Forecasting Foundation Model for Daily Global Multi-Modal Imagery.”
[A]. J. Guerrero-Viu, A. López-Cifuentes, and J. I. Bravo Pérez-Villar, “Temporal Sensitivity of Tessera Embeddings for Land-Cover Classification.”
Foundation Models, Vision-Language Models, and Agents
[A]. M. Czerkawski, “GLUE: Multi-lingual Querying of Earth from Space by Learning from the Ground Up.”
[P]. R. Faulkenberry and S. Prasad, “DINO Soars: DINOv3 for Open-Vocabulary Semantic Segmentation of Remote Sensing Imagery.”
[A]. M. Anderson, M. Klassen, A. Hoover, and K. Cahoy, “A Multi-Agent Feedback System for Detecting and Describing News Events in Satellite Imagery.”
[P]. C. Wu, C. Li, and D. Hong, “VegSAM: Vegetation-aware Adapter for Segment Anything Model in Urban Tree Segmentation.”
[A]. S.-C. Lin, J.-X. Jian, Y. Chu, W.-C. Sun, and F.-Y. Lin, “Grounding Vision-Language Models for Zero-Shot Anomaly Detection in Real-World Remote Sensing.”
[P]. M. Hasan, M. A. Hossain, S. V. Roy, S. Bhowmik, A. V. Patel, M. Singha, S. Chaudhuri, M. Haris, and B. Banerjee, “GeoMeld: Toward Semantically Grounded Foundation Models for Remote Sensing.”
[A]. M. Czerkawski, “BetaEarth: Emulating Closed-Source Earth Observation Models Through Their Public Embeddings.”
Spatial Reasoning, Grounding, and Urban/3D Understanding
[P]. T. Shinde, “Geometry-aware Coresets for Efficient Spatial Reasoning with Foundation Models in Low-Resource Remote Sensing.”
[P]. P. Shah, G. Sethi, and A. Gandhe, “Cluster-Guided Refinement and Ensemble Voting for Robust Visual Grounding in Remote Sensing.”
[A]. Q. Wu, K. Gao, D. A. Clausi, J. Li, and Y. Chen, “Preliminary Results for BuildingTwin: Geometrically Grounded Building-Centric 3D Reconstruction for Urban Digital Twins.”
Representation Learning, Generative Modeling, and Modality-Specific Methods
[P]. D. Doutsas and B. Figliuzzi, “Effective-dimension control of β in Dirichlet β-VAE for Blind Hyperspectral Unmixing.”
[A]. F. Wolf, B. Rolih, and L. Čehovin Zajc, “Brewing Stronger Features: Dual-Teacher Distillation for Multispectral Earth Observation.”
[A]. N. Munia, J. Zhu, and A.-A.-Z. Imran, “LiDAR-Conditioned Graph Diffusion for Karst Conduit Network Generation.”
Amartya Ray; Amey Sunil Kulkarni; Claudia Paris; Dipesh Tamboli; Emanuele Dalsasso; Gabriele Moser; János Horváth; Ksenia Bittner; Md Aminur Hossain; Moloud Abdar; Nandini Saini; Ramesh Nair; Ryan Faulkenberry; Sadia Hussain; Samrat Mukherjee; Shivam Pande; Sudipan Saha