Abstract:
Recent advances in Earth observation and machine learning enable inference about the Earth system at planetary scale. However, real-world applications are constrained by sparse ground truth, heterogeneous sensing conditions, and domain shift across regions. Addressing these challenges requires learning representations that generalize across geography and time, as well as enabling statistically valid inference from machine learning-derived Earth observation products. I will present two case studies, one on learning invariant features for crop type mapping and one on using machine learning-derived Earth observation maps to enable statistically valid downstream inference. Together, these examples demonstrate how combining machine learning with principled use of Earth observation modalities can yield scalable, reliable insights about human and environmental systems.
Bio:
Sherrie Wang is an Assistant Professor at MIT in the Department of Mechanical Engineering and the Institute of Data, Systems, and Society. Her research spans Earth observation data, machine learning, and statistical inference, with the goal of enabling reliable understanding of land and atmospheric systems at scale. Her work spans developing and evaluating geospatial data products, designing machine learning algorithms that generalize under data scarcity and domain shift, and performing downstream inference with principled uncertainty quantification. A central theme of her work is understanding how different sensing modalities, such as satellite imagery and LiDAR, interact with learning algorithms to produce representations that transfer across geographic and temporal scales. Her research supports applications in agriculture, greenhouse gas monitoring, and localized weather inference, particularly in settings where ground-based measurements are limited.
Summary:
Availability and spatiotemporal resolution of satellite observations have grown significantly over past few decades, both public and commercial
Can see major global dynamics
Economic activity (from nighttime lights)
Natural disasters
Agriculture
Forests
ML for remote sensing has exploded over past decade
Earth Intelligence Lab @ MIT
Algorithms: label-scarce settings, benchmarks, algorithms tailored to unique properties of remote sensing data
Data Products: global scale, uncertainty quantification, accurate novel data sources
Causal Inference and Forecasting: assess impacts, scenario-sensitive
Representation Learning for Satellite Imagery
Tile2Vec: https://arxiv.org/abs/1805.02855
Ensure embeddings of nearby tiles in satellite images are are closer to each other than embeddings of more distant tiles
Resulting embeddings outperform dedicated end-to-end supervised learning algorithms with explicit labels provided
LLMs for geospatial tasks
Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data: https://arxiv.org/abs/2401.17600 (2024)
LLMs are good at creating captions in images and recognizing landmarks
Fail to count items in images or create bounding boxes around specific regions within images
Localized off-grid weather forecasting
Global models don’t forecast local conditions well (e.g. wind patterns), e.g. missing surface friction due to buildings, etc.
High-resolution satellite data is highly local and resolved
Idea: combined gridded forecasts with satellite images
Model: train transformer on images + HRRR weather forecast https://rapidrefresh.noaa.gov/hrrr
Local Off-Grid Weather Forecasting with Multi-Modal Earth Observation Data: https://arxiv.org/abs/2410.12938
Global map of sugarcane
Mapping sugarcane globally at 10 m resolution using Global Ecosystem Dynamics Investigation (GEDI) and Sentinel-2: https://essd.copernicus.org/articles/16/4931/2024/
Idea: some crops (sugarcane, corn) are much taller than others, so use satellite-based LiDAR (from GEDI satellite) to map them
Focus here on sugarcane due its relatively longer growing season (another available feature)
GEDI has spotty spatial coverage so they use it to identify some locations where sugarcane is grown
Then they used these locations as prediction labels on a vision model that predicts sugarcane locations are a uniform spatial grid using visual satellite imagery
What does the Clean Water Act regulate
Machine learning predicts which rivers, streams, and wetlands the Clean Water Act regulates: https://www.science.org/doi/10.1126/science.adi3794
The water bodies covered by the act change over time
ML model maps the waterbodies covered by various versions of the Act.
Global crop type mapping
To predict crop types we typically need labeled data
Challenge: we don’t have labels for most of the world
Can we create crop type models that apply across geographies, making it possible to use labels from some regions to make predictions for other regions
Temporal multi-spectral features
Traditional: harmonic regression to get over gaps in imagery, 1D median features to combine spectral data over time
Idea: 2D median features that combine spectra over time
CropGlobe dataset: crops in US, Argentina, France, UK, China, Australia
300k samples
Using Sentinel-2 data for 2023 for analysis + hyperspectral data from NASA EMIT sensor (https://earth.jpl.nasa.gov/emit/) for subset of points
Model: CropNet
Lightweight 9-layer CNN + 2 downsampling stages + spatial dropout for improved robustness
2m parameters
Incorporating time shifts, time scale and magnitude warping to transform input features to robustly capture temporal dynamics
Use of 2D median features improved accuracy for cross-regional predictions and hyperspectral provided some further lift
Insight: satellite-only features can be quite effective at cross-region prediction of crop type, suggesting a lot of potential predictive capability for these models
Observation: prediction for each crop type depends on the same features across the world, though the timing changes
Invariant Features for Global Crop Type Classification: https://arxiv.org/abs/2509.03497