Abstract:
While AI has been disrupting conventional weather forecasting, we are only beginning to witness the impact of AI on long-term climate simulations. The fidelity and reliability of climate models has been limited by computing capabilities. These limitations lead to inaccurate representations of key processes such as convection, cloud, or mixing or restrict the ensemble size of climate predictions. Therefore, these issues are a significant hurdle in enhancing climate simulations and their predictions.
Here, I will discuss a new generation of climate models with AI representations of unresolved ocean physics, learned from high-fidelity simulations, and their impact on reducing biases in climate simulations. The simulations are performed with operational ocean model components. I will further demonstrate the potential of AI to accelerate climate predictions and increase their reliability through the generation of fully AI-driven emulators, which can reproduce decades of climate model output in seconds with high accuracy.
Bio:
Laure Zanna is a physical oceanographer and climate physicist in the Department of Mathematics at the Courant Institute and the Center for Data Science, NYU. She holds the Joseph B. Keller and Herbert B. Keller Professorship in Applied Mathematics. Her research focuses on understanding, simulating and predicting the role of the ocean in climate on local and global scales. She combines theory, numerical simulations, statistics, and machine learning to tackle a wide range of problems in fluid dynamics and climate, including turbulence, multiscale modeling, ocean heat and carbon uptake, and sea level. Since 2020, she is leading M²LInES, an international collaboration sponsored by Schmidt Sciences dedicated to improving climate models using scientific machine learning. In 2020, Prof Zanna received the Nicholas P. Fofonoff Award from the American Meteorological Society “for exceptional creativity in the development and application of new concepts in ocean and climate dynamics”, and was the 2022 WHOI Geophysical Fluid Dynamics principal lecturer.
Summary:
Focus: global ocean simulation using AI methods
AI weather/climate modeling is growing rapidly
AI models provide high skill and performance
The ocean is a critical system that evolves a lot more slowly than the atmosphere
70% of surface
4km deep
Top 3m contains as much heat as whole atmosphere (heat battery)
Top 50 m contains as much carbon as atmosphere
Large currents induce atmospheric currents
Scales span from minutes to decades
Traditional modeling approach
Newton’s laws of motion
GFDL CM 2.6 Ocean simulation
Very expensive, can resolve down to 10km
Critical for prediction and evaluation of counter-factuals
Latest models
OM4 - Ocean+Sea ice: 12 years per day for 25km resolution on 4671 cores
OM4 - Ocean+Atmosphere+land+Sea ice: 16 years per day for 25km resolution on 5535 cores
We want
Good/accurate simulations so you learn new insights
Fast enough to enable exploration
Generalize/are robust to new scenarios
Simulations are divided into grid and sub-grid models
Space broken up into a discrete grid
Coarser phenomena simulated directly on the grid
Finer-grained phenomena are modeled using approximate models that don’t directly capture the physics but are fast
Empirical approximations
Have free parameters that must be tuned to reduce prediction error
Driven by external forcings (e.g. wind, radiation, CO2 concentrations)
Simulations are imprecise and predictions differ from observations in many details
Example: Gulf stream is too warm because its transport tends to be too slow
Idea: use ML to learn simulation components instead of tunes empirical approximations
Train a neural network to model sub-grid dynamics
Loss: error in predicting the full time series of the climate system
Not enough physical observation data, especially in the ocean
As such, we use outputs of high-resolution numerical models as the training set
Key idea: neural networks work much better if the data is normalized to fit into a small range
Normalization is dynamic relative to the surrounding state of the climate
First run a base coarse model to establish normal range of values in a given context
Then use that to normalize the values for the neural model
Resulting model is more accurate than case numerical simulation
Above uses AI for sub-grid models. Can we create neural surrogates of the entire simulation?
Samudra
Trained on ocean time series of GDFL OM4, using historical atmospheric forcings
25km, 60 years
Coarse grained to 100km, 5 day averages, 19 vertical levels
Input: current state and time derivative
Output: next 2 time steps
Architecture: Autoregressive emulator, ConvNext Unet, 135M parameters
Other architectural choices don’t make much of a difference
Resulting model captures the spatial and temporal of the OM4 simulation
Computational cost is orders of magnitude lower than numerical simulation
Ongoing work: SamudrACE considering training procedure for ocean and air models
Previously: used ocean forcings for atmospheric model and atmospheric forcings for ocean model
Improved to train ocean and atmosphere models separately while explicitly coupling them
Result is
Separate models for ocean and atmosphere models that are coupled together
More stable than training a unified ocean/atmosphere model because of the very different time scales of the two systems
Some ocean trends are still poorly represented by the SamudrACE model over long time period
Tradeoff between stability of model dynamics and sensitivity to perturbation (capturing real variability)
Access at: https://m2lines.github.io