Research

Mitigating the curse of small ensembles in weather forecasting

Technical name: Probit-space Ensemble Size Expansion (PESE; pronounced "peace")
Link to publication
Link to Python code with simple example
Link to slides from a recent talk on PESE-GC

To improve weather forecasts, ensemble data assimilation (EnsDA) requires estimates of forecast statistics (e.g., forecast mean and variance). These estimates are obtained by running multiple computer weather forecast models (a.k.a., an ensemble of computer forecasts). However, because computer weather models require lots of computer power to run (>10 Macbook Pros per model), we are usually forced to estimate forecast statistics using less than 100 models (i.e., less than 100 samples). As such, our estimates of forecast statistics contains errors, thus limiting the accuracy of EnsDA.

I am working to weaken this limitation through leveraging human knowledge of forecast statistics with a pre-existing ensemble of computer forecasts to generate additional forecasts (aka, "virtual forecasts" or "virtual members"). This approach is statistical, requires much less computer resources than running additional computer models, and is highly flexible (i.e., not limited to multivariate Gaussian distributions). Preliminary tests with a toy weather model (Lorenz 96 "wave-on-a-ring") indicate that this approach amplifies the corrective power of EnsDA.

Improving satellite data assimilation through efficiently handling clear-sky and cloudy-sky forecasts separately

Technical name: Bi-Gaussian Ensemble Kalman Filter (recent paper here)

Current ensemble satellite data assimilation methods assume that a mixture of clear-sky and cloudy-sky ensemble forecasts can be treated as a single group. However, these two types of forecasts have different behaviors. For instance, clear-sky forecasts will never forecast heavy downpours, but cloudy-sky forecasts can. These differences suggest that handling cloud-free and cloudy ensemble forecasts in separate groups will improve data assimilation outcomes.

Previously proposed methods to handle groups of forecasts require either an unrealistic amount of computing resources, or are difficult and labor intensive to implement and upkeep. I have recently developed a method that avoids the unrealistic computational cost and is much easier to implement and upkeep.

I have recently published on virtual reality tests using a realistic computer weather model (WRF). These tests indicate that my method outperforms a state-of-the-art data assimilation method in converting synthetic satellite infrared images into forecast corrections (see figure).

Tests with real-world satellite infrared images are ongoing.

Turning hi-res satellite observations into tropical thunderstorm datasets

Tropical Mesoscale Convective System Reanalysis (TMeCSR; "tea-mixer")
Link to TMeCSR v1 data

In tropical weather systems, small-scale features often affect large-scale weather. Weather data captures a snapshots of the weather, much like a digital camera. However, if the resolution of the "digital camera" is not high enough, the small-scale features cannot be seen! In other words, if we want to study the connection between the small- and large-scale weather, we need a high resolution "digital camera".

Current datasets usually don't have the resolution needed to examine the connection between typical convection and large-scale weather. I have recently created a publicly available 4D high-resolution tropical thunderstorm dataset (TMeCSR; 9-km grid spacing). This is done by combining high-resolution satellite infrared imagery with in-situ observations and high-resolution weather models. Furthermore, the TMeCSR is better at capturing tropical thunderstorm systems than the global gold standard (see figure).

Right now, I am collaborating with Xingchao Chen and Chin-Hsuan Peng from the Pennsylvania State University to create a higher resolution version of this dataset.