DM Workshop Co-Chairs
Shuyi Chen (shuyic@alumni.cmu.edu), Carnegie Mellon University
Feng Qiu (fqiu@anl.gov), Argonne National Laboratory
Hieu Pham (hieu.pham@uah.edu), The University of Alabama in Huntsville
Shouyi Wang (shouyiw@uta.edu), University of Texas at Arlington
Alinson Santos Xavier (axavier@anl.gov), Argonne National Laboratory
Woody Zhu (shixianz@andrew.cmu.edu), Carnegie Mellon University
Problem
Extreme weather events—such as thunderstorms, high winds, and heat waves—are among the leading causes of large-scale power outages in the United States. In recent years, these events have grown in both frequency and severity, placing increasing strain on electric utilities and emergency responders. Accurate short-term outage forecasting is essential for improving grid resilience, enabling proactive crew deployment, optimizing resource allocation, and ultimately minimizing societal and economic impacts.
Despite its importance, short-term outage prediction remains a challenging problem. Outage patterns are highly irregular, often featuring long periods of no events interspersed with sudden spikes during severe weather. The challenge is compounded by the high-frequency nature of the data, which captures rapid changes in both outages and weather but also increases noise and variability. Predictive performance is further limited by the scarcity of publicly available outage datasets paired with such fine-grained, high-resolution weather data.
To address this challenge, we have curated a high-resolution, hourly dataset of county-level outage counts and 108 weather features for Michigan (April–July 2023). This period features strong weather–outage correlation, making it ideal for benchmarking predictive models.
The goal of this competition is to develop models that predict short-term outages during extreme weather events, using historical outage and weather features, under realistic deployment constraints where future weather data during the test period will not be available.
Timeline
The entire competition will run until September 26th, 2025. During the training phase contestants should use the provided training data to develop their models. The testing phase will be conducted on a holdout dataset where only the competition committee knows the true values. The testing phase will be conducted on a holdout dataset where only the competition committee knows the true values. The testing period will be Sept. 19 – Sept. 26, and the holdout datasets (with only test period timestamps and counties) will be released prior to September 11th. Participants should use them as templates for submission. The leaderboard, on the Workshop’s website, will be updated twice. Saturday September 27th will be the final ranking update. We will invite the top four competitors to the INFORMS Data Mining Workshop to present their solutions solely based on the final ranking.
Registration will close Friday, September 5 (AOE).
Data and Evaluation
The dataset contains hourly county-level outage counts and weather variables for all 83 counties in Michigan (FIPS codes 26001–26165). The training period is from 2023-04-01 00:00 to 2023-06-30 00:00 (data source: poweroutage.us).
The dataset contains 109 weather variables, covering temperature, humidity, wind, precipitation, severe weather, and other atmospheric and land surface conditions. These include:
Temperature & Humidity – near-surface temperature and moisture conditions:
t2m (2m temperature), d2m (2m dewpoint), sh2 (specific humidity), r2 (relative humidity), t, r, r_1
Pressure & Geopotential Heights – atmospheric pressure and heights at multiple levels:
pres, pres_1, pres_2, sp (surface pressure), pt (potential temperature), gh, gh_1 … gh_7 (geopotential heights at various levels)
Wind & Turbulence – horizontal and vertical wind components, gusts, and turbulence measures:
u, v, u10, v10, gust, max_10si, vucsh, vvcsh, ustm, vstm, wz, wz_1
Severe Weather & Instability – parameters linked to storms, convection, and hail:
cape, cape_1 (convective available potential energy), cin (convective inhibition), lftx, lftx4 (lifted indices), hail, hail_1, hail_2, ltng (lightning)
Clouds & Radiation – cloud coverage and surface radiation fluxes:
tcc, tcc_1 (total cloud cover), hcc (high cloud cover), mcc (medium cloud cover), lcc (low cloud cover), sdswrf (surface downward shortwave radiation), sdlwrf (surface downward longwave radiation), suswrf (surface upward shortwave), sulwrf (surface upward longwave)
Precipitation & Hydrology – rain, snow, and related surface hydrology:
prate (precipitation rate), tp (total precipitation), crain (convective rain), cfrzr (freezing rain), cicep (ice pellets), csnow (snow), cpofp (probability of frozen precipitation), bgrun (baseflow runoff), ssrun (surface runoff)
Other Environmental & Land Surface Variables – additional atmospheric, oceanic, and land surface parameters:
mslma (mean sea-level pressure anomaly), pwat (precipitable water), refc (composite reflectivity), refd, refd_1 (radar reflectivity at specific levels), aod (aerosol optical depth), veg (vegetation fraction), lai (leaf area index), vgtyp (vegetation type), orog (orography), vis (visibility), blh (boundary layer height), fsr (forecast surface roughness), gflux (ground heat flux), as well as various technical and diagnostic fields including veril, tcolw (total column water), tcoli (total column ice), plpl (pressure lapse), mstav, sdwe, sdwe_1, and layth.
Participants will build models to forecast outage counts at two horizons:
24-hour horizon – day-ahead forecast starting June 30, 2023 at 01:00, ending on July 1, 2023 at 00:00.
48-hour horizon – two-day forecast starting June 30, 2023 at 01:00, ending on July 2, 2023 at 00:00.
The final ranking will be based on the average rank across the two horizons using Root Mean Squared Error (RMSE) as the metric.
To help participants get started, we have provided a sample demonstration using a simple seq2seq model for outage forecasting. This example shows how to load the provided NetCDF dataset, fit a model, and generate predictions. We have also included evaluation code for the 24-hour and 48-hour forecast horizons. Please note the test_24h_demo.nc and test_48h_demo.nc in the data files are sample datasets only, not the actual test sets.
Submission
Each team needs to submit their predictions on the test data through a Google form. As we get closer to time, we will provide the testing set as well as a template for predictions.
We will release the ranking results on September 20th and September 27th. The final ranking is computed based on the result released on Sep 27th, your team's best submission will be used as the ranking, and will be used to determine the finalists.
One member in each team needs to be identified as the primary contact and provide their email in the submission form.
Written Report
All teams invited as finalists are required to submit a 6-page maximum report summarizing their methodologies, due on October 15th, 2025 (anywhere on earth). Winners will be chosen based on the presentations and the reports by a panel of judges.
Prize
The finalists will be chosen directly based on numerical results using the average ranking across the four response variables. Each finalist team will have one complimentary registration code for the workshop, but will still need to pay for the main INFORMS conference registration (if attending). If you are invited as one of the four finalists, you will receive a monetary prize. The final selection of winners will be based on the quality of the presentation and the written methodology, judged by our panel of judges.
First Place Prize: 1000 USD
Second Place Prize: 500 USD
Third Place Prize: 250 USD
Fourth Place Prize: 125 USD
Please use the following link to register for the competition HERE. After registration, you will be emailed a link, to access the training data.
Registration is closed.
Submission Form: HERE (You will need a Google account to submit. If you do not have one, you can also email hieu.pham@uah.edu your submission.). Please submit your predictions for each phase by Sept. 19 and Sept 26 AOE Time, respectively.
Submission Templates (Test Set): 24-hour horizon and 48-hour horizon. Please submit both templates to submit your forecasts.
Please use the provided submission templates for your predictions. They will both need to be uploaded to be considered for each week's leaderboard. Additionally, please name your submission file as "YourTeamNumber_XXhr_horizon.csv" (e.g. "2025_24hr_horizon.csv" or "2025_48hr_horizon.csv" ). For your team number, we have concatenated the first four letters of your team leader's first and last name (e.g. Steven Smith = StevSmit). Please use this table to identify your team number. If you do not see your team number, please email us.
The rankings will be updated on Sept. 20th, and Sept. 27th. Please submit your rankings the day before AOE Time. The final ranking is computed based on the result released on Sep 27th and will be used to determine the finalists. We will use your best result from your two submissions for the final leaderboard. After the final rankings, the top four teams will be contacted with further instructions.
Leaderboard Ranking
September 20
September 27