Abstract: Power grids must be managed at greater speed, scale, and physical fidelity to enable decarbonization and improve resilience. Methods from machine learning (ML) have the potential to play an important role by providing fast, scalable approximations for foundational power system optimization problems, but often struggle to enforce hard physical constraints, maintain robustness in the face of unexpected inputs, or generalize across different grids. To address the issue of physical constraints, we present three different approaches — optimization-in-the-loop ML, learned approximate projection, and a safe exploration approach for reinforcement learning — that represent different tradeoffs in terms of runtime and provable guarantees. To foster robustness and generalization, we present two new ML benchmarks: (a) PF∆, focused on fast approximations to power grid simulation problems that can generalize across topologies and cope with near-infeasible instances, and (b) RL2Grid, focused on reinforcement learning for the combinatorial problem of topology optimization.
Bio: Priya Donti is an Assistant Professor and the Silverman (1968) Family Career Development Professor at MIT EECS and LIDS. Her research focuses on safe and robust machine learning for high-renewables power grids. Priya is also a co-founder and Chair of Climate Change AI, a global nonprofit initiative to catalyze impactful work at the intersection of climate change and machine learning. Priya received her Ph.D. in Computer Science and Public Policy from Carnegie Mellon University. She was recognized as part of the MIT Technology Review’s 2021 list of 35 Innovators Under 35, Vox’s 2023 Future Perfect 50, and the 2025 TIME100 AI list, and is a recipient of the Schmidt Sciences AI2050 Early Career Fellowship, the ACM SIGEnergy Doctoral Dissertation Award, the Siebel Scholarship, the U.S. Department of Energy Computational Science Graduate Fellowship, and best paper honorable mentions at ICML and ACM e-Energy.
Summary:
Power: societally-critical system
Critical service
Emits greenhouse gases, need to decarbonize
Complex to operate effectively
Example problem: AC optimal power flo
Goals: minimize power costs, meet demand, satisfy grid & operational constraints
Minimize linear system of constraints
Min/max
Power flow (conservation of flow over all edges and nodes in graph)
Quadratic relationships: non-convex and NP Hard
Power wholesale prices are based on this problem
1k-14k variables at the lower end of complexity but must be solved in milliseconds for online control
Example Approach: Amortized optimization via ML
Must be
Safe: respect hardware constraints
Robust to variety of inputs
Generalizable to many settings
Multi-agent: cooperate with other controllers
Human-in-the-loop
Focus on safety
Enforcing constraints via optimization-in-the-loop ML
Enforce via differentiable “last layer” procedure
Explicit vs implicit layers
Explicit:
Document the goal as a function the model’s input/internal state to produce output
Differentiate with respect to this function
Implicit:
In many use-cases (e.g. power flow), we can’t separate inputs from outputs
e.g. we have the prior state and some of the next state and we iteratively refine what else must happen to make the state transition this way
Differentiate with respect to the output/next state solution that is produced iteratively
Implicit function theorem says: can do this without propagating gradients through iterative refinement process
Optimization-in-the-loop ML for AC Optimal Power Flow
Apply optimization, ensuring constraints are followed and costs are minimized
Train neural network to compute constraints
Completion/correction approach:
Train neural networks on a subset of the problem, take their output (not guaranteed to fully respect the wider problem) and add a full AC power flow solver given these inputs to correct it
Fails if the neural network produces an infeasible solution that cannot be refined/completed
Unified approach:
Neural network outputs a fully valid design that is not optimal (unconstrained optimization/feasibility seeking)
This is used as a warm start for a minimization algorithm to find an optimal solution
Failure mode: neural network’s design is inherently more costly than the global optimum, so the optimizer settles into a local optimum cost
Iterative procedure can sometimes diverge
Approximating AC Power Flow: 118-bus transmission test case
FSNet approach produces optimal results (relative to IPOPT), 10x faster than IPOPT
(300x speedup in latest unpublished work for 1000-bus grids)
Baseline Neural Network approach does not but is 500x faster than IPOPT
Also better solutions on synthetic cases than traditional optimization solvers
Intuition: neural solver is tuned to a particular distribution of power grids and conditions
Is unusually effective in these regions of the problem space
Implication: neural optimizers will not generalize as well
Differentiable projection onto feasible actions
Building heating design
Embed optimization-based or physics-based enforcer is a general approach
Learning approximate projections: optimization that is mostly correct
Output of neural network is encoded into a latent vector via an auto-encoder
If the vector is not feasible, can push it towards correct region
Train a critic to differentiate between feasible and infeasible points
Observation: encoding turns complex constraint shapes into simple shapes in latent
Safety encouraging exploration in Reinforcement Learning
If we see an agent doing unsafe things, we want to up-sample it so we see it more during training
If we see the agent do something bad, save the state into a “retraining area” of starting states
Enable higher-safety RL training
Observation: we don’t actually know if a given policy is safe in the real world
Use reachability-based formal verification to identify implications of policy
Benchmarks:
PF-Delta: ML Benchmark for power flow
How good are ML surrogates for power flow?
Are they robust to near-infeasible scenarios?
How do they generalize?
RL2Grid: Benchmark for Topology Optimization
Discrete actions for optimizations
65k configuration for a single subtraction on 118-bus system