Safe, Robust, and Generalizable Machine Learning for Power Systems

Abstract: Power grids must be managed at greater speed, scale, and physical fidelity to enable decarbonization and improve resilience. Methods from machine learning (ML) have the potential to play an important role by providing fast, scalable approximations for foundational power system optimization problems, but often struggle to enforce hard physical constraints, maintain robustness in the face of unexpected inputs, or generalize across different grids. To address the issue of physical constraints, we present three different approaches — optimization-in-the-loop ML, learned approximate projection, and a safe exploration approach for reinforcement learning — that represent different tradeoffs in terms of runtime and provable guarantees. To foster robustness and generalization, we present two new ML benchmarks: (a) PF∆, focused on fast approximations to power grid simulation problems that can generalize across topologies and cope with near-infeasible instances, and (b) RL2Grid, focused on reinforcement learning for the combinatorial problem of topology optimization.

Bio: Priya Donti is an Assistant Professor and the Silverman (1968) Family Career Development Professor at MIT EECS and LIDS. Her research focuses on safe and robust machine learning for high-renewables power grids. Priya is also a co-founder and Chair of Climate Change AI, a global nonprofit initiative to catalyze impactful work at the intersection of climate change and machine learning. Priya received her Ph.D. in Computer Science and Public Policy from Carnegie Mellon University. She was recognized as part of the MIT Technology Review’s 2021 list of 35 Innovators Under 35, Vox’s 2023 Future Perfect 50, and the 2025 TIME100 AI list, and is a recipient of the Schmidt Sciences AI2050 Early Career Fellowship, the ACM SIGEnergy Doctoral Dissertation Award, the Siebel Scholarship, the U.S. Department of Energy Computational Science Graduate Fellowship, and best paper honorable mentions at ICML and ACM e-Energy.

Summary:

Power: societally-critical system
- Critical service
- Emits greenhouse gases, need to decarbonize
- Complex to operate effectively
Example problem: AC optimal power flo
- Goals: minimize power costs, meet demand, satisfy grid & operational constraints
- Minimize linear system of constraints
  - Min/max
  - Power flow (conservation of flow over all edges and nodes in graph)
  - Quadratic relationships: non-convex and NP Hard
- Power wholesale prices are based on this problem
- 1k-14k variables at the lower end of complexity but must be solved in milliseconds for online control
- Example Approach: Amortized optimization via ML
- Must be
  - Safe: respect hardware constraints
  - Robust to variety of inputs
  - Generalizable to many settings
  - Multi-agent: cooperate with other controllers
  - Human-in-the-loop
Focus on safety
Enforcing constraints via optimization-in-the-loop ML
- Enforce via differentiable “last layer” procedure
- Explicit vs implicit layers
  - Explicit:
    - Document the goal as a function the model’s input/internal state to produce output
    - Differentiate with respect to this function
  - Implicit:
    - In many use-cases (e.g. power flow), we can’t separate inputs from outputs
    - e.g. we have the prior state and some of the next state and we iteratively refine what else must happen to make the state transition this way
    - Differentiate with respect to the output/next state solution that is produced iteratively
    - Implicit function theorem says: can do this without propagating gradients through iterative refinement process
Optimization-in-the-loop ML for AC Optimal Power Flow
- Apply optimization, ensuring constraints are followed and costs are minimized
- Train neural network to compute constraints
- Completion/correction approach:
  - Train neural networks on a subset of the problem, take their output (not guaranteed to fully respect the wider problem) and add a full AC power flow solver given these inputs to correct it
  - Fails if the neural network produces an infeasible solution that cannot be refined/completed
- Unified approach:
  - Neural network outputs a fully valid design that is not optimal (unconstrained optimization/feasibility seeking)
  - This is used as a warm start for a minimization algorithm to find an optimal solution
  - Failure mode: neural network’s design is inherently more costly than the global optimum, so the optimizer settles into a local optimum cost
- Iterative procedure can sometimes diverge
Approximating AC Power Flow: 118-bus transmission test case
- FSNet approach produces optimal results (relative to IPOPT), 10x faster than IPOPT
  - (300x speedup in latest unpublished work for 1000-bus grids)
- Baseline Neural Network approach does not but is 500x faster than IPOPT
Also better solutions on synthetic cases than traditional optimization solvers
- Intuition: neural solver is tuned to a particular distribution of power grids and conditions
- Is unusually effective in these regions of the problem space
- Implication: neural optimizers will not generalize as well
Differentiable projection onto feasible actions
- Building heating design
- Embed optimization-based or physics-based enforcer is a general approach
Learning approximate projections: optimization that is mostly correct
- Output of neural network is encoded into a latent vector via an auto-encoder
- If the vector is not feasible, can push it towards correct region
- Train a critic to differentiate between feasible and infeasible points
- Observation: encoding turns complex constraint shapes into simple shapes in latent
Safety encouraging exploration in Reinforcement Learning
- If we see an agent doing unsafe things, we want to up-sample it so we see it more during training
- If we see the agent do something bad, save the state into a “retraining area” of starting states
- Enable higher-safety RL training
- Observation: we don’t actually know if a given policy is safe in the real world
  - Use reachability-based formal verification to identify implications of policy
Benchmarks:
- PF-Delta: ML Benchmark for power flow
  - https://arxiv.org/html/2510.22048v1
  - How good are ML surrogates for power flow?
  - Are they robust to near-infeasible scenarios?
  - How do they generalize?
- RL2Grid: Benchmark for Topology Optimization
  - https://arxiv.org/abs/2503.23101
  - Discrete actions for optimizations
  - 65k configuration for a single subtraction on 118-bus system