Combining Simulators and Machine Learning in High Energy Physics
Abstract:
High energy physics explores nature at the extremes with aim to understand the fundamental laws of nature. For instance, the Large Hadron Collider at CERN produces the highest energy particle collisions ever achieved that are examined with massive detectors to study fundamental particles and their interactions. The complexity of the collisions and detectors necessitates the use of high-fidelity simulators for inferential tasks. However, these simulators are highly resource intensive and have intractable likelihoods. To overcome these challenges, this talk will discuss methods to combine machine learning and simulators, often dubbed simulation-based inference, to aid and improve data analysis in High Energy Physics.
Bio:
Dr. Michael Kagan is a Lead Staff Scientist at SLAC National Accelerator Laboratory. He is an experimental high energy physicist working on the Large Hadron Collider at CERN and on exploring the interface of physics and Machine Learning. He obtained his Ph.D. in Physics from Harvard University in 2012, and his B.A. in Physics and Mathematics from the University of Michigan in 2006. Dr. Kagan was awarded the SLAC Panofsky Fellowship in 2012, and the Department of Energy Early Career Award in 2018.
Summary:
Analysis goals
Fundamental physics parameter inference
Hypothesis testing
Compare measured distributions to theory
Tuning parameters of simulations
Accelerator/detector design
Data:
Collisions from the CERN ATLAS detector
Measure:
Energy, Momentum,
Particle Type: Muon, Bottom Quark, Electron, Neutrino
Measures individual events and aggregates (total charge left in detector over time block)
Data generation process:
Standard model of particle physics + its parameters
O(10) Fundamental particles
O(100) Particle interactions
O(108) Detector elements
Fundamentally stochastic
Even though physical model is computable, the full distribution of possible events is intractable to compute (space of random possibilities is large)
Goal: invert data generation process from data back to parameters
Collect probability distribution of event counts detected by detector
Reduce 100M events -> 1 inferred parameter
Done by simulating the physical process
Simulation models the process from parameters to distribution from outcomes
Given data with the observed distribution of outcomes
Invert simulation (many techniques) to update probability distribution of possible simulator parameter values via Bayes theorem
Their contribution is a way to invert simulation using neural nets
Use Deep Generative Models to train a surrogate simulator
These neural nets are differentiable even though the simulation is not, so can use gradient descent to tune simulation parameters to minimize prediction error
Application: Detector Design Optimization:
Optimize design of magnetic shield of the SHiP muon detector
Goal: tune parameters of magnet to achieve a given target distribution of muons (they need to arrive at detector in an adequately tight cluster)
They have a non-differentiable simulation of the accelerator and detector
They invert it by
Running the simulator many times, and
Training a differentiable GAN on the data to approximate it
GAN predicts the final location of a muon given its initial state
GAN is differentiable and can be inverted using gradient descent
The optimization process moves across the device’s parameter space
GAN is only valid in a limited region of the simulation’s parameter space
So they they iteratively re-train the generative model on different parameter space regions
They discovered that the retraining step was much cheaper than running the expensive simulation enough times to do this without a differentiable approximation
They tried alternative inversion methods and this method is consistently faster
The prediction error is unbiased (observed empirically, not proven)
Application: deconvolving measurements to remove effects that lead to measurement noise
Automatic Differentiation:
Neural approximation is challenging because it is hard to get neural nets to relearn physics
Can overcome some work by enforcing physical invariants but still, much wasted work
Different approach: write the simulation to be differentiable to begin with
MadJax: differentiable Matrix Elements with JAX
They hijacked an automatic differentiator for Fortran and forced it to generate Jax code
Produced a differentiable particle scattering simulation