Combining Simulators and Machine Learning in High Energy Physics

Abstract:
High energy physics explores nature at the extremes with aim to understand the fundamental laws of nature. For instance, the Large Hadron Collider at CERN produces the highest energy particle collisions ever achieved that are examined with massive detectors to study fundamental particles and their interactions. The complexity of the collisions and detectors necessitates the use of high-fidelity simulators for inferential tasks. However, these simulators are highly resource intensive and have intractable likelihoods. To overcome these challenges, this talk will discuss methods to combine machine learning and simulators, often dubbed simulation-based inference, to aid and improve data analysis in High Energy Physics.

Bio:
Dr. Michael Kagan is a Lead Staff Scientist at SLAC National Accelerator Laboratory. He is an experimental high energy physicist working on the Large Hadron Collider at CERN and on exploring the interface of physics and Machine Learning. He obtained his Ph.D. in Physics from Harvard University in 2012, and his B.A. in Physics and Mathematics from the University of Michigan in 2006. Dr. Kagan was awarded the SLAC Panofsky Fellowship in 2012, and the Department of Energy Early Career Award in 2018.

Summary:

Analysis goals
- Fundamental physics parameter inference
- Hypothesis testing
- Compare measured distributions to theory
- Tuning parameters of simulations
- Accelerator/detector design
Data:
- Collisions from the CERN ATLAS detector
- Measure:
  - Energy, Momentum,
  - Particle Type: Muon, Bottom Quark, Electron, Neutrino
- Measures individual events and aggregates (total charge left in detector over time block)
- Data generation process:
  - Standard model of particle physics + its parameters
  - O(10) Fundamental particles
  - O(100) Particle interactions
  - O(108) Detector elements
  - Fundamentally stochastic
  - Even though physical model is computable, the full distribution of possible events is intractable to compute (space of random possibilities is large)
Goal: invert data generation process from data back to parameters
- Collect probability distribution of event counts detected by detector
- Reduce 100M events -> 1 inferred parameter
- Done by simulating the physical process
  - Simulation models the process from parameters to distribution from outcomes
  - Given data with the observed distribution of outcomes
  - Invert simulation (many techniques) to update probability distribution of possible simulator parameter values via Bayes theorem
- Their contribution is a way to invert simulation using neural nets
  - Use Deep Generative Models to train a surrogate simulator
  - These neural nets are differentiable even though the simulation is not, so can use gradient descent to tune simulation parameters to minimize prediction error
Application: Detector Design Optimization:
- Optimize design of magnetic shield of the SHiP muon detector
- Goal: tune parameters of magnet to achieve a given target distribution of muons (they need to arrive at detector in an adequately tight cluster)
- They have a non-differentiable simulation of the accelerator and detector
- They invert it by
  - Running the simulator many times, and
  - Training a differentiable GAN on the data to approximate it
  - GAN predicts the final location of a muon given its initial state
- GAN is differentiable and can be inverted using gradient descent
- The optimization process moves across the device’s parameter space
  - GAN is only valid in a limited region of the simulation’s parameter space
  - So they they iteratively re-train the generative model on different parameter space regions
  - They discovered that the retraining step was much cheaper than running the expensive simulation enough times to do this without a differentiable approximation
- They tried alternative inversion methods and this method is consistently faster
- The prediction error is unbiased (observed empirically, not proven)
Application: deconvolving measurements to remove effects that lead to measurement noise
Automatic Differentiation:
- Neural approximation is challenging because it is hard to get neural nets to relearn physics
- Can overcome some work by enforcing physical invariants but still, much wasted work
- Different approach: write the simulation to be differentiable to begin with
- MadJax: differentiable Matrix Elements with JAX
  - They hijacked an automatic differentiator for Fortran and forced it to generate Jax code
- Produced a differentiable particle scattering simulation

Page updated

Report abuse