Modeling particle accelerators with large-scale Particle-In-Cell codes
Abstract:
Particle accelerators have applications in many fields, including discovery science, medicine, industry and national security. The design of particle accelerators, as well as R&D in novel acceleration techniques, require advanced computational models. These models are often based on the Particle-In-Cell algorithms, which is highly scalable, but can become numerically costly for certain types of accelerators. We will present the BLAST toolkit, a collection of open-source high-performance modeling tools for particle accelerators. In particular, we will discuss how these algorithms are ported to massively-parallel computing architectures and the world's largest supercomputers, and how ML techniques can be leveraged to improve modeling workflows.
Bios:
Jean-Luc Vay obtained his Master’s degree and Ph.D. at the University of Paris (France). He is now a senior scientist and the head of the Accelerator Modeling Program in the Accelerator Technology and Applied Physics Division at Lawrence Berkeley National Laboratory. He is also leading the multi-institutions DOE Exascale Computing Project application WarpX and the DOE SciDAC Collaboration for Advanced Modeling of Particle Accelerators (CAMPA). His research focuses on the development of algorithms and codes, and their use for the modeling of various particle beams, accelerators, and plasma applications. He is a Fellow of the American Physical Society, and the recipient of the 2013 US Particle Accelerator School Prize for Achievement in Accelerator Physics & Technology, the 2014 NERSC Award for Innovative Use of High-Performance Computing.
Remi Lehe is a Research Scientist at LBNL, where his work focuses on simulations of plasma-based accelerators and research on advanced Particle-In-Cell algorithms. He is also a core developer for the open-source codes WarpX and FBPIC. In addition, Dr. Lehe has recently been involved in several research projects involving artificial intelligence, and he is the lead instructor for a new course on machine learning created in 2021 at the U.S. Particle Accelerator School. Remi Lehe obtained his Master’s degree in Physics at the Ecole Normale Superieure (France) and his Ph.D. at the Ecole Polytechnique (France). His work has been recognized by the John Dawson Prize of the Laser-Plasma Accelerator Workshop in 2015, the APS Metropolis prize in 2016, and the LBNL ATAP SPOT award in 2023.
Axel Huebl is a computational laser-plasma physicist working on Exascale simulations. As a scientist at Berkeley Lab, he leads the software architecture of the Beam, Plasma & Accelerator Simulation Toolkit (BLAST). In 2019, he completed his PhD with highest distinction at TU Dresden (Germany) and received awards for his pioneering work on PIConGPU (Gordon Bell Finalist @ SC13; ACM/IEEE George Michael Memorial Fellowship @ SC16; FoMICS PhD prize @ PASC17; IEEE-NPSS PAST award 2022). He is a vivid advocate for open science and leads the open particle-mesh data project (openPMD) for self-describing, scalable I/O and data science.
All three presenters and their coauthors were awarded the 2022 ACM Gordon Bell Prize for outstanding achievement in high-performance computing, running the BLAST code WarpX on the world's largest supercomputer, including the first reported Exascale machine Frontier.
Summary:
Focus: modeling the dynamics of particle accelerators
BLAST: Beam Plasma Acceleration Simulation Toolkit: https://blast.lbl.gov
Types of accelerators:
Synchrotron:
Circular accelerator where particles follow the same path
Hardware is reused for many paths
Particles lose energy over the path to radiation
Linear accelerator:
Single path
Need to be large to have a big single path
Both require a sequence of magnets and/or accelerating structures
Plasma accelerators
Plasma, "the 4th state of matter": separated nuclei and electrons
Lasers move electrons in plasma but not protons
Shoot in the direction of the particle path
Dynamically creates moving bubbles with electric fields in the areas where electrons were pushed away
Bubbles have a large electric field that accelerates particles
Particles move in the bubble and get accelerated forward
Particles move faster than bubble, once they get to the front of it, need a new bubble
Simulation
Must model bubbles and their electromagnetic fields
Motion of particles in the field
Motion of plasma, bubbles and how lasers (also electromagnetic fields) drive them
Particle-in-cell (PIC) algorithm
Electric and magnetic fields: Maxwell's equations
Beam and plasma particles: relativistic equations of motion (particles can move close tothe speed of light)
Modeled as explicit time step algorithm
Electromagnetic fields are affected by
Plasma particles
Laser
Accelerated particles
Stages of each iteration:
Particle push
Charge/current deposition
Field solver
Field gathering
High computational complexity: micron-scale wavelength means that time/space resolution needs to be very fine since light covers this distance very quickly (sub-femtosecond time resolution).
Algorithmic improvements: reduced geometry, mesh refinement, Lorentz transform
HPC computing: parallelism, GPUs
HPC strategy is based on many multi-physics frameworks, support libraries
Focus: WarpX code
GPU-accelerated PIC code for Exascale
Explicit forward iteration in time: Push particle->deposit currents->solve fields->gather fields
Geometries:
3D cartesian grid is the standard
Can use cylindrical for radially-symmetric phenomena (2D, so cheaper and often a good approximation)
Techniques:
Multi-node parallelization: MPI
On-node parallelization: OpenMP
Scalable, Synchronized I/O for data analysis
https://github.com/ECP-WarpX/WarpX
HPC challenges
HPC hardware challenges:
1970-2005: Could improve chip performance by increasing the clock rate
2005-Present: Frequency has stabilized at few Ghz, so more performance is provided by putting more compute cores onto a single chip
There are many computer architectures for arranging these cores (CPUs, GPUs, FPGAs, etc.)
Parallelizing simulation codes
Break up the simulation problem in space, have each core simulate a small region of space
Give CPUs and GPUs different computations that are better suites for each type of processor
Codes based on sparse matrices are especially challenging because the highest performance processors are very regular and work best with dense data structures
Software architecture hierarchy:
Scripting & Language bindings (to Python)
Applications & Physics modulus
IO, Math, Containers and Algorithms
Parallelism:
Message Passing
In-node Acceleration (hopefully: write code once, recompile to target new architecture)
AMReX library:
C++
Domain decomposition and MPI communication
There are many ways to express data structure iteration, transformation, decomposition
AMReX has a portable layer for expressing parallel nesting and blocking (parallel for, hierarchical parallelism, locality specification)
Can compile to many on-node parallelization libraries/languages
Working on standardizing this interface
Scaling
Weak scaling: run larger problems that model a larger area on more cores
100% -> 75% efficiency while scaling to full size of supercomputer
Can do 500x larger problems in 2023 than could in 2019
Grid management:
Dynamics are much more dense/complex in some parts of space than in others
Z-order space filling curve
Block structured mesh refinement
Find adjacent blocks of similar size, assign to different cores/GPUs
Particles move across blocks and cores all the time, need to communicate them across network
Surrogate modeling of simulation
Fast ML-based model that approximates the main simulation
Less accurate - much faster
Can
Make predictions more quickly, OR
Run large ensembles of simulations to explore parameter space, do Uncertainty Quantification
Example use-case: modeling an accelerator beamline
Train ML surrogate on specific range of parameter space
Goal is to replace everything with ML, if possible
Dataset: state of fields and particles before and after the accelerating stage, not each time step
Currently do not impose physics-based constraints (e.g. energy conservation)
Example use-case: surrogate-based optimization
Want to optimize the parameters of the laser shot that make the accelerator work best
Parameter space: laser energy, gas density, laser chirp, …
Expensive to sample parameter space by running the full simulations, so need to run it as few times as possible
Learn surrogate model during the course of the sampling process
Use the surrogate to choose the next configuration point on which to run the full physics model
Surrogate gets better over the course of the algorithm, more training samples in the more promising region of the parameter space
Bayesian optimization: Surrogate model is a Gaussian Process Model
Multi-fidelity Bayesian optimization
BLAST includes simulations at different levels of physics approximation and compute costs
Can run different variants at different cost/accuracy points to iterate faster based on real accuracy needs of the search process
This converges much faster in time (many more fast samples)
Future: astrophysical plasmas, plasma confinement fusion devices