Studying the Higgs boson properties
In 2020, I joined the CMS experiment as a PhD student with the main objective of analyzing data produced by the world's largest particle accelerator : the Large Hadron Collider. My thesis, entitled "Search for CP Violation in the Tau Yukawa Coupling" focused on studying the fundamental properties of the Higgs boson and investigating its potential implication in mechanisms that led to the disappearence of anti-matter in the early stages of the universe.
After being eventually created by smashing together millions of bunches of protons, the Higgs boson quickly decays into new fundamental particles that are then identified by the CMS detector. The main challenge of physicists therefore lies in the ability to properly isolate the few collisions of interest from a huge background of underlying events and particles with similar experimental signature.
For this purpose, machine learning and deep learning tools are widely used in particle physics for particle identification. My thesis, focused on the study of the Higgs boson decaying into a pair of tau leptons, mainly relied on the usage of the DeepTau algorithm to properly identify events in which two of these particles are involved. On a later stage, a boosted decision tree was also used to categorized events and dinstiguish the events involving a Higgs boson from background events involving particles with similar decay products.
Decay chain of a Higgs boson into a pair of tau leptons decaying later to neutrinos, quarks and a lighter lepton.
After being produced, the tau lepton flies a bit further away from the original Higgs boson originating vertex and in turn decays into an invisible neutrino and lighter decay products varying from electrons, muons and quarks. The DeepTau neural network is therefore used to distinguished final products issued from a tau lepton decay from prompt particles originating from the primary vertex where the proton collision happended. The networks combines dense layers and convolutional layers to generate 4 output classes from kinetic informations about the various final state particles and "pictures" of the energy deposit in the detector layers.
Architecture of the DeepTau neural network, from the original publication
A boosted decision tree using XGBoost has been trained in order to categorize events in three categories, of which one is dedicated to signal events (Higgs boson), and two to background events (Z boson and fake hadronic jets events). The model is trained with simulated events from each category by using different kinematic variables from the final state particles as inputs. This procedure allows a better sensitivity to the measurement of physics observables and a better description of the background contribution during the statistical inference step of the analysis.
Distribution of expected number of events in the signal category according to the output BDT score.