Advanced Machine Learning for Physics (PhD 2024)
Course Information and Syllabus
Contacts: Stefano Giagu (stefano.giagu [at] uniroma1.it) and Andrea Ciardiello (andrea.ciardiello [at] gmail.com)
Program:
The general objective of the course is to become familiar with advanced deep learning techniques based on differentiable neural network models with different learning paradigms; acquire skills in modeling complex problems, through deep learning techniques, and be able to apply them in different contexts in the fields of physics, basic and applied scientific research.
Topics covered include: general overview of differentiable artificial neural networks and use of the pytorch library for ANN design, training and testing. Basic architectures: MLP, Convolutional neural network, neural network for sequence analysis (RNN, LSTM/GRU). Bayesian-NN. Attention, Self-Attention, Transformers and Visual Transformers, Models for object detection and semantic segmentation and applications. Graph Neural Networks and Geometrical Deep Learning. Generative models based on VAE, GAN, autoregressive models, invertible networks, diffusion models, normalising flow, and generative GNNs. Advanced learning techniques: transfer learning, domain adaptation, adversarial learning, self-supervised and contrastive learning, model distillation. Explainable and interpretable AI. Quantum Machine Learning on near-term quantum devices.
Approximately 50% of the lectures are frontal lessons supplemented by slide, aimed at providing advanced knowledge of Deep Learning techniques. The remaining 50% is based on hands-on computational practical experiences that provide some of the application skills necessary to autonomously develop and implement advanced Deep Learning models for solving various problems in physics and scientific research in general.
Indispensable prerequisites: basic concepts in machine learning, python language programming, standard python libraries (numpy, pandas, matplotlib, torch/pytorch )
a basic python course on YT (many others available on web: https://youtu.be/_uQrJ0TkZlc
tutorial on numpy, matplotlib, pandas: https://jakevdp.github.io/PythonDataScienceHandbook/)
basic concepts of ML: Introduction + Part I (sec. 5: ML basics) of the book I. Goodfellow et al.: https://www.deeplearningbook.org/
tutorials on pytorch web site: https://pytorch.org/
an introductory course on pytorch on YT (many others available on web): https://youtu.be/c36lUUr864M
Depending on the requirements of your specific PhD course each students can decided how may lectures/hands-on to attend tu fulfil the required hours: 20h, 40h, 60h (60h corresponds to the whole course).
Discussion group (telegram group):
Calendar: (in preparation)
aula Marcello Conversi: dipartimento di fisica G.Marconi building (1st floor)
aula Giustina Baroni: dipartimento di fisica E.Fermi building (2nd floor)
labSS: dipartimento di fisica G.Marconi building (1st floor)
Student's E-mail and data for the CINECA HPC system accounts
enter the requested data in the google spreadsheet document available here (by the end of March 2024)
Bibliography/References and detailed topics treated during lectures, slides, notebooks, etc.
Given the highly dynamic nature of the topics covered in the course, there is no single reference text. During the course the sources will be indicated and provided from time to time in the form of scientific and technical articles and book chapters.
Some classic readings on Deep Learning based on differentiable neural networks:
DL: I. Goodfellow, Y. Bengio, A. Courville: Deep Learning, MIT Press (https://www.deeplearningbook.org/)
PB: P. Baldi, Deep Learning in Science, Cambridge University Press
DL2: C. Bishop, Deep Learning, Springer
GRL: W. L. Hamilton, Graph Representation Learning Book, MCGill Uni press (https://www.cs.mcgill.ca/~wlh/grl_book/files/GRL_Book.pdf)
lecture 1 - 27.2.2024 (slides, recording) h17:00-19:00
course information
ANN 101: (DL ch 6 (6.1,6.2,6.3, 6.4, 6.5), BIS ch 5 (5.1, 5.2,5.3, 5.5), DUDA ch 6 (6.1, 6.2, 6.3, 6.24, 6.5, 6.8))
artificial neuron model and MLPs
activations functions for hidden and output layers)
training of an ANN
loss functions
SGD, momentum, learning rate, variable lr and optimizers
learning curves, bias-variance tradeoff and double descent in DNN
regularisation
dropout
early stopping
noise injection
data augmentation
weight regularisation L1, L2, L1+L2
a simple example of a shallow MLP implemented in pytorch
lecture 2 - 28.2.2024 (slides, recording) h16:00-18:00
Convolutional-NN (SL 9 (9.1, 9.2, 9.3, 9.4, 9.7, 9.8, 9.9, 9.10, 9.11)
image representation and input properties of a CNN (symmetry, translation invariance, self-similarity, compositionality, locality) and learned convolutional filters (DL ch 9 (9.1,9.2, 9.4))
local receptive field
convolution e shared weights
pooling layers
CNN architectures: LeNet, AlexNet, VGG, Inception, ResNet, DenseNet, ...
Analysis of sequences: task definition and problems (DL ch 10 (intro, 10.2))
Elementary RNN cell: structure and operating principle
StackedRNN, Bidirectional RNN, Encoder-Decoder RNN (seq2seq)
Back-propagation through time
Long-term correlation and gradient vanishing and exploding problems: Gated Cell and Long Short Term Memory RNN
LSTM: description of operations (note by C.Olah and for details DL ch 10 (10.10))
Hands-on 1 - 5.3.2024 (notebook, recording) h 14:00-16:00
hands-on: pytorch library usage, an example of auto-grad based optimisation, implementation of a simple CNN to solve a classification task
lecture 3 - 6.3.2024 (slides, recording) h 16:00-18:00
learning methods:
learning paradigms recall (supervised, unsupervised, reinforcment learning)
semi-supervised learning
self-supervised learning
contrastive learning, simCLR (DL2 6.3.5)
non-contrastive learning, Barlow twins (https://arxiv.org/pdf/2103.03230.pdf)
deep residual learning, denoisers based on residual learning
transfer learning and domain adaptation (DL2 6.3.4)
knowledge transfer based on knowledge distillation (https://arxiv.org/abs/1503.02531)
hands-on 2 - 12.3.2024 (notebook (recording not available)) h: 14:00-16:00
hands-on: use of skip connection and denoising CNN (DnCNN) exercise description & assignment
lecture 4 - 13.3.2024 (slides, recording) h: 16:00-18:00
neural architectures for object detection and segmentation (DL2: 10.4. 10.5)
semantic segmentation, downsampling-upsampling (arXiv:1411.4038., arXiv:1505.04366)
object detection, IoU, anchor boxes, non-max supression
region proposals, R-CNN, Fast and Faster R-CNN (arXiv:1311.2524, arXiv:1506.01497 )
Yolo and SSD models (arXiv:1506.02640, arXiv:1512.02325)
Instance segmentation, Mask R-CNN (arXiv:1703.06870)
Pose
hands-on 3 - 19.3.2024 (notebook, recording) h: 14:00-16:00
hands-on: DnCNN solution + SSL Barlow Twins exercise assignment
hands-on 4 - 20.3.2024 (notebook, recording) h: 16:00-18:00
hands-on: use of MONAI library for semantic segmentaion of medical images
hands-on 5 - 27.3.2024 (notebook, slides, recording) h:16:00-18:00
Barlow Twins problem solution + introduction to implementation of Bayesian NN
lecture 5 - 3.4.2024 (slides, recording) h:16:00-18:00
uncertainty quantification in ANN
IID and OOD data
type of uncertanties: approximation unc., epistemic unc. aleatoric unc.
calibration error and ECE (https://arxiv.org/pdf/1706.04599.pdf)
ensamble methods:
deep ensambles (https://arxiv.org/pdf/1612.01474.pdf)
MC dropout (https://arxiv.org/pdf/1506.02142.pdf)
Bayesian-NN (https://arxiv.org/pdf/2007.06823.pdf)
Pyro lib, example implementation of a MCMC-based (see DL ch 17 for a simple introduction to MCMC) BNN for a simple regression task (https://github.com/pyro-ppl/pyro)
conformal predictions (https://arxiv.org/pdf/2107.07511.pdf)
lecture 6 - 9.4.2024 (slides, recording) h:14:00-16:00
Graph Neural Networks (Dl2 ch 13, GRL section 5 and 6, PyTorch geometric web site)
introduction
graphs and representation
permutation equivariance
graph convolutions and message passing
basic GCN layerw
self-loop GCN
graph attention networks
solutions to the over-smoothing problem in GNNs
normalization
GNN python libraries
hands-on 6 - 10.4.2024 (notebook, recording) h: 16:00-18:00
Graph Neural Networks and Pytorch Geometric
hands-on 7 - 16.4.2024 (notebook, recording) h: 14:00-16:00
PointCloud classification with GNNs
lecture 7 - 17.4.2024 (slides, recording) h: 16:00-18:00
attention mechanism
the RNNSearch encoder-decoder model (arXiv:1409.0473)
attention and the Nadaraya-Watson kernel estimator
attention layers vs fully connected layers
transformer architecture (arXiv:1706.03762)
word embedding (cenni)
(masked) multi head (self) attention based on scaled dot product
layer normalization
positional embedding
hands-on 8 - 23.4.2024 (notebook, recording) h: 14:00-16:00
implementation of attention layers and of the transformer architecture
hands-on 9 - 24.4.2024 (slides, notebook, recording) h: 16:00-18:00
modern evolutions of transformers architectures: BERT/GPT/GPT2/GPT3/...
vision transformer (arXiv:2010.11929)
multimodal transformers
assignment of project #4 (ViT)
lecture 8 - 30.4.2024 (slides, recording) h: 14:00-16:00
AI explainability and interpretability
hands-on 10 - 7.4.2024 (notebook, recording) h: 14:00-16:00
xAI examples + PointNet++ solution
hands-on 11 - 8.4.2024 (notebook, recording) h: 16:00-18:00
ViT solution
lecture 9 - 14.5.2024 h:14:00-16:00
Q&A with the instructors
lecture 10 - 15.5.2024 (slides, recording) h: 16:00-18:00
AutoEncoders (DL ch 14, DL2 19.1)
under-complete auto-encoders
linear AE and PCA
over-complete AEs
denoising AE
sparse AE
contractive AE
AE for self-supervised anomaly detection
examples of implementation in pytorch
lecture 11 - 21.5.2024 (slides, recording) h: 14:00-16:00
seminar from Sergio Orlandini on HPC resources and AI parallel programming at CINECA - part 1
lecture 12 - 22.5.2024 (slides, recording) h: 16:00-18:00
generative DL (DL ch. 20, DL2 ch 11.3, 19.2)
autoregressive models
latent variable models
VAE, ELBO theorem
lecture 13 - 29.5.2024 (slides, recording) h: 16:00-18:00
generative DL part 2 (DL ch. 20, DL2 ch. 17, 18, 20)
GANs
flow model: normalising flow models
deep diffusion probabilistic models
lecture 14 - 4.6.2024 (slides, recording) h: 14:00-16:00
introduction to Quantum Computation and QML (https://cernbox.cern.ch/s/mygF5uKp2D8Pp7g)
hands-on 12 - 5.6.2024 (notebook, recording) h: 16:00-18:00
normalizing flow + deep diffusion assigned
hands-on 13 - 11.6.1013 (notebook, recording) h: 14:00-16:00
parametric quantum circuits for classification with pennylane
lecture 15 - 12.6.2024 (slides, recording) h: 16:00-18:00
seminar from Sergio Orlandini on HPC resources and AI parallel programming at CINECA - part 2