This reading group examines the interplay between the theoretical foundations of deep learning and the practical challenge of making machine learning efficient. On the theory side, we study mathematical insights into optimization, generalization, architectures, and training dynamics, exploring how these principles explain why modern networks learn and generalize. On the efficiency side, we investigate how these theoretical insights inform resource-aware and scalable ML, including algorithmic improvements, architecture design, and hardware-aware training. By linking rigorous theory with practical efficiency, the group aims to understand not only why deep learning works, but also how to make it faster, cheaper, and more sustainable in real-world applications, while staying current with cutting-edge research.
The list of topics includes but is not limited to:
Optimization & Training Dynamics
Gradient-based methods: SGD vs GD, adaptive optimizers, Hessian-based methods
Implicit regularization, convergence, and stationary points (local minima, saddles, valleys, mode connectivity)
Training trajectories: smoothness, loss landscape geometry, and overparameterization effects
Benchmarks of training efficiency, distributed and collaborative learning
Generalization & Robustness
Double descent, benign overfitting, PAC-Bayes bounds, information-theoretic perspectives
NTK vs finite networks, memorization, stability
Invariance, equivariance, data augmentation, robustness, multi-task and continual learning
Test-time adaptation, reconfiguration, and transfer learning
Architectures & Model Efficiency
Initialization, lottery tickets, subnetwork discovery, and sparsity
Efficient network architectures, neural architecture search (NAS), meta-learning
Multi-modal models, mixture-of-experts (MoEs), model merging, ensembling
Scaling laws and their effect on efficiency and generalization
Emerging Paradigms & Practical Efficiency
Low-resource and low-data machine learning
On-device learning, edge computing, hardware-aware training
Resource-efficient ML paradigms: energy-efficient training, inference optimization
Linking theoretical insights to design of scalable, sustainable ML systems
This reading group was formed in late 2025 by merging two ELLIS reading groups that had been running independently since 2022: the ELLIS Reading Group on Mathematics of Deep Learning and the Efficient ML reading group. The archive of the Mathematics of Deep Learning group's talks from 2022–2025 remains available on their original site.