About

This reading group examines the interplay between the theoretical foundations of deep learning and the practical challenge of making machine learning efficient. On the theory side, we study mathematical insights into optimization, generalization, architectures, and training dynamics, exploring how these principles explain why modern networks learn and generalize. On the efficiency side, we investigate how these theoretical insights inform resource-aware and scalable ML, including algorithmic improvements, architecture design, and hardware-aware training. By linking rigorous theory with practical efficiency, the group aims to understand not only why deep learning works, but also how to make it faster, cheaper, and more sustainable in real-world applications, while staying current with cutting-edge research.

The list of topics includes but is not limited to:

Optimization & Training Dynamics
- Gradient-based methods: SGD vs GD, adaptive optimizers, Hessian-based methods
- Implicit regularization, convergence, and stationary points (local minima, saddles, valleys, mode connectivity)
- Training trajectories: smoothness, loss landscape geometry, and overparameterization effects
- Benchmarks of training efficiency, distributed and collaborative learning
Generalization & Robustness
- Double descent, benign overfitting, PAC-Bayes bounds, information-theoretic perspectives
- NTK vs finite networks, memorization, stability
- Invariance, equivariance, data augmentation, robustness, multi-task and continual learning
- Test-time adaptation, reconfiguration, and transfer learning
Architectures & Model Efficiency
- Initialization, lottery tickets, subnetwork discovery, and sparsity
- Efficient network architectures, neural architecture search (NAS), meta-learning
- Multi-modal models, mixture-of-experts (MoEs), model merging, ensembling
- Scaling laws and their effect on efficiency and generalization
Emerging Paradigms & Practical Efficiency
- Low-resource and low-data machine learning
- On-device learning, edge computing, hardware-aware training
- Resource-efficient ML paradigms: energy-efficient training, inference optimization
- Linking theoretical insights to design of scalable, sustainable ML systems

This reading group was formed in late 2025 by merging two ELLIS reading groups that had been running independently since 2022: the ELLIS Reading Group on Mathematics of Deep Learning and the Efficient ML reading group. The archive of the Mathematics of Deep Learning group's talks from 2022–2025 remains available on their original site.

Page updated

Google Sites

Report abuse