Winter School

Winter School Class

Abstracts and Titles:

Title: Accelerated Gradient Methods Through Variable and Operator Splitting (Long Chen)

Abstract: This talk introduces a unified framework for accelerated gradient methods through the variable and operator splitting (VOS). The operator splitting decouples the optimization process into simpler subproblems, and more importantly, the variable splitting leads to acceleration. The key contributions include the development of strong Lyapunov functions to analyze stability and convergence rates, as well as advanced discretization techniques like Accelerated Over-Relaxation (AOR) and extrapolation by the predictor-corrector methods (EPC). For the convex case, we introduce a dynamic updating parameter and a perturbed VOS flow. The framework effectively handles a wide range of optimization problems, including convex optimization, composite convex optimization, and saddle point systems with bilinear coupling.

Title: Basics of Probability Theory and Generative Modelling (Jonathan W. Siegel)

Abstract: In this talk, we will review the basics of probability theory and discuss two general approaches for modelling probability distributions using neural networks.

Title: Invertible Flow Generative Models (Jonathan W. Siegel)

Abstract: In this talk, we describe generative modelling using invertible flow networks.

Title: Generative Adversarial Networks (part 1) (Jonathan W. Siegel)

Abstract: In this talk, we introduce and analyze the original generative adversarial network (GANs).

Title: Generative Adversarial Networks (part 2) (Jonathan W. Siegel)

Abstract: In this talk, we study different GAN variants which are based upon different adversarially defined metrics on probability distributions.

Title: Energy-based Generative Modelling (Jonathan W. Siegel)

Abstract: In this talk, we give a numerical example of generative flow networks. We also introduce and discuss energy-based methods for generative modelling using neural networks.

Title: Optimal Error Estimates for Shallow Neural Networks Part 1 & 2 (Tong Mao)

Abstract: We establish a novel integral representation for Sobolev spaces, showing that every function in $H^{\frac{d+2k+1}{2}}(\Omega)$ can be expressed as an $L^2$-weighted integral of ReLU$^k$ ridge functions over the unit sphere. This result mirrors the known representation of Barron spaces and highlights a fundamental connection between Sobolev regularity and neural network representations. Moreover, we prove that linearized shallow networks—constructed by fixed inner parameters and optimizing only the linear coefficients—achieve optimal approximation rates $O(n^{-\frac{1}{2}-\frac{2k+1}{2d}})$ in Sobolev spaces.

Title: Optimal Error Estimates for Deep Neural Networks without the Curse of Dimensionality (Yahong Yang)

Abstract: In this class, we will discuss deep neural network approximation in Korobov spaces, which provides a way to overcome the curse of dimensionality. We will first introduce two kinds of optimal approximation rates for Korobov spaces: one is the optimal rate in the continuous approximation setting, and the other is a nearly optimal rate in terms of deep neural network complexity. We will then consider how energy-based sparse grids and symmetry assumptions can be incorporated to further overcome the curse of dimensionality.

Title: Overcoming Training Challenges in Neural Network based PDE Solvers: Theory, Algorithms, and Applications (Chuqi Chen)

Abstract: Neural network–based PDE solvers have shown strong expressive capability, yet their practical performance is often limited by poor training behavior—slow convergence, instability, and persistent training error. In this talk, we focus on understanding and reducing training error in neural PDE solvers from both theoretical and algorithmic perspectives.We begin by characterizing training difficulty through a Neural Tangent Kernel (NTK)–based analysis, which provides insight into how architecture, discretization, and problem structure affect optimization dynamics. Building on these insights, we review and unify strategies that improve trainability, including neural network parameter initialization, architecture design, loss function formulation, and training methodologies.

Page updated

Google Sites

Report abuse