Applied and Computational Mathematics Seminar
Department of Mathematics and Statistics
Applied and Computational Mathematics Seminar
Department of Mathematics and Statistics
Date and time: Sep 05 at 2:00 pm (Parker 328)
Title: Exploiting Low-Dimensional Data Structures and Understanding Neural Scaling Laws of Transformers
Abstract: When training deep neural networks, a model’s generalization error is often observed to follow a power scaling law dependent on the model size and the data size. A prominent example is transformer-based large language models (LLMs), where networks with billions of parameters are trained on trillions of tokens. A theoretical interest in LLMs is to understand why transformer scaling laws emerge. In this talk, we exploit low-dimensional structures in language datasets by estimating its intrinsic dimension and establish statistical estimation and mathematical approximation theories for transformers to predict the scaling laws. This perspective shows that transformer scaling laws can be explained in a manner consistent with the underlying data geometry. We further validate our theory with empirical observations of LLMs and find strong agreement between the observed empirical scaling laws and our theoretical predictions. Finally, we turn to in-context learning, analyzing its scaling behavior by uncovering a connection between the attention mechanism in transformers and classical kernel methods in machine learning.
Date and time: Sep 19 at 2:00 pm (Parker 328)
Title: On some foundational issues in feedback control
Abstract: The remarkable success of closed-loop control in mitigating the effect of uncertainty on a system’s performance has undoubtedly enabled much of the technological world around us. Indeed, feedback regulation can be found “under the hood” in the functioning of engines, the workings of biological organisms, interplanetary navigation, GPS tracking, robotics and more. While the mitigation of uncertainty has been at the heart of control theory since its inception, explicit control of uncertainty is a relatively recent development that has garnered much attention. In this, a main object of study is the Liouville (continuity) equation -- the PDE governing the evolution of the probability distribution of the state of a dynamical system. While it was widely believed that the basic question of controllability of the Liouville equation had been resolved, it escaped the community’s attention for almost two decades that early investigations on the subject came short of providing a satisfactory answer, even for linear systems. In this talk, we revisit and address this topic and develop a theory for Collective Steering, the endeavor to shepherd an ensemble of dynamical systems between desired configurations using a common feedback law. Our investigation sheds light on a topological obstruction at the heart of the issue that limits the ability to design feedback control laws that are globally continuous with respect to the specifications. Along the way, we touch upon an elegant geometric framework at the intersection of optimal transport, geometric hydrodynamics, and quantum mechanics.
Date and time: Sep 26 at 2:00 pm (Parker 328)
Title: Information Gamma Calculus: Convexity Analysis for Stochastic Differential Equations
Abstract: We study the Lyapunov convergence analysis for degenerate and non-reversible stochastic differential equations (SDEs). We apply the Lyapunov method to the Fokker–Planck equation, in which the Lyapunov functional is chosen as a weighted relative Fisher information functional. We derive a structure condition and formulate the Lyapunov constant explicitly. Given the positive Lyapunov constant, we prove the exponential convergence result for the probability density function towards its invariant distribution in the L1 norm. Several examples are presented: underdamped Langevin dynamics with variable diffusion matrices, quantum SDEs in Lie groups (Heisenberg group, displacement group, and Martinet sub-Riemannian structure), three oscillator chain models with nearest-neighbor couplings, and underdamped mean field Langevin dynamics (weakly self-consistent Vlasov–Fokker–Planck equations). If time is allowable, some extensions will be discussed on the time-inhomogeneous SDEs.
Date and time: Oct 17 at 2:00 pm (Parker 328)
Title: Nonlocal Boundary Value Problems with Local Boundary Conditions
Abstract: We state and analyze nonlocal problems with classically-defined, local boundary conditions. The model takes its horizon parameter to be spatially dependent, vanishing near the boundary of the domain. We establish a Green's identity for the nonlocal operator that recovers the classical boundary integral, which permits the use of variational techniques. Using this, we show the existence of weak solutions, as well as their variational convergence to classical counterparts as the bulk horizon parameter uniformly converges to zero. In certain circumstances, global regularity of solutions can be established, resulting in improved modes and rates of variational convergence. Generalizations of these results pertaining to models in continuum mechanics and Laplacian learning will also be presented.
Date and time: Nov 07 at 2:00 pm (Parker 328)
Title: Floquet Hamiltonians - Spectrum and Dynamics
Abstract: The last couple of decades witnessed tremendous experimental progress in the study of "Floquet media," crystalline materials whose properties are changed by applying a time-periodic parametric forcing. The theory of Floquet media has so far been restricted to discrete models, which are often heuristic and approximate. Understanding these materials from their underlying PDE models, such as the Schrödinger equation, remains an open problem.
Specifically, semi-metals such as graphene are known to transform into "Floquet Insulators" under such periodic driving. While traditionally this phenomenon is modeled by a spectral gap, in PDE models no such gaps are conjectured to form. How do we reconcile these seemingly contradictory statements? We prove the existence of an “effective gap” – a novel and physically-relevant notion which generalizes a (proper) spectral gap. Adopting a broader perspective, we will then survey newer results on dispersion and spectral near-invariance in bulk Floquet materials.