Kinetic Theory in Machine Learning

Kinetic theory describes many particles' interaction, and naturally finds itself useful in machine learning. In particular, some machine learning algorithms heavily depend on particles' interactions with each other, and kinetic theory provides a nature venue for validating these algorithms.

Learning a probability measure

People have different opinions about data science. Whichever perspective researchers take, it is clear that the numerical object is switching from vectors in Euclidian space to probability measures, that lives on infinite dimensional manifold. We study different approaches to reconstruct a probability measure from data.

Plots: the reconstructed solution when the objective function is entropy or second moment (under-determined setting).

Differential-Equation Constrained Optimization With Stochasticity, SIAM-UQ, 12(2), Qin Li, Li Wang, Yunan Yang
Accelerating optimization over probability measure space, accepted, JMLR, S. Chen, Q. Li, O. Tse and S. Wright, https://www.jmlr.org/papers/volume26/23-1288/23-1288.pdf
A Good Score Does not Lead to A Good Generative Model, rejected, S. Li, S. Chen and Q. Li, https://arxiv.org/abs/2401.04856
Forward Euler for Wasserstein Gradient Flows: Breakdown and Regularization, submitted, Yewei Xu, Qin Li, https://arxiv.org/abs/2509.13260
Inverse Problems Over Probability Measure Space, submitted, Qin Li, Maria Oprea, Li Wang, Yunan Yang, https://arxiv.org/abs/2504.18999
Least-Squares Problem Over Probability Measure Space, notes, Qin Li, Li Wang, Yunan Yang, https://arxiv.org/abs/2501.09097

Experimental design

Plots on the right: top left: media (in an elliptic equation); top right: D-optimal objective (determinant of the Hessian); bottom left: gradient-flow pattern; bottom right: identified optimal distribution of source and receiver location.

Unique reconstruction for discretized inverse problems: a random sketching approach, Ruhui Jin, Qin Li, Anjali Nair, Samuel Stechmann, Inverse Problems, 41(4)
Optimal experimental design via gradient flow, R. Jin, M. Guerra, Q. Li, S. Wright, https://arxiv.org/abs/2401.07806, submitted
Continuous nonlinear adaptive experimental design with gradient flow, submitted, Ruhui Jin, Qin Li, Stephen O. Mussmann, Stephen J. Wright, https://arxiv.org/abs/2411.14332
Data selection: at the interface of PDE-based inverse problem and randomized linear algebra, Kathrin Hellmuth, Ruhui Jin, Qin Li, Stephen J. Wright, invited, https://arxiv.org/abs/2510.01567
Structured Random Sketching for PDE Inverse Problems, K. Chen, K. Newton, Q. Li and S. Wright, SIAM-Matrix, 41 (4)

Bayesian ensemble sampling

Particles in interacting particle systems exchange information. The hope is that the knowledge obtained in each particle can collectively sense the correct target distribution.

Constrained Ensemble Langevin Monte Carlo. Z. Ding, and Q. Li. Foundations of Data Science,4(1), 2022
Ensemble Kalman Inversion: mean-field limit and convergence analysis. Z. Ding, and Q. Li. Statistics and Computing, 31(9), 2021
Ensemble Kalman Sampling: mean-field limit and convergence analysis. Z. Ding and Q. Li. SIAM J. Math. Anal. 53(2), 2021
Ensemble Kalman Inversion for nonlinear problems: weights, consistency, and variance bounds. Z. Ding, Q. Li and J. Lu, Foundations of Data Science, 3(3), 2021
Bayesian sampling using interacting particles, S. Chen, Z. Ding and Q. Li, Active Particles, Vol (4)
Swarm-Based Gradient Descent Meets Simulated Annealing, Zhiyan Ding, Martin Guerra, Qin Li and Eitan Tadmor, SIAM-Numerical Analysis, 62(6)

Bayesian MCMC sampling

Langevin Monte Carlo, the discrete version of Langevin dynamics, mimic the dynamics of Fokker-Planck equation, a classical model in kinetic theory. Can theory for kinetic equations help shaping the numerical analysis for these MCMC methods?

Random coordinate underdamped Langevin Monte Carlo, Z. Ding, Q. Li, J. Lu and S. Wright. Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:2701-2709, 202
Variance reduction for Langevin Monte Carlo in high dimensional sampling problems. Z. Ding and Q. Li. Advances in Neural Information Processing Systems, 33, 2020
Langevin Monte Carlo: random coordinate descent and variance reduction. Z. Ding, and Q. Li J. Mach. Learn. Res., 22, 2021
Random coordinate Langevin Monte Carlo. Z. Ding, Q. Li, J. Lu and S. Wright, COLT, 134, 1-28, 2021

Training overparameterized residue neural network

The training process can be well-presented by a gradient flow of interacting neurons, through the mean-field argument.

Overparameterization of deep ResNet: zero loss and mean-field analysis, S. Chen, Z. Ding, Q. Li and S. Wright. J. Mach. Learn. Res., 23, 2022
On the Global Convergence of Gradient Descent for multi-layer ResNets in the mean-field regime. S. Chen, Z. Ding, Q. Li and S. Wright. first version, arXiv: 2007.14209, 2021
Correcting auto-differentiation in neural-ODE training. Y.~Xu, S.~Chen, Q.~Li, S.~Wright, submitted, 2023, https://arxiv.org/abs/2306.02192

back to home

Page updated

Google Sites

Report abuse