Program

AAAI-MLPS 2021 Recording

Invited Speakers

Surya Ganguli (Stanford University), "Weaving together machine learning, theoretical physics, and neuroscience" [slides]

Abstract: An exciting area of intellectual activity in this century may well revolve around a synthesis of machine learning, theoretical physics, and neuroscience. The unification of these fields will likely enable us to exploit the power of complex systems analysis, developed in theoretical physics and applied mathematics, to elucidate the design principles governing neural systems, both biological and artificial, and deploy these principles to develop better algorithms in machine learning. We will give several vignettes in this direction, including: (1) determining the best optimization problem to solve in order to perform regression in high dimensions; (2) finding exact solutions to the dynamics of generalization error in deep linear networks; (3) developing interpretable machine learning to derive and understand state of the art models of the retina; (4) analyzing and explaining the origins of hexagonal firing patterns in recurrent neural networks trained to path-integrate; (5) understanding the geometry and dynamics of high dimensional optimization in the classical limit of dissipative many-body quantum optimizers.

References:

M. Advani and S. Ganguli, Statistical mechanics of optimal convex inference in high dimensions, Physical Review X, 6, 031034, 2016.

M. Advani and S. Ganguli, An equivalence between high dimensional Bayes optimal inference and M-estimation, NeurIPS, 2016.

A.K. Lampinen and S. Ganguli, An analytic theory of generalization dynamics and transfer learning in deep linear networks, International Conference on Learning Representations (ICLR), 2019.

H. Tanaka, A. Nayebi, N. Maheswaranathan, L.M. McIntosh, S. Baccus, S. Ganguli, From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction, NeurIPS 2019.

S. Deny, J. Lindsey, S. Ganguli, S. Ocko, The emergence of multiple retinal cell types through efficient coding of natural movies, Neural Information Processing Systems (NeurIPS) 2018.

B. Sorscher, G. Mel, S. Ganguli, S. Ocko, A unified theory for the origin of grid cells through the lens of pattern formation, NeurIPS 2019.

Y. Bahri, J. Kadmon, J. Pennington, S. Schoenholz, J. Sohl-Dickstein, and S. Ganguli, Statistical mechanics of deep learning, Annual Reviews of Condensed Matter Physics, 2020.

Y. Yamamoto, T. Leleu, S. Ganguli and H. Mabuchi, Coherent Ising Machines: quantum optics and neural network perspectives, Applied Physics Letters 2020.

B.P. Marsh, Y, Guo, R.M. Kroeze, S. Gopalakrishnan, S. Ganguli, J. Keeling, B.L. Lev, Enhancing associative memory recall and storage capacity using confocal cavity QED, https://arxiv.org/abs/2009.01227

Bio: Surya Ganguli triple majored in physics, mathematics, and EECS at MIT, completed a PhD in string theory at Berkeley, and a postdoc in theoretical neuroscience at UCSF. He is now an associate professor of Applied physics at Stanford where he leads the Neural Dynamics and Computation Lab. His research spans the fields of neuroscience, machine learning and physics, focusing on understanding and improving how both biological and artificial neural networks learn striking emergent computations. He has been awarded a Swartz-Fellowship in computational neuroscience, a Burroughs-Wellcome Career Award, a Terman Award, a NeurIPS Outstanding Paper Award, a Sloan fellowship, a James S. McDonnell Foundation scholar award in human cognition, a McKnight Scholar award in Neuroscience, a Simons Investigator Award in the mathematical modeling of living systems, and an NSF career award.

Ben Adcock (Simon Fraser University), "Deep learning for scientific computing: (closing) the gap between theory and practice"

Abstract: Deep learning is starting to be increasingly used for challenging problems in scientific computing. Theoretically, such efforts are supported by a large and growing body of literature on existence of deep neural networks with favourable approximation properties. Yet, these results often say very little about practical performance in terms of the traditional pillars of numerical analysis: accuracy, stability, sampling complexity and computational cost. In this talk, I will focus on two distinct problems in scientific computing to which deep learning is being actively applied: high-dimensional function approximation and inverse problems for imaging. In each case, I will first highlight several limitations of current approaches in terms of stability, unpredictable generalization and/or the gap between existence theory and practical performance. Then, I will showcase recent theoretical contributions that show that deep neural networks matching the performance of best-in-class schemes can be computed in both settings. This highlights the potential of deep neural networks, and sheds light on achieving robust, reliable and overall improved practical performance.

References:

B. Adcock, S. Brugiapaglia, N. Dexter & S. Moraga, Deep neural networks are effective at learning high-dimensional Hilbert-valued functions from limited data, MSML21 (in revision), 2021.

B. Adcock & N. Dexter, The gap between theory and practice in function approximation with deep neural networks, SIAM J. Math. Data Sci. (to appear), 2021.

B. Adcock & A. C. Hansen. Compressive Imaging: Structure, Sampling, Learning, CUP (in press), 2021.

V. Antun, F. Renna, C. Poon, B. Adcock & A. C. Hansen, On instabilities of deep learning in image reconstruction and the potential costs of AI, Proc. Natl. Acad. Sci. USA 117(48):30088--30095, 2020.

Bio: Ben Adcock is Associate Professor of Mathematics at Simon Fraser University. He received the CAIMS/PIMS Early Career Award (2017), an Alfred P. Sloan Research Fellowship (2015) and a Leslie Fox Prize in Numerical Analysis (2011). He has published over 50 peer-reviewed journal articles, with his work featuring in outlets such as SIAM Review, Proceedings of the National Academy of Sciences, Foundations of Computational Mathematics and SIAM News. His research interests include numerical analysis, mathematics of data science, machine learning, approximation theory and computational harmonic analysis.

Animashree Anandkumar (Caltech/NVIDIA), "Neural operator: A new paradigm for learning PDEs"

Abstract: Partial Differential Equations (PDE) lay the foundation for modeling a wide variety of scientific phenomena. Traditional solvers tend to be slow when high-fidelity solutions are needed. We introduce neural-operator, a data-driven approach that aims to directly learn the solution operator of PDEs. Unlike neural networks that learn function mapping between finite-dimensional spaces, neural operator extends that to learning the operator between infinite-dimensional spaces. This makes the neural operator independent of resolution and grid of training data and allows for zero-shot generalization to higher resolution evaluations. We find that the neural operator is able to solve the Navier-Stokes equation in the turbulent regime with a 1000x speedup compared to traditional methods.

Bio: Anima Anandkumar is a Bren Professor at Caltech and Director of ML Research at NVIDIA. She was previously a Principal Scientist at Amazon Web Services. She has received several honors such as Alfred. P. Sloan Fellowship, NSF Career Award, Young investigator awards from DoD, and Faculty Fellowships from Microsoft, Google, Facebook, and Adobe. She is part of the World Economic Forum's Expert Network. She is passionate about designing principled AI algorithms and applying them in interdisciplinary applications. Her research focus is on unsupervised AI, optimization, and tensor methods.

Nathan Kutz (University of Washington), "Targeted use of deep learning for physics-informed model discovery"

Abstract: Machine learning and artificial intelligence algorithms are now being used to automate the discovery of governing physical equations and coordinate systems from measurement data alone. However, positing a universal physical law from data is challenging: (i) An appropriate coordinate system must also be advocated and (ii) simultaneously proposing an accompanying discrepancy model to account for the inevitable mismatch between theory and measurements must be considered. Using a combination of deep learning and sparse regression, specifically the sparse identification of nonlinear dynamics (SINDy) algorithm, we show how a robust mathematical infrastructure can be formulated for simultaneously learning physics models and their coordinate systems. This can be done with limited data and sensors. We demonstrate the methods on a diverse number of examples, showing how data can maximally be exploited for scientific and engineering applications.

Bio: Nathan Kutz is the Yasuko Endo and Robert Bolles Professor of Applied Mathematics at the University of Washington, having served as chair of the department from 2007-2015. He received the BS degree in physics and mathematics from the University of Washington in 1990 and the Phd in applied mathematics from Northwestern University in 1994. He was a postdoc in the applied and computational mathematics program at Princeton University before taking his faculty position. He has a wide range of interests, including neuroscience to fluid dynamics where he integrates machine learning with dynamical systems and control.

Jan S Hesthaven (EPFL), "Nonintrusive reduced order models using physics informed neural networks" [slides]

Abstract: The development of reduced order models for complex applications, offering the promise for rapid and accurate evaluation of the output of complex models under parameterized variation, remains a very active research area. Applications are found in problems which require many evaluations, sampled over a potentially large parameter space, such as in optimization, control, uncertainty quantification, and in applications where a near real-time response is needed.

However, many challenges remain unresolved to secure the flexibility, robustness, and efficiency needed for general large-scale applications, in particular for nonlinear and/or time-dependent problems.

After giving a brief general introduction to projection based reduced order models, we discuss the use of artificial feedforward neural networks to enable the development of fast and accurate nonintrusive models for complex problems. We demonstrate that this approach offers substantial flexibility and robustness for general nonlinear problems and enables the development of fast reduced order models for complex applications.

In the second part of the talk, we discuss how to use residual based neural networks in which knowledge of the governing equations is built into the network and show that this has advantages both for training and for the overall accuracy of the model.

Time permitting, we finally discuss the use of reduced order models in the context of prediction, i.e. to estimate solutions in regions of the parameter beyond that of the initial training. With an emphasis on the Mori-Zwansig formulation for time-dependent problems, we discuss how to accurately account for the effect of the unresolved and truncated scales on the long term dynamics and show that accounting for these through a memory term significantly improves the predictive accuracy of the reduced order model.

Bio: After receiving his PhD in 1995 from the Technical University of Denmark, Professor Hesthaven joined Brown University, USA where he became Professor of Applied Mathematics in 2005. In 2013 he joined EPFL as Chair of Computational Mathematics and Simulation Science and from 2017-2020 as Dean of the School of Basic Sciences. From 2021, he serves as Provost at EPFL.

His research interests focus on the development, analysis, and application of high-order accurate methods for the solution of complex time-dependent problems, often requiring high-performance computing. A particular focus of his research has been on the development of computational methods for problems of linear and non-linear wave problems with recent emphasis on combining traditional methods with machine learning and neural networks with broad applications, including structural health monitoring.

He has received several awards for both his research and his teaching, and has published 4 monographs and more than 160 research papers. He is on the editorial board of 8 journals and serves as Editor-in-Chief of SIAM J. Scientific Computing.

Accepted Papers

The accepted papers can be found in Proceedings.

The final version of the extended abstract and short papers will be published in open-access CUER Workshop Proceedings (http://ceur-ws.org/).

Schedule

AAAIMLPS2021schedule_draft.pdf