Talks of Fall 2022

4:00 - 5:00 pm, Sept 7, 2022 (EST), Yue Yu, Lehigh University

Title: Learning Nonlocal Operators for Heterogeneous Material Modeling

Video Slides

Abstract: Constitutive modeling based on the continuum mechanics theory has been a classical approach for modeling the mechanical responses of materials. However, when constitutive laws are unknown or when defects and/or high degrees of heterogeneity present, these classical models may become inaccurate. In this talk we propose to use data-driven modeling which directly utilizes high-fidelity simulation and experimental measurements on displacement fields, to predict a material's response. Two nonlocal operator regression approaches will be presented, to obtain homogenized and heterogeneous surrogates for material modeling, respectively.


In the first approach, we propose to consider nonlocal models as coarse-grained, homogenized models to accurately capture the system's global behavior, and learn optimal kernel functions directly from data. By combining machine learning identifiability theory, known physics, and nonlocal theory, this combination guarantees that the resulting model is mathematically well-posed, physically consistent, and converging as the data resolution increases. As an application, we employ this ``linear operator regression’’ technique to model wave propagation in heterogeneous bars.


In the second approach, we further propose to leverage the linear operator regression to nonlinear models by combining it with neural networks, and model the heterogeneous material response as mapping between loading conditions and its resultant displacement fields. To this end, we develop deep integral neural operator architectures, which is resolution-independent and naturally embeds the material micromechanical properties and defects in the integrand. As an application, we learn material models directly from digital image correlation (DIC) displacement tracking measurements, and show that the learnt model substantially outperforms conventional constitutive models.

Bio: Yue Yu received her B.S. from Peking University in 2008, and her Ph.D. from Brown University in 2014. She was a postdoc fellow at Harvard University after graduation, and then she joined Lehigh University as an assistant professor of applied mathematics and was promoted to associate professor in 2019. Her research lies in the area of applied and computational mathematics, with recent projects focusing on nonlocal problems and scientific machine learning. She has received an NSF Early Career award and an AFOSR Young Investigator Program (YIP) award.


4:00 - 5:00 pm, Sept 14, 2022 (EST), Florian Schaefer, Georgia Tech

Title: Inference, Computation, and Games

Video Slides

Abstract: In this talk, we develop algorithms for numerical computation, based on ideas from competitive games and statistical inference. In the first part, we propose competitive gradient descent (CGD) as a natural generalization of gradient descent to saddle point problems and general sum games. Whereas gradient descent minimizes a local linear approximation at each step, CGD uses the Nash equilibrium of a local bilinear approximation. Explicitly accounting for agent-interaction significantly improves the convergence properties, as demonstrated in applications to GANs, reinforcement learning, computer graphics, and physics-informed neural networks. In the second part, we show that the conditional near-independence properties of smooth Gaussian processes imply the near-sparsity of Cholesky factors of their dense covariance matrices. We use this insight to derive simple, fast solvers with state-of-the-art complexity vs. accuracy guarantees for general elliptic differential and integral equations. Our methods come with rigorous error estimates, are easy to parallelize, and show good performance in practice.


Bio: Florian Schäfer is an assistant professor in the School of Computational Science and Engineering at Georgia Tech. Before joining Georgia Tech, he received his PhD in applied and computational mathematics at Caltech, working with Houman Owhadi. Before that, he received Bachelor’s and Master’s degrees in Mathematics at the University of Bonn. His research interests lie at the interface of numerical computation, statistical inference, and competitive games.


4:00 - 5:00 pm, Sept 21, 2022 (EST), Ju Sun, University of Minnesota

Title: Deep Image Prior (and Its Cousin) for Inverse Problems: the Untold Stories

Video Slides

abstract: Deep image prior (DIP) parametrizes visual objects as outputs of deep neural networks (DNNs); its consin neural implicit representation (NIR) directly parametrizes visual objects as DNNs. These stunningly simple ideas, when integrated into natural optimization formulations for visual inverse problems, have matched or even beaten the state-of-the-art methods on numerous visual reconstruction tasks, although not driven by massive amounts of training data. Despite the remarkable successes, the overparametrized DNNs used are typically powerful enough to also fit the noise besides the desired visual contents (i.e., overfitting), and the fitting process can take up to tens of minutes on sophisticated GPU cards to converge to a reasonable solution. In this talk, I’ll describe our recent efforts to combat these practicality issues around DIP and NIR, and how careful calibration of DIP models (or variants) can help to solve challenging visual reconstruction problems, such as blind

image deblurring and phase retrieval, in unprecedented regimes.

Joint work with my PhD students: Taihui Li, Hengkang Wang, Zhong Zhuang, Hengyue

Liang, Le Peng, and Tiancong Chen. Related papers:

● Early Stopping for Deep Image Prior https://arxiv.org/abs/2112.06074

● Self-Validation: Early Stopping for Single-Instance Deep Generative Priors https://arxiv.org/abs/2110.12271

● Blind Image Deblurring with Unknown Kernel Size and Substantial Noise https://arxiv.org/abs/2208.09483

bio: Ju Sun is an assistant professor at the Department of Computer Science & Engineering, the University of Minnesota at Twin Cities. His research interests span computer vision, machine learning, numerical optimization, data science, computational imaging, and healthcare. His recent efforts are focused on the foundation and computation for deep learning and applying deep learning to tackle challenging science, engineering, and medical problems. Before this, he worked as a postdoc scholar at Stanford University (2016-2019), obtained his Ph.D. degree from Electrical Engineering of Columbia University in 2016 (2011-2016), and B.Eng. in Computer Engineering (with a minor in Mathematics) from the National University of Singapore in 2008 (2004-2008). He won the best student paper award from SPARS'15, honorable mention of doctoral thesis for the New World Mathematics Awards (NWMA) 2017, and AAAI New Faculty Highlight Programs 2021.

4:00 - 5:00 pm, Sept 28 (EST), Jack Umenberger, MIT

Title: Shortest Paths in Graphs of Convex Sets, with Applications to Control and Motion Planning

Video Slides

Abstract: Given a graph, the shortest-path problem requires finding a sequence of edges of minimum cost connecting a source vertex to a target vertex. In this talk we introduce a generalization of this classical problem in which the position of each vertex in the graph is a continuous decision variable, constrained to lie in a corresponding convex set, and the cost of an edge is a convex function of the positions of the vertices it connects. Problems of this form arise naturally in motion planning of autonomous vehicles, robot navigation, and optimal control of hybrid dynamical systems. The price for such a wide applicability is the complexity of this problem, which is readily seen to be NP-hard. We present a solution approach based on a strong mixed-integer convex formulation of the problem, which makes use of perspective functions. In the aforementioned applications, this formulation often has a very tight convex relaxation, making it possible to efficiently find globally-optimal paths in large graphs and in high-dimensional spaces.


Bio: Jack is a postdoctoral associate in Russ Tedrake's Robot Locomotion at the Massachusetts Institute of Technology. He received his PhD in Engineering and B.E. in Mechatronics from The University of Sydney, Australia, in 2018 and 2013, respectively, and was a postdoctoral fellow in the Division of Systems and Control at Uppsala University, Sweden, from 2017-2019. His research interests revolve around the application of optimization to data-driven modeling and control of dynamical systems.

4:00 - 5:00 pm, Oct 05, 2022 (EST), Rui Gao, University of Texas at Austin

Title: Wasserstein distributionally robust optimization: computation, regularization and statistics

Slides

Abstract: Wasserstein distributionally robust optimization is an emerging paradigm for decision-making under uncertainty, which involves a minimax formulation that hedges distributional uncertainty based on Wasserstein distance. In this talk, we will discuss three aspects of this problem: (i) computationally tractable reformulations; (ii) its connection with variation regularization such as Lispchitz regularization and gradient regularization; and (iii) non-asymptotic performance guarantees based on transport-information inequalities.

Bio: Rui Gao is an Assistant Professor in the Department of Information, Risk, and Operations Management at the University of Texas at Austin. His main research studies data-driven decision-making under uncertainty and prescriptive data analytics, and has been recognized by several INFORMS paper competition awards. He currently serves as an Associate Editor for Mathematical Programming. He received a Ph.D. in Operations Research from Georgia Institute of Technology in 2018, and a B.Sc. in Mathematics and Applied Mathematics from Xi'an Jiaotong University in 2013.

4:00 - 5:00 pm, Oct 19, 2022 (EST), Katy Craig, UC Santa Barbara

Title: Graph Clustering Dynamics: From Spectral to Mean Shift

Video Slides

Abstract: Clustering algorithms based on mean shift or spectral methods on graphs are ubiquitous in data analysis. However, in practice, these two types of algorithms are treated as conceptually disjoint: mean shift clusters based on the density of a dataset, while spectral methods allow for clustering based on geometry. In joint work with Nicolás García Trillos and Dejan Slepčev, we define a new notion of Fokker-Planck equation on graph and use this to introduce an algorithm that interpolates between mean shift and spectral approaches, enabling it to cluster based on both the density and geometry of a dataset. We illustrate the benefits of this approach in numerical examples and contrast it with Coifman and Lafon’s well-known method of diffusion maps, which can also be thought of as a Fokker-Planck equation on a graph, though one that degenerates in the zero diffusion limit.


Bio: Katy Craig is an assistant professor at UC Santa Barbara, specializing in partial differential equations and optimal transport. She received her Ph.D. from Rutgers University in 2014, after which she spent one year at the UCLA, as an NSF Mathematical Sciences Postdoctoral Fellow and one year at UCSB as an UC President’s Postdoctoral Fellow. In January 2022, she was awarded an NSF CAREER grant to support her work on optimal transport and machine learning.

4:00 - 5:00 pm, Oct 26, 2022 (EST), Fatma Kilinc-Karzan, CMU

Title: Using exactness guarantees to design faster algorithms for a class of semidefinite programs

Meeting Link: https://rensselaer.webex.com/meet/xuy21

Abstract: Semidefinite programs (SDPs) have been used as a tractable relaxation for many NP-hard problems that naturally arise in operations research, engineering, and computer science. The SDP relaxation is obtained by first reformulating the problem in a lifted space with an additional rank constraint and then dropping the rank constraint. In this talk, we will first study the SDP relaxation for general quadratically constrained quadratic programs, present various exactness concepts related to the SDP relaxation, and discuss conditions guaranteeing such SDP exactness. In particular, this will allow us to identify structural properties of these problems that admit equivalent tractable SDP reformulations. Despite the well-established strength of SDP relaxations, the task of solving an SDP is still considered impractical, especially in modern large-data settings, and precludes their widespread adoption in practice. In the second part of this talk, we will review how we can effectively exploit the exactness properties of SDPs to design storage-optimal accelerated first-order methods (which achieve even linear convergence rates for certain problems). This is joint work with Alex Wang.

Bio: Fatma Kılınç-Karzan is an Associate Professor of Operations Research at Tepper School of Business, Carnegie Mellon University. She holds a courtesy appointment at the Department of Computer Science as well. She completed her PhD at Georgia Institute of Technology in 2011. Her research interests are on foundational theory and algorithms for convex optimization and structured nonconvex optimization, and their applications in optimization under uncertainty, machine learning and business analytics. Her work was the recipient of several best paper awards, including 2015 INFORMS Optimization Society Prize for Young Researchers and 2014 INFORMS JFIG Best Paper Award. Her research has been supported by generous grants from NSF, ONR, and AFOSR, including an NSF CAREER Award. She is serving on the Mathematical Optimization Society Council, INFORMS Computing Society, and on the editorial board of several journals including Operations Research, Mathematical Programming, SIAM Journal on Optimization, INFORMS Journal on Computing, and Optimization Methods and Software.

4:00 - 5:00 pm, Nov 2, 2022 (EST), Chao Ma, Stanford University

Title: Implicit biases of optimization algorithms for neural networks: static and dynamic perspectives

Video Slides

Abstract:

Modern neural networks are usually over-parameterized—the number of parameters exceeds the number of training data. In this case the loss functions tend to have many (or even infinite) global minima, which imposes an additional challenge of minima selection on optimization algorithms besides the convergence. Specifically, when training a neural network, the algorithm not only has to find a global minimum, but also needs to select minima with good generalization among many other bad ones. In this talk, I will share a series of works studying the mechanisms that facilitate global minima selection of optimization algorithms. First, with a linear stability theory, we show that stochastic gradient descent (SGD) favors flat and uniform global minima. Then, we build a theoretical connection of flatness and generalization performance based on a special structure of neural networks. Next, we study the global minima selection dynamics—the process that an optimizer leaves bad minima for good ones—in two settings. For a manifold of minima around which the loss function grows quadratically, we derive effective exploration dynamics on the manifold for SGD and Adam, using a quasistatic approach. For a manifold of minima around which the loss function grows subquadratically, we study the behavior and effective dynamics for GD, which also explains the edge of stability phenomenon.


Short Bio:

Chao is a Szegö Assistant Professor in the Department of Mathematics at Stanford University. His research interests lie in the theory and application of machine learning, with a focus on theoretically understanding the optimization behavior of deep neural networks and their connection with generalization. Before joining Stanford, Chao obtained PhD from the Program in Applied and Computational Mathematics (PACM) at Princeton University in 2020, under the supervision of Professor Weinan E. In 2016, he received bachelor's degree from the school of mathematical Science at Peking University.


4:00 - 5:00 pm, Nov 09, 2022 (EST), Paul Grigas, UC Berkeley

Title: Offline and Online Learning and Decision-Making in the Predict-then-Optimize Setting

Video Slides

Abstract: In the predict-then-optimize setting, the parameters of an optimization task are predicted based on contextual features, and it is desirable to leverage the structure of the underlying optimization task when training a machine learning model. A natural loss function in this setting is based on considering the cost of the decisions induced by the predicted parameters, in contrast to standard measures of prediction error. Since directly optimizing this loss function is computationally challenging, we propose the use of a novel convex surrogate loss function, called the “Smart Predict-then-Optimize+ (SPO+)” loss function. In the offline learning situation, we prove that the SPO+ loss function is statistically consistent and develop corresponding quantitative risk bounds under mild conditions. We then consider an online variant of our setting with resource constraints, where a decision-maker first predicts a reward vector and resource consumption matrix based on a given context vector and then makes a decision. We prove regret bounds that are sublinear with rate depending on the corresponding offline risk bounds of the surrogate loss used to learn the prediction model. We also conduct numerical experiments to empirically demonstrate the strength of our proposed SPO-type methods in the online setting. This talk is based on a series of papers jointly with Othman El Balghiti, Adam Elmachtoub, Ambuj Tewari, and Heyuan Liu.


Bio: Paul Grigas is an assistant professor of Industrial Engineering and Operations Research at the University of California, Berkeley. Paul’s research interests are in large-scale optimization, statistical machine learning, and data-driven decision making. He is also broadly interested in the applications of data analytics, and he has worked on applications in online advertising. Paul’s research is funded by the National Science Foundation including an NSF CRII Award. Paul was awarded 1st place in the 2020 INFORMS Junior Faculty Interest Group (JFIG) Paper Competition, the 2015 INFORMS Optimization Society Student Paper Prize, and an NSF Graduate Research Fellowship. He received his B.S. in Operations Research and Information Engineering (ORIE) from Cornell University in 2011, and his Ph.D. in Operations Research from MIT in 2016.


4:00 - 5:00 pm, Nov 16, 2022 (EST), Molei Tao, Georgia Tech

Title: Machine learning meets dynamics: understanding large-learning rates, & variational Stiefel optimization

Video Slides

Abstract: The interaction between machine learning and dynamics can lead to

both new methodology for dynamics, and deepened understanding and/or

efficacious algorithms for machine learning. This talk will focus on the

latter.

Specifically, in half of the talk, I will describe some of the

nontrivial (and pleasant) effects of large learning rates, which are

often used in practical training of machine learning models but beyond

traditional optimization theory. More precisely, I will first show how

large learning rates can lead to quantitative escapes from local minima,

via chaos, which is an alternative mechanism to commonly known noisy

escapes due to stochastic gradients. I will then report how large

learning rates provably bias toward flatter minimizers, which arguably

generalize better.

In the other half, I will report the construction of

momentum-accelerated algorithms that optimize functions defined on

Riemannian manifolds, focusing on a particular case known as Stiefel

manifold. The treatment will be based on the design of continuous- and

discrete-time dynamics. Two practical applications will also be

described: (1) we markedly improved the performance of

trained-from-scratch Vision Transformer by appropriately wiring

orthogonality into its self-attention mechanism, and (2) our optimizer

also makes the useful notion of Projection Robust Wasserstein Distance

for high-dim. optimal transport even more effective.



Bio: Molei Tao received B.S. in Math & Physics in 2006 from Tsinghua

Univ., China, and Ph.D. in Control & Dynamical Systems with a minor in

Physics in 2011 from Caltech. Afterwards, he worked as a postdoc in

Computing & Mathematical Sciences at Caltech from 2011 to 2012, and then

as a Courant Instructor at NYU from 2012 to 2014. From 2014 on, he has

been working as an assistant, and then associate professor in School of

Mathematics and Machine Learning Center at Georgia Tech. He is a

recipient of W.P. Carey Ph.D. Prize in Applied Mathematics (2011),

American Control Conference Best Student Paper Finalist (2013), the NSF

CAREER Award (2019), AISTATS best paper award (2020), IEEE EFTF-IFCS

Best Student Paper Finalist (2021), and Cullen-Peck Scholar Award (2022).