# Spring 2020

# Statistics, Optimization and Machine Learning Seminar

### Quick info

To sign up for the mailing list and receive announcements about talks (you do not need to be enrolled in the seminar), visit our **google groups website** and add yourself to the group

The seminar is organized by Professors Rafael Frongillo (CS Dept) and Stephen Becker (Applied Math Dept)

### Meeting time/location

When: Tuesdays, 3:30 to 4:30 PM in the Newton Lab (ECCR 257)

Where: Newton Lab (ECCR 257) in the Classroom Wing of the Engineering Center on the CU Boulder main campus (for information on finding Newton Lab, see **visiting the applied math department**). [Update: partway through the 2020 spring semester, we moved to zoom-based talks due to COVID-19]

### The seminar is open to the public!

You do not have to be enrolled in the class ... but *if you are enrolled in the class*, for a passing grade, you must:

- attend all the talks (missing an occasional talk is permissible if you have a valid reason, so email the instructors), and
- give a ~20 min presentation at some point in the semester, either on your own research, or presenting a paper

## List of Talks

(all times/locations Tues 3:30 PM at Newton Lab, ECCR 257, unless otherwise indicated; after March 10, we switched to all zoom-based seminars due to COVID-19)

- Jan 14, organizational meeting, no talk
- Jan 21, Zhihui Zhu (University of Denver, Electrical & Computer Engineering), title "Provable Nonsmooth Nonconvex Approaches for Low-Dimensional Models".
- slides (compressed), and Zhihui Zhu's research website

- Jan 28, Stephen Becker (University of Colorado Boulder, Applied Math), title "Stochastic Subspace Descent: Stochastic gradient-free optimization, with applications to PDE-constrained optimization"
- Feb 4, No speaker
- Feb 11, Ken McMillan (principal researcher at Microsoft Research in Redmond), title "What is the Use of a Formal Specification?"
- part of CS colloquium. Location: DLC 170

- Feb 18, Jamie Sikora (Postdoctoral Researcher, Perimeter Institute), title "Quantum resources: What are they and how much are they worth?"
- part of CS colloquium. Location: DLC 170

- Feb 25, Anindya De (U Penn, CS Dept), title "Testing noisy linear functions for sparsity"
- March 3, Danna Gurari (UT Austin, School of Information), title "Designing Computer Vision Algorithms to Support Real Users and Recognize Multiple Perspectives"
- part of CS colloquium. Location: DLC 170

~~March 10, Aram Harrow (MIT), as part of CS theory seminar~~. Cancelled due to coronavirus- March 17, Bhaswati Ganguli (Professor, Stat Dept, University of Calcutta), title "Renewable energy modeling: experiences from India"
- lecture recorded via zoom; click here to watch (must sign in with colorado.edu account; if you don't have one and want access, contact one of the organizers)

- March 31, Aurelién Lucchi (ETH Zurich), title "An Optimization Perspective on Deep Learning" (this was joint with the CS colloquium), via zoom
~~April 7, Morteza Lahijanian (CU Ann and H. J. Smead Aerospace Engineering Sciences Dept.)~~Cancelled due to coronavirus, postponed until fall 2020- as a replacement talk, we joined with the CS colloquium to hear Ramya Vinayak (U. Washington) speak about "Learning from Societal Data: Theory and Practice", via zoom

~~April 14, Albert Berahas (Lehigh University).~~Cancelled due to coronavirus, postponed until fall 2020- April 21,
*student presentations (Curry Buscher on "Reconciling modern machine-learning practice and the classical bias–variance trade-off" and Jack Kawell on "Scalable Multi-Task Imitation Learning with Autonomous Improvement")*, via zoom - April 28, Pratyush Tiwary (U Maryland College Park), via zoom (email Stephen or Raf for zoom link), title "From atoms to emergent dynamics (with help from statistical physics and artificial intelligence)"

## Abstracts

### January 21

**Title:** Provable Nonsmooth Nonconvex Approaches for Low-Dimensional Models

**Speaker: **Zhihui Zhu (University of Denver, Electrical & Computer Engineering)

**Abstract: **

As technological advances in fields such as the Internet, medicine, finance, and remote sensing have produced larger and more complex data sets, we are faced with the challenge of efficiently and effectively extracting meaningful information from large-scale and high-dimensional signals and data. Many modern approaches to addressing this challenge naturally involve nonconvex optimization formulations. Although in theory finding a local minimizer for a general nonconvex problem could be computationally hard, recent progress has shown that many practical (smooth) nonconvex problems obey benign geometric properties and can be efficiently solved to global solutions.

In this talk, I will extend this powerful geometric analysis to robust low-dimensional models where the data or measurements are corrupted by outliers taking arbitrary values. We consider nonsmooth nonconvex formulations of the problems, in which we employ an L1-loss function to robustify the solution against outliers. We characterize a sufficiently large basin of attraction around the global minima, enabling us to develop subgradient-based optimization algorithms that can rapidly converge to a global minimum with a data-driven initialization. I will also talk about our very recent work for general nonsmooth optimization on the Stiefel manifold which appears widely in engineering. I will discuss the efficiency of this approach in the context of robust subspace recovery, robust low-rank matrix recovery, and orthogonal dictionary learning.

- slides (compressed), and Zhihui Zhu's research website

**Speaker Bio**: Zhihui Zhu received the Ph.D. degree in electrical engineering from Colorado School of Mines, Golden, CO, in 2017, and was a Postdoctoral Fellow in the Mathematical Institute for Data Science at the Johns Hopkins University, Baltimore, MD, in 2018-2019. He is an Assistant Professor with the Department of Electrical and Computer Engineering, University of Denver, CO. His research interests include the areas of data science, machine learning, signal processing, and optimization. His current research largely focuses on the theory and applications of nonconvex optimization and low-dimensional models in large-scale machine learning and signal processing problems.

### January 28

**Title**: Stochastic Subspace Descent: Stochastic gradient-free optimization, with applications to PDE-constrained optimization

**Speaker**: Stephen Becker (University of Colorado Boulder, Applied Math Dept)

**Abstract**:

We describe and analyze a family of algorithms that generalize block-coordinate descent, where we assume one can take directional derivatives (for low-precision optimization, this can be approximated with finite differences, hence this is similar to a 0th order method). The method generalizes randomized block coordinate descent. We prove almost-sure convergence of the algorithm at a linear rate (under strong convexity) and convergence (with convexity). Furthermore, we analyze a variant similar to SVRG but that does not require the finite-sum structure in the objective, and for isotropic random sampling, we use Johnson-Lindenstrauss style arguments to provide non-asymptotic, probabilistic convergence results. Numerical examples are provided for selecting landmark points in Gaussian process regression, and in PDE-constrained optimization (shape optimization). This is joint work with Luis Tenorio and David Kozak from the Colorado School of Mines Applied Math & Stat Dept, and Alireza Doostan from CU Boulder's Smead Aerospace Engineering Dept.

### February 11

**Title**: What is the Use of a Formal Specification?

**Speaker**: Ken McMillan (principal researcher at Microsoft Research in Redmond)

(part of CS colloquium series; in DLC 170)

**Abstract**:

A formal specification is a mathematical description of a designed artifact. As such, it may serve a variety of functions. For example, it may serve as a thinking tool for understanding a design, a contract between designers, a lemma in a formal proof, or a tool for testing or simulation. The form, content and development of a specification flow from its function.

To illustrate this point, we will consider a formal specification of QUIC, a new secure Internet transport protocol. QUIC is intended as a replacement for the TLS/TCP stack and will be the basis of HTTP/3, the next official version of the World-Wide Web protocol. It already carries a substantial fraction of Internet traffic and its usage is growing rapidly. This should be concerning, as QUIC is complex, and errors in its implementation may have serious security consequences.

Our formal specification of QUIC has two primary and interacting functions. The first is to create automated randomized conformance testers using a methodology called compositional specification-based testing. This style of testing revealed numerous errors in implementations of QIUC, including some interesting security vulnerabilities. The second function of our specification is to capture the protocol knowledge inherent in the protocol implementations, in order to resolve ambiguities in the standards documents (called RFC's). Compositional testing also supports this function.

We will discuss how the function of the QUIC specification dictates its structure, style and language, and how its content differs markedly from that of the RFC's. We will also observe that a formal specification need not be complete or perfect to serve a useful purpose. The QUIC experience shows that a modest effort in formalization can pay significant dividends.

This talk describes joint work with Lenore Zuck, University of Illinois at Chicago.

### February 18

**Title**: Quantum resources: What are they and how much are they worth?

**Speaker**: Jamie Sikora (Postdoctoral Researcher, Perimeter Institute)

(part of CS colloquium series; in DLC 170)

**Abstract**:

In this talk, I will discuss several natural quantum problems and, in particular, how the problems change as the quantum resources change. I will show how to take an economics perspective to assign a "shadow price" to each quantum resource. To do this, I will use optimization theory and show that shadow prices are often given "for free" if you know where to look for them. No knowledge about economics, optimization theory, or quantum theory is needed for this talk. This is joint work with Gary Au (University of Saskatchewan).

**Bio**: Jamie Sikora received his Ph.D. from the Institute for Quantum Computing and the University of Waterloo in Ontario, Canada. He is currently a postdoctoral researcher at the Perimeter Institute and his current research focuses on applying optimization techniques to gain a better understanding of our quantum world.

### February 25

**Title**: Testing noisy linear functions for sparsity

**Speaker**: Anindya De (U Penn, CS Dept)

**Abstract**:

Consider the following basic problem in sparse linear regression -- an algorithm gets labeled samples of the form (x, <w.x> + \eps) where w is an unknown n-dimensional vector, x is drawn from a background n-dimensional distribution D, and \eps is some independent noise. Given the promise that w is k-sparse, the breakthrough work of Candes, Romberg and Tao shows that w can be recovered with samples and time which scales as O(k log n). This should be contrasted with general linear regression where O(n) samples are information theoretically necessary.

We look at this problem from the vantage point of property testing, and study the complexity of determining whether the unknown vector w is k-sparse versus "far" from k-sparse. We show that this decision problem can be solved with a number of samples which is independent of n as long as the background distribution D is i.i.d. and its components are not Gaussian. We further show that weakening any of the conditions in this result necessarily makes the complexity scale logarithmically in n.

Joint work with Xue Chen (Northwestern) and Rocco Servedio (Columbia).

### March 2

**Title**: Designing Computer Vision Algorithms to Support Real Users and Recognize Multiple Perspectives

**Speaker**: Danna Gurari, Assistant Professor, School of Information, UT Austin

(part of CS colloquium series; in DLC 170)

**Abstract**:

Computer vision systems are transforming our daily lives. Already, such computers that can ‘see’ are guiding self-driving vehicles, assisting medical professionals in diagnosing diseases, allowing us to deposit checks from our mobile phones, and empowering people with visual impairments to independently overcome daily challenges (e.g., reading restaurant menus). However, the current paradigm for designing human-like intelligence in algorithms has limitations. Currently, algorithms are primarily designed to analyze visual information in constrained environments, with the status quo being to return a single response for each task. These limitations translate to algorithms regularly performing poorly in real-world situations and offering a narrow, one-size-fits-all perspective. In this talk, I will discuss my work that addresses these limitations towards building computing systems that enable and accelerate the analysis of visual information (i.e., images and videos). This will include: (1) creating large-scale datasets that represent real use cases; (2) designing efficient hybrid algorithm-crowd partnerships to compensate for algorithm shortcomings; (3) posing new algorithms for understanding and anticipating divergent responses from a crowd; and (4) introducing new methods that can deliver a diversity of results.

### March 10

cancelled due to coronavirus

### March 17

**Title**: Renewable energy modeling: experiences from India.

**Speaker**: Bhaswati Ganguli, Professor, Stat Dept, University of Calcutta

**Abstract**:

The context for this talk is provided by the LISA 2020 project which aims to boost the application on data science across participating statistical laboratories. For India, the mandate of the project is to focus on applications to the field of renewable energy. The talk will present four aspects of this, namely, forecasting of solar radiation, quantification of the effect of ambient air pollution on photovoltaic panels, the relationship between financial inclusion and the success of electrification projects and the use of text data to study investment trends in the sector.

### March 28

**Title**: From atoms to emergent dynamics (with help from statistical physics and artificial intelligence)

Special note: talk is via zoom. Contact Stephen or Raf for zoom link

**Speaker**: Pratyush Tiwary (University of Maryland, Department of Chemistry & Biochemistry and Institute for Physical Science and Technology); website and

https://sites.google.com/site/pratyushtiwary Twitter: @tiwarylab

**Abstract**:

The ability to rapidly learn from high-dimensional data to make reliable predictions about the future of a given system is crucial in many contexts. This could be a fly avoiding predators, or the retina processing terabytes of data almost instantaneously to guide complex human actions. In this work we draw parallels between such tasks, and the efficient sampling of complex molecules with hundreds of thousands of atoms. Such sampling is critical for predictive computer simulations in condensed matter physics and biophysics. Specifically, we use ideas from statistical physics, artificial intelligence (AI) and information theory, including the Maximum Caliber approach [1], Predictive Information Bottleneck (PIB) [2], Shannon's rate distortion theory [3] and Long-short term memory (LSTM) networks, re-formulating them for the sampling of molecular structure and dynamics, especially when plagued with rare events. We demonstrate our methods on different test-pieces primarily in biophysics. We also discuss some open problems in the application of AI approaches to molecular simulations - for instance dealing with spurious solutions related to non-convex objective function in AI.

1. Tiwary and Berne, Proc. Natl. Acad. Sci. 2016

2. Wang, Ribeiro and Tiwary, Nature Commun. 2019

3. Ravindra, Smith and Tiwary, Mol. Sys. Des. Eng. 2020

**Bio**:

Pratyush Tiwary is an Assistant Professor at the University of Maryland, College Park, holding joint positions in the Department of Chemistry and Biochemistry and the Institute for Physical Science and Technology. He received his PhD and MS in Materials Science from Caltech, working with Axel van de Walle, and finished his undergraduate degree in Metallurgical Engineering at the Indian Institute of Technology, Banaras Hindu University, Varanasi. Prior to starting at Maryland, Prof. Tiwary was a postdoc in the Department of Chemistry at Columbia University working with Bruce Berne, and at the Department of Chemistry & Applied Biosciences at ETH Zurich work with Michele Parrinello.