Location: Smith Hall 205
Title: Correcting Convexity Bias
Abstract: We consider the problem of estimating a function or functional of an unknown input when only noisy observations of the input are available. When the function is convex (or concave) near the unknown input, the naive estimator often incurs a significant bias. We propose new estimators based on bootstrap to reduce this convexity bias. Theoretical analysis are conducted to show that the proposed methods can strictly reduce the expected estimate error under mild conditions. They can serve as off-the-shelf tools for a wide range of problems, including optimization problems with random objective functions or constraints, functionals of probability distributions such as the entropy and the Wasserstein distance, and matrix functions such as inversion.
Location: Smith Hall 205
Title: Solving non-linear PDEs with Gaussian Processes
Abstract: In this talk I present a simple, rigorous, and interpretable framework for solution of nonlinear PDEs based on the framework of Gaussian Processes. The proposed approach provides a natural generalization of kernel methods to nonlinear PDEs; has guaranteed convergence; and inherits the state-of-the-art computational complexity of linear solvers for dense kernel matrices. I will outline our approach by focusing on an example nonlinear elliptic PDE followed by further numerical examples.
Location: Smith Hall 205
Title: Stokes waves in conformal plane: the Hamiltonian variables and instabilities
Abstract: The Stokes wave is a water wave that travels over a free surface of water without changing shape. When a time-varying fluid domain is mapped to a fixed geometry, such as a periodic strip in the lower half-plane, the equation for the Stokes wave is a nonlinear integro-differential ODE whose solutions are found numerically to arbitrary precision. The spectral stability of Stokes waves is studied by linearization of the equations of motion for the free surface around a Stokes wave, and studying the spectrum of the associated Fourier-Floquet-Hill (FFH) eigenvalue problem. We developed a novel approach to studying the eigenvalue spectrum by combining the conformal Hamiltonian canonical variables, the FFH technique built into a matrix-free Krylov-Schur eigenvalue solver. We report new results for the Benjamin-Feir instability as well as the high-frequency, and localized (superharmonic) instabilities of the waves close to the limiting Stokes wave.
Location: Smith Hall 205
Title: Developing high order, efficient, and stable time-evolution methods using a time-filtering approach
Abstract: Time stepping methods are critical to the stability, accuracy, and efficiency of the numerical solution of partial differential equations. In many legacy codes, well-tested low-order time-stepping modules are difficult to change; however, their accuracy and efficiency properties may form a bottleneck. Time filtering has been used to enhance the order of accuracy (as well as other properties) of time-stepping methods in legacy codes. In this talk I will describe our recent work on time filtering methods for the Navier Stokes equations as well as other applications. A rigorous development of such methods requires an understanding of the effect of the modification of inputs and outputs on the accuracy, efficiency, and stability of the time-evolution method. In this talk, we show that time-filtering a given method can be seen as equivalent to generating a new general linear method (GLM). We use this GLM approach to develop an optimization routine that enabled us to find new time-filtering methods with high order and efficient linear stability properties. In addition, understanding the dynamics of the errors allows us to combine the time-filtering GLM methods with the error inhibiting approach to produce a third order A-stable method based on alternating time-filtering of implicit Euler method. I will present our new methods and show their performance when tested on sample problems.
Location: Smith Hall 205
Title: Curse of dimensionality and PCA: 20 years on spiked covariance matrix model
Abstract: This is a survey talk and mainly for random matrix non-experts and graduate students. High-dimensional data analysis has become one of the central topics in modern statistics and computer science. In this area, the dimension of the sample is usually divergent with or even larger than the size. Consequently, the classical estimation, inference and decision theory assuming fixed dimensionality usually lose their validity. The main technical reason is that the standard concentration results, like law of large numbers and central limit theorem usually fail without a substantial modification. To address these issues, random matrix theory has emerged as a particularly useful framework and tool. In this talk, I will explain the curse of dimensionality using principal component analysis (PCA). I will make a survey on the existing results based on the famous and simple spiked model. This model was proposed by Iain Johnstone in 2000 and takes us more than 20 years to partially understand it. Open questions will also be discussed.
Location: Smith Hall 205
Title: The Physics of Data, or the Entropy Theory of Information
Abstract: In classical applied mathematics, the concept of "observables" and "measurements" play very limited roles; they are the focus of statistics. Counting ad infinitum is the holographic observable to an ergodic dynamics with finite states under independent repeated sampling. Entropy provides the infinitesimal probability for an observed frequency ν w.r.t. a probability prior p. Following Callen's thermodynamic postulate and through Legendre-Fenchel transform, without help from mechanics, we show an internal energy μ emerges; it provides a linear representation of real-valued observables with full or partial information. Gibbs' fundamental thermodynamic relation and theory of ensembles follow mathematically. μ is to ν what ω is to t in Fourier analysis.
Location: Smith Hall 205
Title: Capturing Multiscale Physics using High Fidelity Continuum Plasma Models
Abstract: Plasmas are composed of a large number of charged and neutral particles that interact through electromagnetic forces and collisions over a wide range of temporal and spatial scales to produce collective dynamics. The discrete nature of the plasma particles, their large number, and the wide range of interaction scales make a many-body treatment computationally prohibitive. A statistical treatment evolves the probability density function for each particle species in phase space (x,v) and produces the more manageable but six-dimensional kinetic model, which is typically considered to provide the highest physical fidelity. Assumptions related to thermodynamic relaxation reduce the kinetic model through velocity moments to retain more limited information about the distribution function at each position (x). The moment models, e.g. 5N, 10N, and 13N, are three dimensional and thereby offer substantial computational acceleration compared to the kinetic model. The validity of the assumptions and the resulting moment models depends on local plasma parameters such as collisionality, charge neutrality, and magnetization. Such considerations are necessary when selecting the appropriate model for simulations. Expressing the continuum plasma models in a consistent formulation avails the use of high-order representations, such as finite element methods, and simplifies hybridization. Solutions using the discontinuous Galerkin method will be described, where the governing equations are expressed in a balance law form, an approximate Riemann solver is developed for evaluating inter-element numerical fluxes, and Runge-Kutta methods perform the temporal advance. The computational methods are applied to the GEM collisionless magnetic reconnection problem and the magnetized Kelvin-Helmholtz instability. Spatially localized deviations away from thermodynamic equilibrium are investigated, as well as the resulting agreement and disagreement of global solution metrics between plasma models of differing fidelity. The results suggest opportunities to hybridize by applying the simplest, locally valid plasma model and developing strategies to couple the models across subdomain boundaries.
Location: Smith Hall 205
Title: Double dipping: problems and solutions, with application to single-cell RNA-sequencing data
Abstract: In contemporary applications, it is common to collect very large data sets with the vaguely-defined goal of hypothesis generation. Once a dataset is used to generate a hypothesis, we might wish to test that hypothesis on the same set of data. However, this type of "double dipping" violates a cardinal rule of statistical hypothesis testing: namely, that we must decide what hypothesis to test before looking at the data. When this rule is violated, then standard statistical hypothesis tests (such as t-tests and z-tests) fail to control the selective Type 1 error --- that is, the probability of rejecting the null hypothesis, provided that the null hypothesis holds, and given that we decided to test this null hypothesis.
Location: Smith Hall 205
Title: Transforming meshes, or, algebraic topology for fun and profit
Abstract: Generating meshes is one of the key ingredients in solving partial differential equations on irregular domains using the finite element or finite volume method. All algorithms for generating, optimizing, and simplifying unstructured meshes are based on applying a sequence of elementary topological transformations. The vocabulary of available transformations dictates how well the algorithm works. In most meshing tools, however, this vocabulary is quite limited because implementing and debugging the transformations is... not enjoyable, in any respect. In this talk, I'll describe a heartbreakingly elegant* way to implement these transformation kernels. The method uses some ideas from algebraic topology but the only prerequisite is linear algebra.
*Rotten fruit will be provided for the audience in the event that the method should prove insufficiently elegant.