Fall 2017

Short list of speakers (see farther below for full abstracts):

  • Aug 29, Welcome Meeting
  • Sept 5, Manuel Lladser (CU-Boulder), Approximation of Markovian functionals
  • Sept 12, Dan Larremore (CU-Boulder), A Physical Model for Efficient Ranking in Networks
  • Sept 19, Keith Lindsay (UCAR), A Newton-Krylov Solver for Fast Spin-up of Online Ocean Tracers
  • Sept 26, Peter Wills (CU-Boulder), Anomaly Detection on Graphs using the Resistance Metric
  • Sept 26, Kathleen Finlinson (CU-Boulder), Tunability of Neural Networks
  • Oct 3, Wen Zhou (CSU-Fort Collins), A Nonparametric Procedure to Detect Spurious Discoveries with Sparse Signals
  • Oct 10, Jason Dou (CU-Boulder), A Least Squares Monte Carlo Approach to Appointment Scheduling under Patient Cancellation and No-Show Behavior
  • Oct 10, Fan You (CU-Boulder), An Approximate Dynamic Programming Approach to a Rolling-Horizon Appointment Scheduling Problem
  • Oct 17, Luis Tenorio (School of Mines), Randomization Methods for Large Linear Least-squares and Inverse Problems
  • Oct 24, TBD
  • Oct 31, Carlos Martins-Filho (CU-Boulder), Estimation of a Partially Linear Regression in Triangular Systems
  • Nov 7, Richard Clancy (CU-Boulder), Lazy PCA: Even Faster SVD Decomposition Yet Without Agonizing Pain
  • Nov 14, Gabriel Ortiz-Pena (CU-Boulder), Understanding Black-box Predictions via Influence Functions
  • Nov 21, No Class (Thanksgiving Break)
  • Nov 28, Yu Du (UC-Denver), Selective Linearization for Multi-block Statistical Learning Problems
  • Dec 5, No Seminar
  • Dec 12, McKell Carter, Information Processing Models of Neuroimaging Data

For students enrolled: paper signup (google sheets)

Abstracts

SPEAKER: Manuel Lladser, Associate Professor of Applied Mathematics at the University of Colorado-Boulder

TITLE: Approximation of Markovian functionals

TIME: 3:30 PM, Tuesday, 5 September 2017

PLACE: Newton Lab

ABSTRACT

In this presentation, we will discuss work in progress to approximate the distribution of a so-called linear functional of the path of a Markov chain. An archetypical example of this are sojourn-times, such as those encountered in genomic sequence analyses, telecommunication protools, and inventory models. We will see how one can use low-rank matrices to approximate the distribution of said functionals in the L1 norm (as opposed to the traditional approach based on the L2 norm). The technical motivation for this is that the L1 norm is up to a constant factor equal to the "total variation distance," which has a more practical probabilistic interpretation, besides being the standard metric used to analyze Markovian processes. This work is in collaboration with Dr. Barrera from Universidad Adolfo Ibañez in Chile.

SPEAKER: Daniel Larremore, Assistant Professor of BioFrontiers Institute & Department of Computer Science at the University of Colorado-Boulder

TITLE: A Physical Model for Efficient Ranking in Networks

TIME: 3:30 PM, Tuesday, 12 September 2017

PLACE: Newton Lab

ABSTRACT

We present a principled model and algorithm to infer a hierarchical ranking of nodes in directed networks. Unlike other methods such as minimum violation ranking, it assigns real-valued scores to nodes rather than simply ordinal ranks, and it formalizes the assumption that interactions are more likely to occur between individuals with similar ranks. It provides a natural framework for a statistical significance test for distinguishing when the inferred hierarchy is due to the network topology or is instead due to random chance, and it can be used to perform inference tasks such as predicting the existence or direction of edges. The ranking is inferred by solving a linear system of equations, which is sparse if the network is; thus the resulting algorithm is extremely efficient and scalable. We illustrate these findings by analyzing real and synthetic data and show that our method outperforms others, in both speed and accuracy, in recovering the underlying ranks and predicting edge directions. This work is a collaboration with Caterina De Bacco and Cris Moore.

SPEAKER: Keith Lindsay, Climate and Global Dynamics Laboratory, University Corporation for Atmospheric Research (UCAR)

TITLE: A Newton-Krylov Solver for Fast Spin-up of Online Ocean Tracers

TIME: 3:30 PM, Tuesday, 19 September 2017

PLACE: Newton Lab

ABSTRACT

A challenge that arises when simulating tracers in an ocean model is spinning up the tracers to be in balance with the model's circulation. This spinup is desirable for clean comparison of the modeled solution to observations, such as nutrient distributions, and for initializing transient experiments, such as those done with coupled climate carbon models (e.g., bomb radiocarbon). Two aspects of the challenge are the long time scales of ocean ventilation and the short time scales of processes in the upper ocean. We present results here that demonstrate the successful application of a Newton-Krylov based solver to efficiently spin up tracers in online ocean tracer simulations.

SPEAKER: Peter Wills, PhD Student of Department of Applied Mathematics, University of Colorado Boulder

TITLE: Anomaly Detection on Graphs using the Resistance Metric

TIME: 3:30 PM, Tuesday, 26 September 2017

PLACE: Newton Lab

ABSTRACT

In the era of big data, algorithms are needed to analyze graphical data (consisting of objects and relationships) that is dynamic (changing in time). In particular, one might wish to know when a change made to a graph significantly affects the flow of information on the graph. We propose an algorithm based on the graph resistance, which is a topologically sensitive measurement of distance between nodes on a graph. We provide quantitative metrics and experimental data indication the utility of this approach.

SPEAKER: Kathleen Finlinson, PhD Student of Department of Applied Mathematics, University of Colorado Boulder

TITLE: Tunability of Neural Networks

TIME: 3:30 PM, Tuesday, 26 September 2017

PLACE: Newton Lab

ABSTRACT

TBD

SPEAKER: Wen Zhou, Assistant Professor of Statistics at Colorado State University - Fort Collins

TITLE: A Nonparametric Procedure to Detect Spurious Discoveries with Sparse Signals

TIME: 3:30 PM, Tuesday, 3 October 2017

PLACE: Newton Lab

ABSTRACT

Identifying a subset of response-associated covariates from a large number of candidates has become a fundamental tool for scientific discoveries in many fields, particularly in biology including the differential expression analysis in genomics, the genome-wide association study (GWAS) in genetics, the critical transcription factor identification in the Encyclopedia of DNA Elements (ENCODE) project, etc. However, given the high dimensionality and the sparsity of signals in data from those researches, spurious discoveries can easily arise. In addition, the ubiquitous data with mixed types, along with sophisticated dependence structures, greatly limit the applicability of the traditional goodness-of-fit based procedures. In this paper, we introduce a statistical measure on the goodness of spurious fit based on the maximum rank correlations among predictors and responses. The proposed statistic imposes no assumptions on the data types and underlying models, and can be regarded as a generalization of the maximum spurious correlation for linear models. We derive the asymptotic distribution of such goodness of fit of spurious under very mild assumptions on the associations among predictors and responses. Such an asymptotic distribution depends on the sample size, ambient dimension, the number of predictors under study, and the covariance information. We propose a multiplier bootstrap procedure to estimate such a distribution and utilize it as the benchmark to guard against spurious discoveries. It is also applied to the variable selection problems for the high dimensional generalized regressions. While the theory and method are convincingly illustrated by numerical studies, we applied our method to both GWAS and ENCODE studies to demonstrate that the proposed measure provides a statistical verification of the detected biomarkers in practice and reveals the necessity of a two-stage or even multiple stage statistical approach for general genomic or genetic researches.

SPEAKER: Jason Dou, PhD Student of Operations Management, University of Colorado Boulder

TITLE: A Least Squares Approach to Appointment Scheduling under Patient Cancellation and No-Show Behavior

TIME: 3:30 PM, Tuesday, 10 October 2017

PLACE: Newton Lab

ABSTRACT

Patient cancellation and no-show behavior is a major challenge in appointment scheduling. A stochastic dynamic programming model for appointment scheduling is introduced. We develop a least squares Monte Carlo approach to tackle the problem and show promising performance via extensive numerical experiments.

SPEAKER: Fan You, PhD Student of Operations Management, University of Colorado Boulder

TITLE: An Approximate Dynamic Approach to a Rolling-horizon Appointment Scheduling Problem

TIME: 3:30 PM, Tuesday, 10 October 2017

PLACE: Newton Lab

ABSTRACT

We consider a rolling-horizon appointment scheduling problem with multiple patient classes. The problem is formulated as an infinite horizon discounted cost Markov decision process. We consider affine and finite-horizon approximations and show that they admit compact representations and can be efficiently solves as small scale linear programs. A numerical study illustrates the performance of the heuristic control policies based on the approximations.

SPEAKER: Luis Tenorio, Associate Professor of Applied Mathematics and Statistics, Colorado School of Mines

TITLE: Randomization Methods for Large Linear Least-squares and Inverse Problems

TIME: 3:30 PM, Tuesday, 17 October 2017

PLACE: Newton Lab

ABSTRACT

I will consider randomized versions of stochastic Newton and stochastic quasi-Newton methods that can be used to solve large linear least-squares and inverse problems where the large data sets present a significant computational burden (e.g., the size may exceed computer memory or data are collected in real-time). In the proposed framework, stochasticity is introduced in two different frameworks as a means to overcome these computational limitations. The randomized recursion defines quasimartingales with provable convergence properties.

SPEAKER: Carlos Martins-Filho, Professor of Economics, University of Colorado Boulder

TITLE: Estimation of a Partially Linear Regression in Triangular Systems

TIME: 3:30 PM, Tuesday, 31 October 2017

PLACE: Newton Lab

ABSTRACT

We propose kernel-based estimators for the components of a partially linear regression in a triangular system where endogenous regressors appear both in the linear and nonparametric components of the regression. Compared with other estimators currently available in the literature, e.g. the sieve estimators proposed in Ai and Chen (2003) or Otsu (2011), our estimators have explicit functional form and are much easier to implement. They rely on a set of assumptions introduced by Newley et al. (1999) that characterize what has become known as the "control function" approach for endogeneity in regression. We explore conditional moment restrictions that make this model suitable for additive regression estimation as in Kim et al. (1999) and Manzan and Zerom (2005). We establish consistency and square-root n asymptotic normality of the estimator for the parameters in the linear component of the model, give a uniform rate of convergence, and establish the asymptotic normality for the estimator of the nonparametric component. In addition, for statistical inference, a consistent estimator for the covariance of the limiting distribution of the parametric estimator is provided.

PRESENTER: Richard Clancy, PhD Student of Department of Applied Mathematics, University of Colorado Boulder

TITLE: Lazy PCA: Even Faster SVD Decomposition Yet Without Agonizing Pain

TIME: 3:30 PM, Tuesday, 7 November 2017

PLACE: Newton Lab

PRESENTER: Gabriel Ortiz-Pena, PhD Student of Department of Astrophysical and Planetary Sciences, University of Colorado Boulder

TITLE: Understanding Black-box Predictions via Influence Functions

TIME: 3:30 PM, Tuesday, 14 November 2017

PLACE: Newton Lab

SPEAKER: Yu Du, Assistant Professor of Business Analytics, University of Colorado Denver

TITLE: Selective Linearization for Multi-block Statistical Learning Problems

TIME: 3:30 PM, Tuesday, 28 November 2017

PLACE: Newton Lab

ABSTRACT

We consider the problem of minimizing a sum of several convex non-smooth functions. In this talk, we introduce a new algorithm called the selective linearization model, which iteratively linearizes all but one of the functions and employs simple proximal steps. The algorithm is a form of multiple operator splitting in which the order of processing partial functions is not fixed, but rather determined in the course of calculations. It proposes one of the first operator-splitting type methods which are globally convergent for an arbitrary number of operators without artificial duplication of variables. This algorithm is a multi-block extension of the alternating linearization (ALIN) method for solving structured non-smooth convex optimization problems.

Global convergence is proved and estimates of the convergence rate are derived. Specifically, under a strong convexity condition, the number of iterations needed to achieve solution accuracy ε is of order O(ln(1/ε)/ε). The convergence rate analysis technique invented by us can also be used to derive the rate of convergence of the classical bundle ALIN method, for which no convergence rate estimate has been available so far.

We report results of extensive comparison experiments in statistical learning problems such as large-scale fused lasso regularization problem, overlapping group lasso problem and regularized support vector machine problem. The numerical results demonstrate the efficacy and accuracy of the method.

SPEAKER: McKell Carter, Assistant Professor of Psychology and Neuroscience, University of Colorado Boulder

TITLE: Information Processing Models of Neuroimaging Data

TIME: 3:30 PM, Tuesday, 12 December 2017

PLACE: Newton Lab

ABSTRACT

The interpretation of neuroimaging data presents unique challenges for machine learning. A wide variety of machine learning techniques have been applied to the problem of decoding mental states from neuroimaging data. These approaches have been limited by the difficulty of relating high dimensional data sets to complex stimuli that can be interpreted from multiple view points and in multiple contexts. Rather than attempting to directly decode experienced mental states, we take a cognitive science approach to the analysis of fMRI data. We seek to trace the transformation of information throughout the brain using constraints from psychological studies. I will give a brief overview of work in my lab and describe two on-going projects. In the first, we seek to overcome biasing factors like arousal and engagement that are often confounded with the psychological process of interest by using a mixture-model approach. We constructed a student’s t-distribution mixture model based on Scheer’s optimization and adding Brent’s method to estimate degrees of freedom for the T-distributions. This approach shows promise for identifying noise components and providing better specificity in interpreting the computational processes localized to a particular part of the brain. In the second, we sought to characterize processing stream characteristics like those found in the visual hierarchy. We utilized non-parametric higher-order characteristics of hitting-time distributions from an open source functional magnetic resonance imaging dataset to identify brain networks that may contain isolated information processing hierarchies. These measures differ between resting and task imaging data providing support for the detected changes in information processing streams. This same metric differs between neurotypical participants and those with schizophrenia providing a potential clinical application of the measure. We utilized a cognitive science approach to constrain model selection in the characterization of neuroimaging data with promising results.