# Schedule & Speaker Abstracts

## Workshop schedule

Invited speaker abstracts and bios available at the end.

*Location: Palais des Congrès de Montréal, Montréal CANADA - Room 517 D*

### Session 1:

8:25-8:30 **Introductions**

8:30-9:00 **Invited Talk **

Sinead Williamson "*Random clique covers for graphs with local density and global sparsity*"

9:00-9:30 **Invited Talk**

** **Benjamin Bloem-Reddy "*Left-neutrality: an old friend in the mirror*"

9:30-9:45 **Contributed Talk**

** **Leo Duan "*Distribution-based Clustering using Characteristic Function*"

9:45-10:15: **Poster spotlights **

**Adam Farooq - Adaptive principal component analysis**

**An Bian - Optimal DR-Submodular Maximization and Applications to Provable Mean Field Inference**

**Ari Pakman - Amortized Bayesian inference for clustering models**

**Carl Henrik Ek - Sequence Alignment with Dirichlet Process Mixtures**

**Feras Saad - Goodness-of-Fit Tests for High-Dimensional Discrete Distributions with Application to Convergence Diagnostics in Bayesian Nonparametric Inference**

**Iñigo Urteaga - Nonparametric Gaussian mixture models for the multi-armed contextual bandit**

**Kai Xu - Amortized Inference for Latent Feature Models Using Variational Russian Roulette**

**Kelvin Hsu - Bayesian Learning of Conditional Kernel Means for Automatic Likelihood-Free Inference**

**Linfeng Liu - Non-Parametric Variational Inference for Gaussian Processes with Graph Convolutional Networks**

**Lorenzo Masoero - Posterior representations of hierarchical completely random measures in trait allocation models**

**Melanie F. Pradier - Hierarchical Stick-breaking Feature Paintbox**

**Michael Riis Andersen - A non-parametric probabilistic model for monotonic functions**

**Sandhya Prabhakaran - A Bayesian Model for Overlapping Clusters from Distance Data**

**Wil Ward - Variational Bridge Constructs for Approximate Gaussian Process Regression**

**Yiming Sun - Kernel Distillation for Fast Gaussian Processes Prediction**

10:15-11:00:** Poster Session** + Coffee Break (at 10:30)

### Session 2:

11:00-11:30 Invited Talk

** Allison Chaney** "*Nonparametric Deconvolution Models"*

11:30-11:45 Contributed Talk

** Miriam Shiffman** "*Reconstructing probabilistic trees of cellular differentiation from single-cell RNA-seq data*"

11:45-12:30 Poster Session

### Session 3: Lunch Tutorial

*(lunch provided @ 12pm, first come first serve, sponsored by Google)*

12:30: **TensorFlow Probability Software Tutorial** (led by Matt Hoffman)

Uses TensorFlow and TensorFlow Probability.

### Session 4:

2:00-2:30: **Invited Talk **

** **Joseph Futoma "*Learning to Detect Sepsis with a Multi-output Gaussian Process RNN Classifier (in the Real World!)*"

2:30-3:00 **Software Panel**: (Moderated by Mike Hughes)

- David Duvenaud
- Matt Hoffman
- Ben Letham
- Dustin Tran
- Aki Vehtari

3:00-3:30 **Poster Session** + Coffee Break

### Session 5:

3:30-4:00: **Invited Talk **

** **Hamsa Balakrishnan "*Modeling the Fuel Consumption of Aircrafts*"

4:00-4:15 **Contributed talk **

** **Irineo Cabreros "*A nonparametric Bayes approach to hypothesis testing*"

4:15-4:30 **Contributed talk**

** **Runjing Liu "*Evaluating Sensitivity to the Stick Breaking Prior in Bayesian Nonparametrics*"

4:30-5:30: **Research Panel**: (Moderated by Erik Sudderth)

- Barbara Engelhardt
- Tom Griffiths
- Isabel Valera
- Hanna Wallach
- Sinead Williamson

5:30-6:30: **Poster Session** & Individual Discussion

## Invited Speakers

**Hamsa Balakrishnan** (MIT) - *Modeling the Fuel Consumption of Aircrafts*

Abstract: Fuel consumption is a major contributor to airline operating costs as well as aircraft emissions. Aviation emissions inventories depend on the fuel burn and emissions contributions of individual flights, which are not publicly reported by the airlines. By contrast, flight trajectories can be easily observed using surveillance systems. The fuel burn profiles of flights also show considerable variability in real-world operations, which need to be reflected in the estimates. In this talk, I describe our efforts to develop models that predict the fuel consumption and takeoff weight of a flight from its trajectory, using Gaussian Process Regression. These models are found to be significantly better than state-of-the-practice aircraft performance models, with a more than 60% improvement in fuel burn estimation and a 75% improvement in takeoff weight estimation.

(Joint work with Dr. Yashovardhan Chati)

Bio: Hamsa Balakrishnan is an Associate Professor of Aeronautics and Astronautics at the Massachusetts Institute of Technology (MIT). She is the Associate Department Head of Aeronautics and Astronautics, and the Director of Transportation@MIT. She received her PhD from Stanford University, and a B.Tech. from the Indian Institute of Technology Madras. Her current research interests are in the design, analysis, and implementation of control and optimization algorithms for large-scale cyber-physical systems, with an emphasis on air transportation.

**Benjamin Bloem-Reddy** (Oxford) - *Left-neutrality: an old friend in the mirror*

Abstract: Right-neutrality is a useful dependence structure for priors on random distribution functions; it has been used extensively in Bayesian nonparametric models of survival data in large part because it is conjugate to censored observations. Its usefulness stems from an independent increments property that also gives rise to the stick-breaking construction of the Dirichlet Process. Left-neutrality is the mirror image of right-neutrality, but does not seem as useful---it seems hardly mentioned in the literature after Doksum defined it in 1974. However, like stepping through Alice's Looking Glass, left-neutrality takes on a life of its own (and things get a little weird) when used to construct priors on sequences of random probability distributions. To illustrate, I will review some of my recent work in which neutral-to-the-left processes arise as conjugate priors in preferential attachment models of network data. I will also discuss some ongoing projects that indicate the potential for wider applicability of left-neutrality.

Bio: Ben Bloem-Reddy is a postdoctoral researcher in the Department of Statistics at the University of Oxford. His research interests include probabilistic and statistical analysis of discrete data such as networks, probabilistic symmetry, and, more recently, the statistical foundations of deep learning. He obtained a PhD in Statistics from Columbia University, a MS in Physics from Northwestern University, and a BS in Physics from Stanford University.

**Allison Chaney** (Princeton) - *Nonparametric Deconvolution Models*

Abstract: We consider the problem of modeling collections of convolved data points; specifically, each observation is composed of particles that originate from diverse factors. This talk will describe nonparametric deconvolution models (NDMs), a family of Bayesian nonparametric models for these data. The objective of this work is to create a general family of models to learn 1) the features of global factors shared among all observations as well as the number and global proportions of these factors; 2) for each observation, the proportion (or membership) of particles that belong to each factor; and 3) the features of observation-specific (or local) factors for each observation. While the first two objectives are fulfilled by existing models, the final objective is unique to our model family. This framework will allow us to ask scientific questions about observations whose local factors deviate from their corresponding global factors (e.g., anomalous voting behavior or cancerous cells).

Bio: Allison Chaney is an IC Postdoctoral Research Fellow at Princeton University, working with Barbara Engelhardt and Brandon Stewart. She also received her Ph.D. in Computer Science at Princeton, under the advisement of David Blei, and holds a B.A. in Computer Science and a B.S. in Engineering from Swarthmore College. In addition to research internships at Microsoft Research and Hunch/eBay, she has previously worked for Pixar Animation Studios and the Yorba Foundation. Her research focuses on developing scalable and interpretable machine learning methods to identify influences on human behavior. This summer she will start as faculty at Duke's Fuqua School of Business in the Marketing area.

**Joseph Futoma** (Harvard) - *Learning to Detect Sepsis with a Multi-output Gaussian Process RNN Classifier (in the Real World!)*

Abstract: Sepsis is a poorly understood and potentially life-threatening complication that can occur as a result of infection. Early detection and treatment improves patient outcomes, and as such it poses an important challenge in medicine. In this work, we develop a flexible classifier that leverages streaming lab results, vitals, and medications to predict sepsis before it occurs. We model patient clinical time series with multi-output Gaussian processes, maintaining uncertainty about the physiological state of a patient while also imputing missing values. Latent function values from the Gaussian process are then fed into a deep recurrent neural network to classify patient encounters as septic or not, and the overall model is trained end-to-end using back-propagation. We train and validate our model on a large retrospective dataset of 18 months of heterogeneous inpatient stays from the Duke University Health System, and develop a new “real-time” validation scheme for simulating the performance of our model as it will actually be used. We conclude by showing how this model is saving lives as a part of SepsisWatch, an application currently being used at Duke Hospital to screen, monitor, and coordinate treatment of septic patients.

Bio: Joseph Futoma is a CRCS Postdoctoral Research Fellow at Harvard University, working with Finale Doshi-Velez. He received his Ph.D. and M.S. from Duke University in Statistical Science under the advisement of Katherine Heller, where he was an NDSEG Fellow. He also holds an A.B. in mathematics from Dartmouth College. His main research focus is in the intersection of machine learning and healthcare, and spans areas such as Gaussian processes, reinforcement learning/sequential decision making, and survival analysis. He is especially interested in practical concerns associated with deployment of machine learning into real clinical settings.

**Sinead Williamson** (UT Austin) - Random clique covers for graphs with local density and global sparsity

Abstract: Large real-world graphs tend to be sparse, but they often contain densely connected subgraphs and exhibit high clustering coefficients. While recent random graph models can capture this sparsity, they ignore the local density. We show that models based on random edge clique covers can capture both global sparsity and local density, and are an appropriate modeling tool for many real-world graphs. Joint work with Mauricio Tec.

Bio: Sinead Williamson is an assistant professor in statistics at the University of Texas at Austin, and a research scientist at Amazon. Her research interests include scalable Bayesian inference and Bayesian nonparametrics. Before joining UT Austin, Sinead obtained her PhD from the University of Cambridge working with Zoubin Ghahramani, and spent two years as a post doc at Carnegie Mellon University working with Eric Xing.