UPDATE: Summer 2018, we are migrating to the new google sites version of this webpage, https://sites.google.com/colorado.edu/statoptml/ Reminder: to sign up for the mailing list (you do not need to be enrolled in the seminar), visit our google groups website.
## Abstracts## SPEAKER: Farhad Pourkamali-Anaraki Postdoctoral Research Associate of Applied Mathematics at the University of Colorado-Boulder
TITLE: New Directions in Randomized Dimension Reduction for Modern Data Analysis With the growing scale and complexity of datasets in scientific disciplines, traditional data analysis methods are no longer practical to extract meaningful information and patterns. The need to process large-scale datasets by memory and computation efficient algorithms arises in all fields of science and engineering. Randomization and probabilistic techniques have become fundamental tools in modern data science and machine learning for analyzing large-scale datasets. In particular, randomized dimensionality reduction techniques are effective in modern data settings since they provide a non-adaptive data-independent mapping of high-dimensional datasets into a lower dimensional space. However, such methods require strong theoretical understanding to ensure that the key properties of original data are preserved. In this talk, we will focus on two important topics in modern data analysis: (1) K-means clustering and (2) low-rank approximation of kernel matrices for analyzing datasets with highly complex and nonlinear structures. Specifically, we will present a randomized algorithm for K-means clustering in the one-pass streaming setting that does not require incoherence or distributional assumptions on the data. Moreover, we will talk about the Nystrom method for generating low-rank approximations of kernel matrices that arise in many machine learning problems. We will discuss how randomized dimensionality reduction techniques allow us to obtain highly accurate and efficient low-rank approximations compared to other state-of-the-art Nystrom methods. ## SPEAKER: David Kozak PhD student at Colorado School of Mines with Luis Tenorio
TITLE: Presenting Mokhtari and Ribeiro's paper "Global Convergence of Online Limited Memory BFGS" See the paper: http://www.jmlr.org/papers/volume16/mokhtari15a/mokhtari15a.pdf ## SPEAKER: Dan Zhang Associate Professor of Operations Management, Leeds School of Business, CU Boulder
TITLE: Some Recent Results on Linear Programming Based Approximate Dynamic Programming ABSTRACT The linear programming based approximate dynamic programming has received considerable attention in the recent literature. In this approach, high dimensional dynamic programs are solved approximately as large-scale linear programs to tackle the curse of dimensionality. The linear programming formulations are called approximate linear programs (ALPs) and typically have a large number of decision variables and constraints. A major challenge of the approach therefore lies in efficient solution of the ALPs. In this talk, I report some recent applications and theoretical results in this area of research. Example applications include network revenue management, medical appointment scheduling, and queueing control. I will conclude with discussions on research directions and potential applications in other application areas. The talk is based on a series of papers. A recent relevant publication http://danzhang.com/papers/RMReduction_Final.pdf ## SPEAKER: Peter Shaffery PhD student in Applied Mathematics, CU Boulder, with Vanja DukicTITLE: Presenting Simmons, Nelson, and Simonsohn's 2011 article "False Positive Psychology" (http://journals.sagepub.com/doi/abs/10.1177/0956797611417632) ABSTRACT Here's a quick blog post that treads some similar ground, people may want to look at that ahead of time if they're curious (http://blogs.plos.org/absolutely-maybe/2016/04/25/5-tips-for-avoiding-p-value-potholes/) Abstract of Simmons et al: In this article, we accomplish two things. First, we show that despite empirical psychologists’ nominal endorsement of a low rate of false-positive findings (≤ .05), flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. In many cases, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not. We present computer simulations and a pair of actual experiments that demonstrate how unacceptably easy it is to accumulate (and report) statistically significant evidence for a false hypothesis. Second, we suggest a simple, low-cost, and straightforwardly effective disclosure-based solution to this problem. The solution involves six concrete requirements for authors and four guidelines for reviewers, all of which impose a minimal burden on the publication process ## SPEAKER: Genevieve Patterson Microsoft Research New EnglandTITLE: Uncommon Sense: Using Neural Networks for Exploration and Creativity ABSTRACT In this talk, I investigate new ways for humans and ML systems to collaborate. I am interested in problems where pure automation is not the goal. For example, deep networks can automatically detect the presence of cancer in a mammogram but can’t tell us what avenues we should pursue in cancer research. I discuss two recent projects where state-of-the-art AI fails users. I present methods for identifying the shortcomings of canonical ML solutions and changing optimized objectives to suit user needs. My first project will demo a system for ML supported rotoscoping, an expensive post-processing technique commonly used in filmmaking. My second project explores collaboration between DNNs and radiologists to generate explanations of neural network diagnoses. I'll conclude with a quick introduction to my current work using deep learning to help people write better jokes. Through these projects, I hope to demonstrate novel interactions between ML and humans. BIO Geneviève Patterson studies interactive Computer Vision systems. She is a postdoctoral researcher at Microsoft Research New England. Her interests include visual phenomena discovery, crowd-driven dataset annotation, fine-grained object recognition, low-shot learning, and domain adaptation. Her work has appeared at several top tier computer vision venues including CVPR and ECCV, as well as HCI venues CHI and HCOMP, where her paper introducing a system for crowdsourced one-shot object detection was finalist for Best Paper in 2015. She received her PhD from Brown University in 2016 under the direction James Hays (now of Georgia Tech). ## SPEAKER: Michael Hughes Postdoctoral Fellow, Harvard UniversityTITLE: Discovering Disease Subtypes that Improve Treatment Predictions: Interpretable Machine Learning for Personalized Medicine ABSTRACT For complex diseases like depression, choosing a successful treatment from several possible drugs remains a trial-and-error process in current clinical practice. By applying statistical machine learning to the electronic health records of thousands of patients, can we discover subtypes of disease which both improve population-wide understanding and improve patient-specific drug recommendations? One popular approach is to represent noisy, high-dimensional health records as mixtures of low-dimensional subtypes via a probabilistic topic model. I will introduce this common dimensionality reduction method and explain how off-the-shelf topic models are misspecified for downstream prediction tasks across many domains from text analysis to healthcare. To overcome these poor predictions, I will introduce a new framework -- prediction-constrained training -- which learns interpretable topic models that offer competitive drug recommendations. I will also discuss open challenges in using machine learning to improve clinical decision-making. BIO Michael C. Hughes ("Mike") is currently a postdoctoral fellow in computer science at Harvard University, where he develops new machine learning methods for healthcare applications with Prof. Finale Doshi-Velez. His current research focus is helping clinicians understand and treat complex diseases like depression by training statistical models from big, messy electronic health record datasets. Other research interests include Bayesian data analysis, optimization algorithms, and any machine learning applications that advance medicine and the sciences. He completed a Ph.D. in the Department of Computer Science at Brown University in May 2016, advised by Prof. Erik Sudderth.## SPEAKER: Antonio Blanca Postdoctoral Fellow, Georgia Institute of TechnologyTITLE: Efficient Sampling for Probabilistic Models ABSTRACT A variety of tasks in science and engineering require drawing random samples from a probability distribution. As a result, Markov chain Monte Carlo (MCMC) methods, which provide powerful algorithms to sample from complex probability distributions, have found important applications in computer science, statistics, computational biology, physics, and many other fields. Phase transitions, on the other hand, mark abrupt changes in properties of probability distributions due to a small change of a parameter. In this talk, we explore connections between phase transitions and the efficiency of MCMC algorithms in probabilistic models of major interest. First, we consider the Ising model, which is among the most important and classical of all Markov random fields with vast applications in statistical mechanics, computer vision, and machine learning. We also consider the closely related random-cluster model, a random network model that has unified the study of random graphs, spin systems, and electrical networks. This talk is based on joint works with Pietro Caputo, Alistair Sinclair, and Eric Vigoda. BIO ## Antonio Blanca is a Postdoctoral Fellow in the Algorithms and Randomness Center at Georgia Tech. Previously, he completed his Ph.D. studies at UC Berkeley in 2016. His research concerns the computational problems that arise in the study of probabilistic models. He is particularly interested in Markov chain Monte Carlo methods, structure learning, the design and analysis of randomized algorithms, and in the effects of phase transitions in computation.## SPEAKER: Mark Bun Postdoctoral Researcher, Princeton UniversityTITLE: Finding Structure in the Landscape of Differential Privacy ABSTRACT
Differential privacy offers a mathematical framework for balancing two goals: obtaining useful information about sensitive data, and protecting individual-level privacy. Discovering the limitations of differential privacy yields insights as to what analyses are incompatible with privacy and why. These insights further aid the quest to discover optimal privacy-preserving algorithms. In this talk, I will give examples of how both follow from new understandings of the structure of differential privacy. I will first describe negative results for private data analysis via a connection to cryptographic objects called fingerprinting codes. These results show that an (asymptotically) optimal way to solve natural high-dimensional tasks is to decompose them into many simpler tasks. In the second part of the talk, I will discuss concentrated differential privacy, a framework which enables more accurate analyses by precisely capturing how simpler tasks compose. BIO: Mark Bun is a postdoctoral researcher in the Computer Science Department at Princeton University. He is broadly interested in theoretical computer science, and his research focuses on understanding foundational problems in data privacy through the lens of computational complexity theory. He completed his Ph.D. at Harvard in 2016, where he was advised by Salil Vadhan and supported by an NDSEG Research Fellowship. ## SPEAKER: Ali Mousavi PhD Candidate, Rice UniversityTITLE: Data-Driven Computational Sensing ABSTRACT Great progress has been made on sensing, perception, and signal processing over the last decades through the design of algorithms matched to the underlying physics and statistics of the task at hand. However, a host of difficult problems remain where the physics-based approach comes up short; for example, unrealistic image models stunt the performance of MRI and other computational imaging systems. Fortunately, the big data age has enabled the development of new kinds of machine learning algorithms that augment our understanding of the physics with models learned from large amounts of training data. In this talk, I will overview three increasingly integrated physics+data algorithms for solving the kinds of inverse problems encountered in computational sensing. At the lowest level, data can be used to automatically tune the parameters of an optimization algorithm (e.g., the regularization parameter in Lasso/ tuning parameters in AMP); improving its inferential and computational performance. At the next level, data can be used to learn a more realistic signal model that boosts the performance of an iterative recovery algorithm (e.g., convergence time, error). At the highest level, data can be used to train a deep network to encapsulate the complete underlying physics of the sensing problem (i.e., not just the signal model but also the forward model that maps signals into measurements). As we will see, moving up the physics+data hierarchy increasingly exploits training data and boosts performance accordingly. I will illustrate with computational sensing applications in compressive image recovery. BIO: Ali Mousavi is a Ph.D. candidate in the Department of Electrical and Computer Engineering at Rice University. He received B.Sc. degree in Electrical Engineering from Sharif University of Technology in 2011 and the M.Sc. degree in Electrical and Computer Engineering from Rice University in 2014. In summer 2010, he was a visiting researcher in the Laboratory of Information and Communication Systems at École Polytechnique Fédérale de Lausanne (EPFL). His research interests include machine learning and statistical signal processing, with an emphasis on applications of data science in signal processing. He has received the Rice University George R. Brown School of Engineering Fellowship, Ken Kennedy Institute Schlumberger Graduate Fellowship, and the Society of Iranian-American Women for Education Fellowship. ## SPEAKER: Jean-Gabriel Young PhD Candidate, Universite LavalTITLE: Network archeology: phase transition in the recoverability of network history ABSTRACT Network growth processes can be seen as generative models for the structure and history of complex networks. We show that this point of view leads to a natural inference task, that of reconstructing the past states of a growing network, given its current structure—a difficult permutation inference problem. Using a generalization of the preferential attachment model as the generative model, we introduce an importance sampling algorithm that allows us to estimate the distribution of past states, conditioned on the observed structure. We report a phase transition of the accuracy of this algorithm when it is applied to artificial networks generated by the model itself. This transition appears to be information-theoretic in nature, since simpler methods based on network properties also undergo a sharp drop in quality at the same point. Despite the existence of a phase transition, we find that non-trivial inference is possible in a large portion of the parameter space, including a region where many real networks are found. This implies that our method can be used to infer the past states of networks with no known temporal information. Our work therefore opens the way to many important applications, such as ordering past events in networks, or generating plausible past states for real networks that are well modeled by preferential attachment. Link to preprint. ## SPEAKER: Nathaniel Mathews PhD Candidate, University of Colorado Boulder (Applied Math)TITLE: discussion of paper "Constrained Global Optimization of Expensive Black Box Functions Using Radial Basis Functions" ABSTRACT The authors (Regis et al.) propose an iterative response-surface model for optimization which is well suited to nonlinear constraints on nonconvex objectives, and is meant to allow relatively fast optimization for high-dimensional problems. We will discuss this paper. References: "Constrained Global Optimization of Expensive Black Box Functions Using Radial Basis Functions" by Regis, Rommel G; Shoemaker, Christine A. Journal of Global Optimization; Dordrecht Vol. 31, Iss. 1, (Jan 2005): 153-171. https://search.proquest.com/docview/194645990 For a larger context, see Gutmann 2001 https://search.proquest.com/docview/194642145 And an extension, by the same author: Regis 2011: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.711.8008&rep=rep1&type=pdf ## SPEAKER: Tracy Babb PhD Candidate, University of Colorado Boulder (Applied Math)TITLE: discussion of paper "Practical sketching algorithms for low-rank matrix approximation" ABSTRACT We will present the following paper: "Practical sketching algorithms for low-rank matrix approximation" by J. A. Tropp, A. Yurtsever, M. Udell, and V. Cevher http://users.cms.caltech.edu/~jtropp/papers/TYUC17-Practical-Sketching-SIMAX.pdf The paper extends previous work on the randomized SVD by Halko, Martinsson and Tropp. ## SPEAKER: Anna Broido PhD Candidate, University of Colorado Boulder (Applied Math)TITLE: Scale-free networks are rare ABSTRACT A central claim in modern network science is that real-world networks are typically "scale free," meaning that the degrees of their nodes follow a power-law distribution. Over the past 20 years, a large literature has explored the consequences of this pattern on network vulnerability, controllability, synchronization, and the dynamics of social or biological spreading processes. However, empirical evidence for the ubiquity of scale-free networks is derived from a relatively small number of real-world examples. In this talk I will describe (i) a set of statistical evaluations for assessing the relative degree of scale-free structure in a network, and (ii) the results of applying these tools to a large corpus of nearly 1000 network data sets drawn social, biological, technological, transportation, and informational domains. Across domains, we find that scale-free networks are rare: only 4% exhibit the strongest-possible structure, while 52% exhibit the weakest-possible structure. Furthermore, we find that social networks typically exhibit weaker scale-free patterns, while a handful of technological and biological networks may exhibit stronger scale-free structure. These results indicate that networks are far more structurally diverse than previously appreciated, and highlight productive directions for future model development. The preprint is: Scale-free networks are rare, by Anna D. Broido, Aaron Clauset (Jan 2018, arXiv) |