Spring 2019

Quick info:

  • To sign up for the mailing list (you do not need to be enrolled in the seminar), visit our google groups website.
  • The seminar meets Tuesdays, 3:30 to 4:30 PM in the Newton Lab (ECCR 257) in the Classroom Wing of the Engineering Center on the CU Boulder main campus

List of Talks

  • Jan 15, Lior Horesh (IBM Research), "Don't go with the flow – A new tensor algebra for Neural Networks"
  • Feb 5
    • Osman Malik, "Fast Randomized Matrix and Tensor Interpolative Decomposition Using CountSketch"
    • Ann-Casey Hughes presents "Learning the Grammar of Dance" (Stuart and Bradley, ICML 1998)
  • Feb 12, Svenja Knappe (CU-Boulder Assoc. Research Prof. in Mech. Eng.), "Magnetic field imaging with optically-pumped magnetometers"
  • Feb 19, CS Colloquium: Yixuan (Sharon) Li (Research Scientist, Facebook AI), "Learning Generalizable and Reliable Neural Machines in an Open World"
  • Feb 26, Phil Kragel (CU-Boulder in ICS), "Detecting emotional situations using convolutional neural networks and distributed models of human brain activity"
  • Mar 5, Shuang Li (Colorado School of Mines), "Optimization for High-dimensional Analysis and Estimation"
  • Mar 12, Anshumali Shrivastava (Rice Univ, Asst. Prof. in CS), "Hashing Algorithms for Extreme Scale Machine Learning"
  • Mar 19, Sophie Giffard-Roisin (CU-Boulder in CS), "Hurricane forecasting using fused deep learning"
  • Mar 26, No talk (spring break)
  • Apr 2 (paper discussion)
  • Apr 9
    • Michael Ramsey, "Behavioral assessment of hearing in 2- to 7-year-old children: Evaluation of a two-interval, observer-based procedure using conditioned play-based responses"
    • Alec Dunton presents "Multi-fidelity optimization via surrogate modeling" by (Forrester, Sobester, and Keane)
  • Apr 16, John Pearson (Duke Univ, Asst Prof in Biostatistics and Bioinformatics), "Modeling real behavior in two-person differential games"
  • Apr 23, Sidney D'Mello (CU Assoc. Prof. in ICS)
  • Apr 30: Sam Paskewitz presents "Bayesian Online Changepoint Detection" (Adams and McKay, 2007)

Abstracts

Jan 15

Title: Don't go with the flow – A new tensor algebra for Neural Networks

Speaker: Lior Horesh (IBM Research)

Abstract: Multi-dimensional information often involves multi-dimensional correlations that may remain latent by virtue of traditional matrix-based learning algorithms. In this study, we propose a tensor neural network framework that offers an exciting new paradigm for supervised machine learning. The tensor neural network structure is based upon the t-product (Kilmer and Martin, 2011), an algebraic formulation to multiply tensors via circulant convolution which inherits mimetic matrix properties. We demonstrate that our tensor neural network architecture is a natural high-dimensional extension to conventional neural networks. Then, we expand upon (Haber and Ruthotto, 2017) interpretation of deep neural networks as discretizations of nonlinear differential equations, to construct intrinsically stable tensor neural network architectures. We illustrate the advantages of stability and demonstrate the potential of tensor neural networks with numerical experiments on the MNIST dataset.

Speaker Bio: Lior Horesh is the Manager of the Mathematics of AI group of IBM TJ Watson Research Center as well as an IBM Master Inventor. Dr. Horesh also holds an Adjunct Associate Professor in the Computer Science Department of Columbia University, teaching graduate level Advanced Machine Learning and Quantum Computing Theory and Practice courses. His expertise lies at large-scale modeling, inverse problems, tensor algebra, experimental design and quantum computing. His recent research focuses on the interplay between first-principles and data-driven methods.

Feb 5

Title: Fast Randomized Matrix and Tensor Interpolative Decomposition Using CountSketch

Speaker: Osman Malik

Abstract: In this talk I will present our recently developed fast randomized algorithm for matrix interpolative decomposition. If time permits, I will also say a few words about how our method can be applied to the tensor interpolative decomposition problem. Our preprint paper is available at https://arxiv.org/abs/1901.10559


Title: Discussion of the paper “Learning the Grammar of Dance” by Joshua M. Stuart and Elizabeth Bradley, Dept Computer Science at CU

Presenter: Ann-Casey Hughes

Paper Abstract: A common task required of a dancer or athlete is to move from one prescribed body posture to another in a manner that is consistent with a specific style. One can automate this task, for the purpose of computer animations, using simple machine-learning and search techniques. In particular, we find kinesiologically and stylistically consistent interpolation sequences between pairs of body postures using graph-theoretic methods to learn the “grammar” of joint movement in a given corpus and then applying memory-bounded A* search to the resulting transition graphs—using an influence diagram that captures the topology of the human body in order to reduce the search space. Paper link: https://www.cs.colorado.edu/~lizb/papers/icml98.pdf

Feb 12

Title: Magnetic field imaging with optically-pumped magnetometers

Speaker: Svenja Knappe

Abstract: I present our ongoing effort in developing imaging systems with microfabricated optically-pumped magnetometers (mOPMs). By use of microfabrication technologies and simplification of optical setups, we aim to develop manufacturable sensors of small size and low power. Our zero-field mOPMs require a shielded environment but reach high sensitivities of less than 10 fT/Hz1/2. One target application lies in the field of non-magnetic brain imaging, specifically magnetoencephalography (MEG). The attraction of using these sensors for non-invasive brain imaging comes from the possibility of placing them directly on the scalp of the patient, very close to the brain sources. We have built several multi-channel test systems to validate the prediction of very high signal-to-noise ratios in standard MEG paradigms.

Bio: Svenja Knappe received her Ph.D. in physics from the University of Bonn, Germany in 2001 with a thesis on miniature atomic magnetometers and atomic clocks based on coherent-population trapping. For 16 years, she worked at the National Institute of Standards and Technology (NIST) in Boulder CO, developing chip-scale atomic sensors. She is now an Associate Research Professor at the University of Colorado and her research interests include microfabricated atomic sensors. She is also a co-founder of FieldLine Inc.

Feb 19

Computer Science Colloquium

Title: Learning Generalizable and Reliable Neural Machines in an Open World

Speaker: Yixuan Sharon Li (Research Scientist, Facebook AI)

Abstract: The past several years have seen a growing demand for building intelligent machines that can learn from and generalize to complex data. However, despite tremendous performance improvements, high-capacity models such as deep neural networks still struggle to generalize to the diverse world. In this talk, I will present my recent work addressing challenges on learning more generalizable and reliable visual representations. This requires machine learning models to not only classify data accurately from known classes and distributions, but also to develop awareness against abnormal examples in open environments. To this end, I will first talk about works on how we can improve generalization by leveraging multiple learning agents. Then, I will discuss an approach that effectively detects anomalies from outside the training distribution. Finally, I will share ongoing efforts on learning highly generalizable representations without strong human supervision, followed by discussion of future directions towards a minimally supervised, continuous learning paradigm.

Feb 26

Title: Detecting emotional situations using convolutional neural networks and distributed models of human brain activity

Speaker: Philip Kragel

Abstract: Emotions are thought to be canonical responses to situations ancestrally linked to survival or the well-being of an organism. Although sensory elements do not fully determine the nature of emotional responses, they should be sufficient to convey the schema or situation that an organism must respond to. However, few computationally explicit models describe how combinations of stimulus features come to evoke different types of emotional responses, and further, it is not clear that activity in sensory (e.g., visual) cortex contains distinct codes for multiple classes of emotional responses in a rich way. In this talk I will present research 1) developing convolutional neural networks that identify different kinds of emotional situations, and 2) using human neuroimaging to understand how model representations of these situations are characterized by distributed patterns of activity in the human visual system. I will conclude by discussing future directions using machine learning approaches to understand emotional phenomena. Preprint: https://doi.org/10.1101/470237

Bio: Philip Kragel is a postdoctoral associate in the Cognitive and Affective Neuroscience Lab at the University of Colorado Boulder, where he has worked since 2015. He completed his Ph.D. in Psychology and Neuroscience, in addition to a master’s in engineering management, and a bachelor’s degree in biomedical engineering at Duke University. His research integrates cognitive neuroscience and machine learning approaches to understand the mind, with a focus on emotion and related affective phenomena.

Mar 5

Title: Optimization for High-dimensional Analysis and Estimation

Speaker: Shuang Li

Abstract: High-dimensional signal analysis and estimation appear in many signal processing applications, including modal analysis, and parameter estimation in the spectrally sparse signals. The underlying low-dimensional structure in these high-dimensional signals inspires us to develop optimization-based techniques and theoretical guarantees for the above fundamental problems in signal processing. In many applications, high-dimensional signals often have certain concise representations, which is a linear combination of a small number of atoms in a dictionary with elements drawn from the signal space. In compressive sensing, L1-minimization is a widely used framework to find the sparse representations of a signal. It has recently been shown that atomic norm minimization, which is a generalization of L1-minimization, is an efficient and powerful way for exactly recovering unobserved time-domain samples and identifying unknown frequencies in signals having sparse frequency spectra, namely, finding a concise representation for spectrally sparse signals. This new technique works on a continuous dictionary and can completely avoid the effects of basis mismatch, which can plague conventional grid-based compressive sensing techniques. The objective of this presentation is to analyze and estimate the high-dimensional signals or parameters contained in these signals with optimization-based techniques.

Bio: Shuang Li received the B. Eng. degree in communications engineering from Zhejiang University of Technology, Hangzhou, China, in 2013. She is currently working toward the Ph.D. degree at the Colorado School of Mines, Golden, CO, USA. Her research interests include compressive sensing, modal analysis, blind inverse problems, and convex and non-convex optimization for signal processing and machine learning. Homepage: http://inside.mines.edu/~shuangli/

Mar 12

Title: Hashing Algorithms for Extreme Scale Machine Learning

Speaker: Anshumali Shrivastava

Abstract: In this talk, I will discuss some of my recent and surprising findings on the use of hashing algorithms for large-scale estimations. Locality Sensitive Hashing (LSH) is a hugely popular algorithm for sub-linear near neighbor search. However, it turns out that fundamentally LSH is a constant time (amortized) adaptive sampler from which efficient near-neighbor search is one of the many possibilities. Our observation adds another feather in the cap for LSH. LSH offers a unique capability to do smart sampling and statistical estimations at the cost of few hash lookups. Our observation bridges data structures (probabilistic hash tables) with efficient unbiased statistical estimations. I will demonstrate how this dynamic and efficient sampling beak the computational barriers in adaptive estimations where it is possible that we pay roughly the cost of uniform sampling but get the benefits of adaptive sampling. We will demonstrate the power of one simple idea for three favorite problems 1) Partition function estimation for large NLP models such as word2vec, 2) Adaptive Gradient Estimations for efficient SGD, 3) Sub-Linear Deep Learning with Huge Parameter Space.

I will show the power of these randomized algorithm by introducing SLIDE system. SLIDE is an auspicious illustration of the power of smart randomized algorithms over CPUs in outperforming the best available GPU with an optimized implementation for training large neural networks. Our evaluations on large industry-scale datasets, with some large fully connected architectures, show that training with SLIDE on a 44 core CPU is more than 2.7 times (2 hours vs. 5.5 hours) faster than the same network trained using Tensorow on Tesla V100 at any given accuracy level.

Finally, if time permits, I will show how following the same estimation view of LSH we can recover some very surprising estimates which can break the linear memory barrier in near neighbor search.

Bio: Anshumali Shrivastava is an assistant professor in the computer science department at Rice University. His broad research interests include randomized algorithms for large-scale machine learning. In 2018, Science news named him one of the Top-10 scientists under 40 to watch. He is a recipient of National Science Foundation CAREER Award, a Young Investigator Award from Air Force Office of Scientific Research, and machine learning research award from Amazon. His research on hashing inner products has won Best Paper Award at NIPS 2014 while his work on representing graphs got the Best Paper Award at IEEE/ACM ASONAM 2014. Anshumali finished his Ph.D. in 2015 from Cornell University.

Mar 19

Title: Hurricane forecasting using fused deep learning

Speaker: Sophie Giffard-Roisin

Abstract: The forecast of hurricane trajectories is crucial for the protection of people and property, but machine learning techniques have been scarce for this so far. I will present a method that we developed recently, a fusion of neural networks, that is able to combine past trajectory data and reanalysis atmospheric images (wind and pressure 3D fields). Our network is trained to estimate the longitude and latitude displacement of hurricanes and depressions from a large database from both hemispheres (more than 3000 storms since 1979, sampled at a 6 hour frequency). Finally, I will give an overview of the hackathon that I organized on a very close topic at the Climate Informatics Workshop in September.

Bio: Sophie Giffard-Roisin is a post-doctoral researcher at the computer science department of CU Boulder. After a PhD in machine learning for medical imaging at Inria (Sophia-Antipolis, France), she did her post-doc with Claire Monteleoni on machine learning for climate and weather applications, and worked on hurricane forecasting and avalanche detection. She also organized the hackathon of the 2018 Climate Informatics Workshop (Boulder). Personal webpage: http://sophiegif.github.io/

Apr 2

Paper Title: "Discontinuous vs. Smooth Regression" (Muller and Stadtmuller, 1999)

Presenter: David Gunderman

Summary: This paper presents a method for determining from (noisy) measurements whether an underlying regression function is continuous or discontinuous. I will present the method and an application of the method to some data on children's growth.

Link: https://rmgsc.cr.usgs.gov/outgoing/threshold_articles/Muller_Stadtmuller1999.pdf


Paper Title: "Breathing oscillations in a global simulation of a thin accretion disc" (Mishra, Kluźniak, and Fragile, 2018)

Presenter: Sananda Banerjee

Paper Abstract: We study the oscillations in an axisymmetric, viscous, radiative, general relativistic hydrodynamical simulation of a geometrically thin disc around a non-rotating, 6.62M⊙ black hole. The numerical set-up is initialized with a Novikov–Thorne, gas pressure-dominated accretion disc, with an initial mass accretion rate of m˙=0.01LEdd/c2 (where LEdd is the Eddington luminosity and c is the speed of light). Viscosity is treated with the α-prescription. The simulation was evolved for about 1000 Keplerian orbital periods at three Schwarzschild radii (ISCO radius). Power density spectra of the radial and vertical fluid velocity components, the total (gas + radiation) mid-plane pressure, and the vertical component of radiative flux from the photosphere, all reveal strong power at the local breathing oscillation frequency. The first, second, and third harmonics of the breathing oscillation are also clearly seen in the data. We quantify the properties of these oscillations by extracting eigenfunctions of the radial and vertical velocity components and total pressure. This confirms that these oscillations are associated with breathing motion.

Link: https://academic.oup.com/mnras/article/483/4/4811/5211081

Apr 9

Title: Behavioral assessment of hearing in 2- to 7-year-old children: Evaluation of a two-interval, observer-based procedure using conditioned play-based responses

Speaker: Michael Ramsey

Abstract: It is challenging to collect reliable behavioral data from toddlers and preschoolers. Consequently, there are significant gaps in our understanding of how auditory development unfolds during this time period. My colleague, Angela Bonino, has designed a new experimental method for improved data collection that appears to be promising for collecting more, and more reliable data. For this talk, I will focus mainly on the data visualization and statistical methods used to present and analyze the data.


Paper Title: "Multi-fidelity optimization via surrogate modeling" (Forrester, Sobester, and Keane)

Presenter: Alec Dunton

Paper Abstract: This paper demonstrates the application of correlated Gaussian process based approximations to optimization where multiple levels of analysis are available, using an extension to the geostatistical method of co-kriging. An exchange algorithm is used to choose which points of the search space to sample within each level of analysis. The derivation of the co-kriging equations is presented in an intuitive manner, along with a new variance estimator to account for varying degrees of computational ‘noise’ in the multiple levels of analysis. A multi-fidelity wing optimization is used to demonstrate the methodology.

Link: https://eprints.soton.ac.uk/64698/1/RSPA20071900.pdf

Apr 16

Title: Modeling real behavior in two-person differential games

Speaker: John Pearson

Abstract: In the behavioral sciences, games and game theory have long been the tools of choice for studying strategic behavior. However, the most commonly studied games involve only small numbers of discrete choices and well-defined rounds, while real-world strategic behaviors are continuous and extended in time. Differential game theory attempts to model these phenomena, but the theoretical and empirical properties of differential games are comparatively poorly understood. In this talk, I'll present recent work from my lab modeling the empirical behavior of real agents (humans and monkeys) in one such game. By combining approaches from control theory and physics with scalable Bayesian inference, we are able to fit generative models that not only produce realistic new examples of behavior but also decompose players' strategies into scientifically meaningful components.

Bio: John Pearson is Assistant Professor in the Department of Biostatistics and Bioinformatics at Duke, with appointments in Electrical and Computer Engineering, Neurobiology, and Psychology & Neuroscience. He did his PhD in theoretical physics, working on string theory and quantum gravity, before switching to neuroscience, where he taught monkeys to gamble, had patients play video games during brain surgery, and modeled jury decision-making. His lab works at the intersection of machine learning and neuroscience, from social decisions in humans to information processing in the retina.

Apr 23

Title: Project Tesserae: Longitudinal Multimodal Modeling of Individuals in Naturalistic Contexts

Speaker: Sidney K. D’Mello

Abstract: I will describe our team’s efforts on a two-year Intelligence Advanced Research Projects Activity (IARPA) program called MOSAIC - Multimodal Objective Sensing to Assess Individuals with Context. The program’s ambitious aims are to “advance multimodal sensing to measure personnel and their environment unobtrusively, passively, and persistently both at work and outside of work, reduce the time and manpower required to process and integrate such data, and construct personalized and adaptive assessments of an individual that are accurate throughout the individual’s career.”

The premise of our team (called Project Tesserae) was to fuse information from low-cost mobile devices which individuals already use in their daily lives with accurate and robust machine learning techniques to develop generalizable models of psychological, health, and job performance measures. Towards this end, we conducted a year-long study of over 750 working professionals from across the US to explore the extent to which wearables, smartphones, Bluetooth beacons, social media, and other sensing streams can offer insights into individuals embedded in their social contexts. In this talk, I will present an overview of the Project Tesserae study design, offer insight on lessons learned, share experiences on various sensing technologies, and describe several early insights gained.

Bio: Sidney D’Mello (PhD in Computer Science) is an Associate Professor in the Institute of Cognitive Science and Department of Computer Science at the University of Colorado Boulder. He is interested in the dynamic interplay between cognition and emotion while individuals and groups engage in complex real-world tasks. He applies insights gleaned from this basic research program to develop intelligent technologies that help people achieve to their fullest potential by coordinating what they think and feel with what they know and do. D’Mello has co-edited seven books and published over 250 journal papers, book chapters, and conference proceedings (13 of these have received awards). His work has been funded by numerous grants and he serves(d) as associate editor for four journals, on the editorial boards for six others, and has played leadership roles in several professional organizations.

https://sites.google.com/site/sidneydmello/

Apr 30

Paper Title: "Bayesian Online Changepoint Detection" (Adams and McKay, 2007)

Presenter: Sam Paskewitz

Paper Abstract: Changepoints are abrupt variations in the generative parameters of a data sequence. Online detection of changepoints is useful in modelling and prediction of time series in application areas such as finance, biometrics, and robotics. While frequentist methods have yielded online filtering and prediction techniques, most Bayesian papers have focused on the retrospective segmentation problem. Here we examine the case where the model parameters before and after the changepoint are independent and we derive an online algorithm for exact inference of the most recent changepoint. We compute the probability distribution of the length of the current "run" or time since the last changepoint, using a simple message-passing algorithm. Our implementation is highly modular so that the algorithm may be applied to a variety of types of data. We illustrate this modularity by demonstrating the algorithm on three different real-world data sets.

Link: https://arxiv.org/abs/0710.3742