Winter 2022

Friday, June 17 , 2022

Yashar Hezaveh

Recording: here


Exploring the Universe with Machine Learning

Astrophysics is being revolutionized by a new generation of telescopes and sky surveys that produce monumental volumes of invaluable data to answer some of the most intriguing questions about the birth and the evolution of our Universe. Surveys like the SKA will produce data on the scale of petabits per second – more than the global internet data rate today. The world's largest digital camera in Rubin's Observatory, with 3.2 Gigapixels, will scan the entire sky every 3 nights, producing a time series of billions of astrophysical sources. These data will help us discover the nature of dark matter and dark energy, two unknown components that constitute 95% of the energy content of the Universe.

Machine learning has been shown to be indispensable for the analysis of these data. In this talk, I will give a broad overview of the state of the field and the numerical methods commonly used in astrophysics. I will then focus on a project to use machine learning and statistical methods to infer the distribution of dark matter in distant galaxies. I will conclude by sharing a number of ongoing projects and opportunities for MLxAstro collaborations.




Friday, June 3 , 2022

Yatin Nandwani

Recording: here


Deep learning with symbolic constraints

Recently many training techniques and neural models have been proposed to learn, and implicitly represent known as well as unknown symbolic constraints. In this work, we first explore if neural models can be better trained using known symbolic domain knowledge expressed as constraints on the output space. To this end, we propose a primal-dual formulation for deep learning with constraints. We then shift our attention to the neural models that learn unknown symbolic constraints using input / output pairs of a combinatorial puzzle, such as sudoku. We identify a couple of potential issues in such models and propose appropriate solutions. First, we identify the issue of solution multiplicity (one input having many correct solutions) while training neural models and propose appropriate loss functions to address it. Next, we observe that existing architectures, such as SATNet, message passing based RRN, fail to generalize across the output space of variables (can not solve 16 x 16 sudoku after training on only 9 x 9 sudoku). In response, we design two neural architectures for output space invariance in combinatorial problems.



Friday, May 27 , 2022

Benjamin Sanchez-Lengeling

Recording: here


Graph Neural Networks for Olfaction and Interpretability

Olfaction is our sense of the chemical world. Why does vanilla bean smell "vanilla", or why does grass smell "green"? There are many biological, behavioural, cultural and physical components to this question. We look at it from the angle of single molecules: predicting the relationship between a molecule's structure and its odor, which remains a difficult, decades-old task. This problem is an important challenge in chemistry, impacting human nutrition, the manufacture of synthetic fragrance, the environment, and sensory neuroscience. We are attempting to answer and understand olfaction with Graph Neural Networks (GNNs). This talk is three interconnected parts that lie at the heart of my research: 1) GNNs as a tool to learn representations on graph-structured data, 2) Modelling molecules with GNNs to build an odor representation, 3) Graph data and GNNs as a testbest for interpretability techniques, which we ultimately want to use for scientific discovery.



Friday, May 20 , 2022

Andrea Tacchetti

Recording: here


The Good Shepherd: Machine Learning for Mechanism Design

From recommender systems to traffic routing, machine learning systems are mediating an ever growing number of economic and social interactions among individuals, firms, and organizations, slowly becoming a cornerstone of modern institutions. This talk will focus on constructing agents (“mechanisms”) that successfully mediate economic interactions among participants in the presence of strategic behavior, information asymmetry, and in pursuit of complex group-wide metrics. In particular, the research presented will focus on the multiagent and value alignment challenges arising in this context: how do we construct mechanisms that keep up with adaptive participants, shepherd their learnings towards desirable outcomes, and ensure that the participants’ own aspirations for group wide goals are represented in the policy of the mediator.



Friday, May 6 , 2022

Igor Mordatch

Recording: here


Reinforcement Learning via Sequence and Energy-Based Modeling

Can standard sequence modeling frameworks train effective policies for reinforcement learning (RL)? Doing so would allow drawing upon the simplicity and scalability of the Transformer architecture, and associated advances and infrastructure investments in language modeling such as GPT-x and BERT. I will present our work investigating this by casting the problem of RL as optimality-conditioned sequence modeling. Despite the simplicity, such an approach is surprisingly competitive with current model-free offline RL baselines. However, robustness of such an approach remains a challenge in robotics applications. In the second part of the talk, I will discuss the ways in which implicit, energy-based models can address it - particularly with respect to approximating complex, potentially discontinuous and multi-valued functions. Robots with such implicit policies can learn complex and remarkably subtle behaviors on contact-rich tasks from human demonstrations, including tasks with high combinatorial complexity and tasks requiring 1mm precision.



Friday, March 18, 2022

Alan Cowen

Recording: here


How To Build Technology With Empathy: Addressing the Need for Psychologically Valid Data

From digital assistants that can sense and soothe your frustrations to photo apps that can identify a warm smile, empathic AI will be central to a future in which our everyday interactions with technology ultimately serve our emotional well-being. But researchers and developers are missing the right training data to make this human-centric future for empathic AI a reality. Large-scale, globally diverse data with a multitude of emotional expressions and contexts is essential to paint a scientifically valid picture of human emotion. I will discuss three principles for gathering this kind of rich, globally diverse, psychologically valid emotion data at scale, and explain how we implemented them at Hume AI to gather 3 million self-report and perceptual judgments of 1.5 million human emotional behaviors. Using Hume’s models as an example, I will showcase how ML models trained on this data can infer human emotional behavior with more accuracy and nuance than was previously possible. Finally, I will summarize ethical guidelines for deploying these powerful new empathic technologies at scale.



Friday, February 25, 2022

Vish Sivakumar & Michael Mandel

Recording: here


Non-invasive neural interfaces and challenges

Non-invasive neural interfaces have the potential to transform human-computer interaction by providing users with low friction, information rich, always available inputs. Reality Labs at Meta is developing such an interface for the control of augmented reality devices based on electromyographic (EMG) signals captured at the wrist. Machine learning is crucial to unlocking the full potential of these signals and interactions and this talk will present several specific problems and the machine learning approaches that have advanced us towards this ultimate goal of effortless and joyful interfaces. We will provide the necessary neuroscientific background to understand these signals, describe supervised approaches to biomimetic control especially for generating text, detail several approaches to enabling generalization across users and sessions, and discuss unsupervised approaches to extending the bandwidth of the human-machine interface using these signals.



Friday, February 18, 2022

Aishwarya Agrawal

Recording: here


Are current vision-language models learning to solve the task or merely learning to solve the dataset?

Over the past few years, vision-language models have led to significant improvements on various tasks such as image-caption retrieval, image captioning, visual question answering, even surpassing human performance on some tasks (such as visual question answering). But, are current models really better than humans in answering questions about images? Are current vision-language models really learning to solve the task or merely learning to solve the dataset? In this talk, I will present a few case studies spanning different tasks and models that try to answer this question via careful and systematic evaluations.



Friday, February 4, 2022

Marc Bellemare

Recording: here


Distributional reinforcement learning: A richer model of agent-environment interactions

Few decisions are made with full certainty of their consequences. In reinforcement learning, this principle is instantiated by modelling the sum of rewards obtained (the return) as a random quantity. Consequently, having a complete picture of reinforcement learning requires understanding how an agent's choices affect the distribution of possible returns. Based on our upcoming book (MIT Press), this talk gives a snapshot of the current state of distributional reinforcement learning, including: a characterization of the random return by means of the distributional Bellman equation, dynamic programming algorithms for computing approximations to the return distribution, and a small sample of the ways in which distributional predictions can be used to make better decisions.



Friday, January 14, 2022

Utku Evci

Recording: here


Beyond Static Network Architectures

Going beyond static architectures and using dynamically (1) trained, (2) executed or (3) adapted architectures has been shown to provide faster optimization, better scaling and more effective generalization. In this talk I will give a short overview of these results and share some of our recent work on dynamic training and adaptation of neural networks. On the dynamic training front, I plan to discuss our work on (a) training sparse neural networks and (b) growing neural networks, both of which use gradients as the guiding signal to update architectures during training. I will conclude with our recent work on (c) transfer learning, in which we propose to utilize a pretrained network head2toe by selecting features from all intermediate activations and show that this approach matches fine tuning performance.