Astrophysics is being revolutionized by a new generation of telescopes and sky surveys that produce monumental volumes of invaluable data to answer some of the most intriguing questions about the birth and the evolution of our Universe. Surveys like the SKA will produce data on the scale of petabits per second – more than the global internet data rate today. The world's largest digital camera in Rubin's Observatory, with 3.2 Gigapixels, will scan the entire sky every 3 nights, producing a time series of billions of astrophysical sources. These data will help us discover the nature of dark matter and dark energy, two unknown components that constitute 95% of the energy content of the Universe.
Machine learning has been shown to be indispensable for the analysis of these data. In this talk, I will give a broad overview of the state of the field and the numerical methods commonly used in astrophysics. I will then focus on a project to use machine learning and statistical methods to infer the distribution of dark matter in distant galaxies. I will conclude by sharing a number of ongoing projects and opportunities for MLxAstro collaborations.
Recently many training techniques and neural models have been proposed to learn, and implicitly represent known as well as unknown symbolic constraints. In this work, we first explore if neural models can be better trained using known symbolic domain knowledge expressed as constraints on the output space. To this end, we propose a primal-dual formulation for deep learning with constraints. We then shift our attention to the neural models that learn unknown symbolic constraints using input / output pairs of a combinatorial puzzle, such as sudoku. We identify a couple of potential issues in such models and propose appropriate solutions. First, we identify the issue of solution multiplicity (one input having many correct solutions) while training neural models and propose appropriate loss functions to address it. Next, we observe that existing architectures, such as SATNet, message passing based RRN, fail to generalize across the output space of variables (can not solve 16 x 16 sudoku after training on only 9 x 9 sudoku). In response, we design two neural architectures for output space invariance in combinatorial problems.
Olfaction is our sense of the chemical world. Why does vanilla bean smell "vanilla", or why does grass smell "green"? There are many biological, behavioural, cultural and physical components to this question. We look at it from the angle of single molecules: predicting the relationship between a molecule's structure and its odor, which remains a difficult, decades-old task. This problem is an important challenge in chemistry, impacting human nutrition, the manufacture of synthetic fragrance, the environment, and sensory neuroscience. We are attempting to answer and understand olfaction with Graph Neural Networks (GNNs). This talk is three interconnected parts that lie at the heart of my research: 1) GNNs as a tool to learn representations on graph-structured data, 2) Modelling molecules with GNNs to build an odor representation, 3) Graph data and GNNs as a testbest for interpretability techniques, which we ultimately want to use for scientific discovery.
From recommender systems to traffic routing, machine learning systems are mediating an ever growing number of economic and social interactions among individuals, firms, and organizations, slowly becoming a cornerstone of modern institutions. This talk will focus on constructing agents (“mechanisms”) that successfully mediate economic interactions among participants in the presence of strategic behavior, information asymmetry, and in pursuit of complex group-wide metrics. In particular, the research presented will focus on the multiagent and value alignment challenges arising in this context: how do we construct mechanisms that keep up with adaptive participants, shepherd their learnings towards desirable outcomes, and ensure that the participants’ own aspirations for group wide goals are represented in the policy of the mediator.
Can standard sequence modeling frameworks train effective policies for reinforcement learning (RL)? Doing so would allow drawing upon the simplicity and scalability of the Transformer architecture, and associated advances and infrastructure investments in language modeling such as GPT-x and BERT. I will present our work investigating this by casting the problem of RL as optimality-conditioned sequence modeling. Despite the simplicity, such an approach is surprisingly competitive with current model-free offline RL baselines. However, robustness of such an approach remains a challenge in robotics applications. In the second part of the talk, I will discuss the ways in which implicit, energy-based models can address it - particularly with respect to approximating complex, potentially discontinuous and multi-valued functions. Robots with such implicit policies can learn complex and remarkably subtle behaviors on contact-rich tasks from human demonstrations, including tasks with high combinatorial complexity and tasks requiring 1mm precision.
From digital assistants that can sense and soothe your frustrations to photo apps that can identify a warm smile, empathic AI will be central to a future in which our everyday interactions with technology ultimately serve our emotional well-being. But researchers and developers are missing the right training data to make this human-centric future for empathic AI a reality. Large-scale, globally diverse data with a multitude of emotional expressions and contexts is essential to paint a scientifically valid picture of human emotion. I will discuss three principles for gathering this kind of rich, globally diverse, psychologically valid emotion data at scale, and explain how we implemented them at Hume AI to gather 3 million self-report and perceptual judgments of 1.5 million human emotional behaviors. Using Hume’s models as an example, I will showcase how ML models trained on this data can infer human emotional behavior with more accuracy and nuance than was previously possible. Finally, I will summarize ethical guidelines for deploying these powerful new empathic technologies at scale.
Non-invasive neural interfaces have the potential to transform human-computer interaction by providing users with low friction, information rich, always available inputs. Reality Labs at Meta is developing such an interface for the control of augmented reality devices based on electromyographic (EMG) signals captured at the wrist. Machine learning is crucial to unlocking the full potential of these signals and interactions and this talk will present several specific problems and the machine learning approaches that have advanced us towards this ultimate goal of effortless and joyful interfaces. We will provide the necessary neuroscientific background to understand these signals, describe supervised approaches to biomimetic control especially for generating text, detail several approaches to enabling generalization across users and sessions, and discuss unsupervised approaches to extending the bandwidth of the human-machine interface using these signals.
Over the past few years, vision-language models have led to significant improvements on various tasks such as image-caption retrieval, image captioning, visual question answering, even surpassing human performance on some tasks (such as visual question answering). But, are current models really better than humans in answering questions about images? Are current vision-language models really learning to solve the task or merely learning to solve the dataset? In this talk, I will present a few case studies spanning different tasks and models that try to answer this question via careful and systematic evaluations.
Few decisions are made with full certainty of their consequences. In reinforcement learning, this principle is instantiated by modelling the sum of rewards obtained (the return) as a random quantity. Consequently, having a complete picture of reinforcement learning requires understanding how an agent's choices affect the distribution of possible returns. Based on our upcoming book (MIT Press), this talk gives a snapshot of the current state of distributional reinforcement learning, including: a characterization of the random return by means of the distributional Bellman equation, dynamic programming algorithms for computing approximations to the return distribution, and a small sample of the ways in which distributional predictions can be used to make better decisions.
Going beyond static architectures and using dynamically (1) trained, (2) executed or (3) adapted architectures has been shown to provide faster optimization, better scaling and more effective generalization. In this talk I will give a short overview of these results and share some of our recent work on dynamic training and adaptation of neural networks. On the dynamic training front, I plan to discuss our work on (a) training sparse neural networks and (b) growing neural networks, both of which use gradients as the guiding signal to update architectures during training. I will conclude with our recent work on (c) transfer learning, in which we propose to utilize a pretrained network head2toe by selecting features from all intermediate activations and show that this approach matches fine tuning performance.