Winter 2021

Friday, June 4, 2021

Jeff Clune

Recording: here


AI-Generating Algorithms, an Alternate Paradigm for Producing General AI, and an Example in this Direction: Learning to Continually Learn

A clear trend in machine learning is that hand-designed pipelines are replaced by higher-performing learned pipelines once sufficient compute and data are available. I argue that trend will apply to machine learning itself, and thus that the fastest path to truly powerful AI is to create AI-generating algorithms (AI-GAs) that generate their own architectures, learning algorithms, and training environments. This paradigm includes an all-in bet on meta-learning. After introducing these ideas, the talk focuses on one example of this paradigm: Learning to Continually Learn. Catastrophic forgetting is a longstanding Achilles Heel of machine learning, wherein ML systems learn new tasks by overwriting their knowledge of how to solve previous tasks. To produce agents that can continually learn, we must prevent catastrophic forgetting. I will describe A Neuromodulated Meta-Learning algorithm (ANML), which uses meta-learning to try to solve catastrophic forgetting, producing state-of-the-art results.



Friday, May 14 , 2021

Kaleem Siddiqi

Recording: here


Learning Representations for Biological Structures : Insights from the Heart and the Brain

Biological structures are not aligned to grids, but exhibit regularities of their own. Drawing on insights from the mammalian heart and the mammalian brain, I will demonstrate the role of minimal surfaces, minimal paths, moving frames and diffusion geometry to understand geometric structure and relate it to function. I will also describe new findings obtained by applying computer vision methods at the millimeter, micron and nanometer scales. In the outer wall of the heart we have reconstructed an entirely new system of myofibers, which runs longitudinally from apex to base. We hypothesize that such fibres play an important role in electrical conduction. Ultrastructure analysis of astrocytes, the most complex cells in the mammalian brain which comprise a network that co-exists with that formed by neurons, reveals new insights into their form and how it facilitates their function.



Friday, Apr 30 , 2021

Geoff Hinton

Recording: here


How to represent part-whole hierarchies in a neural net

I will present a single idea about representation which allows advances made by several different groups to be combined into an imaginary system called GLOM. The advances include transformers, neural fields, contrastive representation learning, distillation and capsules. GLOM answers the question: How can a neural network with a fixed architecture parse an image into a part-whole hierarchy which has a different structure for each image? The idea is simply to use islands of identical vectors to represent the nodes in the parse tree. The talk will discuss the many ramifications of this idea. If GLOM can be made to work, it should significantly improve the interpretability of the representations produced by transformer-like systems when applied to vision or language.

Friday, Apr 16 , 2021

Sarath Chandar

Recording: here


Towards Lifelong Learning Systems

One of the grand challenges of Artificial Intelligence is to design artificial agents that can achieve human-level general intelligence. While deep learning has demonstrated super-human performance in several applications, current machine learning (ML) systems are highly specific to the task they were trained on and cannot generalize when faced with sequences of other tasks. Furthermore, even when faced with the original task, these systems cannot learn after deployment. Lifelong learning is a paradigm in ML where systems learn continuously over a sequence of tasks. Such systems, if realizable, can transfer knowledge between tasks, develop better priors for future tasks, and hence be as sample efficient as humans in learning. In this talk, I will discuss the core challenges in designing lifelong learning systems which include catastrophic forgetting and capacity saturation. Then, I will introduce new lifelong learning benchmarks which are inspired by realistic scenarios. Finally, I will explain the shortcomings of the existing replay-based algorithms for lifelong learning and introduce a new class of optimizers for lifelong learning. Our proposed optimizers are complementary to existing solutions and when combined with any of the existing solutions, result in even less catastrophic forgetting. Throughout the talk, I will also cover applications of lifelong learning in computer vision, natural language processing, and reinforcement learning. Towards the end, I will talk about the open questions in lifelong learning and promising future research directions.

Friday, Apr 9 , 2021

Anna Huang

Recording: here


Interactive Generative models for music

How can we turn listeners into creators? How can we support novices in composing their own musical adventure? Advances in generative modeling has opened up exciting possibilities in this space. To leverage these models in enabling new musical interactions, we need to go beyond optimizing for machine learning (ML) objectives such as realism of generated samples, and consider human objectives in human-computer interaction (HCI) such as user control, creative agency and sense of authorship. Instead of optimizing for a task, we need to support a process, the creative process of exploration, prototyping and iteration.


First, I'll describe how we designed a generative model with human objectives in mind, allowing users to interact with the model in an iterative workflow. To enable flexible infilling and rewriting, we use OrderlessNADE (Uria 2014) to model all possible orderings of filling in a score and independent blocked Gibbs (Yao 2014) as the generation procedure. The result is Coconet, the ML model that powered Google's first AI Doodle, the Bach Doodle, in two days harmonizing more than 55 million melodies from users around the world.


Next, I'll talk about Music Transformer to discuss ML challenges in music modeling. As music relies heavily locally on relative timing and globally on repetition to build structure, we had to augment and scale up how self-attention processes relational information. The result is a model that can generate music that sounds compelling and coherent across multiple time scales, from the 10-millisecond scale of expressive timing in performance to the minute scale.

Can we enable novices in steering models such as Music Transformer to create something that feels personal? How can we control these models in a way that aligns with how we experience the ups and downs and turns in music? I'll talk about some future directions in how we might design creative environments that enable workflows and interactions that are "helpful for people, and helpful for models". I'll also briefly touch on the international AI Song Contest, how we're building a feedback loop between researchers and musicians on designing ML models and tools.

Friday, Feb 19, 2021

Vineeth N Balasubramanian

Recording: here


Explaining Neural Networks: A Causal Perspective

As deep neural network models get absorbed into real-world applications each day, there is an impending need to explain the decisions of these neural network models. This talk will begin with an introduction to the need for explaining neural network models, summarize existing efforts in this regard, as well as present a few of our efforts in this direction. In particular, while existing methods for neural network attributions (for explanations) are largely statistical, we propose a new attribution method for neural networks developed using first principles of causality (to the best of our knowledge, the first such). The neural network architecture is viewed as a Structural Causal Model, and a methodology to compute the causal effect of each feature on the output is presented. With reasonable assumptions on the causal structure of the input data, we propose algorithms to efficiently compute the causal effects, as well as scale the approach to data with large dimensionality. We also show how this method can be used for recurrent neural networks. We report experimental results on both simulated and real datasets showcasing the promise and usefulness of the proposed algorithm. This work was presented as a Long Oral at ICML 2019.

Friday, Feb 12, 2021

Yoshua Bengio

Recording: here

Slides : here

Human-inspired inductive biases for causal reasoning and out-of-distribution generalization

Humans are very good at out-of-distribution generalization (at least compared to current AI systems) and it would be good to understand some of the inductive biases they may exploit and test these theories by evaluating how they can be translated into successful ML architectures, training frameworks and experiments. Natural language and experimental results in cognitive science and neuroscience provide a wealth of clues about the system 2 part of how humans understand the world and reason about it. In this talk, I will discuss several of these hypothesized inductive biases, many of which exploit notions in causality and connect the discovery of abstractions in representation learning (the perception and interpretation part) and in reinforcement learning (the abstract actions). Systematic generalization is hypothesized to arise from an efficient factorization of knowledge into recomposable pieces corresponding to reusable factors (in a directed factor graph formulation). This is related yet different in many ways from symbolic AI (and this can be seen in the errors and limitations of reasoning in humans, as well as in our ability to learn to do this at scale, with distributed representations and efficient search). Sparsity of the causal graph and locality of interventions -- which can be observed in the structure of sentences -- have the potential to considerably reduce the computational complexity of both inference (including planning) and learning, which may be a reason for which evolution may have incorporated this "consciousness" prior. Although this talk will rest on a series of recent papers on these topics (e.g., on learning causal and/or modular structure with deep learning), much of it will be forward-facing and suggest open research questions in the hope of stimulating novel investigations and collaborations. A recent review of many of these points can be found in https://arxiv.org/abs/2011.15091

Friday, Jan 15, 2021

Marc Bellemare

Recording: here

Autonomous navigation of stratospheric balloons using reinforcement learning

Efficiently navigating a superpressure balloon in the stratosphere requires the integration of a multitude of cues, such as wind speed and solar elevation, and the process is complicated by forecast errors and sparse wind measurements. Coupled with the need to make decisions in real time, these factors rule out the use of conventional control techniques. This talk describes the use of reinforcement learning to create a high-performing flight controller for Loon superpressure balloons. Our algorithm uses data augmentation and a self-correcting design to overcome the key technical challenge of reinforcement learning from imperfect data, which has proved to be a major obstacle to its application to physical systems. We deployed our controller to station Loon balloons at multiple locations across the globe, including a 39-day controlled experiment over the Pacific Ocean. Analyses show that the controller outperforms Loon’s previous algorithm and is robust to the natural diversity in stratospheric winds. These results demonstrate that reinforcement learning is an effective solution to real-world autonomous control problems in which neither conventional methods nor human intervention suffice, offering clues about what may be needed to create artificially intelligent agents that continuously interact with real, dynamic environments.

Friday, Jan 8, 2021

Brian Christian

Recording: here

The alignment problem : Machine learning and human values

With the incredible growth of machine learning over recent years has come an increasing concern about whether ML systems' objectives truly capture their human designers' intent: the so-called "alignment problem." Over the last five years, these questions of both ethics and safety have moved from the margins of the field to become arguably its most central concerns. The result is something of a movement: a vibrant, multifaceted, interdisciplinary effort that is producing some of the most exciting research happening today. Brian Christian, visiting scholar at UC Berkeley and author of the acclaimed bestsellers The Most Human Human and Algorithms to Live By, will survey this landscape of recent progress and the frontier of open questions that remain.