Summer / Été 2019

The question of “safe” exploration has received a lot of attention recently in RL, and more broadly there has been a move away from average performance to tail performance. One way to achieve this is to move away from maximizing expected rewards to instead maximize other quantiles of the reward distribution. One distinct advantage of quantiles is the following: while means may not exist or could be infinite especially when rewards are heavy-tailed, quantiles are always well defined and exist for any distribution. In this talk, we will gently introduce quantiles, examine some of their advantages, and demonstrate that it is possible to estimate quantiles of a distribution sequentially over time. We then apply our results to the problem of selecting an arm with the best quantile in a multi-armed bandit framework, proving a state-of-the-art sample complexity bound for a novel allocation strategy. Simulations demonstrate that our method stops with fewer samples than existing methods by a factor of five to fifty. A central open problem is extending our ideas to tabular MDPs and related “distributional" RL settings. This is joint work with Steve Howard, a PhD student at UC Berkeley.

Friday, August 23, 2019

Laurent Dinh (Google Brain Montreal)

Recording:

https://bluejeans.com/s/DcZJu/

A primer on normalizing flows

Normalizing flows are an flexible family of probability distributions that can serve as generative models for a variety of data modalities. Because flows can be expressed as compositions of expressive functions, they have successfully harnessed recent advances in deep learning. An ongoing challenge in developing these methods is the definition of expressive yet tractable building blocks. In this talk, I will introduce the fundamentals and describe recent work (including my own) on this topic.

Friday, August 16, 2019

Andre Barreto (Deepmind London)

Recording:

TBA

Efficient Reinforcement Learning with Generalized Policy Updates

The combination of reinforcement learning with deep learning has led to several successful applications in recent years. However, the amount of data needed by learning systems of this type still precludes their widespread use. In this talk I will describe a divide-and-conquer approach to reinforcement learning with the potential to make it more data efficient. The basic premise is that complex decision problems can be naturally decomposed into multiple tasks that unfold in sequence or in parallel. By associating each task with a reward function, we can naturally incorporate this problem decomposition into the standard reinforcement learning formalism. The specific way I propose to do so is through a generalization of two fundamental operations in reinforcement learning, policy improvement and policy evaluation. The generalized version of these operations, jointly referred to as "generalized policy updates", extend their standard counterparts from single to multiple tasks. This allows one to leverage the solution of some tasks to speed up the solution of other tasks. I will describe how exactly this can be done and provide some examples illustrating the use of the proposed approach in practice.

Friday, August 9, 2019

Leila Wehbe (CMU)

Recording:

https://bluejeans.com/s/G0m22/

Using insights from the human brain to interpret and improve NLP models

There has been much progress in neural network models for NLP that are able to produce embeddings for individual words as well as text sequences. This has allowed us to investigate the brain representation of natural text and begin to unravel the mechanisms the brain uses to make sense of language. At the same time, the brain, being the only processing system we have that actually produces and understands language, could be a valuable source of insight on how to build useful representations of language. In this talk I will describe recent work on identifying brain regions that process language meaning at different lengths of context. Based on these results, I will also describe recent work that employs brain activity recordings from subjects reading natural text to interpret and improve the representations learned by recent NLP algorithms such as ELMo or BERT. We show that modifications to BERT's architecture that make its representations more predictive of brain activity also improve BERT's performance on a series of NLP tasks. Under this perspective, the cognitive neuroscience of language and NLP can evolve in a symbiotic partnership where progress in one field can illuminate the path for the other.

Friday, July 19, 2019

Paul Cisek (UdeM)

Recording:

https://bluejeans.com/s/6cezw/

Rethinking behavior from an evolutionary perspective

In both AI research and psychological theory, the human brain is usually thought of as an information processing system that encodes and manipulates representations of knowledge to produce plans of action. This view leads to a decomposition of behavior into putative functions such as object recognition, memory, decision-making, action planning, etc., inspiring the search for the neural correlates of these functions and attempts to simulate them in computational models. However, empirical neurophysiological data does not support many of the predictions of these classic subdivisions, consistently showing divergence and broad distribution of putatively unified functions, “mixed representations” that combine sensory, motor, and cognitive variables in single neurons, and a general incompatibility with the conceptual subdivisions posited by psychological theories. In this talk, I will explore the possibility of resynthesizing a different set of functional subdivisions, guided by the growing body of data on the evolutionary process that produced the human brain. The main part of my talk will summarize, in chronological order, the sequence of innovations that appeared in nervous systems along the lineage that leads from the earliest multicellular animals to humans. Along the way, functional subdivisions and elaborations will be introduced in parallel with the neural specializations that made them possible, gradually building up an alternative conceptual taxonomy of brain functions. These functions emphasize mechanisms for real-time interaction with the world, rather than for building explicit knowledge of the world, and the relevant representations emphasize pragmatic outcomes rather than decoding accuracy, mixing variables in just the way seen in real neural data. I will argue that this alternative taxonomy better delineates the real functional pieces into which the human brain is organized, offers a more natural mapping to real neural structures, and provides a better set of constraints for constructing artificial systems aimed at replicating brain function

Friday, July 12, 2019

Ravindran Balaraman (IIT Madras)

Recording:

https://bluejeans.com/s/KKsu@/

Reinforcement Learning at Work

In this talk, I will describe a few innovative applications of reinforcement learning to real-world problems. In the first part of the talk, I will describe a network discovery problem inspired by the requirement of information dissemination in real-life social networks. In this work, we propose a reinforcement learning framework for network discovery that automatically learns useful node and graph representations that encode important structural properties of the network. At training time, the method identifies portions of the network such that the nodes selected from this sampled subgraph can effectively influence nodes in the complete network. We experiment with real-world social networks from four different domains and show that the policies learned by our RL agent provide a 10-36% improvement over the current state-of-the-art method. If time permits, in the second part of the talk I will briefly describe a multi-product inventory control problem for a retail chain. We model this as a distributed control problem with very high dimensional state and action spaces. We look at a variant of A3C that scales well to such problems and compare the performance to vanilla A3C and other control approaches.

Friday, July 5, 2019

Ankesh Anand and Evan Racah (Mila)

Recording:

https://bluejeans.com/s/KmDjE/

Unsupervised State Representation Learning in Atari

State representation learning, or the ability to capture latent generative factors of an environment, is crucial for building intelligent agents that can perform a wide variety of tasks. Learning such representations without supervision from rewards is a challenging open problem. We introduce a method that learns state representations by maximizing mutual information across spatially and temporally distinct features of a neural encoder of the observations. We also introduce a new benchmark based on Atari 2600 games where we evaluate representations based on how well they capture the ground truth state variables. We believe this new framework for evaluating representation learning models will be crucial for future representation learning research. Finally, we compare our technique with other state-of-the-art generative and contrastive representation learning methods.

Friday, June 28, 2019

Audrey Durand (Mila)

Recording:

https://bluejeans.com/s/Rn5F3/

Bandits in the wild

The multi-armed bandits setting is a well-known environment for studying the exploration-exploitation trade-off in reinforcement learning (RL). As the simplest instance of an RL problem, it has attracted the attention of many researchers seeking strong and elegant theoretical guarantees on agent performance. However, these guarantees often depend on firm assumptions which might not hold in applied settings, leaving it unclear whether the associated approaches could solve real-world problems or not. In this talk, I present two successful applications of bandit algorithms for interactive learning in medical research: (1) contextual bandits applied to adaptive trials for evaluating cancer treatments in mice and (2) multi-objective kernelized bandits applied to online tuning of high-resolution microscopy devices for neuroscience. I discuss the constraints and challenges faced when deploying these algorithms in real-world settings, and how they motivated new theoretical work, based on more realistic assumptions.

Friday, June 21, 2019

Guy Wolf (UdeM+IVADO)

Recording:

https://bluejeans.com/s/G9OO@/

Geometry-based Data Exploration

High-throughput data collection technologies are becoming increasingly common in many fields, especially in biomedical applications involving single cell data (e.g., scRNA-seq and CyTOF). These introduce a rising need for exploratory analysis to reveal and understand hidden structure in the collected (high-dimensional) Big Data. A crucial aspect in such analysis is the separation of intrinsic data geometry from data distribution, as (a) the latter is typically biased by collection artifacts and data availability, and (b) rare subpopulations and sparse transitions between meta-stable states are often of great interest in biomedical data analysis. In this talk, I will show several tools that leverage manifold learning, graph signal processing, and harmonic analysis for biomedical (in particular, genomic/proteomic) data exploration, with emphasis on visualization, data generation/augmentation, and nonlinear feature extraction. A common thread in the presented tools is the construction of a data-driven diffusion geometry that both captures intrinsic structure in data and provides a generalization of Fourier harmonics on it. These, in turn, are used to process data features along the data geometry for denoising and generative purposes. Finally, I will relate this approach to the recently-proposed geometric scattering transform that generalizes Mallat's scattering to non-Euclidean domains, and provides a mathematical framework for theoretical understanding of the emerging field of geometric deep learning.

Friday, May 31, 2019

Liam Li (CMU)

Recording:

https://bluejeans.com/s/tYeib/

Random Search and Reproducibility for Neural Architecture Search

Neural architecture search (NAS) is a promising research direction that has the potential to replace expert-designed networks with learned, task-specific architectures. In this talk, I present our recent work on grounding the empirical results in this field, building off the following observations: (i) NAS is a specialized hyperparameter optimization problem; and (ii) random search is a competitive baseline for hyperparameter optimization. Leveraging these observations, we evaluate simple baselines using random search on two standard NAS benchmarks---PTB and CIFAR-10. Our results show that random search with efficient evaluation strategies is a competitive NAS method for both benchmarks. Finally, we explore the existing reproducibility issues of published NAS results. We note the lack of source material needed to exactly reproduce these results, and further discuss the robustness of published results given the various sources of variability in NAS experimental setups.

Friday, May 24, 2019

Recording:

https://bluejeans.com/s/V56_B

Slides Pieter : https://www.dropbox.com/s/1gsokc41o3n51ns/2019_05_24_MILA--Abbeel.pdf?dl=0

Some progress on understanding the benefits of distributional reinforcement learning - Marc G. Bellemare

In this talk I will review what we now know about distributional reinforcement learning and how its full benefits are only obtained when combined with nonlinear representations such as deep networks. I will discuss how trying to understanding the good empirical performance of distributional RL has led us to all kinds of exciting results regarding representation learning for RL, and in particular a formulation of optimal representation learning based on the geometric notion of a value function polytope.

Model-based RL via Meta-Model-Free RL - Pieter Abbeel

Model-free RL has seen great asymptotic successes, but sample complexity tends to be high. Model-based RL carries the promise of better sample efficiency, and indeed has shown more data efficient learning, but tends to fall well short of model-free RL in terms of asymptotic performance. In this presentation I will describe a new approach to model-based RL that brings in ideas from domain randomization and meta-model-free RL, resulting in the best of both worlds: fast learning and great asymptotic performance. Our method is evaluated on several mujoco environments (PR2 reacher, swimmer, hopper, ant, swimmer, walker) and is able to learn lego-block placement on a real robot in 10 minutes.

Friday, May 17, 2019

Dan Sodickson (NYU)

Recording:

https://bluejeans.com/s/8PDu7/

Machine Learning and Medicine: How AI will change the way we see patients, and the way we see ourselves

Just as astronomy constitutes the exploration of outer space, advances in medicine may be seen as an ever-deeper exploration of inner space. This talk will explore how that inwardly-focused exploration may be transformed in the age of machine learning. I will begin by attempting to convey a view of artificial intelligence from the vantage point of medicine: what physicians tend to feel about AI, what they tend to know, and what they generally do not know. I will briefly cite examples of productive (and less productive) emerging uses of AI in medicine. I will then focus on medical imaging in particular, and will summarize some of the goals, early outcomes, challenges, and future directions of the fastMRI collaboration between NYU School of Medicine and Facebook AI Research, in which deep learning is being used to accelerate MRI beyond previous limits. In addition to being of great value to patients, physicians, and healthcare systems, acceleration serves as a potent enabler of several emerging trends that promise to reshape the future of biomedical imaging, including a move from carefully-staged snapshots to continuous streaming, and a move from imitating the eye to emulating the brain. In this context, I will explore how AI may change not only the analysis of images and other sensor data streams, but also the design and use of future imaging devices. I will conclude with a few speculations about potential changes in the way we interact with the medical system and even how we perceive the world around us. Throughout, I will attempt to highlight areas in which data scientists can add value to the day-to-day practice of medical imaging, to the improvement of human health, and to the ongoing exploration of inner space.

Summer / Été 2019

Friday, August 30, 2019

Quantiles for bandits (and RL?)

Friday, August 23, 2019

A primer on normalizing flows

Friday, August 16, 2019

Efficient Reinforcement Learning with Generalized Policy Updates

Friday, August 9, 2019

Using insights from the human brain to interpret and improve NLP models

Friday, July 19, 2019

Rethinking behavior from an evolutionary perspective

Friday, July 12, 2019

Reinforcement Learning at Work

Friday, July 5, 2019

Unsupervised State Representation Learning in Atari

Friday, June 28, 2019

Bandits in the wild

Friday, June 21, 2019

Geometry-based Data Exploration

Friday, May 31, 2019

Random Search and Reproducibility for Neural Architecture Search

Friday, May 24, 2019

Some progress on understanding the benefits of distributional reinforcement learning - Marc G. Bellemare

Model-based RL via Meta-Model-Free RL - Pieter Abbeel

Friday, May 17, 2019

Machine Learning and Medicine: How AI will change the way we see patients, and the way we see ourselves

Be our next SPEAKER