SUMMER / ETE 2020

Friday, Aug 28, 2020

Mengye Ren (Vector Institue)

Recording: here

Towards continual and compositional few-shot learning

Few-shot learning has recently emerged as a popular area of research towards building more flexible machine learning programs that can adapt at test time. However, it now faces two major criticisms. First, the “k-shot n-way” episodic structure is still far from modelling the incremental knowledge acquisition procedure of an agent in a natural environment. Secondly, there has been limited improvement towards modelling compositional understanding of novel objects; on the other hand, features obtained from regular classification tasks can perform very well. In this talk I will introduce two recent advances that address each of the challenges above.

Friday, Aug 21, 2020

Jason Hartford (UBC)

Recording: here

Valid Causal Inference with (Some) Invalid Instrumental Variables

Instrumental variable (IV) methods provide a powerful approach to estimating causal effects: they are robust to unobserved confounders and they can be combined with deep networks for flexible nonlinear causal effect estimation. But a key challenge when applying them is the reliance on untestable "exclusion" assumptions. In this talk, I will discuss recent work where we showed how to perform consistent IV estimation despite some violations of these assumptions. In particular, we show that when one has multiple candidate instruments, only a majority of these candidates---or, more generally, the modal candidate-response relationship---needs to be valid to estimate the causal effect. Our approach, ModeIV, uses an estimate of the modal prediction from an ensemble of instrumental variable estimators. The technique is simple to apply and is "black-box" in the sense that it may be used with any instrumental variable estimator as long as the treatment effect is identified for each valid instrument independently.

The talk assumes no background in causal inference: I will give an introduction to causal inference with instrumental variables and briefly discuss DeepIV, our approach for using deep networks for causal effect estimation when one has access to instrument variables, before presenting ModeIV in detail.

Friday, Aug 14, 2020

Ida Momennejad (MSR)

Recording: here

Multi-scale predictive representations in prefrontal and hippocampal hierarchies

Memory and planning rely on learning the structure of relationships among experiences. A century after ‘latent learning’ experiments summarized by Tolman, the larger puzzle of cognitive maps remains elusive: how does the brain learn compact representations of these structures to guide flexible behavior? I use reinforcement learning (RL) to study how humans learn and generalize compact representations of structures in memory and planning. I show behavioral, fMRI, and electrophysiology evidence for structure learning and transfer with multi-scale successor representations updated via offline replay. Recently we’ve shown evidence for multi-scale predictive representations in prefrontal and hippocampal hierarchies. I will briefly mention the significance of this approach for computational psychiatry, and for future studies on the entanglement of learning and memory with exploration and planning.

Friday, July 24, 2020

Brady Neal (Mila)

Recording: here

A Brief Introduction to Causal Inference

What motivates causal inference? If correlation doesn’t imply causation, then what does? Why are randomized experiments so key? If you can’t do a randomized experiment, then how can you do causal inference? In this talk, you can expect to learn the links between all of these questions, where you can learn more, and how you can get involved in research in causal inference.

Friday, July 10, 2020

Vikram Voleti (Mila)

Recording: here

A BRIEF TUTORIAL ON Neural ordinary differential equations

Neural ODEs are being increasingly used to model data by combining differential equations with gradient-based training. In this talk, we shall dive into the core concepts that make Neural ODEs a great tool for machine learning. We will cover the fundamentals of Neural ODEs. We shall then look at three primary applications as proposed in the original Neural ODE paper. Finally, we shall look at some of the more recent research in this area.

Friday, July 3rd, 2020

Petar Veličković (Deepmind)

Recording: here

Algorithmic Inductive Biases

Inductive biases, broadly speaking, encourage learning algorithms to prioritise solutions with certain properties. Here we focus specifically on methods that incorporate structural assumptions directly into the architecture or algorithm. This can be seen as a “meet‐in‐the‐middle” approach, combining aspects of classical symbolic artificial intelligence with modern deep architectures for representation learning.

While it is now well-known that directly encoding these structural inductive biases yields models that are more data‐efficient or generalisable, the question of which structural inductive biases are relevant for which applications and setups still remains. Rigorously specifying when an inductive bias is appropriate still remains a 'silent art', often reducing to ad-hoc tricks that then propagate throughout several research lines.

In order to provide more palpable intuition, in this talk I will propose the lens of algorithmic reasoning tasks as a possible way to quantify the relevance of an inductive bias. The algorithmic execution setup implies that we know _exactly_ how our data is produced, and how the outputs are generated, with no noise inolved. Hence, it is possible to make a clear argument for why a particular bias might be appropriate, and unique opportunities for credit assignment studies are offered.

Surveying increasingly specialised reasoning procedures (from simple reachability queries, through dynamic programming, all the way to dynamic path reasoning in trees), I will highlight the tangible performance gains that we have been able to recover by applying appropriate inductive biases, accompanied by a clear argumentation for why they were applied.

Friday, June 19, 2020

Liam Paull (Mila)

Recording: here

CHallenges for efficiently deploying robots in unstructured environments

Historically, robotics has over-promised and under-delivered. We are seeing this cycle repeat again with the autonomous car. In this talk, I will give an overview of the typically-used components built for for deploying mobile robot systems. I will try to motivate why it is so challenging to actually reliably deploy robots in open-world settings.

Subsequently, I will introduce our (relatively) newly formed robotics lab at UdeM/Mila whose main focus is on addressing these issues. I will discuss recent work along two major themes that are particularly relevant to machine learning: learning perceptual representations and learning in simulation. The ability to backpropagate error signals through computational graphs results in the ability to learn parameters that would otherwise have to be hand-tuned. Applying this to some robotics perception tasks in challenging since many of the components have long-term dependencies and are inherently non-differentiable. I will describe some recent work that aims to address this problem for representation learning. Simulators also have the ability to make learning on robots more efficient. But this approach incurs problems associated with transfer learning and distributional shift. I will describe several recent approaches to formalize and address this issue.

Finally, I will also briefly describe our AI Driving Olympics project in connection to the problem of robotics benchmarking and "sim2real" transfer.

Friday, June 12, 2020

Timothy O'Donnell (McGill)

Recording: here

Compositionality in language

One of the most celebrated aspects of natural language is its capacity for generalization: Language allows us to express and comprehend vast numbers of novel thoughts and ideas. This capacity for generalization is is made possible because the linguistic system is compositional. The meaning of utterances is built from the meaning of parts such as words and morphemes. In this talk, I will give an overview of the traditional model of compositionality from linguistics and its relation to AI, discussing how compositionality may be bettered viewed as an inductive bias than an architectural property of a system. I will then discuss several recent studies that address various aspects of the problem of building compositional models of language in AI.


Friday, May 22, 2020

Ioannis Mitliagkas (Mila)

Recording: here

Adversarial formulations, robust learning and generalization: some recent and ongoing work

Modern machine learning can involve big, over-parametrized models whose capacity goes beyond what was considered necessary in traditional ML analysis. Some of those models, like generative adversarial networks (GANs), introduce a different paradigm by using multiple, competing objective functions: they are described as games. Compared to single-objective optimization, game dynamics is more complex and less understood. Similar dynamics also appears in formulations of robust learning and domain generalization. This intersection of overparametrization, robust learning, and adversarial dynamics presents some exciting new questions on numerics and statistical learning. In this talk, I will give an overview of recent work performed in this area by my group and collaborators, outline some ongoing projects and summarize interesting questions worth exploring.


Friday, May 15, 2020

Jian Tang (Mila)

Recording: here

Graph Representation Learning: Algorithms and Applications

Graphs, a general type of data structures for capturing interconnected objects, are ubiquitous in a variety of disciplines and domains, ranging from computational social science, recommender systems, bioinformatics to chemistry. Recently, there is a growing interest in the machine learning community in developing deep learning architectures for graph-structured data. In this talk, I will give a high-level overview of the research in my group on graph representation learning including: (1) Unsupervised graph representation learning and visualization (WWW’15, WWW’16, WWW’19, ICLR’19); (2) Towards combining traditional statistical relational learning and graph neural networks (ICML’19, NeurIPS’19); (3) Graph Representation Learning for Drug Discovery (ICLR’20).


Friday, May 8, 2020

Yoshua Bengio (Mila)

Recording: here

Empowering Citizens against Covid-19 with an ML-Based and Decentralized Risk Awareness App

What is contact tracing? How can it help to substantially bring down the reproduction number (number of newly infected individuals per infected person)? How can automated contact tracing act as a complement to existing manual contact tracing? How can ML-based risk estimation generalize contact tracing, moving away from a binary decision about a contact to graded predictions capturing all the clues about being contagious, and how could it enable fitting powerful epidemiological models to the data collected on phones? What are the privacy, human rights, dignity and democracy concerns around digital tracing? How can we deploy decentralized apps with the strongest possible privacy guarantees, thus delivering both on the side of saving lives by reducing greatly the reproduction number of the virus while making sure that neither governments nor other users can have access to my infection status or my personal data? How do we create trust in both directions and empower citizens with the information needed to act responsibly to protect their community, instead of relying on the authority of the government and the threat of social punishment? How does it make the problem more challenging from a machine learning perspective because a lot of information is now not accessible or blurred to achieve differential privacy and avoid having a central repository tracking people's detailed movements and who they met when? What machine learning techniques appear most promising to jointly train an inference machine which predicts contagiousness in the past and the present and at the same time train a highly structured epidemiological model which is a generative engine for running what-if policy scenarios and help public health take the difficult decisions ahead using the scientific evidence as well as the data being collected in a privacy-first way? How do we set up a form of non-profit data trust which protects citizens and avoids conflicts of interest, keeping the collected data at arm's length of governments but yet providing them with the information they need for taking policy decisions and managing the public health challenges. Many questions, and hopefully some early answers.