Older Talks

Friday, September 7, 2018

Geoff Gordon (MSR Montreal)

Recording: https://bluejeans.com/s/Gwjyq

Slides: Not yet available

Neural Networks and Bayes Rule

Relational or structured reasoning is an important current research challenge. The classical approach to this challenge is a templated graphical model: highly expressive, with well-founded semantics, but (at least naively) difficult to scale up, and difficult to combine with the most effective supervised learning methods. More recently, researchers have designed many different deep network architectures for structured reasoning problems, with almost the flip set of advantages and disadvantages. Can we get the best of both worlds? That is, can we design deep nets that look more like graphical models, or graphical models that look more like deep nets, so that we get a framework that is both practical and "semantic"? This talk will take a look at some progress toward such a hybrid framework.


Friday, August 24, 2018

Gaël Varoquaux (INRIA)

Recording: https://bluejeans.com/s/dU@5N

Slides: Not available

Simple representations for learning: factorizations and similarities

Real-life data seldom comes in the ideal form for statistical learning. This talk will focus on high-dimensional problems for signals and discrete entities: when dealing with many, correlated, signals or entities, it is useful to extract representations that capture these correlations. Matrix factorization models provide simple but powerful representations. They are used for recommender systems across discrete entities such as users and products, or to learn good dictionaries to represent images. However they entail large computing costs on very high-dimensional data, databases with many products or high-resolution images. I will present an algorithm to factorize huge matrices based on stochastic subsampling that gives up to 10-fold speed-ups [1]. With discrete entities, the explosion of dimensionality may be due to variations in how a smaller number of categories are represented. Such a problem of "dirty categories" is typical of uncurated data sources. I will discuss how encoding this data based on similarities recovers a useful category structure with no preprocessing. I will show how it interpolates between one-hot encoding and techniques used in character-level natural language processing. [1] Stochastic subsampling for factorizing huge matrices A Mensch, J Mairal, B Thirion, G Varoquaux IEEE Transactions on Signal Processing 66 (1), 113-128 [2] Similarity encoding for learning with dirty categorical variables. P Cerda, G Varoquaux, B Kégl Machine Learning (2018): 1-18


Friday, August 3, 2018

Petar Veličković (University of Cambridge)

Recording: https://bluejeans.com/s/Cuvv

Slides: Not available

Keeping our graphs attentive

A multitude of important real-world datasets (especially in biology) come together with some form of graph structure: social networks, citation networks, protein-protein interactions, brain connectome data, etc. Extending neural networks to be able to properly deal with this kind of data is therefore a very important direction for machine learning research, but one that has received comparatively rather low levels of attention until very recently. Attentional mechanisms represent a very promising direction for extending the established convolutional operator on images to work on arbitrary graphs, as they satisfy many of the desirable features for a convolutional operator. Through this talk, I will focus on my work on Graph Attention Networks (GATs), where these theoretical properties have been further validated by solid results on transductive as well as inductive node classification benchmarks. I will also outline some of the earlier efforts towards deploying attention-style operators on graph structures, as well as very exciting recent work that expands on GATs and deploys them in more general circumstances (such as EAGCN, DeepInf, and applications to solving the Travelling Salesman Problem). Time permitting, I will also present some of the relevant related graph-based work in the computational biology and medical imaging domains that I have been involved in.


Friday, June 29, 2018

Anne Churchland (Cold Spring Harbor Laboratory)

Recording: https://bluejeans.com/s/oIPlg/

Slides: Not available

Spontaneous movements dominate cortical activity during sensory-guided decision making

Animals continually produce a wide array of spontaneous and learned movements and undergo rapid internal state transitions. Most work in neuroscience ignores this “internal backdrop” and instead focuses on neural activity aligned to task-imposed variables such as sensory stimuli. We sought to understand the joint effects of internal backdrop vs. task-imposed variables. We measured neural activity using calcium imaging via a widefield macroscope during decision-making. Surprisingly, the impact of the internal backdrop dwarfed task-imposed sensory and cognitive signals. This dominance was comparable in novice and expert decision-makers and was even stronger in single neuron measurements from frontal cortex. These results highlight spontaneous and learned movements as the main determinant of large-scale cortical activity. By leveraging a wide array of animal movements, our model offers a very general method for separating the impact of internal backdrop from task-imposed neural activity.

Friday, May 11, 2018

Martin Gilbert (IVADO)

Recording: https://bluejeans.com/s/vMEi4

Slides: Not available

Intro to Ethics 2

In this presentation, I will introduce to the three main families of normative moral theories: consequentialism, deontologism and virtue ethics.

Thursday, April 26, 2018

Masashi Sugiyama ( RIKEN + University of Tokyo)

Recording: https://bluejeans.com/s/E4PWm

Slides: Not available

Machine learning from weak supervision - Towards accurate classification with low labeling costs

Recent advances in machine learning with big labeled data allow us to achieve human-level performance in various tasks such as speech recognition, image understanding, and natural language translation. On the other hand, there are still many application domains where human labor is involved in the data acquisition process and thus the use of massive labeled data is prohibited. In this talk, I will introduce our recent advances in classification techniques from weak supervision, including classification from positive and unlabeled data, a novel approach to semi-supervised classification, classification from positive-confidence data, and classification from complementary labels

Friday, April 13, 2018

Guillaume Lajoie (DMS-UdeM)

Recording: https://bluejeans.com/s/b7lnV

Slides: Not available

Dynamics of high-dimensional recurrent networks: how chaos shapes computation in biological and artificial neural networks

Networks of neurons —either biological or artificial— are called recurrent if their connections are distributed and contain feedback loops. Such networks can perform remarkably complex computations, as evidenced by their ubiquity throughout the brain and ever-increasing use in machine learning. They are, however, notoriously hard to control and their dynamics are generally poorly understood, especially in the presence of external forcing. This is because recurrent networks are typically chaotic systems, meaning they have rich and sensitive dynamics leading to variable responses to inputs. How does this chaos manifest in the neural code of the brain? How might we tame sensitivity to exploit complexity when training artificial recurrent networks for machine learning?

Understanding how the dynamics of large driven networks shape their capacity to encode and process information presents a sizeable challenge. In this talk, I will discuss the use of Random Dynamical Systems Theory as a framework to study information processing in high-dimensional, signal-driven networks. I will present an overview of recent results linking chaotic attractors to entropy production, dimensionality and input discrimination of dynamical observables. I will outline insights this theory provides on how cortex performs complex computations using sparsely connected inhibitory and excitatory neurons, as well as implications for gradient-based optimization methods for artificial networks.