Towards AGI:
Scaling, Alignment & Emergent Behaviors in Neural Nets
IFT 6760A Winter 2023, Université de Montréal / Mila - Quebec AI Institute
see the links
Class 1: Mon, Jan 9, 2023 4:30pm-6:30pm EST
Part 1: Overlaps with the 9am-5pm workshop at Mila (Agora) on
Reality+Virtual Worlds and the Problems of Philosophy (here are recording of the workshop)
____________________________________________________________________________________________________________
Part 2. After the workshop, the class continues from 5:30 to 6:30pm (Auditorium 2 at Mila; streaming link).
Lecturer: Irina Rish
Topic: Brief intro & overview of the class (video)
Class 2 (Thu, Jan 12, 2023)
Class 3 (Thu, Jan 19, 2023)
Lecturer: Irina Rish
Topic: Neural Scaling Laws: History and Overview (slides, video )
Papers: The Bitter Lesson, GPT-3 paper: Language Models are Few-Shot Learners, Scaling Laws for Neural Language Models, Training compute-optimal large language
models (summary/blog: New Scaling Laws for LLMs)
Class 4 (Mon, Jan 23, 2023)
Part 1: Lecturer: Irina Rish video- part 1
Overview of Neural Scaling Laws workshops
Reading group: Scaling Laws & Emergent Behaviors
Brief discussion on AI Alignment: AI and the paperclip problem, AI alignment research links, Unsolved Problems in ML Safety, Concrete Problems in AI Safety
Brief discussion on objective ethincs: Derek Parfit , On What Matters (vol 1, vol 2, vol 3). See also Why Anything? Why This? Reasons and Persons
____________________________________________________________________________________________________________________
Part 2: Lecturer: Irina Rish
Topic: Introduction to Continual Learning (slides, video - part 2)
Papers: Continual T0, Effect of scale on catastrophic forgetting in neural networks (summary: Effects of Model and Prior Learning Scale on Catastrophic Forgetting), Foundational Models for Continual Learning: An Empirical Study of Latent Replay
Class 5 (Thu, Jan 26, 2023)
Lecturer: Irina Rish
Topic: Phase Transitions in AI (and Emergent Behaviors in Large-Scale models) (slides, video)
The Law of Robustness by Sebastien Bubeck
Papers: Hard and Easy Distributions of SAT Problems, Every Monotone Graph Property Has a Sharp Threshold, Approximability of probability distributions, Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets, GPT-3 paper: Language Models are Few-Shot Learners, A universal law of robustness via isoperimetry.
Class 6 (Mon, Jan 30, 2023)
Part 1: Lecturer: Irina Rish
Topic: ChatGPT, Claude, and open-source projects (video - part 1)
Papers: EpochAI Scaling Laws Literature review and A database of papers on scaling laws
____________________________________________________________________________________________________________________
Part 2: Lecturer: Irina Rish
Topic: Multimodal Models (slides, video - part 2)
Papers: OpenAI blog on CLIP, OpenAI blog on DALL-E, MAGMA – Multimodal Augmentation of Generative Models through Adapter-based Finetuning, Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot, Perceiver: General Perception with Iterative Attention (video: youtube summary)
Class 7 (Thu, Feb 3, 2023)
Part 1: Lecturers: Kyle Roth, Alex Fulleringer
Topic: Unsolved Problems in ML Safety (slides, video - part 1)
____________________________________________________________________________________________________________________
Part 2: Lecturers: Mohammad Samsami, Sonia Joseph
Topic: A Generalist Agent (Gato) (slides, video - part 2)
Class 8 (Mon, Feb 6, 2023)
Part 1: Lecturers: Georges Belanger
Topic: Training language models to follow instructions with human feedback (slides, video)
______________________________________________________________________________________________________
Part 2: Open ChatGPT and Open ChatMAGMA efforts and Call for Participation
Open ChatGPT: join OpenAssistant discord (aka Open ChatGPT) https://open-assistant.io/
Yannic Kilcher: OpenAssistant - ChatGPT's Open Alternative (We need your help!)
Christoph Schuhmann (LAION) Collecting Data for GPT-NeoX Finetuning
Open ChatMAGMA:
Related events this week:
Scaling& Emergent Phenomena Reading Group (Tuesday Feb 7, 11:00am EST)
Tutorial and Q&A on Phase Transitions by Guillaume Dumas (video)
extended version of the talk given at the 2nd Workshop on Neural Scaling Laws video (part 2)
____________________________________________________________________________________________________________
Language Grounding Reading Group (Thursday, Feb 9th, 12pm EST)
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models (presented by Oscar Mañas).
Where (in-person): F01, 2nd floor, 6666, Mila
Where (online): BlueJeans https://bluejeans.com/578480945/7427
Class 9 (Thu, Feb 9, 2023)
Part 1: Lecturers: Hamza Abdelhedi, Jordan O'Byrne
Topic: Optimal neural representation in neural systems at the edge of chaos (slides, video - part 1)
Background: Reservoir computing, Echo state network, Next generation reservoir computing, Echo State Neural Machine Translation
Fri Feb 10: conversation on emergence, intelligence and transhumanism with Prof. Michael Levin and Machine Learning Street Talk
____________________________________________________________________________________________________________
Part 2: Lecturers: Maryam Valipour, Farzad Salajegheh
Topic: data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language (slides, video - part 2)
Class 10 (Mon, Feb 13, 2023)
Part 1: Lecturers: Chuanrui Wang, Huiyu Cai
Topic: Double Descent in Neural Net Training (slides, video - part 1)
Papers: Deep Double Descent: Where Bigger Models and More Data Hurt
Reconciling modern machine learning practice and the bias-variance trade-off
Deep Double Descent: Where Bigger Models and More Data Hurt
Double Trouble in Double Descent: Bias and Variance(s) in the Lazy Regime
Rethinking Bias-Variance Trade-off for Generalization of Neural Networks
______________________________________________________________________________________________________
Part 2: Lecturers: Praneet, Alekhya
Topic: Learning to continually learn (ANML) (slides, video - part 2)
Class 11 (Thu, Feb 16, 2023)
Part 1: Lecturers: Vitaly Kondulukov, Kaushik Moudgalya
Topic: Perceiver: General Perception with Iterative Attention (slides, video - part 1)
____________________________________________________________________________________________________________
Part 2: Lecturers: Alireza Razagh, Leo Gagnon
Topic: MusicLM: Generating Music From Text (slides, video - part 2)
Class 12 (Mon, Feb 20, 2023)
Part 1: Lecturers: Diganta Misra, Sparsha Mishra
Topic: Scaling Language-Image Pre-training via Masking (slides, video - part 1)
______________________________________________________________________________________________________
Part 2: Lecturer: Alexandre Larouche
Topic: Scaling laws for single-agent reinforcement learning (slides, video - part 2)
Class 13 (Thu, Feb 23, 2023)
Part 1: Lecturers: Artem Zholus
Topic: Emerging Properties in Self-Supervised Vision Transformers (slides, video -part 1)
____________________________________________________________________________________________________________
Part 2: Lecturers: Charles-Étienne Joseph, Pulkit
Topic: Chain of Thought Prompting Elicits Reasoning in Large Language Models (slides, video - part 2)
Class 14 (Mon, March 6, 2023)
Part 1: Lecturers: Zuobai Zhang. Jiarui Lu
Topic: Improving language models by retrieving from trillions of tokens (slides, video - part 1)
______________________________________________________________________________________________________
Part 2: Lecturer: Ethan Caballero
Topic: Broken Neural Scaling Laws (slides, video - part 2)
Class 15 (11am-noon, Tue, March 7, 2023)
Lecturers: Kyle Roth, Alex Fulleringer
Topic: Artificial Intelligence, Values, and Alignment (slides)
Class 16 (Thu, March 9, 2023)
Part 1: Lecturers: Maryam Valipour, Farzad Salajegheh
Topic: A ConvNet for the 2020s (slides, video -part 1)
____________________________________________________________________________________________________________
Part 2: Lecturers: Rohan Banerjee, Selim Gilon
Topic: Masked Autoencoders Are Scalable Vision Learners (slides, video - part 2)
Class 17 (Mon, March 13, 2023)
Part 1: Lecturers: Sandeep Kumar, Vamsikrishna Chemudupati
Topic: wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations (slides, video )
Class 18 (11am-noon, Tue, March 14, 2023)
Lecturers: Niki Howe
Topic: Adversarial Policies Beat Superhuman Go AIs (slides, video)
Class 19 (Thu, March 16, 2023)
Part 1: Lecturers: Alireza Razaghi, Nizar Islah
Topic: Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts (slides, video -part 1)
____________________________________________________________________________________________________________
Part 2: Lecturers: Mohammad Samsami, Sonia Joseph
Topic: Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small (slides, video - part 2)
Class 20 (Mon, March 20, 2023)
Part 1: Lecturers: Niki Howe, Ethan Caballero
Topic: Scaling Laws for Reward Model Overoptimization (slides, video - part 1)
______________________________________________________________________________________________________
Part 2: Lecturers: Nizar Islah, Diganta Misra
Topic: Patching open-vocabulary models by interpolating weights (slides, video - part 2)
Class 21 (Thu, March 23, 2023)
Part 1: Lecturers: Rohan Banerjee, Selim Gilon
Topic: Effect of Pre-Training Scale on Intra- and Inter-Domain Full and Few-Shot Transfer Learning for Natural and Medical X-Ray Chest Images (slides, video -part 1)
____________________________________________________________________________________________________________
Part 2: Lecturers: XinyiHE, Johan S
Topic: Constitutional AI: Harmlessness from AI Feedback (slides, video - part 2)
Class 22 (Mon, March 27, 2023)
Part 1: Lecturers: Gwen Legate, Albert Orozco Camacho
Topic: Wide Neural Networks Forget Less Catastrophically (slides, video - part 1)
______________________________________________________________________________________________________
Part 2: Lecturers: Zuobai Zhang, Jiarui Lu
Topic: Retrieval-Augmented Multimodal Language Modeling (slides, video - part 2)
Class 23 (10am-noon, Tue 28, March , 2023)
Part 1: 10am - 11am Lecturers: Christoph Schuhmann (LAION), Irina Rish (LAION)
Topic: Informal overview of some recent LAION projects (video, part 1)
____________________________________________________________________________________________________________
Part 2: 11am-noon Lecturers: Sarthak Mittal, Sangnie Bhardwaj
Topic: Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture (slides, video - part 2 )
This class is combined with the Scaling& Emergent Phenomena Reading Group
Class 24 (Thu, March 30, 2023)
Part 1: Lecturers: Chuanrui Wang, Huiyu Cai
Topic: Evolutionary-scale prediction of atomic-level protein structure with a language model (slides, video -part 1)
____________________________________________________________________________________________________________
Part 2: Lecturers: Sparsha Mishra
Topic: Artificial muses: Generative Artificial Intelligence Chatbots Have Risen to Human-Level Creativity (slides, video - part 2)
Class 25 (Mon, April 3, 2023)
Part 1: Lecturers: Gwen Legate, Albert Orozco Camacho
Topic: muNet: Evolving Pretrained Deep Neural Networks into Scalable Auto-tuning Multitask Systems (slides, video - part 1)
______________________________________________________________________________________________________
Part 2: Lecturers: Arkil Patel, Tejas Vaidhya
Topic: What Can Transformers Learn In-Context? A Case Study of Simple Function Classes (slides, video - part 2)
Class 26 (11am-noon, Tue, April 4, 2023)
Lecturers: Amin Bonyad and Maxime Gevers
Topic: UniT: Multimodal Multitask Learning with a Unified Transformer (slides, video)
Class 27 (Thu, April 6, 2023)
Part 1: Lecturer: Léo Gagnon
Topic: Data Distributional Properties Drive Emergent In-Context Learning in Transformers
(slides, video -part 1)
_______________________________________________________________________________________________________
Part 2: Lecturer: Artem Zholus
Topic: Visual Prompt Tuning (slides, video - part 2)
Class 28 (11am-noon, Tue, April 11, 2023)
Lecturers: Vamsikrishna Chemudupati and Sandeep Kumar
Topic: Robust Speech Recognition via Large-Scale Weak Supervision (slides, video)This class is combined with the Scaling& Emergent Phenomena Reading Group
Class 29 (Thu, April 13, 2023)
Part 1: Lecturers: Johan S and XinyiHE
Topic: Collaborating with language models for embodied reasoning (slides, video -part 1)
_______________________________________________________________________________________________________
Part 2: Lecturers: Arkil Patel and Tejas Vaidhya
Topic: LoRA: Low-Rank Adaptation of Large Language Models (slides, video - part 2)
Class 30 (Mon, April 17, 2023)
Part 1: Lecturers: Pulkit and Charles-Étienne Joseph
Topic: Self-Refine: Iteratve refinement with self-feedback (slides, video - part 1)
______________________________________________________________________________________________________
Part 2: Lecturers: Vitaly and Kaushik
Topic: A Watermark for Large Language Models (slides, video - part 2)
Class 31 (Thu, April 20, 2023)
Part 1: Lecturer: Pascal Tikeng
Topic: Grokking and Epoch-wise double descent ([1], [2], [3]) (slides, video -part 1)
_______________________________________________________________________________________________________
Part 2: Lecturer: Ali Touil
Topic: Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents
(slides, video - part 2)
Class 32 (Mon, April 24, 2023)
Part 1: Lecturer: Sangnie Bhardwaj and Sarthak Mittal
Topic: BayesFlow: Learning complex stochastic models with invertible neural networks (slides, video )
Class 33 (Thu, April 27, 2023)
Part 1: Lecturer: Amin Bonyad khalaj
Topic: Real or Fake Text?: Investigating Human Ability to Detect Boundaries Between Human-Written and Machine-Generated Text (slides, video )
_______________________________________________________________________________________________________
Part 2: Lecturer: Hamza Abdelhedi and Jordan O'Byrne
Topic: Hopfield Networks Is All You Need
(slides, video - part 2)
Final Project Presentations (Thu, May 01, 2023)
Part 1: Lecturer: Leo Gagnon
Title : In-context learning of causal adaptation strategies
Slides: here
Video : here
_______________________________________________________________________________________________________
Part 2 Lecturer: Farzad Salajegheh, Maryam Valipour, Alireza Razzaghi
Title : Deformable gate in CNNs
Slides: here
Video : here
_____________________________________________________________________________________________________
Part 3 Lecturer: Niki Howe
Title : Scaling scaling laws with board games
Slides: here
Video : here
_______________________________________________________________________________________________________
Part 4 Lecturer: Huiyu Cai, Jiarui Lu, Zuobai Zhang
Title : Exploring scaling behavior for protein language models
Slides: here
Video : here
Final Project Presentations (Thu, May 02, 2023)
Part 1: Lecturer: Amin Bonyad khalaj
Title : Toxic Comment Classification with Deep Learning
Slides: here
Video : here
_______________________________________________________________________________________________________
Part 2 Lecturer: Nizar Islah, Diganta Misra
Title : Continual/Lifelong Sparse Mixture-of-Experts
Slides: here
Video : here
Final Project Presentations (Thu, May 04, 2023)
Part 1: Lecturer: Johan Samir Obando Ceron
Title : Unlocking the Neural Capacity for Sample Efficient Reinforcement Learning
Slides: here
Video : here
_______________________________________________________________________________________________________
Part 2 Lecturer: Kyle Roth, Georges Bélanger, Alex Fulleringer
Title : "Fine-Tuning" a frozen LLM by teaching it to record information to external tools
Slides: here
Video : here
_______________________________________________________________________________________________________
Part 3 Lecturer: Alexandre Larouche, Ali Touil
Title : Scaling laws for Coorperation in Multi-Agent RL
Slides: here
Video : here
_______________________________________________________________________________________________________
Part 4 Lecturer: Vamsikrishna Chemudupati, Sandeep Kumar
Title : Investigating Slimmable networks for transformer-based models in ASR
Slides: here
Video : here
_______________________________________________________________________________________________________
Part 5 Lecturer: Gwen LeGate, Charles-Étienne Joseph, Albert Orozco Camacho
Title : Scaling Federated Learning
Slides: here
Video : here
_______________________________________________________________________________________________________
Part 6 Lecturer: Hamza Abdelhedi, Jordan O'Byrne
Title : Brain similarity metrics
Slides: here
Video : here
Final Project Presentations (May 08, 2023)
Part 2 Lecturer: Sarthak Mittal and Sangnie Bhardwaj
Title : ControlNet
Slides: here
Video : here
_______________________________________________________________________________________________________
Part 2 Lecturer: Pulkit Madan and Kaushik Moudgalya and Vitaly Kondulukov
Title : Cluster Prompt (Formerly SCIPS)
Slides: here
Video : here
_______________________________________________________________________________________________________
Part 3 Lecturer: Rohan Banerjee and Selim Gilon
Title : Towards foundational models for medical image segmentation
Slides: here
Video : here
_______________________________________________________________________________________________________
Part 4 Lecturer: Xinyi HE
Title : Continual Learning with Hypernetworks
Slides: here
Video : here
_______________________________________________________________________________________________________
Part 4 Lecturer: Sonia Joseph and Mohammad Reza Samsami and Artem Zholus
Title : Reverse-Engineering OpenAI’s VPT
Slides: here
Video : here
_______________________________________________________________________________________________________
Part 5 Lecturer: Sparsha Mishra and Diganta Misra
Title : Language guided minimal finetuning of large vision encoders
Slides: here
Video : here
Final Project Presentations (May 09, 2023)
Part 1: Lecturer: Alekhya Dronavalli, Praneet Suresh
Title : Impact of Scale in Multi-task Variational Information Bottleneck
Slides: here
Video : here
_______________________________________________________________________________________________________
Part 2 Lecturer: Tejas Vaidhya, Arkil Patel
Title : Pruned LLMs
Slides: here
Video : here
Class 34 (May 09, 2023)
Part 1: Lecturer: Alekhya Dronavalli, Praneet Suresh
Title : Contrastive Syn-to-Real Generalization
Slides: here
Video : here