Towards AGI - Schedule

Schedule

Topics & Papers

Paper presentations and projects: schedule & sign up sheet

Deadlines:
Oct 6 - Project proposal submission due

Dec 5, 2025, at 11:59 PM - Final project submission due
Poster Session is on Monday, November 24th, 2 pm - 5 pm

Class 1 (Wed, Sept 3, 3:30-5:30pm)

Lecturer: Irina Rish

Topic: Intro and Overview: A brief history of AI at Scale (slides, video)

Papers: The Bitter Lesson, GPT-3 paper: Language Models are Few-Shot Learners

Class 2 (Mon, Sept 8, 3:30-5:30pm)

Lecturer: Irina Rish

Topic: Intro and Overview: Continual Learning at Scale (slides, video)

Class 3 (Wed, Sept 10, 3:30-5:30pm)

Lecturer: Irina Rish

Topic: Overview of Papers to Present and Some Projects Topics (video-part1, video-part2 )

Class materials: some of the previous Topics & Papers (focus on: Continual Learning at Scale, Alignment and Safety, Emergence, Phase Transitions and Stat Physics of ML), Some previous large-scale projects, Towards Time_Series Foundation Models

Class 4 (Mon, Sept 15, 3:30-5:30pm)

Part 1: Lecturers: Alireza Dehghanpour Farashah and Aditi Khandelwal

Topic: Scaling Laws for Neural Language Models (slides, video)

Also covered: Training Compute-Optimal Large Language Models (Chinchilla Explained: video), Emergent Abilities of Large Language Models, Are emergent abilities of LLMs a Mirage? Additional materials: Neural Scaling Laws and GPT-3 (video); a nice overview of the history of scaling laws: Scaling Laws for LLMs: from GPT-3 to o3

Part 2: Lecturer: Hiroki Naganuma

Topic: An Empirical Model of Large-Batch Training (slides, video)

Class 5 (Wed, Sept 17, 3:30-5:30pm)

Part 1: Lecturer: Alex Coventry

Topic: Planning with Reasoning using Vision Language World Model (video)

Part 2: Lecturer: Samin Yeasar Arnob

Topic: DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning (slides, video - part 2, starting at 01:04:29)

Class 6 (Mon, Sept 22, 3:30-5:30pm)

Part 1: Lecturers: Youssef Briki & Salman Hussain Ali

Topic: LoRA: Low-Rank Adaptation of Large Language Models (slides, video)

Part 2: Lecturer: Samin Yeasar Arnob

Topic: Exploring Sparse Adapters for Scalable Merging of Parameter Efficient Experts (slides, video)

Class 7 (Wed, Sept 24, 3:30-5:30pm)

Part 1: Lecturers: Niklas Herbster & Martin Zborowski

Topic: Persona Vectors: Monitoring and Controlling Character Traits in Language Models (slides, video)

Part 2: Lecturer: Alex Coventry

Topic: VinePPO: Refining Credit Assignment in RL Training of LLMs (slides, video)

Class 8 (Mon, Sept 29, 3:30-5:30pm)

Lecturers: Matthew Wiens & Arthur Toulouse

Topic: Effect of scale on catastrophic forgetting in neural networks (slides, video)

Class 9 (Wed, Oct 1, 3:30-5:30pm)

Part 1: Lecturer: Brandon Leblanc

Topic: VGGT: Visual Geometry Grounded Transformer (slides, video)

Part 2: Lecturers: David Guzmán & Zihan Wang

Topic: Scaling Laws for Transfer (slides, video)

Class 10 (Mon, Oct 6, 3:30-5:30pm)

Part 1: Lecturers: Niklas Herbster & Martin Zborowski

Topic: On the Biology of a Large Language Model (slides, video)

Part 2: Lecturers: Simon-Olivier Duguay & Maximilien Le Clei

Topic: The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery (slides, video)

Class 11 (Wed, Oct 8, 3:30-5:30pm)

Part 1: Lecturers: Aidan Li & Ouakib Amine

Topic: Simple and Scalable Strategies to Continually Pre-train Large Language Models (slides, video)

Part 2: Lecturers: Matthew Wiens & Arthur Toulouse

Topic: The Platonic Representation Hypothesis (slides, video)

Class 12 (Mon, Oct 13, 3:30-5:30pm)

No class due to Thanksgiving.

Class 13 (Wed, Oct 15, 3:30-5:30pm)

Part 1: Lecturers: Aidan Li & Youssef Briki

Topic: The Ultra-Scale Playbook: Training LLMs on GPU Clusters (slides, video)

Part 2: Lecturers: Étienne Mitchell-Bouchard & Olivier Déry-Prévost

Topic: Evaluating Large Language Models Trained on Code (slides, video)

Class 14 (Mon, Oct 20, 3:30-5:30pm)

Part 1: Lecturers: Simon Roy & Samuel Barbeau

Topic: Hierarchical Reasoning Model (slides, video)

Part 2: Lecturers: Julia Kuhn & Syrine Matoussi

Topic: Training Compute-Optimal Protein Language Models (slides, video)

Class 15 (Wed, Oct 22, 3:30-5:30pm)

Part 1: Lecturer: Zafir Khalid

Topic: K2-Think: A Parameter-Efficient Reasoning System (slides, video)

Part 2: Lecturers: Hiroki Naganuma & Ouakib Amine

Topic: Muon is Scalable for LLM Training (slides, video - part 2, from 1:51:54)

Class 16 (Mon, Oct 27, 3:30-5:30pm)

Part 1: Lecturers: Yuchen Hui & Yehao Yan

Topic: Scaling Laws For Dense Retrieval (slides, video)

Part 2: Lecturers: Alireza Dehghanpour Farashah & Aditi Khandelwal

Topic: Alignment faking in large language models (slides, video)

Class 17 (Wed, Oct 29, 3:30-5:30pm)

Part 1: Lecturers: Zafir Khalid & Onur Kocer

Topic: Subliminal Learning: Language models transmit behavioral traits via hidden signals in data (slides, video)

Part 2: Lecturers: Yuchen Hui & Yehao Yan

Topic: On the Theoretical Limitations of Embedding-Based Retrieval (slides, video)

Class 18 (Mon, Nov 3, 3:30-5:30pm)

Part 1: Lecturers: Avery Ryoo & Majdi Hassan

Topic: The Superposition of Diffusion Models Using the Itô Density Estimator (slides, video)

Part 2: Lecturer: Santino Nanini

Topic: xLSTM: Extended Long Short-Term Memory (slides, video)

Class 19 (Wed, Nov 5, 3:30-5:30pm)

Part 1: Lecturers: Juan Rodriguez & Mike Zhu & Hannuo Zhang

Topic: Why Language Models Hallucinate (slides, video)

Part 2: Lecturers: Vedant Shah & Avery Ryoo

Topic: Large Language Diffusion Models (slides, video)

Class 20 (Mon, Nov 10, 3:30-5:30pm)

Part 1: Lecturers: Maximilien Le Clei & Simon-Olivier Duguay

Topics: The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search & AlphaEvolve: A coding agent for scientific and algorithmic discovery (slides, video)

Part 2: Lecturers: Majdi Hassan & Vedant Shah

Topic: Less is More: Recursive Reasoning with Tiny Networks (slides, video)

Class 21 (Wed, Nov 12, 3:30-5:30pm)

Part 1: Lecturers: Julia Kuhn & Syrine Matoussi & Ali Barooni

Topics: Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws (slides, video)

Part 2: Lecturer: Ayush Kaushal

Topic: Surprising Effectiveness of pretraining Ternary Language Model at Scale (slides, video)

Class 22 (Mon, Nov 17, 3:30-5:30pm)

Part 1: Lecturer: Brandon Leblanc

Topic: DUNE: Distilling a Universal Encoder from heterogenous 2D and 3D teachers (slides, video)

Part 2: Lecturer: Juan Rodriguez

Topic: LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning (slides, video)

Part 3: Lecturer: Salman Hussain Ali

Topic: Distillation Scaling Laws (slides, video)

Class 23 (Wed, Nov 19, 3:30-5:30pm)

Part 1: Lecturers: Ya Shi Zhang & Xixian Liu

Topic: Scaling Rectified Flow Transformers for High-Resolution Image Synthesis (slides, video)

Part 2: Lecturers: Étienne Mitchell-Bouchard & Olivier Déry-Prévost

Topic: Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis (slides, video)

Part 3: Lecturer: Jalal Naghiyev

Topic: Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach (slides, video)

Course Project Presentations: Poster Session on Monday, November 24th, 2 pm - 5 pm

Page updated

Google Sites

Report abuse