Search this site
Embedded Files
Towards AGI
  • Home
  • Schedule
  • Topics&Papers
    • Adversarial Robustness
    • Alignment and Safety
    • CompPsych-FoMo
    • Compression and Fast Inference
    • Continual Learning at Scale
    • Emergence & Phase Transitions in ML
    • Foundation Models
    • Generalization (iid and ood)
    • High Performance Computing
    • Knowledge Fusion
    • Neural Scaling Laws
    • Out-of-Distribution Generalization
    • Scaling Laws in Nature
    • State Space Models
    • Time Series Foundation Models
Towards AGI
  • Home
  • Schedule
  • Topics&Papers
    • Adversarial Robustness
    • Alignment and Safety
    • CompPsych-FoMo
    • Compression and Fast Inference
    • Continual Learning at Scale
    • Emergence & Phase Transitions in ML
    • Foundation Models
    • Generalization (iid and ood)
    • High Performance Computing
    • Knowledge Fusion
    • Neural Scaling Laws
    • Out-of-Distribution Generalization
    • Scaling Laws in Nature
    • State Space Models
    • Time Series Foundation Models
  • More
    • Home
    • Schedule
    • Topics&Papers
      • Adversarial Robustness
      • Alignment and Safety
      • CompPsych-FoMo
      • Compression and Fast Inference
      • Continual Learning at Scale
      • Emergence & Phase Transitions in ML
      • Foundation Models
      • Generalization (iid and ood)
      • High Performance Computing
      • Knowledge Fusion
      • Neural Scaling Laws
      • Out-of-Distribution Generalization
      • Scaling Laws in Nature
      • State Space Models
      • Time Series Foundation Models

Schedule

Topics & Papers  

Paper presentations and projects: schedule & sign up sheet


Deadlines:
Oct  6 - project proposals due

TBA - final project submission due

Class 1 (Wed, Sept 3, 3:30-5:30pm)

Lecturer: Irina Rish 

Topic: Intro and Overview: A brief history of AI at Scale  (slides, video)

Papers:  The Bitter Lesson,   GPT-3 paper: Language Models are Few-Shot Learners

Class 2  (Mon, Sept 8, 3:30-5:30pm)

Lecturer: Irina Rish 

Topic:  Intro and Overview: Continual Learning at Scale (slides, video)

Class 3 (Wed, Sept 10, 3:30-5:30pm)

Lecturer: Irina Rish 

Topic: Overview of Papers to Present and Some Projects Topics (video-part1,  video-part2 )

Class materials: some of the previous  Topics & Papers  (focus on: Continual  Learning  at Scale, Alignment and Safety, Emergence, Phase Transitions and Stat Physics of ML), Some previous  large-scale projects, Towards Time_Series Foundation Models 

Class 4  (Mon, Sept 15, 3:30-5:30pm)

Part 1: Lecturers: Alireza Dehghanpour Farashah and Aditi Khandelwal

Topic:   Scaling  Laws for Neural Language Models  (slides, video)

Also covered: Training Compute-Optimal Large Language Models  (Chinchilla Explained: video), Emergent Abilities of Large Language Models,  Are emergent abilities of LLMs a Mirage?     Additional materials: Neural Scaling Laws and GPT-3 (video); a nice overview of the history of scaling laws: Scaling Laws for LLMs: from GPT-3 to o3

Part 2: Lecturer: Hiroki Naganuma

Topic:   An Empirical Model of Large-Batch Training (slides, video)

Class 5  (Wed, Sept 17, 3:30-5:30pm)

Part 1: Lecturer: Alex Coventry

Topic:   Planning with Reasoning using Vision Language World Model (video)

Part 2: Lecturer:  Samin Yeasar Arnob 

Topic:   DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning  (slides, video - part 2, starting at  01:04:29)

Class 6  (Mon, Sept 22, 3:30-5:30pm)

Part 1: Lecturers: Youssef Briki & Salman Hussain Ali

Topic:   LoRA: Low-Rank Adaptation of Large Language Models  (slides, video)

Part 2: Lecturer: Samin Yeasar Arnob

Topic:   Exploring Sparse Adapters for Scalable Merging of Parameter Efficient Experts (slides, video)

Class 7  (Wed, Sept 24, 3:30-5:30pm)

Part 1: Lecturers: Niklas Herbster & Martin Zborowski

Topic:   Persona Vectors: Monitoring and Controlling Character Traits in Language Models  (slides, video)

Part 2: Lecturer: Alex Coventry

Topic:   VinePPO: Refining Credit Assignment in RL Training of LLMs (slides, video)

Class 8  (Mon, Sept 29, 3:30-5:30pm)

Lecturers: Matthew Wiens & Arthur Toulouse

Topic:   Effect of scale on catastrophic forgetting in neural networks  (slides, video)

Class 9  (Wed, Oct 1, 3:30-5:30pm)

Part 1: Lecturer: Brandon Leblanc

Topic:   VGGT: Visual Geometry Grounded Transformer  (slides, video)

Part 2: Lecturers: David Guzmán & Zihan Wang

Topic:   Scaling Laws for Transfer (slides, video)

Class 10  (Mon, Oct 6, 3:30-5:30pm)

Part 1: Lecturers: Niklas Herbster & Martin Zborowski

Topic:   On the Biology of a Large Language Model  (slides, video)

Part 2: Lecturers: Simon-Olivier Duguay & Maximilien Le Clei

Topic:   The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery (slides, video)

Class 11  (Wed, Oct 8, 3:30-5:30pm)

Part 1: Lecturers: Aidan Li & Ouakib Amine

Topic:   Simple and Scalable Strategies to Continually Pre-train Large Language Models  (slides, video)

Part 2: Lecturers: Matthew Wiens & Arthur Toulouse

Topic:   The Platonic Representation Hypothesis (slides, video)

Class 12  (Mon, Oct 13, 3:30-5:30pm)

No class due to Thanksgiving.

Class 13  (Wed, Oct 15, 3:30-5:30pm)

Part 1: Lecturers: Aidan Li & Youssef Briki

Topic:   The Ultra-Scale Playbook: Training LLMs on GPU Clusters  (slides, video)

Part 2: Lecturers: Étienne Mitchell-Bouchard & Olivier Déry-Prévost

Topic:   Evaluating Large Language Models Trained on Code (slides, video)

Class 14  (Mon, Oct 20, 3:30-5:30pm)

Part 1: Lecturers: Simon Roy & Samuel Barbeau

Topic:   Hierarchical Reasoning Model  (slides, video)

Part 2: Lecturers: Julia Kuhn & Syrine Matoussi

Topic:   Training Compute-Optimal Protein Language Models (slides, video)

Class 15  (Wed, Oct 22, 3:30-5:30pm)

Part 1: Lecturers: Zafir Khalid

Topic:   K2-Think: A Parameter-Efficient Reasoning System  (slides, video)

Part 2: Lecturers: Hiroki Naganuma & Ouakib Amine

Topic:   Muon is Scalable for LLM Training (slides, video - part 2, from 1:51:54)

Class 16  (Mon, Oct 27, 3:30-5:30pm)

Part 1: Lecturers: Yuchen Hui & Yehao Yan

Topic:   Scaling Laws For Dense Retrieval  (slides, video)

Part 2: Lecturers: Alireza Dehghanpour Farashah & Aditi Khandelwal

Topic:   Alignment faking in large language models (slides, video)

Class 17  (Wed, Oct 29, 3:30-5:30pm)

Part 1: Lecturers: Zafir Khalid & Onur Kocer

Topic:   Subliminal Learning: Language models transmit behavioral traits via hidden signals in data  (slides, video)

Part 2: Lecturers: Yuchen Hui & Yehao Yan

Topic:   On the Theoretical Limitations of Embedding-Based Retrieval (slides, video)

Course Project Presentations: Poster Session TBA

Google Sites
Report abuse
Page details
Page updated
Google Sites
Report abuse