Paper presentations and projects: schedule & sign up sheet
Deadlines:
February - project proposals due
TBA - final project submission due
Lecturer: Irina Rish
Topic: Intro and Overview: A brief history of AI at Scale (slides)
Papers: The Bitter Lesson, GPT-3 paper: Language Models are Few-Shot Learners
Topic: Intro and Overview: Continual Learning at Scale (slides, video)
Topic: Overview of Emergence/Grokking, Large-Scale Projects, Time-Series FoMo (video)
Time-Series Foundation Models
AI@Scale Workshop (including tutorial on HPC and distributed LLM training)
Topic: Overview of Projects on Psych Eval of LLMs (video posted on discord)
Additional reading: Computational Psychology & Foundation Models papers and ComPsy FoMo Workshop
Part 1: Lecturer: Daria Yasafova
Topic : Scaling Laws for Neural Language Models (slides, video)
Additional video: Neural Scaling Laws and GPT-3
Topic: Training Compute-Optimal Large Language Models (Chinchilla Explained: video)
Additional reading: Scaling Laws for LLMs: from GPT-3 to o3
Part 1: Lecturer: Prateek Humane
Topic: DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning (slides, video)
Additional Reading:
Open-R1: a fully open reproduction of DeepSeek-R1
DeepSeek-R1 explained (Medium article)
Topic: Scale Alone Does not Improve Mechanistic Interpretability in Vision Models (slides, video)
Part 2: Lecturer: Daria Yasafova
Topic: Language models are few shot learners (slides, video - part 2)
Additional reading: Scaling Laws for LLMs: from GPT-3 to o3
Part 1: Lecturer: Prateek Humane
Topic: Alignment faking in large language models (slides, video)
Part 2: Lecturers: Edward Habelrih and Frederic Jarjour
Topic: Effect of scale on catastrophic forgetting in neural networks (slides, video - part 2)
Topic: Loss of plasticity in deep continual learning (slides, video)
Part 2: Lecturer: Jama Hussein Mohamud
Topic: Scaling Laws for Transfer (slides, video - part 2)
Part 1: Lecturers: Shruti Bibra and Yousef Kotp
Topic: When Do We Not Need Larger Vision Models? (slides, video)
Part 2: Lecturers: Jiadi Yu and Mingze Li
Topic: Zero-Shot Text-to-Image Generation (slides, video - part 2)
Part1: Lecturers: Wenhao Xu and Huan Zhang
Topic: PaLM: Scaling Language Modeling with Pathways (slides, video)
Part 2: Lecturers: Sungjae Cho and Anirudh Jamkhandi
Topic: 1. Scaling laws in the mammalian neocortex: does form provide clues to function?
2. A Connectomic Hypothesis for the Hominization of the Brain (slides, video - part 2)
Part 1: Lecturers: Ivan Anokhin and Shahrad Mohammadzadeh
Topic: RLHF: How to Learn from Human Feedback with Reinforcement Learning (slides, video)
Part 2: Lecturers: Rishika Bhagwatkar and Aditya Sharma
Topic: RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning (slides, video - part 2)
Part1: Lecturers: Sepehr Babapour and Alireza Dizaji
Topic: Representation Projection Invariance Mitigates Representation Collapse (slides, video)
Part 2: Lecturer: Anas El Houssaini
Topic: Open X-Embodiment: Robotic Learning Datasets and RT-XModels (slides, video - part 2)
Part 1: Lecturers: Azalée Robitaille and Artur Kuramshin
Topic: Data scaling laws in imitation learning for robotic manipulation (slides, video)
Part 2: Lecturers: Manping Li and Yan Zhang
Topic: Simple and Scalable Strategies to Continually Pre-train Large Language Models (slides, video - part 2)
Part1: Lecturers: Roger Creus
Topic: Voyager: An Open-Ended Embodied Agent with Large Language Models (slides, video)
Part 2: Lecturers: Wenhao Xu and Huan Zhang
Topic: Towards Understanding Sycophancy in Language Models (slides, video - part 2)
Part 1: Lecturers: Ivan Anokhin and Navid Hassan Zadeh
Topic: Investigating Continual Pretraining in Large Language Models: Insights and Implications (slides, video)
Part 2: Lecturers: Yorguin Jose Mantilla Ramos and Yousef Kotp
Topic: Progress measures for grokking via mechanistic interpretability (slides, video - part 2)
Part1: Lecturers: Kun Ni and Yuxing Tian
Topic: DeepSeek LLM: Scaling Open-Source Language Models with Longtermism (slides, video)
Part 2: Lecturers: Jama Hussein Mohamud
Topic: Emergent Abilities of Large Language Models (slides, video - part 2)
Part 1: Lecturers: Jia Ao Sun and Hao Yu
Topic: Constitutional AI: Harmlessness from AI Feedback (slides, video)
Part 2: Lecturers: Sungjae Cho and Anirudh Jamkhandi
Topic: Brain-inspired replay for continual learning with artificial neural networks (slides, video - part 2)
Part1: Lecturers: Rishika Bhagwatkar and Aditya Sharma
Topic: Byte Latent Transformer: Patches Scale Better Than Tokens (slides, video)
Part 1: Lecturers: Azalée Robitaille and Artur Kuramshin
Topic: Robotic Control via Embodied Chain-of-Thought Reasoning (slides, video)
Part 2: Lecturers: Edward Habelrih and Frederic Jarjour
Topic: Large Language Models can Strategically Deceive their Users when Put Under Pressure (slides, video - part 2)
Part1: Lecturers: Kun Ni and Zibo Shang
Topic: Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (slides, video)
Part 2: Lecturers: Anirudh Buvanesh and Ankur Sikarwar
Topic: The Platonic Representation Hypothesis (slides, video - part 2)
Part 1: Lecturers: Juan David Guerra and Mauricio Rivera
Topic: Rho-1: Not All Tokens Are What You Need (slides, video)
Part 2: Lecturers: Sepehr Babapour
Topic: Leveraging clinical data across healthcare institutions for continual learning of predictive risk models (slides, video - part 2)
Part1: Lecturers: Anirudh Buvanesh and Ayush Agrawal
Topic: Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs (slides, video)
Part 2: Lecturers: Alexandrine Fortier and Mikhail Kamara
Topic: BrainWash: A Poisoning Attack to Forget in Continual Learning (slides, video - part 2)
Part 1: Lecturers: Jiadi Yu and Mingze Li
Topic: s1: Simple test-time scaling (slides, video)
Part 2: Lecturers: Manping Li and Yan Zhang
Topic: Learning Transferable Visual Models From Natural Language Supervision (slides, video - part 2)
Part1: Lecturers: Samin Mahdipour and Mariem ben Slimen
Topic: Toward Next-Generation Artificial Intelligence: Catalyzing the NeuroAI Revolution (slides, video)
Part 1: Lecturers: Ankur Sikarwar and Ayush Agrawal
Topic: Genie: Generative Interactive Environments (slides, video)
Part 2: Lecturers: Michael Xi and Morteza Mahdiani
Topic: Scaling laws for decoding images from brain activity (slides, video - part 2)
Part 1: Lecturers: Alexandrine Fortier and Mikhail Kamara
Topic: Deception abilities emerged in large language models (slides, video - part 2)
Part 1: Lecturers: Mauricio Rivera and Juan David Guerra
Topic: Dynamic Neural Regeneration (slides, video)
Part 2: Lecturers: Navid Hassan Zadeh and Shahrad Mohammadzadeh
Topic: Chronos: Learning the Language of Time Series (slides, video - part 2)
Part 1: Lecturers: Yuxing Tian and Zibo Shang
Topic: Reasoning Models Don’t Always Say What They Think (slides, video - part 2)
Part 2: Lecturers: Artiom Matvei and William Chidiac
Topic: Mixtures of Experts Unlock Parameter Scaling for Deep RL (slides, video - part 2)
Part 1: Lecturers: Yajie Luo and Yihong Wu
Topic: SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training (slides, video)
Part 2: Lecturers: Roger Creus
Topic: Mastering Board Games by External and Internal Planning with Language Models (slides, video - part 2)