Workshop Schedule
July 28th, 2023
08:30 - 09:00 Arrival - coffee/tea
09:30 - 10:00 Irina Rish (University of Montreal/Mila): An Overview video
10:00 - 10:30 Pascal Jr. Tikeng Notsawo (University of Montreal/Mila): Is grokking predictable? (remote talk) video
paper: Predicting Grokking Long Before it Happens: A look into the loss landscape of models which grok
10:30 - 11:00 Eric Michaud (MIT): The Quantization Model of Neural Scaling (remote talk) video
paper: The Quantization Model of Neural Scaling Interview with Eric AI Alignment forum
11:00 - 11:30 Elvis Dohmatob (Meta AI): Three Theoretical Insights for Robustness: The Bad, The Bad, and The Ugly! (remote talk) video
11:30 - 11:40 coffee break
11:40 - 12:10 Neel Nanda (Google DeepMind): Progress measures for grokking via mechanistic interpretability (remote talk) video
12:10 - 12:30 discussion
12:30 - 02:00 lunch (on your own)
02:15 - 02:45 Andrey Gromov (University of Maryland): A solvable model for grokking modular arithmetic (remote talk) video
paper: Grokking modular arithmetic
02:50 - 03:20 Rylan Schaeffer (Stanford): Are emergent abilities of Large Language Models a mirage? video
paper: Are Emergent Abilities of Large Language Models a Mirage?
03:20 - 03:40 discussion
03:40 - 04:00 coffee break
04:00 - 04:30 Ziming Liu (MIT/IAIFI): Intelligence from hunger video
04:30 - 05:00 Hugo Cui (EPFL): Error scaling laws in kernel learning video
05:00 - 05:20 Kshitij Gupta (University of Montreal/Mila) AGI Collective video
05:20 - 05:40 discussion