Scaling Laws & Emergent Behaviors
Mon @ 2:00 M EST (Mila Calendar) Topics&Papers
Relevant papers on Phase Transitions in ML
Winter/Spring 2024
2pm-3pm, Mon Feb 5, 2024
2pm-3pm, Mon Jan 29, 2024
Jacob Steinhardt's blog: Future ML Systems Will Be Qualitatively Different - LessWrong
Fall 2023
3pm-4pm, Mon, Nov 6, 2023
Speaker: Pascal Jr. Tikeng Notsawo (University of Montreal/Mila)
Talk: Is grokking predictable? video
Paper: Predicting Grokking Long Before it Happens: A look into the loss landscape of models which grok
3pm-4pm, Mon, Oct 23, 2023
Open discussion. Working document
3pm-4pm, Mon, Oct 9, 2023
Speaker: Andrey Gromov (University of Maryland)
Talk: A solvable model for grokking modular arithmetic
Paper: Grokking modular arithmetic
3pm-4pm, Mon, Oct 2, 2023
Open discussion. Working document
Summer 2023
Winter/Spring 2023
Class 18 (11am-noon, Tue, March 14, 2023)
Lecturers: Niki Howe
Topic: Adversarial Policies Beat Superhuman Go AIs (slides, video)
Class 15 (11am-noon, Tue, March 7, 2023)
Lecturers: Kyle Roth, Alex Fulleringer
Topic: Artificial Intelligence, Values, and Alignment (slides)
Date/time: Tue Feb 2 11:00 am EST
Topic: open discussion on chatbot behaviors, history of emergence and transitions in AI systems, Jacob Steinhardt's blog posts: Future ML Systems Will Be Qualitatively Different, Emergent Deception and Emergent Optimization , as well as Broken Neural Scaling Laws and Emergent Abilities of Large Language Models.
Speaker: Irina Rish (video)
Topic: Tutorial and Q&A on Phase Transitions (video Feb 2023) Date/time: Tue Feb 7, 11:00 am EST
extended version of the talk given at the 2nd Workshop on Neural Scaling Laws
Speaker: Guillaume Dumas
Papers mentioned: Problems in Physics wi th Many Scales of Length, Quantifying causal emergence shows that macro can beat micro, How critical is brain criticality?, Multilevel development of cognitive abilities in an artificial neural network, Why Deep Learning Works II: the Renormalization Group
Related: Scaling course, Class 5
Topic: Phase Transitions in AI (and Emergent Behaviors in Large-Scale models) (slides, video)
Papers on "phase transitions" in AI: Hard and Easy Distributions of SAT Problems, Every Monotone Graph Property Has a Sharp Threshold, Approximability of probability distributions, Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets, GPT-3 paper: Language Models are Few-Shot Learners, A universal law of robustness via isoperimetry.
Topic: Planning paper discussions for winter trimester Date/time: Tue Jan 24, 11:00 am EST
Speaker: Irina Rish