Scaling Laws & Emergent Behaviors

Call-in link Mon @ 2:00 M EST (Mila Calendar) Topics&Papers

Relevant papers on Emergence, Phase Transitions and Stat Physics of ML

To be presented this semester

2pm-3pm, Mon, June 24, 2024

Speakers: Paul Bogdan's group

A perspective on flocking in DNNs

2pm-3pm, Mon, June 10, 2024

Speakers: Andrei Mircea and Ekaterina Lobacheva

Gradient dissent in language model pretraining slides / video

2pm-3pm, Mon, June 3, 2024

An overview of ongoing work by several groups
Speakers: Paul Bogdan, Pascal Notsawo, Darshil Doshi

video

2pm-3pm, Mon, Apr 22, 2024

Speaker: Parviz

Talk: Grokking as Compression

Paper: Grokking as Compression
youtube: Grokking as Compression: A Nonlinear Complexity Perspective

2pm - 3pm, Mon March 11, 2024

Discussion, continued:

Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition

Video recording

2pm-3pm, Mon March 4, 2024

Paper to discuss:

Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition

Related work/papers mentioned in today's discussion:

A Mathematical Framework for Transformer Circuits

Chinchilla scaling laws etc: Go smol or go home, Training Compute-Optimal Large Language Models, chinchilla's wild implications — LessWrong, Chinchilla Explained: video

Yuhai Tu: Activity–weight duality in feed-forward neural networks reveals two co-determinants for generalization

Fall 2023

3pm-4pm, Mon, Nov 6, 2023

Speaker: Pascal Jr. Tikeng Notsawo (University of Montreal/Mila)

Talk: Is grokking predictable? video

Paper: Predicting Grokking Long Before it Happens: A look into the loss landscape of models which grok

3pm-4pm, Mon, Oct 23, 2023

Open discussion. Working document

3pm-4pm, Mon, Oct 9, 2023

Speaker: Andrey Gromov (University of Maryland)

Talk: A solvable model for grokking modular arithmetic

Paper: Grokking modular arithmetic

3pm-4pm, Mon, Oct 2, 2023

Open discussion. Working document

Summer 2023

Emergence and Phase Transitions (5th Neural Scaling Laws workshop @ ICML)

Winter 2023

Class 18 (11am-noon, Tue, March 14, 2023)

Lecturers: Niki Howe

Topic: Adversarial Policies Beat Superhuman Go AIs (slides, video)

This class is combined with the Scaling& Emergent Phenomena Reading Group

Class 15 (11am-noon, Tue, March 7, 2023)

Lecturers: Kyle Roth, Alex Fulleringer

Topic: Artificial Intelligence, Values, and Alignment (slides)

This class is combined with the Scaling& Emergent Phenomena Reading Group

Date/time: Tue Feb 2 11:00 am EST

Topic: open discussion on chatbot behaviors, history of emergence and transitions in AI systems, Jacob Steinhardt's blog posts: Future ML Systems Will Be Qualitatively Different, Emergent Deception and Emergent Optimization , as well as Broken Neural Scaling Laws and Emergent Abilities of Large Language Models.

Speaker: Irina Rish (video)

Topic: Tutorial and Q&A on Phase Transitions (video Feb 2023) Date/time: Tue Feb 7, 11:00 am EST

extended version of the talk given at the 2nd Workshop on Neural Scaling Laws

Speaker: Guillaume Dumas

Papers mentioned: Problems in Physics wi th Many Scales of Length, Quantifying causal emergence shows that macro can beat micro, How critical is brain criticality?, Multilevel development of cognitive abilities in an artificial neural network, Why Deep Learning Works II: the Renormalization Group

Related: Scaling course, Class 5

Topic: Phase Transitions in AI (and Emergent Behaviors in Large-Scale models) (slides, video)

Papers on "phase transitions" in AI: Hard and Easy Distributions of SAT Problems, Every Monotone Graph Property Has a Sharp Threshold, Approximability of probability distributions, Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets, GPT-3 paper: Language Models are Few-Shot Learners, A universal law of robustness via isoperimetry.

Topic: Planning paper discussions for winter trimester Date/time: Tue Jan 24, 11:00 am EST

Speaker: Irina Rish