Workshop Schedule

Fri Dec 2, 2022

8:00 - 8:40 - coffee/breakfast

8:50 - 9:10 - Irina Rish (Mila/UdeM): Overview video/slides

9:10 - 9:30 - Felix Stollenwerk (AISweden) : Scaling GPTSw3

9:30 - 9:50 - Evi Gogoulou (RISE/KTH ): Continual Learning in the presence of language shift video/slides

9:50 - 10:20 coffee (during the talks :)

10:20- 10:40 - Ankesh Anand (DeepMind): Large-scale Pretrained Models and RL video/slides

10:40 - 11:10 - David Krueger (U of Cambridge) : A discussion on AI Safety and Alignment video/slides

11:10- 11:30 - Leo Gao (OpenAI) : Scaling Laws for Reward Model Overoptimization slides

11:30 - 11:40 - coffee break/posters setup

11:50 - 12:10 - Eric Michaud (MIT): Neural Scaling Exponents Beyond the Manifold Dimension video/slides

12:10 - 12:40 - Ziming Liu (MIT) : Omnigrok: Grokking Beyond Algorithmic Data video/slides

12:40 - 1:00 - Ethan Caballero (Mila/McGill): Broken Neural Scaling Laws video/slides

1:00 -2:00 - lunch / poster session

2:35 - 3:00 - Jenia Jitsev (Juelich Supercomputing Center, ELLIS, LAION) Supercomputers, supercommunities --> superintelligence? Kind of. video/slides

3:00 - 3:30 - Christoph Schuhmann (LAION): Together we are strong: Web scale data sets and foundation models through community organizing video/slides

3:30 - 3:45 - Robert Kaczmarczyk (TU Munich, LAION ): LAION-5B and beyond - datasets, models, and...? video/slides

3:45 - 4:10 - Julien Launay (lighton.ai): HQ data need not apply: training LLMs with with web data only video/slides

4:10 - 4:30 - Thomas Wolf (Hugging Face): Collaboratively training a large multilingual language model video/slides

4:30 - 5:00 - coffee break/posters

5:00 - 5:30 - Jonas Andrulis and Robert Baldock (Aleph Alpha): Don't be dense: Sparse results and manifesto video/slides

5:30 - 5:50 - Joel Hestness (Cerebras): Toward Sparsity Scaling Laws: Needs and Opportunities video/slides

5:50 - 6:00 - Michael Trazzi: Are AI Researchers Aligned About AGI Alignment? video/slides

6:00 - 6:30 - panel:

joining remotely: Jared Kaplan (Anthropic/JHU), Surya Ganguli (Stanford), David Krueger (U of Cambridge)

in person: Ethan Caballero (Mila/McGill), Jenia Jitsev (Juelich/LAION), Jonas Andrulis (Aleph Alpha) , Joel Hestness (Cerebras)

7:00 - 10:00 - party

Posters:

Robert Baldock (AlephAlpha): Pruned Models Win The New Hardware Lottery: Efficient Inference Of Pruned Large Language Models With Graphcore's IPU

Ethan Caballero (Mila/McGill/UdeM): Broken Neural Scaling Laws

Muawiz Chaudhary (Mila)

Wei Deng (ML Research, Morgan Stanley): A Conditional Schrödinger Bridge Method forProbabilistic Time Series Imputation

António Góis (Mila)

Timothee Lesort (Mila)

TBA