Neural Scaling Laws
Surveys:
EpochAI Scaling Laws Literature review and A database of papers on scaling laws
History of Scaling Laws:
Learning Curves: Asymptotic Values and Rate of Convergence (Cortes et al, 1994)
BNSL:
Multiply broken power-law densities as survival functions
2024
Random matrix methods for high-dimensional machine learning models
2023
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Scaling Data-Constrained Language Models
An Information-Theoretic Analysis of Compute-Optimal Neural Scaling Laws
Scaling laws for single-agent reinforcement learning
Scaling Laws for Generative Mixed-Modal Language Models
Training Trajectories of Language Models Across Scales
2022
Holistic Evaluation of Language Models (HELM) - leaderboard
Reproducible scaling laws for contrastive language-image learning (LAION CLIP)
Scaling Laws Beyond Backpropagation
Beyond neural scaling laws: beating power law scaling via data pruning
Training compute-optimal large language models ("Chinchilla")
What Language Model to Train if You Have One Million GPU Hours?
Unified Scaling Laws for Routed Language Models - Scaling laws for MOEs
Scaling Scaling Laws with Board Games - Scaling laws for AlphaZero on Hex
Scaling Laws for Neural Language Models (Kaplan et al, the original famous scaling laws paper)
A Neural Scaling Law from the Dimension of the Data Manifold
A constructive prediction of the generalization error across scales
Jonathan Rosenfeld's PhD thesis on Scaling Laws for Deep Learning
Scaling and Reinforcement Learning
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents
Training language models to follow instructions with human feedback
Offline Pre-trained Multi-Agent Decision Transformer