As the current LLM research community pushes towards massive scale, this workshop takes a complementary approach by focusing on the under-explored small-scale regime, where scale encompasses compute, data, and model size. We ask: to what extent is scale necessary, and how far can we push toward smaller settings while maintaining competitive performance and enabling scientifically meaningful discoveries? Recent work demonstrates a breadth of opportunities at this scale, including:
analyses of empirically successful methods and emergent phenomena during training;
minimalistic replications of modern pipelines;
investigation of model internals;
diagnoses and mitigations of failure modes; and
algorithmic innovations across training and inference.
This workshop aims to highlight the methods and opportunities enabled by small-scale experimentation, and to foster discussion on how such approaches can broaden participation and accelerate progress in LLM research.
We invite submissions that use small-scale experiments to explore the potential and limits of small-scale research for algorithmic innovation and scientific understanding, without necessarily improving state-of-the-art performance. Topics of interest include but are not limited to:
LLM Training: model architectures, optimizers, training dynamics, emergent abilities, pre-training and post-training strategies, data curation and selection.
Inference-Time Methods: prompt optimization, self-improvement, test-time scaling, test-time training.
Evaluation and Benchmarks: stress tests, synthetic tasks, benchmarks (static and dynamic), evaluation protocols and metrics.
Interpretability and Safety: mechanistic interpretability, safety and alignment, robustness, steering and monitoring.
Survey or position papers on methods, opportunities, and limitations at small scale are also welcome.
Paper & notebook submission deadline: June 30th (AoE), 2026.
Review period: July 1st - July 15th (4:59pm PDT / 11:59pm UTC), 2026.
Notification date: July 24th, 2026
We have 2 tracks.
Small-Scale Frontier Track: This track is intended for work advancing the small-scale frontier along any dimension, including data efficiency, compute efficiency, or model size. The main experimental results must be demonstrated on models with at most 3B parameters, with a soft training budget cap of 1020 FLOPs for any single run. As a point of reference, under a standard GRPO setup, this budget is roughly comparable to training a 3B-parameter model for 10 epochs on 5,000 questions, using 4 generations per question with each generation capped at 4096 tokens.
To support reproducibility, authors will be required to report estimated FLOP budgets for their experiments and are strongly encouraged to release relevant training artifacts, including:
training logs,
intermediate checkpoints,
evaluation artifacts (datasets and metrics), and
detailed analyses of model behavior.
Submissions are limited to 4 pages in the main body, with unlimited supplementary material. Authors may additionally submit a ZIP file containing supporting artifacts.
Free-Tier Colab Track: This track is intended for work that can be reproduced within the constraints of a free-tier Google Colab notebook (≤1 GPU, ≤12 hours runtime, ≤500 GB storage).
Submitted notebooks should be clearly documented and runnable on free-tier Colab. Authors should provide environment specifications whenever necessary. Example notebooks can be found at our github link.
Submissions should emphasize methodological clarity, reproducibility, and scientific insight under tight computational constraints. A PDF write-up (maximum 2 pages) is encouraged but not required.
The reviewing process will be double-blind and all submissions must be anonymized. Please do not include author names, affiliations, acknowledgements, or any other identifying information in your submission. Submissions and reviews will not be made public.
All submissions must be made through the OpenReview site.
Style Files: Please find the relevant files for submission here: https://drive.google.com/drive/folders/1SyfMQYB_B1mgu5bavRCuGrto8kgkOWAh
Note: If you are creating a new OpenReview profile, we strongly recommend using your institutional email address. Profiles created without an institutional email may require a moderation process, which can take up to two weeks.
Dual submission: This workshop is non-archival and will not have official proceedings. Workshop submissions can be submitted to other venues. We welcome ongoing and unpublished work, including papers that are under review at the time of submission. We do not accept submissions that have been accepted for publication in other venues with archival proceedings, with the only exception being COLM 2026 main conference papers.
Please refer to the FAQs for commonly asked questions about submissions.