Combining Theory and Benchmarks: Towards A Virtuous Cycle to Understand and Guarantee Foundation Model Performance
Combining Theory and Benchmarks: Towards A Virtuous Cycle to Understand and Guarantee Foundation Model Performance
ICML 2026
Seoul, South Korea
July 10 or 11, 2026
Call For Papers
The call for papers invites original research, position papers, and survey articles, all aimed at advancing research at the intersection of theory and benchmarking. Original research papers should present significant new findings, innovative methodologies, or theoretical insights. Position papers should provide thought-provoking perspectives on emerging trends and challenges in the field. Survey articles are encouraged to offer comprehensive overviews of specific topics, illuminating current research landscapes and proposing future directions. All submissions must align with the workshop’s theme and stimulate engaging discussions among participants, enhancing the collective knowledge of the community.
Authors can choose to submit either tiny papers (up to 2 pages), short papers (up to 4 pages) or long papers (up to 8 pages). Tiny papers are intended to encourage submissions from early-career researchers as well as timely intermediate research milestones that may not be presentable elsewhere.
Topics of interest include, but are not limited to:
Quantification of capabilities across levels: How can we move from scores to formal, quantitative guarantees on performance across levels?
data-centric quantification (e.g., which data subsets yield the tightest bounds or most informative coverage metrics)
instance-wise uncertainty bounds (e.g., conformal prediction, PAC-Bayes, calibrated error bars, and worst-case analysis)
class- or task-level generalization guarantees
quantitative analyses of how architectural or inductive biases affect reliability and failure modes
Foundations of generalization and composition: Which mathematical frameworks can explain when and why models generalize? How can benchmarks be designed so that we can test hypotheses derived from theory?
theoretical frameworks for transfer and compositional generalization
scaling laws and emergent behavior
meta-learning and multi-task generalization theory
composition of uncertainty across modules
benchmark design with provable properties such as tunable difficulty, decomposability, or built-in “hardness parameters” that enable controlled extrapolation
Reliable and structured empirical evaluation: How should benchmarks be constructed to evaluate reasoning, robustness under distribution shift, and calibrated uncertainty?
robustness and domain shift benchmarks
empirical uncertainty quantification and calibration (ensembles, Bayesian, or conformal methods),
fine-grained failure analyses,
standardized ablation and reproducibility protocols
evaluation for multimodal, continual, or compositional settings
studies addressing bias, fairness, and safety evaluation under shift or adversarial prompting
Submission Guidelines
Tiny papers: Maximum 2 pages (excluding references, appendices, or other supplementary material)
Short papers: Maximum 4 pages (excluding references, appendices, or other supplementary material)
Long papers: Maximum 8 pages (excluding references, appendices, or other supplementary material)
Submission Link: https://openreview.net/group?id=ICML.cc/2026/Workshop/CTB
Submissions should be formatted using the ICML 2026 template according to the author instructions. Submissions will be made through OpenReview (see submission link above) and all submitting authors are required to have an OpenReview profile. Submissions should follow the same author guidelines as the main conference, including ethical conduct for peer review. All submissions are considered non-archival, and will be published on OpenReview and the workshop website. We will consider submissions that have undergone peer review or are currently under review at other venues. All submissions must be anonymous to accommodate the double-blind review process.
All accepted papers will be presented as posters during the workshop, for which at least one paper author must be present in-person. Additionally, a small number of outstanding papers will be invited for oral presentations. To encourage submissions, we plan to announce multiple best paper awards for each type of contribution, with associated prizes pending sponsorship.
Important Dates
Call for Papers Released: 20 March 2026
Submission Deadline: 24 April 2026
Review Period: 24 April 2026 - 13 May 2026
Notification Deadline: 15 May 2026
Camera-Ready Deadline (available on ICML Website): 29 May 2026
Workshop Dates: 10 or 11 July, 2026
Useful Links
ICML 2026 Workshops: https://icml.cc/Conferences/2026/CallForWorkshops