The I Can’t Believe It’s Not Better (ICBINB) initiative is excited to announce its upcoming workshop at ICLR 2026 in Rio de Janeiro (Brazil), dedicated to discussing negative results and rigorous evidence of limitations of LLMs that need to be overcome. We invite researchers and industry professionals to submit papers on negative results and unexpected challenges encountered in developing, aligning, scaling, evaluating, and deploying these systems. The primary goal is to create a platform for open, honest discussion about the hurdles and roadblocks in building reliable, efficient, and safe LLM systems. We believe that sharing these experiences is crucial for the field: it prevents teams from retracing unproductive paths, strengthens our understanding of failure modes and boundary conditions, and fosters a culture of transparency and learning.
To this aim, we invite submissions that
showcase and investigate important limitations of current LLMs. This may include the evaluations on pitfalls in common approaches to alignment, reasoning, etc., and evaluations of real-world (especially safety-critical) applications.
attempt promising ideas to overcome common challenges but fall short of the expected gains, accompanied by analyses that clarify failure modes and boundary conditions.
Specifically, the submitted papers should contain the following:
Problem. A problem in a clearly specified domain/setting (e.g., clinical decision-making tasks, long-horizon tool use, code agent, etc), with assumptions, target metrics, and desired improvements precisely stated.
Proposed solution. A solution for this type of problem as proposed in prior literature, including its core mechanism, required preconditions, and the hypotheses under which it is expected to work.
Observed outcome. A concise description of the negative or null outcome, including what failed to improve (and by how much), instability or regressions observed, and quantitative evidence (metrics, error bars, compute budget, data/seed details).
Reason for failure. An investigation (and ideally an answer) to why it did not work as promised by the literature: e.g., dataset artifacts/leakage, mis-specified objectives (reward hacking), shortcut cues, distribution shift, optimization or scaling constraints (compute, memory, context length), fragile tool-use/memory orchestration in agents, or evaluation mismatches; supported by diagnostics/ablations and ending with boundary conditions and actionable takeaways.
Around these "negative results", a non-exhaustive, topic-wise list of LLM research problems includes (but is not limited to): (i). Reasoning. Works that reveal brittle logic, shallow or non-transferable chains of thought, limited systematic generalization, or domain-specific reasoning failures. (ii). Alignment. Misalignment between user intent and model behavior; failures in safety tuning, adversarial robustness, or goal preservation (including post-deployment drift). (iii). Efficiency and Scaling. Limitations in training, inference, and fine-tuning under realistic compute/latency/memory constraints, with particular emphasis on energy use and sustainability. (iv). Agents. Challenges in multi-step planning, tool use/selection, memory, self-monitoring/reflection, or stability of open-ended agentic systems. (v). Hallucinations. Studies of factual inaccuracies, fabricated/phantom citations, and calibration of model confidence/trust, alongside leak-resistant evaluation and mitigation. (vi). Other. Any well-supported finding that challenges prevailing assumptions, exposes boundary conditions, or provides constructive negative results.
Besides these points, papers will be assessed on
Clarity of writing.
Rigor and transparency in the scientific methodologies employed.
Novelty and significance of insights.
Quality of discussion of limitations.
Reproducibility of results.
Reviewers will nominate papers for the spotlight and contributed talks, as well as two awards: the "Entropic Award" for the most surprising negative result, and the "Didactic Award" for the most well-explained and pedagogical papers.
The archival status of this workshop is still being clarified and will be published on our website nearer the submission deadline, but there will always be an option for authors to have their paper in a non-archival track, which is to share preliminary findings that will later go to full review at another venue. Proceedings of our last workshop can be found here.
Note from ICLR:
Since 2025, ICLR has discontinued the separate “Tiny Papers” track, and is instead requiring each workshop to accept short (3–5 pages in ICLR format, exact page length to be determined by each workshop) paper submissions, with an eye towards inclusion; see https://iclr.cc/Conferences/2025/CallForTinyPapers for a history of the ICLR tiny papers initiative. Authors of these papers will be earmarked for potential funding from ICLR, but need to submit a separate application for Financial Assistance that evaluates their eligibility. This application for Financial Assistance to attend ICLR 2026 will become available on https://iclr.cc/Conferences/2026/ at the beginning of February and close early March.
For the submissions, please use these style files which can be found here.
We use OpenReview to host papers, and submissions will be double-blind.
Submissions should be no more than 4 pages long (excluding references), and authors should consider the following:
Authors may include unlimited appendices, but reviewers will not be required to take them into account in their assessment of the submission.
We ask the authors to disclose any use of Large Language Models (LLMs) with a paragraph describing the roles in their work. We also recommend including an ethics statement and a reproducibility statement. The details of these statements can be found on ICLR, and none of these is counted toward the page limit.
We welcome first-time authors to submit to this workshop. The workshop will be run in person.
Additionally, we welcome contributions of tiny papers to our workshop. These are papers with the same structure and formatting instructions as seen in full workshop submissions, but with at most 2 pages of the main text. They are not required to contain all four elements mentioned above, but should at least highlight a challenge around LLMs and a description of the (negative) outcome.
Important Dates:
Paper Submission Deadline - January 31st, 2026 (OpenReview Submission link can be found here)
Notification of Acceptance/Rejection - March 1st, 2026
Camera-ready & poster submission - March 8th, 2026
In-person Workshop - April 26th or 27th, 2026 (TBA)
Updates will be posted on this website. For any questions, please reach out to us at cant.believe.it.is.not.better@gmail.com.
We look forward to your submission!