Machine learning is grappling with a fundamental challenge: bias. It appears in many forms — class imbalance, spurious correlations, algorithmic unfairness, and dataset shift — and is often tackled in isolation.
This workshop breaks down those silos. We are bringing together researchers from the fairness, robustness, and generalisation communities to build a unified theoretical framework for understanding and mitigating learning biases.
In this event, we will dig into the technical challenges, foster interdisciplinary dialogue, and promote collaborative problem-solving. Whether you're a seasoned professor, an industry researcher, or a PhD student, if you're working on making ML safer, more reliable, and more efficient, this workshop is for you.
Advancing theoretical understanding of how diverse data imbalances shape learning.
Fostering interdisciplinary dialogue among researchers in fairness, robustness, and dataset shift to build a common vocabulary.
Promoting principled approaches over narrow heuristics.
Emphasising data distributions as the central unifying factor behind ML pathologies.
Questions during the talks? Ask them here!
08:45 – Opening Remarks
09:00 – Fanny Yang "Robust Generalization Under Misspecified Robust Risk and the Role of Limited Target Data" [slides 🎞️]
09:45 – Contributed Talk by Jean-Michel Loubes "When majority rules, minority loses" [slides 🎞️]
10:30 – Break & Poster Session
11:00 – Aasa Feragen "AI bias - it's harder than you think" [slides 🎞️]
11:45 – Contributed Talk by Anissa Alloula "Representation Invariance and Allocation" [slides 🎞️]
12:30 – Lunch Break & Poster Session
13:30 – Emanuele Francazi "Bias Before Learning: How Initialization Design Shapes Bias and Efficiency" [slides 🎞️]
14:15 – Levent Sagun "Concepts, Proxies, and the Performance of Deduction"
15:00 – Break & Poster Session
15:30 – Shai Ben-David "On potential ethical harms inflicted by common types of bias training data" [slides 🎞️]
16:15 – Poster Session
On potential ethical harms inflicted by common types of bias training data
Several common types of bias in training data can be viewed as being the results of some ``censorship"
applied to the data that the learner has access to. I consider several types of such `bias through censorship' that occur in some real life scenarios:
Learning when the only (but not all) positive points are labeled in the training data (a.k.a. PU learning),
The so-called ``apple tasting" setup, in which the learner can access only the labels of points that were previously predicted (maybe wrongfully) to be positive,
Limiting the representation of data.
I will examine some social and ethical costs of deploying classifiers trained on such biased data for human impactful decisions.
AI bias - it's harder than you think
While reality keeps presenting examples of how AI can perpetuate or even create bias and discrimination, very few -- even among AI researchers -- realize how diverse the underlying mechanisms are: Often the narratives are oversimplified along the lines of "AI developers were not thinking and collected biased data giving biased AI". While this definitely happens, this is by far not the only way that bias enters AI models -- in my talk, we will go through some thought-provoking stories about non-trivial AI bias, and discuss open statistical problems related to their detection and mitigation.
Bias Before Learning: How Initialization Design Shapes Bias and Efficiency
Bias in deep learning is often attributed to data or learning algorithms, but deep neural networks can already exhibit predictive skew before they see any data. In this talk, I focus on a previously underexplored source of bias that arises purely from architectural and initialization choices, and that remains visible even in the most symmetric setting—perfectly balanced labels and identically distributed classes. I introduce Initial Guessing Bias (IGB), a phenomenon in which, at initialization, most data points are systematically assigned to a single class. Using the IGB framework, we show how practical design choices—activation functions, normalization placement, pooling, and depth—shape this initial skew.
By connecting IGB to mean-field analyses of gradient stability, we obtain a unified theory in which bias emergence and trainability arise from the same underlying regimes. This perspective shows, first, that the most trainable operating region of a network starts with a surprisingly strong but transient, predictive bias. It also reveals that not all initial biases behave the same: when strong skew is paired with vanishing gradients, it tends to persist much longer, leading to qualitatively different learning dynamics.
More broadly, these results illustrate how learning biases and efficiency are tied to the same underlying mechanisms. This unified perspective suggests practical levers—at the level of initialization and architectural design—for steering deep networks toward more controlled and interpretable learning dynamics.
Concepts, Proxies, and the Performance of Deduction
Can AI truly do science, or does it merely reproduce the surface form of scientific reasoning? This talk argues that many failures in modern ML arise from a fundamental confusion between concepts and the statistical proxies that models actually learn. By tracing the gap between the Task (deduction, conceptual inference) and the Model (inductive pattern-matching), I show how proxy-based learning leads to conceptual collapse, spurious correlations, and social biases. I conclude by arguing that situated knowledge (rather than universalist abstractions) is necessary to evaluate and govern models whose outputs perform deduction without engaging in it.
Robust Generalization Under Misspecified Robust Risk and the Role of Limited Target Data
In practice, data during test time typically differs from the data we could collect during training time. On a high level, there are two fundamentally different paradigms under which we aim to obtain a model that still performs well – one that uses assumptions on possible target shift, and another based on (limited) data access from the target distribution. The former is typically addressed by domain and robust generalization methods that are trained to perform well under the worst case within a specified set of expected distributions. However, this set of distributions is rarely fully known. To move towards more realistic settings, we first relax the common assumption that the robust risk is identifiable, which makes the problem even more challenging. At the same time, we show how little access to the target data can already significantly mitigate this difficulty, and vice versa – that assumptions on the target shifts can improve upon pure transfer learning.
All submissions are available on the OpenReview website.
Below are reported the 26 accepted contributions:
Augmented Lagrangian Langevin Monte Carlo for Fair Inference — Ananyapam De, Benjamin Säfken
Your Model Is Not Neutral—It's Just Well-Socialized — Ananyapam De, Benjamin Säfken
Representation Invariance and Allocation: When Subgroup Balance Matters — Anissa Alloula, Charles Jones, Zuzanna Wakefield-Skórniewska, Francesco Quinzan, Bartlomiej Papiez
On Fair and Balanced Matching in Bipartite Graphs — Beloslava Malakova, Alicja Gwiazda, Teodora Todorova
Calibrated Surrogate Losses for Robust Classification with a Reject Option — Boris Ndjia
Erase to Adapt: Random Erasing Surprisingly Enables Stable Continual Test-Time Learning — Chandler Timm Cagmat Doloriel
When Majority Rules Minority Loses — François Bachoc, Jerome Bolte, Ryan Boustany, Jean-Michel Loubes
Mitigating Spurious Correlations in Patch-Wise Tumor Classification on High-Resolution Multimodal Images — Ihab Asaad, Maha Shadaydeh, Joachim Denzler
Do Visual Bias Mitigation Methods Generalize? A Preliminary Study Across Domains and Modalities — Ioannis Sarridis, Christos Koutlis, Symeon Papadopoulos, Christos Diou
Computing Strategic Responses to Non-Linear Classifiers — Jack Geary, Boyan Gao, Henry Gouk
Intersectional Fairness Score: The Overlooked but Far-Reaching Choice of Aggregation Design — Jeanne Monnier, Thomas George
Robust Canonicalization through Bootstrapped Data Re-Alignment — Johann Schmidt, Sebastian Stober
CausalFairness: An Open Source Python Library for Causal Fairness Analysis — Kriti Mahajan
What Do LLMs Understand About International Trade? Introducing TradeGov Dataset for International Trade Q&A Evaluation — Kriti Mahajan
Addressing Label Distribution Skew in Federated Learning with Per-Class Expert Models — Larissa Reichart, Ali Burak Ünal, Mete Akgün
Unsupervised Multi-Source Federated Domain Adaptation under Domain Diversity through Group-Wise Discrepancy Minimization — Larissa Reichart, Cem Ata Baykara, Ali Burak Ünal, Harlin Lee, Mete Akgün
Uncovering Implicit Bias in LLM Mathematical Reasoning with Concept Learning — Leroy Z. Wang
Optimal Transport under Group Fairness Constraints — Linus Bleistein, Mathieu Dagréou, Francisco Andrade, Thomas Boudou, Aurélien Bellet
Red Teaming Multimodal Language Models: Evaluating Harm Across Prompt Modalities and Models — Madison Van Doren, Casey Ford
On the Influence of SGD Hyperparameters on Robustness to Spurious Correlations — Mahdi Ghaznavi, Hesam Asadollahzadeh
CogniBias: A Benchmark for Cognitive Biases in AI–Human Dialogue — Om Dabral, Mridul Maheshwari, Sanyam Kathed, Hith Rahil Nidhan, Hardik Sharma, Abhinav Upadhyay, Bagesh Kumar, Rajkumar Saini
MADGen: Minority Attribute Discovery in Text-to-Image Generative Models — Silpa Vadakkeeveetil Sreelatha, Dan Wang, Serge Belongie, Muhammad Awais, Anjan Dutta
When Are Learning Biases Equivalent? A Unifying Framework for Fairness, Robustness, and Distribution Shift — Sushant Mehta
When Non-Commutativity Breeds Unfairness: A Geometric–Algebraic View of Uncertainty in VAEs — Tahereh Dehdarirad, Gabriel Eilertsen, Michael Felsberg
The Role of Outcome Imbalance in Fairness Over Time — Tereza Blazkova
SATA-Bench: Select All That Apply Benchmark for Multiple Choice Questions — Weijie Xu, Shixian Cui, Xi Fang, Chi Xue, Stephanie Eckman, Chandan K. Reddy
We invite researchers to submit papers exploring the themes of our workshop, with a special focus on contributions that promote a unified understanding of learning biases. We are excited to receive interdisciplinary work and theoretical papers that draw connections between different subfields of machine learning and explore these connections, aiming to address fundamental questions, such as:
Under which conditions are different mechanisms that resemble class imbalance quantitatively equivalent?
Can different sources of bias be controlled in such a way that they mitigate one another?
How can a unified understanding of these biases lead to the development of more intrinsically fair and robust machine learning systems?
The scope of the workshop includes, but is not limited to, the following themes:
Class and subpopulation imbalance
Spurious correlations and shortcut learning
Dataset shift and out-of-distribution generalisation
Algorithmic bias and fairness in machine learning
Biases emerging from model initialisation or architectural design
We invite submissions of both regular papers (up to 5 pages) and tiny papers (up to 2 pages).
All page limits exclude references and supplementary material.
To ensure fairness, our review process is double-blind. Submissions must be fully anonymised, with no author names or affiliations appearing in the paper. Please avoid any self-identifying statements or links. Papers that are not properly anonymised will be desk-rejected without review.
All submissions must be formatted using the official NeurIPS LaTeX style files.
All accepted papers will be presented during our poster sessions. A select few will be chosen for short oral presentations in addition to their poster.
Submissions are managed through OpenReview.
Here are the key deadlines for the workshop. Please note that all deadlines are Anywhere on Earth (AoE).
Paper Submission Open: September 15, 2025
Paper Submission Deadline: October 10, 2025
Paper Acceptance Notification: October 31, 2025
Q: My contribution has been accepted to the Workshop. What format should the poster be?
Posters are A1 portrait format.
Q: My contribution has been accepted to the Workshop. Where should I hang my poster?
Use the hangers located outside the Auditorium. Spaces reserved to our workshop are numbers 59-84; please choose any available spot.
Q: Will this workshop have proceedings?
No, this workshop will have no official proceedings. This means that you are free to publish a revised or extended version of your work at a future archival conference or journal. Submitting to our workshop does not preclude you from submitting elsewhere.
Q: Will the accepted papers be publicly available?
Yes, accepted papers and their reviews will be made publicly available on the OpenReview page for the workshop. This provides a lasting record of the work presented and the discussions that took place.
Q: Why should I submit a contribution if it does not count as a publication?
The primary purpose is to gather feedback on your work from the community. In particular, it allows you to present early-stage ideas, preliminary results, or ongoing projects; receive constructive input from attendees; and refine your work before developing it into a full paper for an archival journal or conference.
Q: What is the purpose of a "tiny paper"? What kind of work is suitable for this format?
Tiny papers are intended for showcasing preliminary results, novel ideas, or position statements that can be communicated concisely. They are a great way to get feedback on early-stage research or to highlight a specific, focused contribution that may not require a full-length paper.
Q: What is your policy on dual submissions?
We welcome submissions of work that is currently under review at other venues. We believe in the open exchange of ideas and want to provide a platform for feedback on ongoing research.
Q: Can I submit a paper that is already on arXiv?
Yes, absolutely. The existence of a pre-print on services like arXiv will not be considered a violation of our double-blind review policy.
Q: Do the page limits (5 pages for regular, 2 pages for tiny) include references and supplementary material?
No. The page limits apply only to the main content of the paper. You may have an unlimited number of pages for references and for any supplementary material or appendices.
Q: The LaTeX template contains a lengthy checklist. Is it required for workshop submissions?
No, you are not required to include the checklist with your submission. The checklist is a specific requirement for the main NeurIPS conference and is not necessary for our workshop.