The 5th Workshop on Uncertainty Reasoning and Quantification in Decision Making
(held in conjunction with ACM SIGKDD 2026)
August 9, 2026, Jeju, Korea
The 5th Workshop on Uncertainty Reasoning and Quantification in Decision Making
(held in conjunction with ACM SIGKDD 2026)
August 9, 2026, Jeju, Korea
Deep neural networks (DNNs) have received tremendous attention and achieved great success in various applications, such as image and video analysis, natural language processing, recommendation systems, and drug discovery. However, inherent uncertainties derived from different root causes have been serious hurdles for DNNs to find robust and trustworthy solutions for real-world problems. A lack of consideration of such uncertainties may lead to unnecessary risk. For example, a self-driving autonomous car can misclassify a human on the road. A deep learning-based medical assistant may misdiagnose cancer as a benign tumor. Uncertainty has become increasingly important, and it has been attracting attention from academia and industry due to its increased popularity in real-world applications with uncertain concerns. It also emphasizes decision-making problems, such as autonomous driving and diagnosis systems. Therefore, the wave of research at the intersection of uncertainty reasoning and quantification in data mining and machine learning has also influenced other fields of science, including computer vision, natural language processing, reinforcement learning, and social science.
Important Dates
The following are the proposed important dates for the workshop. All deadlines are due 11:59 pm Pacific Time.
Paper Submission: April 30th, 2026 May 15th, 2026
Paper Notification: June 4th, 2026
Workshop Date: August 9th, 2026, Morning
Topics of Interest
This workshop will provide a premium platform for both research and industry from different backgrounds to exchange ideas on opportunities, challenges, and cutting-edge techniques in uncertainty reasoning and quantification. We encourage submissions in various degrees of progress, such as new results, visions, techniques, innovative application papers, and progress reports under the topics that include, but are not limited to, the following broad categories:
Uncertainty quantification in foundation models
Uncertainty reasoning in foundation models
Decision-making with foundation models
Uncertainty quantification in classification and regression
Out-of-distribution detection
Conditional reasoning with uncertainty
Quantification of multidimensional uncertainty
Sequential uncertainty estimation
Interpretation of uncertainty
Uncertainty-aware deep reinforcement learning
Decision-making with uncertainty
And with a particular focus, but not limited to, these application domains:
Application of an autonomous system
Application of uncertainty methods in a large-scale dataset
Computer vision (uncertainty in face recognition, object relation)
Natural language processing (language uncertainty, sentence uncertainty)
Reinforcement learning (uncertainty-aware offline reinforcement learning exploration vs. exploitation)
Application of uncertainty methods in foundation models
Submission Guidelines
Submissions are limited to a total of 5 pages, including all content and references. There will be no page limit for supplemental materials. All submissions must be in PDF format and use ACM Conference Proceeding templates (two-column format). One recommended setting for a Latex file of an anonymous manuscript is: \documentclass[sigconf, anonymous, review]{acmart}. Template guidelines are here: https://www.acm.org/publications/proceedings-template.
Following this KDD conference submission policy, reviews are double-blind, and author names and affiliations should NOT be listed. Submitted papers will be assessed based on their novelty, technical quality, potential impact, and clarity of writing. For papers that rely heavily on empirical evaluations, the experimental methods and results should be clear, well-executed, and repeatable. Authors are strongly encouraged to make data and code publicly available whenever possible.
Submit your paper through the UDM workshop CMT submission site: https://cmt3.research.microsoft.com/UDM2026/
The Microsoft CMT service was used for managing the peer-reviewing process for this conference. This service was provided for free by Microsoft and they bore all expenses, including costs for Azure cloud services as well as for software development and support.
Paper Acceptance
Accepted workshop papers will be posted on the workshop website, but will NOT be included in the official KDD proceedings.
Upon notification, we ask that authors of accepted works deanonymize their papers, make any final changes, and then submit a camera-ready version to the CMT submission site. The workshop website will then be updated with links to accepted papers. Note that accepted works will not be formally published. This means that:
Authors can retain full copyright of their works.
Work contained in accepted papers is not precluded from being published in other research venues.
Submitted papers are allowed to have significant overlap with previously published or currently submitted work (in this case, please indicate overlapping works).
Any questions may be directed to the email address: chen_zhao@baylor.edu
Attendence
For each accepted paper, at least one author must attend the conference and present the paper.
Huaming Chen, The University of Sydney, Australia
Huaming Chen is an AI scientist and researcher with research interests in the areas of trustworthy machine learning and its applications. He is currently a Senior Lecturer with the School of Electrical and Computer Engineering, The University of Sydney, Sydney, NSW, Australia. His research has been published in leading AI conferences, including ICLR, ICML, KDD, WWW, ECML/PKDD, AAAI, and so on. His work has received numerous awards, including the sole recipient of the Best Paper in Research Track in ECML/PKDD 2025. He actively serves on the organising/program committees, associate editors, and reviewers for top international journals and conferences, such as ICLR, ICML, KDD, IJCAI, ACM MM, ICSE, FSE, AISTATS, UAI, and so on.
Xiaofeng Gao, Shanghai Jiao Tong University, China
Dr. Xiaofeng Gao is a tenure-track full professor at the School of Computer Science, Shanghai Jiao Tong University. She received her B.S. in Information and Computational Science from Nankai University, M.S. in Operations Research and Control Theory from Tsinghua University, and Ph.D. in Computer Science from The University of Texas at Dallas. Her research focuses on data engineering and combinatorial optimization. She has published over 400 peer-reviewed papers in leading journals, including IEEE TKDE, IEEE TMC, ACM/IEEE TON, IEEE TC, IEEE TPDS, and IEEE TKDD, as well as top conferences such as SIGKDD, SIGIR, ICDE, VLDB, ICDM, WWW, NeurIPS, IJCAI, AAAI, ICML, with 12 Best Paper Awards, including WASA 2025, ADMA 2023, APWEB-WAIM 2022, DASFAA 2017, and ICPADS 2016.
Dr. Gao is a recipient of the National Young Talent Program and serves as Vice Head of the CCF Technical Committee on Distributed Computing and Systems (CCF DCS). She has served as Program Chair for ICDM 2026, COCOON 2024, ISCO 2018, and COCOA 2017, and as General Chair for CSoNet 2022.
Chang-Tien Lu, Virginia Tech, USA
Dr. Chang-Tien Lu is a professor in the Department of Computer Science, curriculum lead in the Institute for Advanced Computing, and associate director of the Sanghani Center for AI and Data Analytics at Virginia Tech. Dr. Lu’s research interests include spatial informatics, urban computing, artificial intelligence, and intelligent transportation systems. He has published over 250 articles in top-rated journals and conference proceedings, and his research has been supported by NSF, NIH, DoD, DoE, IARPA, and DOT. He is an ACM Distinguished Scientist and IEEE Fellow.
Dr. Lu currently serves as an associate editor of ACM Transactions on Spatial Algorithms and Systems, Data & Knowledge Engineering, IEEE Transactions on Big Data, and GeoInformatica. He regularly serves on conference organizing and program committees, including as Program Chair of IEEE ICTAI in 2006, and General Chair of the ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems in 2009, 2020, and 2021; the International Symposium on Spatial and Temporal Databases (SSTD) in 2017; IEEE Big Data in 2024; and IEEE ICDM in 2025. He also served as Secretary (2008–2011) and Vice Chair (2011–2014) of ACM SIGSPATIAL, playing a pivotal role in advancing the field and the broader computing research community.
Posters:
Neural Routed Boosting: Robust Decision-Making under Heteroscedastic Noise
Puspak Chakraborty, Arun Rajkumar
Ensemble methods struggle with heteroscedastic label uncertainty, limiting their reliability in downstream decision-making tasks. Traditional algorithms like AdaBoost overfit to corrupted instances, while robust variants that rely on global conditional risk estimation often fail to capture spatially varying uncertainty. We propose Neural Routed Boosting, a novel framework for conditional reasoning and robust classification under localized uncertainty. Neural Routed Boosting utilizes a lightweight neural network to capture spatial variations in the data. This network partitions the input space into geometrically coherent regions. It then routes samples to local experts. This structural approach isolates high-uncertainty subspaces, preventing them from corrupting the decision boundaries of reliable regions. We prove that this composite architecture of a neural router and region-specific experts constitutes a valid weak learner, guaranteeing training error convergence. Evaluations on synthetic and real-world datasets demonstrate that our spatial approach to uncertainty quantification outperforms traditional and robust baselines in complex environments.
Oral Presentations:
From Topology To Trajectory: LLM-Driven World Models For Supply Chain Resilience
Jia Luo
Semiconductor supply chains face unprecedented resilience challenges amidst global geopolitical turbulence. Conventional Large Language Model (LLM) planners, when confronting such non-stationary “Policy Black Swan” events, frequently suffer from Decision Paralysis or a severe Grounding Gap due to the absence of physical environmental modeling. This paper introduces ReflectiChain, a cognitive agentic framework tailored for resilient macroeconomic supply chain planning. The core innovation lies in the integration of Latent Trajectory Rehearsal powered by a generative world model, which couples reflection-in-action (System 2 deliberation) with delayed reflection-on-action. Furthermore, we leverage a Retrospective Agentic RL mechanism to enable autonomous policy evolution during the deployment phase (test-time). Evaluations conducted on our high-fidelity benchmark, Semi-Sim, demonstrate that under extreme scenarios such as export bans and material shortages, Silicon Janus achieves a 250% improvement in average step rewards over the strongest LLM baselines. It successfully restores the Operability Ratio (OR) from a deficient 13.3% to over 88.5% while ensuring robust gradient convergence. Ablation studies further underscore that the synergy between physical grounding constraints and double-loop learning is fundamental to bridging the gap between semantic reasoning and physical reality for long-horizon strategic planning.
DBE-Net: A Dual-dimensional Band Encoder Network for Unsupervised Anomaly Detection in Self-Piercing Riveting
Chen Hu, Yujie Wan, Yang Luo, Xiaofeng Gao
Self-Piercing Riveting (SPR) is critical to ensuring the structural safety of New Energy Vehicle (NEV) body-in-white assembly. However, conventional anomaly detection approaches typically ignore inter-series process dependencies and fail to extract subtle defect features hidden in strong global trends. To address these issues, this paper proposes DBE-Net, an unsupervised anomaly detection framework tailored for SPR quality monitoring. It combines Induced Set Attention Block (ISAB) to capture process-level correlations across sequences, Variational Mode Decomposition (VMD) to separate multi-frequency signal components, and Deep SVDD for compact one-class normality modeling. Extensive experiments on the industrial SPRAD dataset demonstrate that DBE-Net outperforms state-of-the-art methods in AUC, precision, recall, and F1-score, accurately identifying micro-defects while suppressing high-frequency noise. This work provides a reliable, high-precision intelligent detection solution for real-world NEV manufacturing quality assurance.
WiSDoM: Frugal Winner Selection by Design of Matchups
Saranath P
Identifying the best of 𝑁 items from noisy pairwise comparisons is a core primitive of modern machine-learning pipelines — RLHF annotation, LLM-judge arenas, policy selection — where each comparison is expensive, and the available budget is a small multiple of 𝑁. Under the Bradley–Terry–Luce model, we recast winner identification as a winner-focused experimental design problem: pick a distribution over pairs that minimizes the worst-case posterior variance of the winner’s score gap against the top challengers. The design is convex but only useful with a reliable warm start, so we pair it with a best-of-𝑡 single-elimination bracket using Elo scoring to secure winner coverage. The resulting algorithm, WiSDoM, recovers the top item effectively under shoestring budgets and outperforms prior budgeted BTL winner-identification baselines across synthetic and real-world experiments
Less Can Be Safer: Tail-Risk Reduction via Capacity-Constrained Policies in Industrial Reinforcement Learning
Minu Baek, Gihun Gil, Yeojin Jang, Byounghoon Son, Beomdo Park, Minsung Jung, Junseong Park, Hyeonseok Jang, Sangkeum Lee
Average reward can hide unacceptable rare failures in safety-critical industrial control. We present a workshop study of capacity-constrained reinforcement learning for papermaking control, using a data-validated digital twin built from 11 months of industrial operation and 687 production lots. The policy class uses shallow variational quantum circuit features as a deliberately small representation inside PPO; the workshop claim is not quantum computational advantage, but tail-risk reduction under process-derived feedback. Across 600 evaluation episodes on 120 lots and five seeds, the shallowest policy reduces failure rate to 0.33% compared with 6.0% for a classical PPO baseline, improves 5th-percentile safe-zone compliance from 13.6% to 55.5%, and reduces variance by 2.44x while mean performance is not significantly different. We introduce the Proactive Safety Index as a descriptive signal for anticipatory speed reduction. The evidence is limited to in-distribution digital-twin simulation. We submit this result as a case study in uncertainty-aware decision making where safety depends on rare-event behavior rather than mean performance.
Travel-Oriented Reasoning Large Language Model via Domain-Specific Knowledge Graphs
Vignesh Ram Nithin Kappagantula, Shayan Hassantabar, Samuel Simpson, Golnaz Moallem
Large language models (LLMs) demonstrate broad reasoning abilities but struggle with accuracy and reliability in specialized domains such as travel, where reasoning depends on precise definitions, rules, and expert-defined conceptual frameworks, and where confident but unfounded outputs arise from a reasoning failure in which the model has not internalized the underlying domain graph rather than from missing domain knowledge alone. We propose a modular pipeline for building a travel-domain reasoning LLM grounded in an expert-designed knowledge graph (KG). Our pipeline integrates a travel KG that encodes domain entities and their relationships, a bottom-up construction procedure that walks the KG to produce multi-hop question answer (QA) pairs, a supervised fine-tuning stage that embeds the domain knowledge into a reasoning-capable LLM using the generated QA pairs as auditable reasoning traces, and a travel-domain benchmark dataset that measures the fine-tuned model's accuracy and calibration. We evaluate our approach using Qwen3-4B with LoRA adaptation. Our reasoning model achieves an 82.4% exact match on the benchmark. This performance significantly outperforms the pretrained Qwen3-4B baseline at 22.4%. A calibration analysis decomposes the residual 17.57% of errors into two distinct failure modes: an over-confident multi-label decoder that predicts both correct answers plus one spurious option on most dual-answer mistakes, and a smaller reasoning failure on single-answer questions where the supporting facts are present in the KG but the model fails to reconstruct the correct multi-hop path. This split confirms that explicit KG-grounded reasoning substantially improves the accuracy and uncertainty interpretation of LLMs in specialized domains, and isolates per-option calibration and trace-length-aware decoding as the next axes of improvement.
Claim-Level Confidence Calibration for Reliable Decision Making with Large Language Models
Toghrul Abbasli
Large Language Models (LLMs) increasingly support decision-making in high-stakes domains, but they often hallucinate and express confidence that is misaligned with factual correctness. Response-level confidence is a coarse signal: a single generation can mix correct and incorrect statements, so a single number is not actionable for users that must accept, reject, or verify individual pieces of information. We study \emph{claim-level confidence calibration} as a decision-relevant uncertainty signal: each response is decomposed into atomic, verifiable claims, and each claim is assigned a calibrated confidence using inference-time signals from consistency across samples and self-verification. Our framework operates in closed-box settings (no logits, no fine-tuning) and applies post-hoc calibration directly at the claim level, enabling selective intervention such as evidence retrieval or human review for low-confidence claims. Across TriviaQA and TruthfulQA we evaluate seven baselines on six recent models (Llama-3.1, Mistral, Qwen2.5, DeepSeek-R1, GPT-4, GPT-4o), and show that claim-level decomposition combined with post-hoc calibration reduces expected calibration error on factual questions while exposing failure modes on adversarial false-premise questions where decision-makers most need reliable uncertainty estimates.
Optimization-based Online Conformal Prediction for Multi-step Forecasting
Ruipu Li, Daniel Menacho, Alexander Rodríguez
Conformal prediction (CP) is well-suited for uncertainty quantification in time series forecasting due to its distribution-free coverage guarantees. However, existing multi-step methods often struggle to balance coverage validity with efficiency: they either calibrate horizons independently—ignoring temporal correlations—or enforce strict simultaneous coverage, resulting in overly conservative intervals. In this work, we propose O2CP: Optimization-based Online Conformal Prediction, a unified framework for online conformal prediction that explicitly models multi-step error dependencies without sacrificing long-term marginal coverage guarantees. We first prove that standard online conformal updates maintain validity as long as calibration parameters remain within a defined "safe" region. Leveraging this theoretical insight, we introduce a two-layer architecture: an outer layer that defines admissible parameter sets to ensure validity, and an inner layer that performs constrained optimization to model joint error distributions and minimize horizon-wide objectives. To make this computationally feasible, we develop a lightweight sampling strategy that estimates joint distributions without requiring large calibration sets. Extensive experiments on real-world datasets—including autonomous driving, climate forecasting, and public health—demonstrate that O2CP consistently outperforms state-of-the-art baselines, achieving target coverage with significantly sharper prediction intervals and reduced regret over long horizons.
GLU: Global-Local Uncertainty Quantification in LLMs
Johanne Medina, Tianyi Zhou, Keivin Isufaj, Aristides Gionis, Sanjay Chawla
Large language models hallucinate confidently, making uncertainty quantification (UQ) essential for reliable deployment. Existing methods rely predominantly on token-level signals, leaving the geometric structure of intermediate hidden states underused. In this paper, we take the geometric complexity of hidden-state matrices as a measure of the global uncertainty of LLMs, while treating token-level uncertainty estimation as a local metric. We show that hidden-state geometric entropy (global uncertainty) and token-level entropy (local uncertainty) are statistically near-orthogonal, capturing distinct failure regimes for reliability prediction. In particular, global geometry recovers the confident-but-wrong failure mode that local signals systematically miss. Building on this, we propose Global-Local Uncertainty (GLU), an unsupervised, single-pass score that fuses the two signals via a multiplicative gate. Across three model families and six benchmarks, GLU matches or outperforms all unsupervised baselines while being a single forward pass, length-normalized, and architecture-agnostic.
NG-MMoE: Improving Multi-Task Mixture-of-Experts via Learnable Stochastic Gating
Connor Lee, Carissa Lee
Multi-gate Mixture-of-Experts (MMoE) has demonstrated strong performance in multi-task learning by learning task-specific gating over a shared pool of expert networks. However, the deterministic softmax gating in MMoE does not model uncertainty in the routing decision, causing it to collapse to suboptimal expert assignments, particularly when tasks are weakly correlated or training labels are sparse. We propose Noisy-Gate MMoE (NG-MMoE), which augments each per-task gate with input-conditioned learnable Gaussian noise during training. The noise acts as a principled uncertainty-aware exploration mechanism over the expert routing space, smoothing the loss landscape and preventing premature gate collapse. We provide theoretical motivation grounded in stochastic optimization, information theory, and the exploration-exploitation trade-off. Experiments on UCI Census-income and MovieLens 100k/1M show that NG-MMoE (i) achieves lower average loss and variance than MMoE and (ii) produces higher-entropy gate distributions, confirming better expert utilization under routing uncertainty.
Ram Prasad Nethi (Amazon Web Services)
Venkata Ratna Kumar Bonagiri (Macys)
Ankur Gupta (LinkedIn)
Ankur Bhatnagar (Macys)
Shatrughna Upadhyay (Intuit Inc)
Shahazad Qurashi (Jazan University)
Zixuan Wang (TikTok)
Xinyu Wu (Baylor University)
Denglin Jiang (Bloomberg)
Xiang Fang (Baylor University)
Zhuosheng Liu (UC Davis)
Tingshuo Miao (Baylor University)