Uncertainty Quantification
for Text Classification
Outline
Introduction
Text classification
Why do we need uncertainty
Where does uncertainty come from: aleatoric vs epistemic
How can we use uncertainty in text classification
Approaches
Virtual Ensemble in GBDT
Bayesian Deep Learning
Deep Ensemble
Monte-Carlo Dropout
Bayes by Backprop
Generalization to Epistemic Neural Networks
Evidential Deep Learning
The Dirichlet distribution
Prior Networks
Posterior Networks
Distance Awareness
Spectral-normalized Neural Gaussian Process
Deep Deterministic Uncertainty
Evaluation
Test Scenarios
In-domain calibration
Cross-domain robustness
Novel class detection
Performance Metrics
ECE
NLL and Brier Score
AUROC/AUPR and FAR90
Practical Recommendations
Recent Developments
Multi-Label OOD Detection
Uncertainty with (Large) Language Models
Language models know what they know
Uncertainty interpretation of text classifiers built on LLMs
Uncertainty estimation in text generation
Calibration of language models
Calibration for in-context learning
Authors
Dell Zhang, Thomson Reuters Labs, London, UK
Murat Sensoy, Amazon Alexa AI, London, UK
Masoud Makrehchi, Thomson Reuters Labs, Toronto, Canada
Bilyana Taneva-Popova, Thomson Reuters Labs, Zug, Switzerland
Lin Gui, King’s College London, London UK
Yulan He, King’s College London & Alan Turing Institute, London, UK
Slides
Available upon request.
Data
Code
The following tutorials and notebooks provide example methods for Deep Learning uncertainty. The methods are implemented with Tensorflow and inspired by TF tutorials and other public libraries (e.g., Google's Uncertainty Baselines).
SNGP model using toy 2D image data (i.e., scikit-learn's two moons dataset). See this TF tutorial.
BERT-SNGP model, where SNGP is built on top of a BERT encoder to improve the model's ability in detecting out-of-scope queries.
The CLINC150 dataset is used. See this Colab notebook, which follows closely the TF tutorial and adds additional uncertainty quantification metrics for out-of-domain detection as well as in-domain evaluation.Deterministic TextCNN model, which has been inspired by Google's Uncertainty Baselines library, see here. We train a TextCNN model on the CLINC150 dataset as well as a Deep Ensemble of TextCNN models and evaluate their OOD performance (see this Colab notebook).
Other related materials:
Jordy Van Landeghem's uncertainty-bench repository on Github.
Liudmila Prokhorenkova's tutorial on uncertainty estimation with CatBoost with this Jupyter notebook.
Murat Sensoy's original Colab notebook on Evidential Neural Network.
Thursday, 6th April 2023. Half-Day (Afternoon).
Swift Suite, Radisson Blu Royal Hotel, Golden Lane, Dublin 2, D08 VRR7, Ireland
1:30pm-3:00pm Session 1/2
3:00pm-3:30pm Coffee Break
3:30pm-5:00pm Session 2/2
Sunday, 23rd July 2023. Full-Day (Online Only).
Time Zone: CST (GMT+8)
10:00PM-11:30PM Session 1/4 DZ
11:30PM-12:00AM Short Break
12:00AM-01:30AM Session 2/4 MS
01:30AM-02:30AM Long Break
02:30AM-04:00AM Session 3/4 DZ
04:00AM-04:30AM Short Break
04:30AM-06:00AM Session 4/4 YH+LG
[NOTICE] Due to an unexpected technical problem with the EventX platform, the live streaming event of our SIGIR'23 Virtual Tutorial has been cancelled. We are really sorry for the inconvenience caused. The tutorial slides and recorded video clips will be provided to the conference participants later.