home
Niko Brummer's home page
Aliases to this page: tinyurl.com/nbrummer, niko.brummer.googlepages.com
Fully Bayesian stuff
Here is a very simple, but practical, fully Bayesian, generative, multiclass pattern classifier: derivation, MATLAB implementation.
Some (in-progress) notes analysing various aspects of the problem of forensic likelihood-ratio calibration, with the aim of working towards more Bayesian solutions:
Integrating out model parameters in generative and discriminative classifiers. [PDF].
What is the ‘relevant population’ in Bayesian forensic inference? Available: arxiv.org/abs/1403.6008.
Tutorial for Bayesian forensic likelihood ratio. Available: arxiv.org/abs/1304.3589.
Fully Bayesian Score Calibration assuming Gaussian Distributions. [PDF]
See also 'Bayesian PLDA' below.
"Fully Bayesian Forensic LR: Extending the paradigm shift", presentation at the NFI, Netherlands, October 2011. [pdf]
Niko Brummer and Albert Swart, 'Bayesian calibration for forensic evidence reporting', submitted Interspeech 2014. Avaliable: arxiv.org/abs/1403.5997.
Text-dependent speaker verification
The terminology in this field can be confusing. Here is a proposal of how to term the various possible outcomes when verifying combinations of speakers and phrases: tinyurl.com/TextDepedentSV.
BOSARIS Toolkit
This is the successor to the FoCal Toolkit. The BOSARIS Toolkit provides MATLAB code for calibrating, fusing and evaluating scores from (automatic) binary classifiers. It was developed
to provide solutions for automatic speaker recognition, but we envision that much of the code will have wider applicability for other biometric and/or forensics problems, where the calibration of likelihood-ratios is of interest.
The BOSARIS Toolkit User Guide: Theory, Algorithms and Code for Binary Classifier Score Processing.
The code and user manual are here.
Selected Papers
Niko Brummer, "Application-Independent Evaluation of Speaker Detection", Odyssey 2004.
Niko Brummer and Johan du Preez, "Application Independent Evaluation of Speaker Detection", Computer Speech and Language, 2006.
Niko Brummer and David van Leeuwen, "On calibration of language recognition scores", Odyssey 2006.
David van Leeuwen and Niko Brummer, "Channel-dependent GMM and Multi-class Logistic Regression", Odyssey 2006.
Niko Brummer, Lukas Burget, et al. ''Fusion of Heterogenous Speaker Recognition Systems in the STBU Submission for the NIST Speaker Recognition Evaluation 2006'', IEEE TASLP, vol.15, no.7, Sept. 2007.
Niko Brummer et al., "Discriminative Acoustic Language Recognition via Channel-Compensated GMM Statistics'', Interspeech 2009. [Paper: pdf][Presentation:pdf]
Niko Brummer and Edward de Villiers, "The Speaker Partitioning Problem", Odyssey 2010. PDF.
Jesus Villalba and Niko Brummer, "Towards Fully Bayesian Speaker Recognition: Integrating Out the Between-Speaker Covariance", accepted for Interspeech 2011. PDF.
Niko Brummer and Edward de Villiers, "The BOSARIS Toolkit: Theory, Algorithms and Code for Surviving the New DCF'', NIST SRE'11 Analysis Workshop, Atlanta, December 2011. [Paper: pdf][Presentation: pdf]
David van Leeuwen and Niko Brummer, "The distribution of calibrated likelihood-ratios in speaker recognition", Interspeech 2013. http://arxiv.org/abs/1304.1199
Niko Brummer and George Doddington, "Likelihood-ratio calibration using prior-weighted proper scoring rules", Interspeech 2013. http://arxiv.org/abs/1307.7981
Niko Brummer and Daniel Garcia-Romero, "Generative Modelling for Unsupervised Score Calibration", accepted for ICASSP 2014. http://arxiv.org/abs/1311.0707. Here are some additional notes about computing the Hessian for the Laplace approximation: The EM algorithm and the Laplace Approximation.
Niko Brümmer, Albert Swart and David van Leeuwen, "A comparison of linear and non-linear calibrations for speaker recognition", Odyssey 2014, available: http://arxiv.org/abs/1402.2447. Presentation slides.
Niko Brummer and Albert Swart, 'Bayesian calibration for forensic evidence reporting', accepted for Interspeech 2014. Avaliable: arxiv.org/abs/1403.5997.
Anya Silnova, Niko Brummer, Johan Rohdin, Themos Stafylakis, Lukas Burget, "Probabilistic embeddings for speaker diarization", Odyssey 2020. Available: https://arxiv.org/abs/2004.04096. Awarded: Jack Godfrey Best Student Paper.
Luciana Ferrer, Mitchell McLaren, Niko Brummer, "A Speaker Verification Backend with Robust Performance across Conditions", 20201, https://arxiv.org/abs/2102.01760.
Ph.D.
Niko Brummer, Measuring, refining and calibrating speaker and language information extracted from speech, Ph.D. dissertation, University of Stellenbosch, December 2010.
Ph.D. Oral defense presentation: defence.pdf.
(Also online: here or here.)
Book chapter
David van Leeuwen and Niko Brümmer, An Introduction to Application-Independent Evaluation of Speaker Recognition Systems, in Speaker Classification I: Fundamentals, Features, and Methods, Christian Müller (Ed.),
Springer 2007.
Invited Talks
"Calibration of Binary (Speaker Recognition) and Multiclass (Language Recognition) Statictical Pattern Recognizers", at ATVS UAM, 2008. [slides].
"Calibration of Likelihood-Ratios in Automatic Speaker Recognition: Applicability to other Forensic Technologies", presented at BBfor2 Workshop, IDIAP, Martigny, December 2011. [pdf]
"The Role of Proper Scoring Rules in Training and Evaluating Probabilistic Speaker and Language Recognizers", Odyssey 2012, 25-28 June, Singapore. See abstract and details at: http://www.odyssey2012.org/plenary.html. Presentation is here. Video here.
"Binary and Multiclass Calibration in Speaker and Language Recognition", ASRU 2013, Olomouc. Slides: tinyurl.com/BrummerASRU13. Video: http://www.superlectures.com/asru2013.
"Bayesian Calibration for Forensic Speaker Recognition", HLT Winter School, CSIR, Pretoria, July 2014. PDF Slides.
"Bayesian Calibration for Forensic Evidence Reporting", Keynote, ICFIS 2014, Leiden, Netherlands, August 2014. Slides [PDF]: tinyurl.com/BrummerICFIS14.
Notes
This note interprets the classical i-vector recipe in terms of mean-field variational Bayes. This interpretation also works for the newer phonetic i-vector extractors. It also suggests new ways to use VB to calibrate the GMM or phonetic state posteriors to fir the extractor model better.
Some notes written during the BOSARIS workshop:
Calculus of likelihood ratios: PDF.
Bayesian PLDA: My original notes, and Jesus's impelementation and our Interspeech 2011 paper.
Some things which are good to know when computing first and second order partial derivatives for large-scale numerical optimization: PDF.
THE EM ALGORITHM AND MINIMUM DIVERGENCE:
General theory: PDF.
Applied to JFA-style GMM modeling: EM4JFA.PDF.
Applied to PLDA:
EM for Probabilistic LDA (PLDA).
EM for simplified PLDA (SPLDA).
Applied to Heavy-tailed PLDA: VBEM and MINDIV
The PAV Algorithm optimizes binary proper scoring rules.
Incomplete technical report, describing a precursor to the two-covariance and PLDA speaker recognition models: Farewell SVM: Bayes Factor Speaker Detection in Supervector Space, 2006.
Software
FoCal Toolkit: MATLAB code for Evaluation, Fusion and Calibration of Statistical Pattern Recognizers. Includes tools for logistic regression, Cllr and APE-curves.
Some new tools for ROCCH-DET curves are here: http://focaltoolkit.googlepages.com/rocch.
Also see: http://sites.google.com/site/bosaristoolkit.
Albayzin Toolkit: https://sites.google.com/site/albayzin2012lre/.
Odyssey conferences
Details and links to archived proceedings of the Odyssey Speaker and Language Recognition Workshop series is available at: www.speakerodyssey.com.
The last Odyssey was in Joensuu, Finland, June 2014: cs.uef.fi/odyssey2014/. For online talks and proceedings, see: http://www.superlectures.com/odyssey2014/.
The next Odyssey will be in Bilbao, Spain, June 2016.
Fun
Simple, lunar lander game, constructed with GeoGebra.
Flying dragon yoga sequence: tinyurl.com/flyingdragonsequence.
Google speech to text: tinyurl.com/googlestt