jfocal
FoCal II: Toolkit for Fusion and Calibration of Scores in Multi-Class Pattern Recognition Problems
Version 0: Calibration tool as a Java executable
Introduction
The Java version of this toolkit has been discontinued. There is a new MATLAB toolkit with wider capabilities. See http://niko.brummer.googlepages.com/FoCal
The rest of this page refers to the discontinued Java version.
This is a tool to analyze (and improve) the calibration of multi-class pattern recognition confidence scores. Scores in the following formats can be processed:
Posterior probability: P(class | data)
Likelihood: P(data | class)
This is an extension, for the multi-class case, of the capabilities of the FoCal Toolkit (which handles only two-class recognition problems.)
This version of the toolkit:
Calculates error-rates for supervised scores.
Does refinement/calibration analysis of supervised scores.
Can learn and apply calibration transformations to improve recognition performance of multi-class scores.
Is available in the form of a (platform independent) Java executable. (Sorry no source code.)
Has some MATLAB code to produce synthetic scores to experiment with the toolkit.
Has no fusion capabilities yet.
Download
This toolkit is freely available for research purposes. At present, the Java source-code is not made available. Although the Java class files are not obfuscated, please do not disassemble the Java code and use it for other purposes --- the author cannot support this source code for purposes other than the functions of this toolkit.
First of all RTFM: Here is a nice user manual and tutorial that explains everything.
All the real work is done by the single Java executable: VectorCal.jar. Make sure you have a suitable version of the Java VM installed on your machine. This code will not run on Java versions earlier than 1.4.x. I tested this with Java versions 1.4.2 and 1.5.0. (If you need to install Java, go to java.sun.com/downloads, under Java Standard Edition (Java SE), select J2SE 1.4.2 or J2SE 1.5.0 and then dowload and install the Java Runtime Environment J2SE JRE). Once Java is installed, to run this toolkit, make sure VectorCal.jar is in the current directory and launch VectorCal.jar as shown directly below. This invocation should display some help text to get you started. In an MS-Windows Dos-Box, in Linux (and hopefully in some other environments) type at the command-line:
java -jar vectorcal.jar -help
Here are some synthetic score files to serve as examples to be processed with this tool: examples.zip. These are text files, so you can look inside them. Some are in log-likelihood score format, others are in posterior score format. Each row is an independent recognition trial. For supervised data, the first column is a label in the range 1..M, when there are M classes. For unsupervised data, the labels should be all 0's. The label column is followed by another M columns of scores for each class. If the very first label (top left in the file) has the special value of -1, then the rest of the first row specifies a prior probability distribution and all subsequent rows are in posterior probability format. If the first label is >=0, then there is no prior and scores are in log-likelihood format.
Here are two MS-Windows batch files which run the above example score files through the tool: batch_files.zip. You should be able to easily create equivalents for other environments.
Finally, here are some MATLAB functions and scripts which were used to make the synthetic data: matlab.zip. They allow control of the number of classes, the refinement and the calibration. Have a look at this code to better understand the data. The above examples were made by running the script "slow_textformat_example.m".
Documentation
Here is a user manual and tutorial: FocalII.pdf.
For the theory of what is going on here, see my Odyssey'06 paper.
Feedback
Email feedback is welcome.
- Niko Brummer