News‎ > ‎

Learning a Discrete distribution (infer.NET)

posted 2 Feb 2013, 12:20 by Diego Tosato
Suppose we have 100 categories/classes for the discrete distribution that has to be learned. Moreover we have an initial training set of few (50) examples that we can exploit to initialise the discrete distributions.
public Variable<int> category;
public VariableArray<int> x;
int nCategory = 100;

So, here the (straightforward) pgm I wrote to learn the discrete distribution. In 'data' is stored the training set which contains int numbers in [0,99].

In particular I adopted the following one: [10, 5, 20, 10, 10, 5, 20,  10, 5, 30, 10, 5, 20, 10, 10, 5, 20,  10, 5, 30, 10, 5, 20, 10, 10, 5, 20,  10, 5, 30, 10, 5, 20, 10, 10, 5, 20,  10, 5, 30, 10, 5, 20, 10, 10, 5, 20,  10, 5, 30 ].

//pre-processing

double[] probs = new double[nCategory]; for (int i = 0; i < nCategory; i++) probs[i] = 0; for (int i = 0; i < data.Length; i++) probs[data[i]]++; Vector probsVec = Vector.FromArray(probs) + 0.000001; Vector normalizer = Vector.Zero(nCategory); normaliser.SetAllElementsTo(probsVec.Sum()); probsVec = probsVec / normaliser; // model category = Variable.Discrete(probsVec).Named("category"); Range dataRange = new Range(data.Length).Named("n"); VariableArray<int> x = Variable.Array<int>(dataRange).Named("x"); x[dataRange] = Variable.DiscreteUniform(category).ForEach(dataRange); x.ObservedValue = data;


As you can see I initialized 'category' using the probabilities obtained from the training data.

Finally, I run the inference engine and I sample  from the learned distribution.

Discrete out = engine.Infer<Discrete>(category);

int sample = (int) out.Sample();

http://social.microsoft.com/Forums/en-US/infer.net/thread/995f22b7-abcd-4748-9a68-b33d50fe0aee

Comments