LAML

LAML (Linear Algebra and Machine Learning) is a stand-alone pure Java library for linear algebra and machine learning. The goal is to build efficient and easy-to-use linear algebra and machine learning libraries. The reason why linear algebra and machine learning are built together is that full control of the basic data structures for matrices and vectors is required to have fast implementation for machine learning methods. Additionally, LAML provides a lot of commonly used matrix functions in the same signature to MATLAB, thus can also be used to manually convert MATLAB code to Java code. [Tutorial-LAML.pdf]

The built-in linear algebra library supports well-designed dense and sparse matrices and vectors. Standard compressed sparse column (CSC) and compressed sparse row (CSR) are used to design and implement sparse matrices. The matrix multiplication is competitive or even faster than well known linear algebra libraries in Java such as Commons-Math and Colt. Unlike other linear algebra libraries in Java, the built-in linear algebra library in LAML gives users full control of sparse matrices and vectors (e.g., the interior arrays in sparse matrices), which is crucial to make efficient high level implementations.

Carefully designed linear algebra library is the basis for machine learning library. ML library aims to provide fast implementation of mature machine learning methods. For instance, the LinearBinarySVM class re-implements the popular liblinear (in Java). For heart_scale data with C = 1.0 and eps = 1e-2, the average running time is 0.04 seconds using an Intel(R) Core(TM) i7 CPU M620 @ 2.67GHz with 4.00GB memory and 64-bit Windows 7 operating system, even a little faster than liblinear, which costs 0.06 seconds in average given the same parameter.

JML v.s. LAML:

LAML is much faster than JML (more than 3 times faster) due to two implementation considerations. First, LAML allows full control of dense and sparse matrices and vectors. Second, LAML extensively uses in-place matrix and vector operations thus avoids too much memory allocation and garbage collection.

JML relies on third party linear algebra library, i.e. Apache Commons-math. Sparse matrices and vectors have been deprecated in Commons-math 3.0+, and will be ultimately eliminated. Whereas LAML has its own built-in linear algebra library.

Like JML, LAML also provides a lot of commonly used matrix functions in the same signature to MATLAB, thus can also be used to manually convert MATLAB code to Java code.

In short, JML has been replaced by LAML.

SourceForge:

https://sourceforge.net/projects/lamal

Github:

https://github.com/MingjieQian/LAML

Documentation:

For more details about the LAML API, please refer to the online Javadoc.

Features:

Stand-alone Java library, completely cross-platform

Built-in Linear Algebra (LA) library

Full control of matrices and vectors

Many general-purpose optimization algorithms

Fast implementation of Machine Learning (ML) methods

Matrix functions with almost the same signature to MATLAB

Well documented source code and friendly API, very easy to use

Packages:

la.decomposition

LU, QR, eigenvalue decomposition, and SVD

la.matrix

Sparse and dense matrix implementation

la.vector

Sparse and dense vector implementation

la.io

Functions of saving and loading a matrix, a vector, or a data set

ml.utils

Efficient functions for array and matrix operations

ml.classification

Linear SVM, linear multi-class SVM, regularized logistic regression, maximum entropy modeling, and AdaBoost

ml.clustering

K-means, L1NMF, NMF, and spectral clustering

ml.optimization

L-BFGS, BoundConstrainedPLBFGS, NonnegativePLBFGS, Projection, ProximalMapping, ShrinkageOperator,

accelerated proximal gradient, accelerated gradient descent, general quadratic programming, nonlinear

conjugate gradient, LBFGS on simplex, quadratic programming with bound constraint, primal-dual interior-

point method

ml.sequence

Hidden Markov Models (HMM) and Conditional Random Fields (CRF)

ml.kernel

Commonly used kernel functions ('linear' | 'poly' | 'rbf' | 'cosine')

ml.manifold

Commonly used manifold learning functions such as computing adjacency matrix, Laplacian matrix, and local

learning regularization matrix

ml.subspace

PCA, kernel PCA, Multi-dimensional Scaling (MDS), Isomap, and Locally Linear Embedding (LLE)

ml.regression

LASSO and linear regression

ml.random

Multivariate Gaussian distribution

ml.recovery

Matrix completion and robust PCA

ml.topics

LDA

ml.graph

Minimum spanning tree using Prim's algorithm, shortest path using Dijkstra's algorithm, topological order,

all-pairs shortest path using Floyd-Warshall algorithm, Huffman codes, and maximum flow using Ford-Fulkerson algorithm.

ml.recommendation

Factorization machines and structured sparse regression (STSR)

...

Usage:

# Eigenvalue Decomposition (For Real Symmetric Matrices)

int m = 4;

int n = 4;

Matrix A = hilb(m, n);

fprintf("A:%n");

disp(A);

long start = 0;

start = System.currentTimeMillis();

Matrix[] VD = EigenValueDecomposition.decompose(A);

System.out.format("Elapsed time: %.4f seconds.%n", (System.currentTimeMillis() - start) / 1000.0);

fprintf("*****************************************%n");

Matrix V = VD[0];

Matrix D = VD[1];

fprintf("V:%n");

printMatrix(V);

fprintf("D:%n");

printMatrix(D);

fprintf("VDV':%n");

disp(V.mtimes(D).mtimes(V.transpose()));

fprintf("A:%n");

printMatrix(A);

fprintf("V'V:%n");

printMatrix(V.transpose().mtimes((V)));

# Output

1 0.5000 0.3333 0.2500

0.5000 0.3333 0.2500 0.2000

0.3333 0.2500 0.2000 0.1667

0.2500 0.2000 0.1667 0.1429

Elapsed time: 0.0100 seconds.

*****************************************

0.7926 0.5821 -0.1792 -0.0292

0.4519 -0.3705 0.7419 0.3287

0.3224 -0.5096 -0.1002 -0.7914

0.2522 -0.5140 -0.6383 0.5146

1.5002

0.1691

0.0067

0.0001

VDV':

1 0.5000 0.3333 0.2500

0.5000 0.3333 0.2500 0.2000

0.3333 0.2500 0.2000 0.1667

0.2500 0.2000 0.1667 0.1429

1 0.5000 0.3333 0.2500

0.5000 0.3333 0.2500 0.2000

0.3333 0.2500 0.2000 0.1667

0.2500 0.2000 0.1667 0.1429

V'V:

1.0000 -0.0000 -0.0000 -0.0000

-0.0000 1.0000 0.0000 0

-0.0000 0.0000 1.0000 -0.0000

-0.0000 0 -0.0000 1.0000

# -------------------------------------------------------------------------- #

# LU Decomposition

double[][] data = new double[][] {

{1, -2, 3},

{2, -5, 12},

{0, 2, -10}

};

Matrix A = new DenseMatrix(data);

fprintf("A:%n");

printMatrix(A);

Matrix[] LUP = LUDecomposition.decompose(A);

Matrix L = LUP[0];

Matrix U = LUP[1];

Matrix P = LUP[2];

fprintf("L:%n");

printMatrix(L);

fprintf("U:%n");

printMatrix(U);

fprintf("P:%n");

printMatrix(P);

fprintf("PA:%n");

printMatrix(P.mtimes(A));

fprintf("LU:%n");

printMatrix(L.mtimes(U));

long start = 0;

start = System.currentTimeMillis();

LUDecomposition LUDecomp = new LUDecomposition(A);

Vector b = new DenseVector(new double[] {2, 3, 4});

Vector x = LUDecomp.solve(b);

fprintf("Solution for Ax = b:%n");

printVector(x);

fprintf("b = %n");

printVector(b);

fprintf("Ax = %n");

printVector(A.operate(x));

fprintf("A^{-1}:%n");

printMatrix(LUDecomp.inverse());

fprintf("det(A) = %.2f%n", LUDecomp.det());

System.out.format("Elapsed time: %.2f seconds.%n", (System.currentTimeMillis() - start) / 1000F);

fprintf("**********************************%n");

A = sparse(A);

fprintf("A:%n");

printMatrix(A);

LUP = LUDecomposition.decompose(A);

L = LUP[0];

U = LUP[1];

P = LUP[2];

fprintf("L:%n");

printMatrix(L);

fprintf("U:%n");

printMatrix(U);

fprintf("P:%n");

printMatrix(P);

fprintf("PA:%n");

printMatrix(P.mtimes(A));

fprintf("LU:%n");

printMatrix(L.mtimes(U));

start = System.currentTimeMillis();

LUDecomp = new LUDecomposition(sparse(A));

b = new DenseVector(new double[] {2, 3, 4});

x = LUDecomp.solve(b);

fprintf("Solution for Ax = b:%n");

printVector(x);

fprintf("Ax = %n");

printVector(A.operate(x));

fprintf("b = %n");

printVector(b);

Matrix B = new DenseMatrix(new double[][] {

{2, 4},

{3, 3},

{4, 2} }

);

Matrix X = LUDecomp.solve(B);

fprintf("Solution for AX = B:%n");

printMatrix(X);

fprintf("AX = %n");

printMatrix(A.mtimes(X));

fprintf("B = %n");

printMatrix(B);

fprintf("A^{-1}:%n");

printMatrix(LUDecomp.inverse());

fprintf("det(A) = %.2f%n", LUDecomp.det());

System.out.format("Elapsed time: %.2f seconds.%n", (System.currentTimeMillis() - start) / 1000F);

# Output

1 -2 3

2 -5 12

0 2 -10

1 0 0

0 1 0

0.5000 0.2500 1

2 -5 12

0 2 -10

0 0 -0.5000

0 1 0

0 0 1

1 0 0

PA:

2 -5 12

0 2 -10

1 -2 3

LU:

2 -5 12

0 2 -10

1 -2 3

Solution for Ax = b:

b =

Ax =

A^{-1}:

-13 7 4.5000

-10 5 3

-2 1 0.5000

det(A) = -2.00

Elapsed time: 0.02 seconds.

**********************************

1 -2 3

2 -5 12

2 -10

0.5000 0.2500 1

2 -5 12

2 -10

-0.5000

PA:

2 -5 12

2 -10

1 -2 3

LU:

2 -5 12

2 -10

1 -2 3

Solution for Ax = b:

Ax =

b =

Solution for AX = B:

13 -22

7 -19

1 -4

AX =

2 4

3 3

4 2

B =

2 4

3 3

4 2

A^{-1}:

-13 7 4.5000

-10 5 3

-2 1 0.5000

det(A) = -2.00

Elapsed time: 0.02 seconds.

# -------------------------------------------------------------------------- #

# QR Decomposition

int m = 4;

int n = 3;

Matrix A = hilb(m, n);

fprintf("When A is full:%n");

fprintf("A:%n");

printMatrix(A);

long start = 0;

start = System.currentTimeMillis();

Matrix[] QRP = QRDecomposition.decompose(A);

Matrix Q = QRP[0];

Matrix R = QRP[1];

Matrix P = QRP[2];

fprintf("Q:%n");

printMatrix(Q);

fprintf("R:%n");

printMatrix(R);

fprintf("P:%n");

printMatrix(P);

fprintf("AP:%n");

printMatrix(A.mtimes(P));

fprintf("QR:%n");

printMatrix(Q.mtimes(R));

fprintf("Q'Q:%n");

printMatrix(Q.transpose().mtimes(Q));

System.out.format("Elapsed time: %.2f seconds.%n", (System.currentTimeMillis() - start) / 1000F);

fprintf("**********************************%n");

// fprintf("|AP - QR| = ");

A = sparse(hilb(m, n));

fprintf("When A is sparse:%n");

fprintf("A:%n");

printMatrix(A);

start = System.currentTimeMillis();

QRP = QRDecomposition.decompose(A);

Q = QRP[0];

R = QRP[1];

P = QRP[2];

fprintf("Q:%n");

printMatrix(Q);

fprintf("R:%n");

printMatrix(R);

fprintf("P:%n");

printMatrix(P);

fprintf("AP:%n");

printMatrix(A.mtimes(P));

fprintf("QR:%n");

printMatrix(Q.mtimes(R));

fprintf("Q'Q:%n");

printMatrix(Q.transpose().mtimes(Q));

System.out.format("Elapsed time: %.2f seconds.%n", (System.currentTimeMillis() - start) / 1000F);

QRDecomposition QRDecomp = new QRDecomposition((A));

Vector b = new DenseVector(new double[] {2, 3, 4, 9});

Vector x = QRDecomp.solve(b);

fprintf("Solution for Ax = b:%n");

printVector(x);

fprintf("b = %n");

printVector(b);

fprintf("Ax = %n");

printVector(A.operate(x));

# Output

When A is full:

1 0.5000 0.3333

0.5000 0.3333 0.2500

0.3333 0.2500 0.2000

0.2500 0.2000 0.1667

-0.8381 0.5144 -0.1796 -0.0263

-0.4191 -0.4052 0.7487 0.3157

-0.2794 -0.5351 -0.1132 -0.7892

-0.2095 -0.5338 -0.6280 0.5261

-1.1932 -0.4749 -0.6705

0 -0.1258 -0.1184

0 0 0.0059

0 0 0

AP:

1 0.3333 0.5000

0.5000 0.2500 0.3333

0.3333 0.2000 0.2500

0.2500 0.1667 0.2000

QR:

1.0000 0.3333 0.5000

0.5000 0.2500 0.3333

0.3333 0.2000 0.2500

0.2500 0.1667 0.2000

Q'Q:

1.0000 0.0000 -0.0000 0.0000

0.0000 1 -0.0000 0

-0.0000 -0.0000 1.0000 0

0.0000 0 0 1

Elapsed time: 0.05 seconds.

**********************************

When A is sparse:

1 0.5000 0.3333

0.5000 0.3333 0.2500

0.3333 0.2500 0.2000

0.2500 0.2000 0.1667

-0.8381 0.5144 -0.1796 -0.0263

-0.4191 -0.4052 0.7487 0.3157

-0.2794 -0.5351 -0.1132 -0.7892

-0.2095 -0.5338 -0.6280 0.5261

-1.1932 -0.4749 -0.6705

-0.1258 -0.1184

0.0059

AP:

1 0.3333 0.5000

0.5000 0.2500 0.3333

0.3333 0.2000 0.2500

0.2500 0.1667 0.2000

QR:

1.0000 0.3333 0.5000

0.5000 0.2500 0.3333

0.3333 0.2000 0.2500

0.2500 0.1667 0.2000

Q'Q:

1.0000 0.0000 -0.0000 0.0000

0.0000 1 -0.0000 0

-0.0000 -0.0000 1.0000 0

0.0000 0 0 1

Elapsed time: 0.04 seconds.

Solution for Ax = b:

117.2346

-719.5017

733.7439

b =

Ax =

2.0651

2.2194

5.9516

7.6990

# -------------------------------------------------------------------------- #

# Singular Value Decomposition

Matrix A = new DenseMatrix(new double[][] { {1d, 2d}, {2d, 0d}, {1d, 7d}});

/*A = new DenseMatrix(new double[][] {

{1, 2, 3},

{4, 5, 6},

{7, 8, 9},

{10, 11, 12}

});*/

// printMatrix(SingularValueDecomposition.bidiagonalize(A)[1]);

// A = IO.loadMatrix("SVDInput");

/*fprintf("When A is full:%n%n");

fprintf("A:%n");

printMatrix(A);*/

long start = 0;

start = System.currentTimeMillis();

boolean computeUV = !false;

Matrix[] USV = SingularValueDecomposition.decompose(A, computeUV);

System.out.format("Elapsed time: %.4f seconds.%n", (System.currentTimeMillis() - start) / 1000.0);

fprintf("*****************************************%n");

Matrix U = USV[0];

Matrix S = USV[1];

Matrix V = USV[2];

if (computeUV) {

fprintf("USV':%n");

disp(U.mtimes(S).mtimes(V.transpose()));

fprintf("A:%n");

printMatrix(A);

fprintf("U'U:%n");

printMatrix(U.transpose().mtimes((U)));

fprintf("V'V:%n");

printMatrix(V.transpose().mtimes((V)));

fprintf("U:%n");

printMatrix(U);

fprintf("V:%n");

printMatrix(V);

}

fprintf("S:%n");

printMatrix(S);

fprintf("rank(A): %d%n", rank(A));

# Output

Elapsed time: 0.0100 seconds.

*****************************************

USV':

1.0000 2.0000

2.0000 -0.0000

1.0000 7.0000

1 2

2 0

1 7

U'U:

1.0000 0.0000 0.0000

0.0000 1.0000 0.0000

0.0000 0.0000 1.0000

V'V:

1.0000 -0.0000

-0.0000 1.0000

-0.2906 -0.2976 -0.9094

-0.0492 -0.9445 0.3248

-0.9556 0.1391 0.2598

-0.1819 -0.9833

-0.9833 0.1819

7.3935

2.0822

rank(A): 2

# -------------------------------------------------------------------------- #

# Linear Binary SVM

double C = 1.0;

double eps = 1e-4;

Classifier linearBinarySVM = new LinearBinarySVM(C, eps);

int[] pred_labels = null;

double[][] data = {

{3.5, 4.4, 1.3, 2.3},

{5.3, 2.2, 0.5, 4.5},

{0.2, 0.3, 4.1, -3.1},

{-1.2, 0.4, 3.2, 1.6}

};

int[] labels = new int[] {1, 1, -1, -1};

linearBinarySVM.feedData(data);

linearBinarySVM.feedLabels(labels);

linearBinarySVM.train();

fprintf("W:%n");

printMatrix(linearBinarySVM.W);

fprintf("b:%n");

printVector(linearBinarySVM.b);

pred_labels = linearBinarySVM.predict(data);

getAccuracy(pred_labels, labels);

# Output

0.2143

0.1312

-0.2407

0.0150

-0.0490

Accuracy: 100.00%

# -------------------------------------------------------------------------- #

# Linear Multi-class SVM

double C = 1.0;

double eps = 1e-4;

Classifier linearMCSVM = new LinearMCSVM(C, eps);

double[][] data = {

{3.5, 4.4, 1.3, 2.3},

{5.3, 2.2, 0.5, 4.5},

{0.2, 0.3, 4.1, -3.1},

{-1.2, 0.4, 3.2, 1.6}

};

int[] labels = new int[] {1, 2, 3, 4};

linearMCSVM.feedData(data);

linearMCSVM.feedLabels(labels);

linearMCSVM.train();

fprintf("W:%n");

printMatrix(linearMCSVM.W);

fprintf("b:%n");

printVector(linearMCSVM.b);

int[] pred_labels = linearMCSVM.predict(data);

getAccuracy(pred_labels, labels);

# Output

-0.0482 0.1314 0.0630 -0.1462

0.2825 -0.2241 -0.0053 -0.0531

-0.0819 -0.0836 0.0599 0.1056

-0.0838 0.1711 -0.2301 0.1428

-0.0248 -0.0089 0.0043 0.0295

Accuracy: 100.00%

# Multi-class Logistic Regression with Multiple Choices of Regularization

double[][] data = {

{3.5, 5.3, 0.2, -1.2},

{4.4, 2.2, 0.3, 0.4},

{1.3, 0.5, 4.1, 3.2}

};

int[] labels = new int[] {1, 2, 3};

* Regularization type.

* 0: No regularization

* 1: L1 regularization

* 2: L2^2 regularization

* 3: L2 regularization

* 4: Infinity norm regularization

int regularizationType = 1;

double lambda = 0.1;

Classifier logReg = new LogisticRegression(regularizationType, lambda);

logReg.epsilon = 1e-5;

logReg.feedData(data);

logReg.feedLabels(labels);

// Get elapsed time in seconds

tic();

logReg.train();

fprintf("Elapsed time: %.3f seconds.%n", toc());

fprintf("W:%n");

printMatrix(logReg.W);

fprintf("b:%n");

printVector(logReg.b);

double[][] dataTest = data;

fprintf("Ground truth:%n");

printMatrix(logReg.Y);

fprintf("Predicted probability matrix:%n");

Matrix Prob_pred = logReg.predictLabelScoreMatrix(dataTest);

disp(Prob_pred);

fprintf("Predicted label matrix:%n");

Matrix Y_pred = logReg.predictLabelMatrix(dataTest);

printMatrix(Y_pred);

int[] pred_labels = logReg.predict(dataTest);

getAccuracy(pred_labels, labels);

# Output

# Without regularization

L-BFGS converges with norm(Grad) 0.000006

Elapsed time: 0.060 seconds.

-1.8522 3.1339 -1.2817

3.4138 -1.7282 -1.6856

-1.2455 -1.2630 2.5084

-2.8311 0.5558 2.2753

-0.3616 0.2534 0.1082

Ground truth:

Predicted probability matrix:

1.0000 0.0000 0.0000

0.0000 1.0000 0.0000

0.0000 0.0000 1.0000

Predicted label matrix:

Accuracy: 100.00%

# L1-norm regularization

Accelerated proximal gradient method converges with norm(G_Y_k) 0.000004

Elapsed time: 0.430 seconds.

0 0.8678 0

0.9579 0 0

0 0 0.9894

0 0 0

Ground truth:

Predicted probability matrix:

0.8790 0.1143 0.0067

0.1493 0.8263 0.0244

0.0258 0.0494 0.9247

Predicted label matrix:

Accuracy: 100.00%

# squared Frobenius norm regularization

Accelerated proximal gradient method converges with norm(G_Y_k) 0.000009

Elapsed time: 0.242 seconds.

-0.2817 0.5368 -0.2551

0.5298 -0.3269 -0.2029

-0.1181 -0.3004 0.4185

-0.3814 0.0449 0.3364

-0.0417 0.0266 0.0151

Ground truth:

Predicted probability matrix:

0.8872 0.1028 0.0100

0.1241 0.8299 0.0461

0.0137 0.0514 0.9349

Predicted label matrix:

Accuracy: 100.00%

# Frobenius norm regularization

Accelerated proximal gradient method converges with norm(G_Y_k) 0.000008

Elapsed time: 0.358 seconds.

-0.4333 0.7669 -0.3336

0.7402 -0.4956 -0.2447

-0.1262 -0.4227 0.5489

-0.5135 0.0810 0.4326

-0.0587 0.0375 0.0212

Ground truth:

Predicted probability matrix:

0.9509 0.0461 0.0029

0.0558 0.9249 0.0192

0.0039 0.0221 0.9740

Predicted label matrix:

Accuracy: 100.00%

# Infinity norm regularization

Accelerated proximal gradient method converges with norm(G_Y_k) 0.000010

Elapsed time: 0.631 seconds.

-0.7162 0.7162 -0.7162

0.7162 -0.7162 -0.7162

-0.7162 -0.7162 0.7162

-0.7162 0.7162 0.7162

-0.6125 -0.3405 0.5728

Ground truth:

Predicted probability matrix:

0.9821 0.0175 0.0004

0.0179 0.9752 0.0068

0.0000 0.0072 0.9928

Predicted label matrix:

Accuracy: 100.00%

# -------------------------------------------------------------------------- #

# Maximum Entropy Modeling

long start = System.currentTimeMillis();

* a 3D {@code double} array, where data[n][i][k]

* is the i-th feature value on the k-th class

* for the n-th sample

double[][][] data = new double[][][] {

{{1, 0, 0}, {2, 1, -1}, {0, 1, 2}, {-1, 2, 1}},

{{0, 2, 0}, {1, 0, -1}, {0, 1, 1}, {-1, 3, 0.5}},

{{0, 0, 0.8}, {2, 1, -1}, {1, 3, 0}, {-0.5, -1, 2}},

{{0.5, 0, 0}, {1, 1, -1}, {0, 0.5, 1.5}, {-2, 1.5, 1}},

};

/*double [][] labels = new double[][] {

{1, 0, 0},

{0, 1, 0},

{0, 0, 1},

{1, 0, 0}

};*/

int[] labels = new int[] {1, 2, 3, 1};

MaxEnt maxEnt = new MaxEnt();

maxEnt.feedData(data);

maxEnt.feedLabels(labels);

maxEnt.train();

double elapsedTime = (System.currentTimeMillis() - start) / 1000d;

System.out.format("Elapsed time: %.3f seconds\n", elapsedTime);

fprintf("MaxEnt parameters:\n");

display(maxEnt.W);

String modelFilePath = "MaxEnt-Model.dat";

maxEnt.saveModel(modelFilePath);

maxEnt = new MaxEnt();

maxEnt.loadModel(modelFilePath);

fprintf("Predicted probability matrix:\n");

display(maxEnt.predictLabelScoreMatrix(data));

fprintf("Predicted label matrix:\n");

display(full(maxEnt.predictLabelMatrix(data)));

fprintf("Predicted labels:\n");

display(maxEnt.predict(data));

# Output

L-BFGS converges with norm(Grad) 0.000072

Elapsed time: 0.060 seconds

MaxEnt parameters:

7.4697

-1.1159

-2.8338

-1.0426

Model saved.

Loading model...

Model loaded.

Predicted probability matrix:

1.0000 0.0000 0.0000

0.0001 0.9999 0.0000

0.0001 0.0000 0.9999

0.9997 0.0002 0.0001

Predicted label matrix:

1 0 0

0 1 0

0 0 1

1 0 0

Predicted labels:

1 2 3 1

# -------------------------------------------------------------------------- #

# AdaBoost

double[][] data = { {3.5, 4.4, 1.3},

{5.3, 2.2, 0.5},

{0.2, 0.3, 4.1},

{5.3, 2.2, -1.5},

{-1.2, 0.4, 3.2} };

int[] labels = {1, 1, -1, -1, -1};

Matrix X = new DenseMatrix(data);

int T = 10;

Classifier[] weakClassifiers = new Classifier[T];

for (int t = 0; t < 10; t++) {

weakClassifiers[t] = new LogisticRegression(epsilon);

}

Classifier adaBoost = new AdaBoost(weakClassifiers);

adaBoost.feedData(X);

adaBoost.feedLabels(labels);

tic();

adaBoost.train();

System.out.format("Elapsed time: %.2f seconds.%n", toc());

Xt = X.copy();

display(full(adaBoost.predictLabelMatrix(Xt)));

display(adaBoost.predict(Xt));

accuracy = Classifier.getAccuracy(labels, adaBoost.predict(Xt));

fprintf("Accuracy for AdaBoost with logistic regression: %.2f%%\n", 100 * accuracy);

# Output

Elapsed time: 0.18 seconds.

1 0

0 1

1 1 -1 -1 -1

Accuracy: 100.00%

# -------------------------------------------------------------------------- #

# K-means

double[][] data = {

{3.5, 5.3, 0.2, -1.2},

{4.4, 2.2, 0.3, 0.4},

{1.3, 0.5, 4.1, 3.2}

};

KMeansOptions options = new KMeansOptions();

options.nClus = 2;

options.verbose = true;

options.maxIter = 100;

KMeans KMeans= new KMeans(options);

KMeans.feedData(data);

// KMeans.initialize(null);

Matrix initializer = null;

initializer = new SparseMatrix(3, 2);

initializer.setEntry(0, 0, 1);

initializer.setEntry(1, 1, 1);

initializer.setEntry(2, 0, 1);

KMeans.clustering(initializer); // Use null for random initialization

System.out.println("Indicator Matrix:");

printMatrix(full(KMeans.getIndicatorMatrix()));

# Output

Iter 1: mse = 9.534 (0.000 secs)

KMeans complete.

Indicator Matrix:

0 1

1 0

# -------------------------------------------------------------------------- #

# NMF

double[][] data = {

{3.5, 4.4, 1.3},

{5.3, 2.2, 0.5},

{0.2, 0.3, 4.1},

{1.2, 0.4, 3.2}

};

KMeansOptions options = new KMeansOptions();

options.nClus = 2;

options.verbose = true;

options.maxIter = 100;

KMeans KMeans= new KMeans(options);

KMeans.feedData(data);

KMeans.initialize(null);

KMeans.clustering();

Matrix G0 = KMeans.getIndicatorMatrix();

NMFOptions NMFOptions = new NMFOptions();

NMFOptions.nClus = 2;

NMFOptions.maxIter = 50;

NMFOptions.verbose = true;

NMFOptions.calc_OV = false;

NMFOptions.epsilon = 1e-5;

Clustering NMF = new NMF(NMFOptions);

NMF.feedData(data);

// NMF.initialize(null);

NMF.clustering(G0); // If null, KMeans will be used for initialization

System.out.println("Basis Matrix:");

printMatrix(full(NMF.getCenters()));

System.out.println("Indicator Matrix:");

printMatrix(full(NMF.getIndicatorMatrix()));

# Output

Iter 1: mse = 13.060 (0.000 secs)

Iter 2: mse = 2.875 (0.000 secs)

KMeans complete.

Iteration 10, delta G: 0.001012

Converge successfully!

Basis Matrix:

5.0577 3.6774 0.5237

0.2326 0.3013 4.4328

Indicator Matrix:

0.8543 0.2004

0.8928 0

0.0000 0.9251

0.1520 0.7014

# -------------------------------------------------------------------------- #

# L1NMF

String dataMatrixFilePath = "CNN - DocTermCount.txt";

tic();

Matrix X = loadMatrixFromDocTermCountFile(dataMatrixFilePath);

X = Matlab.getTFIDF(X);

X = Matlab.normalizeByColumns(X);

X = X.transpose();

KMeansOptions kMeansOptions = new KMeansOptions();

kMeansOptions.nClus = 10;

kMeansOptions.maxIter = 50;

kMeansOptions.verbose = true;

KMeans KMeans = new KMeans(kMeansOptions);

KMeans.feedData(X);

// KMeans.initialize(null);

KMeans.clustering();

Matrix G0 = KMeans.getIndicatorMatrix();

// Matrix X = Data.loadSparseMatrix("X.txt");

G0 = loadDenseMatrix("G0.txt");

L1NMFOptions L1NMFOptions = new L1NMFOptions();

L1NMFOptions.nClus = 10;

L1NMFOptions.gamma = 1 * 0.0001;

L1NMFOptions.mu = 1 * 0.1;

L1NMFOptions.maxIter = 50;

L1NMFOptions.verbose = true;

L1NMFOptions.calc_OV = !true;

L1NMFOptions.epsilon = 1e-5;

Clustering L1NMF = new L1NMF(L1NMFOptions);

L1NMF.feedData(X);

// L1NMF.initialize(G0);

L1NMF.clustering(G0); // Use null for random initialization

System.out.format("Elapsed time: %.3f seconds\n", toc());

# Output

Iter 1: mse = 1.524 (0.030 secs)

Iter 2: mse = 0.816 (0.030 secs)

Iter 3: mse = 0.806 (0.030 secs)

Iter 4: mse = 0.805 (0.040 secs)

KMeans complete.

Iteration 10, delta G: 0.046591

Iteration 20, delta G: 0.047140

Iteration 30, delta G: 0.020651

Iteration 40, delta G: 0.010017

Iteration 50, delta G: 0.007973

Maximal iterations

Elapsed time: 3.933 seconds

# -------------------------------------------------------------------------- #

# Spectral Clustering

tic();

int nClus = 2;

boolean verbose = false;

int maxIter = 100;

String graphType = "nn";

double graphParam = 2;

String graphDistanceFunction = "euclidean";

String graphWeightType = "heat";

double graphWeightParam = 1;

ClusteringOptions options = new SpectralClusteringOptions(

nClus,

verbose,

maxIter,

graphType,

graphParam,

graphDistanceFunction,

graphWeightType,

graphWeightParam);

Clustering spectralClustering = new SpectralClustering(options);

double[][] data = {

{3.5, 5.3, 0.2, -1.2},

{4.4, 2.2, 0.3, 0.4},

{1.3, 0.5, 4.1, 3.2}

};

spectralClustering.feedData(data);

spectralClustering.clustering(null);

display(full(spectralClustering.getIndicatorMatrix()));

System.out.format("Elapsed time: %.3f seconds\n", toc());

# Output

Computing directed adjacency graph...

Creating the adjacency matrix. Nearest neighbors, N = 2.

KMeans complete.

Spectral clustering complete.

0 1

1 0

Elapsed time: 0.070 seconds

# -------------------------------------------------------------------------- #

# Computing Adjacency Matrix, Graph Laplacian and Local Learning Regularization

String filePath = "CNN - DocTermCount.txt";

Matrix X = loadMatrixFromDocTermCountFile(filePath);

int NSample = Math.min(20, X.getColumnDimension());

X = X.getSubMatrix(0, X.getRowDimension() - 1, 0, NSample - 1);

System.out.println(String.format("%d samples loaded", X.getColumnDimension()));

X = X.transpose();

GraphOptions options = new GraphOptions();

options.graphType = "nn";

String type = options.graphType;

double NN = options.graphParam;

System.out.println(String.format("Graph type: %s with NN: %d", type, (int)NN));

// Parameter setting for text data

options.kernelType = "cosine";

options.graphDistanceFunction = "cosine";

// Parameter setting for image data

/*options.kernelType = "rbf";

options.graphDistanceFunction = "euclidean";*/

options.graphNormalize = true;

options.graphWeightType = "heat";

boolean show = true && !false;

// Test adjacency function - pass

tic();

String DISTANCEFUNCTION = options.graphDistanceFunction;

Matrix A = adjacency(X, type, NN, DISTANCEFUNCTION);

System.out.format("Elapsed time: %.2f seconds.%n", toc());

String adjacencyFilePath = "adjacency.txt";

saveMatrix(adjacencyFilePath, A);

if (show)

disp(A.getSubMatrix(0, 4, 0, 4));

// Test laplacian function - pass

tic();

Matrix L = laplacian(X, type, options);

System.out.format("Elapsed time: %.2f seconds.%n", toc());

String LaplacianFilePath = "Laplacian.txt";

saveMatrix(LaplacianFilePath, L);

if (show)

disp(L.getSubMatrix(0, 4, 0, 4));

// Test local learning regularization - pass

NN = options.graphParam;

String DISTFUNC = options.graphDistanceFunction;

String KernelType = options.kernelType;

double KernelParam = options.kernelParam;

double lambda = 0.001;

tic();

Matrix LLR_text = calcLLR(X, NN, DISTFUNC, KernelType, KernelParam, lambda);

System.out.format("Elapsed time: %.2f seconds.%n", toc());

String LLRFilePath = "localLearningRegularization.txt";

saveMatrix(LLRFilePath, LLR_text);

if (show)

display(LLR_text.getSubMatrix(0, 4, 0, 4));

# Output

20 samples loaded

Graph type: nn with NN: 6

Computing directed adjacency graph...

Creating the adjacency matrix. Nearest neighbors, N = 6.

Elapsed time: 0.28 seconds.

Data matrix file written: adjacency.txt

(2, 1) 0.8162

(3, 1) 0.8841

(1, 2) 0.8162

(3, 2) 0.9041

(4, 2) 0.9074

(1, 3) 0.8841

(2, 3) 0.9041

(2, 4) 0.9074

(5, 4) 0.9558

(4, 5) 0.9558

Computing Graph Laplacian...

Computing directed adjacency graph...

Creating the adjacency matrix. Nearest neighbors, N = 6.

Elapsed time: 0.22 seconds.

Data matrix file written: Laplacian.txt

(1, 1) 1

(2, 1) -0.1522

(3, 1) -0.1118

(1, 2) -0.1522

(2, 2) 1

(3, 2) -0.1276

(4, 2) -0.1521

(1, 3) -0.1118

(2, 3) -0.1276

(3, 3) 1

(2, 4) -0.1521

(4, 4) 1

(5, 4) -0.1499

(4, 5) -0.1499

(5, 5) 1

Computing directed adjacency graph...

Creating the adjacency matrix. Nearest neighbors, N = 6.

Elapsed time: 0.26 seconds.

Data matrix file written: localLearningRegularization.txt

(1, 1) 1.0562

(2, 1) -0.2926

(3, 1) -0.1244

(4, 1) 0.0132

(1, 2) -0.2926

(2, 2) 1.0353

(3, 2) -0.1167

(4, 2) -0.1202

(5, 2) 0.0026

(1, 3) -0.1244

(2, 3) -0.1167

(3, 3) 1.0305

(4, 3) 0.0045

(1, 4) 0.0132

(2, 4) -0.1202

(3, 4) 0.0045

(4, 4) 1.0583

(5, 4) -0.0052

(2, 5) 0.0026

(4, 5) -0.0052

(5, 5) 1.0042

# -------------------------------------------------------------------------- #

# Matrix Completion

int m = 6;

int r = 1;

int p = (int) Math.round(m * m * 0.3);

Matrix L = randn(m, r);

Matrix R = randn(m, r);

Matrix A_star = mtimes(L, R.transpose());

int[] indices = randperm(m * m);

minusAssign(indices, 1);

indices = linearIndexing(indices, colon(0, p - 1));

Matrix Omega = zeros(size(A_star));

linearIndexingAssignment(Omega, indices, 1);

Matrix D = zeros(size(A_star));

linearIndexingAssignment(D, indices, linearIndexing(A_star, indices));

Matrix E_star = D.minus(A_star);

logicalIndexingAssignment(E_star, Omega, 0);

// Run matrix completion

MatrixCompletion matrixCompletion = new MatrixCompletion();

matrixCompletion.feedData(D);

matrixCompletion.feedIndices(Omega);

tic();

matrixCompletion.run();

fprintf("Elapsed time: %.2f seconds.%n", toc());

// Output

Matrix A_hat = matrixCompletion.GetLowRankEstimation();

fprintf("A*:\n");

disp(A_star, 4);

fprintf("A^:\n");

disp(A_hat, 4);

fprintf("D:\n");

disp(D, 4);

fprintf("rank(A*): %d\n", rank(A_star));

fprintf("rank(A^): %d\n", rank(A_hat));

fprintf("||A* - A^||_F: %.4f\n", norm(A_star.minus(A_hat), "fro"));

# Output

Elapsed time: 0.07 seconds.

A*:

1.1683 -0.4309 1.7763 -0.3742 -0.4096 -0.6490

-2.9765 1.0978 -4.5257 0.9533 1.0437 1.6536

-0.4342 0.1601 -0.6602 0.1391 0.1522 0.2412

1.6696 -0.6158 2.5386 -0.5347 -0.5854 -0.9276

-0.6291 0.2320 -0.9566 0.2015 0.2206 0.3495

0.4162 -0.1535 0.6329 -0.1333 -0.1459 -0.2312

A^:

1.1683 -0.4309 -0.0000 -0.3742 -0.4096 -0.6490

-2.9765 1.0978 0.0000 0.9533 1.0437 1.6536

-0.4342 0.1601 0.0000 0.1391 0.1522 0.2412

1.6696 -0.6158 -0.0000 -0.5347 -0.5854 -0.9276

-0.6291 0.2320 0.0000 0.2015 0.2206 0.3495

0.4162 -0.1535 -0.0000 -0.1333 -0.1459 -0.2312

0 0 0 0 0 -0.6490

-2.9765 0 0 0 1.0437 1.6536

0 0 0 0.1391 0.1522 0

1.6696 -0.6158 0 -0.5347 0 0

0 0 0 0 0.2206 0

0 0 0 -0.1333 0 0

rank(A*): 1

rank(A^): 1

||A* - A^||_F: 5.6420

# -------------------------------------------------------------------------- #

# Robust PCA

int m = 8;

int r = m / 4;

Matrix L = randn(m, r);

Matrix R = randn(m, r);

Matrix A_star = mtimes(L, R.transpose());

Matrix E_star = zeros(size(A_star));

int[] indices = randperm(m * m);

int nz = m * m / 20;

int[] nz_indices = new int[nz];

for (int i = 0; i < nz; i++) {

nz_indices[i] = indices[i] - 1;

}

Matrix E_vec = vec(E_star);

setSubMatrix(E_vec, nz_indices, new int[] {0}, (minus(rand(nz, 1), 0.5).times(100)));

E_star = reshape(E_vec, size(E_star));

// Input

Matrix D = A_star.plus(E_star);

double lambda = 1 * Math.pow(m, -0.5);

// Run Robust PCA

RobustPCA robustPCA = new RobustPCA(lambda);

robustPCA.feedData(D);

tic();

robustPCA.run();

fprintf("Elapsed time: %.2f seconds.%n", toc());

// Output

Matrix A_hat = robustPCA.GetLowRankEstimation();

Matrix E_hat = robustPCA.GetErrorMatrix();

fprintf("A*:\n");

disp(A_star, 4);

fprintf("A^:\n");

disp(A_hat, 4);

fprintf("E*:\n");

disp(E_star, 4);

fprintf("E^:\n");

disp(E_hat, 4);

fprintf("rank(A*): %d\n", rank(A_star));

fprintf("rank(A^): %d\n", rank(A_hat));

fprintf("||A* - A^||_F: %.4f\n", norm(A_star.minus(A_hat), "fro"));

fprintf("||E* - E^||_F: %.4f\n", norm(E_star.minus(E_hat), "fro"));

# Output

Elapsed time: 0.05 seconds.

A*:

0.4542 1.4239 -0.7884 0.9424 -1.9555 0.3413 -0.9643 1.9974

3.2620 -0.5927 -1.7637 0.6375 1.4598 -1.2037 1.9060 -1.8263

-1.9222 -0.4894 1.3415 -0.8509 0.3416 0.4260 -0.4386 -0.1774

-1.3192 0.7056 0.5454 0.0062 -1.2580 0.6442 -1.1511 1.4349

-2.2590 0.3961 1.2266 -0.4496 -0.9903 0.8288 -1.3082 1.2432

3.2288 1.0411 -2.3324 1.5534 -0.8878 -0.6416 0.5579 0.6255

1.7459 -0.1703 -0.9969 0.4244 0.5708 -0.5946 0.9002 -0.7579

1.0295 -1.0952 -0.2294 -0.3135 1.7622 -0.6867 1.3429 -1.9339

A^:

0.4296 1.4239 -0.7884 0.9422 -1.9322 0.3282 -0.9459 1.9804

3.1412 -0.5927 -1.7637 0.6375 1.4596 -1.2037 1.9060 -1.8263

-1.8337 -0.4894 1.3369 -0.8509 0.3325 0.4261 -0.4385 -0.1774

-1.3192 -0.4311 0.5454 -0.1480 0.2049 0.6442 -0.7934 0.4356

-2.1752 0.3960 1.2266 -0.4496 -0.9903 0.8287 -1.3082 1.2432

2.6885 1.0320 -1.9370 1.2690 -0.8878 -0.6340 0.5579 0.5161

1.6787 -0.1704 -0.9966 0.4247 0.5725 -0.5947 0.9002 -0.7579

1.0103 -1.0939 -0.2342 -0.3135 1.7505 -0.6867 1.3428 -1.9330

E*:

0 0 0 0 0 0 0 0

0 -20.5412 0 0 34.4119 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 20.5941 0 0 0

0 0 0 0 0 0 0 0

E^:

0.0245 0 0 0.0002 -0.0233 0.0131 -0.0184 0.0170

0.1208 0 0 0 0.0001 -0.0001 0.0000 0

-0.0885 0 0.0046 -0.0000 0.0091 -0.0001 -0.0001 0

0 -19.4046 0 0.1542 32.9490 0 -0.3577 0.9993

-0.0839 0.0001 0.0000 0 0 0.0000 0.0000 0.0000

0.5403 0.0091 -0.3953 0.2844 0 -0.0076 0 0.1094

0.0672 0.0001 -0.0004 -0.0003 20.5923 0.0000 0 0

0.0193 -0.0014 0.0048 0 0.0117 0 0.0001 -0.0009

rank(A*): 2

rank(A^): 4

||A* - A^||_F: 2.2716

||E* - E^||_F: 2.2716

# -------------------------------------------------------------------------- #

# LASSO

double[][] data = {{1, 2, 3, 2},

{4, 2, 3, 6},

{5, 1, 2, 1}};

double[][] depVars = {{3, 2},

{2, 3},

{1, 4}};

Options options = new Options();

options.maxIter = 600;

options.lambda = 0.05;

options.verbose = !true;

options.calc_OV = !true;

options.epsilon = 1e-5;

Regression LASSO = new LASSO(options);

LASSO.feedData(data);

LASSO.feedDependentVariables(depVars);

tic();

LASSO.train();

fprintf("Elapsed time: %.3f seconds\n\n", toc());

fprintf("Projection matrix:\n");

display(LASSO.W);

Matrix Yt = LASSO.predict(data);

fprintf("Predicted dependent variables:\n");

display(Yt);

# Output

Elapsed time: 0.060 seconds

Projection matrix:

-0.2295 0.5994

0 0

1.1058 0.5858

-0.0631 -0.1893

Predicted dependent variables:

2.9618 1.9782

2.0209 3.0191

1.0009 3.9791

# -------------------------------------------------------------------------- #

# Linear Regression

double[][] data = {

{1, 2, 3, 2},

{4, 2, 3, 6},

{5, 1, 4, 1}

};

double[][] depVars = {

{3, 2},

{2, 3},

{1, 4}

};

Options options = new Options();

options.maxIter = 600;

options.lambda = 0.1;

options.verbose = !true;

options.calc_OV = !true;

options.epsilon = 1e-5;

Regression LR = new LinearRegression(options);

LR.feedData(data);

LR.feedDependentVariables(depVars);

tic();

LR.train();

fprintf("Elapsed time: %.3f seconds\n\n", toc());

fprintf("Projection matrix:\n");

display(LR.W);

fprintf("Bias vector:\n");

display(((LinearRegression)LR).B);

Matrix Yt = LR.predict(data);

fprintf("Predicted dependent variables:\n");

display(Yt);

# Output

Elapsed time: 0.025 seconds

Projection matrix:

-0.4700 0.4049

0.5621 0.1216

0.6117 0.4454

0.1163 -0.0513

Bias vector:

0.2345

0.1131

Predicted dependent variables:

2.9561 1.9949

2.0111 3.0043

1.0093 3.9895

# -------------------------------------------------------------------------- #

# Basic Conditional Random Field (CRF)

// Number of data sequences

int D = 1000;

// Minimal length for the randomly generated data sequences

int n_min = 4;

// Maximal length for the randomly generated data sequences

int n_max = 6;

// Number of feature functions

int d = 10;

// Number of states

int N = 2;

// Sparseness for the feature matrices

double sparseness = 0.2;

// Randomly generate labeled sequential data for CRF

Object[] dataSequences = CRF.generateDataSequences(D, n_min, n_max, d, N, sparseness);

Matrix[][][] Fs = (Matrix[][][]) dataSequences[0];

int[][] Ys = (int[][]) dataSequences[1];

// Train a CRF model for the randomly generated sequential data with labels

double epsilon = 1e-4;

CRF CRF = new CRF(epsilon);

CRF.feedData(Fs);

CRF.feedLabels(Ys);

CRF.train();

// Save the CRF model

String modelFilePath = "CRF-Model.dat";

CRF.saveModel(modelFilePath);

fprintf("CRF Parameters:\n");

display(CRF.W);

// Prediction

CRF = new CRF();

CRF.loadModel(modelFilePath);

int ID = new Random().nextInt(D);

int[] Yt = Ys[ID];

Matrix[][] Fst = Fs[ID];

fprintf("True label sequence:\n");

display(Yt);

fprintf("Predicted label sequence:\n");

display(CRF.predict(Fst));

# Output

Initial ofv: 46.6525

Iter 1, ofv: 45.9559, norm(Grad): 0.475827

Iter 2, ofv: 33.8123, norm(Grad): 0.472433

Iter 3, ofv: 25.6766, norm(Grad): 0.335332

Objective function value doesn't decrease, iteration stopped!

Iter 4, ofv: 25.6766, norm(Grad): 0.0901032

Model saved.

CRF Parameters:

-5.7255

6.0560

5.8877

-7.7452

11.2735

-4.5474

-0.6763

3.6997

-4.3791

0.6701

Loading model...

Model loaded.

True label sequence:

1 0 1 0

Predicted label sequence:

P*(YPred|x) = 0.624515

1 1 1 0

# -------------------------------------------------------------------------- #

# Hidden Markov Model (HMM)

int numStates = 3;

int numObservations = 2;

double epsilon = 1e-8;

int maxIter = 10;

double[] pi = new double[] {0.33, 0.33, 0.34};

double[][] A = new double[][] {

{0.5, 0.3, 0.2},

{0.3, 0.5, 0.2},

{0.2, 0.4, 0.4}

};

double[][] B = new double[][] {

{0.7, 0.3},

{0.5, 0.5},

{0.4, 0.6}

};

// Generate the data sequences for training

int D = 10000;

int T_min = 5;

int T_max = 10;

int[][][] data = HMM.generateDataSequences(D, T_min, T_max, pi, A, B);

int[][] Os = data[0];

int[][] Qs = data[1];

boolean trainHMM = !false;

if (trainHMM){

HMM HMM = new HMM(numStates, numObservations, epsilon, maxIter);

HMM.feedData(Os);

HMM.feedLabels(Qs);

HMM.train();

fprintf("True Model Parameters: \n");

fprintf("Initial State Distribution: \n");

display(pi);

fprintf("State Transition Probability Matrix: \n");

display(A);

fprintf("Observation Probability Matrix: \n");

display(B);

fprintf("Trained Model Parameters: \n");

fprintf("Initial State Distribution: \n");

display(HMM.pi);

fprintf("State Transition Probability Matrix: \n");

display(HMM.A);

fprintf("Observation Probability Matrix: \n");

display(HMM.B);

String HMMModelFilePath = "HMMModel.dat";

HMM.saveModel(HMMModelFilePath);

}

// Predict the single best state path

int ID = new Random().nextInt(D);

int[] O = Os[ID];

HMM HMMt = new HMM();

HMMt.loadModel("HMMModel.dat");

int[] Q = HMMt.predict(O);

fprintf("Observation sequence: \n");

HMMt.showObservationSequence(O);

fprintf("True state sequence: \n");

HMMt.showStateSequence(Qs[ID]);

fprintf("Predicted state sequence: \n");

HMMt.showStateSequence(Q);

double p = HMMt.evaluate(O);

System.out.format("P(O|Theta) = %f\n", p);

# Output

Iter: 1, log[P(O|Theta)]: -51857.728975

Iter: 2, log[P(O|Theta)]: -51857.679272

Iter: 3, log[P(O|Theta)]: -51857.633951

Iter: 4, log[P(O|Theta)]: -51857.592261

Iter: 5, log[P(O|Theta)]: -51857.553794

Iter: 6, log[P(O|Theta)]: -51857.518197

Iter: 7, log[P(O|Theta)]: -51857.485156

Iter: 8, log[P(O|Theta)]: -51857.454400

Iter: 9, log[P(O|Theta)]: -51857.425690

Iter: 10, log[P(O|Theta)]: -51857.398814

True Model Parameters:

Initial State Distribution:

0.3300

0.3400

State Transition Probability Matrix:

0.5000 0.3000 0.2000

0.3000 0.5000 0.2000

0.2000 0.4000 0.4000

Observation Probability Matrix:

0.7000 0.3000

0.5000 0.5000

0.4000 0.6000

Trained Model Parameters:

Initial State Distribution:

0.3309

0.3269

0.3422

State Transition Probability Matrix:

0.5000 0.3067 0.1934

0.2959 0.5031 0.2010

0.2037 0.3940 0.4023

Observation Probability Matrix:

0.7030 0.2970

0.4957 0.5043

0.3956 0.6044

Model saved.

Loading model...

Model loaded.

Observation sequence:

1 0 1 0 1 0 0

True state sequence:

1 1 1 1 1 0 2

Predicted state sequence:

1 1 1 1 1 0 0

P(O|Theta) = 0.007996

# -------------------------------------------------------------------------- #

# Isomap

double[][] data = {{0, 2, 3, 4}, {2, 0, 4, 5}, {3, 4.1, 5, 6}, {2, 7, 1, 6}};

Matrix X = new DenseMatrix(data);

X = X.transpose();

int K = 3;

int r = 3;

Matrix R = Isomap.run(X, K, r);

disp("Original Data:");

disp(X);

disp("Reduced Data:");

disp(R);

# Output

Computing directed adjacency graph...

Creating the adjacency matrix. Nearest neighbors, N = 3.

Original Data:

0 2 3 2

2 0 4.1000 7

3 4 5 1

4 5 6 6

Reduced Data:

2.2473 2.5240 -0.7279

-3.4592 2.1429 0.4972

2.9742 -1.3689 0.7943

-1.7622 -3.2981 -0.5637

# -------------------------------------------------------------------------- #

# Kernel PCA

double[][] data = {

{0, 2, 3, 4},

{2, 0, 4, 5},

{3, 4.1, 5, 6},

{2, 7, 1, 6}

};

Matrix X = new DenseMatrix(data).transpose();

int r = 3;

Matrix R = KernelPCA.run(X, r);

disp("Original Data:");

disp(X);

disp("Reduced Data:");

disp(R);

# Output

Original Data:

0 2 3 2

2 0 4.1000 7

3 4 5 1

4 5 6 6

Reduced Data:

0.5011 0.0000 0.7063

-0.4987 -0.7080 -0.0001

0.4990 0.0018 -0.7078

-0.5013 0.7062 0.0017

# -------------------------------------------------------------------------- #

# Locally Linear Embedding (LLE)

double[][] data = {

{0, 2, 3, 4},

{2, 0, 4, 5},

{3, 4.1, 5, 6},

{2, 7, 1, 6}

};

Matrix X = new DenseMatrix(data).transpose();

int K = 3;

int r = 3;

Matrix R = LLE.run(X, K, r);

disp("Original Data:");

disp(X);

disp("Reduced Data:");

disp(R);

# Output

Computing directed adjacency graph...

Creating the adjacency matrix. Nearest neighbors, N = 3.

Original Data:

0 2 3 2

2 0 4.1000 7

3 4 5 1

4 5 6 6

Reduced Data:

-1.0360 -0.8728 1.0793

1.0282 -1.1127 -0.8395

-0.9630 0.8727 -1.1450

0.9707 1.1128 0.9051

# -------------------------------------------------------------------------- #

# Multi-dimensional Scaling (MDS)

double[][] data = {

{0, 2, 3, 4},

{2, 0, 4, 5},

{3, 4.1, 5, 6},

{2, 7, 1, 6}

};

Matrix O = new DenseMatrix(data).transpose();

Matrix D = l2Distance(O, O);

Matrix X = MDS.run(D, 3);

disp("Reduced X:");

disp(X);

# Output

Reduced X:

2.2473 2.5240 -0.7279

-3.4592 2.1429 0.4972

2.9742 -1.3689 0.7943

-1.7622 -3.2981 -0.5637

# -------------------------------------------------------------------------- #

# Principal Component Analysis (PCA)

double[][] data = {

{0, 2, 3, 4},

{2, 0, 4, 5},

{3, 4.1, 5, 6},

{2, 7, 1, 6}

};

Matrix X = new DenseMatrix(data).transpose();

int r = 3;

Matrix R = PCA.run(X, r);

disp("Original Data:");

disp(X);

disp("Reduced Data:");

disp(R);

# Output

Original Data:

0 2 3 2

2 0 4.1000 7

3 4 5 1

4 5 6 6

Reduced Data:

-2.2473 2.5240 0.7279

3.4592 2.1429 -0.4972

-2.9742 -1.3689 -0.7943

1.7622 -3.2981 0.5637

# -------------------------------------------------------------------------- #

# General Quadratic Programming

* min 2 \ x' * Q * x + c' * x

* s.t. A * x = b

* B * x <= d

* Number of unknown variables

int n = 5;

* Number of inequality constraints

int m = 6;

* Number of equality constraints

int p = 3;

Matrix x = rand(n, n);

Matrix Q = x.mtimes(x.transpose()).plus(times(rand(1), eye(n)));

Matrix c = rand(n, 1);

double HasEquality = 1;

Matrix A = times(HasEquality, rand(p, n));

x = rand(n, 1);

Matrix b = A.mtimes(x);

Matrix B = rand(m, n);

double rou = -2;

Matrix d = plus(B.mtimes(x), times(rou, ones(m, 1)));

QPSolution solution = GeneralQP.solve(Q, c, A, b, B, d);

# Output

Phase I:

Terminate successfully.

x_opt:

640.1439 6991.6488 -2975.2211 -6851.4134 -6405.0501

s_opt:

0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

lambda for the inequalities s_i >= 0:

1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

B * x - d:

-2123.5480 -2842.1394 -4496.4653 -1640.3370 -3613.3843 -14582.9937

lambda for the inequalities fi(x) <= s_i:

0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

nu for the equalities A * x = b:

-0.0000 -0.0000 -0.0000

residual: 1.07222e-11

A * x - b:

0.0000 0.0000 -0.0000

norm(A * x - b, "fro"): 0.000000

fval_opt: 3.65174e-11

The problem is feasible.

Computation time: 0.590000 seconds

halt execution temporarily in 1 seconds...

Phase II:

Terminate successfully.

residual: 9.83544e-12

Optimal objective function value: 173.050

Optimizer:

1.4528 7.8845 -3.0459 -7.3020 -6.5998

B * x - d:

-0.0000 -1.4506 -3.0662 -0.0000 -2.0318 -14.1904

lambda:

52.4785 0.0000 0.0000 137.3581 0.0000 0.0000

nu:

-19.7244 -38.6926 -118.0042

norm(A * x - b, "fro"): 0.000000

Computation time: 0.061000 seconds

# -------------------------------------------------------------------------- #

# Quadratic Programming with Bound Constraints

* min 2 \ x' * Q * x + c' * x

* s.t. l <= x <= u

int n = 5;

Matrix x = rand(n);

Matrix Q = minus(x.mtimes(x.transpose()), times(rand(1).getEntry(0, 0), eye(n)));

Matrix c = plus(-2, times(2, rand(n, 1)));

double l = 0;

double u = 1;

double epsilon = 1e-6;

QPSolution S = QPWithBoundConstraints.solve(Q, c, l, u, epsilon);

disp("Q:");

disp(Q);

disp("c:");

disp(c);

fprintf("Optimum: %g\n", S.optimum);

fprintf("Optimizer:\n");

display(S.optimizer.transpose());

# Output

Initial ofv: -0.678917

Iter 1, ofv: -1.28937, norm(PGrad): 3.08886

PLBFGS converges with norm(PGrad) 0.000000

1.6699 1.4719 0.9729 0.9943 1.6691

1.4719 0.5587 0.5777 0.6273 1.1480

0.9729 0.5777 1.4139 0.6709 1.0215

0.9943 0.6273 0.6709 0.4050 0.6021

1.6691 1.1480 1.0215 0.6021 1.0643

-0.3002

-1.7387

-1.7054

-1.8646

-1.9046

Optimum: -1.28937

Optimizer:

0 0.0469 0 0.7148 0

**************************************************

For other general-purpose optimization algorithms such as L-BFGS and accelerated proximal gradient, please refer to the documentation.

# -------------------------------------------------------------------------- #

# LDA

int[][] documents = { {1, 4, 3, 2, 3, 1, 4, 3, 2, 3, 1, 4, 3, 2, 3, 6},

{2, 2, 4, 2, 4, 2, 2, 2, 2, 4, 2, 2},

{1, 6, 5, 6, 0, 1, 6, 5, 6, 0, 1, 6, 5, 6, 0, 0},

{5, 6, 6, 2, 3, 3, 6, 5, 6, 2, 2, 6, 5, 6, 6, 6, 0},

{2, 2, 4, 4, 4, 4, 1, 5, 5, 5, 5, 5, 5, 1, 1, 1, 1, 0},

{5, 4, 2, 3, 4, 5, 6, 6, 5, 4, 3, 2} };

LDAOptions LDAOptions = new LDAOptions();

LDAOptions.nTopic = 2;

LDAOptions.iterations = 5000;

LDAOptions.burnIn = 1500;

LDAOptions.thinInterval = 200;

LDAOptions.sampleLag = 10;

LDAOptions.alpha = 2;

LDAOptions.beta = 0.5;

LDA LDA = new LDA(LDAOptions);

LDA.readCorpus(documents);

LDA.train();

fprintf("Topic--term associations: \n");

display(LDA.topicMatrix);

fprintf("Document--topic associations: \n");

display(LDA.indicatorMatrix);

# Output

Topic--term associations:

0.1258 0.0176

0.1531 0.0846

0.0327 0.3830

0.0418 0.1835

0.0360 0.2514

0.2713 0.0505

0.3393 0.0294

Document--topic associations:

0.2559 0.7441

0.1427 0.8573

0.8573 0.1427

0.6804 0.3196

0.5491 0.4509

0.4420 0.5580

# -------------------------------------------------------------------------- #

# Factorization Machines (FM)

String trainFilePath = "Train.txt";

String testFilePath = "Test.txt";

String outputFilePath = "FM-YijPredOnTest.txt";

// Load training data

int idxStart = 0;

FM.feedTrainingData(trainFilePath, idxStart);

// Initialization

FM.allocateResource(k);

FM.feedParams(maxIter, lambda);

FM.initialize();

// Train FM model parameters by training data

FM.train();

// Prediction: generate and save FM-YijPredOnTest.txt

DataSet testData = FM.loadData(testFilePath, idxStart);

double[] Yij_pred = FM.predict(testData.X);

ml.utils.IO.save(Yij_pred, outputFilePath);

# -------------------------------------------------------------------------- #

# STructured Sparse Regression (STSR)

double lambda = 0.01;

double nu = 0.00001;

int maxIter = 30;

//LIBSVM format

String trainFilePath = "Train.txt";

String testFilePath = "Test.txt";

String outputFilePath = "STSR-YijPredOnTest.txt";

// Load training data

int idxStart = 0;

STSR.feedTrainingData(trainFilePath, idxStart);

// Build tree structured pair groups

* featureSize format:

* User[\t]383

* Item[\t]1175

* Event[\t]1

String featureSizeFilePath = "FeatureSize.txt";

* Each line is a group feature index pair (idx1, idx2) separated by

* a tab character, e.g.

* (157, 158)

* (157, 236)

* (24, 157)[\t](157, 158)[\t](157, 236)

String userFeatureGroupListFilePath = "UserTreeStructuredPairGroupList.txt";

String itemFeatureGroupListFilePath = "ItemTreeStructuredPairGroupList.txt";

STSR.buildTreeStructuredPairGroupList(

featureSizeFilePath,

userFeatureGroupListFilePath,

itemFeatureGroupListFilePath

);

// Initialization

STSR.allocateResource();

STSR.feedParams(maxIter, lambda, nu);

STSR.initialize();

// Train STSR model parameters by training data

STSR.train();

// Prediction: generate and save STSR-YijPredOnTest.txt

DataSet testData = STSR.loadData(testFilePath, idxStart);

double[] Yij_pred = STSR.predict(testData.X);

ml.utils.IO.save(Yij_pred, outputFilePath);

# -------------------------------------------------------------------------- #

-----------------------------------

Version: 1.6.5

Author: Mingjie Qian

Date: April 25th, 2016