LAML
LAML (Linear Algebra and Machine Learning) is a stand-alone pure Java library for linear algebra and machine learning. The goal is to build efficient and easy-to-use linear algebra and machine learning libraries. The reason why linear algebra and machine learning are built together is that full control of the basic data structures for matrices and vectors is required to have fast implementation for machine learning methods. Additionally, LAML provides a lot of commonly used matrix functions in the same signature to MATLAB, thus can also be used to manually convert MATLAB code to Java code. [Tutorial-LAML.pdf]
The built-in linear algebra library supports well-designed dense and sparse matrices and vectors. Standard compressed sparse column (CSC) and compressed sparse row (CSR) are used to design and implement sparse matrices. The matrix multiplication is competitive or even faster than well known linear algebra libraries in Java such as Commons-Math and Colt. Unlike other linear algebra libraries in Java, the built-in linear algebra library in LAML gives users full control of sparse matrices and vectors (e.g., the interior arrays in sparse matrices), which is crucial to make efficient high level implementations.
Carefully designed linear algebra library is the basis for machine learning library. ML library aims to provide fast implementation of mature machine learning methods. For instance, the LinearBinarySVM class re-implements the popular liblinear (in Java). For heart_scale data with C = 1.0 and eps = 1e-2, the average running time is 0.04 seconds using an Intel(R) Core(TM) i7 CPU M620 @ 2.67GHz with 4.00GB memory and 64-bit Windows 7 operating system, even a little faster than liblinear, which costs 0.06 seconds in average given the same parameter.
JML v.s. LAML:
LAML is much faster than JML (more than 3 times faster) due to two implementation considerations. First, LAML allows full control of dense and sparse matrices and vectors. Second, LAML extensively uses in-place matrix and vector operations thus avoids too much memory allocation and garbage collection.
JML relies on third party linear algebra library, i.e. Apache Commons-math. Sparse matrices and vectors have been deprecated in Commons-math 3.0+, and will be ultimately eliminated. Whereas LAML has its own built-in linear algebra library.
Like JML, LAML also provides a lot of commonly used matrix functions in the same signature to MATLAB, thus can also be used to manually convert MATLAB code to Java code.
In short, JML has been replaced by LAML.
SourceForge:
https://sourceforge.net/projects/lamal
Github:
https://github.com/MingjieQian/LAML
Documentation:
For more details about the LAML API, please refer to the online Javadoc.
Features:
Stand-alone Java library, completely cross-platform
Built-in Linear Algebra (LA) library
Full control of matrices and vectors
Many general-purpose optimization algorithms
Fast implementation of Machine Learning (ML) methods
Matrix functions with almost the same signature to MATLAB
Well documented source code and friendly API, very easy to use
Packages:
la.decomposition
LU, QR, eigenvalue decomposition, and SVD
la.matrix
Sparse and dense matrix implementation
la.vector
Sparse and dense vector implementation
la.io
Functions of saving and loading a matrix, a vector, or a data set
ml.utils
Efficient functions for array and matrix operations
ml.classification
Linear SVM, linear multi-class SVM, regularized logistic regression, maximum entropy modeling, and AdaBoost
ml.clustering
K-means, L1NMF, NMF, and spectral clustering
ml.optimization
L-BFGS, BoundConstrainedPLBFGS, NonnegativePLBFGS, Projection, ProximalMapping, ShrinkageOperator,
accelerated proximal gradient, accelerated gradient descent, general quadratic programming, nonlinear
conjugate gradient, LBFGS on simplex, quadratic programming with bound constraint, primal-dual interior-
point method
ml.sequence
Hidden Markov Models (HMM) and Conditional Random Fields (CRF)
ml.kernel
Commonly used kernel functions ('linear' | 'poly' | 'rbf' | 'cosine')
ml.manifold
Commonly used manifold learning functions such as computing adjacency matrix, Laplacian matrix, and local
learning regularization matrix
ml.subspace
PCA, kernel PCA, Multi-dimensional Scaling (MDS), Isomap, and Locally Linear Embedding (LLE)
ml.regression
LASSO and linear regression
ml.random
Multivariate Gaussian distribution
ml.recovery
Matrix completion and robust PCA
ml.topics
LDA
ml.graph
Minimum spanning tree using Prim's algorithm, shortest path using Dijkstra's algorithm, topological order,
all-pairs shortest path using Floyd-Warshall algorithm, Huffman codes, and maximum flow using Ford-Fulkerson algorithm.
ml.recommendation
Factorization machines and structured sparse regression (STSR)
...
Usage:
# Eigenvalue Decomposition (For Real Symmetric Matrices)
int m = 4;
int n = 4;
Matrix A = hilb(m, n);
fprintf("A:%n");
disp(A);
long start = 0;
start = System.currentTimeMillis();
Matrix[] VD = EigenValueDecomposition.decompose(A);
System.out.format("Elapsed time: %.4f seconds.%n", (System.currentTimeMillis() - start) / 1000.0);
fprintf("*****************************************%n");
Matrix V = VD[0];
Matrix D = VD[1];
fprintf("V:%n");
printMatrix(V);
fprintf("D:%n");
printMatrix(D);
fprintf("VDV':%n");
disp(V.mtimes(D).mtimes(V.transpose()));
fprintf("A:%n");
printMatrix(A);
fprintf("V'V:%n");
printMatrix(V.transpose().mtimes((V)));
# Output
A:
1 0.5000 0.3333 0.2500
0.5000 0.3333 0.2500 0.2000
0.3333 0.2500 0.2000 0.1667
0.2500 0.2000 0.1667 0.1429
Elapsed time: 0.0100 seconds.
*****************************************
V:
0.7926 0.5821 -0.1792 -0.0292
0.4519 -0.3705 0.7419 0.3287
0.3224 -0.5096 -0.1002 -0.7914
0.2522 -0.5140 -0.6383 0.5146
D:
1.5002
0.1691
0.0067
0.0001
VDV':
1 0.5000 0.3333 0.2500
0.5000 0.3333 0.2500 0.2000
0.3333 0.2500 0.2000 0.1667
0.2500 0.2000 0.1667 0.1429
A:
1 0.5000 0.3333 0.2500
0.5000 0.3333 0.2500 0.2000
0.3333 0.2500 0.2000 0.1667
0.2500 0.2000 0.1667 0.1429
V'V:
1.0000 -0.0000 -0.0000 -0.0000
-0.0000 1.0000 0.0000 0
-0.0000 0.0000 1.0000 -0.0000
-0.0000 0 -0.0000 1.0000
# -------------------------------------------------------------------------- #
# LU Decomposition
double[][] data = new double[][] {
{1, -2, 3},
{2, -5, 12},
{0, 2, -10}
};
Matrix A = new DenseMatrix(data);
fprintf("A:%n");
printMatrix(A);
Matrix[] LUP = LUDecomposition.decompose(A);
Matrix L = LUP[0];
Matrix U = LUP[1];
Matrix P = LUP[2];
fprintf("L:%n");
printMatrix(L);
fprintf("U:%n");
printMatrix(U);
fprintf("P:%n");
printMatrix(P);
fprintf("PA:%n");
printMatrix(P.mtimes(A));
fprintf("LU:%n");
printMatrix(L.mtimes(U));
long start = 0;
start = System.currentTimeMillis();
LUDecomposition LUDecomp = new LUDecomposition(A);
Vector b = new DenseVector(new double[] {2, 3, 4});
Vector x = LUDecomp.solve(b);
fprintf("Solution for Ax = b:%n");
printVector(x);
fprintf("b = %n");
printVector(b);
fprintf("Ax = %n");
printVector(A.operate(x));
fprintf("A^{-1}:%n");
printMatrix(LUDecomp.inverse());
fprintf("det(A) = %.2f%n", LUDecomp.det());
System.out.format("Elapsed time: %.2f seconds.%n", (System.currentTimeMillis() - start) / 1000F);
fprintf("**********************************%n");
A = sparse(A);
fprintf("A:%n");
printMatrix(A);
LUP = LUDecomposition.decompose(A);
L = LUP[0];
U = LUP[1];
P = LUP[2];
fprintf("L:%n");
printMatrix(L);
fprintf("U:%n");
printMatrix(U);
fprintf("P:%n");
printMatrix(P);
fprintf("PA:%n");
printMatrix(P.mtimes(A));
fprintf("LU:%n");
printMatrix(L.mtimes(U));
start = System.currentTimeMillis();
LUDecomp = new LUDecomposition(sparse(A));
b = new DenseVector(new double[] {2, 3, 4});
x = LUDecomp.solve(b);
fprintf("Solution for Ax = b:%n");
printVector(x);
fprintf("Ax = %n");
printVector(A.operate(x));
fprintf("b = %n");
printVector(b);
Matrix B = new DenseMatrix(new double[][] {
{2, 4},
{3, 3},
{4, 2} }
);
Matrix X = LUDecomp.solve(B);
fprintf("Solution for AX = B:%n");
printMatrix(X);
fprintf("AX = %n");
printMatrix(A.mtimes(X));
fprintf("B = %n");
printMatrix(B);
fprintf("A^{-1}:%n");
printMatrix(LUDecomp.inverse());
fprintf("det(A) = %.2f%n", LUDecomp.det());
System.out.format("Elapsed time: %.2f seconds.%n", (System.currentTimeMillis() - start) / 1000F);
# Output
A:
1 -2 3
2 -5 12
0 2 -10
L:
1 0 0
0 1 0
0.5000 0.2500 1
U:
2 -5 12
0 2 -10
0 0 -0.5000
P:
0 1 0
0 0 1
1 0 0
PA:
2 -5 12
0 2 -10
1 -2 3
LU:
2 -5 12
0 2 -10
1 -2 3
Solution for Ax = b:
13
7
1
b =
2
3
4
Ax =
2
3
4
A^{-1}:
-13 7 4.5000
-10 5 3
-2 1 0.5000
det(A) = -2.00
Elapsed time: 0.02 seconds.
**********************************
A:
1 -2 3
2 -5 12
2 -10
L:
1
1
0.5000 0.2500 1
U:
2 -5 12
2 -10
-0.5000
P:
1
1
1
PA:
2 -5 12
2 -10
1 -2 3
LU:
2 -5 12
2 -10
1 -2 3
Solution for Ax = b:
13
7
1
Ax =
2
3
4
b =
2
3
4
Solution for AX = B:
13 -22
7 -19
1 -4
AX =
2 4
3 3
4 2
B =
2 4
3 3
4 2
A^{-1}:
-13 7 4.5000
-10 5 3
-2 1 0.5000
det(A) = -2.00
Elapsed time: 0.02 seconds.
# -------------------------------------------------------------------------- #
# QR Decomposition
int m = 4;
int n = 3;
Matrix A = hilb(m, n);
fprintf("When A is full:%n");
fprintf("A:%n");
printMatrix(A);
long start = 0;
start = System.currentTimeMillis();
Matrix[] QRP = QRDecomposition.decompose(A);
Matrix Q = QRP[0];
Matrix R = QRP[1];
Matrix P = QRP[2];
fprintf("Q:%n");
printMatrix(Q);
fprintf("R:%n");
printMatrix(R);
fprintf("P:%n");
printMatrix(P);
fprintf("AP:%n");
printMatrix(A.mtimes(P));
fprintf("QR:%n");
printMatrix(Q.mtimes(R));
fprintf("Q'Q:%n");
printMatrix(Q.transpose().mtimes(Q));
System.out.format("Elapsed time: %.2f seconds.%n", (System.currentTimeMillis() - start) / 1000F);
fprintf("**********************************%n");
// fprintf("|AP - QR| = ");
A = sparse(hilb(m, n));
fprintf("When A is sparse:%n");
fprintf("A:%n");
printMatrix(A);
start = System.currentTimeMillis();
QRP = QRDecomposition.decompose(A);
Q = QRP[0];
R = QRP[1];
P = QRP[2];
fprintf("Q:%n");
printMatrix(Q);
fprintf("R:%n");
printMatrix(R);
fprintf("P:%n");
printMatrix(P);
fprintf("AP:%n");
printMatrix(A.mtimes(P));
fprintf("QR:%n");
printMatrix(Q.mtimes(R));
fprintf("Q'Q:%n");
printMatrix(Q.transpose().mtimes(Q));
System.out.format("Elapsed time: %.2f seconds.%n", (System.currentTimeMillis() - start) / 1000F);
QRDecomposition QRDecomp = new QRDecomposition((A));
Vector b = new DenseVector(new double[] {2, 3, 4, 9});
Vector x = QRDecomp.solve(b);
fprintf("Solution for Ax = b:%n");
printVector(x);
fprintf("b = %n");
printVector(b);
fprintf("Ax = %n");
printVector(A.operate(x));
# Output
When A is full:
A:
1 0.5000 0.3333
0.5000 0.3333 0.2500
0.3333 0.2500 0.2000
0.2500 0.2000 0.1667
Q:
-0.8381 0.5144 -0.1796 -0.0263
-0.4191 -0.4052 0.7487 0.3157
-0.2794 -0.5351 -0.1132 -0.7892
-0.2095 -0.5338 -0.6280 0.5261
R:
-1.1932 -0.4749 -0.6705
0 -0.1258 -0.1184
0 0 0.0059
0 0 0
P:
1
1
1
AP:
1 0.3333 0.5000
0.5000 0.2500 0.3333
0.3333 0.2000 0.2500
0.2500 0.1667 0.2000
QR:
1.0000 0.3333 0.5000
0.5000 0.2500 0.3333
0.3333 0.2000 0.2500
0.2500 0.1667 0.2000
Q'Q:
1.0000 0.0000 -0.0000 0.0000
0.0000 1 -0.0000 0
-0.0000 -0.0000 1.0000 0
0.0000 0 0 1
Elapsed time: 0.05 seconds.
**********************************
When A is sparse:
A:
1 0.5000 0.3333
0.5000 0.3333 0.2500
0.3333 0.2500 0.2000
0.2500 0.2000 0.1667
Q:
-0.8381 0.5144 -0.1796 -0.0263
-0.4191 -0.4052 0.7487 0.3157
-0.2794 -0.5351 -0.1132 -0.7892
-0.2095 -0.5338 -0.6280 0.5261
R:
-1.1932 -0.4749 -0.6705
-0.1258 -0.1184
0.0059
P:
1
1
1
AP:
1 0.3333 0.5000
0.5000 0.2500 0.3333
0.3333 0.2000 0.2500
0.2500 0.1667 0.2000
QR:
1.0000 0.3333 0.5000
0.5000 0.2500 0.3333
0.3333 0.2000 0.2500
0.2500 0.1667 0.2000
Q'Q:
1.0000 0.0000 -0.0000 0.0000
0.0000 1 -0.0000 0
-0.0000 -0.0000 1.0000 0
0.0000 0 0 1
Elapsed time: 0.04 seconds.
Solution for Ax = b:
117.2346
-719.5017
733.7439
b =
2
3
4
9
Ax =
2.0651
2.2194
5.9516
7.6990
# -------------------------------------------------------------------------- #
# Singular Value Decomposition
Matrix A = new DenseMatrix(new double[][] { {1d, 2d}, {2d, 0d}, {1d, 7d}});
/*A = new DenseMatrix(new double[][] {
{1, 2, 3},
{4, 5, 6},
{7, 8, 9},
{10, 11, 12}
});*/
// printMatrix(SingularValueDecomposition.bidiagonalize(A)[1]);
// A = IO.loadMatrix("SVDInput");
/*fprintf("When A is full:%n%n");
fprintf("A:%n");
printMatrix(A);*/
long start = 0;
start = System.currentTimeMillis();
boolean computeUV = !false;
Matrix[] USV = SingularValueDecomposition.decompose(A, computeUV);
System.out.format("Elapsed time: %.4f seconds.%n", (System.currentTimeMillis() - start) / 1000.0);
fprintf("*****************************************%n");
Matrix U = USV[0];
Matrix S = USV[1];
Matrix V = USV[2];
if (computeUV) {
fprintf("USV':%n");
disp(U.mtimes(S).mtimes(V.transpose()));
fprintf("A:%n");
printMatrix(A);
fprintf("U'U:%n");
printMatrix(U.transpose().mtimes((U)));
fprintf("V'V:%n");
printMatrix(V.transpose().mtimes((V)));
fprintf("U:%n");
printMatrix(U);
fprintf("V:%n");
printMatrix(V);
}
fprintf("S:%n");
printMatrix(S);
fprintf("rank(A): %d%n", rank(A));
# Output
Elapsed time: 0.0100 seconds.
*****************************************
USV':
1.0000 2.0000
2.0000 -0.0000
1.0000 7.0000
A:
1 2
2 0
1 7
U'U:
1.0000 0.0000 0.0000
0.0000 1.0000 0.0000
0.0000 0.0000 1.0000
V'V:
1.0000 -0.0000
-0.0000 1.0000
U:
-0.2906 -0.2976 -0.9094
-0.0492 -0.9445 0.3248
-0.9556 0.1391 0.2598
V:
-0.1819 -0.9833
-0.9833 0.1819
S:
7.3935
2.0822
rank(A): 2
# -------------------------------------------------------------------------- #
# Linear Binary SVM
double C = 1.0;
double eps = 1e-4;
Classifier linearBinarySVM = new LinearBinarySVM(C, eps);
int[] pred_labels = null;
double[][] data = {
{3.5, 4.4, 1.3, 2.3},
{5.3, 2.2, 0.5, 4.5},
{0.2, 0.3, 4.1, -3.1},
{-1.2, 0.4, 3.2, 1.6}
};
int[] labels = new int[] {1, 1, -1, -1};
linearBinarySVM.feedData(data);
linearBinarySVM.feedLabels(labels);
linearBinarySVM.train();
fprintf("W:%n");
printMatrix(linearBinarySVM.W);
fprintf("b:%n");
printVector(linearBinarySVM.b);
pred_labels = linearBinarySVM.predict(data);
getAccuracy(pred_labels, labels);
# Output
W:
0.2143
0.1312
-0.2407
0.0150
b:
-0.0490
Accuracy: 100.00%
# -------------------------------------------------------------------------- #
# Linear Multi-class SVM
double C = 1.0;
double eps = 1e-4;
Classifier linearMCSVM = new LinearMCSVM(C, eps);
double[][] data = {
{3.5, 4.4, 1.3, 2.3},
{5.3, 2.2, 0.5, 4.5},
{0.2, 0.3, 4.1, -3.1},
{-1.2, 0.4, 3.2, 1.6}
};
int[] labels = new int[] {1, 2, 3, 4};
linearMCSVM.feedData(data);
linearMCSVM.feedLabels(labels);
linearMCSVM.train();
fprintf("W:%n");
printMatrix(linearMCSVM.W);
fprintf("b:%n");
printVector(linearMCSVM.b);
int[] pred_labels = linearMCSVM.predict(data);
getAccuracy(pred_labels, labels);
# Output
..
W:
-0.0482 0.1314 0.0630 -0.1462
0.2825 -0.2241 -0.0053 -0.0531
-0.0819 -0.0836 0.0599 0.1056
-0.0838 0.1711 -0.2301 0.1428
b:
-0.0248 -0.0089 0.0043 0.0295
Accuracy: 100.00%
# Multi-class Logistic Regression with Multiple Choices of Regularization
double[][] data = {
{3.5, 5.3, 0.2, -1.2},
{4.4, 2.2, 0.3, 0.4},
{1.3, 0.5, 4.1, 3.2}
};
int[] labels = new int[] {1, 2, 3};
/*
* Regularization type.
* 0: No regularization
* 1: L1 regularization
* 2: L2^2 regularization
* 3: L2 regularization
* 4: Infinity norm regularization
*/
int regularizationType = 1;
double lambda = 0.1;
Classifier logReg = new LogisticRegression(regularizationType, lambda);
logReg.epsilon = 1e-5;
logReg.feedData(data);
logReg.feedLabels(labels);
// Get elapsed time in seconds
tic();
logReg.train();
fprintf("Elapsed time: %.3f seconds.%n", toc());
fprintf("W:%n");
printMatrix(logReg.W);
fprintf("b:%n");
printVector(logReg.b);
double[][] dataTest = data;
fprintf("Ground truth:%n");
printMatrix(logReg.Y);
fprintf("Predicted probability matrix:%n");
Matrix Prob_pred = logReg.predictLabelScoreMatrix(dataTest);
disp(Prob_pred);
fprintf("Predicted label matrix:%n");
Matrix Y_pred = logReg.predictLabelMatrix(dataTest);
printMatrix(Y_pred);
int[] pred_labels = logReg.predict(dataTest);
getAccuracy(pred_labels, labels);
# Output
# Without regularization
L-BFGS converges with norm(Grad) 0.000006
Elapsed time: 0.060 seconds.
W:
-1.8522 3.1339 -1.2817
3.4138 -1.7282 -1.6856
-1.2455 -1.2630 2.5084
-2.8311 0.5558 2.2753
b:
-0.3616 0.2534 0.1082
Ground truth:
1
1
1
Predicted probability matrix:
1.0000 0.0000 0.0000
0.0000 1.0000 0.0000
0.0000 0.0000 1.0000
Predicted label matrix:
1
1
1
Accuracy: 100.00%
# L1-norm regularization
Accelerated proximal gradient method converges with norm(G_Y_k) 0.000004
Elapsed time: 0.430 seconds.
W:
0 0.8678 0
0.9579 0 0
0 0 0.9894
0 0 0
b:
0 0 0
Ground truth:
1
1
1
Predicted probability matrix:
0.8790 0.1143 0.0067
0.1493 0.8263 0.0244
0.0258 0.0494 0.9247
Predicted label matrix:
1
1
1
Accuracy: 100.00%
# squared Frobenius norm regularization
Accelerated proximal gradient method converges with norm(G_Y_k) 0.000009
Elapsed time: 0.242 seconds.
W:
-0.2817 0.5368 -0.2551
0.5298 -0.3269 -0.2029
-0.1181 -0.3004 0.4185
-0.3814 0.0449 0.3364
b:
-0.0417 0.0266 0.0151
Ground truth:
1
1
1
Predicted probability matrix:
0.8872 0.1028 0.0100
0.1241 0.8299 0.0461
0.0137 0.0514 0.9349
Predicted label matrix:
1
1
1
Accuracy: 100.00%
# Frobenius norm regularization
Accelerated proximal gradient method converges with norm(G_Y_k) 0.000008
Elapsed time: 0.358 seconds.
W:
-0.4333 0.7669 -0.3336
0.7402 -0.4956 -0.2447
-0.1262 -0.4227 0.5489
-0.5135 0.0810 0.4326
b:
-0.0587 0.0375 0.0212
Ground truth:
1
1
1
Predicted probability matrix:
0.9509 0.0461 0.0029
0.0558 0.9249 0.0192
0.0039 0.0221 0.9740
Predicted label matrix:
1
1
1
Accuracy: 100.00%
# Infinity norm regularization
Accelerated proximal gradient method converges with norm(G_Y_k) 0.000010
Elapsed time: 0.631 seconds.
W:
-0.7162 0.7162 -0.7162
0.7162 -0.7162 -0.7162
-0.7162 -0.7162 0.7162
-0.7162 0.7162 0.7162
b:
-0.6125 -0.3405 0.5728
Ground truth:
1
1
1
Predicted probability matrix:
0.9821 0.0175 0.0004
0.0179 0.9752 0.0068
0.0000 0.0072 0.9928
Predicted label matrix:
1
1
1
Accuracy: 100.00%
# -------------------------------------------------------------------------- #
# Maximum Entropy Modeling
long start = System.currentTimeMillis();
/*
* a 3D {@code double} array, where data[n][i][k]
* is the i-th feature value on the k-th class
* for the n-th sample
*/
double[][][] data = new double[][][] {
{{1, 0, 0}, {2, 1, -1}, {0, 1, 2}, {-1, 2, 1}},
{{0, 2, 0}, {1, 0, -1}, {0, 1, 1}, {-1, 3, 0.5}},
{{0, 0, 0.8}, {2, 1, -1}, {1, 3, 0}, {-0.5, -1, 2}},
{{0.5, 0, 0}, {1, 1, -1}, {0, 0.5, 1.5}, {-2, 1.5, 1}},
};
/*double [][] labels = new double[][] {
{1, 0, 0},
{0, 1, 0},
{0, 0, 1},
{1, 0, 0}
};*/
int[] labels = new int[] {1, 2, 3, 1};
MaxEnt maxEnt = new MaxEnt();
maxEnt.feedData(data);
maxEnt.feedLabels(labels);
maxEnt.train();
double elapsedTime = (System.currentTimeMillis() - start) / 1000d;
System.out.format("Elapsed time: %.3f seconds\n", elapsedTime);
fprintf("MaxEnt parameters:\n");
display(maxEnt.W);
String modelFilePath = "MaxEnt-Model.dat";
maxEnt.saveModel(modelFilePath);
maxEnt = new MaxEnt();
maxEnt.loadModel(modelFilePath);
fprintf("Predicted probability matrix:\n");
display(maxEnt.predictLabelScoreMatrix(data));
fprintf("Predicted label matrix:\n");
display(full(maxEnt.predictLabelMatrix(data)));
fprintf("Predicted labels:\n");
display(maxEnt.predict(data));
# Output
L-BFGS converges with norm(Grad) 0.000072
Elapsed time: 0.060 seconds
MaxEnt parameters:
7.4697
-1.1159
-2.8338
-1.0426
Model saved.
Loading model...
Model loaded.
Predicted probability matrix:
1.0000 0.0000 0.0000
0.0001 0.9999 0.0000
0.0001 0.0000 0.9999
0.9997 0.0002 0.0001
Predicted label matrix:
1 0 0
0 1 0
0 0 1
1 0 0
Predicted labels:
1 2 3 1
# -------------------------------------------------------------------------- #
# AdaBoost
double[][] data = { {3.5, 4.4, 1.3},
{5.3, 2.2, 0.5},
{0.2, 0.3, 4.1},
{5.3, 2.2, -1.5},
{-1.2, 0.4, 3.2} };
int[] labels = {1, 1, -1, -1, -1};
Matrix X = new DenseMatrix(data);
int T = 10;
Classifier[] weakClassifiers = new Classifier[T];
for (int t = 0; t < 10; t++) {
weakClassifiers[t] = new LogisticRegression(epsilon);
}
Classifier adaBoost = new AdaBoost(weakClassifiers);
adaBoost.feedData(X);
adaBoost.feedLabels(labels);
tic();
adaBoost.train();
System.out.format("Elapsed time: %.2f seconds.%n", toc());
Xt = X.copy();
display(full(adaBoost.predictLabelMatrix(Xt)));
display(adaBoost.predict(Xt));
accuracy = Classifier.getAccuracy(labels, adaBoost.predict(Xt));
fprintf("Accuracy for AdaBoost with logistic regression: %.2f%%\n", 100 * accuracy);
# Output
Elapsed time: 0.18 seconds.
1 0
1 0
0 1
0 1
0 1
1 1 -1 -1 -1
Accuracy: 100.00%
# -------------------------------------------------------------------------- #
# K-means
double[][] data = {
{3.5, 5.3, 0.2, -1.2},
{4.4, 2.2, 0.3, 0.4},
{1.3, 0.5, 4.1, 3.2}
};
KMeansOptions options = new KMeansOptions();
options.nClus = 2;
options.verbose = true;
options.maxIter = 100;
KMeans KMeans= new KMeans(options);
KMeans.feedData(data);
// KMeans.initialize(null);
Matrix initializer = null;
initializer = new SparseMatrix(3, 2);
initializer.setEntry(0, 0, 1);
initializer.setEntry(1, 1, 1);
initializer.setEntry(2, 0, 1);
KMeans.clustering(initializer); // Use null for random initialization
System.out.println("Indicator Matrix:");
printMatrix(full(KMeans.getIndicatorMatrix()));
# Output
Iter 1: mse = 9.534 (0.000 secs)
KMeans complete.
Indicator Matrix:
0 1
0 1
1 0
# -------------------------------------------------------------------------- #
# NMF
double[][] data = {
{3.5, 4.4, 1.3},
{5.3, 2.2, 0.5},
{0.2, 0.3, 4.1},
{1.2, 0.4, 3.2}
};
KMeansOptions options = new KMeansOptions();
options.nClus = 2;
options.verbose = true;
options.maxIter = 100;
KMeans KMeans= new KMeans(options);
KMeans.feedData(data);
KMeans.initialize(null);
KMeans.clustering();
Matrix G0 = KMeans.getIndicatorMatrix();
NMFOptions NMFOptions = new NMFOptions();
NMFOptions.nClus = 2;
NMFOptions.maxIter = 50;
NMFOptions.verbose = true;
NMFOptions.calc_OV = false;
NMFOptions.epsilon = 1e-5;
Clustering NMF = new NMF(NMFOptions);
NMF.feedData(data);
// NMF.initialize(null);
NMF.clustering(G0); // If null, KMeans will be used for initialization
System.out.println("Basis Matrix:");
printMatrix(full(NMF.getCenters()));
System.out.println("Indicator Matrix:");
printMatrix(full(NMF.getIndicatorMatrix()));
# Output
Iter 1: mse = 13.060 (0.000 secs)
Iter 2: mse = 2.875 (0.000 secs)
KMeans complete.
Iteration 10, delta G: 0.001012
Converge successfully!
Basis Matrix:
5.0577 3.6774 0.5237
0.2326 0.3013 4.4328
Indicator Matrix:
0.8543 0.2004
0.8928 0
0.0000 0.9251
0.1520 0.7014
# -------------------------------------------------------------------------- #
# L1NMF
String dataMatrixFilePath = "CNN - DocTermCount.txt";
tic();
Matrix X = loadMatrixFromDocTermCountFile(dataMatrixFilePath);
X = Matlab.getTFIDF(X);
X = Matlab.normalizeByColumns(X);
X = X.transpose();
KMeansOptions kMeansOptions = new KMeansOptions();
kMeansOptions.nClus = 10;
kMeansOptions.maxIter = 50;
kMeansOptions.verbose = true;
KMeans KMeans = new KMeans(kMeansOptions);
KMeans.feedData(X);
// KMeans.initialize(null);
KMeans.clustering();
Matrix G0 = KMeans.getIndicatorMatrix();
// Matrix X = Data.loadSparseMatrix("X.txt");
G0 = loadDenseMatrix("G0.txt");
L1NMFOptions L1NMFOptions = new L1NMFOptions();
L1NMFOptions.nClus = 10;
L1NMFOptions.gamma = 1 * 0.0001;
L1NMFOptions.mu = 1 * 0.1;
L1NMFOptions.maxIter = 50;
L1NMFOptions.verbose = true;
L1NMFOptions.calc_OV = !true;
L1NMFOptions.epsilon = 1e-5;
Clustering L1NMF = new L1NMF(L1NMFOptions);
L1NMF.feedData(X);
// L1NMF.initialize(G0);
L1NMF.clustering(G0); // Use null for random initialization
System.out.format("Elapsed time: %.3f seconds\n", toc());
# Output
Iter 1: mse = 1.524 (0.030 secs)
Iter 2: mse = 0.816 (0.030 secs)
Iter 3: mse = 0.806 (0.030 secs)
Iter 4: mse = 0.805 (0.040 secs)
KMeans complete.
Iteration 10, delta G: 0.046591
Iteration 20, delta G: 0.047140
Iteration 30, delta G: 0.020651
Iteration 40, delta G: 0.010017
Iteration 50, delta G: 0.007973
Maximal iterations
Elapsed time: 3.933 seconds
# -------------------------------------------------------------------------- #
# Spectral Clustering
tic();
int nClus = 2;
boolean verbose = false;
int maxIter = 100;
String graphType = "nn";
double graphParam = 2;
String graphDistanceFunction = "euclidean";
String graphWeightType = "heat";
double graphWeightParam = 1;
ClusteringOptions options = new SpectralClusteringOptions(
nClus,
verbose,
maxIter,
graphType,
graphParam,
graphDistanceFunction,
graphWeightType,
graphWeightParam);
Clustering spectralClustering = new SpectralClustering(options);
double[][] data = {
{3.5, 5.3, 0.2, -1.2},
{4.4, 2.2, 0.3, 0.4},
{1.3, 0.5, 4.1, 3.2}
};
spectralClustering.feedData(data);
spectralClustering.clustering(null);
display(full(spectralClustering.getIndicatorMatrix()));
System.out.format("Elapsed time: %.3f seconds\n", toc());
# Output
Computing directed adjacency graph...
Creating the adjacency matrix. Nearest neighbors, N = 2.
KMeans complete.
Spectral clustering complete.
0 1
0 1
1 0
Elapsed time: 0.070 seconds
# -------------------------------------------------------------------------- #
# Computing Adjacency Matrix, Graph Laplacian and Local Learning Regularization
String filePath = "CNN - DocTermCount.txt";
Matrix X = loadMatrixFromDocTermCountFile(filePath);
int NSample = Math.min(20, X.getColumnDimension());
X = X.getSubMatrix(0, X.getRowDimension() - 1, 0, NSample - 1);
System.out.println(String.format("%d samples loaded", X.getColumnDimension()));
X = X.transpose();
GraphOptions options = new GraphOptions();
options.graphType = "nn";
String type = options.graphType;
double NN = options.graphParam;
System.out.println(String.format("Graph type: %s with NN: %d", type, (int)NN));
// Parameter setting for text data
options.kernelType = "cosine";
options.graphDistanceFunction = "cosine";
// Parameter setting for image data
/*options.kernelType = "rbf";
options.graphDistanceFunction = "euclidean";*/
options.graphNormalize = true;
options.graphWeightType = "heat";
boolean show = true && !false;
// Test adjacency function - pass
tic();
String DISTANCEFUNCTION = options.graphDistanceFunction;
Matrix A = adjacency(X, type, NN, DISTANCEFUNCTION);
System.out.format("Elapsed time: %.2f seconds.%n", toc());
String adjacencyFilePath = "adjacency.txt";
saveMatrix(adjacencyFilePath, A);
if (show)
disp(A.getSubMatrix(0, 4, 0, 4));
// Test laplacian function - pass
tic();
Matrix L = laplacian(X, type, options);
System.out.format("Elapsed time: %.2f seconds.%n", toc());
String LaplacianFilePath = "Laplacian.txt";
saveMatrix(LaplacianFilePath, L);
if (show)
disp(L.getSubMatrix(0, 4, 0, 4));
// Test local learning regularization - pass
NN = options.graphParam;
String DISTFUNC = options.graphDistanceFunction;
String KernelType = options.kernelType;
double KernelParam = options.kernelParam;
double lambda = 0.001;
tic();
Matrix LLR_text = calcLLR(X, NN, DISTFUNC, KernelType, KernelParam, lambda);
System.out.format("Elapsed time: %.2f seconds.%n", toc());
String LLRFilePath = "localLearningRegularization.txt";
saveMatrix(LLRFilePath, LLR_text);
if (show)
display(LLR_text.getSubMatrix(0, 4, 0, 4));
# Output
20 samples loaded
Graph type: nn with NN: 6
Computing directed adjacency graph...
Creating the adjacency matrix. Nearest neighbors, N = 6.
Elapsed time: 0.28 seconds.
Data matrix file written: adjacency.txt
(2, 1) 0.8162
(3, 1) 0.8841
(1, 2) 0.8162
(3, 2) 0.9041
(4, 2) 0.9074
(1, 3) 0.8841
(2, 3) 0.9041
(2, 4) 0.9074
(5, 4) 0.9558
(4, 5) 0.9558
Computing Graph Laplacian...
Computing directed adjacency graph...
Creating the adjacency matrix. Nearest neighbors, N = 6.
Elapsed time: 0.22 seconds.
Data matrix file written: Laplacian.txt
(1, 1) 1
(2, 1) -0.1522
(3, 1) -0.1118
(1, 2) -0.1522
(2, 2) 1
(3, 2) -0.1276
(4, 2) -0.1521
(1, 3) -0.1118
(2, 3) -0.1276
(3, 3) 1
(2, 4) -0.1521
(4, 4) 1
(5, 4) -0.1499
(4, 5) -0.1499
(5, 5) 1
Computing directed adjacency graph...
Creating the adjacency matrix. Nearest neighbors, N = 6.
Elapsed time: 0.26 seconds.
Data matrix file written: localLearningRegularization.txt
(1, 1) 1.0562
(2, 1) -0.2926
(3, 1) -0.1244
(4, 1) 0.0132
(1, 2) -0.2926
(2, 2) 1.0353
(3, 2) -0.1167
(4, 2) -0.1202
(5, 2) 0.0026
(1, 3) -0.1244
(2, 3) -0.1167
(3, 3) 1.0305
(4, 3) 0.0045
(1, 4) 0.0132
(2, 4) -0.1202
(3, 4) 0.0045
(4, 4) 1.0583
(5, 4) -0.0052
(2, 5) 0.0026
(4, 5) -0.0052
(5, 5) 1.0042
# -------------------------------------------------------------------------- #
# Matrix Completion
int m = 6;
int r = 1;
int p = (int) Math.round(m * m * 0.3);
Matrix L = randn(m, r);
Matrix R = randn(m, r);
Matrix A_star = mtimes(L, R.transpose());
int[] indices = randperm(m * m);
minusAssign(indices, 1);
indices = linearIndexing(indices, colon(0, p - 1));
Matrix Omega = zeros(size(A_star));
linearIndexingAssignment(Omega, indices, 1);
Matrix D = zeros(size(A_star));
linearIndexingAssignment(D, indices, linearIndexing(A_star, indices));
Matrix E_star = D.minus(A_star);
logicalIndexingAssignment(E_star, Omega, 0);
// Run matrix completion
MatrixCompletion matrixCompletion = new MatrixCompletion();
matrixCompletion.feedData(D);
matrixCompletion.feedIndices(Omega);
tic();
matrixCompletion.run();
fprintf("Elapsed time: %.2f seconds.%n", toc());
// Output
Matrix A_hat = matrixCompletion.GetLowRankEstimation();
fprintf("A*:\n");
disp(A_star, 4);
fprintf("A^:\n");
disp(A_hat, 4);
fprintf("D:\n");
disp(D, 4);
fprintf("rank(A*): %d\n", rank(A_star));
fprintf("rank(A^): %d\n", rank(A_hat));
fprintf("||A* - A^||_F: %.4f\n", norm(A_star.minus(A_hat), "fro"));
# Output
Elapsed time: 0.07 seconds.
A*:
1.1683 -0.4309 1.7763 -0.3742 -0.4096 -0.6490
-2.9765 1.0978 -4.5257 0.9533 1.0437 1.6536
-0.4342 0.1601 -0.6602 0.1391 0.1522 0.2412
1.6696 -0.6158 2.5386 -0.5347 -0.5854 -0.9276
-0.6291 0.2320 -0.9566 0.2015 0.2206 0.3495
0.4162 -0.1535 0.6329 -0.1333 -0.1459 -0.2312
A^:
1.1683 -0.4309 -0.0000 -0.3742 -0.4096 -0.6490
-2.9765 1.0978 0.0000 0.9533 1.0437 1.6536
-0.4342 0.1601 0.0000 0.1391 0.1522 0.2412
1.6696 -0.6158 -0.0000 -0.5347 -0.5854 -0.9276
-0.6291 0.2320 0.0000 0.2015 0.2206 0.3495
0.4162 -0.1535 -0.0000 -0.1333 -0.1459 -0.2312
D:
0 0 0 0 0 -0.6490
-2.9765 0 0 0 1.0437 1.6536
0 0 0 0.1391 0.1522 0
1.6696 -0.6158 0 -0.5347 0 0
0 0 0 0 0.2206 0
0 0 0 -0.1333 0 0
rank(A*): 1
rank(A^): 1
||A* - A^||_F: 5.6420
# -------------------------------------------------------------------------- #
# Robust PCA
int m = 8;
int r = m / 4;
Matrix L = randn(m, r);
Matrix R = randn(m, r);
Matrix A_star = mtimes(L, R.transpose());
Matrix E_star = zeros(size(A_star));
int[] indices = randperm(m * m);
int nz = m * m / 20;
int[] nz_indices = new int[nz];
for (int i = 0; i < nz; i++) {
nz_indices[i] = indices[i] - 1;
}
Matrix E_vec = vec(E_star);
setSubMatrix(E_vec, nz_indices, new int[] {0}, (minus(rand(nz, 1), 0.5).times(100)));
E_star = reshape(E_vec, size(E_star));
// Input
Matrix D = A_star.plus(E_star);
double lambda = 1 * Math.pow(m, -0.5);
// Run Robust PCA
RobustPCA robustPCA = new RobustPCA(lambda);
robustPCA.feedData(D);
tic();
robustPCA.run();
fprintf("Elapsed time: %.2f seconds.%n", toc());
// Output
Matrix A_hat = robustPCA.GetLowRankEstimation();
Matrix E_hat = robustPCA.GetErrorMatrix();
fprintf("A*:\n");
disp(A_star, 4);
fprintf("A^:\n");
disp(A_hat, 4);
fprintf("E*:\n");
disp(E_star, 4);
fprintf("E^:\n");
disp(E_hat, 4);
fprintf("rank(A*): %d\n", rank(A_star));
fprintf("rank(A^): %d\n", rank(A_hat));
fprintf("||A* - A^||_F: %.4f\n", norm(A_star.minus(A_hat), "fro"));
fprintf("||E* - E^||_F: %.4f\n", norm(E_star.minus(E_hat), "fro"));
# Output
Elapsed time: 0.05 seconds.
A*:
0.4542 1.4239 -0.7884 0.9424 -1.9555 0.3413 -0.9643 1.9974
3.2620 -0.5927 -1.7637 0.6375 1.4598 -1.2037 1.9060 -1.8263
-1.9222 -0.4894 1.3415 -0.8509 0.3416 0.4260 -0.4386 -0.1774
-1.3192 0.7056 0.5454 0.0062 -1.2580 0.6442 -1.1511 1.4349
-2.2590 0.3961 1.2266 -0.4496 -0.9903 0.8288 -1.3082 1.2432
3.2288 1.0411 -2.3324 1.5534 -0.8878 -0.6416 0.5579 0.6255
1.7459 -0.1703 -0.9969 0.4244 0.5708 -0.5946 0.9002 -0.7579
1.0295 -1.0952 -0.2294 -0.3135 1.7622 -0.6867 1.3429 -1.9339
A^:
0.4296 1.4239 -0.7884 0.9422 -1.9322 0.3282 -0.9459 1.9804
3.1412 -0.5927 -1.7637 0.6375 1.4596 -1.2037 1.9060 -1.8263
-1.8337 -0.4894 1.3369 -0.8509 0.3325 0.4261 -0.4385 -0.1774
-1.3192 -0.4311 0.5454 -0.1480 0.2049 0.6442 -0.7934 0.4356
-2.1752 0.3960 1.2266 -0.4496 -0.9903 0.8287 -1.3082 1.2432
2.6885 1.0320 -1.9370 1.2690 -0.8878 -0.6340 0.5579 0.5161
1.6787 -0.1704 -0.9966 0.4247 0.5725 -0.5947 0.9002 -0.7579
1.0103 -1.0939 -0.2342 -0.3135 1.7505 -0.6867 1.3428 -1.9330
E*:
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 -20.5412 0 0 34.4119 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 20.5941 0 0 0
0 0 0 0 0 0 0 0
E^:
0.0245 0 0 0.0002 -0.0233 0.0131 -0.0184 0.0170
0.1208 0 0 0 0.0001 -0.0001 0.0000 0
-0.0885 0 0.0046 -0.0000 0.0091 -0.0001 -0.0001 0
0 -19.4046 0 0.1542 32.9490 0 -0.3577 0.9993
-0.0839 0.0001 0.0000 0 0 0.0000 0.0000 0.0000
0.5403 0.0091 -0.3953 0.2844 0 -0.0076 0 0.1094
0.0672 0.0001 -0.0004 -0.0003 20.5923 0.0000 0 0
0.0193 -0.0014 0.0048 0 0.0117 0 0.0001 -0.0009
rank(A*): 2
rank(A^): 4
||A* - A^||_F: 2.2716
||E* - E^||_F: 2.2716
# -------------------------------------------------------------------------- #
# LASSO
double[][] data = {{1, 2, 3, 2},
{4, 2, 3, 6},
{5, 1, 2, 1}};
double[][] depVars = {{3, 2},
{2, 3},
{1, 4}};
Options options = new Options();
options.maxIter = 600;
options.lambda = 0.05;
options.verbose = !true;
options.calc_OV = !true;
options.epsilon = 1e-5;
Regression LASSO = new LASSO(options);
LASSO.feedData(data);
LASSO.feedDependentVariables(depVars);
tic();
LASSO.train();
fprintf("Elapsed time: %.3f seconds\n\n", toc());
fprintf("Projection matrix:\n");
display(LASSO.W);
Matrix Yt = LASSO.predict(data);
fprintf("Predicted dependent variables:\n");
display(Yt);
# Output
Elapsed time: 0.060 seconds
Projection matrix:
-0.2295 0.5994
0 0
1.1058 0.5858
-0.0631 -0.1893
Predicted dependent variables:
2.9618 1.9782
2.0209 3.0191
1.0009 3.9791
# -------------------------------------------------------------------------- #
# Linear Regression
double[][] data = {
{1, 2, 3, 2},
{4, 2, 3, 6},
{5, 1, 4, 1}
};
double[][] depVars = {
{3, 2},
{2, 3},
{1, 4}
};
Options options = new Options();
options.maxIter = 600;
options.lambda = 0.1;
options.verbose = !true;
options.calc_OV = !true;
options.epsilon = 1e-5;
Regression LR = new LinearRegression(options);
LR.feedData(data);
LR.feedDependentVariables(depVars);
tic();
LR.train();
fprintf("Elapsed time: %.3f seconds\n\n", toc());
fprintf("Projection matrix:\n");
display(LR.W);
fprintf("Bias vector:\n");
display(((LinearRegression)LR).B);
Matrix Yt = LR.predict(data);
fprintf("Predicted dependent variables:\n");
display(Yt);
# Output
Elapsed time: 0.025 seconds
Projection matrix:
-0.4700 0.4049
0.5621 0.1216
0.6117 0.4454
0.1163 -0.0513
Bias vector:
0.2345
0.1131
Predicted dependent variables:
2.9561 1.9949
2.0111 3.0043
1.0093 3.9895
# -------------------------------------------------------------------------- #
# Basic Conditional Random Field (CRF)
// Number of data sequences
int D = 1000;
// Minimal length for the randomly generated data sequences
int n_min = 4;
// Maximal length for the randomly generated data sequences
int n_max = 6;
// Number of feature functions
int d = 10;
// Number of states
int N = 2;
// Sparseness for the feature matrices
double sparseness = 0.2;
// Randomly generate labeled sequential data for CRF
Object[] dataSequences = CRF.generateDataSequences(D, n_min, n_max, d, N, sparseness);
Matrix[][][] Fs = (Matrix[][][]) dataSequences[0];
int[][] Ys = (int[][]) dataSequences[1];
// Train a CRF model for the randomly generated sequential data with labels
double epsilon = 1e-4;
CRF CRF = new CRF(epsilon);
CRF.feedData(Fs);
CRF.feedLabels(Ys);
CRF.train();
// Save the CRF model
String modelFilePath = "CRF-Model.dat";
CRF.saveModel(modelFilePath);
fprintf("CRF Parameters:\n");
display(CRF.W);
// Prediction
CRF = new CRF();
CRF.loadModel(modelFilePath);
int ID = new Random().nextInt(D);
int[] Yt = Ys[ID];
Matrix[][] Fst = Fs[ID];
fprintf("True label sequence:\n");
display(Yt);
fprintf("Predicted label sequence:\n");
display(CRF.predict(Fst));
# Output
Initial ofv: 46.6525
Iter 1, ofv: 45.9559, norm(Grad): 0.475827
Iter 2, ofv: 33.8123, norm(Grad): 0.472433
Iter 3, ofv: 25.6766, norm(Grad): 0.335332
Objective function value doesn't decrease, iteration stopped!
Iter 4, ofv: 25.6766, norm(Grad): 0.0901032
Model saved.
CRF Parameters:
-5.7255
6.0560
5.8877
-7.7452
11.2735
-4.5474
-0.6763
3.6997
-4.3791
0.6701
Loading model...
Model loaded.
True label sequence:
1 0 1 0
Predicted label sequence:
P*(YPred|x) = 0.624515
1 1 1 0
# -------------------------------------------------------------------------- #
# Hidden Markov Model (HMM)
int numStates = 3;
int numObservations = 2;
double epsilon = 1e-8;
int maxIter = 10;
double[] pi = new double[] {0.33, 0.33, 0.34};
double[][] A = new double[][] {
{0.5, 0.3, 0.2},
{0.3, 0.5, 0.2},
{0.2, 0.4, 0.4}
};
double[][] B = new double[][] {
{0.7, 0.3},
{0.5, 0.5},
{0.4, 0.6}
};
// Generate the data sequences for training
int D = 10000;
int T_min = 5;
int T_max = 10;
int[][][] data = HMM.generateDataSequences(D, T_min, T_max, pi, A, B);
int[][] Os = data[0];
int[][] Qs = data[1];
boolean trainHMM = !false;
if (trainHMM){
HMM HMM = new HMM(numStates, numObservations, epsilon, maxIter);
HMM.feedData(Os);
HMM.feedLabels(Qs);
HMM.train();
fprintf("True Model Parameters: \n");
fprintf("Initial State Distribution: \n");
display(pi);
fprintf("State Transition Probability Matrix: \n");
display(A);
fprintf("Observation Probability Matrix: \n");
display(B);
fprintf("Trained Model Parameters: \n");
fprintf("Initial State Distribution: \n");
display(HMM.pi);
fprintf("State Transition Probability Matrix: \n");
display(HMM.A);
fprintf("Observation Probability Matrix: \n");
display(HMM.B);
String HMMModelFilePath = "HMMModel.dat";
HMM.saveModel(HMMModelFilePath);
}
// Predict the single best state path
int ID = new Random().nextInt(D);
int[] O = Os[ID];
HMM HMMt = new HMM();
HMMt.loadModel("HMMModel.dat");
int[] Q = HMMt.predict(O);
fprintf("Observation sequence: \n");
HMMt.showObservationSequence(O);
fprintf("True state sequence: \n");
HMMt.showStateSequence(Qs[ID]);
fprintf("Predicted state sequence: \n");
HMMt.showStateSequence(Q);
double p = HMMt.evaluate(O);
System.out.format("P(O|Theta) = %f\n", p);
# Output
Iter: 1, log[P(O|Theta)]: -51857.728975
Iter: 2, log[P(O|Theta)]: -51857.679272
Iter: 3, log[P(O|Theta)]: -51857.633951
Iter: 4, log[P(O|Theta)]: -51857.592261
Iter: 5, log[P(O|Theta)]: -51857.553794
Iter: 6, log[P(O|Theta)]: -51857.518197
Iter: 7, log[P(O|Theta)]: -51857.485156
Iter: 8, log[P(O|Theta)]: -51857.454400
Iter: 9, log[P(O|Theta)]: -51857.425690
Iter: 10, log[P(O|Theta)]: -51857.398814
True Model Parameters:
Initial State Distribution:
0.3300
0.3300
0.3400
State Transition Probability Matrix:
0.5000 0.3000 0.2000
0.3000 0.5000 0.2000
0.2000 0.4000 0.4000
Observation Probability Matrix:
0.7000 0.3000
0.5000 0.5000
0.4000 0.6000
Trained Model Parameters:
Initial State Distribution:
0.3309
0.3269
0.3422
State Transition Probability Matrix:
0.5000 0.3067 0.1934
0.2959 0.5031 0.2010
0.2037 0.3940 0.4023
Observation Probability Matrix:
0.7030 0.2970
0.4957 0.5043
0.3956 0.6044
Model saved.
Loading model...
Model loaded.
Observation sequence:
1 0 1 0 1 0 0
True state sequence:
1 1 1 1 1 0 2
Predicted state sequence:
1 1 1 1 1 0 0
P(O|Theta) = 0.007996
# -------------------------------------------------------------------------- #
# Isomap
double[][] data = {{0, 2, 3, 4}, {2, 0, 4, 5}, {3, 4.1, 5, 6}, {2, 7, 1, 6}};
Matrix X = new DenseMatrix(data);
X = X.transpose();
int K = 3;
int r = 3;
Matrix R = Isomap.run(X, K, r);
disp("Original Data:");
disp(X);
disp("Reduced Data:");
disp(R);
# Output
Computing directed adjacency graph...
Creating the adjacency matrix. Nearest neighbors, N = 3.
Original Data:
0 2 3 2
2 0 4.1000 7
3 4 5 1
4 5 6 6
Reduced Data:
2.2473 2.5240 -0.7279
-3.4592 2.1429 0.4972
2.9742 -1.3689 0.7943
-1.7622 -3.2981 -0.5637
# -------------------------------------------------------------------------- #
# Kernel PCA
double[][] data = {
{0, 2, 3, 4},
{2, 0, 4, 5},
{3, 4.1, 5, 6},
{2, 7, 1, 6}
};
Matrix X = new DenseMatrix(data).transpose();
int r = 3;
Matrix R = KernelPCA.run(X, r);
disp("Original Data:");
disp(X);
disp("Reduced Data:");
disp(R);
# Output
Original Data:
0 2 3 2
2 0 4.1000 7
3 4 5 1
4 5 6 6
Reduced Data:
0.5011 0.0000 0.7063
-0.4987 -0.7080 -0.0001
0.4990 0.0018 -0.7078
-0.5013 0.7062 0.0017
# -------------------------------------------------------------------------- #
# Locally Linear Embedding (LLE)
double[][] data = {
{0, 2, 3, 4},
{2, 0, 4, 5},
{3, 4.1, 5, 6},
{2, 7, 1, 6}
};
Matrix X = new DenseMatrix(data).transpose();
int K = 3;
int r = 3;
Matrix R = LLE.run(X, K, r);
disp("Original Data:");
disp(X);
disp("Reduced Data:");
disp(R);
# Output
Computing directed adjacency graph...
Creating the adjacency matrix. Nearest neighbors, N = 3.
Original Data:
0 2 3 2
2 0 4.1000 7
3 4 5 1
4 5 6 6
Reduced Data:
-1.0360 -0.8728 1.0793
1.0282 -1.1127 -0.8395
-0.9630 0.8727 -1.1450
0.9707 1.1128 0.9051
# -------------------------------------------------------------------------- #
# Multi-dimensional Scaling (MDS)
double[][] data = {
{0, 2, 3, 4},
{2, 0, 4, 5},
{3, 4.1, 5, 6},
{2, 7, 1, 6}
};
Matrix O = new DenseMatrix(data).transpose();
Matrix D = l2Distance(O, O);
Matrix X = MDS.run(D, 3);
disp("Reduced X:");
disp(X);
# Output
Reduced X:
2.2473 2.5240 -0.7279
-3.4592 2.1429 0.4972
2.9742 -1.3689 0.7943
-1.7622 -3.2981 -0.5637
# -------------------------------------------------------------------------- #
# Principal Component Analysis (PCA)
double[][] data = {
{0, 2, 3, 4},
{2, 0, 4, 5},
{3, 4.1, 5, 6},
{2, 7, 1, 6}
};
Matrix X = new DenseMatrix(data).transpose();
int r = 3;
Matrix R = PCA.run(X, r);
disp("Original Data:");
disp(X);
disp("Reduced Data:");
disp(R);
# Output
Original Data:
0 2 3 2
2 0 4.1000 7
3 4 5 1
4 5 6 6
Reduced Data:
-2.2473 2.5240 0.7279
3.4592 2.1429 -0.4972
-2.9742 -1.3689 -0.7943
1.7622 -3.2981 0.5637
# -------------------------------------------------------------------------- #
# General Quadratic Programming
/*
* min 2 \ x' * Q * x + c' * x
* s.t. A * x = b
* B * x <= d
*/
/*
* Number of unknown variables
*/
int n = 5;
/*
* Number of inequality constraints
*/
int m = 6;
/*
* Number of equality constraints
*/
int p = 3;
Matrix x = rand(n, n);
Matrix Q = x.mtimes(x.transpose()).plus(times(rand(1), eye(n)));
Matrix c = rand(n, 1);
double HasEquality = 1;
Matrix A = times(HasEquality, rand(p, n));
x = rand(n, 1);
Matrix b = A.mtimes(x);
Matrix B = rand(m, n);
double rou = -2;
Matrix d = plus(B.mtimes(x), times(rou, ones(m, 1)));
QPSolution solution = GeneralQP.solve(Q, c, A, b, B, d);
# Output
Phase I:
Terminate successfully.
x_opt:
640.1439 6991.6488 -2975.2211 -6851.4134 -6405.0501
s_opt:
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
lambda for the inequalities s_i >= 0:
1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
B * x - d:
-2123.5480 -2842.1394 -4496.4653 -1640.3370 -3613.3843 -14582.9937
lambda for the inequalities fi(x) <= s_i:
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
nu for the equalities A * x = b:
-0.0000 -0.0000 -0.0000
residual: 1.07222e-11
A * x - b:
0.0000 0.0000 -0.0000
norm(A * x - b, "fro"): 0.000000
fval_opt: 3.65174e-11
The problem is feasible.
Computation time: 0.590000 seconds
halt execution temporarily in 1 seconds...
Phase II:
Terminate successfully.
residual: 9.83544e-12
Optimal objective function value: 173.050
Optimizer:
1.4528 7.8845 -3.0459 -7.3020 -6.5998
B * x - d:
-0.0000 -1.4506 -3.0662 -0.0000 -2.0318 -14.1904
lambda:
52.4785 0.0000 0.0000 137.3581 0.0000 0.0000
nu:
-19.7244 -38.6926 -118.0042
norm(A * x - b, "fro"): 0.000000
Computation time: 0.061000 seconds
# -------------------------------------------------------------------------- #
# Quadratic Programming with Bound Constraints
/*
* min 2 \ x' * Q * x + c' * x
* s.t. l <= x <= u
*/
int n = 5;
Matrix x = rand(n);
Matrix Q = minus(x.mtimes(x.transpose()), times(rand(1).getEntry(0, 0), eye(n)));
Matrix c = plus(-2, times(2, rand(n, 1)));
double l = 0;
double u = 1;
double epsilon = 1e-6;
QPSolution S = QPWithBoundConstraints.solve(Q, c, l, u, epsilon);
disp("Q:");
disp(Q);
disp("c:");
disp(c);
fprintf("Optimum: %g\n", S.optimum);
fprintf("Optimizer:\n");
display(S.optimizer.transpose());
# Output
Initial ofv: -0.678917
Iter 1, ofv: -1.28937, norm(PGrad): 3.08886
PLBFGS converges with norm(PGrad) 0.000000
Q:
1.6699 1.4719 0.9729 0.9943 1.6691
1.4719 0.5587 0.5777 0.6273 1.1480
0.9729 0.5777 1.4139 0.6709 1.0215
0.9943 0.6273 0.6709 0.4050 0.6021
1.6691 1.1480 1.0215 0.6021 1.0643
c:
-0.3002
-1.7387
-1.7054
-1.8646
-1.9046
Optimum: -1.28937
Optimizer:
0 0.0469 0 0.7148 0
**************************************************
For other general-purpose optimization algorithms such as L-BFGS and accelerated proximal gradient, please refer to the documentation.
# -------------------------------------------------------------------------- #
# LDA
int[][] documents = { {1, 4, 3, 2, 3, 1, 4, 3, 2, 3, 1, 4, 3, 2, 3, 6},
{2, 2, 4, 2, 4, 2, 2, 2, 2, 4, 2, 2},
{1, 6, 5, 6, 0, 1, 6, 5, 6, 0, 1, 6, 5, 6, 0, 0},
{5, 6, 6, 2, 3, 3, 6, 5, 6, 2, 2, 6, 5, 6, 6, 6, 0},
{2, 2, 4, 4, 4, 4, 1, 5, 5, 5, 5, 5, 5, 1, 1, 1, 1, 0},
{5, 4, 2, 3, 4, 5, 6, 6, 5, 4, 3, 2} };
LDAOptions LDAOptions = new LDAOptions();
LDAOptions.nTopic = 2;
LDAOptions.iterations = 5000;
LDAOptions.burnIn = 1500;
LDAOptions.thinInterval = 200;
LDAOptions.sampleLag = 10;
LDAOptions.alpha = 2;
LDAOptions.beta = 0.5;
LDA LDA = new LDA(LDAOptions);
LDA.readCorpus(documents);
LDA.train();
fprintf("Topic--term associations: \n");
display(LDA.topicMatrix);
fprintf("Document--topic associations: \n");
display(LDA.indicatorMatrix);
# Output
Topic--term associations:
0.1258 0.0176
0.1531 0.0846
0.0327 0.3830
0.0418 0.1835
0.0360 0.2514
0.2713 0.0505
0.3393 0.0294
Document--topic associations:
0.2559 0.7441
0.1427 0.8573
0.8573 0.1427
0.6804 0.3196
0.5491 0.4509
0.4420 0.5580
# -------------------------------------------------------------------------- #
# Factorization Machines (FM)
String trainFilePath = "Train.txt";
String testFilePath = "Test.txt";
String outputFilePath = "FM-YijPredOnTest.txt";
// Load training data
int idxStart = 0;
FM.feedTrainingData(trainFilePath, idxStart);
// Initialization
FM.allocateResource(k);
FM.feedParams(maxIter, lambda);
FM.initialize();
// Train FM model parameters by training data
FM.train();
// Prediction: generate and save FM-YijPredOnTest.txt
DataSet testData = FM.loadData(testFilePath, idxStart);
double[] Yij_pred = FM.predict(testData.X);
ml.utils.IO.save(Yij_pred, outputFilePath);
# -------------------------------------------------------------------------- #
# STructured Sparse Regression (STSR)
double lambda = 0.01;
double nu = 0.00001;
int maxIter = 30;
//LIBSVM format
String trainFilePath = "Train.txt";
String testFilePath = "Test.txt";
String outputFilePath = "STSR-YijPredOnTest.txt";
// Load training data
int idxStart = 0;
STSR.feedTrainingData(trainFilePath, idxStart);
// Build tree structured pair groups
/*
* featureSize format:
* User[\t]383
* Item[\t]1175
* Event[\t]1
*/
String featureSizeFilePath = "FeatureSize.txt";
/*
* Each line is a group feature index pair (idx1, idx2) separated by
* a tab character, e.g.
* (157, 158)
* (157, 236)
* (24, 157)[\t](157, 158)[\t](157, 236)
*/
String userFeatureGroupListFilePath = "UserTreeStructuredPairGroupList.txt";
String itemFeatureGroupListFilePath = "ItemTreeStructuredPairGroupList.txt";
STSR.buildTreeStructuredPairGroupList(
featureSizeFilePath,
userFeatureGroupListFilePath,
itemFeatureGroupListFilePath
);
// Initialization
STSR.allocateResource();
STSR.feedParams(maxIter, lambda, nu);
STSR.initialize();
// Train STSR model parameters by training data
STSR.train();
// Prediction: generate and save STSR-YijPredOnTest.txt
DataSet testData = STSR.loadData(testFilePath, idxStart);
double[] Yij_pred = STSR.predict(testData.X);
ml.utils.IO.save(Yij_pred, outputFilePath);
# -------------------------------------------------------------------------- #
-----------------------------------
Version: 1.6.5
Author: Mingjie Qian
Date: April 25th, 2016