TopicMiner

Topic Miner is a new Java library for machine learning (although initially we focused on text mining).

Features:

It provides a general framework for users to implement machine learning methods themselves.

tm.clustering package implements clustering related models.

tm.classification package implements classification related methods.

tm.topics includes topic modeling and topic mining methods.

tm.data provides reading and writing functions for dense or sparse matrix which can be easily loaded by MATLAB.

tm.kernel compute kernel matrix between two matrices, currently implemented kernel types are linear, poly, rbf, and cosine.

tm.manifold implements functions related to manifold learning, i.e., computing adjacency matrix and Laplacian matrix. This subpackage is very useful for semi-supervised learning.

tm.matlab implements some frequently used Matlab matrix functions such as sort, sum, max, min, kron, vec, repmat, reshape, and colon.

Note that the advantage of this library is that it is very convenient for users to translate a Matlab implementation into a Java implementation by using tm.matlab functions.

Current version implements logistic regression, KMeans, NMF, L1NMF and LSI just for examples of implementing machine learning methods by using this general framework. The SVM package LIBLINEAR is also incorporated. I will try to add more important methods such as MaxEnt, spectral clustering, and LDA to this package if I get the time:)

I hope this library could help engineers and researchers speed up their productivity cycle.

Documentation:

For more details about the meaning of member variables and how to use the member functions of TopicMiner, please refer to the online documentation.

Dependencies:

TopicMiner depends on Apache Commons-Math library and LIBLINEAR.

I choose Commons-Math because it supports both dense and sparse matrix, although it doesn't have the fastest speed compared to other Java based linear algebra libraries. I also built a similar library called JML which uses jblas for basic matrix operations.

Download:

TopicMiner.zip

-----------------------------------

Author: Mingjie Qian

Version: 1.0

Date: March 9th, 2012