Toolkits

libLAML: A stand-alone C++ library for linear algebra and machine learning. libLAML can be compiled on MinGW, Linux, and Mac OS X. The goal is to build efficient and easy-to-use linear algebra and machine learning libraries in C++. libLAML can also be used to manually convert MATLAB code to efficient C++ code.

LAML: A stand-alone pure Java library for linear algebra and machine learning. The goal is to build efficient and easy-to-use linear algebra and machine learning libraries. The reason why linear algebra and machine learning are built together is that full control of the basic data structures for matrices and vectors is required to have fast implementation for machine learning methods.

JML: A new pure Java library for machine learning which is very easy to use. Not only equipped with quite a few implementations of important machine learning methods and general purpose optimization algorithms, the library also provides a general framework for implementing machine learning tools by minimum effort of code conversion from MATLAB to Java.

eLibrary (electric library): A Java software to search files and folders in an OS file system. It differs from general OS file search engines in that it personalizes the indexing setup so that users can choose which directories to index or remove from an existing index and it can also suggest queries just like Google's "Did you mean" feature. The customization of indexing and query suggestion greatly improves search speed and make user experience more comfortable. eLibrary can also extract text content from files of many widely used file types such as pdf, doc, ppt, and mp3 to improve the index quality.

RSSFeedCrawler-Python: A Python crawler for multiple RSS feed sites. Both text and images could be scraped via HTML parsing.

RSSFeedCrawler: A Java crawler for multiple RSS feed sites. Both text and images could be scraped via HTML parsing.

JMLBLAS: A Java library for machine learning. It is very fast for analyzing moderate scaled data due to utilization of jblas.

TextProcessor: A Java text processing toolkit, which provides some frequently used text processing functions such as stemming, removing stop-words, generating a term vocabulary, and calculating the term-doc frequency matrix. Basic topic mining models such as LDA and sparse NMF are also supported.