Software

- Sparse Kernel Clustering for large Number of Clusters
  - MATLAB implementation of the sparse kernel k-means algorithm, using the RBF kernel.
  - Usage: Unzip the downloaded file, unzip the flann-*-src.zip into the directory, and run the file: main_sparse_kernel_kmeans.m. Provide the following parameters:
  - Buffer size parameters: Maximum buffer size m, Initial buffer size l
  - Neighborhood size for constructing the sparse kernel: p
  - RBF kernel width: lambda (Change the kernel definition in lines 36 and 71 if a different kernel needs to be used. Also change the flann input parameters in lines 34 and 69.)
  - Batch size: batchSize
    - Number of clusters: k
  - Eigenvector re-orthogonalization interval (determines how often the updated eigenvectors are orthogonalized): reorth_count
  - Parameter for lazy clustering (determines how often the points added to the buffer are clustered): reclustercount
- Approximate Stream Kernel Clustering
  - MATLAB implementation of the stream kernel k-means algorithm, using the RBF kernel.
  - Usage: Unzip the downloaded file, and run the file: main_stream_kernel_kmeans.m. Provide the following parameters:
  - Buffer size parameters: Maximum buffer size m, Initial buffer size l
  - RBF kernel width: lambda (Change the kernel definition in lines 29 and 59 if a different kernel needs to be used)
  - Batch size: batchSize
    - Concept drift parameters: recency_threshold (determines how fast the concept changes) and recency_factor (rate of decay of the clusters - best value: 0.1)
    - Number of clusters: k
  - Eigenvector re-orthogonalization interval (determines how often the updated eigenvectors are orthogonalized): reorth_count
  - Parameter for lazy clustering (determines how often the points added to the buffer are clustered): reclustercount
- Kernel k-means based on Random Fourier Features
  - This is a MATLAB implementation of the fast kernel clustering algorithm proposed in the ICDM 2012 paper "Efficient Kernel Clustering Using Random Fourier Features".
  - Code is available for download here.
  - The .m file has a simple example demonstrating the working of the algorithm.
- Approximate Kernel k-means
  - This is a MATLAB implementation of the approximate kernel k-means algorithm for large scale clustering.
  - Details about this algorithm are available in the KDD 2011 paper "Approximate Kernel k-means: solution to Large Scale Kernel Clustering"
  - Code is available for download here .
  - Usage: Unzip the downloaded file. The zipped file includes two files: approx_kkmeans.m and example.m. While the former is the core implementation of the approximate kernel k-means algorithm, the latter gives an example of how the algorithm is invoked on a simple 2-D dataset.
- Incremental Topic and User Modeling
  - Java code to perform topic modeling in growing documents and modeling users based on their posts to the edmodo portal (binaries available on request).
- SVM integrated with two-level k-means
  - This is an implementation of the two-level k-means algorithm for fast linear classification. It is based on the SVM^light implementation of SVM by Thorsten Joachims.
  - Code is available for download here .
  - Details about the algorithm can be found in the paper "Two-level k-means clustering algorithm for k–t relationship establishment and linear-time classification"

Page updated

Google Sites

Report abuse