Projects

Graph Laplacian and Spectral Embedding

Laplacian Eigenmap, proposed by Belkin and Niyogi in 2003, is a delicate learning map which encodes various geometric information of the data cloud. We are formulating "rules" for data scientists to choose proper target dimension and interpret the outcome graph.

On the other hand, we are also interested in any possible applications. In particular, we expect Laplacian Eigenmap can be used to solve certain problems in graph theory.

Recovery Problem of Embeddings

Under certain natural conditions, PCA and classical MDS can recover full geometric information of the pre-image. In general, it is still obscure how the curvature affects the embedding procedure or, conversely, how to extract geometric information from the embedding algorithm.

We also study the geometric aspects in all manifold learning algorithms.

(The picture shows an MDS recovery of the north pole and a circle of latitude, which is distorted because the geodesic distance is larger than straight line distance.)

Consistency of Weighted Laplacians

Graph Laplacian is proved to be a well-behaved operator which is analogues to the Laplace-Beltrami operator on smooth manifolds. Although there have been some theoretical analysis on the convergence of graph Laplacian, the thorough understanding is not yet achieved.

Geometry in Statistical Learning

We study geometric constraints in regression problems. Nonparametric regression on a sphere and on complicated domains are straightforward applications.

Geostatistics and Clustering

Clustering is a powerful tool for understanding data cloud, while geostatistics model data structure from an alternative viewpoint. There are several ways in which the two disciplines interplay. Here are two obvious examples. Spatial domain partition, based on local geographic phenomena, requires clustering algorithms and geostatistics thinking. Outlier detection among scalar/vector fields also adopts geostatistics modeling and clustering techniques