Foundations of Manifold Learning
本課程網頁已於 107-2 學期結束後停止維護,並移除大部分課程講義連結。
流形學習
流形學習(manifold learning, ML) 廣義來說包含了所有和幾何相關的機器學習理論[1],其中主要的一個分支是低維嵌入(low dimensional embedding),目的是做資料降維(dimensionality reduction)、視覺化(data visualization)或分群(clustering),這部分和統計的多變量分析比較相關[2]。統計中最基本的降維方法叫做主成分分析(Principal component analysis, PCA),大多數多變量分析的課本都有提到,或者可以參考專書[3]。另一個常用於“相對性”資料的方法則是 multidimensional scaling(MDS)[4]。這兩種方式都是在指定的低維歐氏空間中盡可能重建原始資料分佈。
2000 年時,同一期《Science》上出現兩個新的 ML 演算法:Isomap[5] 與 Locally Linear Embedding(LLE)[6]。之後陸續有 Laplacian eigenmap(LE)[7]、Diffusion map(DM)[8]、t-SNE[9] 等方法。這些新方法和以往最大的差異在於假設資料具有流形的結構,因此採用非線性的嵌入法來研究資料的型態,亦即"學習流形(learn the manifold)"。可以參考一些介紹性的文章 [10,11,12] 及 [12] 豐富的參考資料。
本課程的目的是介紹這些方法的數學原理,以及理論上的困難與問題,為進一步的探究做準備。
我們假設學生已經修過微積分與線性代數。
[1] Manifold Learning Theory and Applications, by Yunqian Ma (Editor), Yun Fu (Editor)
[2] Modern Multivariate Statistical Techniques - Regression, Classification, and Manifold Learning. Alan J. Izenman. especially Chapter 7,13,16
[3] Principal component analysis. I. T. Jolliffe.
[4] Modern Multidimensional Scaling - Theory and Applications. I. Borg and P. J. F. Groenen, 2005. Or Chapter 13 in [2].
[5] http://web.mit.edu/cocosci/isomap/isomap.html
[6] https://cs.nyu.edu/~roweis/lle/
[7] Laplacian Eigenmaps for Dimensionality Reduction and Data Representation, by M. Belkin and P. Niyogi.
[8] Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators, by Nadler, Lafon and Coifman.
[9] Visualizing Data using t-SNE, by van der Maaten and G. Hinton.
[10] An introduction to diffusion maps, by J. de la Porte, B. M. Herbst, W. Hereman, S. J. van der Walt.
[11] Spectral methods for dimensionality reduction, by L. Saul, K. Weinberger, F. Sha, J. Ham, D. Lee.
[12] Dimensionality Reduction: A Comparative Review, by L. van der Maaten, E. Postma, J. van den Herik.
課程參考資料
除了第一手論文之外,學生主要可以參考上述的
[2] Modern Multivariate Statistical Techniques - Regression, Classification, and Manifold Learning. Alan J. Izenman.
以及
Geometric Structure of High-Dimensional Data and Dimensionality Reduction. Jianzhong Wang. 2012.
關於矩陣計算以及 Matlab 的部分可以參考
Matrix Algebra Useful for Statistics. Shayle R. Searle and André I. Khuri. 2017
課程內容
本課程涵蓋三個主題:流形學習演算法的數學推導、幾何、綜合探索。
流形學習演算法的數學推導:介紹各種流形學習法,包括線性與非線性方法。
幾何:介紹維度、嵌入、黎曼流形、測地線、曲率、Laplace-Beltrami 算子、一階變分、調和座標。
綜合探索:編寫簡單的 matlab 程式、討論演算法的優劣、利用幾何知識來設計不同的演算法。
如果先花三四週講預備知識,再介紹流形學習法,這樣似乎會很無趣。所以預計交錯進行,每提出一個流形學習演算法,就來看看需要用到什麼數學。實際進度紀錄如下:
Week 1: Introduction to manifold learning
What is manifold learning? Methods and problems
Matrix computation, Gram matrix and centering matrix
Week 2: PCA
Statistics: variance, covariance, and their matrix expressions
PCA - looking for maximum variance
maximum is achieved by the largest eigenvalue
[理2004] Matlab: matrix operations
Week 3: PCA
PCA: an example [ppt]
PCA - looking for minimum energy
Matrix decomposition and matrix calculus
[理2004] Matlab: PCA
Week 4: Surfaces in R^3
Gauss map and the second fundamental form
Gaussian curvature
Riemannian metric
[理2004] Matlab: better PCA; function (by Yen)
Week 5: MDS
Classical MDS
Euclidean MDS is PCA
non-Euclidean MDS problem
The concept of manifold
[理2004] Matlab: import data(swiss roll); call function(HLLE)
Week 6: Metric and isometry
metric MDS
Isomap, REE, MVU(SDE)
[理2004] Matlab: Picture compression by PCA (by Yen)
Week 7: Computation of curvatures
curves and turning angle
surfaces and Euler characteristic
second fundamental form and deviation
Week 8 : Locally Linear Embedding
LLE algorithm
matrix calculus
Connection and Christoffel symbols
[理2004] Matlab: use spectrum to check the dimension of two coplanar circles in R^3 is 2. scree plot (by Yen)
Week 9: Midterm Exam
紙筆測驗 1.5 hr
挑選期末報告主題
Week 10(4/22): Geometric Laplacian
Levi-Civita connection
Laplace-Beltrami operator
Week 11(4/29): Prof. Lin's "Graph Laplacian"
graph Laplacian
graph connectivity
graph partition
spectral clustering
參考資料
A Tutorial on Spectral Clustering, Statistics and Computing, U. von Luxburg
Fiedler’s Theory of Spectral Graph Partitioning, B. Slininger
Week 12(5/6): Discussions on Final Presentations (individual presentation)
Consistency problem & metric design
介紹 Workshop: Geometry of Big Data 的內容
[理2004] Matlab: MDS
Week 13(5/13): Discretization of Laplacian
the first variation of energy
difference quotient - weighted Laplacian
heat kernel - point cloud Laplacian
[理2004] 用 Python 展示 graph Laplacian 的課程內容 (by J. Lin)
Week 14(5/20): Hessian LLE
examples of chart
Local tangent space & PCA/SVD
Taylor & Hessian
Hessian LLE
theoretical proof: normal coordinates vs tangent coordinates
參考資料
Robust Hessian Locally Linear Embedding Techniques for High-Dimensional Data, Xianglei Xing, Sidan Du, and Kejun Wang
An improved local tangent space alignment method for manifold learning, Peng Zhang, Hong Qiao, and Bo Zhang(改變 LTS 的中心點取法)
Week 15: Replaced by Dr. Yi-Sheng Wang's "Topological Data Analysis" on 5/24(Friday)
Week 16, 17(6/3,10): 學生報告
[1] 魏丞偉 MLLE: Modified Locally Linear Embedding Using Multiple Weights, Zhenyue Zhang and Jing Wang.
[2] 林鋐儒 Global versus local methods in nonlinear dimensionality reduction 或參考 Sparse multidimensional scaling using landmark points, both by Vin de Silva and Joshua B. Tenenbaum.
[3] 陳俊宇 MVU/SDE Algorithms for manifold learning
[4] 葉永傑 Generalized Non-metric Multidimensional Scaling, S. Agarwal, J. Wills, L. Cayton, G. Lanckriet, D. Kriegman, S. Belongie. (Semidefinite programming) 或可看 Robust Euclidean Embedding, L. Cayton and S. Dasgupta. (Subgradient algorithm)
[5] 陳湘元 Visualizing Data using t-SNE, by van der Maaten and G. Hinton.
[6] 陳彥伍 Locally Linear Embedding (LLE) for MRI based Alzheimer’s Disease Classification, X. Liu, D. Tosun, M. W. Weiner, N. Schuff, and ADNI.
[7] 蔡承翰 Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators, by Nadler, Lafon and Coifman.
Week 18(6/17):
[8] 閻紹華 Perron-Frobenius Theorem in Matrix Analysis
[9] 翁偉倫 Principal Manifolds and Nonlinear Dimensionality Reduction via Tangent Space Alignment, by Zhenyue Zhang and Hongyuan Zha.
[10] 蔡秉承 Taking all positive eigenvectors is suboptimal in classical MDS, J. Tsang and R. Pereira.
其他可參考主題:
[11] HLLE: Explain Donoho-Grimes's output / compare with Maaten's one.
[12] Shamap: Shape-based Manifold Learning, Fenglei Fan, Hongming Shan and Ge Wang.
[13] dimension detecting
[14] LLE with distance matrix
[15] Non-linear Dimensionality Reduction: Riemannian Metric Estimation and the Problem of Geometric Recovery, D. Perrault-Joncas and M. Meila
[16] Riemannian Manifold Learning for Nonlinear Dimensionality Reduction, Tony Lin, Hongbin Zha, and Sang Uk Lee
[17] Geometric Network Comparisons, Dena Asta and Cosma R. Shalizi
[18] A remark on global positioning from local distances, A. Singer
[19] Polynomial PCA and kernel PCA
[20] MDS, MVU in view of kernel
[21] Matlab: HLLE-比較先前用過的 Donoho 與 Grimes 的程式碼 與 Maaten 的 差異