Shaohua Li




6.2017 - Now:      Scientist in Institute of High Performance Computing, A*Star, Singapore.

2.2016 - 5.2017:   Research Fellow at NExT center, National University of Singapore (NUS).

8.2009 - 1.2016:   PhD student at School of Computer Engineering, Nanyang Technological University (NTU). 

                         Supervisor: Chunyan Miao. Cosupervisor: Gao Cong.

9.2012 - 12.2012: Research Intern at Data Mining and Business Analysis Technology Group, NEC Japan

                         Mentor: Ryohei Fujimaki.

2002    - 2005:     M.Sc. in CS, Institute of Software, Chinese Academy of Sciences (CAS). 

                         Supervisor: Jian Zhang

1997    - 2002:     B.S. in Math (Special Class for the Gifted Youth, SCGY), University of Science and Technology of China (USTC).


Check View Shaohua Li's profile on LinkedIn for a complete list.


Bayesian Methods and Deep Learning, and their applications on Natural Language Processing and Computer Vision. 


        Reviewer of ICML 2018, ACL 2017, 2018, MM 2017, NIPS 2017.

        09 Jun 2017                Gave a talk about Topic Embedding & Laplacian-steered Neural Style Transfer to Jiashi Feng's team, NUS.

        16 Oct 2015                Gave a talk "Word Embedding Methods, Generative Word Embedding and Extension" to Wei Lu's team, SUTD.


"Laplacian-Steered Neural Style Transfer". Shaohua Li, Xinxing Xu, Liqiang Nie and Tat-Seng Chua. Accepted by the ACM Multimedia Conference (MM) 2017.  [PDF]   [Code]   [Slides]

"Dirichlet-vMF Mixture Model". arXiv:1702.07495 [cs.CL], 2017.

"Document Visualization using Topic Clouds". arXiv:1702.01520 [cs.IR], 2017.

"Detecting Functional Modules of the Brain using Eigenvalue Decomposition of Laplacian", Xiuchao Sui, Shaohua Li, and Jagath C Rajapakse. Accepted by the International Symposium on Biomedical Imaging (ISBI) 2017.   [PDF]

"Generative Topic Embedding: a Continuous Representation of Documents", Shaohua Li, Tat-Seng Chua, Jun Zhu and Chunyan Miao. In the Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL) 2016 (oral), pp. 666-675.   [PDF] (extended with proofs)   [Code]    [Slides]

"PSDVec: a Toolbox for Incremental and Scalable Word Embedding", Shaohua Li, Jun Zhu and Chunyan Miao. Accepted by Neurocomputing (2016).    [PDF]   [Code&Data]

"Locality Regularized Sparse Subspace Clustering with Application to Cortex Parcellation on Resting fMRI", Xiuchao Sui, Shaohua Li and Jagath C Rajapakse. In the Proceedings of the International Symposium on Biomedical Imaging (ISBI) 2016.   [PDF]

"Mobile App Tagging", Ning Chen, Steven C.H. Hoi, Shaohua Li and Xiaokui Xiao. In the Proceedings of the ACM Conference of Web Search and Data Mining (WSDM) 2016. [PDF]

"A Generative Word Embedding Model and its Low Rank Positive Semidefinite Solution", Shaohua Li, Jun Zhu, Chunyan Miao. In the Proceedings of the Conference on Empirical Methods on Natural Language Processing (EMNLP) 2015, pp. 1599–1609.   [PDF]   [Code&Data]   [Poster]

"On the Equivalence of Factorized Information Criterion Regularization and the Chinese Restaurant Process Prior", Shaohua Li. arXiv:1506.09068 [stat.ML], 2015.

"Factorized Asymptotic Bayesian Inference for Factorial Hidden Markov Models", Shaohua Li, Ryohei Fujimaki, Chunyan Miao. arXiv:1506.07959 [stat.ML], 2015.

"Sparse Canonical Correlation Analysis Reveals Correlated Patterns of Gray Matter Loss and White Matter Impairment in Alzheimer's Disease", Xiuchao Sui, Shaohua Li, Chunshui Yu, Tianzi Jiang. In the Proceedings of the International Symposium on Biomedical Imaging (ISBI) 2015.   [PDF]

"SimApp: A Framework for Detecting Similar Mobile Applications by Online Kernel Learning", Ning Chen, Steven C.H. Hoi, Shaohua Li and Xiaokui Xiao. In the Proceedings of the ACM Conference of Web Search and Data Mining (WSDM) 2015.   [PDF]

"Factorial hidden markov models estimation device, method, and program", Ryohei Fujimaki, Shaohua Li. US Patent US20140343903 A1, 2014.   [Details]

"Author Name Disambiguation using a New Categorical Distribution Similarity", Shaohua Li, Gao Cong, Chunyan Miao. In the Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD) 2012, Bristol, UK.   [PDF]    [Slides]    [Code&Data]    [Extended Version]

"A k-NN Method for Large Scale Hierarchical Text Classification at LSHTC3", Xiaogang Han, Shaohua Li, Zhiqi Shen. ECML-PKDD 2012 PASCAL Workshop on Large-Scale Hierarchical Classification, Bristol, UK.   [PDF]


统计建模基础 Statistical Modeling - Basics (in Chinese) 2017
统计建模 - 隐变量模型 Statistical Modeling - Latent Variable Models (in Chinese) 2017
TopicVec (Generative Topic Embedding): A Hybrid Model of Word Embedding and Latent Dirichlet Allocation 2016
词嵌入原理及应用 Principles and Applications of Word Embedding (in Chinese)   讲座录像 2016
Word Embedding Methods, Generative Word Embedding and Extension 2015
共轭梯度法简介 Introduction to Conjugate Gradient Methods (in Chinese) 2014
Introduction to Sparse Topic Modeling 2013
Monte Carlo Methods - A very brief overview 2012
Extreme Learning Machine -- Overview and Comments 2012
Introduction to Perl (in Chinese) 2008


Generative Topic Embedding: TopicVec (in Python)

Generative Word Embedding: PSDVec (in Python)

Categorical Sampling Likelihood Ratio (A novel and general categorical set similarity): [Python Package]   [Perl Package]

(Accidental reinvention of a 1980 paper) Robust Linear Regression using EM algorithm (in R)

Runner-up in D-Crypt Career Fair 2013 Challenge in Software Hacking and Reverse Engineering

Misc Writings:

Deep Learning Reading List (with reviews and comments)
Entropy, KL Divergence and Mutual Information
ICML 2014 流水账
A List of People in Machine Learning (incomplete)
Memoirs of my internship in Japan 日本纪行 (in Chinese)
Papers to Read