I'm a PhD student in Machine Learning, working with Claire Monteleoni at the Computer Science Department of The George Washington University, where I also earned my B.S. from the Department of Mathematics. 

For more information, please see my CV  
If you have any questions or suggestions, please contact me at my email address: tangch 'at' gwu 'dot' edu

My current research mainly lies at the intersection of unsupervised learning and stochastic optimization, which is often challenging because unsupervised problems are usually nonconvex; the same problem arises in most deep learning problems (supervised and unsupervised) as well. 

Examples of topics that interest me include
  • Understanding the empirical success of heuristics for unsupervised learning, e.g., clustering heuristics like Lloyd's algorithm, by forming reasonable assumptions on data.
  • Scaling up unsupervised learning problems via stochastic optimization methods.
  • Exploring the role of unsupervised learning methods in the context of modern machine learning: e.g., their application to deep learning, semi-supervised and active learning. And vice versa.
I am also always eager to explore new territories, which I was able to do via internships: I studied how the generalization power of random forest is affected by parameter tuning during an research internship at University of Tübingen (with Theory of Machine learning group); I explored the rules underlying gene expression from single-cell RNA sequences (a much higher resolution data empowered by recently available technology) using deep learning techniques during an internship Microsoft Research Cambridge UK (with Machine Intelligence and Perception group).


ReLu-activated auto-encoders learn sparsely used dictionaries with weight-normalized SGD
Cheng Tang, Claire Monteleoni, in preparation, 2018.

When do random forests fail?
Cheng Tang, Damien Garreau, Ulrike von Luxburg, accepted at NIPS, 2018 [NIPS version].

Convergence rate of stochastic k-means
Cheng Tang, Claire Monteleoni, 2017. [Long version[Code for experiments]
  • A shorter version appears in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), 2017. [AISTATS version]  

On Lloyd's algorithm: new theoretical insights for clustering in practice
Cheng Tang, Claire Monteleoni, in Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), 2016. 
[AISTATS version]

Scalable constant k-means approximation via heuristics on well-clusterable data
Cheng Tang, Claire Monteleoni, NIPS workshop on Learning faster with easy data II, 2015.

Detecting extreme events from climate time-series via topic modeling
Cheng Tang, Claire Monteleoni, 
in Machine Learning and Data Mining Approaches to Climate Science: Proceedings of the 4th International Workshop on Climate Informatics. Lakshmanan, V., Gilleland, E., McGovern, A., Tingley, M. (Eds.), Springer, 2015 [publisher link].
  • The work was featured as “Can Topic Modeling Shed Light on Climate Extremes?” in IEEE Computing in Science and Engineering (CISE) Magazine, Special Issue on Computing & Climate. Vol. 17, no. 6, pp. 4352, Nov./Dec. 2015. 

Seasonal Prediction using unsupervised feature learning and regression
Mahesh Mohan, Cheng Tang, Claire Monteleoni, Tim DeSole, Ben Cash,
in Proceedings of the 5th International Workshop on Climate Informatics, 2015, [link to paper].

Scaling up Lloyd's algorithm: stochastic and parallel block-wise optimization perspectives
Cheng Tang, Claire Monteleoni, 7th NIPS workshop on optimization for machine learning (OPT2014), 2014.

On the Convergence Rate of Stochastic Gradient Descent for Strongly Convex Functions
Cheng Tang, Claire Monteleoni, International Workshop on Advances in Regularization, Optimization, Kernel methods and Support vector machines (ROKS), 2013, [publisher link].