Syllabus
The class will review key assumptions and ideas making semi- and unsupervised learning possible, including
- Low-density separation assumptions
- Clustering assumptions
- Generative modeling assumptions
- Smoothness
- Manifold assumptions
- Existence of different 'views' on the data
and their implementation in learning algorithms
- expectation maximization (EM) and its generalizations (e.g., for Bayesian methods)
- co-training and multi-view training
- bootstrapping techniques
- graph-based algorithms (e.g., label propagation)
- transductive SVMs
- ...
as well as their applications to interesting problems in natural language processing:
- induction of semantic representations
- grammar induction (syntax)
- topic modeling
- ...
We will also look into related problems and techniques
- latent variable models and partially labeled setting
- multi-task learning
- feedback instead of full supervision
- domain shift and domain adaptation techniques