CS159 - Readings & Resources

Advanced Topics in Machine Learning
Caltech, Spring 2022

Topic: Representation Learning for Science

Representation learning transforms data into representations (also called embeddings, encodings, or features) from which it is easier to extract useful information. Recently, these methods have facilitated progress in a variety of fields, such as medicinal chemistry, ecology, protein synthesis, fluid mechanics, sports analytics, and animal behavior analysis. Here, we will cover a range of methods (autoencoders, graph embedding techniques, symbolic representations, and self-supervised learning) and help students make connections to applications in science.

The goal in this course is for students to be able to:

Recognize existing representation learning methods and potential application areas. (Focus of lectures)
Effectively apply existing representation learning methods in a pre-defined setting. (Focus of assignments)
Formulate challenges in scientific data analysis into appropriate computer science questions. (Focus of project proposal)
Explore new techniques and develop methods that work on real-world data from scientific applications. (Focus of final project)

Past years of CS159 websites and readings are available [here].

Generative Modeling

Graph Embeddings

"Textbooks"

Graph Representation Learning, Will Hamilton (2020)
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges, Bronstein et al. (2021)

Spectral Methods / Random Walks

(Blog) An Overview of k-way Spectral Clustering, Yeh (2021)
Partitioning Well-Clustered Graphs: Spectral Clustering Works!, Peng et al. (2015)
node2vec: Scalable Feature Learning for Networks, Grover and Leskovec (2016)

Graph Neural Networks

(Blog) A Gentle Introduction to Graph Neural Networks, Sanchez-Lengeling et al. (2021)
(Blog) Understanding Convolutions on Graphs, Daigavane et al. (2021)
Message-Passing: Neural Message Passing for Quantum Chemistry, Gilmer et al. (2017)
GCN: Semi-Supervised Classification with Graph Convolutional Networks, Kipf and Welling (2017)
GIN: How Powerful are Graph Neural Networks?, Xu et al. (2018)
GAT: Graph Attention Networks, Veličković et al. (2018)
GATv2: How Attentive are Graph Attention Networks?, Brody et al. (2022)
Strategies for Pre-training Graph Neural Networks

Self-Supervised Learning

Symbolic & Neurosymbolic Representations

Representation Learning with Applications

Vision and Language

(language) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
(language) ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
(vision & language) Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
(vision & language) Learning Transferable Visual Models From Natural Language Supervision
(vision & language) Zero-Shot Text-to-Image Generation

Behavior Analysis

Chemistry

Dynamics Modeling & Physics