Advanced Topics in Machine Learning
Caltech, Spring 2022
Advanced Topics in Machine Learning
Caltech, Spring 2022
Topic: Representation Learning for Science
Representation learning transforms data into representations (also called embeddings, encodings, or features) from which it is easier to extract useful information. Recently, these methods have facilitated progress in a variety of fields, such as medicinal chemistry, ecology, protein synthesis, fluid mechanics, sports analytics, and animal behavior analysis. Here, we will cover a range of methods (autoencoders, graph embedding techniques, symbolic representations, and self-supervised learning) and help students make connections to applications in science.
The goal in this course is for students to be able to:
Recognize existing representation learning methods and potential application areas. (Focus of lectures)
Effectively apply existing representation learning methods in a pre-defined setting. (Focus of assignments)
Formulate challenges in scientific data analysis into appropriate computer science questions. (Focus of project proposal)
Explore new techniques and develop methods that work on real-world data from scientific applications. (Focus of final project)
Past years of CS159 websites and readings are available [here].
Auto-Encoding Variational Bayes (VAE). Many variations:
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets
"Textbooks"
Graph Representation Learning, Will Hamilton (2020)
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges, Bronstein et al. (2021)
Spectral Methods / Random Walks
(Blog) An Overview of k-way Spectral Clustering, Yeh (2021)
Partitioning Well-Clustered Graphs: Spectral Clustering Works!, Peng et al. (2015)
node2vec: Scalable Feature Learning for Networks, Grover and Leskovec (2016)
Graph Neural Networks
(Blog) A Gentle Introduction to Graph Neural Networks, Sanchez-Lengeling et al. (2021)
(Blog) Understanding Convolutions on Graphs, Daigavane et al. (2021)
Message-Passing: Neural Message Passing for Quantum Chemistry, Gilmer et al. (2017)
GCN: Semi-Supervised Classification with Graph Convolutional Networks, Kipf and Welling (2017)
GIN: How Powerful are Graph Neural Networks?, Xu et al. (2018)
GAT: Graph Attention Networks, Veličković et al. (2018)
GATv2: How Attentive are Graph Attention Networks?, Brody et al. (2022)
Unsupervised Representation Learning by Predicting Image Rotations
A Simple Framework for Contrastive Learning of Visual Representations
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Shuffle and Learn: Unsupervised Learning using Temporal Order Verification
Unsupervised Learning of Object Landmarks through Conditional Image Generation
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Neurosymbolic Programming (review paper)
Learning Neurosymbolic Generative Models via Program Synthesis
Learning Differentiable Programs with Admissible Neural Heuristics
Synthesizing Programs for Images using Reinforced Adversarial Learning
DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning
Vision and Language
(language) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
(language) ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
(vision & language) Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
(vision & language) Learning Transferable Visual Models From Natural Language Supervision
(vision & language) Zero-Shot Text-to-Image Generation
(vision) Benchmarking Representation Learning for Natural World Image Collections
(vision) Transfusion: Understanding Transfer Learning for Medical Imaging
Behavior Analysis
Task Programming: Learning Data Efficient Behavior Representations
Composing graphical models with neural networks for structured representations and fast inference
VAE-SNE: a deep generative model for simultaneous dimensionality reduction and clustering
Interpreting Expert Annotation Differences in Animal Behavior
Chemistry
ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction
Molecular Contrastive Learning of Representations via Graph Neural Networks
3D Molecular Representations Based on the Wave Transform for Convolutional Neural Networks
OrbNet: Deep Learning for Quantum Chemistry Using Symmetry-Adapted Atomic-Orbital Features
Dynamics Modeling & Physics