CS 2823r

Advanced Topics in Modern Machine Learning

This course explores active research in machine learning and AI, and topics may include probabilistic machine learning, generative AI models and their evaluation, causality, and applications. Students will gain broad exposure to these areas through recent papers, reflecting instructor interests. The course involves student-led weekly discussions of chosen papers, emphasizing motivation, context, and innovation. For every topic, an instructor will give a lecture on the background, facilitate group discussions and guide presentations. Students will also complete an in-depth course project on a chosen subject.

Course Instructors

Zi Wang, Alexander D'Amour, David Belanger, Jasper Snoek, Liliya Lavitas

Course correspondence will be through Canvas.

Location & Time

Friday 9:45-12:30, SEC (Allston)

Grading

Class participation - 40%
Class presentations - 20%
Project proposal - 10%
Project presentation - 10%
Project report and code - 20%

In-class discussions

Each class meeting will be three hours of in-depth discussion on a specific topic. Two students will present papers each week, and each student is expected to facilitate a discussion 1-2 times per semester. The presenters for each week are expected to coordinate with each other and with the course instructors in advance to divide up the assigned papers and any additional background material that will need to be discussed.

Discussions will center around:

Understanding the strengths and weaknesses of these methods.
Understanding the relationships between these methods, and with previous approaches.
Extensions or applications of these methods.
Experiments that might better illuminate their properties.
The ultimate goal is that these discussions will reveal gaps in current research, generate new ideas, and ideally generate novel research directions.

Final Project

Students can work on projects individually or in pairs. The goal of the projects is to allow students to dive more deeply into one of the topics of the course. The project can be an extension of existing work, a novel application using existing methods, exploration of a new research idea or non-trivial implementation and experimentation using existing methods. The grade will depend on the ideas, how well you present them in the report, how clearly you position your work relative to existing literature, how illuminating your experiments are, and how well-supported your conclusions are.

Each group of students will write a short (around 2 pages) research project proposal, which ideally will be structured similarly to a standard paper. It should include a description of a minimum viable project, some nice-to-haves if time allows, and a short review of related work. You don't have to do what your project proposal says - the point of the proposal is mainly to have a plan and to make it easy for me to give you feedback.

Towards the end of the course, everyone will present their project in a short presentation.

At the end of the class you'll hand in a project report (around 4 to 8 pages), prepared in the format of a machine learning conference paper such as NeurIPS or ICML. Note, we do not expect the report to be a completed research paper of that caliber but hopefully some projects will be a first step in that direction.

Statement of Interest and Qualifications

In the case of oversubscription, we will prioritize graduate students whose research interests will benefit from the material, but are open to other qualified and interested candidates. As an advanced course where we will discuss recent literature at a technical level, we do expect significant math, statistics and machine learning background.

Please fill submit your statement using this form. Update on Sep 3: Students who are approved to enroll in this course have either been contacted or had an approved petition in my.harvard. We have a waitlist so please stay tuned.

Prerequisites / recommended preps

One of the following courses:

CS 1810 (link)
COMPSCI 1820 (link)
CS/Stat 184(0) (link)
MIT 6.390 (link)

Collaboration policy

For course presentations, the presentations can be completed independently but we encourage students to collaborate to ensure there is less redundancy and overlap (e.g. by having similar introductions to a topic). However, we expect both students to each contribute substantially to the week's presentations. For the project, if students choose to work together, we will ask for a statement detailing the individual contributions of each student.

Additional rules

Attendance in person expected.
1 absence permitted per semester, unless otherwise discussed with the instructors.
Reading is expected to be done in advance, not during course time.

Tentative List of Papers

Note: this is a representative list of papers and is subject to change

Sign-up spreadsheet here.

Topic 1: Probabilistic ML and Decision-making

Active Learning

Paper 1: Active Learning Literature Survey

Bayesian Optimization

Agents

ML-Guided Design

Topic 2: Causality

Paper 1: "Deep Proxy Causal Learning and it Application to Confounded Bandit Policy Evaluation" by Xu et al 2021

Topic 3: LLM Pre-training and Evaluations

Foundational & Benchmark-Defining Papers

Papers on Specific Capabilities & Challenges

LLM Scaling Laws

Paper 1: Scaling Laws for Neural Language Models
Paper 2: Training Compute-Optimal Large Language Models

Final Projects

Date TBD: Final project posters due (so that they can be printed before the poster session)

Dec 9: Final project poster session. Time TBD.

Dec 17: Final projects due

Lecture Slides

The slides will be updated every week.

Harvard CS2823r Sep 5 2025

Final Project Guidelines

A project can contribute in any of these areas (or a combination of them):

Methods: systematic assessment of the strengths and weaknesses of a collection of novel or existing methods when applied to real or synthetic data.
Applications: use of machine learning to help solve a real-world problem.
Theory: formal statements concerning guarantees about machine learning problems or methods.
Exposition: presentation of a unified framework covering a set of existing theory or methods. The goal is to help provide accessible educational content to others as well as identify opportunities for development of novel methods and theory.
Software: development of machine learning tools that are fast, general-purpose, and well-tested.

When evaluating your projects, we will be focusing on the following criteria:

Are your technical statements precise and correct?
Did you properly cite related work and explain the background concepts?
Given your specific machine learning background, did the work stretch you outside of your comfort zone?
Is your write-up well-written and was your presentation engaging? If your project is software-based, was your code high-quality and reusable?
If working as part of a team, did you collaborate effectively?

Note that projects do not necessarily need to focus specifically on the subjects discussed in the class, though should be relevant to recent advances in machine learning.

Some example projects

Implement a few MCMC methods and compare their performance on a variety of low-dimensional inference problems as well as for Bayesian deep nets. Here, it may be best to use synthetic data, so that properties of the problem can be tuned.
Apply Thompson sampling or GP-based Bayesian optimization to a black-box optimization problem in biology. In the interest of time/money, it could be run on a software-defined fitness function, rather than by doing actual experiments.
Derive regret bounds for various bandit algorithms.
Write a tutorial covering the breadth of recent advancements in variational inference.
Implement your own autodiff library from scratch, or contribute new features or example applications to jax . For example, you could develop a user-friendly library for MCMC in jax. This blog post is an excellent example of such a project.