Stanford CS 329M Lectures (September - December 2024):
[Week 01] Sept. 23, 25: Lectures 1 & 2
[Week 02] Sept. 30, Oct. 2: Lectures 3 & 4
[Week 03] Oct. 7, 9: Lectures 5 & 6
[Week 04] Oct. 14, 16: Lectures 7 & 8
[Week 05] Oct. 21, 23: Lecture Canceled, Student Checkpoint Presentations
[Week 06] Oct. 28, 30: Lectures 9 & 10
[Week 07] Nov. 4, 6: Lectures 11 & 12
[Week 08] Nov. 11, 13: Lectures 13 & 14
[Week 09] Nov. 18, 20: Lectures 15 & 16
[Week 10] Nov. 25, 27: Autumn Break (No classes)
[Week 11] Dec. 2, 4: Invited Guest Lectures Day, Student Presentations
Course Description
The field of machine programming (MP) is concerned with the automation of software development. Given the recent advances in software algorithms, hardware efficiency and capacity, and an ever increasing availability of code data, it is now possible to train machines to help develop software. In this course, we teach students how to build real-world MP systems. We begin with a high-level overview of the field, including an abbreviated analysis of state-of-the-art (e.g., Merly Mentor). Next, we discuss the foundations of MP and the key areas for innovation, some of which are unique to MP. We close with a discussion of current limitations and future directions of MP. This course includes a nine-week hands-on project, where students (as individuals or in a small group) will create their own MP system and demonstrate it to the class.
While some overlap exists between traditional techniques to train machines to perform non-programming tasks (e.g., natural language processing, computer vision, etc.), teaching machines to perform programming-specific tasks has uniqueness in at least two dimensions. First, there are certain techniques that are more (or less) effective for MP, such as using self-supervision to learn from the large corpora of unlabeled open-source code. Second, software reasoning is fundamentally multi-dimensional; that is, there exist multiple unique ways to learn from software (e.g., static analysis, dynamic analysis, input/output specifications, program state reinforced-convergence, hardware telemetric data, etc.). In this course, we discuss each of these techniques (and others) and how they can be effectively applied to MP systems.
This course is primarily intended for Stanford MS and PhD graduate students. However, advanced (senior-level) and committed undergraduates have successfully completed this course. Hard-working and disciplined students have historically done will in CS 329M even if they are relatively new to ML, PL, SE, and systems.