Stanford CS228T: Probabilistic graphical models - advanced methods (Spring 2012)


Announcements

  • At long last, we have gathered all of the programming assignments into one place and made them available to the public. We addressed most of the confusions that students had, and (slightly) improved the robustness of the testing scripts. The file pas.pdf contains instructions for all of the programming assignments. Most of the details are relegated to the comments and the provided code itself. Many of the assignments require pmtk3 (note that you may need to change 'startup.m' for every assignment that uses pmtk3 so that it can find your installation of it). Some assignments also make use of various functions from the Matlab toolboxes, all of which are available to Stanford students on corn. We suspect that all assignments could be run on octave with only minor modifications, but we have not tried to do so. To assist people who want to try the assignments, we have made the code available as part of a repository on GitHub so that people can easily discuss any issues they are having.

    The programming assignments were all developed specifically for this year's offering of the course, by Kevin Murphy, Daniel Selsam, and Sanjeev Satheesh.

  • The TA, Daniel Selsam, will be giving the final in-class lecture on Bayesian nonparametrics. We will cover Dirichlet processes (DPs) in great detail, hierarchical Dirichlet processes (HDPs) in moderate detail, and will briefly survey some applications and generalizations, including HDP-LDA, infinite state hidden Markov models (HDP-HMMs) and hierarchical Pitman-Yor processes (HPYs). Some of this material will be necessary to complete the final assignment, and almost all of it will be necessary to complete the extra credit problems.

  • The TA, Daniel Selsam, will give a tutorial on advanced convex optimization techniques on Tuesday, April 24 from 7 PM to 9 PM in Gates B12. We will cover subgradient methods, cutting plane methods, and dual decomposition. This material will come up throughout the course, specifically in the problem of finding the most likely (MAP) assignment in Markov random fields, and in solving Structural SVMs in which the constraint set is too large to enumerate. This tutorial is optional, but recommended.
  • The TA, Daniel Selsam, will give a tutorial on convex optimization on Tuesday, April 17th from 7:00 PM to 8:30 PM in Gates B12. It is strongly recommended that those who did not take EE 364A attend this session. There is more material to cover than can possibly be absorbed in such a short amount of time, so it is also recommended that you start learning this material ahead of time. See Professor Boyd's website for resources.

Overview

This is an advanced class on probabilistic modeling, inference and learning.

Pre-requisites: CS228 (graphical models).

Recommend prior classes: CS229 (machine learning), EE364a (convex optimization 1).


Instructor: Prof Kevin Murphy.

We meet Fridays 10:00am - 12:00pm in building 160 (Wallenberg on the main quad)  room 124 April 6 - June 1.

My office hours are Fridays 9-9.45am in Gates 142 (new!).

The TA is Daniel Selsam (cs228t.qa@gmail.com).

TA office hours: Monday 7-9pm, Thursday 4-6pm. Location: Basement of Huang, in the main room, by the CME department



Please sign up for the google groups email list for important class-related announcements.

Use this instead of the Courseware forum, which will not be monitored.

 

The class will primarily involve lots of reading and doing challenging written weekly homeworks;

you should decide how much detail you need to go into for the readings based on your interests and based on the homework questions.

There will be a few programming exercises, but not as much as CS228.

Weekly lectures will provide a brief summary of that week's reading, 

some ungraded in-class quiz questions, and a discussion of the solutions to the homework you just turned in.

 

Reading material comes from 3 sources:

1. Selected chapters from Kevin Murphy's draft textbook (mandatory). This should be purchased from the Stanford bookstore (for $45).

2. Koller & Friedman textbook (mandatory).

3. Recent papers : links will be added online.



Lecture/ reading schedule

Reading: KM refers to Kevin Murphy's book, KF refers to Koller & Friedman.
The KM book is not yet published, so it likely contains typos. Please 
enter typos in this Google docs form.
KM Chapters that are not in the course reader
will be posted on Stanford's Courseware File Download site (SUnet ID required).
The links below to KM chapters will not work unless you are already logged in to sunet.

We cover a lot of material in this class, so it's sometimes hard to see the big picture. Fortunately, Neal Parikh, the TA for this class in 2011,
has made a nice set of notes summarizing the entire course, available here.

You should skim the reading material before class, and come with questions you want answered. After class, you should read the material in more detail.
You should decide which sections to focus on, based on (1) what you need to know to do the homeworks; (2) what you want to know for your own
research; (3) what wasn't clear from class.


 Lec# Date     Topics Reading Homework
 Notes
 1         April 6     Online / sequential inference Kalman filtering (KM 18, KF 14.2), particle filtering (KM 23.5-23.6, KF 15.3.3),  Bayesian online ADF for binary logistic regression (Zoeter07). You may find parts of KM Chap 8 on online learning (Courseware link, public link)  to be useful background for the homework. Optional:  KM chap 4   on multivariate Gaussians will be useful for understanding the derivation of the Kalman filter (Courseware link, public link).  Papers on online ADF for probit classifiers (Opper96) and multi-class classifiers (Zhang10). Papers on decentralized PF (deFreitas), One pass SMC (Balakrsihnan and Madigan) HW1 (importance sampling que moved to hw2) Slides 
 2 April 13 MCMC MC (KM 23), MCMC (KM 24, KF 12),  Optionaladaptive MCMC tutorial (Andrieu 2008) . Iain Murray thesis chap 2,
Auxiliary variable tutorial (Higdon98)
HW2 Slides
 3          April 20 Variational inference 1 Mean field/ variational Bayes (KM 21, KF 8, KF 19.2.2, KF 19.3.3), online EM papers (Cappe 10, Liang 09) HW3, Solutions (requires SUnetId) Slides
 4 April 27     Variational inference 2 Belief propagation (KM 22, KF 11), New version of KM 22.5 on EP . Optional:   EP tutorial (Minka 05, (Seeger 05), Wainwright and Jordan tutorial ("The Monster") HW4.  Starter code is in courseware.

 Slides 
 5     May 4 MAP state estimation MAP estimation (KM 22, KF 13), new version of KM 22.6, dual decomposition tutorial  (Sontag and Jaakkola). Optional: Komodakis 11 Boyd book ch 5 HW5 Slides
 6 May 11     MRFs, CRFs and max-margin models MRFs (KM 19, KF 20.1-20.6), max-margin methods (Yuille 2012). Structural SVMs (Tsochandtaridis 05,Yu & Joachims 09, Mark Schmidt review , cutting plane training of SSVMs). Optional: SVM review (Ng, cs229)  HW6 Slides
 7 May 18Structure learningStructure learning (KM 26, KF 18, KF 19.4, KF 20.7), Heckerman'96  tutorial HW7 code, HW7 theory No slides
 8 May 25    Latent variable modelsTopic models (KM 27), scalable infernece in LVMs. Optional: nips'11 tutorial slides by Ahmed & Smola on same topic HW8 code, HW8 theory
 Guest lecture by Amr Ahmed
 9     June 1Bayesian non-parametrics Dirichlet process mixture models (KM 25.2), Blei tutorial, IBP paper (Griffiths and Ghahramani, 2011)
[Note: this is only part of the IBP paper. You may not consult the rest of the paper until the last problem set is submitted].
Variational Inference for DPMM, Dirichlet Processes, Hierarchical Dirichlet Process
Optional: split-merge for DPMM (Jain and Neal).
 HW9 code, HW9 theory (due back Sun June 10th)Slides The TA (Daniel Selsam) will give a tutorial on non-parametric Bayes.



Grading policy

9 problem sets. Each problem set is worth 11% of your grade.

Homework 1 is issued on April Friday 6th and due  Thursday 12th before midnight.
Send by email to  cs228t.qa@gmail.com - put HW#X in the subject line.
Similarly, homework 2 is issued Friday 13th and due Thursday 19th.
The final homework will be due June 7th (during finals week). This is in lieu of a final exam.
No late days allowed (except for documented medical emergencies).

Please send one pdf file with your solution.
Math should be typeset using latex; if you can't do this for some reason, please write neatly and scan it in.

There will be a few programming questions, but not as much as cs228.
Coding questions should be solved in matlab. Send all your code as a single zip file.
Usually you will be asked to generate some kind of numbers or plots; please include such results in your pdf file,
along with the theoretical exercises; indicate the name of the scripts used to generate each figure in the figure caption
(as I do in my textbook). So in total the TA should receive just 2 files (pdf and zip), and he should be easily able to figure out how to run
your code to reproduce the figures.
 

Students may discuss and work on problems in groups (in fact, we strongly recommend this) but must write up their own solutions. When writing up the solutions, students should write the names of people with whom they discussed the assignment. Also, when writing up the solutions, students should not use notes from group work. If we start receiving solution sets which are too similar, we may reconsider this policy. As we often reuse problem set questions from previous years, we expect the students not to copy, refer to, or look at the solutions in preparing their answers. It will be considered an honor code violation to intentionally refer to previous year's solutions. The purpose of problem sets in this class is to help you think about the material, not just give us the right answers. Moreover, some of the problems are taken from recently published papers. Students are expected to come up with the answers on their own, rather than extracting them from published solutions. Therefore, please restrict attention to the references provided on the webpage when solving problems on the problem set. If you do happen to use other material, it must be acknowledged clearly with a citation on the submitted solution.