CSE 512 Machine Learning
Spring 2025, SUNY Korea
Spring 2025, SUNY Korea
Instructor: Byungkon Kang
Classes: M,W 10:30 - 11:50 @A114
Office hours: T,Th 14:00 - 16:00. No need to make appointments for office-hour visits.
Prerequisite: Graduate-level CSE major or permission from instructor
This is a course designed to introduce graduate-level concepts in machine learning with a focus on conducting and understanding research.
By the end of this semester, students are expected to know the workings of vanilla ML algorithms: what are the basic mathematical models, what kind of problems can they solve, etc, as well as some of the (more recent) core deep learning concepts. The students will also get hands-on experience on machine learning research by conducting their own.
Reading good research papers is an absolute requirement to doing good research. That is why the second half of the semester is structured as a series of seminar sections. You will be asked to choose a set of papers relevant to your research and lead a discussion session on it. Even though only one person will be presenting at a time, everyone is expected to have read the paper and participate in the discussion.
Experience in the following is highly recommended:
Linear algebra
Probability/statistics
Optimization
Reading published research articles
There is no required textbook for this course. Instead, I will draw materials from the following list of books.
Richard O. Duda, Peter E. Hart, David G. Stork. "Pattern Classification". Wiley-Interscience, 2000. ISBN: 0471056693. (Good beginner's book)
Christopher M. Bishop, "Pattern Recognition and Machine Learning". Springer, 2011. ISBN: 0387310738. (Good intermediary material)
Kevin P. Murphy, "Machine Learning: A Probabilistic Perspective". The MIT Press, 2012. ISBN: 0262018020. (Better suited for advanced readers)
Ian Goodfellow, Yoshua Bengio, and Aaron Courville, "Deep Learning". The MIT Press, 2016. ISBN: 0262035618. (An HTML version can be found here)
It would be a good idea to purchase at least one reference textbook if you're serious about doing research in machine learning, but please do not acquire them illegally.
Some of the topics will be based on published papers, in which case a link to those papers will be given in the schedule below.
The following may be subject to change, so please check back regularly. The dates are in "Month/Day format".
To access the lecture slides, you'll need to have logged in with your stonybrook.edu account.
Week 1: Introduction + Preliminary math (2/24), Maximum likelihood estimation (2/26)
Week 2: No class (3/3), MLE cont. + Logistic regression (3/5)
Week 3: Bayesian parameter estimation + regularization (3/10), Sampling + density estimation (3/12)
Week 4: Nearest neighbor methods (3/17), Neural networks (3/19)
Week 5: Deep learning (3/24, 3/26)
Week 6: Transformers & Word embedding (Ref: Original paper, Tutorial) (3/31, 4/2) (Proposal due)
Week 7: Reinforcement learning + Policy gradient (4/7, 4/9)
Week 8: Deep generative models (GAN, VAE, FBM) (4/14, 4/16)
Week 9: PyTorch hands-on (sample code: train.py, model.py) (4/21, 4/23)
Week 10: Paper presentation - Sign up here (4/28, 4/30)
Week 11: No class (5/5) Paper presentation (5/7) (Progress report)
Week 12: Paper presentation (5/12, 5/14)
Week 13: Paper presentation (5/19, 5/21)
Week 14: Paper presentation (5/26, 5/28)
Week 15: Paper presentation (5/30 - Follows a Monday schedule. Room change: B103), Paper presentation (6/2)
Final report due: June 11th
Scribe (15%): Scribes are written summaries of class lectures. Each student should submit at least five (5) scribes on topics (not lectures!) of his/her choice. These are due a week after the end of the corresponding lecture. Do *not* blindly copy the contents of the lecture slides or other references. Doing so will result in a score of 0.
Paper presentation and reading assignments (15%): These will happen in the second half of the semester, where you will choose papers relevant to your project and give presentations. The number of presentations will be adjusted according to the class size. Each presentation will last about 30 minutes, including Q&A. Everyone in the class, including me, should read (hence the 'reading assignment') the paper prior to the class and participate in discussion.
The paper you wish to present must be endorsed by me at least two days before the presentation date.
Project proposal (20%)
Mid-term report (20%)
Final project report and presentation (30%)
All assignments except pop quizzes should be submitted to Brightspace.
Attendance is not explicitly part of your final grade, but missing more than 20% of the classes will automatically result in a grade of 'F'.
A final project, and its by-product assignments, will be an integral part of this course. Graduate students will be expected to learn what it means to conduct research in machine learning.
During the first half of the semester, students should actively discuss with the instructor to choose a suitable research topic to work on. The end result of this discussion should be formalized in the form of a proposal (or an extended abstract) which includes the following:
Problem definition
Data acquisition and preparation strategy
Proposed solution (can change over time)
Schedule
Expected outcome
Around a month before the end of semester, a progress report should be submitted. The report is of the same format as the proposal's with the addition of any ongoing developments and temporary results. Finally, the final report is due on the date of final, and should be delivered in a form of a conference paper. The required format should follow that of ICML's: https://icml.cc/Conferences/2023/StyleAuthorInstructions. The appropriate length should be at least 5 to 6 pages, double column.
Students should pursue their academic goals in an honest way that does not put you at an unfair advantage over other students. You are responsible for all work you submitted and representing other’s work as yours is always wrong. Faculty is required to report any suspected instance of academic dishonesty to the school. Regarding your homework, you are encouraged to discuss it with others, but you should write your own code. For more information please refer to http://www.stonybrook.edu/commcms/academic\_integrity/index.html.
If you have a physical, psychological, medical or learning disability that may impact your course work, please let the instructor know. Reasonable accommodation will be provided if necessary and appropriate. All information and documentation are confidential.
Stony Brook University expects students to respect the rights, privileges, and property of other people. Faculty are required to report to the Office of Judicial Affairs any disruptive behavior that interrupts their ability to teach, compromises the safety of the learning environment, or inhibits students' ability to learn. Faculty in the HSC Schools and the School of Medicine are required to follow their school-specific procedures.