Class time: Monday, 1PM - 2:45PM
Classroom: Knowlton Hall 195
Course website: https://sites.google.com/view/osu-cse-5539-sp25-zhu
Instructor: Prof. Zhihui Zhu
Email: zhu.3440@osu.edu
Office hours: Monday 2:45PM-4PM
Office: DL 583
Course abstract: In the past decade, deep learning has demonstrated unprecedented performance across many different domains in engineering and science, ranging from traditional AI fields such as computer vision, natural language processing, and gaming, to health care, finance, and so on. However, we still have a limited understanding of this success and deep neural networks (DNNs) are often designed by trial and error and then trained and used as black boxes. This course will introduce the basic ingredients of DNNs, overview recent work on the models and theory of deep learning, sample important applications, and discuss some open questions. The format of the class will be a mix of lectures and research paper presentations. Students who participate in this class are expected to be self-motivated graduate or senior undergraduate students.
Students are expected to have a background in linear algebra, multivariate calculus, probability, statistics, and python. Students are also expected to have taken courses in artificial intelligence/machine learning (3521/6521, 5523, or 5526).
Course credits: 2 units
Pre-requisites:
Required background:
§ Linear algebra: Math 2568, 2174, 4568, or 5520H
§ Artificial intelligence: 3521, 5521, or 5243
§ Statistics and probability: 5522, Stat 3460, or 3470
§ Machine learning: 5523 or Neural Networks: 5526
Students in the class are expected to have a decent degree of mathematical sophistication and to be familiar with linear algebra, multivariate calculus, probability, and statistics. Students are also expected to have knowledge of programming, algorithm design, and data structures.
Programming: students are expected to know or self-learn deep learning software (e.g., Tensorflow and Pytorch).
Review materials can be found: linear algebra, probability, Python-1, Python-2, Python-3
Textbook (optional)
Kevin P. Murphy, Machine Learning: A Probabilistic Perspective. The MIT press, 2012. (Free online version, Amazon)
Hastie, Tibshirani, and Friedman, The Elements of Statistical Learning, 2009. (Free online version, Amazon)
Hsuan-Tien Lin, Malik Magdon-Ismail, and Yaser Abu-Mostafa, Learning From Data, AMLBook, 2012. (Amazon)
Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep learning. The MIT press, 2016. (Amazon)
Useful reference:
Kaare Brandt Petersen and Michael Syskind Pedersen, The Matrix Cookbook
Grading (tentative):
Participation: 5%
Includes attendance, asking questions, discussion
Paper presentation: 40%
It will be graded based on efforts and clearness in presenting the ideas of the papers. See the syllabus for detailed rubrics.
Final project (1-2 people): 55%
Proposal (10%): 5% presentation, 5% report
Final (45%): 22.5% presentation, 22.5% report
Announcements, communications, and discussions:
We will make normal announcements using the Carmen Canvas.
We will use Piazza for discussions. If you have questions about the course materials or policy, please post them on Piazza. I will also monitor these discussions and answer as appropriate, but students should be active and feel free to use the forums to have group discussions as well.
Please only use email to contact the instructor for urgent or personal issues. Any e-mails sent to the instructor should include the tag "[OSU-CSE-5539]" in the subject line. (This ensures we can filter and prioritize your messages.) We reserve the right to forward any questions (and their answers) to the entire class if they should prove relevant. Please indicate if you wish to be anonymized (i.e. have your name removed) in this case.
For much of the semester, each class will involve the presentation and discussion of recent important papers. The objective of the course is to instill a holistic view of the latest developments in deep learning, and help the participants understand their broad implications.
Presenters
Each paper will be presented by a group of students each with an assigned “role”. This role defines the lens through which they read the paper and determines what they prepare for the group in-class discussion. Here are the roles we will experiment with:
Stakeholder: Act as if you’re the authors of this paper. Describes their motivation, problem definition, method and experimental findings of this paper. (time budget: 15 minutes)
Scientific Reviewer: Act like you’re a reviewer of this work. Be critical of the work, though not necessarily negative. You can follow the guidelines for NeurIPS reviewers (under “Review Content”), taking note of the example reviews included therein. (time budget: 10 minutes)
Archaeologist : Determine where this paper sits in the context of previous and subsequent work. Find and report on one prior paper that substantially influenced the current paper and one newer paper that cites this current paper. (time budget: 10 minutes)
Visionary : Propose an imaginary follow-up research project or a new application – not just based on the current but only possible due to the existence and success of the current paper. (time budget: 10 minutes)
The presentation of each role will be done individually or in a group of two depending on the complexity of the paper. In case of a group presentation the presenters may decide how to divide the work among themselves but it should be roughly equal.
Who presents what role and when? In a given class session, two papers centered around a theme will be discussed. The students will each be given a random role (determined at least 10 days before the presentation). Each role (irrespective of how many students are assigned to it) should aim for specified time budgets for each role. You’re encouraged to have slides for your role, though it is not mandatory. If you do so, I would recommend less than 7-10 slides to make sure stay within our time budget.
What slides? To minimize time spent context switching or fighting with screen sharing/projector dongles, we will have a shared pool of slides. Each role group are encouraged to title their slides with “[role]: [student name]” (as in “Stakeholder: Jane, John”) so that the slides are quickly identified during the session.
Non-Presenters
If you aren’t in the presenting group during a given class period: Come up with one question / discussion point about the paper (either something you’re confused about or something you’d like to hear discussed more). Submit this question to Canvas (a submission link will be provided before the class)
During the class: While only a subset of the class will participate in presenting a paper, the rest of the class is expected to come to class ready to participate in the discussions.