🍁 2025 Fall
⏰ Tuesday/Thursday, 4:00-5:50 p.m.
📍 SAL 101
Complete videos for this class will be available on Brightspace for enrolled students.
Contact: Students should ask all course-related questions in the Ed forum, where you will also find announcements. You will find the course Ed on the course Brightspace page. For external enquiries, emergencies, or personal matters that you don't wish to put in a private Ed post, you can email us at csci544-25f@googlegroups.com with 'CSCI544' in your subject line. Please send all emails to this mailing list -- do not email the instructors directly. We will try to respond within 48-72 hours.
Instructor: Jieyu Zhao
Office Hour: 30mins after the class
TA: Rebecca Dorn
Office Hour: Thursdays 9:30-10:30 AM, Zoom Link Here
TA: Thomas Reeves
Office Hour: Wednesdays 8:30-9:30 AM, Zoom Link Here
TA: Sahana Ramnath
Office Hour: Mondays 11AM-12 noon, Zoom link here
TA: Muzi Tao
Office Hour: Tuesday 3:00-4:00PM, Zoom Link Here
TA: Ziyi Liu
Office Hour: Friday 2:00-3:00PM, Zoom Link Here
Check Brightspace.
This course covers both fundamental and cutting-edge topics in Natural Language Processing (NLP) with a focus on Language Models. Natural language processing (NLP) has been revolutionized by the advancement of large-scale language models, achieving state-of-the-art performance across a wide variety of tasks. This course will cover the fundamentals of language modeling and related topics in natural language processing, deep learning, and machine learning. Students will gain familiarity with the capabilities of large language models as well as get hands-on experience with building and evaluating small-scale language models. The class will also explore the real-world consequences of deploying language models, such as the ethics and harms associated with them.
[Tentatively for now]
Calendar and prespecified syllabus are subject to change. More details, e.g., reading materials and additional resources, will be added as the semester continues. All work (except the project final report) is due on the specified date by 11:59 pm PT.
There will be three components to course grades:
Homeworks (20%).
10% X 2: There will be four coding homework assignments based on the topics of the class.
Quizzes (10%).
5% X 2: Multiple-Choice Questions and Short Answers. Missed quizzes will receive a zero grade, and there will be no make-up quizzes.
Class Projects (50%).
Each student will do a group class project based on the topics covered in the class. Students will propose their own project, do the research and build a proof-of-concept, create a video demonstration of the proof-of-concept, and present the project in their report.
Proposal: 5%
Status Reports: 10%
Project Presentation: 15%
Final Write-up: 20%
Exams (20%)
Midterm (20%): The midterm exam will contain a mixture of multiple-choice and long-form questions, covering about the first half of the material covered in the class.
Grading inquiries and questions about the grading of the homework and the quizzes can be asked (to the TAs) within two weeks from the grading date (the date the grades are released). Grades will be available within 2-2.5 weeks after submission.
All written assignments related to the final project should use the standard *ACL paper submission template.
Students are allowed a maximum of 4 late days total for all assignments (but NOT the quiz sheets). You may use up to 2 late days per assignment. Using one late day for a project assignment involves each of the teammates using a late day each. Partial late days are not permitted. For every extra late day beyond the allowed late days, the student / team will lose 20% of the grade for the assignment.
Note: Please familiarize yourself with the academic policies and read the note about student well-being.
Project proposal (5%).
Student teams should submit a ~1-page proposal (using the *CL paper submission template) for their project. The proposal should:
state and motivate the problem by providing a problem or task definition (preferably with example inputs and expected outputs),
situate the problem within related work (this might help you find sources of data for training a model for your task),
Related work: publications, start by looking in the ACL anthology
References do not count towards page limit, but please follow the correct format
state a hypothesis to be verified and how to verify it (evaluation framework), and
provide a brief description of the approach to be followed to verify the hypothesis (such as proposed models and baselines).
We highly encourage students to work towards a problem involving predictive models, hence it’s worth thinking about the five key ingredients of supervised learning: data, model, loss function, optimization algorithm and inference / evaluation.
Project progress report (10%).
Student teams should submit a ~3-page progress report (using the *CL paper submission template) for their project. This report should:
once again describe the project’s goals (it is okay if this has changed slightly since the proposal, based on the feedback),
contain all details on the dataset (your dataset should mostly be collected by this time),
contain some initial results (think of this as a motivating results), and
must outline a concrete plan of what will be done before the final report.
While the initial results might be inconclusive, you are expected to have made non-trivial progress by this point. The project proposal may be extended for this report. Please take into consideration the earlier feedback you received, and address those inline (you may highlight these in a different text color if you wish to draw the grader’s attention).
Project final presentation (15%).
Each team will prepare a 5 minute presentation, followed by 1-2 minutes of Q/A.
You can choose a representative for your team to do the presentation. But you need to use one slide to clearly describe in the team, who is responsible for which part.
Points will be deducted if the time limit (only for the 5-min presentation) is violated, so please practice timing your talk. We will be very strict about this.
Each project presentation should describe
the underlying motivation of the project,
the research questions answered in the project,
the proposed methods,
their findings so far,
the contriubtion of each member, and
address audience questions.
All members of the team are expected to identify the central points of the research, and present that research to the class, as well as answer questions from the instructor, TAs and fellow students.
If you are in the audience, you could participate in asking questions - bonus points will be awarded to folks who ask insightful questions (and clearly announce their name before asking a question).
Each team will prepare slides (via Google slides) and add the link here by 11:59 PM the day before the presentation. Failure to share slides on time will cause a loss of grade.
Project final report (20%).
Student teams should submit a ~6-8 page final report (again using the *CL paper submission template) detailing all aspects of their project. The report should be structured like a conference paper (similar to the papers that students read and presented in class), including
an abstract,
an introduction to their problem and method,
related work, highlighting the similarities and differences to their own work,
a description of the method used to addressed the problem,
the experiments and results, and
a discussion of the results, outlining future work possibilites. A tech report format is discouraged. Parts of the proposal and progress report may be reused for the final report. Negative results will not be penalized, but should be accompanied with detailed analysis of why the proposed method did not work as anticipated. You may include an appendix at the very end. References and the appendix do not count towards the main report page limit (i.e. can exceed 8 pages). You MUST submit all your code as a final deliverable as a zip file (points will be deducted if we do not get this in time). PLAGIARISM will be strictly penalized.
The following texts are useful, but none are required. All of them can be read free online.
Dan Jurafsky and James H. Martin. Speech and Language Processing (2024 pre-release)
Jacob Eisenstein. Natural Language Processing
Yoav Goldberg. A Primer on Neural Network Models for Natural Language Processing
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning
Delip Rao and Brian McMahan. Natural Language Processing with PyTorch (requires Stanford login).
Lewis Tunstall, Leandro von Werra, and Thomas Wolf. Natural Language Processing with Transformers
If you have no background in neural networks but would like to take the course anyway, you might well find one of these books helpful to give you more background:
Michael A. Nielsen. Neural Networks and Deep Learning
Eugene Charniak. Introduction to Deep Learning