Welcome to CS 222!
Paper signup link (Due Friday Jan. 12th 11:59pm)
Class Overview
Students will gain a deep understanding of modern approaches to Natural Language Processing (NLP) from a research perspective. This course, a research-focused version of CS 173, aims to prepare students who are interested in research roles that incorporate core NLP components, or in positions such as NLP research engineers.
Students who successfully complete this course will be able to:
Read and understand NLP research papers.
Reproduce the results presented in NLP papers.
Research and apply current NLP modeling techniques to solve novel problems.
This newly designed graduate-level NLP course aims to examine the latest key research papers published after 2022 in various research areas listed below (mostly using LLMs). It will condense the fundamental NLP materials covered in CS173 into two weeks, then focus on latest research papers that explore the frontiers of NLP research.
Topics Covered (subject to change)
Week 1 (Jan 8) : NLP Tasks and Neural Models (Transformers, BERT, T5, GPT Families)
Week 2 (Jan 15): NLP Core Techniques (Fine-tuning, Pre-training, RLHF)
Week 3 (Jan 22): NLP Security
Week 4 (Jan 29): Large Language Models (LLM) Analysis
Week 5 (Feb 5): LLM Reasoning and Grounding
Week 6 (Feb 12): Multi-Modal Models
Week 7 (Feb 19): Project Proposal Week
Week 8: RAG/LLM agents
Week 9: LLM for Coding
Week 10: Bias and Efficiency
Course Details
Lectures: MWF 10:00 AM - 10:50 AM, Spieth Hall | Room 2200
Prerequisites: CS 171 or CS 172 or CS 173; CS 224 or CS 228 or CS 229
Instructor: Yue Dong
TA: Yu Fu yfu093@ucr.edu Office hour: Wednesday 1-2pm
Office hour: TuThu 12 noon - 1 pm Zoom
Format: The course consists of three 50-minute lectures and 3 hours of individual study. The lecture starting from week 3 includes a concept overview from the instructor, followed by presentations and Q&A sessions with students. Students are highly encouraged to utilize office hours to discuss papers and projects.
Grading
Take-Home Entrance Exam / Homework 1 (Week 4) — 5%: assesses foundational knowledge necessary for the course.
Paper Presentation — 15%: evaluates understanding of research papers and presentation of key contributions and methodologies.
Weekly Research Participation — 20%: students are expected to submit questions related to the weekly research papers. During the lecture, they should be prepared to orally answer a randomly selected question from the question pool related to the paper. Students are required to read at least one research paper per week.
Midterm (Week 6) — 20%: assesses knowledge related to core NLP concepts covered in week 1-6.
Final Project (week 11) — 40%: assesses research capabilities, requiring strong coding skills and analytical efforts.
Standard +/- Scale: 90% or higher is the cutoff for an A- (similarly for B,C,D); 87% or higher is the cutoff for a B+ (and so on). 59% or less is an F. More details see the grading section.
Textbook
Required Textbook: Dan Jurafsky and James Martin, Speech and Language Processing, 3rd ed (free)
Additional Recommended References:
Jurafsky & Martin, 3rd ed, draft is available online: web.stanford.edu/~jurafsky/slp3/
C.D. Manning & H. Schuetze, Foundations of Statistical Natural Language Processing
Office hours
Instructor Office Hours: MF 11:00 am - 11:50 am MRB 4135
Additional Support: Academic Resources Center (ARC), 156 Surge, http://www.arc.ucr.edu
About me
Hi, I am Yue Dong, an assistant professor of CSE at UCR. My research interests include natural language processing, machine learning, and artificial intelligence. I lead the Natural Language Processing group at UCR, which develops natural language understanding and generation systems that are controllable, trustworthy, and efficient.