CS 173: Introduction to Natural Language Processing, Spring 2024
Welcome to CS 173!
We will use Piazza for questions and discussions in addition to office hours; unresolved queries by the TA will be escalated to the instructor. Class page at: https://piazza.com/ucr/spring2024/cs_173_001_24s/info
Direct emails/messages to the instructor should be limited to (1) medical or (2) grades issues.
For other questions and concerns, visit during office hours. Key announcements will be on eLearn/Canvas with email alerts.
Class Overview
Students will gain an overview of modern approaches for Natural Language Processing (NLP). Students will learn the theory and algorithms for NLP from applications such as part-of-speech tagging, parsing, named entity recognition, coreference resolution, sentiment analysis and machine translation.
Students who successfully complete this course will be able to:
Implement algorithms for basic language models
Install, configure and run sophisticated NLP toolkits
Identify which NLP tasks apply to given real world problems involving unstructured text data
Apply standard modelling techniques to a given NLP task
Research and apply current NLP modelling techniques to solve novel problems
Course Details
Prerequisites: CS 150
Format: The course consists of two 90-minute lectures and a one-hour discussion per week. Students are highly encouraged to utilize office hours for help on homework and exam review.
Instructor: Dr. Yue Dong — please communicate via eLearn or office hour, not email
Teaching Assistant: Rishi Sangireddy <rishi.sangireddy@email.ucr.edu>
Lectures: TTh 9:30am - 10:50am, WCH room 143
Discussion: W 11:00 - 11:50 am, Student Success Center | Room 121
Preliminary Schedule
Act 1: Preliminaries
Weeks 1-4:
Introduction & Regular Expressions (ch 2)
N-gram Language Models (ch 3)
Naive Bayes (ch 4)
Assignments (week 3)
Midterm 1 (April 23rd, 2024)
Act 2: Modeling Techniques
Weeks 4-7:
Logistic Regression (ch 5)
Vector Semantics (ch 6)
Hidden Markov Models (Appendix A)
Assignment (week 5,7,9)
Midterm 2 (May 16th, 2024)
Act 3: Linguistics & Applications
Week 8-10:
Part of Speech Tagging (ch 8)
Entity and Relation Extraction (ch 18)
Question Answering, Information Retrieval (ch 14)
Chatbots and Dialogue Systems (ch 15)
FINAL EXAM
Homeworks and Tutorials
Study problems and the problems for the tutorials can be found on the Tutorials page.
You are expected to come to the tutorial class each week and make a good-faith attempt to work on the problems in your group.
Four assignments on eLearn will be due on Sunday evening (week 3,5,7, and 9).
Grading
Homeworks — 30% — if you do well on homeworks, you will do well in this course
Midterms (two) — 30% — if you complete homeworks successfully, you should excel on midterms
Final — 20% — if you excel on midterms you should excel on final, which is comprehensive
Quizzes, Attendance, Participation, Office Hours, and other activities — 20% — instructor's discretion
Standard +/- Scale: 90% or higher is the cutoff for an A- (similarly for B,C,D); 87% or higher is the cutoff for a B+ (and so on). 59% or less is an F.
Textbook
Required Textbook: Dan Jurafsky and James Martin, Speech and Language Processing, 3rd ed (free)
Additional Recommended References:
Jurafsky & Martin, 3rd ed, draft is available online: web.stanford.edu/~jurafsky/slp3/
C.D. Manning & H. Schuetze, Foundations of Statistical Natural Language Processing
Office hours
Instructor Office Hours: Tuesdays 11-12 noon, Thursdays 12 -1pm MRB 4135
TA Office Hours: Mondays and Fridays over Zoom, Rishi Sangireddy, https://ucr.zoom.us/j/8896652238
Additional Support: Academic Resources Center (ARC), 156 Surge, http://www.arc.ucr.edu
About me
Hi, I am Yue Dong, an assistant professor of CSE at UCR. My research interests include natural language processing, machine learning, and artificial intelligence. I lead the Natural Language Processing group at UCR, which develops natural language understanding and generation systems that are controllable, trustworthy, and efficient.