DS651 Introduction to Speech & Natural Language Processing
DS651 Introduction to Speech & Natural Language Processing
Class Timing: Wednesday 6.30-7.30 PM at https://zoom.us/meeting/register/GOG2nzEHSOyH2CaBOdh7Kw
Credit Structure: 1-0-0-0-1
This course gives a high-level introduction to Speech and NLP, focusing on applications, pipelines, and intuition. It covers how machines process speech and text, and how systems like voice assistants and chatbots work. The course emphasizes big-picture understanding with small hands-on examples, without going deep into theory.
Unit 1: Module 1: What is SNLP (2 Hours)
Speech Processing vs NLP, Real-world systems, End-to-end pipelines, Applications
Unit 2: Basic Speech Processing (4 Hours)
Speech Production, Spoken Language Processing, speech acquisition, feature extraction, and neural speech representation
Unit 3: Basic Text Processing (3 Hours)
Tokenization, Stopwords, Stemming vs Lemmatization, N-grams, Bag of Words, TF-IDF, Word2Vec, GloVe
Unit 4: Speech Applications (2 Hours)
ASR pipeline, Intro to Whisper / wav2vec, TTS pipeline.
Unit 6: NLP Applications (3 Hours)
Text classification, Sentiment analysis, POS tagging, Named Entity Recognition (NER), Dependency parsing, Semantic ambiguity, LLMs, Challenges
Lecture 1: Information Layers of SNLP
Lecture 2: SNLP: Real-world systems, End-to-end pipelines, Applications
Lecture 3: Speech Production
Lecture 4: Spoken Language Processing, Speech acquisition
Lecture 5: Speech Feature Extraction
Lecture 6: Neural Speech Representation
Lecture 7: Lexical Processing in NLP: Regular Expression, Tokenization, Stemming, Lemmatization
Lecture 8: Lexical Processing Applications
Lecture 9: N-grams, Bag of Words, TF-IDF, Word Embedings
Lecture 10: ASR pipeline, Intro to Whisper / wav2vec
Lecture 11: TTS pipeline
Lecture 12: Text classification, Sentiment analysis
Lecture 13: POS tagging, Named Entity Recognition (NER), Dependency parsing
Lecture 14: Semantic ambiguity, LLMs, Challenges
"Speech and Language Processing" by Daniel Jurafsky and James H. Martin, Prentice Hall, 2024.
"Springer Handbook of Speech Processing" by Jacob Benesty, M. Mohan Sondhi, Yiteng Arden Huang, 2008.
"Natural Language Understanding" by James Allen, Benjamin/Cummings Publishing Company, 1987.
"Foundations of Statistical Natural Language Processing" by Christopher D. Manning and Hinrich Schütze, MIT Press, 1999.
"A Primer on Neural Network Models for Natural Language Processing" by Yoav Goldberg, Online.
"Natural Language Processing with Python" by Steven Bird, Ewan Klein, Edward Loper, O'Reilly Media, Inc., 2009.
2 Theoretical Assignments (20%)
12 Quizzes (36%)
1 End Term (20%)
Classroom Notes (10%)
Attendance (14% or 7%)