Academics‎ > ‎

EM602c Natural Langauge Text Processing

        Preamble
           
           



        Course Contents
            Introduction to Natural Language Text Processing, Applications and Issues in NL text processing,
            Regular Expressions, Word Tokenization, Word Normalization and Stemming, Sentence Segmentation.
            Probabilistic Language Models –theory and methods for statistical NLP. Bi-grams, tri-grams and N-grams, estimating N-gram probabilities. 
            
Text Classification: Naïve Bayes, Multinominal Naïve Bayes.
            Parts of speech tagging (POS tagging), Information Extraction, NER.
Trigram Hidden Markov Model for parameter estimation. Viterbi Algorithm 
            Natural Language Parsing,
Probabilistic CFGs,Parsing with PCFGs, Estimating model parameters, CKY parsing algorithm, Issue with PCFGs, Lexicalized PCFGs        




        Course Slides and other Reading material

 Topics Slides Readings/Notes/Programming illustrations
Introduction to NLP, NLP Tasks, Applications
 Slides#1

Basic Text Processing, Regular Expression, Word tokenization, Word Normalization, Stemming, Sentence Segmentation
 Slides#2

Language Modeling, Markov assumptions, Ngrams,
Estimating N-gram probability, MLE
Dealing with zeros, generalization, Back-offs and interpolations
 Slides#3

 Slides#4
 1. Notes by Prof Michael Collins on Language Modeling
Text classification, examples, Naive Bayes Learning
Parameter Estimation, Laplace (Add one) smoothing
Text classification evaluation, Practical issues
 Slides#5
 Slides#6
 Slides#7
 1. A programming example for Sentiment Analysis (in Python using NLTK) with
 Positive Review Data and Negative Review Data
Tagging problem, POS Tagging, NER, Generative models, Trigram Hidden Markov Model for parameter estimation, Dealing with low frequency words, Viterbi Algorithm
 
 Slides#8
 1. Notes by Prof Michael Collins on Tagging Problems
Natural Language Parsing, A simple CFG for English Language, Ambiguity, Probabilistic CFGs,
Parsing with PCFGs, Estimating model parameters, CKY parsing algorithm. Example
Issue with PCFGs, Lexicalized PCFGs

 Slides#9

 Slides#10

 Slides#11
 1. Notes by Prof Michael Collins on PCFGs and Lexicalized PCFGs







         
Course Project 
Link for example projects [Here]


Useful Links

1. NLP course at John Hopkins University
2. Project Ideas from projects [Link-1] [Link-2]
3. NLP Course at UMASS  Link-2



Evaluation Criteria
Quizzes 20%
End Sem 50%
          Project 30%
Comments