Introduction (05/01/23)
Slide: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lconkvxgoj25hj
Regular Expressions and Morphology (09/01/23)
Regular expressions, Finite State Automata, Morphology, Porter stemmer, Edit distance
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lconl69ajiq5o2
Study material: Speech and language processing, Jurafsky & Martin
N-gram Language Models -Part 1 (12/01/23)
N-gram models, their limitations, evaluation, derivation of Perplexity and its interpretation (linking it with Entropy)
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lcszvb8xabxmb
Study materials: Speech and language processing, Jurafsky & Martin
Explaining Perplexity: https://towardsdatascience.com/perplexity-in-language-models-87a196019a94
N-gram Language Models - Part 2 (16/01/23)
Derivation of Perplexity and its interpretation (linking it with branching factor), different smoothing techniques
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lcynmy2xmp532z
Study materials: Speech and language processing, Jurafsky & Martin
Explaining Perplexity: https://towardsdatascience.com/perplexity-in-language-models-87a196019a94
Parts-of-Speech Tagging (19/01/23)
Introduction of POS tagging, open and close sets, unsupervised tagging, Supervised tagging (HMM)
Slides: Intro to PoS https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/ld2zo6c7gg834d
Slides: HMM https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/ld31jjyxwdh5bz
Study materials: Speech and language processing, Jurafsky & Martin
Parts-of-Speech Tagging -Part 2 (24/01/23)
Viterbi algorithm, learning HMM, MaxEnt Model, MEMM model
Study materials: Speech and language processing, Jurafsky & Martin (Chapter 6)
Parts-of-Speech Tagging -Part 3 (28/01/23)
Multinomial Logistic Regression, MaxEnt Model, Generative vs discriminative models
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/ldfxkl5j2fs2js
Study materials: Speech and language processing, Jurafsky & Martin (Chapter 6)
Parsing (28/01/23)
Introduction to statistical parsing, Constituency vs dependency parsing
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/ldfxl2wgd5r2xz
Study materials: Speech and language processing, Jurafsky & Martin
Parsing - Part 2 (30/01/23)
CFG, PCFG, CKY algorithm, Evaluation of parsing
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/ldieaebfwd14zs
Study materials: Speech and language processing, Jurafsky & Martin
Text Classification (02/02/23)
Introduction to text classification, Naive Bayes algorithm, Evaluation
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/ldmzwbg5wq53ed
Minor 1 (06/02/23 -- 09/02/23)
Lexical Similarity (13/02/23)
Word and senses, Wordnet, Computing with a Thesaurus
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/le2sgixk37y73x
Study materials: Speech and language processing, Jurafsky & Martin
Distributional Similarity (16/02/23; 20/02/23)
Vector space model, PMI, MI, TF-IDF, considering syntax, evaluation
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/le6nlx0p135j8
Study materials: Speech and language processing, Jurafsky & Martin
PMI: https://en.wikipedia.org/wiki/Pointwise_mutual_information; MI: https://en.wikipedia.org/wiki/Mutual_information
Word Representation (20/02/23; 27/02/23)
Issues with the lexical-based and one-hot vectors, Word2Vec and its derivation, GloVe and its derivations, Evaluation, Bias
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lebijp59sz31py
Study materials: Word2Vec paper: https://arxiv.org/abs/1301.3781; GloVe: https://nlp.stanford.edu/pubs/glove.pdf; Evaluation of embrrdings: https://aclanthology.org/D15-1036/
Relevant blogs: Word2Vec: https://jalammar.github.io/illustrated-word2vec/; GloVe: https://towardsdatascience.com/light-on-math-ml-intuitive-guide-to-understanding-glove-embeddings-b13b4f19c010
Introduction to Deep Learning (23/02/23)
Perceptrons, XOR problem, Feedforward Neural Networks, Backpropagation in Neural Networks, Application of Neural Networks in NLP, Intro to CNNs for NLP
Slides for Deep learning: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lefugtkllp5mz
Slides for CNNs: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lefugpho55n40
Additional Readings: CS231n notes on network architectures, CS231n notes on backprop
Recurrent Neural Networks (27/02/23; 02/03/23)
Fixed-window model, Intro to RNNs, Derivation of backpropagation through time, applications of RNNs
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lelh6qh7j907l6
Reading materials: http://karpathy.github.io/2015/05/21/rnn-effectiveness/; https://www.deeplearningbook.org/contents/rnn.html
Midsem break (06/03/23 -- 10/03/23)
Recurrent Neural Networks - II (02/03/23; 13/03/23)
Vanishing and exploding gradient, LSTM and GRU, BiLSTM and Stacked LSTM
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lepibwp74jd2rz
Reading materials: https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1224/readings/cs224n-2019-notes05-LM_RNN.pdf; http://colah.github.io/posts/2015-08-Understanding-LSTMs; Proof of vanishing gradient problem: https://arxiv.org/pdf/1211.5063.pdf
Sequence-to-Sequence Models and Attention (20/03/23)
Introduction to Seq2Seq, Beam search, maths behind attention
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lffguuqbda06fk
Reading materials: Original seq2seq NMT paper: https://arxiv.org/pdf/1409.3215.pdf; Bahdanau et al., ICLR 2015 (paper that introduced attention): https://arxiv.org/pdf/1409.0473.pdf; Nice blog on Seq2Seq and attention: https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/;
Minor 2 (23/03/23 -- 26/03/23)
Self-Attention and Transformers (27/03/23)
Intro to self-attention, attention vs self-attention, Introduction to Transformers
Reading materials: Original paper of Transformer: https://arxiv.org/abs/1706.03762; Blog illustrating Transformers: https://jalammar.github.io/illustrated-transformer
Unfolding Transformers (29/03/23)
Encoder, decoder, positional encoding, layer normalization
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lfsf1uwdxje1sz
Reading materials: Blog illustrating Transformers: https://jalammar.github.io/illustrated-transformer, Positional encoding: https://machinelearningmastery.com/a-gentle-introduction-to-positional-encoding-in-transformer-models-part-1; Layer normalization: https://www.pinecone.io/learn/batch-layer-normalization
Model Pretraining & Transfer Learning (03/04/23; 06/04/23)
Byte-pair encoding, ELMo, Transformers: Encoder (BERT), Decoder (GPT), Encoder-Decoder (T5)
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lfzc6rah8ff5dg
Reading materials: BERT: https://arxiv.org/pdf/1810.04805.pdf; ELMo: https://arxiv.org/pdf/1802.05365.pdf; Blog on Transfer learning: https://www.ruder.io/state-of-transfer-learning-in-nlp; Limits of Transfer Learning: https://arxiv.org/pdf/1910.10683.pdf; Visual illustration: http://jalammar.github.io/illustrated-bert/
Text-to-Text Transfer and T5 (09/04/23)
Explaining T5 model, and the C4 dataset, understanding the functionalities of T5
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lg9j0f6naeq5sz
Reading materials: T5 paper: https://arxiv.org/abs/1910.10683
Guest Lecture by Adobe Delhi (13/03/23) -- Topic: NLP Research in Enterprise
Prompt-based Learning and In-context Learning (17/04/23; 20/04/23)
Prompt learning, manual prompt engineering, prompt tuning, In-content learning
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lgjim83jmy61u3
Reading materials: Survey paper: https://arxiv.org/pdf/2107.13586.pdf; Power of prompting: https://arxiv.org/pdf/1904.09751.pdf; Prefix-tunning: https://aclanthology.org/2021.acl-long.353.pdf; Explaining in-context learning: https://arxiv.org/abs/2111.02080;
Retrieval-augmented Language Models (20/04/23; 24/04/23)
Retrieval-augmented models -- REALM, nearest-neighbour machine translation
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lgnwi1mhuua65j
Reading materials: REALM paper: https://arxiv.org/pdf/2002.08909.pdf (On HuggingFace: https://huggingface.co/docs/transformers/model_doc/realm); Surveys: https://arxiv.org/abs/2302.07842; https://arxiv.org/abs/2202.01110; Other readings: https://arxiv.org/pdf/2010.00710.pdf;
Guest Lecture by Prof. Graham Neubig, CMU (20/04/23) -- Title: Is My NLP Model Working? The Answer is Harder Than You Think
Multilingual LMs (24/04/23)
Multilingual datasets, benchmarks and models, adoptor models, cross-lingual transfer, back-translation
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lgthmkzqcdx6f6
Reading materials: The state of multilingual AI by Sebastian Ruder: https://www.ruder.io/state-of-multilingual-ai/; Survey: https://arxiv.org/pdf/2107.00676.pdf; Adopter -- Blog: https://medium.com/dair-ai/adapters-a-compact-and-extensible-transfer-learning-method-for-nlp-6d18c2399f62 and paper: https://arxiv.org/pdf/1902.00751.pdf
Bias, Fairness and Other Ethical Aspects + Conclusion (26/04/23)
Different types of biases, bias mitigation, and current issues in NLP
Slides: https://piazza.com/class_profile/get_resource/lc8oxdp2tw11hp/lgxuvfcznbh50v
Reading materials: A course on Computational Ethics for NLP (http://demo.clab.cs.cmu.edu/ethical_nlp2020/#syllabus)!!; Blog: https://huggingface.co/blog/evaluating-llm-bias; Cognitive bias in NLP: https://arxiv.org/abs/2304.01358