One of the recent developments with most impact in natural language processing applications is the way that words are represented. Learning those features not only avoids the laborious and application-dependent feature engineering, but it also allows for a smooth integration with deep learning frameworks that are revolutionizing the area since 2014.
In this tutorial we will see the study in depth those representation of words. A particular emphasis will be on "understanding by doing", and the attendants will implement the algorithms presented, mostly from scratch without the use of high-level toolkit in order to have a better understand of how those algorithms work.
Session 1: Introduction & Supervised Learning
basic introduction to supervised learning, covering loss functions and optimization through stochastic gradient descent (SGD). This first session will also be completed with a very high-level introduction to natural-language processing application.
Session 2: Hands-on
The attendants will derive their own loss function and implement SGD. This will then applied to a simplified version of Named Entity Recognition. For this, please install before comming:
>>> import nltk
>>> nltk.download()
Get the template here.
A possible solution: classification and NER
Session 3: Word Embeddings
The third session will focus on word-embeddings, covering the theory behind them and the most popular implementations.
Session 4: Hands-on
Implementing skip-gram with negative sampling from scratch. If time allows, also positive point-wise mutual information
Get the template here.
Download tokenized spanish text here.
The derivation is here.
A possible solution here.
Session 5: Advanded Topics
Relevant papers from the recent academic conferences about multi-lingual embeddings, biases in word-embeddings, fine-tuning, etc.