Vector Space Models of Meaning

Mehrnoosh Sadrzadeh and Martha Lewis

ESSLLI 2019, Riga, Latvia

We also have a workshop at ESSLLI: https://sites.google.com/view/semspace2019/home

Slides and Exercises becoming available through the week!

Summary

Vector space models of meaning are based on Harris and Firth’s distributional hypothesis at the word level and Frege’s principle of compositionality at the phrase and sentence level. These models represent word and phrase/sentence meanings by vector embeddings. Vectors are populated by extracting co-occurrence frequencies from corpora and running deep neural net algorithms on training data. They have been applied to tasks such as similarity, sense disambiguation, entailment, and sentiment analysis at both the word level and at the phrase and sentence level. In this course, we study these models from a theoretical and an applied point of view and practice their constructions. We review the basics behind word-level distributional vectors and show how they are built using co-occurrence matrices and neural word embeddings. We present different normalisation functions and show which ones perform best when it comes to computing degrees of similarity. At the phrase/sentence level, we go through the simple and type-driven compositional operators on word vectors and tensors. We also cover some of the different deep neural net architectures trained to learn them. We go through the datasets developed to evaluate these models and discuss how they perform on the tasks covered by these datasets.