NEW: Slides online!

Geometrical models of natural language semantics, also known as distributional models or semantic spaces, have become omnipresent in contemporary computational linguistics and neighboring fields, like cognitive science and corpus linguistics. Since their earliest application in information retrieval they have found widespread application, thanks to their accuracy, scalability, and cognitive plausibility. Different types of distributional models have been introduced (from the relatively simple bag-of-word, document-based and syntax-based techniques to the statistically more advanced topic models), and in this wide variety of incarnations, semantic spaces have been successful in virtually all subfields of lexical semantics, and beyond. Their direct applications include the construction of lexical taxonomies, word sense discrimination and disambiguation, cognitive modeling, textual entailment, etc. Moreover, other areas of NLP, like parsing or Machine Translation, have found they can indirectly benefit from the ability of semantic spaces to generalize from a limited training set to unseen, but semantically similar, words. Yet, if the growth of semantic spaces is to continue, a number of issues have to be addressed, including the differentiation between several types of semantic relationships, and the modeling of semantic composition.

One of the most important challenges that we see in distributional semantics is fragmentation with regard to data sets, methods and evaluation metrics, which makes it difficult to compare studies and achieve scientific progress. The goal of GEMS 2011 is to address this fragmentation by bringing together people from various NLP backgrounds with an interest in semantic spaces, and by focusing on shared evaluation.

GEMS will be co-located with EMNLP 2011, at Edinburgh, Scotland and will take place on July 31st, 2011.

GEMS has been endorsed by the ACL SIGSEM and ACL SIGLEX interest groups.

Invited speaker: Mirella Lapata