Empirical approaches to the study and the acquisition of concepts

I have been interested in the use of commonsense knowledge in support of natural language processing ever since my undergraduate work on Sowa's Conceptual Graphs and then my work on the TRAINS project, but until I started working with Renata Vieira on large scale anaphora resolution, and in particular on bridging references, I had only worked with hand-coded knowledge bases.

Automatically acquired commonsense knowledge for anaphora resolution

So-called bridging references are a class of anaphoric expressions whose interpretation often requires lexical and commonsense knowledge. For instance, in the following example from the GNOME corpus (reported by (Poesio et al, 2004)), `the handles' in 1b. is a bridging reference to the `egg vases' in 1a whose interpretation requires knowing that vases may have handles.

1a. These `egg vases' are of exceptional quality.

1b. Basketwork bases support egg-shaped bodies, and bundles of straw form the handles

Together with my former student Renata Vieira we first of all produced a classification of bridging references, and tested whether lexical sources such as WordNet provided sufficient information, with mediocre results ( Poesio, Vieira, and Teufel, 1997; Vieira & Poesio, 2000 ). In subsequent work, we started investigating whether the necessary lexical and commonsense knowledge could be acquired in a psychologically plausible way from corpora, using so-called `semantic space' or `distributional' models (Deerwester et al, 1990; Lund and Burgess, 1995; Schuetze, 1996; Landauer et al, 1998). In these models, the contexts in which a word occurs are used to build a vectorial representation of that word that encodes co-occurrence information. In our initial work (Poesio, Schulte im Walde, and Brew, 1998) we used a variant of the Hyperspace Analogue to Language (HAL) model (Lund and Burgess, 1995) to automatically extract lexical models that allowed us to predict synonymy with an accuracy similar to that obtained with WordNet. In subsequent work (Poesio, Ishikawa, Schulte in Walde, and Vieira, 2002; Poesio et al, 2004) we started using the semi-supervised relation extraction methods pioneered by Marti Hearst (1998) to extract automatically from corpora information about parts of objects (like that between vases and handles) that we could use to resolve associative bridging references involving mereological knowledge, like that shown in 1b.

Automatically identifying attributes using supervised methods

The leading distributional models all extract either purely collocational information, or collocational information mediated by syntactic information (Grefenstette, 1994; Lin, 1998; Curran and Moens, 2002). Our work on semi-supervised extraction of mereological relations led us to wonder whether semantic space models in which the vectors encoded relational information would also achieve better performance than traditional distributional models at the tasks in which such models are usually evaluated (e.g., clustering).

I explored this question in work with my former PhD student Abdulrahman Almuhareb. We first showed that semi-supervised techniques for the extraction of attributes and values using only very simple patterns over words achieved higher performance that purely collocational models, with much more compact vectors (Almuhareb and Poesio, 2004). We then evaluated using a dependency parser to support attribute extraction, finding that this didn't lead to improved results, but much smaller corpora were needed (methods based on word patterns require Web-site corpora, which is cognitively implausible) (Almuhareb and Poesio, 2005a; Almuhareb and Poesio, 2005b) Finally, we developed a supervised approach, in which supervised methods like those used in relation extraction were used to extract from corpora relations that we identified as being typical of conceptual attributes, on the basis of a classification of attributes derived in part from Pustejovsky's Qualia theory (Pustejovsky, 2005), in part from Guarino's theory of attributes (Guarino, 2000) (Almuhareb and Poesio, 2005c). This research on attributes and values is summarized in ( Poesio and Almuhareb, 2008). This work was subsequently continued as a collaboration between myself, my then PhD student Eduard Barbu, and Claudio Giuliano and Lorenza Romano from Fondazione Bruno Kessler in Trento. In this work, the classification schema for attributes in feature norms from Wu and Barsalou (2009) was used (the same schema used by McRae et al, 2005 to classify their feature norms), as well as state-of-the art supervised relation extraction methods based on SVMs. ( Poesio et al, 2008)

Also with Abdulrahman, I developed one of the first methods for word-sense discrimination within semantic space models. The MSDA algorithm ( Almuhareb and Poesio, 2006) borrows ideas from Schuetze's and Pedersen's work, but makes crucial use of the attributes of concepts to discriminate between senses, instead of all collocations.

Unsupervised and Semi-supervised methods

In addition to the supervised methods, we also explored the use of semi-supervised and unsupervised methods. In collaboration with Marco Baroni, Eduard Barbu, and Brian Murphy, we developed the STRUDEL model for the acquisition of distributional models, whose technique of counting patterns instead of occurrences has proven very effective' ( Baroni et al, 2010). With Eduard Barbu, we focused on the question of whether extracting from Wikipedia instead of from the Web or other corpora would improve the quality of the extracted ontology ( Barbu and Poesio, 2009).

Main publications

  • Marco Baroni, Brian Murphy, Eduard Barbu, and Massimo Poesio, 2010. Strudel: A Corpus-Based Semantic Model Based on Properties and Types. Cognitive Science. 34(2), 222-254. (pdf )
  • Barbu, Eduard and Massimo Poesio, 2009. Unsupervised Knowledge Extraction of Taxonomies of Concepts from Wikipedia. In Proc. RANLP, Borovets. (pdf.)
  • Massimo Poesio and Abdulrahman Almuhareb, 2008. Extracting Concept Descriptions from the Web: The Importance of Attributes and Values. In P. Buitelaar and P. Cimiano (eds), Ontology Learning and Population: Bridging the Gap between Text and Knowledge IoS, Amsterdam, 29-44. (pdf)
  • Poesio, Massimo, Eduard Barbu, Claudio Giuliano and Lorenza Romano, 2008. Supervised relation extraction for ontology learning from text based on a cognitively plausible model of relations. In Proc. of ECAI Workshop on Ontology Learning and Population, Patras. (pdf)
  • Babru, E. and Massimo Poesio, 2008. A Comparison of Feature Norms and WordNet. In Proc. of The Global WordNet Conference, Szeged, Hungary. (pdf)
  • Almuhareb, A. and M. Poesio, 2006. MSDA: A Word Sense Discrimination Algorithm. Proc. of ECAI, Riva del Garda, August. (pdf)
  • Poesio, M. and A. Almuhareb, 2005c. Identifying Concept Attributes Using a Classifier. In T. Baldwin, A. Korhonen and A. Villavicencio (eds), Proc. of ACL Workshop on Deep Lexical Semantics, Ann Arbor, Michigan, June. (pdf)
  • Almuhareb, A. and M. Poesio, 2005b. Concept Learning and Categorization from the Web. Proc. of Annual Meeting of the Cognitive Science Society, (Poster), Stresa, July. ( pdf)
  • Almuhareb, A. and M. Poesio, 2005a. Finding Concept Attributes in the Web using a parser. Proc. of the Corpus Linguistics Conference, Birmingham, July. (pdf)
  • Abdulrahman Almuhareb and Massimo Poesio, 2004. Attribute-based and value-based clustering: an evaluation, Proc. of EMNLP, Barcelona, July (pdf).
  • Massimo Poesio, Rahul Mehta, Axel Maroudas and Janet Hitzeman, 2004. Learning to resolve bridging references, Proc. of ACL, Barcelona, July. (pdf)
  • Massimo Poesio, Tomonori Ishikawa, Sabine Schulte im Walde and Renata Vieira, 2002. Acquiring lexical knowledge for anaphora resolution. Proc. of LREC, Las Palmas, May. (pdf)
  • Renata Vieira and Massimo Poesio, 2000. An Empirically-Based System for Processing Definite Descriptions. Computational Linguistics, v. 26, n.4, 539-593. (pdf)
  • Massimo Poesio and Renata Vieira, 1998. A Corpus-based Investigation of Definite Description Use, Computational Linguistics, v. 24, n.2, 183-216. (pdf)
  • Massimo Poesio, Sabine Schulte im Walde and Chris Brew, Lexical clustering and definite description interpretation. Proc. of the AAAI Spring Symposium on Learning for Discourse, Stanford, CA, March, 82--89. AAAI. pdf
  • Massimo Poesio, Renata Vieira, and Simone Teufel, 1997. Resolving Bridging Descriptions in Unrestricted Text. Proc. ACL-97 Workshop on Operational Factors in Practical, Robust, Anaphora Resolution For Unrestricted Texts. ACL Madrid, 7-11 July, pages 1-6, 1997. (.pdf)