Anaphora is the linguistic device of referring back to previously introduced entities with expressions such as pronouns, as in
1. The city council refused the women a permit because they feared violence
Anaphora resolution has been one of my main areas of research in the past twenty years or so; the handbook that I co-edited ( Poesio, Stuckardt and Versley, 2016) summarizes the state of the field. Over the years I have worked on the linguistics of anaphora (e.g., the problem of telescoping); on anaphora resolution (or coreference) both from a scientific and a technological perspective; on the creation of anaphorically annotated resources such as ARRAU and GNOME; and on the use of anaphora resolution in applications such as summarization and information extraction.
One of the main stumbling blocks for research in semantics and discourse has been the lack of semantically annotated resources of a size and quality comparable to that of the Penn Treebank. As a result, in order to support the research on anaphora discussed below (and in particular our work on salience, bridging references, reference to abstract objects, and disagreements on anaphora) I have become involved in a number of projects concerned with the creation of such resources: the MATE project in which general-purpose anaphoric annotation guidelines were designed, the GNOME project focused on annotation to support research on salience and bridging, the ARRAU project on abstract object anaphora and disagreements on anaphora; the Phrase Detectives project; and the ongoing DALI project. See the page on annotation for more details.
An issue that became apparent from the very first data collections I was involved with (the TRAINS corpus, the Vieira-Poesio corpus) and then became the focus of several subsequent projects is that of disagreements in the interpretation of anaphoric expressions (Poesio and Reyle, 2001; Poesio, Reyle, and Stevenson, 2007; Poesio et al, 2006). There are many anaphoric expressions on whose interpretation people disagree without finding them ambiguous. An example are pronouns like IT in the following example,
2. Can you kindly hook up engine E3 to the boxcar at Elmira and send IT to Corning as soon as possible please?
These examples led us to formulate the Justified Sloppiness Hypothesis to explain the felicitousness of such cases. In subsequent work we identified a number of other cases of 'felicitous ambiguity', and carried out numerous empirical investigations of the problem, in particular in the ARRAU project, the Phrase Detectives Game-With-A-Purpose for anaphoric annotation (Poesio et al, 2013), and now in the DALI project (see the page on ambiguity for more discussion.)
With regards to anaphora resolution, I have been especially interested in the effect of salience on anaphora resolution (Poesio et al, 2004; Poesio et al, 2006; Karamanis et al, 2009), the use of lexical and commonsense knowledge in the interpretation of so-called bridging references (Poesio et al, 1997; Poesio and Vieira, 1998; Poesio, Schulte im Walde and Brew, 1998; Vieira and Poesio, 2000; Poesio et al, 2002; Poesio et al, 2004) see also next Section) and the treatment of non-anaphoric definites (see Section on the Linguistics of Anaphora).
My work with Janet Hitzeman, Rosemary Stevenson, and other collaborators (in particular Hua Cheng, Renate Henschel, and Nikiforos Karamanis) on the role of salience in anaphoric interpretation started with a careful analysis of the claims of theories of the local focus such as Sidner's theory (Sidner, 1979) or Centering theory (Grosz, Joshi and Weinstein, 1995; Walker, 1998) at the light of empirical evidence. According to such theories, at each stage in the interpretation of a discourse, certain entities are more 'important' or `salient' than others, and this affects the interpretation of anaphoric expressions including pronouns and demonstratives. Consider for instance the contrast pointed out by Gundel between 3a and 3b. Most theories of the local focus predict the apple to be `more salient' than the napkin; this would explain why it is the preferred interpretation of pronoun it in 3a. By contrast, demonstratives are generally used to refer to entities that are not in the local focus; hence the preferred interpretation of that in 3b is the napkin.
3.a Put the apple on the napkin and then move it to the side.
3.b Put the apple on the napkin and then move that to the side.
Our studies involved a combination of traditional behavioral methods from psycholinguistics, of corpus analysis, and of computational modelling. Some of the questions we studied using standard psychological techniques include the effect of rhetorical structure on pronoun interpretation, and especially the interaction between animacy and thematic roles in determining the salience of a discourse entity (Pearson, Stevenson and Poesio, 2001). This work has been continued in recent years by my current student Kevin Glover, who, in his PhD (Glover, 2015) proposes a novel model of animacy and for estimating the animacy of noun phrases from corpora, which he then deploys in a variety of applications ranging from predicting pronominalization to clinical applications. In parallel with this behavioral work, we used annotated corpora (in particular, the GNOME corpus - see below) to evaluate how the claims of local focus theories such as Centering are affected by different ways of setting their `parameters'--e.g., whether position in the sentence or grammatical function are used to rank discourse entities, how frequently the local focus is updated (after every clause or every sentence), etc. (Poesio et al, 2004).
These analyses were subsequently extended in several directions. In collaboration with Natalia Modjeska I used the computational models of the local focus to investigate the claims mentioned above about the relation between the local focus and the use of demonstrativce NPs in English (Poesio et al, 2004). In collaboration with Barbara Di Eugenio and several students including Amrita Patel I tested several models of the global focus proposed in the literature on Centering (Poesio et al, 2006). Finally, in collaboration with Janet Hitzeman, I looked at the relation between local focus and global focus studying the interpretation of so-called long distance pronouns (Hitzeman and Poesio, 1998).
The work with Janet Hitzeman and Rosemary Stevenson on psychologically plausible models of anaphora resolution had obvious applications in the area of natural language generation - the development of systems capable of producing natural language that find easy to process. This observation led to a joint project on language generation in which Janet, Rosemary and I collaborated with Donia Scott (then at the University of Brighton) and Barbara di Eugenio from the University of Illinois at Chicago. The GNOME project, which ran from 1998 to 2000, was an EPSRC-funded project whose goal was to apply results from psychological research and from corpus analysis to develop and evaluate algorithms for generating nominal expressions.
In GNOME we developed both hand-crafted and statistical models of the processes involved in discourse entity realization, including statistical models of the choice of NP type (Poesio and Henschel, 1999 ; Poesio et al, 1999 ; Poesio, 2000 ) and NP modification ( Cheng, Poesio, Henschel, and Mellish, 2001 ). We were particularly interested on pronominalization ( Henschel, Cheng and Poesio, 2000 ).
More recently, I collaborated with Nikiforos Karamanis, Chris Mellish and Jon Oberlander on developing statistical models of other types of generation, including text structuring ( Karamanis et al 2009).
Definite descriptions like the handles are a particularly interesting class of anaphoric expressions in that lexical and commonsense knowledge is often required to interpret them, unlike the case of pronouns. For instance, in the following example from the GNOME corpus (reported by (Poesio et al, 2004)), `the handles' is an associative reference to the `egg vases' in 4a.
4a. These `egg vases' are of exceptional quality.
4b. Basketwork bases support egg-shaped bodies, and bundles of straw form the handles
Hence, studying how such anaphoric references are interpreted may lead to insights in the interface between grammar and commonsense knowledge. Together with my former student Renata Vieira we first of all produced a classification of bridging references, and tested whether lexical sources such as WordNet provided sufficient information, with mediocre results ( Poesio, Vieira, and Teufel, 1997; Vieira & Poesio, 2000 ). In subsequent work, we started investigating whether the necessary lexical and commonsense knowledge could be acquired in a psychologically plausible way from corpora, using so-called `semantic space' or `distributional' models (Poesio, Schulte im Walde, and Brew, 1998; Poesio, Ishikawa, Schulte in Walde, and Vieira, 2002; Poesio et al, 2004). This work led me to initiate a new line of research in such models for the acquisition of lexical and commonsense knowledge from corpora.
In recent years, I have started investigating the use of Wikipedia to resolve encyclopedic definite descriptions such as the composer in Bach ... the composer, in collaboration with Olga Uryupina (Uryupina et al, 2012).
From a technological perspective, I've been especially interested in the use of anaphora resolution for summarization. In collaboration with Josef Steinberger and Mijail Kabadjov, I managed to show that using anaphora resolution can result in significant improvements in the quality of summaries as measured in terms of scores such as ROUGE (Steinberger et al, 2007).
More recently, I started investigating the use of anaphora resolution to summarize online forums, in the SENSEI project.
Over the years I co-led the development of a number of anaphora resolution systems: first the Vieira-Poesio definite description resolver, then two of the first off-the-shelf toolkits for anaphora resolution, GUITAR (Poesio and Kabadjov, 2004) and more recently BART (Versley et al, 2008; Broscheit et al, 2010; Uryupina et al, 2011, 2012).
The aspects of the linguistics of anaphora that I investigated include the phenomena of telescoping (Poesio and Zucchi, 1990); of weak definites (Poesio, 1994); and more in general whether familiarity is really the foundation of definiteness (Poesio, 1999, 2001).
`Weak definites' are definite descriptions that seem to have `lost' their presuppositional status ( Poesio, 1994 ). This initial work was based on the assumption that the defining characteristic of definites is familiarity, as suggested, e.g., by Heim (1982). More recently, I have been led to reconsider this assumption, primarily because of my work with Renata Vieira on analyzing the uses of definite descriptions in corpora. We found that more than half of definite descriptions are not `familiar' in an obvious sense. As a result, I began considering theories based on the competing hypothesis, that what characterizes definites is uniqueness. Specifically, I concentrated on Loebner's (1987) theory. Using the GNOME corpus, we compared the familiarity-based account with Loebner's; preliminary results of this work are discussed in (Poesio, 2001 )
I also developed a theory of definite description interpretation, which made use of research on the formal semantics of discourse and on defeasible reasoning ( Poesio, 1992; Poesio, 1993b; Poesio, 1994b; Poesio and Rieser, 2011 ).