Surface Realisation
Surface realisation maps structured (including linearised structured) meaning representations to surface word strings, usually individual sentences. In the period up to 2011, many different surface realisers were developed, with symbolic realisers initially dominating. Post 2000, however, statistical surface realisers began to dominate. A significant subset of statistical realisation work (Langkilde, 2002; Callaway, 2003; Nakanishi et al., 2005; Zhong and Stent, 2005; Cahill and van Genabith, 2006; White and Rajkumar, 2009) produced results for regenerating the Penn Treebank (PTB) (Marcus et al., 1995). The basic approach in all this work was to remove information from the Penn Treebank parses (the word strings themselves as well as some of the parse information), and then convert and use these underspecified representations as inputs to a surface realiser which aimed to reproduce the original treebank sentence.
While publications reporting this type of work referred to each other and (more or less tentatively) compared BLEU scores, the results were not in fact directly comparable, because of the differences in the input representations automatically derived from Penn Treebank annotations. In particular, the extent to which they were underspecified varied from one system to the next. The aim in developing the 2011 Surface Realisation Shared Task (SR'11) was to make it possible, for the first time, to directly compare different, independently developed surface realisers by developing a ‘common-ground’ input representation that could be used by all participating systems to generate realisations from. In fact, two different input representations were created, one shallow, one deep, in order to enable more teams to participate.
With the advent of neural NLG techniques, there has been renewed interest in standard data sets and shared tasks. New work is being reported for SR'11 (e.g. Putupully et al., 2017; Marcheggiani and Perez-Beltrachini, 2018), and a new, multilingual SR task (SR'18, SR'19) has been introduced.