Discourse Relation Parsing and Treebanking (DISRPT)
7th Workshop on Rhetorical Structure Theory and Related Formalisms
In conjunction with 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019)
June 6, 2019
Study of coherence relations in frameworks such as RST (Mann & Thompson 1988), SDRT (Asher & Lascarides 2003) and PDTB (Miltsakaki et al. 2004), has experienced a revival in the last few years, in English and many other languages (Matthiessen & Teruya 2015; Maziero et al. 2015; da Cunha 2016; Iruskieta et al. 2016; Zeldes 2016, 2017). Multiple sites are now actively engaged in the development of discourse parsers (Feng and Hirst 2014; Joty et al. 2015; Surdeanu et al. 2015; Xue 2016; Braud et al. 2017), as a goal in itself, but also for applications such as sentiment analysis, argumentation mining, summarization, question answering, or machine translation evaluation (Benamara et al., 2017; Gerani et al. to appear; Durrett et al. 2016; Peldszus & Stede 2016; Scarton et al. 2016; Schouten & Frasincar 2016; among many others). At the same time, evaluation of results in discourse parsing has proven complicated (see Morey et al. 2017), and progress in integrating results across discourse treebanking frameworks has been slow.
DISRPT 2019 follows a series of biennial events on discourse relation studies, which were initially focused especially on RST, first in Brazil (2007, 2009, 2011, 2013) as part of Brazilian NLP conferences, and then in Spain in 2015 and in 2017, as part of the Spanish NLP conference (https://sites.google.com/site/workshoprst2015/) and INLG 2017 (https://sites.google.com/site/workshoprst2017/). The 2019 workshop aims to broaden the scope of discussion to include participants and program committee members from different discourse theories (especially, but not limited to, RST, SDRT and PDTB). We are interested in applied papers with a computational orientation, resource papers and work on discourse parsing, as well as papers that advance the field with novel theoretical contributions and promote cross-framework fertilization. A major theme and a related shared task on discourse unit segmentation across formalisms (see below) will aim to promote convergence of resources and a joint evaluation of discourse parsing approaches.
We invite submissions on the following and related topics, handling any language(s), and especially under-represented ones:
- Discourse relations (issues in segmentation, relation inventory, cognitive status of relations)
- Discourse parsing in any formalism, including shallow and deep discourse parsing
- Relation signaling (connectives and any other signals) and annotation
- Applications of coherence relations in NLP
The invited speaker for the workshop will be Bonnie Webber (Institute for Language, Cognition and Computation, University of Edinburgh) - title: Discourse (2009-2019): Recent successes, future challenges.
This workshop introduces the first iteration of a cross-formalism shared task on discourse unit segmentation. Since all major discourse parsing frameworks require a segmentation of texts into non-overlapping, though possibly discontinuous segments, learning segmentations for and from diverse resources is a promising area for converging methods and insights. We will provide training, development and test datasets from all available languages in RST, SDRT and PDTB, using a uniform format. Because different corpora, languages and frameworks use different guidelines for segmentation, the shared task will promote design of flexible methods for dealing with various guidelines, and will help to push forward the discussion of converging standards for discourse units. For datasets which have treebanks, we will evaluate in two different scenarios: with and without gold syntax.
- Amir Zeldes (Georgetown University, Washington, DC, USA)
- Debopam Das (Humboldt University of Berlin; University of Potsdam, Germany)
- Erick Galani Maziero (Universidade Federal de Lavras, Brazil)
- Juliano Desiderato Antonio (Universidade Estadual de Maringá, Brazil)
- Mikel Iruskieta (University of the Basque Country, Spain)
- Stergos Afantenos, IRIT - Université Paul Sabatier, France
- Farah Benamara, IRIT - Université Paul Sabatier, France
- Eduard Hovy, Carnegie Mellon University, USA
- Irene Castellon, Universitat de Barcelona, Spain
- Johann Christian Chiarcos, Wolfgang Goethe Universität Frankfurt, Germany
- Maria Beatriz Nascimento Decat, Universidade Federal de Minas Gerais, Brazil
- Iria da Cunha, Universidad Nacional de Educación a Distancia, Spain
- Barbara Di Eugenio, University of Illinois at Chicago, USA
- Arantza Diaz de Ilarraza, University of the Basque Country, Spain
- Flavius Frasincar, Erasmus University Rotterdam, Netherlands
- Maria Eduarda Giering, Universidade do Vale do Rio dos Sinos, Brazil
- Nancy Green, University of North Carolina, USA
- Graeme Hirst, University of Toronto, Canada
- Kerstin Kunz, Universität Heidelberg, Germany
- Ekaterina Lapshinova-Koltunski, Universität des Saarlandes, Germany
- Jiri Mirovsky, Charles University, Czech Republic
- Anna Nedoluzhko, Charles University, Czech Republic
- Thiago Pardo, Universidade de São Paulo, Brazil
- Lucie Polakova, Charles University, Czech Republic
- Gisela Redeker, University of Groningen, Netherlands
- Hannah Rohde, University of Edinburgh, UK
- Gerardo Sierra, Universidad Nacional Autónoma de México, Mexico
- Christian Stab, Technische Universität Darmstadt, Germany
- Manfred Stede, Universität Potsdam, Germany
- Mihai Surdeanu, University of Arizona
- Maite Taboada, Simon Fraser, Canada
- Juan-Manuel Torres, Laboratoire Informatique d'Avignon, France
- Nianwen Xue, Brandeis University, USA
Asher, Nicholas, and Alex Lascarides. 2003. Logics of Conversation. Cambridge: Cambridge University Press.
Benamara, Farah, Maite Taboada & Yannick Mathieu. 2017. Evaluative language beyond bags of words: Linguistic insights and computational applications. Computational Linguistics 43(1), 201–264.
Braud, Chloé, Maximin Coavoux & Anders Søgaard. 2017. Cross-lingual RST discourse parsing. Proceedings of EACL 2017. Valencia, Spain, 292–304.
da Cunha, Iria. 2016. Towards discourse parsing in Spanish. Papers presented at TextLink - Structuring Discourse in Multilingual Europe - Second Action Conference. Budapest, Hungary.
Durrett, Greg, Taylor Berg-Kirkpatrick & Dan Klein. 2016. Learning-based single-document summarization with compression and anaphoricity constraints. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, Germany, 1998–2008.
Feng, Vanessa Wei & Graeme Hirst. 2014. A linear-time bottom-up discourse parser with constraints and post-editing. Proceedings of ACL 2014. Baltimore, MD, 511–521.
Gerani, Shima, Giuseppe Carenini & Raymond Ng. to appear. Modeling content and structure for abstractive review summarization. Computer Speech and Language.
Iruskieta, Mikel, Gorka Labaka & Juliano Desiderato Antonio. 2016. Detecting the central units in two different genres and languages: A preliminary study of Brazilian Portuguese and Basque texts. Procesamiento del Lenguaje Natural 56, 65–72.
Joty, Shafiq, Giuseppe Carenini & Raymond Ng. 2015. CODRA: A novel discriminative framework for rhetorical analysis. Computational Linguistics 41(3), 385–435.
Mann, William C., and Sandra A. Thompson. 1988. Rhetorical Structure Theory: Toward a functional theory of text organization. Text-Interdisciplinary Journal for the Study of Discourse 8(3), 243–281.
Matthiessen, Christian M.I.M. & Kazuhiro Teruya. 2015. Grammatical realizations of rhetorical relations in different registers. Word 61(3), 232–281.
Maziero, Erick G., Graeme Hirst & Thiago A. S. Pardo. 2015. Semi-supervised never-ending learning in rhetorical relation identification. Proceedings of Recent Advances in Natural Language Processing, Hissar, Bulgaria.
Miltsakaki, Eleni, Rashmi Prasad, Aravind K. Joshi & Bonnie L. Webber. 2004. The Penn Discourse Treebank. In Proceedings of LREC 2004. Lisbon, Portugal.
Morey, Mathieu, Philippe Muller & Nicholas Asher. 2017. How Much Progress have we Made on RST Discourse Parsing? A Replication Study of Recent Results on the RST-DT. In: Proceedings of EMNLP 2017. Copenhagen, Denmark, 1319–1324.
Peldszus, Andreas & Manfred Stede. 2016. Rhetorical structure and argumentation structure in monologue text. Proceedings of the 3rd Workshop on Argument Mining, ACL. Berlin, Germany, 103–112.
Riccardi, Giuseppe, Frederic Bechet, Morena Danieli, Benoit Favre, Robert Gaizauskas, Udo Kruschwitz & Massimo Poesio. 2015. The SENSEI Project: Making sense of human conversations. In J. F. Quesada, F. J. Martín Mateos & T. López-Soto (eds.), Future and Emergent Trends in Language Technology. Proceedings of the First International FETLT Workshop. Berlin: Springer, 10–33.
Schouten, Kim & Flavius Frasincar. 2016. COMMIT at SemEval-2016 Task 5: Sentiment analysis with Rhetorical Structure Theory. Proceedings of SemEval-2016. San Diego, CA, 356–360.
Scarton, Carolina, Daniel Beck, Kashif Shah, Karin Sim Smith & Lucia Specia. 2016. Word embeddings and discourse information for Machine Translation Quality Estimation. Proceedings of the First Conference on Machine Translation, ACL. Berlin, Germany, 831–837.
Surdeanu, Mihai, Thomas Hicks & Marco Valenzuela-Escárcega. 2015. Two practical Rhetorical Structure Theory parsers. Proceedings of NAACL 2015. Denver, CO, 1–5.
Xue, Nianwen, Hwee Tou Ng, Sameer Pradhan, Attapol T. Rutherford, Bonnie Webber, Chuan Wang & Hongmin Wang. 2016. CoNLL 2016 Shared Task on multilingual shallow discourse parsing. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, Germany, 1–19.
Zeldes, Amir. 2016. rstWeb: A browser-based annotation interface for Rhetorical Structure Theory and discourse relations. Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2016) System Demonstrations. San Diego, CA, 1–5.
Zeldes, Amir. 2017. The GUM Corpus: Creating Multilayer Resources in the Classroom. Language Resources and Evaluation 51(3), 581–612.