Workshop on Monolingual
Text-To-Text Generation
Co-located with ACL-HLT 2011 in Portland, OR, 24 June 2011

This workshop is endorsed by SIGGEN
This workshop has now concluded.  Thank you to all for making it such a successful event.

Please join the group on text-to-text generation and share your thoughts about possible shared tasks and data.

There is also now a website for text-to-text generation to keep track of future workshops and new data sets.  Please visit: https://sites.google.com/site/t2tgen.

Call for papers

The ability to perform monolingual text-to-text generation is an important step in solving many natural language processing problems. For example, when generating novel text at the sentence-level, abstractive summarization systems may need to compress sentences or fuse multiple sentences together; the evaluation of translation systems may require additional paraphrases to use as reference gold standards; and answers to questions may need to be generated automatically from extracted sentences.

The community of researchers examining monolingual text-to-text generation has grown steadily in recent years, introducing the need for a focused venue to communicate results in this area.  As tools and approaches are developed, it is important that our community shares its experiences and its resources.

This workshop will solicit work describing the use of data-oriented text-to-text generation methods, where the generation process begins with some source text as input. As such, it complements existing events such as GenChal'11 at ENLG 2011, which will have a focus on data-to-text surface realisation methods.

This year, the workshop will focus on work describing the generation of novel sentences, with preference given to submissions that describe how the proposed text-to-text generation model has an impact on content selection and/or issues of grammaticality at the sentence level.   Submissions can describe work-in-progress, resources, position papers as well as traditional unpublished work.

Suggested topics for this workshop include (but are not limited to):

  • Sentence fusion
  • Sentence compression
  • Sentence-level paraphrase generation
  • Answer generation for questions
  • Sentence simplification
  • Evaluating novel sentence generation
  • Semantics and world knowledge for sentence generation
  • Content planning issues in text-to-text generation
  • Incorporating user preferences for text-to-text generation
  • Discourse-level constraints for novel sentence generation
  • Descriptions of new monolingual text-to-text generation problems
  • Applications of monolingual text-to-text generation
List of useful resources.

Submit a paper

We will be accepting both short (up to four (4) pages of content, and two (2) additional pages of references) and long papers (up to eight (8) pages of content, with two (2) additional pages of references).   Submission requirements are identical to that of the main conference.  For further information on the submission guidelines see:  http://www.acl2011.org/call.shtml#submission .

Please submit you paper using the START V2 system. Note that short papers can be submitted on April 9 (one day after the ACL short paper notification) but the abstracts must be emailed to the organizers before April 1.

Important dates

  • Dec 18   Workshop CFP
  • Apr 01   Abstract due date
  • Apr 09  Full submission due date
  • Apr 28   Notification of acceptance
  • May 06   Camera-ready deadline
  • Jun 24   Workshop

Committees

Program committee:

  • Anja Belz
  • Bernd Bohnet
  • Aoife Cahill
  • Chris Callison-Burch
  • Robert Dale
  • Mark Dras
  • Michel Galley
  • Kevin Knight
  • Emiel Krahmer
  • Mirella Lapata
  • Nitin Madnani
  • Erwin Marsi
  • Kathleen McKeown
  • Ryan McDonald
  • Cécile Paris
  • Michael Strube
  • Michael White
  • David Zajic
Steering committee:
  • Chris Callison-Burch
  • Mirella Lapata
  • Erwin Marsi

Organizers & contact information

Useful resources

Accepted papers

Framework for Abstractive Summarization using Text-to-Text Generation
Pierre-Etienne Genest and Guy Lapalme

Learning to Fuse Disparate Sentences
Micha Elsner and Deepak Santhanam

An Unsupervised Alignment Algorithm for Text Simplification Corpus Construction
Stefan Bott and Horacio Saggion

Text specificity and impact on quality of news summaries
Annie Louis and Ani Nenkova

Comparing Phrase-based and Syntax-based Paraphrase Generation
Sander Wubben, Erwin Marsi, Antal van den Bosch and Emiel Krahmer

Web-based validation for contextual targeted paraphrasing
Houda Bouamor, Aurélien Max, Gabriel Illouz and Anne Vilnat

Paraphrastic Sentence Compression with a Character-based Metric: Tightening without Deletion
Courtney Napoles, Chris Callison-Burch, Juri Ganitkevitch and Benjamin Van Durme

Evaluating sentence compression: Pitfalls and suggested remedies
Courtney Napoles, Chris Callison-Burch and Benjamin Van Durme

Creating Disjunctive Logical Forms from Aligned Sentences for Grammar-Based Paraphrase Generation
Scott Martin and Michael White

Learning to Simplify Sentences Using Wikipedia
Will Coster and David Kauchak

Towards Strict Sentence Intersection: Decoding and Evaluation Strategies
Kapil Thadani and Kathleen McKeown

Ivited talk

Statistical Perspectives on Text-to-Text Generation

Speaker: Noah Smith, CMU

Statistical techniques are now the dominant approach for NLP problems like translation and parsing, where data occur as a by-product of human activities (parallel corpora from international organizations) or can be obtained by expert annotation efforts. Statistical models for text-to-text problems must deal with scenarios where data are less "natural," less static, and generally smaller. I'll present and attempt to synthesize examples from my group's efforts in this area, drawing from examples in machine translation, question answering, question generation, paraphrase, and summarization.

Slides.

Workshop schedule

9:00 - 10:30

  • Learning to Simplify Sentences Using Wikipedia
    Will Coster and David Kauchak, (9:00)
  • Web-based Validation for Contextual Targeted Paraphrasing
    Houda Bouamor, Aure ́lien Max, Gabriel Illouz and Anne Vilnat, (9:30)
  • An Unsupervised Alignment Algorithm for Text Simplification Corpus Construction
    Stefan Bott and Horacio Saggion, (10:00)
  • Comparing Phrase-based and Syntax-based Paraphrase Generation
    Sander Wubben, Erwin Marsi, Antal van den Bosch and Emiel Krahmer (10:15)

Morning break (10:30 - 11:00)

11:00 - 12:30

Invited talk (Noah Smith) and discussion

Lunch (12:30 - 14:00)

14:00 - 15:30

  • Text Specificity and Impact on Quality of News Summaries
    Annie Louis and Ani Nenkova, (14:00)
  • Towards Strict Sentence Intersection: Decoding and Evaluation Strategies
    Kapil Thadani and Kathleen McKeown, (14:30)
  • Learning to Fuse Disparate Sentences
    Micha Elsner and Deepak Santhanam, (15:00)

Afternoon break (15:30 - 16:00)

16:00 - 17:30

  • Framework for Abstractive Summarization using Text-to-Text Generation
    Pierre-Etienne Genest and Guy Lapalme, (16:00)
  • Creating Disjunctive Logical Forms from Aligned Sentences for Grammar-Based Para- phrase Generation
    Scott Martin and Michael White, (16:30)
  • Paraphrastic Sentence Compression with a Character-based Metric: Tightening without Deletion
    Courtney Napoles, Chris Callison-Burch, Juri Ganitkevitch and Benjamin Van Durme, (17:00)
  • Evaluating Sentence Compression: Pitfalls and Suggested Remedies
    Courtney Napoles, Benjamin Van Durme and Chris Callison-Burch, (17:15)