The goal is to develop systems that can insert useful content into an existing sentence. We scope this task to generating post-modifiers for entities.
The input is a sentence mentioning a news event about an entity and a list of knowledge base entries involving the entity. The entity and the slot where the post-modifier needs will be annotated as part of the input.
The output is an appropriate post-modifier phrase that can be inserted into the sentence (immediately after the entity).
Example 1
Input:
- Sentence: Barack Obama hailed the MeToo movement as a critical grassroots effort that needs the support of the society at large.
- Entity: Barack Obama
- KBID: FB123412
Output:
a father of two girls
The output will be inserted as a clause right after the entity mention in the input sentence. So the above output is to be interpreted as below:
Barack Obama, a father of two girls, hailed the MeToo movement as a critical grassroots effort that needs the support of the society at large.
Example 2
Input:
- Sentence: Barack Obama criticized the 5-4 supreme court decision and said that this unravels the Title IX protections in an unprecedented judicial overreach.
- Entity: Barack Obama
- KBID: FB123412
Output:
the 44th president of the US
The output will be inserted as a clause right after the entity mention in the input sentence. So the above output is to be interpreted as below:
Barack Obama, the 44th president of the US, criticized the 5-4 supreme court decision and said that this unravels the Title IX protections in an unprecedented judicial overreach.
The evaluation measures how well the generated post-modifier matches the original post-modifier and whether the generated text fits with rest of the sentence. We will use the following measures:
[We will release a script soon.]
The dataset consists of 30,691 sentences extracted from the Gigaword and CNN-DailyMail collections. Additional details below.
[TODO filter out clearly unnecessary fields]
We have built a baseline, a seq2seq model with attention using the Open NMT (https://github.com/OpenNMT/OpenNMT-py).
Model details: