Post-modifier Generation
The goal is to develop systems that can insert useful content into an existing sentence. We scope this task to generating post-modifiers for entities.
Task Definition
The input is a sentence mentioning a news event about an entity and a list of knowledge base entries involving the entity. The entity and the slot where the post-modifier needs will be annotated as part of the input.
The output is an appropriate post-modifier phrase that can be inserted into the sentence (immediately after the entity).
Example 1
Input:
- Sentence: Barack Obama hailed the MeToo movement as a critical grassroots effort that needs the support of the society at large.
- Entity: Barack Obama
- KBID: FB123412
Output:
a father of two girls
The output will be inserted as a clause right after the entity mention in the input sentence. So the above output is to be interpreted as below:
Barack Obama, a father of two girls, hailed the MeToo movement as a critical grassroots effort that needs the support of the society at large.
Example 2
Input:
- Sentence: Barack Obama criticized the 5-4 supreme court decision and said that this unravels the Title IX protections in an unprecedented judicial overreach.
- Entity: Barack Obama
- KBID: FB123412
Output:
the 44th president of the US
The output will be inserted as a clause right after the entity mention in the input sentence. So the above output is to be interpreted as below:
Barack Obama, the 44th president of the US, criticized the 5-4 supreme court decision and said that this unravels the Title IX protections in an unprecedented judicial overreach.
Evaluation
The evaluation measures how well the generated post-modifier matches the original post-modifier and whether the generated text fits with rest of the sentence. We will use the following measures:
- KB coverage
- General: # of KB claims covered by the PM
- Gold PM: # of relevant KB claims covered by the PM (where relevant = set of claims that have overlap with gold PM)
- Exact match
- BLEU + METEOR
- Word embedding based similarity measure (cosine of averaged word vectors)
- Coherence of the modifier in the local context (language modeling + similarity w/ context?)
[We will release a script soon.]
Dataset:
The dataset consists of 30,691 sentences extracted from the Gigaword and CNN-DailyMail collections. Additional details below.
[TODO filter out clearly unnecessary fields]
Baseline Models:
We have built a baseline, a seq2seq model with attention using the Open NMT (https://github.com/OpenNMT/OpenNMT-py).
Model details:
- Single 2-layer biLSTM that encodes the sentence + claims.
- How is the input encoded?
- Sentence first + special tokens to mark relation and values in the claims.
- <rel> ... </rel> for relation, <value> ... </value> for value, and <and> is used if there are more than one value.
- Example: Last night German magazine Der Spiegel quoted aides to Herman Van Rompuy . <rel> member of political party </rel> <value> Christian Democratic and Flemish <and> European People's Party </value> <rel> occupation </rel> <value> politician <and> economist </value>
- What is the attention doing?
- Standard multiplicative attention that produces weights over the hidden states for the input encoder states (http://opennmt.net/OpenNMT-py/onmt.modules.html#attention).
- Decoder will generate post-modifier alone with beam decoding (beam size = 5).