Unexpected news events such as natural disasters represent a unique information access problem where the performance of traditional approaches deteriorates. For example, immediately after an event, the corpus may be sparsely populated with relevant content. Even when, after a few hours, relevant content is available, it is often inaccurate or highly redundant. At the same time, crisis events demonstrate a scenario where users urgently need information, especially if they are directly affected by the event.
The goal of this track is to develop systems which allow users to efficiently monitor the information associated with an event over time. Specifically, we are interested in developing systems which
For each event, a system will traverse the input stream of documents from the event onset time, t0, until some fixed period afterward, tT. Throughout this simulation, the system will emit short timestamped text summaries whenever appropriate. At the end of the simulation, the system will have produced a list of tuples,
where
is the timestamp of the ith text update, ui. The content of ui can be either extracted from documents in the stream or generated by the system.
In order to evaluate a system's simulation output, O, we need a set of all possible relevant sub-events annotated with the time at which it occurred. This set can be derived by retrospective analysis of the event using some manual editorial process. This is the approach taken in the GALE distillation and NTCIR 1CLICK evaluations. In addition to purely manual evaluation, we will consider a semi-automatic nugget-based evaluation. Decidability of what nuggets (updates) match what system output or documents can be done mostly automatically. System performance will be aggregated over a set of events, each with a separate system output and relevant set of sub-events. We currently plan to use the Wikipedia edit history for Current Events since it clearly defines a target audience and provides a great deal of manual extraction and summarization work that is frequently updated as current events unfold and news becomes available. Assessors will remove unnecessary components and aid in fact extraction based upon the edit history.