This is the first pilot year of the ShARe/CLEF eHealth Evaluation Lab, a shared task focused on natural language processing (NLP) and information retrieval (IR) for clinical care.The goal is to present the community with highly relevant eHealth tasks that will be further defined through the pilot edition. In the subsequent shared task edition (ShARe/CLEF eHealth 2014 Shared Task), the tasks will be expanded with additional data and layers of annotations.
Obtaining the data requires several preparatory steps--please see the Datasets tab to learn how to obtain the data.
The vision is two-fold: (1) to develop tasks that potentially impact patient understanding of medical information and (2) to provide the community with an increasingly sophisticated dataset of clinical narrative to advance the state-of-the-art in Natural Language Processing, Information Extraction and Information Retrieval in healthcare. The tasks and annotations are aligned with these in the general NLP domain (such as CoNLL shared tasks, TREC evaluation). As such, an effort is made to stay close to community-adopted conventions and standards yet capture the uniqueness of the clinical narrative.
Task1 and Tasks 2 involves annotation of entities in a set of narrative clinical reports; Task 3 involves retrieval of web pages based on queries generated when reading the clinical reports.
The tasks will operate by distributing the training materials to registered task participants. Training and testing materials include
Training: 200 clinical reports with standoff annotations of disorder mention spans and UMLS concept unique identifiers (CUIs)
Test: 100 clinical reports
Training: 200 clinical reports with standoff annotations of acronym/abbreviation mention spans and UMLS CUIs
Test: 100 clinical reports with standoff annotations of acronym/abbreviation mention spans
Training: a collection of medically related web documents, 5 development queries, and result set of web documents
Test: 50 test queries
Participants will have approximately one month to explore the training materials and develop automated techniques, after which test materials for the task will be released. After test material release, no further technique development should take place. Evaluation of the participant submissions will be distributed to the participants before the CLEF eHealth Workshop. See Timeline page for specific dates.
This shared task has been supported in part by
- the Shared Annotated Resources (ShARe) project funded by the United States National Institutes of Health (R01GM090187). The guidelines annotation and schema for the SHAre/CLEF eHealth 2013 shared task are available to the research community. When using the annotation guidelines in your own research and publications, please cite:
- The ShARe Schema for the Syntactic and Semantic Annotation of Clinical Texts. Noémie Elhadad, Wendy Chapman, Tim O’Gorman, Martha Palmer, Guergana Savova. Under Review.
- the CLEF Initiative (Conference and Labs of the Evaluation Forum, formerly known as Cross-Language Evaluation Forum)
- NICTA, funded by the Australian Government as represented by the Department of Broadband, Communications and the Digital Economy and the Australian Research Council through the ICT Centre of Excellence program.
- The Khresmoi project, funded by the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no 257528.
- Office of the National Coordinator of Healthcare Technology SHARP 90TR0002.
- VA CHIR