Chemotherapy Treatment Timelines Extraction
from the Clinical Narrative:
Resources
This section provides links to some resources that participants in the shared task might be interested in.
Baseline System
The organizers provide a Docker version of a baseline system for the participants to use in the shared task. The Docker is an end-to-end system , thus can be used directly as a baseline for Subtask 2. To use it as a baseline for Subtask 1, the Docker needs to be modified to consume gold events and time expressions as start input.
Link to the baseline system and details is here: End-to-end Baseline System
Clinical TempEval Tasks
Other Useful Resources
THYME annotation guidelines for pairwise temporal relations (2014) The addition of NOTED-ON and the refinement of 2014 annotation guidelines is described in Lin, Wright-Bettner et al, 2020.
Apache Clinical Text Analysis and Knowledge Extraction System (cTAKES) is a natural language processing system for extraction of information from electronic medical record clinical free-text. Source download
Cancer Deep Phenotype Extraction from Electronic Medical Records (DeepPhe) is a natural language processing system for extracting cancer phenotypes from clinical records
DeepPhe-CR (DeepPhe for Cancer Registries), the version of DeepPhe called DeepPhe-CR specifically tailored for cancer registry abstraction and needs
EntityBERT: BERT-based Models Pretrained on MIMIC-III with or without Entity-centric Masking Strategy for the Clinical Domain -- the foundation model the shared task organizers use in their system. Paper describing the model
TimeNorm: Text to time expression with the neural parser -- a character-based recurrent neural network for finding and normalizing time expressions
xmltodict: pypi listed library for transforming xml documents to python dictionary format
Tutorial:
import xmltodict, json
with open(<xml path>, 'r') as fp:
xml = fp.read()
obj = xmltodict.parse(xml)
json.dump(fp=open(<path to save json>) 'w'), obj=obj, indent=2)