Data:

Interpretable dataset

Core of the data is based on a real doctor's diagnosis process.

This task provides two pieces of data to participants.

1. 1. 1. Chest radiography (image)
    2. Report (text)
      1. Interpretable report of diagnosis process (for input)
      2. Interpretive report (for evaluation)

All data corresponding to the report was generated through GPT-4 from answers obtained through questions according to the diagnostic procedure. Interpretable report of diagnosis process consists of information from QA (Question and Answer) 1-4. The interpretive report consists of information from QA 1-5. You can see below what QA's information includes.

Interpretable dataset annotation procedure

This dataset includes anatomical information and causative outcomes of lesions based on text from the licensing test.

This data was annotated in the form of QA with the doctor's interpretation of the diagnostic procedure. The annotation procedure can be seen in the picture above. The questions consisted of five questions and mimicked the process of a real doctor making a diagnosis on a radiography.

Step1 - Q1: Create first impressions of images for assumed disease.
Step2 – Q2: Anatomical location tracking of abnormal finding.
Step3 – Q3: Thoracic spine levels tracking of abnormal finding.
Step4 – Q4: Create diagnosis of images.
Step5 – Q5: Confirmation for checklist on the cause of diagnosis.
Step6 – Q4: Correct diagnosis after cause confirmation. (Checklist is 28 per diagnosis)
Step7 – Review: Cross review. (compare the MIMIC-CXR report with the annotation)

Comparison with related datasets

No dataset containing causes and consequences for lesions.
This dataset was augmented based on MIMIC-CXR.