Dataset

Hidden-RAD2 provides task-specific datasets based on chest X-ray cases from MIMIC-CXR and potentially IU-Xray or other suitable sources.

The source dataset may differ across releases or cases. Participants must comply with the access, licensing, and citation requirements of each source dataset.

Data Repositories

• Task 1: https://github.com/hidden-rad2/Task1

• Task 2: https://github.com/hidden-rad2/Task2

• Hidden-RAD2 organization: https://github.com/hidden-rad2

Task-specific files, schemas, release notes, and submission instructions will be published in the corresponding repositories.

Task 1 Dataset

Task 1 is designed to reproduce the structured reasoning process used by radiologists when interpreting chest X-ray images.

Each training case contains a chest X-ray image and a radiologist-validated structured interpretation consisting of:

• A1 — Initial impressions

• A2 — Thoracic level or lung zone

• A3 — Anatomical location

• A4 — Final impressions

• A5 — ABCDE-based confirmation checklist

For the test data, participants are given chest X-ray images and must generate the corresponding A1–A5 structured interpretation.

Cases may be obtained from MIMIC-CXR, IU-Xray, or other suitable chest X-ray datasets. The source dataset and applicable access requirements will be specified for each release.

For further information, see the Task 1 Definition page.

Task 2 Dataset

Each Task 2 case includes:

• a causal exploration section to be verified;

• the original radiology report; and

• the corresponding chest X-ray image.

Each case begins with a clinically validated gold causal explanation. Controlled hallucinations are introduced into selected explanations to create invalid examples. The dataset also includes valid explanations without inserted errors. Participants must therefore not assume that every causal explanation is invalid.

Task 2 labels include:

• validity;

• error spans;

• hallucination types; and

• corrected causal explanations.

All authorized participants receive the report and image resources available for each case. Participants may freely decide whether a submitted system uses the report only, the image only, or both.

Each submitted run must identify the evidence actually used:

• R — Report-only

• I — Image-only

• RI — Report-and-Image

These evidence-use categories describe the inputs used by a submitted run. They are not separate subtasks, and participants do not need to select a category in advance.

For further information, see the Task 2 Definition page.

Annotation and Quality Control

Task 1 structured interpretations, Task 2 gold explanations, and controlled hallucinations are reviewed using structured radiology guidelines.

Quality control examines:

• clinical validity;

• consistency with the available evidence;

• anatomical and diagnostic accuracy;

• completeness of supporting evidence;

• appropriateness of the hallucination type;

• accuracy and safety of the correction; and

• whether the correction introduces any new unsupported claims.

Data Access and Licensing

Access and licensing requirements vary by source dataset. Participants using restricted data must obtain the required authorization.

For details, see the Data Access and Licensing page for MIMIC-CXR.

Staged Dataset Release Schedule

• Release 1 — July 13, 2026

• Release 2 — July 20, 2026

• Release 3 — July 25, 2026

• Formal-run test data — July 30, 2026

Task-specific files, release notes, and schedule changes will be announced through the corresponding GitHub repository.

Page updated

Google Sites

Report abuse