Multi-rater datasets
Multi-rater datasets were collected to evaluate interrater variability among pathologists, to evaluate the accuracy of non-pathologists, and to measure biases introduced by providing algorithm-generated suggestions during annotation. The same FOVs were annotated for each multi-rater dataset. Scroll down to download.
Experimental setup
Participants annotated independently of each other. For the evaluation set, we used higher-quality suggestions, while the bootstrap control used lower-quality suggestions. Participants were not shown suggestions for the unbiased control.
Label & truth inference process
A constrained clustering process was used to obtain the potential nuclear locations from multi-rater annotations. Then, an Expectation-Maximization statistical framework was used to aggregate opinion about specific nuclei, taking participant reliability into account. When the opinions of non-pathologists were aggregated, this was called the inferred NP-label. For pathologists, it was called the inferred P-truth.
Evaluation dataset
> Click here to download the raw data (each annotator independently).
> Click here to download the inferred NP-labels.
> Click here to download the inferred P-truth.
40,028 annotations | 1,358 unique nuclei | 530 boundaries
Bootstrap control dataset
> Click here to download the raw data (each annotator independently).
> Click here to download the inferred NP-labels.
> Click here to download the inferred P-truth.
19,881 annotations | 1,349 unique nuclei | 148 boundaries
Unbiased control dataset
> Click here to download the raw data (each annotator independently).
> Click here to download the inferred NP-labels.
> Click here to download the inferred P-truth.
37,434 annotations | 1,569 unique nuclei | 0 boundaries*
* By definition, we did not show participants any algorithmic suggestions in this control experiment. However, we did ask one practicing pathologist (SP.3) to manually trace all boundaries. All nuclear boundaries in FOVs prefixed by "SP.3_#_U-control_#_" are manually traced (1,223 boundaries).