Overview of image annotation levels used in digital pathology. Taken from Lee et al. 2021 Front. Artif. Intell.
Building a dataset for training machine learning models can be arduous. The time required to manually produce accurate "ground truth" labels increases drastically as the resolution of the annotated objects increases. Consider the time it would take to add a tag to an image as a whole compared to how long it would take to segment and label each tumor in that slide (this is what we did to train GLASS-AI). Building a machine learning model using individual cell annotations would require annotating every cell in a single image, a nearly Sisyphean task. Several programs can perform the segmentation of individual cells. However, each of those thousands of tiny shapes still needs to be labeled.
Example images from H&E stained lung tumor and two adjacent slides stained for different molecular markers (top) Computational cell segmentation and signal extraction for data set generation (bottom).
While labeling cells by type manually is a daunting task, we routinely do this in the lab using immunohistochemistry (IHC) for cell-specific markers. We reasoned that we could extract the signal from these stained slides to compute labels for the individual cells. We used IHC-stained slides adjacent to a hematoxylin & eosin (H&E)-stained slides that were digitally aligned to segment and label cells within the H&E-stained slide. However, this approach can also be applied to serially stained sections to increase cell-to-cell registration accuracy.
An example machine learning model trained using automated cell-level annotations. Taken from Lee et al. 2021 Front. Artif. Intell.
Using the generated cell-level labels, we can train a machine learning model to segment and classify individual cells in tissue slides. The ability to generate training data for both abundant and rare cell types in a section is a tremendous advantage when training new models.
A general description of this approach was published in Lee et al. 2021 Front. Artif. Intell.