Home / Project Blogs /Cell Detection & Counting

Goal

We need to detect and count cells from 500+ images (~2,000 x 2,000 pixels) of immunofluorsecent tissue.

Intro

Some preliminary work on detecting cells from images of immunoreactive tissue has shown us that we can get good detections by training a network, like YOLOv5, with a small dataset, namely, a single large image of ~10,000 x 10,000 pixels and ~1µm/pixel resolution. Since we are starting this project without any manual cell markings, we thought we could accelerate the acquisition of cell positions by obtaining some rough cell detections with a network trained on a small dataset, manually correcting the detections, and retraining the model with the new corrected detections. Further, given that the network can learn from a small dataset, we may only have to retrain with manual corrections of a subset of the 500+ images. Then if the retrained network performs well on the next subset of images, we can deploy it on the remaining dataset.

Training Data

We used a total of 130 labeled images of fluorescent tissue each stained for either ChAT, DBH, or pERK.

Pre-Process

Since we are interested in capturing cells regardless of their color, we converted all images (height x width x RGB) to grayscale (height x width). Here we considered a few grayscaling methods: averaging across the RGB dimension, selecting the visually-relevant RGB channel, and obtaining the maximum pixel intensity across the RGB dimension.

Visually Green Image

Average Across Channels

Green Channel

Maximum Intensity Across Channels

The images in our dataset are pseudocolored images from an original signal. The color itself is not a feature we care about, so we can convert the images into grayscale which preserves signal intensity. Averaging across the RGB dimension is an option, but it results in a dim grayscale image. When the image is predominantly green, the blue and red channels of the color image are mostly zeros (black). As a result, the signal of the green channel is averaged with low intensities, reducing the overall signal. The other two options are using the visually-relevant color channel or the maximum intensity across color channels. Both result in the same grayscale image, which tells us that the 'green' channel in the images above contains the maximum intensity among the three color channels and using either method is correct. However, since we have a large number of images with different colors, we opted for the method that extracts the largest pixel intensity across the green red and blue channels for each image. Selecting the maximum value will give the same results as getting the visually-relevant channel, and we don't have to specify which channel to extract for every image in our dataset. This will be useful during deployment also because we can apply the same grayscale transformation to all images regardless of color. Looking ahead, this method should also work when detecting cells in a merged image where you care about all RGB channels instead of a single one.

Data Split for Training

We randomly sampled non-overlapping image patches of 256x256 with a training-to-validation ratio of 2:1 from all 130 images. This resulted in the training set of 12,179 image patches and a validation set of 5,985 image patches.

Example of sampling scheme per image. Darker image patches were used for training and lighter patches were used for validation.

Augmentation

To make the most of the 130+, we can generate additional training samples by applying different transformations to copies of the dataset. In this case, we altered the brightness.

We also allowed the default transformations of the YOLOv5 framework:

hsv_h: 0.015

hsv_s: 0.7

hsv_v: 0.4

degrees: 0.0

translate: 0.1

scale: 0.5

shear: 0.0

perspective: 0.0

flipud: 0.0

fliplr: 0.5

mosaic: 1.0

mixup: 0.0

copy_paste: 0.0

Balancing

The proportion of empty image patches and image patches with fibers is large (i.e. 11,554 negative patches to 625 positive patches) and could result in the network being biased to output empty predictions.

Here we leverage our augmentation step and generate new image patches in a proportion that will yield a final training set with a similar number of negative and positive image patches. Our final training set contains 11,554 negative patches and 13,083 positive patches.

Vector Graphic Cell Positions to YOLOv5 Labels

The manual annotations of cell positions were exported from Adobe Illustrator in SVG format. We processed these in Python to generate the ground truth label format required by the YOLOv5 framework.

The manual annotations of cell positions were exported from Adobe Illustrator in SVG format. We processed these in Python to generate the ground truth label format required by the YOLOv5 framework. Code available here.

Training

Hyperparameters

Model: YOLOv5 (small) by Ultralytics

lr0: 0.01

lrf: 0.1

momentum: 0.937

weight_decay: 0.0005

warmup_epochs: 3.0

warmup_momentum: 0.8

warmup_bias_lr: 0.1

box: 0.05

cls: 0.5

cls_pw: 1.0

obj: 1.0

obj_pw: 1.0

iou_t: 0.2

anchor_t: 4.0

fl_gamma: 0.0

Results

We monitored the Mean Average Precision metric during training to assess learning. The model was able to reach a mean average precision of 0.6621 in 37 steps. This score is comparable to the scores reported by Ultralytics.

Deployment

Now that we have a trained network, we can use it to extract preliminary cell counts from all the images. For ease of use, we released the model in a web app graphical user interface. The web app can be downloaded from GitHub and installed on a personal computer and the code can be found here.

Tabulating Counts

Need to include:

Nomenclature of AI files for automated extraction of brain regions and cell positions
AI script files
Python program to tabulate data