Home / Project Blogs /Shape Prior to Post-Process Model Predictions

Goal

Here we present a method to isolate regions in model predictions that meet certain criteria drawn from prior information. Among the criteria, we consider the region size and percent occupancy of intense pixels in the region.

Intro

Our work on varying the input size of MA-Net has shown that it is possible to reduce the false-positive rate of the model by increasing the model's input size from 512^2 to 2048^2 pixels. This increase in the input image effectively gives the model spatial awareness during prediction. In the case of our brain tissue dataset, the model is better able to differentiate tissues with similar features and different neighboring structures. However, despite this improvement, there are still regions MA-Net wrongly classifies as the target tissue.

Here we apply a post-processing step to isolate square regions from model predictions that best match the corresponding regions on a reference brain atlas. The reference atlas helps us determine the expected dimensions of brain regions and their occupancy within their bounding box. Once the best rectangular region is determined, we use it to mask the model prediction and obtain a segmentaion.

The plan is to assess the improvements on the intersection-over-union metric of the post processed segmentations and discuss the reliability of the method.

Prior Information

The labeled data we used to train MA-Net was annotated by human experts who referenced Brain Maps 4.0 (BM4), a standardized rat brain atlas. Additional to delineating brain regions, the human annotators also assigned each image in our dataset with the corresponding atlas level. Our plan is to continue to use BM4 as a reference to set the criteria for the best matching region.

BM4 is a series of 73 vector graphic files, each representing one coronal section of the rat brain. In any given file, vector lines and polygons establish coordinate grids and delineate brain regions at that level. We can use these elements to obtain brain region measurements and construct our region size criteria as well as our class-to-background proportion.

Measurement Conversions

BM4 is in units of points, or pixels. Moreover, the atlas has a different scale than the images in our dataset:

  • A pixel length in the atlas equals ~17 micrometers*

  • A pixel length in our dataset equals 1 micrometer

Therefore, to determine the expected region size in an image of our dataset, we first need to convert the pixel measurements from the atlas to micrometers and then back to pixels for the scale of our dataset. Combining it all in one equation, we get

(x1 px) • (17µm/px)/(1µm/px) = x2 px

where x1 and x2 are the lengths of a measurement on the atlas and a dataset image respectively.

Lastly, we obtained the occupancy measure of region by creating a square with sides equal to the largest axis (height or width) of the region and then dividing the number of brain region pixels by the total number of pixels in the square. These are reported as percentages.

*We obtained the value of 17 µm/pixel by averaging several grid spaces. The stereotaxic grid of BM4 is rectilinear (not regular), so the height and width are not equal and the height or width of any consecutive row or column is also not constant. On average, a one-millimeter grid space is equivalent to 54.5 pixels in height and 63.1 pixels in width. Thus 1000µm/((54.5px+63.1px)/2) = 17 µm/px. The grid is identical across all levels in the atlas, with some exceptions near the edges of the grid. We used the stereotaxic grid for this method, but the atlas's physical coordinate grid is also rectilinear.

Final Measurements

We obtained dimensions and occupancy for levels 21 and 25 of the atlas which corresponded to the two test images in our dataset. Here we only report results pertaining to the Fornix.

Level 21

    • Atlas Height: 53.0184px

      • Metric Equivalent:
        53.0184px • (1000µm/54.5px) = 972.8µm

    • Atlas Width: 62.309 px

      • Metric Equivalent:
        62.309px • (1000µm/63.1px) = 987.5µm

    • Occupancy: 37.52%

Level 25

    • Atlas Height: 20.3934px

      • Metric Equivalent:
        20.3934px • (1000µm/54.5px) = 373.7µm

    • Atlas Width: 15.4565 px

      • Metric Equivalent:
        15.4565px • (1000µm/63.1px) = 244.9µm

    • Occupancy: 59.44%

Method

Now that we have some regional information from the reference atlas, we can impose these expectations on the model predictions of the Fornix for level 21 and 25.

Determine Square Region Dimensions

Similar to how we determined a brain region's occupancy, we set our expected region dimensions to be square with sides equal to the largest axis of the reference brain region. Following this procedure, we obtain a region of (988µm)^2 for level 21 and (374µm)^2 for level 25.

Now we need to find a way to compute the occupancy for every possible square region in an image.

Integral Image

A brute force approach to finding a region that satisfies certain criteria in an image is to compute the similarity metric for the square region at the top left corner of the image, move over a pixel, compute the same metric, and repeat this process, moving by one pixel every time. This sliding window approach is inefficient since there is a large overlap from one region to the next resulting in repeated addition operations. To avoid calculating repeated operations, we take advantage of the fact that the sum of any region can be obtained by operating on four values from the Integral Image of the original image, which is faster in comparison.

Given an image I of size m by n, we compute its integral image S, of size (m+1) by (n+1) where S[r,c] then is the sum of all the pixels above and to the left of I[r,c] (Viola-Jones 2001).

Occupancy Metric

With the integral image, we can now find the sum inside any square region. Then, we can divide the sum by the total number of pixels in the given region. Assuming the model predictions are close approximations to binary masks, this division will give us an occupancy measure we can compare to the expected occupancy of the brain region. Lastly, we find the best region by finding the region in the model prediction with the most similar occupancy.

Masking & Thresholding

The final step involves masking out everything outside the best square region and then thresholding the post-processed model prediction to obtain the final segmentation.

Results

Here we show the best square regions detected for both test images. We also compare the intersection-over-union metric before and after post-processing is applied.

Best Matching Regions

Given we are looking for a region of (988µm)^2 for level 21 and (374µm)^2 for level 25, we convert these to pixel equivalent measurements using the scale factor 1µm/px, the scale of the images in our dataset. Thus, we used a region of (988px)^2 for level 21 and (374px)^2 for level 25. The figure below illustrates the best matching region for each test image corresponding to level 21 and 25.

Level 21

IoU w Threshold @ 0.5 before post-processing = 0.37

Level 25

IoU w Threshold @ 0.5 before post-processing = 0.49

Masking & Segmentation

With the best matching regions found, we can mask out anything outside each region.

Level 21


Level 25


Segmentation

At this point we can threshold the post-processed model prediction to get the final segmentation mask.

Level 21

IoU w Threshold @ 0.5 after post-processing = 0.54

Level 25

IoU w Threshold @ 0.5 after post-processing = 0.75

Does the detected region agree with the ground truth label?

As an additional observation, we can see that the best matching region from the model predictions align well with the ground truth labels.

Level 21


Level 25


Discussion

This method significantly improves the intersection-over-union metric for both test images. The combination of reducing false positives by using larger input sizes in MA-Net and extracting expected predictions by applying a region-based filtering has yielded the best results we have obtained by far. Nonetheless, we should discuss assumptions of using these methods. A discussion about MA-Net, such as model size, is provided in the MA-Net experiment section, so here we focus on assumptions we made about MA-Net outputs when applying region filtering as well as assumptions about the filtering itself.

The first assumption we made is that for any new experimental image, the user will have to know the corresponding level in the atlas to determine the appropriate region size and occupancy. Due to many factors, such as plane of sectioning of brain tissue, finding a corresponding level may not be trivial. However, region filtering is a modular step, so a pipeline can be developed that uses the idea of region filtering without explicitly specifying the level or even brain region.

Moreover, we are assuming the predictions of MA-Net are good enough that measures like region size and occupancy can detect the correct region. It could be easy to fool the region-filtering algorithm with predictions that make no sense in terms of the data. So really, the model should produce predictions that are as close to the final brain delineation for the region-filtering to work. Here, "close to final" means to have as high recall and precision as possible so that the occupancy measure serves as a good metric to match regions. Since we have not attempted this with other brain regions, it may be possible that MA-Net, as it is configured right now, does not produce "close to final" delineations.

Lastly, our choice of using a square region, as opposed to regions with similar proportions to a brain region's bounding box, allows for detection of a region in more orientations than is established on the brain atlas. Further, using the longer axis of the brain region to define the sides of the square region also prevents any overcropping of the brain region in case of rotation. A possible issue would be that if we are preventing overcropping, we may be allowing more false positives. However, because the model predictions are more precise near the brain region, false positives are not a huge issue. Further areal filtering can be done to remove any debris or extract the largest connected component inside the best matching region.