non-maximum supression

Non-Maximum Suppression NMS

In computer vision, there are usually multiple rectangles detected to enclose an object, so there is a need to combine those similar rectangles into one.

The basic idea is simple. If those rectangles overlap too much (e.g. iou > 50%) then consider them as the same rectangle.

The algorithm is as follows.

Starting from the rectangle with the highest score, if any rectangle has an overlap > threshold, then consider it as the same as the former.

From those rectangles with overlap < threshold, repeat the process.

pseudo code:

input: rectangles, scores, threshold

output: {}

while rectangles not empty

representative = the rectangle with the highest score

        output appends representative

for any other rectangle

if other overlaps representative > threshold

        remove other from rectangles


Implementation for merging rectangles

def nms(rectangles, scores, threshold):

    '''

         rectangles = [[x1, y1, x2, y2 ]], shape n x 4

         x1,y1 is the top left vertex and x2,y2 is the bottom right vertex

         scores are the confidences of the rectangles

    '''

    x1 = rectangles[:, 0]

    y1 = rectangles[:, 1]

    x2 = rectangles[:, 2]

    y2 = rectangles[:, 3]

    

    areas = (x2 - x1 + 1) * (y2 - y1 + 1) #if x1 x2 the same pixel then the length is 1

    index_score_desc = scores.argsort()[::-1] # [::-1] reverses the order

    

    keep = []

    while index_score_desc.size > 0:

        i = index_score_desc[0] #pick the index of the largest score

        keep.append(i)

        max_x1 = np.maximum(x1[i], x1[index_score_desc[1:]]) #minimum of x1 and x2

        max_y1 = np.maximum(y1[i], y1[index_score_desc[1:]])

        min_x2 = np.minimum(x2[i], x2[index_score_desc[1:]])

        min_y2 = np.minimum(y2[i], y2[index_score_desc[1:]])

        

        intersection = np.maximum(0, min_x2 - max_x1 + 1) * np.maximum(0, min_y2 - max_y1 + 1)

        iou = intersection / (areas[i] + areas[index_score_desc[1:]] - intersection)

        #note the iou is same size as index_score_desc[1:]

        #and the index of iou + 1 correspond to the index of index_score_desc

        

        index_low_iou = np.where(iou < threshold)[0] #the row index of iou whose value is less than threshold

        index_low_iou += 1;  #the iou index + 1 is the corresponding index in index_score_desc as the first element (i) is excluded

        

        index_score_desc = index_score_desc[index_low_iou] #keep only the low iou in the list

    

    return keep

The algorithm works for polygons too.

def nms_poly(polys, scores, threshold):

    '''

       non-maximum suppression

       start from highest score descendly

       merge ploys overlap with IOU > threshold

       polys = [[x1, y1, x2, y2, ... x4, y4]]  shape [n, 8], the 4 vertices are top left, top right, bottom right and bottom left

       

    '''

    polygons = np.array([Polygon(poly) for poly in polys.reshape(-1, 4, 2)])

    areas = np.array([p.area for p in polygons])

    index_score_desc = scores.argsort()[::-1] # [::-1] reverses the order

    keep = []

    while index_score_desc.size > 0:

        i = index_score_desc[0] #pick the index of the largest score

        keep.append(i)

        p1 = polygons[i]

        p2s = [p2 for p2 in polygons[index_score_desc[1:]]]

        intersection = np.array([p1.intersection(p2).area for p2 in p2s])

        iou = intersection / (areas[i] + areas[index_score_desc[1:]] - intersection)

        

        index_low_iou = np.where(iou < threshold)[0]

        index_low_iou += 1;

        

        index_score_desc = index_score_desc[index_low_iou]

        

    return polys[keep]