Introduction

Local invariant descriptors (e.g., [27, 26, 10, 39, 37]) are image statistics at each pixel that describe neighborhoods in a way that is invariant to geometric and photometric nuisances. They are typically computed by aggregating smoothed oriented gradients within a neighborhood of the pixel. These descriptors play an important role in characterizing local textural properties. This is because a texture consists of small tokens, called textons [20], which may vary by small geometric and photometric nuisances but are otherwise stationary. Careful construction of these descriptors is crucial since they play a key role in low-level segmentation, which in turn plays a role in higher level tasks such as object detection and segmentation. Existing local invariant descriptors aggregate oriented gradients in predefined pixel neighborhoods that could contain image data from different textured regions, especially near the boundary of the texture. This leads to ambiguity in grouping descriptors, especially for descriptors near the boundary.

Figure 1. [Left]: Descriptors that aggregate local image data across boundaries of textured regions lead to segmentation errors. The problem is exacerbated as the texton size increases. [Right]: Segmentation by Shape-Tailored Descriptors (our method).

This could lead to segmentation errors if descriptors are grouped to form a segmentation. The problem is exacerbated when the textons in the textures are large. In this case, the neighborhood of the descriptor needs to be chosen large to fully capture texton data. See Fig. 1. Ideally, one would need to construct local descriptors that aggregate oriented gradients only from within textured regions.

However, the segmentation is not known a-priori. Thus, it is necessary to solve for the local descriptors and the region of the segmentation in a joint problem. In this paper, we address this joint problem. This is accomplished in two steps. First, we construct novel dense local invariant descriptors, called Shape-Tailored Local Descriptors (STLD). These descriptors are formed from shape dependent scale spaces of oriented gradients. The shape dependent scale spaces are the solution of Poisson-like partial differential equations (PDE). Of particular importance is the fact that these scale-spaces are defined within a region of arbitrary shape and do not aggregate data outside the region of interest. Second, we incorporate Shape-Tailored Descriptors into the Mumford-Shah energy [29] as an example energy based on these descriptors. Optimization jointly estimates Shape-Tailored Descriptors and their support region, which forms the segmentation.

Contributions: 1.Our main contribution is to define new dense local descriptors by using shape-dependent scale spaces of oriented gradients. 2. We show that our new descriptors give more accurate segmentation than their non-shape-dependent counterparts for texture segmentation. 3.We apply our descriptors to disocclusion detection [43] in object tracking improving state-of-the-art.

Page updated

Google Sites

Report abuse