In general, the different image inpainting methods can be classified into two broad categories: non-learning-based methods and learning-based methods. Learning-based methods usually require an already existing dataset of images for training with a specific deep neural network. These methods then go on to learn how to generate a reasonable result for the damaged/empty regions of the given image. Non-learning-based inpainting methods, on the other hand, depend on the characteristics of autocorrelation within the natural image itself. They exploit the information that is already present (intact) in the regions beyond the damaged/empty region within the same image for the purpose of inpainting or repair. Methods based on ways like patch synthesis, partial differntial equation propagation and interpolation are some prominent examples.
Interpolation-based methods
Shih et al. proposed a multi-resolution method using pixel interpolation. In their method, the damaged image for inpainting was segmented into blocks, and the segmentation was conducted until the variances of sub-blocks were smaller than a pre-determined threshold. Damaged pixel was then repaired through the mean value of current sub-block or sub-block of previous level. Another image inpainting scheme with interpolation strategy was proposed in Shih et al., which evaluated neighboring information of each pixel to be inpainted and decided the size of reference window that can be exploited to calculate an interpolated color. However, these methods may lead to blurring effects when damaged pixels were close to image edges.
Partial Differential Equation-based methods
Such methods are motivated by intensive works on the use of variational and partial differential equation (PDE) methods in image processing. Bertalmio et al. established an inpainting mathematical model by borrowing ideas from classic fluid dynamics. By iteratively solving the numerical representation of a PDE, intact information of neighboring areas can be smoothly propagated into damaged region along isophote direction. However, the PDE-based methods cannot deal with the large-area inpainting very well and often introduced heavy computation complexity due to high-order mathematic models.
Patch-based methods
These methods employ the strategy of texture synthesis to repair larger damaged region(s). They aim at iterating through a searching space, looking for a patch which has a similar appearance to the missing region; based on some human-defined distance metrics to fill up the holes. The searching space can be either internal or external to the given image. These methods can achieve better inpainted results for larger damaged region than the PDE-based methods, although they may leave some artificial seams between the patches. PatchMatch, introduced in 2009, was one such approach that used patches of low-level features. It turned out to produce really great results for image backgrounds and repetitive textures.
Structural image editing using PatchMatch. Left to right: (a) the original image; (b) a hole is marked (magenta) and line constraints (red/green/blue) are used to improve the continuity of the roofline; (c) the hole is filled in; (d) user-supplied line constraints for retargeting; (e) retargeting using constraints eliminates two columns automatically; and (f) user translates the roof upward using reshuffling.
In recent years, a lot of works have applied the convolutional neural network (CNN) to image inpainting, such as Iizuka et al. and Pathak et al. In these tasks, training samples were utilized to train the deep CNN so that they can estimate the pixel strength of the damaged region in the input image. Benefiting from large-scale training data, these learning-based methods can produce semantic specious inpainting results. However, the existing CNN-based inpainting methods usually complete damaged region by propagating convolution feature to the fully connected layer, which sometimes make the inpainted results lack fine texture details with blurring. Another powerful family of learning-based inpainting methods with deep generative model has also been proposed through introducing the adversarial loss to improve the visual quality of the inpainted results. Generally speaking, learning-based methods can achieve superior global semantic structures of inpainted results compared with non-learning-based methods, while non-learning-based methods can obtain fine local textures of inpainted results with much less computational complexity.
Qualitative illustration of the task in Pathak et al. Given an image with a missing region (a), a human artist has no trouble inpainting it (b). Automatic inpainting using our context encoder trained with L2 reconstruction loss is shown in (c), and using both L2 and adversarial losses in (d).