Masking Results
Masking Results
Original Image
Edge Detection and Opening/Closing
Strong Edge Detection
Generator
Fig. 10: Masking results of various methods. The purpose of the masks is to isolate the raindrops from the rest of the image. As shown, only the final method (Generator) is able to successfully do so for the provided image.
Our three methods of raindrop mask creation lead us to believe that the generator has clearly much stronger detection of raindrops, as is clear in Fig. 10. This is likely because of our incorrect assumption that the raindrops are significantly sharper than the background image. While they overall do have a different texture, the texture is not strong enough for it to be distinguished from the background. Thus, we conclude through far more methods than mentioned that a composition of kernels and functions is not sufficient to identify raindrops. Instead, learning is necessary due to the variable nature of raindrop characteristics.
Inpaint Model Results
Evaluation Metrics
PSNR: compute the Peak Signal-to-Noise Ratio between a clean image and a noisy image; the higher it is, the better the quality of our generated derain images.
SSIM: compute the structural similarity between two images. It measures the luminance, contrast, and structure changes between two images. A Higher SSIM score indicates a better quality of the generated derain images.
Quantitative Evaluation Results
The table on the left shows the quantitative evaluation results on the Raindrop test set. We could see that our model derained results are obviously better than the input rain images although it is still not comparable to our baseline model. We are going to discuss the possible reasons in detail in the analysis session. We can also see that our model does improve a lot compared with pretrained inpainting model.
Qualitative Evaluation Results
Fig. 11 Edge Model results. From left to right: (1) input images with raindrops; (2) input masked-out grayscale images; (3) Edge map of background region of input images; (4) predicted edge map on the mask region after Edge Generator; (5) Composite edge map of (3) and (4); (6) Ground-truth images without raindrops
Fig. 12 Inpaint model results. From left to right: (1) input images with raindrops; (2) input masked-out images; (3) Edge Model prediction; (4) Inpaint Model prediction; (5) Combination of inpaint prediction in masked-out areas and rest of the input images; (6) Ground-truth images without raindrops
Fig. 11 shows the result of our Edge Model. As we can see by comparing (3) Edge map of background region of input images and (4) predicted edge map on the mask region after Edge Generator, the original edges in (3) contain small circles as edge maps for raindrops while the edges predicted in (4) contain much less. This means our edge model is successfully learning to predict the edges based on the original object contours without raindrops and the predicted edge could provide useful information to guide the inpaint model. Fig. 12 shows our final results after inpainting, as we can see from column (5), our results have significantly fewer raindrops. Its results are better when the raindrops are small and obvious, such as in the fourth and the sixth row shown on the right of Fig. 11 where our model can remove nearly all the present raindrops. However, our model didn't perform well when the rain is not obvious like in the first row. However, our model can still reduce the effect of the raindrops visually.
Analysis
As we can see, the result of our model is better than directly applying the pretrained model and the input rain images. This means that our method (changing inpaint model target to be the images without rain while input images are with rain) is helping to remove raindrops by inpainting areas similar to the ground-truth images. From qualitative results, we can see that our method successfully removes a large number of raindrops. Since there is no prior work done removing raindrops by using inpainting models, we believe that our model serves as validation of the idea and this direction can be promising with further improvements. At the same time, the result images from our method still have some raindrops remaining and tend to be blurry. There are some possible reasons that may result in the unsatisfactory performance of our model compared with our baseline:
The architecture may not be suitable for inpainting images different than input images. In our project, we also tried to try to train the EdgeConnect model without changing the ground-truth label to be the images without rain (i.e. model to inpaint original masked images with raindrops). We observed that this model converges faster and generates more clear results compared to our final method. This may indicate that our inpaint architecture is having trouble to learn filling in images that are not corresponding to input images. One solution could be trying out replacing the inpaint architecture with conditional GAN to learn the input-output correspondence.
The mask qualities may vary from image to image. As discussed above, we can see that our model works better when the raindrops are clear small dots, and it fails to remove large blurry raindrops. As we can see from the masks, the pretrained attention model has trouble identifying the blurry raindrops. One main drawback of our model is that our result depends directly on the quality of our masks as we directly use them as further input.
The size of our dataset is too small. The dataset contains only 860 training images, which may not provide enough information to appropriate learn to mask and inpaint realistic derained scenes.
It is hard to inpaint the images due to the various shapes and types of raindrops. Due to the nature of raindrops, it is possible for the images to mask out regions that are either too large or too small, and the loss of information or retention of unnecessary information makes it hard to inpaint.