SIGGRAPH 2019

Handheld Multi-frame Super-resolution

We present a multi-frame super-resolution algorithm that supplants the need for demosaicing in a camera pipeline by merging a burst of raw images. In the above figure we show a comparison to a method that merges frames containing the same-color channels together first, and is then followed by demosaicing (top). By contrast, our method (bottom) creates the full RGB directly from a burst of raw images. This burst was captured with a hand-held mobile phone and processed on the device. Note in the third (red) inset that the demosaiced result exhibits aliasing (Moiré), while our result takes advantage of this aliasing, which changes on every frame in the burst, to produce a merged result in which the aliasing is gone but the cloth texture becomes visible.

"Handheld Multi-Frame Super-Resolution", B. Wronski, I. Garcia-Dorado, M. Ernst, D. Kelly, M. Krainin, C.K. Liang, M. Levoy, and P. Milanfar, * in ACM Transactions on Graphics, Vol. 38, No. 4, Article 28, July 2019 (SIGGRAPH 2019)

* Authors are affiliated with Google Research, 1600 Amphitheatre Parkway, Mountain View, CA, 94043

Abstract

Compared to DSLR cameras, smartphone cameras have smaller sensors, which limits their spatial resolution; smaller apertures, which limits their light gathering ability; and smaller pixels, which reduces their signal-to-noise ratio. The use of color filter arrays (CFAs) requires demosaicing, which further degrades resolution. In this paper, we supplant the use of traditional demosaicing in single-frame and burst photography pipelines with a multi-frame super-resolution algorithm that creates a complete RGB image directly from a burst of CFA raw images. We harness natural hand tremor, typical in handheld photography, to acquire a burst of raw frames with small offsets. These frames are then aligned and merged to form a single image with red, green, and blue values at every pixel site. This approach, which includes no explicit demosaicing step, serves to both increase image resolution and boost signal to noise ratio. Our algorithm is robust to challenging scene conditions: local motion, occlusion, or scene changes. It runs at 100 milliseconds per 12-megapixel RAW input burst frame on mass-produced mobile phones. Specifically, the algorithm is the basis of the Super-Res Zoom feature, as well as the default merge method in Night Sight mode (whether zooming or not) on Google’s flagship phone.

Here is an earlier Google Research Blog by B. Wronski and P. Milanfar about this technology.

Paper

High res (21MB) Low res (9MB)

Supplementary Material

High res (23MB) Low res (5MB)

Acknowledgements

We gratefully acknowledge current and former colleagues from collaborating teams across Google including: Haomiao Jiang, Jiawen Chen, Yael Pritch, James Chen, Sung-Fang Tsai, Daniel Vlasic, Pascal Getreuer, Dillon Sharlet, Ce Liu, Bill Freeman, Lun-Cheng Chu, Michael Milne, and Andrew Radin. Integration of our algorithm with the Google Camera App as Super-Res Zoom and Night Sight mode was facilitated with generous help from the Android camera team. We also thank the anonymous reviewers for valuable feedback that has improved our manuscript.