Peyman Milanfar

July 2021 : We've launched our latest denoising and blur reduction models in Google Photos editor. Read the Blog post. To learn more about the details of the tech described in this article, check out our papers on --
- - Polyblur: https://arxiv.org/abs/2012.09322
  - Pull-push denoising: https://ieeexplore.ieee.org/document/7532702
July 2021 -- We'll have 5 papers in ICCV 2021
- "Learning to Resize Images for Computer Vision Tasks": Front-end resizers in deep networks are simple filters. They’re an afterthought — but they shouldn’t be. Deep computer vision models can benefit greatly from replacing these fixed linear resizers with well-designed, learned, nonlinear resizers. The resizer is jointly trained w/ baseline model loss. No pixel/perceptual loss on the resizer means images are task-optimized, not optimized for visual quality. Structure of the learned resizer is specific; not just adding more generic convolutional layers to the baseline model. Our work shows that a generically deeper model can be improved upon w/ a well-designed front-end, task-optimized, processor.
- "Learning to Reduce Defocus Blur by Realistically Modeling Dual-Pixel Data": We propose a procedure to generate realistic Dual-Pixel (DP) data synthetically. Our synthesis approach mimics the optical image formation found on DP sensors and can be applied to virtual scenes rendered with standard computer software. Leveraging these realistic synthetic DP images, we introduce a new recurrent convolutional network (RCN) architecture that can improve deblurring results and is suitable for use with single-frame and multi-frame data captured by DP sensors. Finally, we show that our synthetic DP data is useful for training DNN models targeting video deblurring applications where access to DP data remains challenging.
- "Multi-scale Transformer for Image Quality Assessment": we develop a multi-scale Transformer for IQA to process native resolution images with varying sizes and aspect ratios. With a multi-scale image representation, our proposed method can capture image quality at different granularities. Furthermore, a novel hash-based 2D spatial embedding and a scale embedding is proposed to support the positional embedding in the multi-scale representation. Experimental results verify that our method can achieve state-of-the-art performance on multiple large scale IQA datasets
- "COMISR: Compression-Informed Video Super-Resolution": We propose a new compression-informed video super-resolution model. The proposed model consists of three modules for video super-resolution: bi-directional recurrent warping, detail-preserving flow estimation, and Laplacian enhancement. All these three modules are used to deal with compression properties such as the location of the intra-frames in the input and smoothness in the output frames. We show that our method not only recovers high-resolution content on uncompressed frames from the widely-used benchmark datasets, but also achieves state-of-the-art performance.
- "Patch Craft: Video Denoising by Deep Modeling and Patch Matching": This work proposes a novel approach for leveraging self-similarity in the context of video denoising, while still relying on a regular convolutional architecture. We introduce a concept of patch-craft frames -- artificial frames that are similar to the real ones, built by tiling matched patches. Our algorithm augments video sequences with patch-craft frames and feeds them to a CNN. We demonstrate the substantial boost in denoising performance obtained with the proposed approach.
June 2020 -- We have 3 papers in CVPR 2020 -- two regular conference and one workshop. Here are the summaries and links to relevant material are below.
- "GIFnets: Differentiable GIF Encoding Framework": We introduce (to our knowledge), the first differentiable GIF encoding pipeline. It includes three novel neural networks: PaletteNet, DitherNet, and BandingNet. Each provides an important functionality within the GIF encoding pipeline. PaletteNet predicts a near-optimal color palette given an input image. DitherNet manipulates the input image to reduce color banding artifacts and provides an alternative to traditional dithering. Finally, BandingNet is designed to detect color banding, and provides a new perceptual loss specifically for GIF images.
- "Distortion Agnostic Deep Watermarking": We develop a framework for distortion-agnostic watermarking, where the image distortion is not explicitly modeled during training. Instead, the robustness of our system comes from two sources: adversarial training and channel coding. Compared to training on a fixed set of distortions and noise levels, our method achieves comparable or better results on distortions available during training, and better performance overall on unknown distortions.
- "LIDIA: Lightweight Learned Image Denoising with Instance Adaptation": We use a combination of supervised and unsupervised training, where the first stage gets a general denoiser and the second does instance adaptation. LIDIA produces near state-of-the-art quality, while having relatively very small number of parameters as compared to the leading methods
  - - Code: https://github.com/grishavak/LIDIA-denoiser
July 2019 -- Our paper on Handheld Multi-frame Super-resolution was presented at SIGGRAPH 2019. You can find our paper, supplementary material and a short video describing the work at the project website. This technology powers the Super-Res Zoom and Night Sight (merge) features on Pixel phones.

December 2018: Our Teams' work on Google Pixel3 wins Innovation of the Year Award from DPReview! "Pixel 3 is the first smartphone camera to truly challenge traditional cameras." We're improving on HDR+ multi-frame fusion, by instead using super res fusion for more detail, without the need for demosaicing. Our Super Res technique allows Pixel3's image quality in 'Night Sight' mode (at full FOV) to rival cameras with Four Thirds sensors in all light conditions; meanwhile also making possible digital zoom that rivals modest optical zoom modules on other phones.
- DPReview Announcement (Slide 25): https://lnkd.in/g5J7M6a
- Image Comparisons: https://goo.gl/VE7uLk
November 2018: Night Sight mode on Pixel 3 uses our Super Res Zoom technology to merge images (whether you zoom or not) for detailed, clear, and vivid shots in low light, and produces super-resolved images that are higher quality than the main camera, in bright light.
- The Verge:
  - Google’s Night Sight for Pixel phones will amaze you
  - The Verge - Google’s Night Sight is subtly awesome in the daytime, too
- Engadget: The Pixel's Night Sight camera mode performs imaging miracles
- Mashable: Google's Night Sight shooting mode for the Pixel 3 is mind-blowing
October 2018: Super Res Zoom in launched in Pixel 3 phones, bringing multi-frame super-resolution to mobile photography.
- DPReview:
  - 5 ways Google Pixel 3 camera pushes the boundaries of computational photography,
  - How Google developed the Pixel 3's Super Res Zoom Technology
- Charged: Pixel 3, improving on incredible
- CNET:
  - Which phone has the best camera?
  - Google's Pixel 3 camera rewrites photo rules with nifty new tricks
- Android Central: Google's latest phones shine a light at just how good Android can be
September 2018: The patent for learning-based jpeg artifact removal has been issued by the USPTO.
June 2018: The patent for RAISR image upscaling has been issued by the USPTO.
May 2018: Pascal Getreuer presented our paper BLADE as an oral at the 2018 International Conference on Computational Photography. Video of the talk here.
May 2018: Hossein Talebi presented our paper on Learned Perceptual Image Enhancement as an oral at the 2018 International Conference on Computational Photography. Video of the talk here.
February 2018: MotionDSP, a company I founded, has been acquired by Cubic Inc. (NYSE:CUB)
January 2018: CNET: Google Pixel 2 photos get AI boost for digital zoom
December 2017: The Storyboard app (using my team's stylization work) is launched on the Google Play Store. See blogpost from Google Research here.
December 2017: Press coverage of NIMA for aesthetic and technical quality assessment of images using a deep neural network. See blogpost from Google Research here.
- - Engadget: Google AI can rate photos based on aesthetic appeal
  - Fast Company: Google Knows If You’ll Like A Photo Before You Even See It
  - The Next Web: Google’s AI can predict whether humans will like an image or not
  - Android Authority: Google AI can now tell which photos you’ll think are beautiful
  - PetaPixel: Google’s New AI Can Score Photos on Technical and Aesthetic Quality
  - DP Review: Google's new AI ranks photos on their technical and aesthetic quality
October 2017: RAISR is launched for digital zoom on Pixel 2/XL phones
- - The Verge: Google Pixel 2: Plainly Great
October 2017: RAISR is launched in Google Clips for high quality image export
- - The Verge: The Google Clips camera puts AI behind the lens
September 2017: I gave a master class (slides and video) at the Summer School on Signal Processing Meets Deep Learning, in Capri, Italy
August 2017: Graduate Student internships in Google Research : Apply Here
July 2017: Code for RED (Regularization by Denoising paper) is now publicly available on Github.
February 2017: A nice summary video describing our work on RAISR
January 2017: RAISR is launched in MotionStills ,
January 2017: RAISR launched in Google+ bringing high quality and bandwidth savings to images on the web.
- - CNET: Google AI expands your photos to shrink your mobile data usage
  - PC Mag: Google RAISR Intelligently Makes Low-Res Images High Quality
  - Android Headlines: Google’s New RAISR Image Algorithm Coming To Google+
  - Android Police: Photos on Google+ receive the machine learning treatment to help save on bandwidth usage
  - Gadgets 360: Google+ Uses Machine Learning to Display High Resolution Images at a Third of the Bandwidth
  - Ubergizmo: Google+ Uses Machine Learning To Show Hi-Res Images At Less Bandwidth
  - VentureBeat: Google+ starts showing high-resolution images that take up less bandwidth
  - The Verge: Google is using machine learning to reduce the data needed for high-resolution images
  - ZDNet: Google sets RAISR to work: Hi-res images now hog less Google+ mobile bandwidth
  - Programmable Web: Why Google's RAISR Image "Compression" Tech Changes the Game for Web Developers
November 2016: RAISR, my team's latest work on single-frame super-resolution was described in our Google Research Blog.
May 2016: I gave a plenary talk at the SIAM Conference on Imaging Sciences. The slides and video of my talk are available here . Conference info here .
July 2015: I gave a plenary talk at the International Conference on Multimedia and Expo in Torino, Italy. More details here, including slides of my talk.
April 2015: devCam: Open-source Android Camera Control for Algorithm Development and Testing
November 2014: My student Sujoy Biswas Kumar won the Best Student Paper Award at ICIP 2014.
October 2014: I gave a plenary talk at the SPIE Optics and Photonics Conference in San Diego. Here is the video of my talk.
May 2014: I gave a plenary talk at the Technion's TCE Conference . Here is the video of my talk along with the slides .