CPSC 8810: ML-based Image Synthesis

Paper list

Cutting-Edge GANs

- [ProgressiveGAN] Progressive Growing of GANs for Improved Quality, Stability, and Variation, Karras et al, ICLR 2018
- [StyleGAN] A Style-Based Generator Architecture for Generative Adversarial Networks. Karras et al. CVPR 2019
- [StyleGAN2] Analyzing and Improving the Image Quality of StyleGAN, Karras et al., 2020
- [StyleGAN3] Alias-Free Generative Adversarial Networks, Karras et al., 2021
- [StyleGAN-T] StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis. Sauer et al., 2022
- [BigGAN] Large Scale GAN Training for High Fidelity Natural Image Synthesis, ICLR 2019
- [VQGAN] Taming Transformers for High-Resolution Image Synthesis. Esser et al. CVPR 2021

Cutting-Edge Diffusion Models

- [DDPM] Denoising Diffusion Probabilistic Models, Ho et al., 2020
- [DDIM] Denoising Diffusion Implicit Models, Song et al., 2021
- [SDE] Score-Based Generative Modeling through Stochastic Differential Equations, Song et al., ICLR 2021
- [GuidedDiffusion] Diffusion Models Beat GANs on Image Synthesis. Dhariwal et al. NeurIPS 2021
- [StableDiffusion] High-Resolution Image Synthesis with Latent Diffusion Models. Rombach et al. CVPR 2022

Other Generative Models

[PixelRNN] Pixel Recurrent Neural Networks, Oord et al, 2016
[PixelCNN] Conditional Image Generation with PixelCNN Decoders, Oord et al, 2016
[RealNVP] Density estimation using Real NVP, Dinh et al, ICLR 2017
[Glow] Glow: Generative flow with invertible 1x1 convolutions, Kingma et al., 2018
[VQ-VAE] Neural Discrete Representation Learning. Oord et al. NeurIPS 2017
[VQ-VAE-2] Generating Diverse High-Fidelity Images with VQ-VAE-2. Razavi et al., 2019
[MaskGIT] MaskGIT: Masked Generative Image Transformer, Chang et al., 2022
[LlamaGen] Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation, Sun et al., 2024

Image-to-Image Translation

- [Pix2Pix] Image-to-Image Translation with Conditional Adversarial Networks. Isola et al. CVPR 2017
- [UNIT] Unsupervised Image-to-Image Translation Networks, Liu et al., NeurIPS 2017
- [CycleGAN] Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks . Zhu et al. ICCV 2019
- [StarGAN v2] StarGAN v2: Diverse Image Synthesis for Multiple Domains. Choi et al, CVPR 2020
- [MUNIT] Multimodal Unsupervised Image-to-Image Translation, Huang et al., ECCV 2018
- [CUT] Contrastive learning for unpaired image-to-image translation, Park et al., ECCV 2020

Text-to-Image Synthesis

- [StackGAN] StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks. Zhang et al., ICCV 2017
- [CLIP] Learning Transferable Visual Models From Natural Language Supervision. Radford et al. ICML 2021
- [StyleCLIP] StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery. Patashnik et al. ICCV 2021
- [GLIDE] GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models. Nichol et al. ICML 2022
- [DALL·E] Zero-shot text-to-image generation. Ramesh, et al. ICML 2020
- [DALL·E 2] Hierarchical Text-Conditional Image Generation with CLIP Latents . Ramesh et al. arXiv 2022
- [Google Imagen] Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. Saharia et al. arXiv 2022

Image Style Transfer

- Image style transfer using convolutional neural networks, Gatys et al., CVPR 2016.
- [Perceptual Loss] Perceptual Losses for Real-Time Style Transfer and Super-Resolution, Johnson et al., ECCV 2016.
- [AdaIN] Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization. Huang et al. ICCV 2017
- [ArtFlow] Artflow: Unbiased image style transfer via reversible neural flows. An et al, CVPR 2021
- [PhotoST] Deep photo style transfer. Luan et al, CVPR 2017
- [PhotoWCT] A closed-form solution to photorealistic image stylization. Li et al, ECCV 2018

Image Editing

- [PatchMatch] PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing, Barnes et al, SIGGRAPH 2009
- [Swapping Autoencoder] Swapping Autoencoder for Deep Image Manipulation . Park et al. NeurIPS 2020
- [GauGAN] Semantic Image Synthesis with Spatially-Adaptive Normalization . Park et al. CVPR 2019
- [Image2StyleGAN] Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?, Abdal et al, ICCV 2019
- [Image2StyleGAN++] Image2StyleGAN++: How to Edit the Embedded Images?, Abdal et al, CVPR 2020
- [SDEdit] SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations, Meng et al., ICLR 2022

Video Synthesis

- [vid2vid] Video-to-Video Synthesis, Wang et al., NeurIPS 2016
- Everybody Dance Now . Chan et al. ICCV 2019
- [MoCoGAN] MoCoGAN: Decomposing Motion and Content for Video Generation . Tulyakov et al. CVPR 2018
- [Make-A-Video] Make-a-video: Text-to-video generation without text-video data. Singer et al, 2022
- [Imagen Video] Imagen Video: High Definition Video Generation with Diffusion Models, Jonathan Ho et al, 2022
- [Sora] Video generation models as world simulators. Brooks et al., 2024.

Image Restoration

- Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising. Zhang et al, TIP 2017
- [DIP] Deep Image Prior, Ulyanov et al., CVPR 2018.
- [FFDNet] FFDNet: Toward a Fast and Flexible Solution for CNN based Image Denoising. Zhang et al, TIP 2018
- [SRGAN] Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. Ledig et al, CVPR 2017
- [SN-PatchGAN] Free-Form Image Inpainting with Gated Convolution. Yu et al, ICCV 2019
- [DarkChannel] Single image haze removal using dark channel prior. He et al, CVPR 2009

3D-aware Synthesis

- [NeRF] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis, Mildenhall et al, ECCV 2020
- [InstantNGP] Instant Neural Graphics Primitives with a Multiresolution Hash Encoding. Müller et al., SIGGRAPH 2022
- [3DGS] 3D Gaussian Splatting for Real-Time Radiance Field Rendering. Kerbl et al., SIGGRAPH 2023
- [PointNet] PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. Qi et al, CVPR 2017
- [pi-GAN] pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis, Chan et al, CVPR 2021
- [StyleNeRF] StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis, Gu et al, arXiv 2021
- [3DGP] 3D Generation on ImageNet. Skorokhodov et al, NeurIPS 2023
- [DeepSDF] Deepsdf: Learning continuous signed distance functions for shape representation. Park et al, CVPR 2019

Face and Pose Modeling

- [GANimation] GANimation: Anatomically-aware Facial Animation from a Single Image, Pumarola et al, ECCV 2018
- [StyleRig] StyleRig: Rigging StyleGAN for 3D Control over Portrait Images, Tewari et al, CVPR 2020
- Deep Video Portraits, Kim et al, SIGGRAPH 2018
- Text-based Editing of Talking-head Video, Fried et al., SIGGRAPH 2019
- [SMPL] SMPL: A skinned multi-person linear model. SIGGRAPH Asia, 2015
- [HMR] End-to-end Recovery of Human Shape and Pose. Kanazawa et al. CVPR, 2018
- [PIFu] Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization. Saito et al. ICCV 2019

Data Augmentation

- [DAGAN] Data Augmentation Generative Adversarial Networks. Antoniou et al. arXiv 2017
- [DatasetGAN] Datasetgan: Efficient labeled data factory with minimal human effort. Zhang et al, CVPR 2021
- [AutoAugment] Autoaugment: Learning augmentation strategies from data. Cubuk et al, CVPR 2019
- [RandAugment] Randaugment: Practical automated data augmentation with a reduced search space. Cubuk et al, CVPR Workshop 2020
- [Albumentations] Albumentations: fast and flexible image augmentations. Buslaev et al, MDPI Information 2020
- [Mixup] Mixup: Beyond empirical risk minimization. Zhang et al, ICLR 2018

Interpretable Generative Models

- [InfoGAN] Infogan: Interpretable representation learning by information maximizing generative adversarial nets. Chen et al, NeurIPS 2016
- [GANSpace] GANSpace: Discovering Interpretable GAN Controls. Harkonen et al. NeurIPS 2020
- [GAN Dissection] GAN Dissection: Visualizing and Understanding Generative Adversarial Networks . Bau et al. ICLR 2019
- [InterFaceGAN] Interpreting the Latent Space of GANs for Semantic Face Editing. Shen et al, CVPR 2020
- Closed-form factorization of latent semantics in gans. Shen et al, CVPR 2021

Image Forensics

- [FaceForensics++] FaceForensics++: Learning to Detect Manipulated Facial Images. Rössler et al, ICCV 2019
- [Celeb-DF] Celeb-DF: A Large-scale Challenging Dataset for DeepFake Forensics. Li et al, CVPR 2020
- DeepFake Detection by Analyzing Convolutional Traces. Guarnera et al. arXiv 2020
- CNN-generated images are surprisingly easy to spot... for now. Wang et al. CVPR 2020
- Extracting Training Data from Diffusion Models, Carlini et al., USENIX 2023

Page updated

Google Sites

Report abuse