Generative adversarial nets

Out-of-Class DSP Tool: GAN, and Its Loss Functions

For generative adversarial nets, we implemented the works of Zhentan Zheng and Jianyi Liu, Patch Permutation GAN (P2-GAN). The work was originally written with tensor-flow v1, we updated it to be compatible with tensor-flow v2 while learning more about its core algorithms.

Fig 1. How GAN works (Google Developers)

A generative model for unsupervised learning.

The generator learns to generate plausible data

The discriminator learns to distinguish the generator's fake data from real data.

The generator tries to fool the discriminator by generating data similar to those in the training set.

The Discriminator tries not to be fooled by identifying fake data from real data.

(from Google Developers)

They both work simultaneously to learn and train while competing in the game.

After implementing the NST method to do the style transfer, we started to use GAN to do the style transfer. GANs have better representation ability if they are well-trained. This part of the project is based on the paper "P2-GAN: EFFICIENT STYLE TRANSFER USING SINGLE STYLE IMAGE".

What's Special about Patch Permutation GAN (P2-GAN)?

Fig 2. Reorganize “Starry Night” from randomly cropped patches (Zheng & Liu, 2020)

Using this equation, we can feed in just one style image and still get good results. The image is randomly cropped into patches, and it’s reorganized into a new image. After K patch permutations, we will have the style training set of size K. This is much more efficient compared to finding a whole set of style images to feed into the GAN.

Patch Discriminator

Then we need to limit the convolutional computation for each layer only to one inner patch. We apply this rule to the path discriminator Dp to make sure that there is no content overlapping, preventing the kernels from stretching across different patches.

L layer CNN structurer to Patch GANs

Patch size n x n, convolutional kernel at each convolution layer l has a size kl x kl, stride = sl

Fig 3. Patch discriminator after applying the rule (Zheng & Liu, 2020)

Loss Functions

Adversarial Loss

Dp, the patch discriminator, computes patch-permuted images, and processes natural images in the same way as other discriminators in CNN.

The generator, G, learns stroke style from the single style image. Based on Dp and G, the adversarial loss function is defined.

Content Loss

The loss function calculates the normalized squared distance between the feature maps of the original images and transferred images.

Final Function

Project GITHUB[LINK]

Results

Page written by: Diwen Zhu. Results section written by: Jason Ning

Page updated

Report abuse