Neural Style Transfer

Figure 1: Starry Night style Transfer [1]

Goals For the Project

The goal of this project is to transform an image from the ArtBench-10 dataset to another style category of the data set. We want to preserve as much content in the original image while revamping its style.

Our project is based on the article "A Neural Algorithm of Artistic Style", and we were provided with a starter code file from "Problem Set 6: Image Synthesis" of EECS 442/504.

System Overview

Style transfer can be achieved by extracting features and styles from layers of a Convolutional Neural Network (CNN). To generate the synthesized image, we start with a white noise image and perform gradient descent to find an image that best represents the features and style of the desired image. The key algorithms involved are called content loss, gram matrix, and style loss.

The CNN model used here is a pre-trained version of SqueezeNet.

Figure 2: Neural style transfer system diagram, v7Labs [2]

Figure 3: Feature Map Demonstration [3]

Feature Map

Convolution is performed in every layer of a CNN. Small image features such as edges are in the first few layers, while global features such as object parts are in the deeper layers.  

If we strategically choose the feature map layer, we can obtain information about the subject/content in the image.


Content Loss

If we are given 2 feature maps, one from the content image, and the other from the generated image, we can then use the equation on the right to calculate its content loss.

This equation resembles a Mean Square Error (MSE) function that calculates the squared ℓ2 distance between two convolutional feature maps.

If we then perform the gradient descent to minimize the content loss, we can achieve the goal of preserving the maximum amount of content from the input image. 

Example: Too much content loss

The example below shows the effect of too much content loss. In the final transferred image, too little original content is left for people to tell what was in the original content image.

Figure 4: Gram Matrix Visualized [4]

Gram Matrix

We can also analyze the style of an image using the feature maps of a CNN. Think of the "texture" of an image as how often some specific colors and edges appear together. 

How can we find correlations? Well, as we learned in class, the dot product can find similarities between the two vectors. If two features often appear together, then their dot product should be a relatively big number.

We can take a similar approach to analyze a matrix, the gram matrix. Again, we start by extracting the feature maps of the desired image from the CNN. Since the feature maps have three dimensions (height, width, and channel), we can simplify later operations by flattening one of its dimensions first. Then by performing matrix multiplication with its transpose to measure the similarity between the rows of the flattened feature map.




Example: Gram Matrix is Symmetric (DSP TOOL USED HERE)

As we learned in class, a matrix multiplied by its transpose should be a symmetrical matrix. We have plotted some gram matrices, and it is apparent that the matrix is symmetrical against its major diagonal. 



Style Loss

The style loss is very similar to the content loss in ways that it uses Mean Square Error to calculate the distance between the generated style image with the input style image. This time the difference measured is the style image's gram matrix and the generated image's gram matrix. We can also add a weight parameter to help us easily tune the amount of style we want to preserve in the final generated image.

We can also perform gradient descent to preserve the most style.




Results achieved 

Art Style --> Art Style

art nouveau

"Grey Day Laurentians" by A Y Jackson, 1928

expressionism

"Caltagirone all’alba" by Antonietta Raphael, 1951

Final image

Preserved the buildings, mountains/sunset. Transferred mostly the color theme.

ukiyo-e

Artwork by Utagawa Hiroshige ii, 1619

baroque

"View of Dordrecht"by Aelbert Cuyp, 1655

Final image

Ships and their reflections preserved. Sky and water changed

Real World Photo --> Art Style

ukiyo-e

The Great Wave off Kanagawa



Photograph

A Person in Red Surfing

Final Image

Successfully preserved the red/green clothing of the person and changed the waves color

Oil paint

"RAIN'S RUSTLE IN THE PARK" by Leonid Afremov

Photograph

Bird on a branch

Final Image

Stunning result

Chinese Ink Paint

"雪绕羣峰"(Snow surrounding the peaks), unknown author

Photograph

Mountains with birds

Final Image

Birds and mountains preserved. Notice the Top part of the mountain became darker, this reflects the style of the Chinese ink paintings.

Page written by: Jason Ning