Figure 1: Starry Night style Transfer [1]
The goal of this project is to transform an image from the ArtBench-10 dataset to another style category of the data set. We want to preserve as much content in the original image while revamping its style.
Our project is based on the article "A Neural Algorithm of Artistic Style", and we were provided with a starter code file from "Problem Set 6: Image Synthesis" of EECS 442/504.
Style transfer can be achieved by extracting features and styles from layers of a Convolutional Neural Network (CNN). To generate the synthesized image, we start with a white noise image and perform gradient descent to find an image that best represents the features and style of the desired image. The key algorithms involved are called content loss, gram matrix, and style loss.
The CNN model used here is a pre-trained version of SqueezeNet.
Figure 2: Neural style transfer system diagram, v7Labs [2]
Figure 3: Feature Map Demonstration [3]
Convolution is performed in every layer of a CNN. Small image features such as edges are in the first few layers, while global features such as object parts are in the deeper layers.
If we strategically choose the feature map layer, we can obtain information about the subject/content in the image.
If we are given 2 feature maps, one from the content image, and the other from the generated image, we can then use the equation on the right to calculate its content loss.
This equation resembles a Mean Square Error (MSE) function that calculates the squared ℓ2 distance between two convolutional feature maps.
If we then perform the gradient descent to minimize the content loss, we can achieve the goal of preserving the maximum amount of content from the input image.
The example below shows the effect of too much content loss. In the final transferred image, too little original content is left for people to tell what was in the original content image.
We can also analyze the style of an image using the feature maps of a CNN. Think of the "texture" of an image as how often some specific colors and edges appear together.
How can we find correlations? Well, as we learned in class, the dot product can find similarities between the two vectors. If two features often appear together, then their dot product should be a relatively big number.
We can take a similar approach to analyze a matrix, the gram matrix. Again, we start by extracting the feature maps of the desired image from the CNN. Since the feature maps have three dimensions (height, width, and channel), we can simplify later operations by flattening one of its dimensions first. Then by performing matrix multiplication with its transpose to measure the similarity between the rows of the flattened feature map.
As we learned in class, a matrix multiplied by its transpose should be a symmetrical matrix. We have plotted some gram matrices, and it is apparent that the matrix is symmetrical against its major diagonal.
The style loss is very similar to the content loss in ways that it uses Mean Square Error to calculate the distance between the generated style image with the input style image. This time the difference measured is the style image's gram matrix and the generated image's gram matrix. We can also add a weight parameter to help us easily tune the amount of style we want to preserve in the final generated image.
We can also perform gradient descent to preserve the most style.
"Grey Day Laurentians" by A Y Jackson, 1928
"Caltagirone all’alba" by Antonietta Raphael, 1951
Preserved the buildings, mountains/sunset. Transferred mostly the color theme.
Artwork by Utagawa Hiroshige ii, 1619
"View of Dordrecht"by Aelbert Cuyp, 1655
Ships and their reflections preserved. Sky and water changed
The Great Wave off Kanagawa
A Person in Red Surfing
Successfully preserved the red/green clothing of the person and changed the waves color
"RAIN'S RUSTLE IN THE PARK" by Leonid Afremov
Bird on a branch
Stunning result
"雪绕羣峰"(Snow surrounding the peaks), unknown author
Mountains with birds
Birds and mountains preserved. Notice the Top part of the mountain became darker, this reflects the style of the Chinese ink paintings.
Page written by: Jason Ning