What is Neural Style Transfer?
Per TensorFlow’s website Neural Style Transfer is an optimization technique used to take two images—a content image and a style reference image (such as an artwork by a famous painter)—and blend them together so the output image looks like the content image, but “painted” in the style of the style reference image. In essence, Neural Style Transfer (NST) is a technique that allows us to generate an image with the same “content” as a base image, but with the “style” of our chosen picture.[1]
​
NST algorithms are characterized by their use of deep neural networks for the sake of image transformation. Common uses for NST are the creation of artificial artwork from photographs, for example by transferring the appearance of famous paintings to user-supplied photographs. Several notable mobile apps use NST techniques for this purpose, including DeepArt and Prisma.[2]
NST was first published in the paper "A Neural Algorithm of Artistic Style" by Leon Gatys et al., originally released to ArXiv 2015,[3] and subsequently accepted by the peer-reviewed CVPR conference in 2016.[4]
The original paper used a VGG-19 architecture[5] that has been pre-trained to perform object recognition using the ImageNet dataset.
Our motivation
Per TensorFlow’s website Neural Style Transfer is an optimization technique used to take two images—a content image and a style reference image (such as an artwork by a famous painter)—and blend them together so the output image looks like the content image, but “painted” in the style of the style reference image. In essence, Neural Style Transfer (NST) is a technique that allows us to generate an image with the same “content” as a base image, but with the “style” of our chosen picture. [1]
NST algorithms are characterized by their use of deep neural networks for the sake of image transformation. Common uses for NST are the creation of artificial artwork from photographs, for example by transferring the appearance of famous paintings to user-supplied photographs. Several notable mobile apps use NST techniques for this purpose, including DeepArt and Prisma. [2]
NST was first published in the paper "A Neural Algorithm of Artistic Style" by Leon Gatys et al., originally released to ArXiv 2015,[3] and subsequently accepted by the peer-reviewed CVPR conference in 2016.[4]
The original paper used a VGG-19 architecture[5] that has been pre-trained to perform object recognition using the ImageNet dataset.
Algorithm
In our algorithm we work with two individual images to produce one, combined output image. We will refer to the first image as the content image (A) and the other will be a style image (B). We combine these two images to generate a output image (C).
​
In this project, we used VGG-19 neural network architecture. This is because this approach focuses on minimizing the error function:
Here, the alpha and beta are the weighting factors for content and style reconstruction respectively. By minimizing this function out output image (C) is produced in a way where its content will be similar to image (A) and its artistic style to image (B).
​
In our implementation method we visualize information that our neural network "learned" in the current layer, and then constructing the image up to that layer with each photo.
​
The deeper layers start to blur the pixels more, and by the end the princess has expanded and complete all regions of the image.
​
We reconstruct the input image from the layers ‘conv1_1’ through ‘conv5_1’ of the original VGG-network
Implementation
There were multiple libraries that we implemented so that our program could run. These included Pytorch, Pillow, OpenCV, and NumPy. What was used most during our project was Pytorch. It is an open course machine learning library, which is based on the Torch library. It is used for deep learning applications using GPUs and CPUs. We felt this was the best idea to work from since Pytorch is based on python, which is the language we were most comfortable with.
Pytorch builds a deep learning application on top of dynamic graphs which can be altered during runtime. This allows for an increase in productivity while working on this project since Pytorch is able to speed up the process with GPUs since these machines are trained for GPUs.
Deep learning methods, especially convolutional neural networks (CNNs), have become a major work technique for a variety of image processing tasks. The techniques of both the histogram color matching and luminance only transfer are used in many ways. The style transfer of luminance only is a visual perception and is far more sensitive to changes in luminance rather than color channels. This would go into the other technique color histogram matching. This is when the style image’s colors are transformed to match the colors of the content image. The color matching method is limited by how well the color transfer from the content image onto style image works. The luminance only transfer is different and preserves the colors of the content image perfectly. The luminance and the color channels are lost in the output image.
Results
These images were produced by combining the two source images below through a NST method and utilizing the VGG-19 architecture network for our processing. In both images on the right, we used Luminance-only transfer. We iterated the transfer a total of 4000 times to analyse the difference between iterations. First image shown is after processing 1000 iterations. The second image displayed is after processing the 4000 iterations.
Color Histogram Matching is limited by how well the color transfer from the content image combined with the style image. The color distribution often cannot be matched perfectly, leading to a mismatch between the colors of the output image and that of the content image.
Luminance-only transfer preserves the colors of the content image. The is dependent on the luminance and color channels are lost i n the output image. This is particularly apparent for styles with prominent brushstrokes. This results in multi-colored brush strokes, creating unique colors that are unable to be made in real paintings.
Source Images
Content image (A) Style image (B)
Outputs
1000 2000 3000 4000
Luminance Only Transfer
Source Images
Content image (A) Style image (B)
Output (hover over)
Color Histogram Matching
Source Images
Content image (A) Style image (B)
Output (hover over)
Conclusion
What we encountered during this was trying to run our outputs on a GPU. Originally we were running our code with CPUs and we had many errors that didn’t output our images. The main difference between GPUs and CPUs is that GPUs devote proportionally more transistors to arithmetic logic units and fewer to caches and flow control as compared to CPUs.
Deep Learning was new to both of us and so we were able to learn a lot from this project. We think it will be used in the future and seen in a commercialized way. NST are currently used in video games, artwork, videos, virtual reality, and many more. This technique could potentially be used in the Metaverse and other virtual realities. Image style transfer is a growing topic of research and aims to boost artistic skills of every person.
References and links
-
https://www.tensorflow.org/tutorials/generative/style_transfer
-
Gatys, Leon A.; Ecker, Alexander S.; Bethge, Matthias (26 August 2015). "A Neural Algorithm of Artistic Style". arXiv:1508.06576 [cs.CV].
-
Bethge, Matthias; Ecker, Alexander S.; Gatys, Leon A. (2016). "Image Style Transfer Using Convolutional Neural Networks". Cv-foundation.org. pp. 2414–2423. Retrieved 13 February 2019.
5. "Very Deep CNNS for Large-Scale Visual Recognition". Robots.ox.ac.uk. 2014. Retrieved 13 February 2019.
​
Project Proposal:
https://docs.google.com/document/d/1ML1rHAs5uvyeD6uXudH-ga3VoqOQalukEPmcR2dXyZU/edit
​
Midterm Project Report:
https://docs.google.com/document/d/1iKKkH7GwStqjpJkloIo5suH6ZrcOny_0-91ZpWQro2Q/edit
​
Project Presentation:
​
GitHub:
https://github.com/annabelleshultz/Using-Color-Optimization-with-Neural-Style-Transfer