Improving Output Image Quality - jcjohnson/neural-style GitHub Wiki


The Style Images were collect from here: https://commons.wikimedia.org/wiki/Category:Gigapixel_images_from_the_Google_Art_Project

Each image is around 200MB+ in size, so downloading them directly to the Neural-Style directory is advised:

wget https://upload.wikimedia.org/wikipedia/commons/0/0b/Sandro_Botticelli_-_La_nascita_di_Venere_-_Google_Art_Project_-_edited.jpg

wget https://upload.wikimedia.org/wikipedia/commons/e/ed/Canaletto_-_Bucentaur%27s_return_to_the_pier_by_the_Palazzo_Ducale_-_Google_Art_Project.jpg

Due to the Russian characters in the name, the third style image used needs to be downloaded mannually without wget: Александр Андреевич Иванов - Явление Христа народу (Явление Мессии) - Google Art Project.jpg


There are multiple different ways to try and improve the quality of the final output image. The most important is getting your parameters right, but you can improve the quality farther using the methods outline below:

  1. Repeating your step(s) with the output image as your new content image:

-content_image <output_image>

  • Top left first run, top right second run, bottom left third run, and bottom right fourth run.

  • The exact same parameters were used for each "run". Each "run" was composed of 7 multiscale/multires.sh steps starting at 640px and ending at 1920px.

  • The example image is composed of 224x224 crops from the center of each output image. They were then resized so that the differences were more easily visible. The non-resized crop set can be found here: https://i.imgur.com/wcb8aCl.png

Direct link: https://i.imgur.com/oK16RYA.png


This method of enhancing the output quality also works with the -init_image parameter instead of the -content_image parameter, and can be exploited by stringing multiple multiscale/multires.sh scripts together. It seems to work with almost any combination of style and content images, as long as the style image is decent in terms of quality/resolution.

In terms of image size, this method can be visualized like walking up a bunch of stairs and then fall off the top. After you fall down, you start walking up the identical set of stairs that are in front of you. In this analogy, the top of stairs is the maximum image size your hardware allows, the bottom of the stairs are your starting size, and the individual steps are different image sizes, with large image sizes being represented as higher up steps.