Usage and Results - jcjohnson/neural-style GitHub Wiki
Welcome
On this page, there will be an explanation for most -[options]
, what they change and how it effects the resulting imagery. On this page there will also be a few examples you can use.
###NIN and VGG-19 At the moment only 2 models have been thoroughly tested with neural style: VGG-19 and NIN. VGG-19 will be installed by default if you followed the instructions. The NIN model files have to be downloaded and moved to /neural-style-master/models beforehand as per instructions on the main github page.
To use the NIN model you have to add these parameters to your command:
-model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -content_layers relu0,relu3,relu7,relu12 -style_layers relu0,relu3,relu7,relu12
(Assuming you put the downloaded NIN model files into your /neural-style-master/models folder)
Note that the -content_layers
and -style_layers
are not optional and should always be used when using the NIN-model, or any other model than VGG-19.
###The difference, NIN or VGG-19
VGG-19 is easy to use and has the best results without changing or adding any parameters, but uses way more resources and most people will only be able to get about 700px max with 4GB VRAM or 8GB RAM
NIN uses way less resources and can get you up to 1200px with 4GB VRAM or 8GB RAM at the same speed, however, to get good results you will have to add/change loads of parameters.
Alternative untested models can also be used, but this is heavily discouraged for new users, more info is available here: Using Other Neural Models
###Parameters and options
If you decide to use VGG-19, the next command will probably suffice for 90% of the images you plan to process, however this doesn't mean that you cannot use the parameters specified later.
VGG-19
th neural_style.lua -style_image [styleimage.jpg] -content_image [contentimage.jpg] -output_image [how_your_resulting_image_will_be_called.png] -backend [your preferred backend] -image_size [length of resulting image in pixels]
Like stated before [length of resulting image in pixels] will have to rather low to avoid errors. If you're not sure you have enough horsepower, start with 400, if that goes well, go up 100px, etc until you start receiving errors. If it crashes at 400, go down 50px etc until you don't receive an error anymore. This can be as low as 40 pixels and if that's the case, using the NIN model is recommended. Also consider that the scale is logarithmic not linear, so don't expect to be able to push double the amount of pixel length with double the amount of (V)RAM.
For some people -optimizer adam
enabled them to add a few pixels to their max, the same goes for -backend cudnn
.
NIN
After you installed the NIN model, it can be used by adding the following parameters (these must be added after th neutral-style.sh
, the same goes for all other options on this page)
-model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -content_layers relu0,relu3,relu7,relu12 -style_layers relu0,relu3,relu7,relu12
(Assuming you put the downloaded NIN model files into your /neural-style-master/models folder)
Note that the -content_layers
and -style_layers
are not optional and should always be used when utilizing the NIN-model
Useful options when using NIN
As you may have noticed, NIN gets rather strange results from time to time, this is mostly undesired. Luckily there are a few options we can use to improve the NIN images. These can also be used with VGG-19.
-pooling [avg or max]
Something mathematical, when not specified it defaults to max. When you use -pooling avg
the resulting image will represent the content image more. It's somewhat arbitrary so test for yourself.
-tv_weight [1>number>0]
Basically the fuzziness/blurriness that is added with each iteration, when not specified it defaults to 0.001, if your results seem too fuzzy and/or blurry, try setting it lower like 0.0001 If you see random lines or loads of artifacts, set it higher like 0.01. Keep it between 0.1 and 0 because it's rather sensitive.
-num_iterations [number]
Believe it or not, but when using NIN, more iterations =/= better results. Most of the time the best results can be found between 30 and 400 iterations. Limiting the iterations will save time and resources. When the option is not specified, it puts the amount of iterations at 1000, so setting -num_iterations 500
might be a lifesaver. If you truly believe that more iterations = better results, no-one is stopping you from setting it to 4000.
-save_iter [number]
This option specifies when the program saves an image, without theoption it defaults to 100, which creates all those images in your folder. When looking for the best iteration, you're best of having it 1/10 of -num_iterations [number]
so you can see which iterations were best.
-init [image or random]
Defines if your starting image will be noise or the original content image, -init image
will result in images that are more true to the original while -init random
will look more like your style image deformed so it looks like the content image. Without the option it defaults to random
.
-content_weight [number]
and -style_weight [number]
Like specified in the README.md, these specify how much the content and style are present in the results. Content weight is best of underneath the 20 and Style weight can range from 100 to 5000. It differs for each image.
NIN Example
th neural_style.lua -style_image style17.jpg -content_image img17.png -output_image MASSE/profile17.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -gpu 0 -num_iterations 300 -content_layers relu0,relu3,relu7,relu12 -content_weight 2 -style_weight 550 -style_layers relu0,relu3,relu7,relu12 -image_size 1433 -optimizer lbfgs -print_iter 25 -save_iter 20
Results in this, made with an R9 390 (8GB VRAM)