ESRGAN - HerbFargus/Wikis GitHub Wiki

ESRGAN upscaling uses AI to upscale images into higher resolution while still improving and retaining existing quality, depending on the model you use.

This will be a rough guide on how to upscale an old cartoon using ESRGAN:

Installation:

this roughly follows the guide here:

https://upscale.wiki/wiki/ESRGAN_Installation_Guide_for_Windows

Pre-Requisites:

Nvidia GPU Nvidia GPU Drivers Python Pip Torch

Nvidia CUDA Drivers:

https://developer.nvidia.com/cuda-downloads

Nvidia Studio Drivers:

https://www.nvidia.com/Download/index.aspx?lang=en-us

Python Installation:

https://www.python.org/downloads/windows/

Pip Installation:

https://bootstrap.pypa.io/get-pip.py

Git Installation:

https://git-scm.com/download/win

Torch Installation:

pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio===0.8.1 -f https://download.pytorch.org/whl/torch_stable.html

OpenCV:

pip install opencv-python

Cupscale:

https://github.com/n00mkrad/cupscale/releases

FF-Utils:

https://github.com/n00mkrad/ff-utils/releases

Models:

DigitalFrame 2.5

https://icedrive.net/1/b46ZwVaqfh

FFMPEG:

https://ffmpeg.org/download.html#build-windows

Workflow:

You can clip a small part of the video for testing first (in this example I'm taking the video stream and the 3rd audio stream):

ffmpeg -i input.mp4 -map 0:0 -map 0:3 -ss 00:01:30 -to 00:01:40 -c:v copy -c:a copy output.mp4

Split video into frames using FF-Utils or you can manually use a script for extra modifications like deinterlace, scaling, etc:

ffmpeg -i input.mkv -map 0:0 -map 0:3 -vf "yadif=0, fps=24000/1001, scale=640:480" -c:v libx264 -b:v 8M -c:a copy output.mkv

Run Cupscale to upscale the extracted frames

Combine frames back together with Cupscale or use custom ffmpeg script for added options

Here's an example of a manual ffmpeg command (bit janky on windows):

ffmpeg -r 29.976 -f image2 -pattern_type glob -i '??/*.png' -vcodec libx265 -crf 16 -pix_fmt yuv420p output.mkv

This is a similar windows version:

ffmpeg -f image2 -framerate 23.976 -pattern_type sequence -i "%08d-4x_DigitalFrames_2.5.png" -vcodec libx264 -crf 16 -pix_fmt yuv420p imagetest.mkv

Or a much higher quality version:

ffmpeg -r 23.976 -f image2 -s 2560x1920 -i "%08d-4x_DigitalFrames_2.5.png" -c:v libx264 -preset veryslow -crf 13 -pix_fmt yuv420p -profile:v high -level:v 4.1 -refs 4 -bf 3 qualitytest.mkv

Convert to 1080p:

ffmpeg -i input.mkv -s 1440x1080 -c:a copy output.mkv

You can combine the audio back without re-encoding (eg combines first video with audio of second):

ffmpeg -i input_0.mp4 -i input_1.mp4 -c copy -map 0:0 -map 1:1 -shortest out.mp4

Other notes:

convert container back to mkv from mp4:

ffmpeg -i input.mp4 -vcodec copy -acodec copy output.mkv