ESRGAN - HerbFargus/Wikis GitHub Wiki
ESRGAN upscaling uses AI to upscale images into higher resolution while still improving and retaining existing quality, depending on the model you use.
This will be a rough guide on how to upscale an old cartoon using ESRGAN:
Installation:
this roughly follows the guide here:
https://upscale.wiki/wiki/ESRGAN_Installation_Guide_for_Windows
Pre-Requisites:
Nvidia GPU Nvidia GPU Drivers Python Pip Torch
Nvidia CUDA Drivers:
https://developer.nvidia.com/cuda-downloads
Nvidia Studio Drivers:
https://www.nvidia.com/Download/index.aspx?lang=en-us
Python Installation:
https://www.python.org/downloads/windows/
Pip Installation:
https://bootstrap.pypa.io/get-pip.py
Git Installation:
https://git-scm.com/download/win
Torch Installation:
pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio===0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
OpenCV:
pip install opencv-python
Cupscale:
https://github.com/n00mkrad/cupscale/releases
FF-Utils:
https://github.com/n00mkrad/ff-utils/releases
Models:
DigitalFrame 2.5
https://icedrive.net/1/b46ZwVaqfh
FFMPEG:
https://ffmpeg.org/download.html#build-windows
Workflow:
You can clip a small part of the video for testing first (in this example I'm taking the video stream and the 3rd audio stream):
ffmpeg -i input.mp4 -map 0:0 -map 0:3 -ss 00:01:30 -to 00:01:40 -c:v copy -c:a copy output.mp4
Split video into frames using FF-Utils or you can manually use a script for extra modifications like deinterlace, scaling, etc:
ffmpeg -i input.mkv -map 0:0 -map 0:3 -vf "yadif=0, fps=24000/1001, scale=640:480" -c:v libx264 -b:v 8M -c:a copy output.mkv
Run Cupscale to upscale the extracted frames
Combine frames back together with Cupscale or use custom ffmpeg script for added options
Here's an example of a manual ffmpeg command (bit janky on windows):
ffmpeg -r 29.976 -f image2 -pattern_type glob -i '??/*.png' -vcodec libx265 -crf 16 -pix_fmt yuv420p output.mkv
This is a similar windows version:
ffmpeg -f image2 -framerate 23.976 -pattern_type sequence -i "%08d-4x_DigitalFrames_2.5.png" -vcodec libx264 -crf 16 -pix_fmt yuv420p imagetest.mkv
Or a much higher quality version:
ffmpeg -r 23.976 -f image2 -s 2560x1920 -i "%08d-4x_DigitalFrames_2.5.png" -c:v libx264 -preset veryslow -crf 13 -pix_fmt yuv420p -profile:v high -level:v 4.1 -refs 4 -bf 3 qualitytest.mkv
Convert to 1080p:
ffmpeg -i input.mkv -s 1440x1080 -c:a copy output.mkv
You can combine the audio back without re-encoding (eg combines first video with audio of second):
ffmpeg -i input_0.mp4 -i input_1.mp4 -c copy -map 0:0 -map 1:1 -shortest out.mp4
Other notes:
convert container back to mkv from mp4:
ffmpeg -i input.mp4 -vcodec copy -acodec copy output.mkv