🤷 How to Pick a Model - Sirosky/Upscale-Hub GitHub Wiki

Introduction

This guide covers how to pick an upscaling model suitable the source you are working on. A common thing I see for those that are just learning how to upscale is that they just download ESRGAN and call it a day. There's nothing necessarily wrong with it, but using a model that is unsuited for the source can result in undesireable issues. Models are best used on sources they are trained to handle.

This guide goes a bit more in-depth than might be necessary, so it's on the TL;DR side. Feel free to just pop by the Discord and ask, and the community can also recommend models that suit your project.

Prerequisites

This guide assumes that you have chaiNNer installed-- if not, you can follow the guide here.

Navigating OpenModelDB

Of course, first you'll need to understand where to look. OpenModelDB is the community collection of super resolution models. Think of it as the HuggingFace or CivitAI (but less degenerate, maybe) of upscaling.

The website is organized so that if you have a specific model in mind, you can search, or you can use tags. There are the basic tags, but you can also use the Advanced Tag Selector to help narrow the results. This is the recommended approach, as there are simply too many options otherwise!

Image

Advanced Tag Selector

Upon hitting the Advanced Tag Selector, your screen explodes with options! But fear not, we'll walk through these.

  • Subject: Fairly self-explanatory. What are you trying to upscale-- Stable Diffusion images? Anime? Faces? Pick what fits best.
  • Purpose: If you know what you're looking for, you can select the planned use of the model here. However, otherwise, I'd suggest leaving it blank as it might otherwise exclude potentially helpful models that weren't tagged.
  • Architecture: Now this is interesting, because there can be quite a bit of nuance that goes into this. You can leave it blank, but architectures do have a significant impact on your upscaling experience.

Without going into the weeds, architectures are essentially the framework of the model. To use an analogy, if super resolution models are considered all "dogs", architectures are "dog breeds" and models are "individual dogs". A golden retriever is going to look and perhaps even behave differently from a chihuahua. Similarly, each architecture has its own quirks and differences. But perhaps the easiest way to categorize architecture is by their inference speed. For example, you might not want to use an extremely slow arch for videos, whereas if you're just upscaling single images, speed is less of a concern.

Currently (based on my subjective opinion), the community's favored architectures are as follows: Lightweight Archs (fastest speed, nearly exclusively used on videos): Compact, SPAN Medium Archs (balanced and flexible): OmniSR, ESRGAN (though very dated) Heavyweight Archs (slow or very slow, primarily for images): DAT, SwinIR, SRFormer, HAT

  • Scale: This is how much a model will resize an image. 1x is no resize, and is mostly used on models that address a specific issue (such as color bleed in video). 2x is commonly used for video-focused models, and doubles the resolution of the image. 4x is used in both video and single-image models, and will multiply the resolution by 4x.
  • Input Type: I would recommend leaving this blank. At time of writing, there are no audio models, and image / video models are often interchangeable.
  • Color: The color of the input, but you can probably just leave this blank.
  • License: Not an issue if you're not using the model commercially or doing anything dumb.

Picking Models

Now that you have narrowed down the possible options, you can use the preview comparisons and description of the model to determine what to pick for download. Try to pick models where the intended purpose (often included in the description) matches as closely as possible with what you're trying to accomplish.

Testing Models

Once you've downloaded a model, it is also recommended to test it and make sure it works well on the source. To accomplish this, chaiNNer is fantastic utility. You can run each model on single images, a short clip, or frames extracted from a video manually. Alternatively, and this is what I prefer), is to pick a few images or frames from the source and run it through the Bulk Model Test chain, which can be grabbed here.

Instead of iterating through sample images, this iterates through a directory containing the models you want to test. Simply load the chain, select a test image, and the folder containing the models. Then, you can visually ciompare the output and see what model works best on your source.