1. Krita AI Hires tab options - minsky91/krita-ai-diffusion-hires GitHub Wiki

Krita AI Hires tab options

Enable Tiled Diffusion

This toggle enables or disables use of Tiled Diffusion within the plugin, a pluggable component that switches image processing into a tiled mode, saving lots of GPU VRAM and making generation run significantly faster than in the standard Refine / Generate mode of the plugin, with no discernible loss of quality. (The Standard Reine option of the plugin’s Upscale workspace also processes images in a tiled fashion, and somewhat faster than TD, but it is limited to the function of refining combined with image guidance; it uses a very different algorithm and produces different output.)

Tiled Diffusion (TD), and particularly its Mixture of Diffusers method that is implemented in the plugin, is known not only for its speed but also for its characteristic “creative” character of denoising, the core of Stable Diffusion-driven image generation; this character will be revealed as soon as you start using it. TD excels also at sophisticated image refining and as being capable of (mostly) avoiding visible tile seams, the artefact that plagues all other upscaling-refining solutions.

When enabling TD, the accompanying Tiled VAE Encode and Decode are also automatically enabled, which additionally speeds up hires image processing. No noticeable degradation of the output has been detected in the course of implementing these features and subsequent testing. TD is tested to be compatible with the most features of Krita AI, except for a few specific scenarios of inpainting and region usage, in which case it’s turned off. It is also not being activated when the canvas resolution is below the user-defined threshold (see TD Activate Resolution below).

To see various examples of TD use In Krita AI Hires, click here. A comparative study of standard and Hires editions and extensive benchmark data can be found here.

Availability: the Enable TD option is only available in Krita AI Hires when it’s configured to use a Custom Comfy server. This is because it uses 3 custom nodes unavailable in the Local managed server distribution. Click here for installation instructions for the Hires edition of the plugin.

Note: TD is suppressed when the user-selected TD Activate Resolution (see below) is set higher than the image’s current resolution, even when the toggle is switched on.

TD Activate Resolution

For basic image resolutions (up to 1.5K) TD might not be the fastest and most efficient mode to generate or otherwise process images with Krita AI, its advantages only become clear above those. In my own testing, at the 2K resolution and above, the difference in the output quality is obvious, and with resolutions from 3K and up, TD generation speed grows exponentially faster against the standard plugin operation (with the exception of the Upscale / Refine mode that also uses a tiled method).

TD Activate Resolution allows to select the resolution at which TD is activated for image generation and refining (provided it’s enabled with the Enable toggle above). Recommended value: 2 or 3 K for the longest image dimension, depending on your particular image content. An optimal value is something that can only be found by experimenting with TD.

Availability: same as for Enable TD above.

Hiresfix Guidance

Hiresfix is a method frequently found in Stable Diffusion tools that allows generating images with resolutions significantly higher than the model’s native one, such as 1024x1024 for SDXL and Flux. Hiresfix output tends to be of a good quality and is often free of distortions that non-native resolution output is usually prone to, especially in human form. Hiresfix has been implemented in Krita AI from the start as a built-in feature (and it works quite well), but has not been documented that much, which is why it requires a bit of explanation here.

Hiresfix works as follows in Krita AI. Whenever you have a canvas of a resolution substantially higher than the checkpoint’s native one, and request a from-scratch (100% Strength) generation, the plugin will first automatically downscale the image to the native dimensions of the same aspect ratio and generate at those dimensions. At the 2nd pass, the intermediate result will be upscaled, also automatically, using a default AI and/or non-AI method, back to the canvas dimensions, and then refined with 40% Strength (equivalent to img2img with the 0.4 denoise factor in other SD tools), using the same prompt, checkpoint and sampler as at the 1st pass. The resulting output is often of striking clarity and detail (especially in the face features, such as hair), with small feature distortions mostly fixed, in an almost magical way. In many cases, however, it introduces visible mid-scale distortions of its own, and can also be quite slow in processing. Most of all, being an automatic feature, Krita AI’s built-in Hiresfix doesn’t allow for tuning of any sort.

Krita AI Hires improves this method, both in terms of quality and speed, by activating TD for the 2nd refine pass, which not only makes generation finish much sooner but also eliminates most of the mid-scale distortions. in certain cases, however, due to its tiled character, it may in turn introduce artefacts of its own, such as faint miniature versions of the main character (if the prompt suggests any) placed in a grid-like way in the image. The artefacts can become visible depending on the checkpoint and sampler, image’s resolution, background lightness or texture type etc.

The Hiresfix Guidance parameter, which is a slider with values between -20 and 20, allows to vary the base denoise factor, thereby affecting the Hiresfix-generated output. Positive values map to the 0.4-0.2 denoise subrange and help to avoid the TD-related artefacts, by the price of the output getting incrementally closer to the non-refined native resolution’s one that may contain small-scale distortions, most notably in faces. In contrast, negative values, which map to the 0.6-0.4 denoise subrange, will relax the guidance toward a stronger and stronger deviation from the non-refined native resolution output, resulting in oftentimes more refined appearance, particularly of faces and other human form detail. Depending on the input pixel content and other plugin settings in use, various slider values may produce a great variety of the output, so the best way is to experiment with the whole range.

What’s more, the Hires version of the plugin’s Hiresfix feature includes a fix for an issue frequently reported when generating images of a higher than native resolution with Flux models, the infamous screen grid pattern. The pattern is eliminated when using TD plus Hiresfix combination, and so is the blurriness that is typically introduced when rendering such images. See examples of Hiresfix doctoring Flux images here.

Note that, being an experimental feature, Hiresfix Guidance may run, in certain scenarios and with certain slider values, into a conflict with other Krita AI image processing components. This may present itself as a messed up output or a failure message in red displayed in the plugin’s UI. In such a case, the right way to fix the issue is to turn off Hiresfix Guidance’s internal calibration logic by setting the slider to 0 (default), or to its leftmost (-20) or rightmost (20) position. Hiresfix will be still applied but in a more straightforward fashion that prevents such failures from happening.

Note 2: the Hiresfix Guidance option will work with TD switched off as well, but processing times will be slower and the output may have more distortions.

To see various examples of Hiresfix Guidance effect In Krita AI Hires, click here.

Availability: same as for Enable TD.

Upload Method

To optimize processing of high resolution images, the way input files and generated workflows are submitted to the server is radically reviewed in Krita AI Hires. A rigorous benchmark testing which I carried out (see the detailed report here ) to investigate possible bottlenecks, revealed that the standard way of Base64-encoding of all input images and embedding them in the workflow as text-format nodes before uploading to the server becomes much too resource-wasteful when preparing and submitting hires content - speaking of RAM, hard disk space and bandwidth usage. At a certain size, it also triggers a failure due to a capping on the image input that can be received and processed by Comfy, as I found out a while ago (https://github.com/Acly/krita-ai-diffusion/issues/1265).

With the new upload method implemented in Krita AI Hires, input images are sent separately in a binary compressed format, which does away with bulky workflows and the 33% overhead that Base64 incurs. More importantly, images are submitted only once per session, so long as their pixel content doesn’t change. Additionally, multiple files are uploaded in a parallel fashion, which further speeds up the operation in case when the input includes for instance control layers and masks. As the benchmark data shows, the new method results in significantly shorter workflow preparation and file upload times (times faster, in fact), with lots of bandwidth saved along the way. The new method doesn’t affect image processing in any way, the output will be identical to the one generated using the old method.

Users who submit high resolution files (4K and up) for upscaling or enhancement will benefit from the new method the most, and especially those owning a low-spec PC and/or slow connection (when using, for instance, a laptop connected over wifi to a PC running a Comfy server) .

Drop-down choices: Binary: the new optimal method of file uploading (default) Base64: Binary with additional Base64 encoding before uploading (for comparison testing only) Workflow-embedded: the old method (for comparison testing only)

Availability: the Upload Method option is only available in Krita AI Hires when it’s configured to use a Custom Comfy server. This is because it uses a new custom node unavailable in the Local managed server distribution. Click here for installation instructions for the Hires edition of the plugin.

Fast Receive

Similarly to upload, the download method has been revamped in Krita AI Hires. The standard websocket protocol-based routine was found by me to be completely inadequate for receiving high resolution images and replaced by a fast http-based one. As a result, download times for a test png-compressed image were reduced from 8.5s to 0.4s for 4K resolution, from 36s to 2.5s for 8K, and from 86s to 3.2s, for 16K. (These are registered timings for transfers on a single PC hosting both server and Krita AI as the client, so no wireless connection was involved; for a setup with the client connected over a wifi connection transfer times were longer of course, but not by a lot.) A full set of comparative benchmark data can be found here (the just-cited timing values for standard Krita AI are typed in bold in the spreadsheet).

Turning off Fast Receive will make Krita AI use the old websocket-based method; it is left there only for the purpose of comparison testing.

Availability: similarly to the Upload Method, the Fast Receive option is only available in Krita AI Hires when it’s configured to use a Custom Comfy server, due to the option’s use of a new custom node unavailable in the Local managed server distribution. Click here for installation instructions for the Hires edition of the plugin.

Lowest Jpeg Resolution

Saving generated images in jpeg format is possible in Krita AI, but only indirectly, via Krita’s Save As or Export functions. The built-in function Save Image in the plugin is coded to save in png (and, as the tests showed, using a very slow and inefficient compression component courtesy of the PyQt5 library). Upload and download image transfers are also always done using this format. Png, being a lossless format, is usually preferred over lossy alternatives such as jpeg and webp, and in many SD tools it is the only choice. This format however is costly in terms of compression and transfer times, as well as storage space use. When working with high resolution images the need for a lossy alternative becomes obvious. I don’t really see any rational reason not to offer the user a lossy format, jpeg in particular, as an alternative. It can be used primarily for draft generations, when you are still looking for a combination of settings to produce the desired effect - which is usually the majority of cases in image enhancement.

In Krita AI Hires, the jpeg format, with the high quality setting of 95, is available as an alternative to png for both image storing and transfer, saving in long term an untold number of gigabytes in storage and bandwidth use. An example: for a 16K image I used for testing, the difference in sizes was between 225 MB and 35 MB, with download times reduced from 10s to 2s (of which server-side compression only took 0.5s) when using jpeg. (The standard version can’t process images this large, so can’t compare.)

This option defines the minimal image resolution at which the plugin switches to transmitting and saving in jpeg format (default is 10K). Drive the slider to the right to prevent the plugin from using jpeg at this or higher resolutions.

Availability: the Lowest Jpeg Resolution option is available in Krita AI Hires for both Custom and Local managed server setups, found in the latter case in the Extras tab. However, due to the option’s use of new custom nodes unavailable in the local managed server distribution, only saving in jpeg format (via the Save Image thumbnail option) is supported for managed setup - no lossy compression during transfers, that is.

Progress Preview

Progress preview is an indispensable option when generating batches of images of medium-to-large resolutions, when each image starts to take minutes to generate. It can be useful for smaller resolutions as well (generating with a Flux model is the case when you might want it), and especially so when you own a PC with a low-spec GPU. The option is available in practically every SD webui tool or frontend, and it’s usually on by default. Standard Krita AI lacks it, however.

When enabled in Krita AI Hires, this option will cause generation progress to be displayed in the preview layer, updated as soon as the next sample (also known as ‘step’) is computed by the server. (The plugin will only do that for the 2nd half of the steps, since only those usually produce a discernible picture.) The preview process is interrupted once you click on any thumbnail to view a generated result. To resume progress preview, unselect the thumbnail by clicking outside of the thumbnail group and make sure the preview layer’s visibility is on.

It’s a true time- and electricity-saver (especially if you assign a hotkey to the Cancel current job action as I did), and I am really not sure why it wasn’t made available in Krita AI so far. It took me only about an hour to implement this feature in the plugin.

Availability: the Progress Preview option is available in Krita AI Hires for both Custom and Local managed server setups, found in the latter case in the Extras tab.

Save Gen Data

Saving generation metadata in the png file is one of the most requested features by Krita AI users. The request usually stems from their past experience of using other SD tools which have such a feature and with which it is possible to recreate the image using the saved parameters, later on. It is much harder to implement in Krita AI, given the sheer range of parameters available in the plugin that can affect the output. Also, input images cannot be saved within metadata. In general case, it will be next to impossible to recreate a Krita AI-generated image using the plugin with just parameters saved in a png, and definitely impossible with any other SD tool (except maybe ComfyUI). Your best option remains to save critical to your project images, both input and output ones, as layers in a Krita’s .kra file, in which the plugin stores all essential parameters and from which one can - most of the times but not always - regenerate it anew and hopefully, identically.

Nevertheless, I took an effort to implement this feature the best I could, covering as much territory as reasonably possible. In addition to standard generation parameters found in metadata saved by other SD tools, I included some essential generation statistics data such as the time it took to generate the image (when medium or verbose level is used) and pixel dimensions of all the images involved.

Check box choices: As a text file: will save metadata in a plain txt file (see example below). The detail saved is controlled by the Metadata Verbosity below, from Essential to Medium and to Verbose.. As metadata in the png file: will save metadata in the png image file, with verbosity controlled in the same way. Note that, due to the way QImageWriter, the image handling component of the PyQt5 library, saves text chunks in the png file, the saved metadata cannot be retrieved and displayed by Comfy frontend (which ignores zTXt-compressed chunks in a png). The metadata saved by Krita AI Hires can be retrieved, however, by Automatic1111 and Forge webuis (with a few exceptions), as well as by the SD Prompt Reader utility specialising in this task. Workflow embedded in the png file: will save the workflow generated by the plugin internally before submitting to the server. The same png chunk compatibility note as above applies to this option, which means Comfy will report “Unable to find workflow in …” when you drop the image to it (but it’s definitely there!) Display in the tooltip: will display metadata as a tooltip when you hover the mouse cursor over a generated image thumbnail. The tooltip’s verbosity is controlled by the same Verbosity option as for the previous choices, but the chosen verbosity level will be reduced by 1, so as to keep the screen footprint of the tooltip limited.

An example of metadata text with the verbosity level Verbose:

a serene landscape with forest and mountains on the horizon, a colored illustration
Negative prompt:
Steps: 10
Sampler: Euler a (Hyper)
Schedule type: Normal
CFG scale: 3.0
Seed: 1105287894
Model: art\dynavisionXLAllInOneStylized_releaseV0610Bakedvae.safetensors (SD XL)
Denoising strength: 1.0
Style Preset: Hyper with Lora
Style Preset filename: style-5.json
LoRas: LCM-Hyper\Hyper-SDXL-12steps-CFG-lora.safetensors: strength 1.0
Rescale CFG: 0.7
Canvas resolution: 8192x6144 pixels
Output resolution: 8192x6144 pixels
Region 1: prompt {background}
Region 2: prompt "a forest hut", resolution 3108x2696
Region 3: prompt "a mountain river", resolution 3851x2856
Region 4: prompt "a forest meadow", resolution 5207x1473
Region 5: prompt "an elderly forester man is walking to his forest hut, carrying a heavy bundle of woodsticks on his back", resolution 1656x2680
Models used: xinsirtile-sdxl-1.0.safetensors (ControlNet)
4x_NMKD-Superscale-SP_178000_G.pth (Upscale or Inpaint model)

Generation stats:
6 cached input image(s) of 0x0 pixels
Preparation time: 3.91 sec.
Workflow size: 0.05 MB, 59 nodes
Workflow upload time: 0.14 sec.
Output files total size: 53.43 MB in 1 PNG image(s) of 8192x6144 pixels
Output files download time: 3.28 sec.
Execution time: 164.7 sec.
Total active time: 168.6 sec.
Total lifetime: 353.2 sec.
Batch size: 3

System info:
os: nt
ram_total: 32 GB
ram_free: 23.1 GB
comfyui_version: 0.3.19
python_version: 3.10.11
pytorch_version: 2.6.0+cu124
GPU device features:
name: cuda:0 NVIDIA GeForce RTX 4070 Ti SUPER
vram_total: 16 GB
vram_free: 14.7 GB

Metadata Verbosity

This parameter controls the verbosity of the metadata saved alongside the output png image file, or of the screen tooltip.

Drop-down choices: Essential: only essential metadata will be saved, including the prompt and main generation parameters. Medium: more data will be saved as compared to Essential. Verbose: all metadata and statistics relevant to the generation will be saved.

Logfile Verbosity

This option is very similar to the Metadata Verbosity one and uses the same levels of Verbosity, except that it applies to the logging content of the client.log file maintained by the plugin. Setting it at the Essential or Medium level will cause the logfile size to grow much slower (use them only when having no issues with Krita AI) than with Verbose.

Drop-down choices: Essential: only a minimum of event data data will be logged, including a few startup sequence lines and beginning and finish of generation events. Medium: more data will be logged as compared to Essential, including the plugin’s startup sequence. Verbose: all logging info will be saved in client.log. Note: this is the level you need to set when you are asked to submit info relevant to an issue you have encountered while using Krita AI.