Additional Embeddings - Nerogar/OneTrainer GitHub Wiki

This tab allows you to add additional embeddings to your training.

image

Description

A technique for adding text embeddings into ANY training. This is not limited to LoRA training and can be used with FineTunes or even just to train multiple embeddings without training any Unet, DIT or Text Encoder(s).

The additional embeddings tab will always be available, no matter what type of training you are doing. This is how you can combine additional embeddings with LoRA training or a FineTune.

There are two techniques for use of additional embeddings.

  1. Pivotal Tuning - Train an embedding separately, and use the additional embedding tab to load the output safetensor for your Lora or Finetune training.
    1. The upside of this approach is that when complete, you have a stationary target to train the unet or DIT against.
    2. The downside is you need to train the embedding first. Although embeddings train quickly, it still takes time and admin.
  2. Combined Training - Train an additional embedding as you are doing your lora or finetune
    1. The upside of this approach is that you do not need to train an embedding first
    2. The downside of this approach is that the embedding is a moving target for your unet or DIT train work. It is likely best to stop training the embedding before your unet or DIT to allow training to occur in the final area the embedding stops.

Usage

Depending on what you are training, the output of an additional embedding will be different.

When training a Lora, the default is to package the embedding as part of the Lora. If you do not want this, uncheck the box on the Lora tab.

When training just an embedding or a finetune, a subfolder is created that will have the embedding inside it. This subfolder will be in the location where your finished safetensor file is stored.

This also applies for incremental saves during training, subfolders will be created for each incremental save with the embedding inside it (or packaged in the Lora).

Not all generation programs work with the embedding packaged with the Lora, so if you do not want this functionality, disable it.

To use with generation, you need to include the LoRA and the embedding, or the finetune and the embedding, into your generation program of choice to get the combined results.

Additional Embeddings GUI Setup

Add embedding

  • Pressing this button will add a default embedding to the tab.

Additional embedding settings

  • Red X - Pressing this button will delete this additional embedding
  • Green plus - Pressing this button will duplicate your additional embedding
  • base embedding (default: blank) - Use this field to specify if you want to load an existing embedding for use during a training. It does not need to be trained more.
  • placeholder (default:<embedding>) - the placeholder for your embedding. It will be used as the filename (generated in a separated folder next to the model defined in the model tab) and must match the trigger word you use in your captions. This will also be the placeholder you use in your prompt, for example in Automatic1111. Note it doesn't need to be a single word, several words separated by space will work as well.
  • token count (default:1) - the number of tokens you want to use for your embedding. If you leave this field blank, OneTrainer will use your initial embedding text to automatically calculate the token count. If you set a value, it will complete it with * for missing tokens (Ex: token count 3 and dog as init text which is one token will become dog**).
  • train (default: on) - a toggle that specifies if your next training run will train this additional embedding
  • output embedding (default: off) - a toggle that specified is you want the embedding to be trained on the output of the text encoder instead of the input. This option is intended to help with models that use T5 or other LLMs.
  • stop training after (default: blank - NEVER) - Two fields that you can use to tell OneTrainer how long you want to train this embedding for. Specifying NEVER will allow the training run the entire length.
  • initial embedding text (default: * ) - The word or words or phrase that your embedding will point to initially, before any training takes place. If you put too many tokens here it will be truncated to the token count, and if you specify too few it will be padded with *.

Notes

  • Use a tool like automatic1111 or stable swarm to determine how many tokens your initial text is that you want to use. This can help set the number of tokens you want to use in the embedding. You can also use this link to determine token information. Please ensure you use the CLIP Tokenizer from the dropdown list.
    • Alternatively, you can now leave the token count blank and OneTrainer will calculate this for you
  • Using an additional embedding with LoRA training with both unet and text encoders will result in very fast learning in the case of subject training
  • As additional embeddings are currently separate files, trying to train a very large number of embeddings is not easy to manage or create.
  • There are likely many more ideas of what can be done with this technique. Try them out and share them on the Discord!
  • Using your caption trigger word as the placeholder for the embedding will make things much easier. You no longer have to use <embedding> and have separate captions for embedding runs and LoRA/FineTune/Additional Embedding runs.
  • Prodigy struggles if you only train the unet and an additional embedding (and do not train the Text Encoder(s)). Limiting the growth to 2, and using BF16 weights and calcs has been shown to work, at least in SDXL. If you try Prodigy with FP16 and get it to work, please share your settings.
  • On the training tab under the embeddings learning rate there is an option "Preserve Embedding Norm", it's for rescaling the trained embeddings to the average norm of all other embeddings (between 0.35 and 0.45).
  • LR may need to be adapted to each encoder, but there is not really any general guidance that can be given as it depends so much on what you're training.

More Info

Pivotal tuning is a similar concept, and here is some additional information to understand more.

⚠️ **GitHub.com Fallback** ⚠️