Tools - Nerogar/OneTrainer GitHub Wiki
Dataset Tools
A tool that help to generate captions with BLIP, BLIP2 or WD14 and mask for masked training using ClipSeg or Rembg. For caption generation, if you set an initial caption, it starts generating from that text instead of an empty string (BLIP and BLIP2 only).
Caption Generation
If you set an initial caption, it starts generating from that text instead of an empty string (BLIP and BLIP2 only). You also can use in addition caption prefix and postfix, usefull with WD14, remember it doesn't add spaces, so you need to add them or "," or ".".
Mask Generation
There are a few different tools to automatically generate a mask, depending on your image type, included in OneTrainer under the dataset tools button. With the batch generate mask tool, you can use ClipSeg, Rembg, Rembg-human and Hex Color.
With ClipSeg, you can use prompts such as "a woman" or "face of a woman" or "face and hair of a woman" to have the model create a mask outside of the areas you specify.
With the manual paint features, you can mask an area off and then use the fill option to fill in the remainder instead of trying to use a brush.
Do not forget to press enter after you are done manually editing your mask! Changes will not be saved unless you press enter
Video Tools
This provides some basic tools for getting screenshots from videos, splitting long videos into short clips, and downloading videos from urls. It may be necessary to install ffmpeg for some video formats.
Multiple videos are processed in parallel, if you have one very long (like movie-length) video then it will run faster if manually split into a few shorter chunks.
Extract Clips - specify either a single video file or a folder with multiple videos (including in subfolders) and a path to an output folder. If the "Output to Subdirectories" option is enabled, the outputs will be saved to separate folders under the output folder for each video processed, otherwise they will all be saved to the top level of the output folder. Videos are saved as .avi.
If the "Split at Cuts" option is enabled, it will use PySceneDetect to identify cuts in the video and split the input video at those locations. Otherwise splits may happen at any location, and the output clips may include cuts.
The "Max Length" setting determines how long the outputted clips will be, specified in seconds. Scenes detected with the "Split at Cuts" option may be split into shorter sections if they are longer. The resulting clips may be shorter than specified if the scenes are short, for example a 4-second scene with a max length of 3 seconds would be split into 2-second clips. Any sections <0.25 seconds are discarded.
Extract Images - same as clips, specify an input video/folder and output folder, with the option to output to subdirectories. Images are saved as jpegs.
Specify the number of images to capture per second, default is 0.5 which is one image every 2 seconds. Images are selected at a random spread around the "center" frame at that frequency, if run multiple times you will end up with different frames selected.
"Blur Removal" specifies a portion of the captured images to discard as "too blurry", which may help avoid capturing frames which are mid-transition or have motion blur. Uses the "variance of the Laplacian" method to quantify the sharpness of the image and discards the lowest in the set of selected frames, only saving the remaining.
Download - Provide either a single link or a list of links (.txt file with each link on a separate line) to download to a specified output directory. Uses yt-dlp, which supports a wide variety of sites. Additional arguments can be provided, see the yt-dlp github page for a list of applicable args and supported websites. The default arguments just reduce the spam in the terminal window while displaying download progress.
Convert Model Tools
Convert between different model formats.