How to setup and evaluate validation datasets - Nerogar/OneTrainer GitHub Wiki

Validation loss helps you clearly understand how well your training is progressing, determine whether your model is learning effectively, and pinpoint the best checkpoints during training. A well-designed validation setup provides visual metrics, like a clear loss curve, enabling precise, data-driven decisions rather than guesswork.

Setting Up Validation

1. Create a Dedicated Validation Concept

On your "concept" page (in your training UI):
- Add a new concept.
- Set its type explicitly to VALIDATION.

Crucially, do not enable:
- Text variation
- Image variation (no augmentations)
- Shuffle
- Caption dropout
  We want the validation process to remain consistent and completely repeatable across training sessions.

2. Configure Validation Settings

On the general settings page, enable:
- TensorBoard logging
- Validation
Step Validation Intervals:
- If your dataset has fewer than 500 images, simply set validation intervals to 1 epoch.
- For larger datasets, calculating steps per epoch helps maintain graph consistency.
  Example:
  - If your dataset completes an epoch in 1350 steps:
    - Set intervals to something divisible by this (e.g., 675 steps, validating twice per epoch; or 450 steps, validating three times per epoch).
  - Perfect divisions are not mandatory, but strongly advised to yield a smoother and clearer validation graph.

Selecting Validation Images

1. Image Uniqueness

Every validation image must be unique, cannot be present in your training dataset and must be captioned. ^1 ^2
Minor modifications like cropping do NOT count as new images.

2. Simplicity

Select simple, clear images that precisely represent the core of what you want the model to learn.
- If training a person's likeness, a close-up headshot or portrait is ideal.
- Avoid overly complex backgrounds, interactions, or busy compositions—simplicity helps clearly evaluate training effectiveness.

3. Quantity

There's no strict minimum or maximum. However:
- 3-5 images: minimal but sufficient to identify basic trends and training trajectory.
- 10-20% of your images: ideal range, providing smoother and more reliable validation graphs while remaining manageable.
Select images representing diverse yet clear aspects of your target concept to effectively gauge generalization and concept reproduction.

Interpreting Validation Graphs

What You Should See

Your validation loss graph should initially show a smooth downward trend as the model improves and internalizes your concept.
Eventually, validation loss hits a first low point, which indicates the model has achieved optimal or near-optimal understanding of your dataset.

Larger vs. Smaller Datasets

Larger datasets (1000+ images) typically see one clear low-point that’s often your ideal checkpoint.
Smaller datasets (30–100 images) may experience fluctuations, causing validation loss to rise after an initial low, before eventually going down again to a new low point.

Best Practices

Keep all checkpoints corresponding to these "low points" in your validation curve. (either exactly on the low point, or the next checkpoint that you saved after a low point was hit)
Each low-loss checkpoint will have differing strengths:
- Some excel at precise reproduction of your trained concept.
- Others offer improved generalization across varied prompts and contexts.
By focusing exclusively on checkpoints identified clearly by these validation low points, you significantly reduce testing and guesswork after training concludes.