How to setup and evaluate validation datasets - Nerogar/OneTrainer GitHub Wiki

Validation helps you clearly understand how well your training is progressing, determine whether your model is learning effectively, and pinpoint the best checkpoints during training. A well-designed validation setup provides visual metrics, like a clear loss curve, enabling precise, data-driven decisions rather than guesswork.

Setting Up Validation

1. Create a Dedicated Validation Concept

  • On your "concept" page (in your training UI):

    • Add a new concept.

    • Set its type explicitly to VALIDATION.

  • Crucially, do not enable:

    • Text variation

    • Image variation (no augmentations)

    • Shuffle

    • Caption dropout
      We want the validation process to remain consistent and completely repeatable across training sessions.


2. Configure Validation Settings

  • On the general settings page, enable:

    • TensorBoard logging

    • Validation

  • Step Validation Intervals:

    • If your dataset has fewer than 500 images, simply set validation intervals to 1 epoch.

    • For larger datasets, calculating steps per epoch helps maintain graph consistency.
      Example:

      • If your dataset completes an epoch in 1350 steps:

        • Set intervals to something divisible by this (e.g., 675 steps, validating twice per epoch; or 450 steps, validating three times per epoch).
      • Perfect divisions are not mandatory, but strongly advised to yield a smoother and clearer validation graph.


Selecting Validation Images

1. Image Uniqueness

  • Every validation image must be unique and cannot be present in your training dataset.

  • Minor modifications like cropping do NOT count as new images.

2. Simplicity

  • Select simple, clear images that precisely represent the core of what you want the model to learn.

    • If training a person's likeness, a close-up headshot or portrait is ideal.

    • Avoid overly complex backgrounds, interactions, or busy compositions—simplicity helps clearly evaluate training effectiveness.

3. Quantity

  • There's no strict minimum or maximum. However:

    • 3-5 images: minimal but sufficient to identify basic trends and training trajectory.

    • 10-15 images: ideal range, providing smoother and more reliable validation graphs while remaining manageable.

  • Select images representing diverse yet clear aspects of your target concept to effectively gauge generalization and concept reproduction.


Interpreting Validation Graphs

What You Should See

  • Your validation loss graph should initially show a smooth downward trend as the model improves and internalizes your concept.

  • Eventually, validation loss hits a first low point, which indicates the model has achieved optimal or near-optimal understanding of your dataset.

Larger vs. Smaller Datasets

  • Larger datasets (1000+ images) typically see one clear low-point that’s often your ideal checkpoint.

  • Smaller datasets (30–100 images) may experience fluctuations, causing validation loss to rise after an initial low, before eventually going down again to a new low point.

Best Practices

  • Keep all checkpoints corresponding to these "low points" in your validation curve. (either exactly on the low point, or the next checkpoint that you saved after a low point was hit)

  • Each low-loss checkpoint will have differing strengths:

    • Some excel at precise reproduction of your trained concept.

    • Others offer improved generalization across varied prompts and contexts.

  • By focusing exclusively on checkpoints identified clearly by these validation low points, you significantly reduce testing and guesswork after training concludes.