Dataset Preparation - InfluxOW/Stable-Diffusion-Text-To-Person GitHub Wiki

To train the model you will need photos of the desired person, ranging from 10-15 pictures to an infinite. The most reasonable range is 15-30 photos.

  • It's crucial that each photo should contain only one person.

  • The higher the quality of the photos and the more readable the person's face, the better.

  • The dataset should be diverse: different lighting, different environments, different poses, different facial positions. The less diversity in it, the more limited the final model can become.

  • The person's appearance in the photos should be relatively stable and consistent: it's not good if someone looks overweight in some photos and skinny in others, or if they look young in some and old in others. Often, when scrolling through the social media profiles of some people (more often girls), you get the feeling that there are not one but five different people in the photos. Considering that the human brain is essentially a neural network and is not capable of generalizing information from such photos to create a coherent image of the person, this is a significant concern. It suggests that a model trained on such a dataset may also produce contradictory results.

  • I also strongly recommend cropping the photos to a 1:1 aspect ratio.

In general, when it comes to the dataset, the concept of garbage in, garbage out fully applies. The dataset has an even greater impact on the result than most training parameters, so take it seriously if you want to get a high-quality output. However, as an experiment, you can train the model on all the decent photos you have, and in half of the cases, I was satisfied with such results.

Let's take a look at samples from the datasets used to generate the images from the Examples.


Danila Poperechnij | More

Karina Istomina | More

Lana Del Rey | More

Skryptonite | More


And let's also take a look at our today's main character the ideal model of which we will be training further - Billie Eilish.


Billie Eilish | More


As you could see, in half of these datasets I deviated from some of the described rules, but the results turned out at least decent. We'll find out how it turns out with Billie as we go further.


Next - Model Training ‐ Introduction

⚠️ **GitHub.com Fallback** ⚠️