Preparing good facesets - MachineEditor/MachineVideoEditor GitHub Wiki
Well prepared dataset is one of the most important components for any Machine Learning project, and it is very true for work with DFL as well. A good dataset, or in our case - faceset, can often be more important than the model used or any of the other factors.
Please note: These are just general guidelines. Every deep fake is different, there's always exceptions, and what worked for one project could not work for another project.
Good practice hint: It is always advisable doing regular backups, and especially before doing big deletions of pictures.
Some basic advice and definitions will be listed here.
- subject(s) - the person(s) of interest for the project - the (two) people whose faces you want to swap
- SRC - Source; the subject you want to put in the final video; marked blue on the training graph
- DST - Destination; the subject you want to replace in the final video; marked yellow on the training graph
It is almost always true that having more pictures in the facesets is good. Some good and bad practices are listed below.
A few good practices:
- try and use highest quality sources you can get (for example, 4K videos)
- aim to have pictures of a wide variety of angles
- aim to have a wide variety of facial expressions (yawning, talking, anger, eyes open, eyes closed, etc)
- expand your DST set beyond the video where you want DST replaced
- remove obstructed faces from SRC set (or at least make sure ALL obstructions are masked out)
Probably the most important practice for preparing facesets is making sure your SRC set covers all the angles of your DST set.
To easily compare the angles of those sets, you can either compare Heatmaps of both sets, or you can use the Set Creator as it has a Set Compare graph feature.
A few bad practices:
- mixing very different faces of the same person (ie very young subject and their old self, or natural face pictures combined with heavy makeup pictures)
- not fixing alignments
- not removing non-subject faces from final training sets sets
You will probably want to start by acquiring high quality videos that include both of your subjects. Of course, that should always include the video in which you want to do the face replacement.
Once you have your video sources, it is time to extract the frames.
Some of the ways to do it include:
- using MVE's Video Extractor
- using DFL's
.bat
files - directly using FFmpeg
Hint: it is a good idea to keep videos/frames containing SRC subject, and videos/frames containing DST subject separated
You can also expand your facesets by getting high quality pictures online.
For example, you can use MVE's integrated Image Downloader to download high quality pictures from Getty.
Once you have your Frames, it is time to
- extract faces
- remove unwanted faces
- fix alignments
You will probably want to do masks for your faces as well. You can consult the mini guide on using XSeg in MVE
There is a separate article you can consult specifically for these two steps:
https://github.com/MachineEditor/MachineVideoEditor/wiki/Extracting-Faces-and-Fixing-Alignment
First you need to define what is an unwanted face for your faceset.
Most often that will be:
- faces that do not correspond to your subject
- (heavily) obscured SRC faces
- bad quality / very blurry images
Machine Video Editor has several tools that can help you remove unwanted faces:
- Face Similarity sort can be used to only keep the faces of your subject, and easily remove all other faces from your set
- Blur sort is not ideal, but it can help identify very blurry images