Mouth Masking: How Mouth Masking Works - iVideoGameBoss/iRoopDeepFaceCam GitHub Wiki
Deep Dive: How Mouth Masking Works - Advanced
This page provides a thorough explanation of the mouth masking system used in the face-swapping application. We'll dissect each function involved in creating, visualizing, and applying the mouth masks, detailing their inner workings and the logic behind their implementation.
Core Objective: Realistic Mouth Swapping
The goal of the mouth masking system is to improve the quality of face swaps, by swapping the mouth from the source face onto the target face in a more seamless manner. This involves creating an accurate mask around the mouth and lower lip area, feathering edges to blend with the original mouth, and ensuring the colors match the original.
Function Breakdown: Step-by-Step
Let's break down the individual functions and their interactions:
create_mouth_mask(face: Face, frame: Frame)
1. -
Purpose: This function generates a mask around the mouth region. It creates a basic mask using landmarks.
-
Step-by-step breakdown:
-
Initialization:
- A black mask (
mask
) is created with the same height and width as the inputframe
. mouth_cutout
is initialized toNone
, this is the image of the cropped mouth.
- A black mask (
-
Landmark Check:
- It checks if the
face
object contains thelandmark_2d_106
landmarks which is a list of coordinates for landmarks on the face. If no landmarks exists the function returns the black mask andNone
for the cutout and bounding box.
- It checks if the
-
Key Landmark Extraction:
- The coordinates of the nose tip (landmark
80
) and the center of the bottom lip (landmark73
) are extracted as floating-point numbers from theface.landmark_2d_106
.
- The coordinates of the nose tip (landmark
-
Mask Dimensions:
- A vector,
center_to_nose
, is calculated which is from the center of the bottom lip to the tip of the nose. - A
mask_height
is calculated using the length ofcenter_to_nose
multiplied by themodules.globals.mask_size
and a constant0.3
. - The
mask_top
is determined by offsetting the nose tip in the direction ofcenter_to_nose
using a constant0.2
. An offset value frommodules.globals.mask_down_size
is also added to move the mouth mask down or up depending on its sign. - The
mask_bottom
is determined by offsetting themask_top
in the direction ofcenter_to_nose
. - The horizontal mouth landmarks between (landmarks
52
and71
) are extracted as floating point numbers fromface.landmark_2d_106
. - The horizontal
mouth_width
is calculated using the width of the mouth landmarks. - The mask width is calculated using the following formula
mask_width = mouth_width * 0.4 * modules.globals.mask_size * 0.8
.
- A vector,
-
Mask Polygon Construction:
- The
mask_direction
is determined fromcenter_to_nose
which is a vector that extends left and right from the center of the mouth. This is done by swapping the X and Y components of the vector and negating one of them. - The
mask_direction
is normalized by dividing by its length. - A
mask_polygon
is created with 4 points. The first two points are horizontally offset from themask_top
using themask_direction
andmask_width
. The next two points are horizontally offset from themask_bottom
using themask_direction
andmask_width
.
- The
-
Mask Drawing and Mouth Cropping:
- The
mask_polygon
is converted to anumpy.ndarray
of integers and used to fill the mask with the color white. - A bounding box around the mask is calculated.
- The
mouth_cutout
is created by cropping the area specified by the bounding box from the originalframe
.
- The
-
Return:
- The function returns the mask, the cropped mouth
mouth_cutout
and a tuple that contains the top left and bottom right bounding box of the mouth mask.
- The function returns the mask, the cropped mouth
-
create_lower_mouth_mask(face: Face, frame: Frame)
2. - Purpose: This function creates a detailed mask for just the lower lip and surrounding area. It is used because the lower lip has much more movement than the upper lip.
- Step-by-step breakdown:
-
Initialization:
- A black mask (
mask
) is created with the same height and width as the inputframe
. mouth_cutout
is initialized toNone
, this is the image of the cropped mouth.
- A black mask (
-
Landmark Check:
- It checks if the
face
object contains thelandmark_2d_106
landmarks. If not, the function returns a black mask,None
, bounding box, andNone
for the lower lip polygon.
- It checks if the
-
Lower Lip Landmark Extraction:
- The landmarks that make up the lower lip are extracted from
face.landmark_2d_106
usinglower_lip_order
. They are converted to floating points to ensure more accurate calculations.
- The landmarks that make up the lower lip are extracted from
-
Calculate Landmark Center:
- The center of the lower lip landmarks is calculated by using the
np.mean
function on thelower_lip_landmarks
.
- The center of the lower lip landmarks is calculated by using the
-
Expand the Landmarks:
- An
expansion_factor
is calculated using1 + modules.globals.mask_down_size
, this value determines how much the mask will be expanded. - The
lower_lip_landmarks
are then expanded around their center, by subtracting thecenter
and multiplying byexpansion_factor
, then adding thecenter
back.
- An
-
Top Lip Extension:
- A list of
toplip_indices
is created that selects the landmarks of the top lip. - A
toplip_extension
is determined by multiplying the global variablemodules.globals.mask_size
with a0.5
constant. - We then loop through the
toplip_indices
and extend the lip by calculating the direction from thecenter
to the landmark. This new direction is then added to the original landmark position.
- A list of
-
Chin Extension:
- A list of
chin_indices
is created that selects the landmarks of the chin. - A
chin_extension
is created using a constant of2 * 0.2
- We then loop through the
chin_indices
and extend the chin by directly offsetting the Y component by using the distance from the Y center.
- A list of
-
Convert to Integer Coordinates:
- The extended points are converted to integers by using
astype(np.int32)
.
- The extended points are converted to integers by using
-
Calculate Bounding Box:
- The coordinates of a bounding box that can contain the lower lip mask are calculated by using
np.min
to get the top left corner, andnp.max
to get the bottom right corner. - A padding is determined using the width of the bounding box multiplied by 0.1.
- We then expand the bounding box using the padding, making sure the bounding box stays within the size of the frame.
- If the bounding box dimensions are invalid we ensure that the dimensions are at least
1
.
- The coordinates of a bounding box that can contain the lower lip mask are calculated by using
-
Create the Mask:
- A mask is created to be the same size as the bounding box.
- The mask is filled using the
expanded_landmarks
points with the color white. - The mask is then blurred using a Gaussian blur filter with a size of
15
.
-
Place the Mask and Extract Cutout
- The mask is placed in the correct location of the full sized mask using the bounding box.
- The
mouth_cutout
is created by cropping the area specified by the bounding box from the originalframe
.
-
Return:
- The function returns the mask, the cropped mouth
mouth_cutout
, the bounding box for the mouth, and the lower lip polygon which is made up of the extended landmarks.
-
draw_mouth_mask_visualization(frame: Frame, face: Face, mouth_mask_data: tuple)
3. - Purpose: This function visualizes the mouth mask, bounding box, and feathered mask. It is used for debugging to see where the mask is being applied and how large it is.
- Step-by-step breakdown:
-
Initialization:
- The function makes a copy of the
frame
so we are not modifying the original frame. - It checks if the
face
has landmarks and themouth_mask_data
is valid. If not, returns the original frame.
- The function makes a copy of the
-
Extract Data:
- The mouth mask, cutout, bounding box, and lower lip polygon are extracted from
mouth_mask_data
.
- The mouth mask, cutout, bounding box, and lower lip polygon are extracted from
-
Ensure Coordinates are Valid
- The code makes sure that the bounding box is within the bounds of the
vis_frame
.
- The code makes sure that the bounding box is within the bounds of the
-
Mask Region Adjustment:
- The portion of the mask that matches the size of the bounding box is extracted.
-
Draw Lower Lip Polygon
- The lower lip polygon points are used to draw a green line around the mouth
-
Calculate Feather Amount:
- The feather amount for the mask is calculated based on the size of the mask, a ratio from
modules.globals.mask_feather_ratio
and also the max value of30
. - The kernel size is determined to ensure an odd size for gaussian blurring.
- The feather amount for the mask is calculated based on the size of the mask, a ratio from
-
Apply Feathering and Color Visualization:
- The mask region is blurred using a Gaussian Blur based on the size and feather amount that was previously determined.
- The blurred mask is then converted to an image between 0 and 255.
-
Text Labels:
- Text labels are added above and below the mask to indicate what part of the mask we are visualizing.
-
Return:
- The modified
vis_frame
is returned with visualization of the mask.
- The modified
-
apply_mouth_area(frame: np.ndarray, mouth_cutout: np.ndarray, mouth_box: tuple, face_mask: np.ndarray, mouth_polygon: np.ndarray)
4. - Purpose: This function takes the created mouth mask and applies it to the frame with color correction. It blends the swapped mouth region into the target frame using a feathered mask to hide edges.
- Step-by-step breakdown:
-
Initialization:
- The function receives the original
frame
, the croppedmouth_cutout
image, the bounding box of themouth_mask
, the full sizedface_mask
, and a polygon that represents themouth_polygon
.
- The function receives the original
-
Box Dimensions and Check:
- The bounding box values are extracted using tuple unpacking.
- A check is done if the
mouth_cutout
, bounding box, or the face mask are valid and non-null. If they are null, the function returns the originalframe
without modification.
-
Resizing the Mouth Cutout:
- The
mouth_cutout
is resized to match the dimensions of the bounding box usingcv2.resize
. - The region of interest (roi) from the original
frame
where the mouth should be is selected. - The code then checks that the
roi
and resized mouth cutout have the same shape if not, then we must resize the cutout to match the roi.
- The
-
Color Correction:
- The colors of the
resized_mouth_cutout
are adjusted to match the colors of the originalroi
using theapply_color_transfer
method.
- The colors of the
-
Create Polygon Mask:
- A mask is created with the same height and width as the
roi
, this mask will only be used to blend the mouth cutout with theroi
. - The
mouth_polygon
is shifted so that its coordinates are relative to theroi
, then the polygon mask is filled.
- A mask is created with the same height and width as the
-
Apply Feathering:
- The polygon mask edges are blurred by using a Gaussian Blur. The size of the blur is controlled by the global variable
modules.globals.mask_feather_ratio
. - The blurred mask is then normalized to be between
0
and1
.
- The polygon mask edges are blurred by using a Gaussian Blur. The size of the blur is controlled by the global variable
-
Combine Masks:
- The feathered mask is multiplied by the
face_mask
to get a final mask which masks the mouth, and blends smoothly using the feathered edges. - A new dimension is added to the combined mask so it can be used with color channel images.
- The feathered mask is multiplied by the
-
Blending:
- The
color_corrected_mouth
and originalroi
are blended using the feathered mask.
- The
-
Apply Face Mask
- The
face_mask
is given 3 color channels, to match thefinal_blend
. - The
final_blend
and originalroi
are blended using the full face mask.
- The
-
Replace in the Frame:
- The blended mouth region is put back into the original frame.
-
Exception Handling:
- If there are any exceptions the function will silently pass and return the frame as is.
-
Return:
- The modified
frame
with the blended mouth is returned.
- The modified
-
apply_mouth_area_with_landmarks(temp_frame, mouth_cutout, mouth_box, face_mask, target_face)
5. -
Purpose: This function is used to apply the mouth area to a frame, but it uses landmarks to help create the mask if the landmarks are available.
-
Step-by-step breakdown:
-
Check if Landmarks are Valid:
- The
target_face
has its landmarks extracted usingtarget_face.landmark_2d_106
. If they areNone
then we will call theapply_mouth_area
function without landmarks and return.
- The
-
Key Landmark Extraction:
- The coordinates of the nose tip (landmark
80
) and the center of the bottom lip (landmark73
) are extracted as floating-point numbers from theface.landmark_2d_106
.
- The coordinates of the nose tip (landmark
-
Mask Dimensions: * A vector,
center_to_nose
, is calculated which is from the center of the bottom lip to the tip of the nose. * Amask_height
is calculated using the length ofcenter_to_nose
multiplied by themodules.globals.mask_size
and a constant0.3
.- The
mask_top
is determined by offsetting the nose tip in the direction ofcenter_to_nose
using a constant0.2
. An offset value frommodules.globals.mask_down_size
is also added to move the mouth mask down or up depending on its sign. * Themask_bottom
is determined by offsetting themask_top
in the direction ofcenter_to_nose
. * The horizontal mouth landmarks between (landmarks52
and71
) are extracted as floating point numbers fromface.landmark_2d_106
. - The horizontal
mouth_width
is calculated using the width of the mouth landmarks. * The mask width is calculated using the following formulamask_width = mouth_width * 0.4 * modules.globals.mask_size * 0.8
. * Themask_direction
is determined fromcenter_to_nose
which is a vector that extends left and right from the center of the mouth. This is done by swapping the X and Y components of the vector and negating one of them. - The
mask_direction
is normalized by dividing by its length. - A
mask_polygon
is created with 4 points. The first two points are horizontally offset from themask_top
using themask_direction
andmask_width
. The next two points are horizontally offset from themask_bottom
using themask_direction
andmask_width
.
- The
-
Apply Mouth Area with Landmarks
- We then call
apply_mouth_area
with all the variables, including themouth_polygon
and return the result.
- We then call
-
Apply Mouth Area Without Landmarks
- If landmarks are
None
, we simply call theapply_mouth_area
function withNone
for themouth_polygon
and return the result.
- If landmarks are
-
Practical Considerations
- Performance: The mouth mask calculations and blending require additional processing which can impact the overall speed of the face swap process.
- Parameter Sensitivity: The effectiveness of the mouth masking relies on appropriate configuration of the global variables, such as
modules.globals.mask_size
,modules.globals.mask_down_size
, andmodules.globals.mask_feather_ratio
. These should be configured based on the use case. - Edge Artifacts: Feathering helps mitigate hard edges, but artifacts may still be noticeable in some situations.
- Landmark Accuracy: The accuracy of the mouth mask depends heavily on the quality of the facial landmark detections.
Conclusion
The mouth masking system in your code demonstrates an advanced technique for enhancing the quality and realism of face swaps. By carefully creating, refining, and applying masks using multiple steps, the mouth region can be swapped seamlessly. This detailed breakdown illustrates how each component works and the careful considerations that go into this core element.
This in-depth exploration of the mouth masking system provides a comprehensive understanding of its underlying mechanics and the careful design choices that make it effective. Let me know if you have any more questions or would like to delve further into any specific aspect of the mouth masking process!
InsightFace landmark_2d_106
Face Outline Mask
face_outline_indices = [1, 43, 48, 49, 104, 105, 17, 25, 26, 27, 28, 29, 30, 31, 32, 18, 19, 20, 21, 22, 23, 24, 0, 8,
7, 6, 5, 4, 3, 2, 16, 15, 14, 13, 12, 11, 10, 9, 1]
Mouth Mask
lower_lip_order = [65, 66, 62, 70, 69, 18, 19, 20, 21, 22, 23, 24, 0, 8, 7, 6, 5, 4, 3, 2, 65]
toplip_indices = [20, 0, 1, 2, 3, 4, 5]
chin_indices = [11, 12, 13, 14, 15, 16]