Mouth Masking: How Mouth Masking Works - iVideoGameBoss/iRoopDeepFaceCam GitHub Wiki

Deep Dive: How Mouth Masking Works - Advanced

This page provides a thorough explanation of the mouth masking system used in the face-swapping application. We'll dissect each function involved in creating, visualizing, and applying the mouth masks, detailing their inner workings and the logic behind their implementation.

Core Objective: Realistic Mouth Swapping

The goal of the mouth masking system is to improve the quality of face swaps, by swapping the mouth from the source face onto the target face in a more seamless manner. This involves creating an accurate mask around the mouth and lower lip area, feathering edges to blend with the original mouth, and ensuring the colors match the original.

Function Breakdown: Step-by-Step

Let's break down the individual functions and their interactions:

1. `create_mouth_mask(face: Face, frame: Frame)`

Purpose: This function generates a mask around the mouth region. It creates a basic mask using landmarks.
Step-by-step breakdown:
1. Initialization:
  - A black mask (mask) is created with the same height and width as the input frame.
  - mouth_cutout is initialized to None, this is the image of the cropped mouth.
2. Landmark Check:
  - It checks if the face object contains the landmark_2d_106 landmarks which is a list of coordinates for landmarks on the face. If no landmarks exists the function returns the black mask and None for the cutout and bounding box.
3. Key Landmark Extraction:
  - The coordinates of the nose tip (landmark 80) and the center of the bottom lip (landmark 73) are extracted as floating-point numbers from the face.landmark_2d_106.
4. Mask Dimensions:
  - A vector, center_to_nose, is calculated which is from the center of the bottom lip to the tip of the nose.
  - A mask_height is calculated using the length of center_to_nose multiplied by the modules.globals.mask_size and a constant 0.3.
  - The mask_top is determined by offsetting the nose tip in the direction of center_to_nose using a constant 0.2. An offset value from modules.globals.mask_down_size is also added to move the mouth mask down or up depending on its sign.
  - The mask_bottom is determined by offsetting the mask_top in the direction of center_to_nose.
  - The horizontal mouth landmarks between (landmarks 52 and 71) are extracted as floating point numbers from face.landmark_2d_106.
  - The horizontal mouth_width is calculated using the width of the mouth landmarks.
  - The mask width is calculated using the following formula mask_width = mouth_width * 0.4 * modules.globals.mask_size * 0.8.
5. Mask Polygon Construction:
  - The mask_direction is determined from center_to_nose which is a vector that extends left and right from the center of the mouth. This is done by swapping the X and Y components of the vector and negating one of them.
  - The mask_direction is normalized by dividing by its length.
  - A mask_polygon is created with 4 points. The first two points are horizontally offset from the mask_top using the mask_direction and mask_width. The next two points are horizontally offset from the mask_bottom using the mask_direction and mask_width.
6. Mask Drawing and Mouth Cropping:
  - The mask_polygon is converted to a numpy.ndarray of integers and used to fill the mask with the color white.
  - A bounding box around the mask is calculated.
  - The mouth_cutout is created by cropping the area specified by the bounding box from the original frame.
7. Return:
  - The function returns the mask, the cropped mouth mouth_cutout and a tuple that contains the top left and bottom right bounding box of the mouth mask.

2. `create_lower_mouth_mask(face: Face, frame: Frame)`

Purpose: This function creates a detailed mask for just the lower lip and surrounding area. It is used because the lower lip has much more movement than the upper lip.
Step-by-step breakdown:
1. Initialization:
  - A black mask (mask) is created with the same height and width as the input frame.
  - mouth_cutout is initialized to None, this is the image of the cropped mouth.
2. Landmark Check:
  - It checks if the face object contains the landmark_2d_106 landmarks. If not, the function returns a black mask, None, bounding box, and None for the lower lip polygon.
3. Lower Lip Landmark Extraction:
  - The landmarks that make up the lower lip are extracted from face.landmark_2d_106 using lower_lip_order. They are converted to floating points to ensure more accurate calculations.
4. Calculate Landmark Center:
  - The center of the lower lip landmarks is calculated by using the np.mean function on the lower_lip_landmarks.
5. Expand the Landmarks:
  - An expansion_factor is calculated using 1 + modules.globals.mask_down_size, this value determines how much the mask will be expanded.
  - The lower_lip_landmarks are then expanded around their center, by subtracting the center and multiplying by expansion_factor, then adding the center back.
6. Top Lip Extension:
  - A list of toplip_indices is created that selects the landmarks of the top lip.
  - A toplip_extension is determined by multiplying the global variable modules.globals.mask_size with a 0.5 constant.
  - We then loop through the toplip_indices and extend the lip by calculating the direction from the center to the landmark. This new direction is then added to the original landmark position.
7. Chin Extension:
  - A list of chin_indices is created that selects the landmarks of the chin.
  - A chin_extension is created using a constant of 2 * 0.2
  - We then loop through the chin_indices and extend the chin by directly offsetting the Y component by using the distance from the Y center.
8. Convert to Integer Coordinates:
  - The extended points are converted to integers by using astype(np.int32).
9. Calculate Bounding Box:
  - The coordinates of a bounding box that can contain the lower lip mask are calculated by using np.min to get the top left corner, and np.max to get the bottom right corner.
  - A padding is determined using the width of the bounding box multiplied by 0.1.
  - We then expand the bounding box using the padding, making sure the bounding box stays within the size of the frame.
  - If the bounding box dimensions are invalid we ensure that the dimensions are at least 1.
10. Create the Mask:
  - A mask is created to be the same size as the bounding box.
  - The mask is filled using the expanded_landmarks points with the color white.
  - The mask is then blurred using a Gaussian blur filter with a size of 15.
11. Place the Mask and Extract Cutout
  - The mask is placed in the correct location of the full sized mask using the bounding box.
  - The mouth_cutout is created by cropping the area specified by the bounding box from the original frame.
12. Return:
- The function returns the mask, the cropped mouth mouth_cutout, the bounding box for the mouth, and the lower lip polygon which is made up of the extended landmarks.

3. `draw_mouth_mask_visualization(frame: Frame, face: Face, mouth_mask_data: tuple)`

Purpose: This function visualizes the mouth mask, bounding box, and feathered mask. It is used for debugging to see where the mask is being applied and how large it is.
Step-by-step breakdown:
1. Initialization:
  - The function makes a copy of the frame so we are not modifying the original frame.
  - It checks if the face has landmarks and the mouth_mask_data is valid. If not, returns the original frame.
2. Extract Data:
  - The mouth mask, cutout, bounding box, and lower lip polygon are extracted from mouth_mask_data.
3. Ensure Coordinates are Valid
  - The code makes sure that the bounding box is within the bounds of the vis_frame.
4. Mask Region Adjustment:
  - The portion of the mask that matches the size of the bounding box is extracted.
5. Draw Lower Lip Polygon
  - The lower lip polygon points are used to draw a green line around the mouth
6. Calculate Feather Amount:
  - The feather amount for the mask is calculated based on the size of the mask, a ratio from modules.globals.mask_feather_ratio and also the max value of 30.
  - The kernel size is determined to ensure an odd size for gaussian blurring.
7. Apply Feathering and Color Visualization:
  - The mask region is blurred using a Gaussian Blur based on the size and feather amount that was previously determined.
  - The blurred mask is then converted to an image between 0 and 255.
8. Text Labels:
  - Text labels are added above and below the mask to indicate what part of the mask we are visualizing.
9. Return:
  - The modified vis_frame is returned with visualization of the mask.

4. `apply_mouth_area(frame: np.ndarray, mouth_cutout: np.ndarray, mouth_box: tuple, face_mask: np.ndarray, mouth_polygon: np.ndarray)`

Purpose: This function takes the created mouth mask and applies it to the frame with color correction. It blends the swapped mouth region into the target frame using a feathered mask to hide edges.
Step-by-step breakdown:
1. Initialization:
  - The function receives the original frame, the cropped mouth_cutout image, the bounding box of the mouth_mask, the full sized face_mask, and a polygon that represents the mouth_polygon.
2. Box Dimensions and Check:
  - The bounding box values are extracted using tuple unpacking.
  - A check is done if the mouth_cutout, bounding box, or the face mask are valid and non-null. If they are null, the function returns the original frame without modification.
3. Resizing the Mouth Cutout:
  - The mouth_cutout is resized to match the dimensions of the bounding box using cv2.resize.
  - The region of interest (roi) from the original frame where the mouth should be is selected.
  - The code then checks that the roi and resized mouth cutout have the same shape if not, then we must resize the cutout to match the roi.
4. Color Correction:
  - The colors of the resized_mouth_cutout are adjusted to match the colors of the original roi using the apply_color_transfer method.
5. Create Polygon Mask:
  - A mask is created with the same height and width as the roi, this mask will only be used to blend the mouth cutout with the roi.
  - The mouth_polygon is shifted so that its coordinates are relative to the roi, then the polygon mask is filled.
6. Apply Feathering:
  - The polygon mask edges are blurred by using a Gaussian Blur. The size of the blur is controlled by the global variable modules.globals.mask_feather_ratio.
  - The blurred mask is then normalized to be between 0 and 1.
7. Combine Masks:
  - The feathered mask is multiplied by the face_mask to get a final mask which masks the mouth, and blends smoothly using the feathered edges.
  - A new dimension is added to the combined mask so it can be used with color channel images.
8. Blending:
  - The color_corrected_mouth and original roi are blended using the feathered mask.
9. Apply Face Mask
  - The face_mask is given 3 color channels, to match the final_blend.
  - The final_blend and original roi are blended using the full face mask.
10. Replace in the Frame:
  - The blended mouth region is put back into the original frame.
11. Exception Handling:
  - If there are any exceptions the function will silently pass and return the frame as is.
12. Return:
  - The modified frame with the blended mouth is returned.

5. `apply_mouth_area_with_landmarks(temp_frame, mouth_cutout, mouth_box, face_mask, target_face)`

Purpose: This function is used to apply the mouth area to a frame, but it uses landmarks to help create the mask if the landmarks are available.
Step-by-step breakdown:
1. Check if Landmarks are Valid:
  - The target_face has its landmarks extracted using target_face.landmark_2d_106. If they are None then we will call the apply_mouth_area function without landmarks and return.
2. Key Landmark Extraction:
  - The coordinates of the nose tip (landmark 80) and the center of the bottom lip (landmark 73) are extracted as floating-point numbers from the face.landmark_2d_106.
3. Mask Dimensions: * A vector, center_to_nose, is calculated which is from the center of the bottom lip to the tip of the nose. * A mask_height is calculated using the length of center_to_nose multiplied by the modules.globals.mask_size and a constant 0.3.
  - The mask_top is determined by offsetting the nose tip in the direction of center_to_nose using a constant 0.2. An offset value from modules.globals.mask_down_size is also added to move the mouth mask down or up depending on its sign. * The mask_bottom is determined by offsetting the mask_top in the direction of center_to_nose. * The horizontal mouth landmarks between (landmarks 52 and 71) are extracted as floating point numbers from face.landmark_2d_106.
  - The horizontal mouth_width is calculated using the width of the mouth landmarks. * The mask width is calculated using the following formula mask_width = mouth_width * 0.4 * modules.globals.mask_size * 0.8. * The mask_direction is determined from center_to_nose which is a vector that extends left and right from the center of the mouth. This is done by swapping the X and Y components of the vector and negating one of them.
  - The mask_direction is normalized by dividing by its length.
  - A mask_polygon is created with 4 points. The first two points are horizontally offset from the mask_top using the mask_direction and mask_width. The next two points are horizontally offset from the mask_bottom using the mask_direction and mask_width.
4. Apply Mouth Area with Landmarks
  - We then call apply_mouth_area with all the variables, including the mouth_polygon and return the result.
5. Apply Mouth Area Without Landmarks
  - If landmarks are None, we simply call the apply_mouth_area function with None for the mouth_polygon and return the result.

Practical Considerations

Performance: The mouth mask calculations and blending require additional processing which can impact the overall speed of the face swap process.
Parameter Sensitivity: The effectiveness of the mouth masking relies on appropriate configuration of the global variables, such as modules.globals.mask_size, modules.globals.mask_down_size, and modules.globals.mask_feather_ratio. These should be configured based on the use case.
Edge Artifacts: Feathering helps mitigate hard edges, but artifacts may still be noticeable in some situations.
Landmark Accuracy: The accuracy of the mouth mask depends heavily on the quality of the facial landmark detections.

Conclusion

The mouth masking system in your code demonstrates an advanced technique for enhancing the quality and realism of face swaps. By carefully creating, refining, and applying masks using multiple steps, the mouth region can be swapped seamlessly. This detailed breakdown illustrates how each component works and the careful considerations that go into this core element.

This in-depth exploration of the mouth masking system provides a comprehensive understanding of its underlying mechanics and the careful design choices that make it effective. Let me know if you have any more questions or would like to delve further into any specific aspect of the mouth masking process!

InsightFace landmark_2d_106

Face Outline Mask

    face_outline_indices = [1, 43, 48, 49, 104, 105, 17, 25, 26, 27, 28, 29, 30, 31, 32, 18, 19, 20, 21, 22, 23, 24, 0, 8,
                            7, 6, 5, 4, 3, 2, 16, 15, 14, 13, 12, 11, 10, 9, 1]

Mouth Mask

    lower_lip_order = [65, 66, 62, 70, 69, 18, 19, 20, 21, 22, 23, 24, 0, 8, 7, 6, 5, 4, 3, 2, 65]
    toplip_indices = [20, 0, 1, 2, 3, 4, 5]
    chin_indices = [11, 12, 13, 14, 15, 16]

2d106markup-jpg