Week 06 ‐ Comparison of Face Detection Models - AkinduID/EyeRiz GitHub Wiki
- Compare and evaluate several face detection models
- Haar Cascades
- MediaPipe BlazeFace
- MediaPipe Holistics
- MTCNN
To assess the performance of various face detection methods, I ran four models on a stock video of a man speaking. The system captured 11 frames and calculated the time taken by each model to recognize the man's face. These times were then plotted against the frames, and the average time for each model was calculated. Finally, a comparative graph of the average time taken by each model was generated to analyze their performance. In this analysis, the primary focus was on evaluating the speed of the models. Accuracy was not explicitly tested, as the goal was to determine which model could detect faces the fastest for real-time applications.
Haar Cascades is a classical face detection method based on machine learning techniques where a cascade function is trained using positive and negative images. Though it is known for its simplicity, it often falls short in terms of accuracy and speed compared to modern approaches.
BlazeFace is a single-shot detector (SSD) model specifically optimized for real-time face detection, particularly on mobile devices. It is lightweight and highly efficient, allowing for fast inference speeds without sacrificing accuracy. The model was trained primarily on selfie images, making it well-suited for detecting faces in close-up views and is capable of processing video in real-time, even on low-power devices.
The MediaPipe Holistics model is a comprehensive solution for face detection, which integrates face, hands, and pose recognition into a single pipeline. It provides high accuracy in detecting facial landmarks and is ideal for applications that require multiple body part detections. For the projects purpose I utilized only the face detection feature.
The Multi-task Cascaded Convolutional Neural Network (MTCNN) is a widely used face detection algorithm that excels in detecting faces in a variety of lighting and angles. It utilizes a three-stage CNN pipeline to refine the detection, making it highly accurate but slower compared to more lightweight models like BlazeFace.
- BlazeFace is the fastest model, likely due to its SSD architecture and design specifically for short-range face detection. This makes it ideal for real-time face detection, though it is limited to close-range scenarios.
- Holistics is the second fastest and is not restricted to short ranges like BlazeFace.
- Haar Cascades offer moderate speed.
- MTCNN has the slowest processing speed.
- In terms of accuracy, MTCNN performs the best. BlazeFace and Holistics have a similar moderate level accuracy, while Haar Cascades have the lowest accuracy.
- Replace cascade classifier method with BlazeFace.
- Add gesture control feature and virtual cam integration.
- Optimize Servo Movements.