Real time deep hair matting on mobile devices - YBIGTA/pytorch-hair-segmentation GitHub Wiki
Real-time deep hair matting on mobile devices
This paper addresses the problem of live hair color augmentation. The authors of this paper claim that by using the techniques shown in this paper, they were able to get detailed segmentation of hair mattes -- with 30 fps even on an iPad.
The contribution of this paper is two-fold:
- This paper chose to base its model on MobileNets architechture instead of VGG16, which was the mainstream architechture used in this branch of research. This allowed much faster inference time and much less use of memory. While VGG16 occupies approximately 500MB of RAM and takes about 100ms for a single forward pass even on a strong GPU, the proposed architechture takes only 15MB of memory and takes only 60 milliseconds to pass forward when optimized.
- The loss function used in this paper is based on the binary cross entropy loss between predicted and ground truth masks, but there is an additional term which promotes perceptually accurate matting.
This paper presents two networks, both based on MobileNets. The first one is called HairSegNet and is simply MobileNets modified to become FCN. The network architechture is shown below.
![HairSegNet](/Users/SPark9625/Library/Application Support/typora-user-images/image-20181110010849969.png)
The second one, which considers also the aforementioned additional loss term, has additional tweaks made to the first architechture. Skip connections are added and image and mask gradients are incorporated in the loss as well. Structure is shown below.
![HairMatteNet](/Users/SPark9625/Library/Application Support/typora-user-images/image-20181110010912114.png)
Experimentational Results
Quantitative Results
![Quantitative Results](/Users/SPark9625/Library/Application Support/typora-user-images/image-20181110011153478.png)
Qualitative Results
![Qualitative Results](/Users/SPark9625/Library/Application Support/typora-user-images/image-20181110011238764.png)
Naive HairSegNet shows a reasonable performance, but it is clearly missing the fine details. Adding Guided Filter brings some of the details into life, but it still has a halo around the edges. Finally, the final architechture - HairMatteNet seems to preserve the most detail.