1702.02447 - hassony2/inria-research-wiki GitHub Wiki
Arxiv 2017
[arxiv 1702.02447]Region Ensemble Network: Improving Convolutional Network for Hand Pose Estimation [PDF] [code] [notes]
Hengkai Guo, Guijin Wang, Xinghao Chen, Cairong Zhang, Fei Qiao, Huazhong Yang
read 29/05/2017
Objective
Directly regress 3D coordinate of hand position using a tree-structured Region Ensemble Network REN from unique depth image
Synthesis
Presents comparison to state of the art methods Hands Deep in Deep Learning, feedLoop, and Hand pose estimation from local surface normals
Output : 3*J vector representing the 3D world coordinates for the hand joints
Pipeline
Preprocessing
- extract depth cube around hand
- Depth normalized to [-1, 1]
- uniformly divides the feature maps of the convnet into a nxn grid (n=2 in practice), for each grid, feed it into FC layers (Branches)
- features from the last layer are concatenated
- regression layer to predict outputs
- end-to-end training
Results
Compares on NYU and ICVL and outperforms hands deep in deep learning feedLoop, and Hand pose estimation from local surface normals
Claims state of the art
Code
Implementation using caffe