1701.02468 - hassony2/inria-research-wiki GitHub Wiki
CVPR 2017
[arxiv 1701.02468] Unite the People: Closing the Loop Between 3D and 2D Human Representations arXiv:1701.02468v2 [PDF] [project page] [code to be published]
Christoph Lassner, Javier Romero, Martin Kiefel, Federica Bogo, Michael J. Black, Peter V. Gehler
Objective
Scalable method to produce 3D body model fits for 2D images
Improve body pose and shape optimization
Synthesis
Optimization takes into account term that makes body silohouettes match in 2D and 3D. (optimizes distance between estimated silhouette from 3d model and original image and also other terms)
Creates dataset UPI-3D with 3d pose correspondance
Pipeline
Mechanical turk segmentation annotation (foreground and 6 body parts).
Use SMPLify matching with additional term that takes into account silhouette matching
Minimizes sum of absolute distance from point on one of the silhouettes to closest point on the other silhouette. The final distance is biderectionnal with the distance from projected model to silhouette using L1 distance (because of noise) while the other computes L2 distance. (Additional term to previous SMPLify terms)
Use human annotators to jusge quality of the optimization's proposition (binary accept or reject)
==> Creates dataset UPI-3D of 5.569 training images and 1.203 testing images
Segmentation dataset is obtained by projecting 3D segmentation made on 3D model to original images
Training is performed using a ResNet101 network with loss pixel-wise cross entropy
Direct 3D pose and shape prediction
Based on 92 landmarks, estimates directly 3D pose (from 2D to 3D)
2 separately trained random forests : one that predicts rotation vectors for the joints and the other the depth
Definitions
silhouette : all pixels belonging to the body projection