Random Forests for Real Time 3D Face Analysis
read more
Citations
SUN RGB-D: A RGB-D scene understanding benchmark suite
Gradient boosting machines, a tutorial.
OctNet: Learning Deep 3D Representations at High Resolutions
Learning-by-Synthesis for Appearance-Based 3D Gaze Estimation
References
Random Forests
Classification and Regression Trees.
A method for registration of 3-D shapes
Classification and regression trees
Classification and regression trees
Related Papers (5)
Frequently Asked Questions (14)
Q2. What have the authors stated for future works in "Random forests for real time 3d face analysis" ?
In their future work, the authors intend to train on full upper body models instead of isolated faces in order to better handle hair and other non-face body parts.
Q3. What are the key issues in the regressor?
Because the accuracy of a regressor depends on the amount of annotated training data, the acquisition and labeling of a training set are key issues.
Q4. How did Paysan et al. (2009) use the generic face template?
the 3D morphable model of Paysan et al. (2009) was used, together with graph-based non-rigid ICP (Li et al. 2009), to adapt the generic face template to the point cloud.
Q5. How can the authors make the patches more scale-invariant?
In order for the forest to be more scale-invariant, the size of the patches can be made dependent on the depth (e.g., at its center), however, in this work the authors assume the faces to be within a relatively narrow range of distances from the sensor.
Q6. How long does it take to fit the proposed PCA shape model?
In Nair and Cavallaro (2009), fitting the proposed PCA shape model containing only the upper facial features, i.e., without the mouth, takes on average 2 minutes per face.
Q7. What causes the errors around the mouth regions?
Most errors occur around the mouth regions due to the large deformations and the noisy reconstruction of the teeth and oral cavity.
Q8. How can patches be detected when a particular point is occluded?
Since all patches can vote for the localization of a specific point of the object, it can be detected even when that particular point is occluded.
Q9. What does the head pose estimation system assume?
Their head pose estimation system does not assume any initialization phase nor person-specific training, and works on a frame-by-frame basis.
Q10. Why do the authors use the depth channel?
Because the database does not contain a uniform distribution of head poses, but has a sharp peak around thefrontal face configuration, as can be noted from Fig. 21, the authors bin the space of yaw and pitch angles and cap the number of images for each bin.
Q11. Why are GPUs not available in many scenarios?
GPUs are power-hungry and might not be available in many scenarios where portability is important, e.g., for mobile robots.
Q12. What does the algorithm assume that all leaves in a tree contain a probability?
That means that all leaves in a tree contain a probability p(c = 1|P ) = 1 and thus all patches extracted from the depth image will be allowed to vote, no matter their appearance.
Q13. What is the importance of a minimum size for the patches?
The plot shows that a minimum size for the patches is critical since small patches can not capture enough information to reliably predict the head pose.
Q14. How can the body pose be estimated?
In Girshick et al. (2011), it has been shown that the body pose can be more efficiently estimated by using regression instead of classification forests.