Real-Time Body Tracking with One Depth Camera and Inertial Sensors
read more
Citations
A survey of depth and inertial sensor fusion for human action recognition
Total Capture: 3D Human Pose Estimation Fusing Video and Inertial Sensors
LiveCap: Real-Time Human Performance Capture From Monocular Video
Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs
MonoPerfCap: Human Performance Capture From Monocular Video
References
Real-time human pose recognition in parts from single depth images
A survey on vision-based human action recognition
Efficient regression of general-activity human poses from depth images
Real time motion capture using a single time-of-flight camera
Spacetime stereo: shape recovery for dynamic scenes
Related Papers (5)
Frequently Asked Questions (13)
Q2. How can the authors track skeletons in real-time?
Recent hybrid (generative + discriminative) monocular tracking algorithms e.g. [1, 17] can track human skeletons in real-time from a single depth camera, as long as the body is mostly front-facing.
Q3. How do they use the depth map to retrieve a pose?
In order to retrieve a pose XDB The authormatching the one in the depth image from the database, Baak et al. [1] use geodesic extrema computed on the depth map as index.
Q4. What is the purpose of the tracker?
This tracker uses discriminative features detected in the depth data, so-called geodesic extrema EI, to query a database containing pre-recorded full body poses.
Q5. How do they find the closest point in the depth point cloud?
For every point in CX they find the closest point in the depth point cloud MI, and minimize the sum of distances between model and data points by local optimization in the joint angles.
Q6. What are the advantages of using IMUs?
IMUs are nowadays manufactured cheaply and compactly, and integrated into many hand-held devices, such as smart phones and game consoles.
Q7. How do the authors use the generative tracking approach?
The authors also empower generative tracking to use both data for reliable pose inference, and develop a new late fusion step using both modalities.
Q8. What is the method to reconstruct human pose from depth data?
also using depth features and regression forests, [16] generate correspondences between body parts and a pose and size parametrized human model that is optimized in real-time using a one-shot optimization approach.
Q9. What are the three types of trackers?
Most of the trackers introduced so far can be classified into three families—discriminative approaches and generative approaches, and approaches combining both strategies.
Q10. How many false positives have been found in the tested scenarios?
In the tested scenarios, values of τ3 up to 10% have shown a good trade-off between rejecting false positives and not rejecting to many body parts, that are actually visible.
Q11. What other real-time algorithms were proposed by e.g. [17]?
Other real-time algorithms were proposed by e.g. [17] that use a body-part detector similar to [13] to augment a generative tracker.
Q12. How does generative tracking optimize skeletal pose parameters?
Similar to [1], generative tracking optimizes skeletal pose parameters by minimizing the distance between corresponding points on the model and in the depth data.
Q13. How do the authors avoid false positives in the setBvis?
To account for its possible deviation from the “real” pose and to avoid false positives in the setBvis, the authors introduce the threshold τ3 > 0.