Pose Estimation for Augmented Reality: A Hands-On Survey
read more
Citations
DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion
Massive MIMO is a reality—What is next?: Five promising research directions for antenna arrays
A comparative analysis of SIFT, SURF, KAZE, AKAZE, ORB, and BRISK
All One Needs to Know about Metaverse: A Complete Survey on Technological Singularity, Virtual Ecosystem, and Research Agenda
PVN3D: A Deep Point-Wise 3D Keypoints Voting Network for 6DoF Pose Estimation
References
Distinctive Image Features from Scale-Invariant Keypoints
Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography
A method for registration of 3-D shapes
Multiple view geometry in computer vision
Multiple View Geometry in Computer Vision.
Related Papers (5)
Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography
Frequently Asked Questions (13)
Q2. What is the key criterion for accuracy in AR?
For AR application, rather than computational efficiency (as soon as real-time requirement are met), accuracy is the key criterion in order to avoid jitter effects.
Q3. How many correspondences are used to estimate the pose?
To filter out the erroneous ones, the authors use RANSAC on small subsets made of 7 correspondences from which the authors estimate the pose using their PnP method.
Q4. What is the reason why the pose computation is not always easy to do?
Since such 3D knowledge is not always easily available (although the authors have seen that it can be computed on-line), it is also possible to overcome the pose computation considering less constraining knowledge about the viewed scene.
Q5. What is the way to extract features from perspectively transformed images?
since a camera can freely move in AR applications, such features should be extracted from perspectively transformed images.
Q6. What is the solution to the pose estimation problem?
Although P3P is a well-known solution to the pose estimation problem, other PnP approaches that use more points (n > 3) were usually preferred.
Q7. What is the solution to estimate the homography?
Another solution to estimate the homography is to consider the minimization of a cost function, the geometric distance, defined by:ĥ = argmin hN∑ i=1 d(x1i, 1H2x2i)2 (24)which can be solved directly for h which represents the 8 independent parameters hk,k = 1...8 of the homography 1H2 using a gradient approach such as a Gauss-Newton.
Q8. How many iterations are required to ensure that p = 0.99?
For the P4P problem (n = 4) when datais contaminated with 10% of outliers, 5 iterations are required to ensure that p = 0.99 and with 50% of outliers 72 iterations are necessary.
Q9. What is the advantage of the pose estimation methods?
Since a comprehensive or even a sparse 3D knowledge is not always easily available, the development of pose estimation methods that involve less constraining knowledge about the observed scene has been considered.
Q10. What was the first system that enabled scene reconstruction and consequently camera localization in real-time?
KinectFusion [89] was one of the first systems that enables scene reconstruction and consequently camera localization in real-time and in a way compatible with interactive applications [53] (see Figure 14).
Q11. What are the main areas of the research that must be considered by both academics and industries?
Scalability of the solutions, end-users and market acceptance are clearly potential improvement areas that must be considered by both academics and industries.
Q12. What is the definition of a pose estimation problem?
A simple definition could be: "given a set of correspondences between 3D features and their projections in the images plane, pose estimation consists in computing the position and orientation of the camera".
Q13. What is the common approach for detecting a rectangle?
More precisely, rectangle shape is first searched in a binarized image, and then camera pose with respect to the rectangle is computed from the known 3D coordinates of four corners of the marker using approaches similar to those presented in section 3.1 or 4.1.3.