Real-time 3D reconstruction at scale using voxel hashing
read more
Citations
ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes
Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Volumetric and Multi-view CNNs for Object Classification on 3D Data
Matterport3D: Learning from RGB-D Data in Indoor Environments
References
A method for registration of 3-D shapes
Marching cubes: A high resolution 3D surface construction algorithm
KinectFusion: Real-time dense surface mapping and tracking
A volumetric method for building complex models from range images
Surface reconstruction from unorganized points
Related Papers (5)
Frequently Asked Questions (18)
Q2. What future works have the authors mentioned in the paper "Real-time 3d reconstruction at scale using voxel hashing" ?
To further extend the bounds of reconstruction, their method supports lightweight streaming without major data structure reorganization. Due to the high performance of their data structure, the available time budget can be utilized for further improving camera pose estimation, which directly improves reconstruction quality over existing online approaches. The authors believe the advantages of their method will be even more evident when future depth cameras with higher resolution sensing emerge, as their data structure is already capable of reconstructing surfaces beyond the resolution of existing depth sensors such as Kinect.
Q3. How is the list accessed in parallel?
Since the list is accessed in parallel, synchronization is necessary, by incrementing or decrementing the end of list pointer using an atomic operation.
Q4. What is the purpose of the concept of a truncated SDF?
To reduce computational cost, support sensor motion, and approximate sensor noise, Curless and Levoy introduce the notion of a truncated SDF (TSDF) which only stores the signed distance in a region around the observed surface.
Q5. What is the significant limitation of the hierarchy?
The most significant limitation of the hierarchy is the data structure overhead causing a performance drop, particularly in complex scenes.
Q6. Why is the depth weights given to voxel blocks?
The authors set the integration weights according to the depth values in order to incorporate the noise characteristics of the sensor; i.e., more weight is given to nearer depth measurements for which the authors assume less noise.
Q7. How do the authors lock a bucket for writing?
To avoid race conditions when inserting hash entries in parallel, the authors lock a bucket atomically for writing when a suitable empty positionis found.
Q8. Why is 3D reconstruction gaining newfound momentum?
While 3D reconstruction is an established field in computer vision and graphics, it is now gaining newfound momentum due to the wide availability of depth cameras (such as the Microsoft Kinect and Asus Xtion).
Q9. What is the advantage of their method?
To further extend the bounds of reconstruction, their method supports lightweight streaming without major data structure reorganization.
Q10. How is the point-plane energy function linearized on the GPU?
The point-plane energy function is linearized [Low 2004] on the GPU to a 6 × 6 matrix using a parallel reduction and solved via Singular Value Decomposition on the CPU.
Q11. How can the authors use the surface to estimate the pose?
Once the surface is extracted via raycasting, it can be shaded for rendering, or used for frame-to-model camera pose estimation [Newcombe et al. 2011].
Q12. What are the advantages of multilayered height maps?
Multilayered height-maps have been explored to support reconstruction of more complex 3D shapes such as balconies, doorways, and arches [Gallup et al. 2010].
Q13. What are the advantages of their method?
The authors believe the advantages of their method will be even more evident when future depth cameras with higher resolution sensing emerge, as their data structure is already capable of reconstructing surfaces beyond the resolution of existing depth sensors such as Kinect.
Q14. What is the advantage of streaming voxel blocks to the GPU?
This enhances performance, given the high host-GPU bandwidth and ability to efficiently cull voxel blocks outside of the view frustum.
Q15. What is the way to stream voxel blocks?
Their unstructured data structure is well-suited for this purpose, since streaming voxel blocks in or out does not require any reorganization of the hash table.
Q16. What is the benefit of a linear hash table?
This demonstrates another benefit for their linear hash table data structure (over hierarchical data structures), allowing fast parallel access to all allocated blocks for operations such as rasterization.
Q17. What is the weighting factor in the point-plane error-metric?
As their data structure also stores associated color data, the authors incorporate a weighting factor in the point-plane error-metric based on color consistency between extracted and input RGB values [Johnson and Bing Kang 1999].
Q18. What is the simplest way to estimate the ego-motion of a 3D surface?
the rigid six degree-of-freedom (6DoF) ego-motion of the sensor is estimated, typically using variants of ICP [Besl and McKay 1992; Chen and Medioni 1992].