Journal ArticleDOI
BundleFusion: real-time globally consistent 3D reconstruction using on-the-fly surface re-integration
Reads0
Chats0
TLDR
In this paper, a robust pose estimation strategy is proposed for real-time, high-quality, 3D scanning of large-scale scenes using RGB-D input with an efficient hierarchical approach, which removes heavy reliance on temporal tracking and continually localizes to the globally optimized frames instead.Abstract:
Real-time, high-quality, 3D scanning of large-scale scenes is key to mixed reality and robotic applications. However, scalability brings challenges of drift in pose estimation, introducing significant errors in the accumulated model. Approaches often require hours of offline processing to globally correct model errors. Recent online methods demonstrate compelling results but suffer from (1) needing minutes to perform online correction, preventing true real-time use; (2) brittle frame-to-frame (or frame-to-model) pose estimation, resulting in many tracking failures; or (3) supporting only unstructured point-based representations, which limit scan quality and applicability. We systematically address these issues with a novel, real-time, end-to-end reconstruction framework. At its core is a robust pose estimation strategy, optimizing per frame for a global set of camera poses by considering the complete history of RGB-D input with an efficient hierarchical approach. We remove the heavy reliance on temporal tracking and continually localize to the globally optimized frames instead. We contribute a parallelizable optimization framework, which employs correspondences based on sparse features and dense geometric and photometric matching. Our approach estimates globally optimized (i.e., bundle adjusted) poses in real time, supports robust tracking with recovery from gross tracking failures (i.e., relocalization), and re-estimates the 3D model in real time to ensure global consistency, all within a single framework. Our approach outperforms state-of-the-art online systems with quality on par to offline methods, but with unprecedented speed and scan completeness. Our framework leads to a comprehensive online scanning solution for large indoor environments, enabling ease of use and high-quality results.1read more
Citations
More filters
Journal ArticleDOI
Vision-based Robotic Grasping From Object Localization, Object Pose Estimation to Grasp Estimation for Parallel Grippers: A Review
TL;DR: A comprehensive survey on vision-based robotic grasping can be found in this paper, which includes object localization, object pose estimation, and grasp estimation, as well as challenges and future directions in addressing these challenges.
Journal ArticleDOI
High-quality 3D Reconstruction with Depth Super-resolution and Completion
Li Jianwei,Wei Gao,Yihong Wu +2 more
TL;DR: A new depth super-resolution and completion method implemented in a deep learning framework and built to build a high-quality 3D reconstruction system that has better performance both on single depth image enhancement and3D reconstruction.
Book ChapterDOI
3DFacilities: Annotated 3D Reconstructions of Building Facilities
TL;DR: In an effort to leverage the success of deep learning for Scan-to-BIM, 3DFacilities, an annotated dataset of 3D reconstructions of building facilities is presented, containing over 11,000 individual RGB-D frames comprising 50 scene reconstructions annotated with 3D camera poses and per-vertex andper-pixel annotations.
Journal ArticleDOI
A Novel Method for Plane Extraction from Low-Resolution Inhomogeneous Point Clouds and its Application to a Customized Low-Cost Mobile Mapping System
TL;DR: A method for plane extraction from low-resolution inhomogeneous point clouds based on the definition of virtual scanlines and the Enhanced Line Simplification (ELS) algorithm, which suggests that it can be applied to mobile mapping and sensor fusion.
Proceedings ArticleDOI
3D Reconstruction and Texture Optimization Using a Sparse Set of RGB-D Cameras
Wei Li,Xiao Xiao,James K. Hahn +2 more
TL;DR: A robust and efficient tile-based streaming pipeline for geometry reconstruction with TSDF fusion which minimizes memory overhead and calculation cost is proposed, and a multi-grid warping method for texture optimization can address misalignments of both global structures and small details due to the errors in multi-camera registration, optical distortions and imprecise geometries.
References
More filters
Journal ArticleDOI
Distinctive Image Features from Scale-Invariant Keypoints
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Journal ArticleDOI
A method for registration of 3-D shapes
Paul J. Besl,H.D. McKay +1 more
TL;DR: In this paper, the authors describe a general-purpose representation-independent method for the accurate and computationally efficient registration of 3D shapes including free-form curves and surfaces, based on the iterative closest point (ICP) algorithm, which requires only a procedure to find the closest point on a geometric entity to a given point.
Book
A Mathematical Introduction to Robotic Manipulation
TL;DR: In this paper, the authors present a detailed overview of the history of multifingered hands and dextrous manipulation, and present a mathematical model for steerable and non-driveable hands.
Book ChapterDOI
Indoor segmentation and support inference from RGBD images
TL;DR: The goal is to parse typical, often messy, indoor scenes into floor, walls, supporting surfaces, and object regions, and to recover support relationships, to better understand how 3D cues can best inform a structured 3D interpretation.
Proceedings ArticleDOI
KinectFusion: Real-time dense surface mapping and tracking
Richard Newcombe,Shahram Izadi,Otmar Hilliges,David Molyneaux,David Kim,Andrew J. Davison,Pushmeet Kohi,Jamie Shotton,Steve Hodges,Andrew Fitzgibbon +9 more
TL;DR: A system for accurate real-time mapping of complex and arbitrary indoor scenes in variable lighting conditions, using only a moving low-cost depth camera and commodity graphics hardware, which fuse all of the depth data streamed from a Kinect sensor into a single global implicit surface model of the observed scene in real- time.