scispace - formally typeset
Search or ask a question
Journal ArticleDOI

EVO: A Geometric Approach to Event-Based 6-DOF Parallel Tracking and Mapping in Real Time

01 Apr 2017-Vol. 2, Iss: 2, pp 593-600
TL;DR: Evo, an event-based visual odometry algorithm that successfully leverages the outstanding properties of event cameras to track fast camera motions while recovering a semidense three-dimensional map of the environment, makes significant progress in simultaneous localization and mapping.
Abstract: We present EVO, an event-based visual odometry algorithm. Our algorithm successfully leverages the outstanding properties of event cameras to track fast camera motions while recovering a semidense three-dimensional (3-D) map of the environment. The implementation runs in real time on a standard CPU and outputs up to several hundred pose estimates per second. Due to the nature of event cameras, our algorithm is unaffected by motion blur and operates very well in challenging, high dynamic range conditions with strong illumination changes. To achieve this, we combine a novel, event-based tracking approach based on image-to-model alignment with a recent event-based 3-D reconstruction algorithm in a parallel fashion. Additionally, we show that the output of our pipeline can be used to reconstruct intensity images from the binary event stream, though our algorithm does not require such intensity information. We believe that this work makes significant progress in simultaneous localization and mapping by unlocking the potential of event cameras. This allows us to tackle challenging scenarios that are currently inaccessible to standard cameras.
Citations
More filters
Journal ArticleDOI
TL;DR: Simultaneous localization and mapping (SLAM) as mentioned in this paper consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it.
Abstract: Simultaneous localization and mapping (SLAM) consists in the concurrent construction of a model of the environment (the map ), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications and witnessing a steady transition of this technology to industry. We survey the current state of SLAM and consider future directions. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors’ take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved?

2,039 citations

Journal ArticleDOI
TL;DR: What is now the de-facto standard formulation for SLAM is presented, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers.
Abstract: Simultaneous Localization and Mapping (SLAM)consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications, and witnessing a steady transition of this technology to industry. We survey the current state of SLAM. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors' take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved?

1,828 citations


Cites background from "EVO: A Geometric Approach to Event-..."

  • ...high-speed motion [90] and high-dynamic range [132], [207], where standard cameras fail....

    [...]

Journal ArticleDOI
TL;DR: This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras.
Abstract: Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of is), very high dynamic range (140dB vs. 60dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world.

697 citations


Cites background or methods from "EVO: A Geometric Approach to Event-..."

  • ... this conversion step and directly recover the camera motion and scene structure from the events, as suggested by [128]; for example, by optimizing a function with photometric (i.e., event firing rate [26]) and inertial error terms, akin to VI-DSO [234] for standard cameras. Stereo event-based VIO is an unexplored topic, and it would be interesting to see how ideas from event-based depth estimation can...

    [...]

  • ...uring the strength of the scene edges. Recently, solutions to the full problem of event-based 3D SLAM for 6-DOF motions and natural scenes, not relying on additional sensing, have been proposed [25], [26] (Table 3). The approach in [25] extends [24] and consists of three interleaved probabilistic filters to perform pose tracking as well as depth and intensity estimation. However, it suffers from limite...

    [...]

  • ...ff latency for efficiency, probabilistic filters [24], [25], [224] can operate on small groups of events. Other approaches are natively designed for groups, based for example on non-linear optimization [26], [127], [128], and run in real time on the CPU. Processing multiple events simultaneously is also beneficial to reduce noise. Opportunities: The above-mentioned SLAM methods lack loop-closure capabili...

    [...]

  • ...assumption of uncorrelated depth, intensity gradient, and camera motion. Furthermore, it is computationally intensive, requiring a GPU for real-time operation. In contrast, the semi-dense approach in [26] shows that intensity reconstruction is not needed for depth estimation or pose tracking. The approach has a geometric foundation: it performs space sweeping for 3D reconstruction [19] and edge-map al...

    [...]

  • ...], [236] or using sparse signal processing with a patch-based learned dictionary that mapped events to image gradients, which were then Poisson-integrated [235]. Concurrently, the VO methods in [25], [26] extended the image reconstruction technique in [24] to 6-DOF camera motions by using the computed scene depth and poses: [25] used a robust variational regularizer to reduce noise and improve contras...

    [...]

Book
03 Jul 2020
TL;DR: This survey includes both the historically most relevant literature as well as the current state of the art on several specific topics, including recognition, reconstruction, motion estimation, tracking, scene understanding, and end-to-end learning for autonomous driving.
Abstract: Recent years have witnessed enormous progress in AI-related fields such as computer vision, machine learning, and autonomous vehicles. As with any rapidly growing field, it becomes increasingly difficult to stay up-to-date or enter the field as a beginner. While several survey papers on particular sub-problems have appeared, no comprehensive survey on problems, datasets, and methods in computer vision for autonomous vehicles has been published. This monograph attempts to narrow this gap by providing a survey on the state-of-the-art datasets and techniques. Our survey includes both the historically most relevant literature as well as the current state of the art on several specific topics, including recognition, reconstruction, motion estimation, tracking, scene understanding, and end-to-end learning for autonomous driving. Towards this goal, we analyze the performance of the state of the art on several challenging benchmarking datasets, including KITTI, MOT, and Cityscapes. Besides, we discuss open problems and current research challenges. To ease accessibility and accommodate missing references, we also provide a website that allows navigating topics as well as methods and provides additional information.

579 citations


Cites background from "EVO: A Geometric Approach to Event-..."

  • ...Rebecq et al. (2016) propose an event-based 3D reconstruction algorithm to produce a parallel tracking and mapping pipeline that runs in real-time on the CPU....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors proposed a dynamic and active-pixel vision sensor DAVIS, which incorporated a conventional global-shutter camera and an event-based sensor in the same pixel array.
Abstract: New vision sensors, such as the dynamic and active-pixel vision sensor DAVIS, incorporate a conventional global-shutter camera and an event-based sensor in the same pixel array. These sensors have ...

370 citations

References
More filters
Proceedings ArticleDOI
09 May 2011
TL;DR: PCL (Point Cloud Library) is presented, an advanced and extensive approach to the subject of 3D perception that contains state-of-the art algorithms for: filtering, feature estimation, surface reconstruction, registration, model fitting and segmentation.
Abstract: With the advent of new, low-cost 3D sensing hardware such as the Kinect, and continued efforts in advanced point cloud processing, 3D perception gains more and more importance in robotics, as well as other fields. In this paper we present one of our most recent initiatives in the areas of point cloud perception: PCL (Point Cloud Library - http://pointclouds.org). PCL presents an advanced and extensive approach to the subject of 3D perception, and it's meant to provide support for all the common 3D building blocks that applications need. The library contains state-of-the art algorithms for: filtering, feature estimation, surface reconstruction, registration, model fitting and segmentation. PCL is supported by an international community of robotics and perception researchers. We provide a brief walkthrough of PCL including its algorithmic capabilities and implementation strategies.

4,501 citations


"EVO: A Geometric Approach to Event-..." refers methods in this paper

  • ...We also apply a radius filter [21] to the resulting...

    [...]

Proceedings ArticleDOI
13 Nov 2007
TL;DR: A system specifically designed to track a hand-held camera in a small AR workspace, processed in parallel threads on a dual-core computer, that produces detailed maps with thousands of landmarks which can be tracked at frame-rate with accuracy and robustness rivalling that of state-of-the-art model-based systems.
Abstract: This paper presents a method of estimating camera pose in an unknown scene. While this has previously been attempted by adapting SLAM algorithms developed for robotic exploration, we propose a system specifically designed to track a hand-held camera in a small AR workspace. We propose to split tracking and mapping into two separate tasks, processed in parallel threads on a dual-core computer: one thread deals with the task of robustly tracking erratic hand-held motion, while the other produces a 3D map of point features from previously observed video frames. This allows the use of computationally expensive batch optimisation techniques not usually associated with real-time operation: The result is a system that produces detailed maps with thousands of landmarks which can be tracked at frame-rate, with an accuracy and robustness rivalling that of state-of-the-art model-based systems.

4,091 citations

Book ChapterDOI
06 Sep 2014
TL;DR: A novel direct tracking method which operates on \(\mathfrak{sim}(3)\), thereby explicitly detecting scale-drift, and an elegant probabilistic solution to include the effect of noisy depth values into tracking are introduced.
Abstract: We propose a direct (feature-less) monocular SLAM algorithm which, in contrast to current state-of-the-art regarding direct methods, allows to build large-scale, consistent maps of the environment Along with highly accurate pose estimation based on direct image alignment, the 3D environment is reconstructed in real-time as pose-graph of keyframes with associated semi-dense depth maps These are obtained by filtering over a large number of pixelwise small-baseline stereo comparisons The explicitly scale-drift aware formulation allows the approach to operate on challenging sequences including large variations in scene scale Major enablers are two key novelties: (1) a novel direct tracking method which operates on \(\mathfrak{sim}(3)\), thereby explicitly detecting scale-drift, and (2) an elegant probabilistic solution to include the effect of noisy depth values into tracking The resulting direct monocular SLAM system runs in real-time on a CPU

3,273 citations


"EVO: A Geometric Approach to Event-..." refers methods in this paper

  • ...Our tracking module relies on image-to-model alignment, which is also used in frame-based, direct VO pipelines [14], [15]....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a wide variety of extensions have been made to the original formulation of the Lucas-Kanade algorithm and their extensions can be used with the inverse compositional algorithm without any significant loss of efficiency.
Abstract: Since the Lucas-Kanade algorithm was proposed in 1981 image alignment has become one of the most widely used techniques in computer vision Applications range from optical flow and tracking to layered motion, mosaic construction, and face coding Numerous algorithms have been proposed and a wide variety of extensions have been made to the original formulation We present an overview of image alignment, describing most of the algorithms and their extensions in a consistent framework We concentrate on the inverse compositional algorithm, an efficient algorithm that we recently proposed We examine which of the extensions to Lucas-Kanade can be used with the inverse compositional algorithm without any significant loss of efficiency, and which cannot In this paper, Part 1 in a series of papers, we cover the quantity approximated, the warp update rule, and the gradient descent approximation In future papers, we will cover the choice of the error function, how to allow linear appearance variation, and how to impose priors on the parameters

3,168 citations

Proceedings ArticleDOI
29 Sep 2014
TL;DR: A semi-direct monocular visual odometry algorithm that is precise, robust, and faster than current state-of-the-art methods and applied to micro-aerial-vehicle state-estimation in GPS-denied environments is proposed.
Abstract: We propose a semi-direct monocular visual odometry algorithm that is precise, robust, and faster than current state-of-the-art methods. The semi-direct approach eliminates the need of costly feature extraction and robust matching techniques for motion estimation. Our algorithm operates directly on pixel intensities, which results in subpixel precision at high frame-rates. A probabilistic mapping method that explicitly models outlier measurements is used to estimate 3D points, which results in fewer outliers and more reliable points. Precise and high frame-rate motion estimation brings increased robustness in scenes of little, repetitive, and high-frequency texture. The algorithm is applied to micro-aerial-vehicle state-estimation in GPS-denied environments and runs at 55 frames per second on the onboard embedded computer and at more than 300 frames per second on a consumer laptop. We call our approach SVO (Semi-direct Visual Odometry) and release our implementation as open-source software.

1,814 citations


"EVO: A Geometric Approach to Event-..." refers background or methods in this paper

  • ...Since a motion-capture system is not available outdoors, we used a state-of-the-art VO method (SVO [14]) on the intensity frames of the DAVIS for comparison (Fig....

    [...]

  • ...[14] C. Forster, M. Pizzoli, and D. Scaramuzza, “SVO: Fast semi-direct monocular visual odometry,” in Proc....

    [...]

  • ...For comparison, SVO [14] uses up to 30 iterations....

    [...]

  • ...(SVO [14]) on the intensity frames of the DAVIS for comparison (Fig....

    [...]

  • ...Our tracking module relies on image-to-model alignment, which is also used in frame-based, direct VO pipelines [14], [15]....

    [...]