EVO: A Geometric Approach to Event-Based 6-DOF Parallel Tracking and Mapping in Real Time

doi:10.1109/LRA.2016.2645143

Home
/
Papers
/
EVO: A Geometric Approach to Event-Based 6-DOF Parallel Tracking and Mapping in Real Time

Journal Article•DOI•

EVO: A Geometric Approach to Event-Based 6-DOF Parallel Tracking and Mapping in Real Time

Henri Rebecq¹, Timo Horstschaefer¹, Guillermo Gallego¹, Davide Scaramuzza¹•Institutions (1)

University of Zurich¹

01 Apr 2017-Vol. 2, Iss: 2, pp 593-600

TL;DR: Evo, an event-based visual odometry algorithm that successfully leverages the outstanding properties of event cameras to track fast camera motions while recovering a semidense three-dimensional map of the environment, makes significant progress in simultaneous localization and mapping.

read less

Abstract: We present EVO, an event-based visual odometry algorithm. Our algorithm successfully leverages the outstanding properties of event cameras to track fast camera motions while recovering a semidense three-dimensional (3-D) map of the environment. The implementation runs in real time on a standard CPU and outputs up to several hundred pose estimates per second. Due to the nature of event cameras, our algorithm is unaffected by motion blur and operates very well in challenging, high dynamic range conditions with strong illumination changes. To achieve this, we combine a novel, event-based tracking approach based on image-to-model alignment with a recent event-based 3-D reconstruction algorithm in a parallel fashion. Additionally, we show that the output of our pipeline can be used to reconstruct intensity images from the binary event stream, though our algorithm does not require such intensity information. We believe that this work makes significant progress in simultaneous localization and mapping by unlocking the potential of event cameras. This allows us to tackle challenging scenarios that are currently inaccessible to standard cameras.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age

[...]

Cesar Cadena¹, Luca Carlone², Henry Carrillo³, Yasir Latif⁴, Davide Scaramuzza⁵, José Neira⁶, Ian Reid⁴, John J. Leonard² - Show less +4 more•Institutions (6)

ETH Zurich¹, Massachusetts Institute of Technology², Pontifical Xavierian University³, University of Adelaide⁴, University of Zurich⁵, University of Zaragoza⁶

01 Dec 2016-IEEE Transactions on Robotics

TL;DR: Simultaneous localization and mapping (SLAM) as mentioned in this paper consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it.

...read moreread less

Abstract: Simultaneous localization and mapping (SLAM) consists in the concurrent construction of a model of the environment (the map ), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications and witnessing a steady transition of this technology to industry. We survey the current state of SLAM and consider future directions. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors’ take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved?

...read moreread less

2,039 citations

Journal Article•DOI•

Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age

[...]

Cesar Cadena, Luca Carlone, Henry Carrillo, Yasir Latif, Davide Scaramuzza, José L. Neira, Ian Reid, John J. Leonard - Show less +4 more

19 Jun 2016-arXiv: Robotics

TL;DR: What is now the de-facto standard formulation for SLAM is presented, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers.

...read moreread less

Abstract: Simultaneous Localization and Mapping (SLAM)consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications, and witnessing a steady transition of this technology to industry. We survey the current state of SLAM. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors' take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved?

...read moreread less

1,828 citations

Cites background from "EVO: A Geometric Approach to Event-..."

...high-speed motion [90] and high-dynamic range [132], [207], where standard cameras fail....
[...]

Journal Article•DOI•

Event-based Vision: A Survey

[...]

Guillermo Gallego¹, Tobi Delbruck, Garrick Orchard², Chiara Bartolozzi, Brian Taba³, Andrea Censi⁴, Stefan Leutenegger⁵, Andrew J. Davison⁵, Jörg Conradt, Kostas Daniilidis⁶, Davide Scaramuzza⁷ - Show less +7 more•Institutions (7)

Technical University of Berlin¹, National University of Singapore², IBM³, ETH Zurich⁴, Imperial College London⁵, University of Pennsylvania⁶, University of Zurich⁷

10 Jul 2020-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras.

...read moreread less

Abstract: Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of is), very high dynamic range (140dB vs. 60dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world.

...read moreread less

697 citations

Cites background or methods from "EVO: A Geometric Approach to Event-..."

... this conversion step and directly recover the camera motion and scene structure from the events, as suggested by [128]; for example, by optimizing a function with photometric (i.e., event ﬁring rate [26]) and inertial error terms, akin to VI-DSO [234] for standard cameras. Stereo event-based VIO is an unexplored topic, and it would be interesting to see how ideas from event-based depth estimation can...
[...]
...uring the strength of the scene edges. Recently, solutions to the full problem of event-based 3D SLAM for 6-DOF motions and natural scenes, not relying on additional sensing, have been proposed [25], [26] (Table 3). The approach in [25] extends [24] and consists of three interleaved probabilistic ﬁlters to perform pose tracking as well as depth and intensity estimation. However, it suffers from limite...
[...]
...ff latency for efﬁciency, probabilistic ﬁlters [24], [25], [224] can operate on small groups of events. Other approaches are natively designed for groups, based for example on non-linear optimization [26], [127], [128], and run in real time on the CPU. Processing multiple events simultaneously is also beneﬁcial to reduce noise. Opportunities: The above-mentioned SLAM methods lack loop-closure capabili...
[...]
...assumption of uncorrelated depth, intensity gradient, and camera motion. Furthermore, it is computationally intensive, requiring a GPU for real-time operation. In contrast, the semi-dense approach in [26] shows that intensity reconstruction is not needed for depth estimation or pose tracking. The approach has a geometric foundation: it performs space sweeping for 3D reconstruction [19] and edge-map al...
[...]
...], [236] or using sparse signal processing with a patch-based learned dictionary that mapped events to image gradients, which were then Poisson-integrated [235]. Concurrently, the VO methods in [25], [26] extended the image reconstruction technique in [24] to 6-DOF camera motions by using the computed scene depth and poses: [25] used a robust variational regularizer to reduce noise and improve contras...
[...]

Book•

Computer Vision for Autonomous Vehicles: Problems, Datasets and State-of-the-Art

[...]

Joel Janai¹, Fatma Güney², Aseem Behl¹, Andreas Geiger¹•Institutions (2)

Max Planck Society¹, Koç University²

03 Jul 2020

TL;DR: This survey includes both the historically most relevant literature as well as the current state of the art on several specific topics, including recognition, reconstruction, motion estimation, tracking, scene understanding, and end-to-end learning for autonomous driving.

...read moreread less

Abstract: Recent years have witnessed enormous progress in AI-related fields such as computer vision, machine learning, and autonomous vehicles. As with any rapidly growing field, it becomes increasingly difficult to stay up-to-date or enter the field as a beginner. While several survey papers on particular sub-problems have appeared, no comprehensive survey on problems, datasets, and methods in computer vision for autonomous vehicles has been published. This monograph attempts to narrow this gap by providing a survey on the state-of-the-art datasets and techniques. Our survey includes both the historically most relevant literature as well as the current state of the art on several specific topics, including recognition, reconstruction, motion estimation, tracking, scene understanding, and end-to-end learning for autonomous driving. Towards this goal, we analyze the performance of the state of the art on several challenging benchmarking datasets, including KITTI, MOT, and Cityscapes. Besides, we discuss open problems and current research challenges. To ease accessibility and accommodate missing references, we also provide a website that allows navigating topics as well as methods and provides additional information.

...read moreread less

579 citations

Cites background from "EVO: A Geometric Approach to Event-..."

...Rebecq et al. (2016) propose an event-based 3D reconstruction algorithm to produce a parallel tracking and mapping pipeline that runs in real-time on the CPU....
[...]

Journal Article•DOI•

The event-camera dataset and simulator: Event-based data for pose estimation, visual odometry, and SLAM:

[...]

Elias Mueggler¹, Henri Rebecq¹, Guillermo Gallego¹, Tobi Delbruck², Tobi Delbruck¹, Davide Scaramuzza¹ - Show less +2 more•Institutions (2)

University of Zurich¹, ETH Zurich²

17 Feb 2017-The International Journal of Robotics Research

TL;DR: In this article, the authors proposed a dynamic and active-pixel vision sensor DAVIS, which incorporated a conventional global-shutter camera and an event-based sensor in the same pixel array.

...read moreread less

Abstract: New vision sensors, such as the dynamic and active-pixel vision sensor DAVIS, incorporate a conventional global-shutter camera and an event-based sensor in the same pixel array. These sensors have ...

...read moreread less

370 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

3D is here: Point Cloud Library (PCL)

[...]

Radu Bogdan Rusu¹, Steve Cousins¹•Institutions (1)

Willow Garage¹

09 May 2011

TL;DR: PCL (Point Cloud Library) is presented, an advanced and extensive approach to the subject of 3D perception that contains state-of-the art algorithms for: filtering, feature estimation, surface reconstruction, registration, model fitting and segmentation.

...read moreread less

Abstract: With the advent of new, low-cost 3D sensing hardware such as the Kinect, and continued efforts in advanced point cloud processing, 3D perception gains more and more importance in robotics, as well as other fields. In this paper we present one of our most recent initiatives in the areas of point cloud perception: PCL (Point Cloud Library - http://pointclouds.org). PCL presents an advanced and extensive approach to the subject of 3D perception, and it's meant to provide support for all the common 3D building blocks that applications need. The library contains state-of-the art algorithms for: filtering, feature estimation, surface reconstruction, registration, model fitting and segmentation. PCL is supported by an international community of robotics and perception researchers. We provide a brief walkthrough of PCL including its algorithmic capabilities and implementation strategies.

...read moreread less

4,501 citations

"EVO: A Geometric Approach to Event-..." refers methods in this paper

...We also apply a radius filter [21] to the resulting...
[...]

Proceedings Article•DOI•

Parallel Tracking and Mapping for Small AR Workspaces

[...]

Georg Klein¹, David W. Murray¹•Institutions (1)

University of Oxford¹

13 Nov 2007

TL;DR: A system specifically designed to track a hand-held camera in a small AR workspace, processed in parallel threads on a dual-core computer, that produces detailed maps with thousands of landmarks which can be tracked at frame-rate with accuracy and robustness rivalling that of state-of-the-art model-based systems.

...read moreread less

Abstract: This paper presents a method of estimating camera pose in an unknown scene. While this has previously been attempted by adapting SLAM algorithms developed for robotic exploration, we propose a system specifically designed to track a hand-held camera in a small AR workspace. We propose to split tracking and mapping into two separate tasks, processed in parallel threads on a dual-core computer: one thread deals with the task of robustly tracking erratic hand-held motion, while the other produces a 3D map of point features from previously observed video frames. This allows the use of computationally expensive batch optimisation techniques not usually associated with real-time operation: The result is a system that produces detailed maps with thousands of landmarks which can be tracked at frame-rate, with an accuracy and robustness rivalling that of state-of-the-art model-based systems.

...read moreread less

4,091 citations

Book Chapter•DOI•

LSD-SLAM: Large-Scale Direct Monocular SLAM

[...]

Jakob Engel¹, Thomas Schops¹, Daniel Cremers¹•Institutions (1)

Technische Universität München¹

06 Sep 2014

TL;DR: A novel direct tracking method which operates on $\mathfrak{sim}(3)$, thereby explicitly detecting scale-drift, and an elegant probabilistic solution to include the effect of noisy depth values into tracking are introduced.

...read moreread less

Abstract: We propose a direct (feature-less) monocular SLAM algorithm which, in contrast to current state-of-the-art regarding direct methods, allows to build large-scale, consistent maps of the environment Along with highly accurate pose estimation based on direct image alignment, the 3D environment is reconstructed in real-time as pose-graph of keyframes with associated semi-dense depth maps These are obtained by filtering over a large number of pixelwise small-baseline stereo comparisons The explicitly scale-drift aware formulation allows the approach to operate on challenging sequences including large variations in scene scale Major enablers are two key novelties: (1) a novel direct tracking method which operates on $\mathfrak{sim}(3)$, thereby explicitly detecting scale-drift, and (2) an elegant probabilistic solution to include the effect of noisy depth values into tracking The resulting direct monocular SLAM system runs in real-time on a CPU

...read moreread less

3,273 citations

"EVO: A Geometric Approach to Event-..." refers methods in this paper

...Our tracking module relies on image-to-model alignment, which is also used in frame-based, direct VO pipelines [14], [15]....
[...]

Journal Article•DOI•

Lucas-Kanade 20 Years On: A Unifying Framework

[...]

Simon Baker¹, Iain Matthews¹•Institutions (1)

Carnegie Mellon University¹

01 Feb 2004-International Journal of Computer Vision

TL;DR: In this paper, a wide variety of extensions have been made to the original formulation of the Lucas-Kanade algorithm and their extensions can be used with the inverse compositional algorithm without any significant loss of efficiency.

...read moreread less

Abstract: Since the Lucas-Kanade algorithm was proposed in 1981 image alignment has become one of the most widely used techniques in computer vision Applications range from optical flow and tracking to layered motion, mosaic construction, and face coding Numerous algorithms have been proposed and a wide variety of extensions have been made to the original formulation We present an overview of image alignment, describing most of the algorithms and their extensions in a consistent framework We concentrate on the inverse compositional algorithm, an efficient algorithm that we recently proposed We examine which of the extensions to Lucas-Kanade can be used with the inverse compositional algorithm without any significant loss of efficiency, and which cannot In this paper, Part 1 in a series of papers, we cover the quantity approximated, the warp update rule, and the gradient descent approximation In future papers, we will cover the choice of the error function, how to allow linear appearance variation, and how to impose priors on the parameters

...read moreread less

3,168 citations

Proceedings Article•DOI•

SVO: Fast semi-direct monocular visual odometry

[...]

Christian Forster¹, Matia Pizzoli¹, Davide Scaramuzza¹•Institutions (1)

University of Zurich¹

29 Sep 2014

TL;DR: A semi-direct monocular visual odometry algorithm that is precise, robust, and faster than current state-of-the-art methods and applied to micro-aerial-vehicle state-estimation in GPS-denied environments is proposed.

...read moreread less

Abstract: We propose a semi-direct monocular visual odometry algorithm that is precise, robust, and faster than current state-of-the-art methods. The semi-direct approach eliminates the need of costly feature extraction and robust matching techniques for motion estimation. Our algorithm operates directly on pixel intensities, which results in subpixel precision at high frame-rates. A probabilistic mapping method that explicitly models outlier measurements is used to estimate 3D points, which results in fewer outliers and more reliable points. Precise and high frame-rate motion estimation brings increased robustness in scenes of little, repetitive, and high-frequency texture. The algorithm is applied to micro-aerial-vehicle state-estimation in GPS-denied environments and runs at 55 frames per second on the onboard embedded computer and at more than 300 frames per second on a consumer laptop. We call our approach SVO (Semi-direct Visual Odometry) and release our implementation as open-source software.

...read moreread less

1,814 citations

"EVO: A Geometric Approach to Event-..." refers background or methods in this paper

...Since a motion-capture system is not available outdoors, we used a state-of-the-art VO method (SVO [14]) on the intensity frames of the DAVIS for comparison (Fig....
[...]
...[14] C. Forster, M. Pizzoli, and D. Scaramuzza, “SVO: Fast semi-direct monocular visual odometry,” in Proc....
[...]
...For comparison, SVO [14] uses up to 30 iterations....
[...]
...(SVO [14]) on the intensity frames of the DAVIS for comparison (Fig....
[...]
...Our tracking module relies on image-to-model alignment, which is also used in frame-based, direct VO pipelines [14], [15]....
[...]