scispace - formally typeset
Search or ask a question

Showing papers by "Paul Newman published in 2016"


Journal ArticleDOI
TL;DR: A survey of the visual place recognition research landscape is presented, introducing the concepts behind place recognition, how a “place” is defined in a robotics context, and the major components of a place recognition system.
Abstract: Visual place recognition is a challenging problem due to the vast range of ways in which the appearance of real-world places can vary. In recent years, improvements in visual sensing capabilities, an ever-increasing focus on long-term mobile robot autonomy, and the ability to draw on state-of-the-art research in other disciplines—particularly recognition in computer vision and animal navigation in neuroscience—have all contributed to significant advances in visual place recognition systems. This paper presents a survey of the visual place recognition research landscape. We start by introducing the concepts behind place recognition—the role of place recognition in the animal kingdom, how a “place” is defined in a robotics context, and the major components of a place recognition system. Long-term robot operations have revealed that changing appearance can be a significant factor in visual place recognition failure; therefore, we discuss how place recognition solutions can implicitly or explicitly account for appearance change within the environment. Finally, we close with a discussion on the future of visual place recognition, in particular with respect to the rapid advances being made in the related fields of deep learning, semantic scene understanding, and video description.

933 citations


Proceedings ArticleDOI
16 May 2016
TL;DR: This paper trains place-specific linear SVM classifiers to recognise distinctive elements in the environment to extract distinct elements from the environment for localisation in challenging outdoor environments.
Abstract: This paper is about camera-only localisation in challenging outdoor environments, where changes in lighting, weather and season cause traditional localisation systems to fail. Conventional approaches to the localisation problem rely on point-features such as SIFT, SURF or BRIEF to associate landmark observations in the live image with landmarks stored in the map; however, these features are brittle to the severe appearance change routinely encountered in outdoor environments. In this paper, we propose an alternative to traditional point-features: we train place-specific linear SVM classifiers to recognise distinctive elements in the environment. The core contribution of this paper is an unsupervised mining algorithm which operates on a single mapping dataset to extract distinct elements from the environment for localisation. We evaluate our system on 205km of data collected from central Oxford over a period of six months in bright sun, night, rain, snow and at all times of the day. Our experiment consists of a comprehensive N-vs-N analysis on 22 laps of the approximately 10km route in central Oxford. With our proposed system, the portion of the route where localisation fails is reduced by a factor of 6, from 33.3% to 5.5%.

100 citations


Proceedings ArticleDOI
01 Oct 2016
TL;DR: A probabilistic method for fusing sparse 3D LIDAR data with stereo images to provide accurate dense depth maps and uncertainty estimates in real-time is presented, providing accuracy results competitive with state-of-the-art stereo approaches and credible uncertainty estimates that do not misrepresent the true errors.
Abstract: Real-time 3D perception is critical for localisation, mapping, path planning and obstacle avoidance for mobile robots and autonomous vehicles. For outdoor operation in real-world environments, 3D perception is often provided by sparse 3D LIDAR scanners, which provide accurate but low-density depth maps, and dense stereo approaches, which require significant computational resources for accurate results. Here, taking advantage of the complementary error characteristics of LIDAR range sensing and dense stereo, we present a probabilistic method for fusing sparse 3D LIDAR data with stereo images to provide accurate dense depth maps and uncertainty estimates in real-time. We evaluate the method on data collected from a small urban autonomous vehicle and the KITTI dataset, providing accuracy results competitive with state-of-the-art stereo approaches and credible uncertainty estimates that do not misrepresent the true errors, and demonstrate real-time operation on a range of low-power GPU systems.

97 citations


Proceedings ArticleDOI
19 Jun 2016
TL;DR: This paper summarizes the work of the V-Charge project, comprising advances in network communication and parking space scheduling, multi-camera calibration, semantic mapping concepts, visual localization and motion planning, and pushed visual localization, environment perception and automated parking to centimetre precision.
Abstract: Automated valet parking services provide great potential to increase the attractiveness of electric vehicles by mitigating their two main current deficiencies: reduced driving ranges and prolonged refueling times. The European research project V-Charge aims at providing this service on designated parking lots using close-to-market sensors only. For this purpose the project developed a prototype capable of performing fully automated navigation in mixed traffic on designated parking lots and GPS-denied parking garages with cameras and ultrasonic sensors only. This paper summarizes the work of the project, comprising advances in network communication and parking space scheduling, multi-camera calibration, semantic mapping concepts, visual localization and motion planning. The project pushed visual localization, environment perception and automated parking to centimetre precision. The developed infrastructure-based camera calibration and semi-supervised semantic mapping concepts greatly reduce maintenance efforts. Results are presented from extensive month-long field tests.

56 citations


Proceedings ArticleDOI
16 May 2016
TL;DR: A calibration method that automatically estimates the extrinsic calibration between a sensor pose-graph from natural scenes and a system of lidars and cameras, without sensor co-visibility constraints is proposed.
Abstract: We propose a calibration method that automatically estimates the extrinsic calibration between a sensor pose-graph from natural scenes. The sensor pose-graph represents a system of sensors comprising of lidars and cameras, without sensor co-visibility constraints. The method addresses the fact that each scene contributes differently to the calibration problem by introducing a diligent scene selection scheme. The algorithm searches over all scenes to extract a subset of exemplars, whose joint optimisation yields progressively better calibration estimates. This non-parametric method requires no knowledge of the physical world, and continuously finds scenes that better constrain the optimisation parameters. We explain the theory, implement the method, and provide detailed performance analyses with experiments on real-world data.

29 citations


Proceedings ArticleDOI
01 Oct 2016
TL;DR: This paper examines ways in which vehicles, considered as independent agents, can share, update and leverage each others' visual experiences in a mutually beneficial way, underpinning long-term operations of fleets of vehicles using visual localisation.
Abstract: This paper is about underpinning long-term operations of fleets of vehicles using visual localisation. In particular it examines ways in which vehicles, considered as independent agents, can share, update and leverage each others' visual experiences in a mutually beneficial way. We draw on our previous work in Experience-based Navigation (EBN) [1], in which a visual map supporting multiple representations of the same place is built, yielding real-time localisation capability for a solitary vehicle. We now consider how any number of such agents might operate in concert via data sharing policies that are germane to the shared task of lifelong localisation. We rapidly construct considerable maps by the conjoining of work distributed to asynchronous processes, and share expertise amongst the team by the selective dispensing of mission-specific map contents. We demonstrate and evaluate our system against 100km of data collected in North Oxford over a period of a month featuring diverse deviation in appearance due to atmospheric, lighting, and structural dynamics. We show that our framework is capable of creating maps in a fraction of the time required by single-agent EBN, with no significant loss in localisation robustness, and is able to furnish robots on real-world forays with maps which require much less storage.

20 citations


Posted Content
TL;DR: A framework that is automatic and quantitative to aid designers in exploring such resource-performance tradeoffs and finding schedules for mobile robots, guided by questions such as “what is the minimum resource budget required to achieve a given level of performance?”
Abstract: Design of mobile autonomous robots is challenging due to the limited on-board resources such as processing power and energy source. A promising approach is to generate intelligent scheduling policies that trade off reduced resource consumption for a slightly lower but still acceptable level of performance. In this paper, we provide a framework to aid the designers in exploring such resource-performance trade-offs and finding schedules for mobile robots, guided by questions such as "given a resource budget, what guarantees can be provided on achievable performance". The framework is based on quantitative multi-objective verification technique which, for a collection of possibly conflicting objectives, produces the Pareto front that contains all the optimal trade-offs that are achievable. The designer then selects a specific Pareto point based on the resource constraints and desired performance level, and a correct-by-construction schedule that meets those constraints is automatically generated. We demonstrate the efficacy of this framework on several scenarios of a robot with complex dynamics, with encouraging results.

17 citations


Book ChapterDOI
01 Jan 2016
TL;DR: This paper proposes a method to execute the optimisation and regularisation in a 3D volume which has been only partially observed and thereby avoiding inappropriate interpolation and extrapolation and offers empirical analysis of the precision of the reconstructions.
Abstract: This paper is about dense regularised mapping using a single camera as it moves through large work spaces. Our technique is, as many are, a depth-map fusion approach. However, our desire to work both at large scales and outdoors precludes the use of RGB-D cameras. Instead, we need to work with the notoriously noisy depth maps produced from small sets of sequential camera images with known inter-frame poses. This, in turn, requires the application of a regulariser over the 3D surface induced by the fusion of multiple (of order 100) depth maps. We accomplish this by building and managing a cube of voxels. The combination of issues arising from noisy depth maps and moving through our workspace/voxel cube, so it envelops us, rather than orbiting around it as is common in desktop reconstructions, forces the algorithmic contribution of our work. Namely, we propose a method to execute the optimisation and regularisation in a 3D volume which has been only partially observed and thereby avoiding inappropriate interpolation and extrapolation. We demonstrate our technique indoors and outdoors and offer empirical analysis of the precision of the reconstructions.

15 citations


Posted Content
TL;DR: This paper provides the theory and system needed to create city-scale dense reconstructions from range data and in particular, stereo cameras, and applies a regularizer over a compressed 3D data structure while dealing with the complex boundary conditions this induces during the data-fusion stage.
Abstract: This paper is about the efficient generation of dense, colored models of city-scale environments from range data and in particular, stereo cameras. Better maps make for better understanding; better understanding leads to better robots, but this comes at a cost. The computational and memory requirements of large dense models can be prohibitive. We provide the theory and the system needed to create city-scale dense reconstructions. To do so, we apply a regularizer over a compressed 3D data structure while dealing with the complex boundary conditions this induces during the data-fusion stage. We show that only with these considerations can we swiftly create neat, large, "well behaved" reconstructions. We evaluate our system using the KITTI dataset and provide statistics for the metric errors in all surfaces created compared to those measured with 3D laser. Our regularizer reduces the median error by 40% in 3.4 km of dense reconstructions with a median accuracy of 6 cm. For subjective analysis, we provide a qualitative review of 6.1 km of our dense reconstructions in an attached video. These are the largest dense reconstructions from a single passive camera we are aware of in the literature.

11 citations


Proceedings ArticleDOI
01 Oct 2016
TL;DR: An on-line system that discovers and drives collision-free traversable paths, using a variational approach to dense stereo vision, which is light weight, can be run on low cost hardware and is remarkably quick to predict the semantics.
Abstract: In this paper we propose an on-line system that discovers and drives collision-free traversable paths, using a variational approach to dense stereo vision. Our system is light weight, can be run on low cost hardware and is remarkably quick to predict the semantics. In addition to the scene's path affordance it yields a segmentation of the local scene as a composite of distinctive labels - e.g, ground, sky, obstacles and vegetation. To estimate the labels, we combine a very fast and light weight (shallow) image classifier which considers informative feature channels derived from colour images and dense depth maps estimates. Unlike other approaches, we do not use local descriptors around pixel features. Instead, we encompass label-predicted probabilities with a variational approach for image segmentation. Akin to dense depth map estimation, we obtain semantically segmented images by means of convex regularisation. We show how our system can rapidly obtain the required semantics and paths at VGA resolution. Extensive experiments on the KITTI dataset support the robustness of our system to derive collision-free local routes. An accompanied video supports the robustness of the system at live execution in an outdoor experiment.

11 citations


Book ChapterDOI
01 Jan 2016
TL;DR: This paper systematically identifies the data requirements of field robotics applications and design a relational database that is capable of meeting their demands and describes and demonstrates how the system is used to manage over 50TB of data collected over a period of 4 years.
Abstract: Field robotics applications have some unique and unusual data requirements—the curating, organisation and management of which are often overlooked. An emerging theme is the use of large corpora of spatiotemporally indexed sensor data which must be searched and leveraged both offline and online. Increasingly we build systems that must never stop learning. Every sortie requires swift, intelligent read-access to gigabytes of memories and the ability to augment the totality of stored experiences by writing new memories. This however leads to vast quantities of data which quickly become unmanageable, especially when we want to find what is relevant to our needs. The current paradigm of collecting data for specific purposes and storing them in ad-hoc ways will not scale to meet this challenge. In this paper we present the design and implementation of a data management framework that is capable of dealing with large datasets and provides functionality required by many offline and online robotics applications. We systematically identify the data requirements of these applications and design a relational database that is capable of meeting their demands. We describe and demonstrate how we use the system to manage over 50TB of data collected over a period of 4 years.

Book ChapterDOI
01 Jan 2016
TL;DR: A novel state formulation is proposed that captures joint estimates of the sensor pose, a local static background and dynamic states of moving objects, and a new hierarchical data association algorithm to associate raw laser measurements to observable states.
Abstract: This paper presents a unified and model-free framework for the detection and tracking of dynamic objects with 2D laser range finders in an autonomous driving scenario. A novel state formulation is proposed that captures joint estimates of the sensor pose, a local static background and dynamic states of moving objects. In addition, we contribute a new hierarchical data association algorithm to associate raw laser measurements to observable states, and within which, a new variant of the Joint Compatibility Branch and Bound (JCBB) algorithm is introduced for problems with large numbers of measurements. The system is calibrated systematically on 7.5K labeled object examples and evaluated on 6K test cases, and is shown to greatly outperform an existing industry standard targeted at the same problem domain.

Proceedings ArticleDOI
16 May 2016
TL;DR: This work proposes a method to process a sequence of laser point clouds and back-fill dense surfaces into gaps caused by removing objects from the scene - a valuable tool in scenarios where resource constraints permit only one mapping pass in a particular region.
Abstract: In mobile robotics applications, generation of accurate static maps is encumbered by the presence of ephemeral objects such as vehicles, pedestrians, or bicycles. We propose a method to process a sequence of laser point clouds and back-fill dense surfaces into gaps caused by removing objects from the scene - a valuable tool in scenarios where resource constraints permit only one mapping pass in a particular region. Our method processes laser scans in a three-dimensional voxel grid using the Truncated Signed Distance Function (TSDF) and then uses a Total Variation (TV) regulariser with a Kernel Conditional Density Estimation (KCDE) “soft” data term to interpolate missing surfaces. Using four scenarios captured with a push-broom 2D laser, our technique infills approximately 20 m2 of missing surface area for each removed object. Our reconstruction's median error ranges between 5.64 cm – 9.24 cm with standard deviations between 4.57 cm – 6.08 cm.

Patent
21 Apr 2016
TL;DR: In this paper, a method and system for processing a series of images to identify at least a portion of an object of interest within the images, wherein each image represents at least part of an environment, is presented.
Abstract: A method and system of processing a series of images to identify at least a portion of an object of interest within the images, wherein each image represents at least a part of an environment. The method comprises obtaining a first two-dimensional (2D) image, in which at least one point, having a predetermined property, is labelled as forming at least part of the object of interest;segmenting the first 2D image to identify the at least one region corresponding to the at least one labelled point to identify the at least a portion of the object of interest within the first 2D image; obtaining a second 2D image of the environment; propagating at least a portion of the region from the first 2D image to the second 2D image using three dimensional (3D) geometric data; and segmenting the second 2D image to identify the at least one region having the predetermined property in the second 2D image thereby identifying the at least a portion of the object of interest in the second2D image.

Book ChapterDOI
01 Jan 2016
TL;DR: An Augmented Lagrangian (AL) method is proposed that markedly reduces the number of iterations required for convergence, more than \(50\,\%\) of reduction in all cases in comparison to the state-of-the-art approach and part of this significant saving is invested in improving the accuracy of the depth map.
Abstract: The estimation of dense depth maps has become a fundamental module in the pipeline of many visual-based navigation and planning systems. The motivation of our work is to achieve a fast and accurate in-situ infrastructure modelling from a monocular camera mounted on an autonomous car. Our technical contribution is in the application of a Lagrangian Multipliers based formulation to minimise an energy that combines a non-convex dataterm with adaptive pixel-wise regularisation to yield the final local reconstruction. We advocate the use of constrained optimisation for this task—we shall show it is swift, accurate and simple to implement. Specifically we propose an Augmented Lagrangian (AL) method that markedly reduces the number of iterations required for convergence, more than \(50\,\%\) of reduction in all cases in comparison to the state-of-the-art approach. As a result, part of this significant saving is invested in improving the accuracy of the depth map. We introduce a novel per pixel inverse depth uncertainty estimation that affords us to apply adaptive regularisation on the initial depth map: high informative inverse depth pixels require less regularisation, however its impact on more uncertain regions can be propagated providing significant improvement on textureless regions. To illustrate the benefits of our approach, we ran our experiments on three synthetich datasets with perfect ground truth for textureless scenes. An exhaustive analysis shows that AL can speed up the convergence up to 90 % achieving less than 4 cm of error. In addition, we demonstrate the application of the proposed approach on a challenging urban outdoor dataset exhibiting a very diverse and heterogeneous structure.

Posted Content
TL;DR: This paper provides a framework to aid designers in exploring such resource-performance trade-offs and finding schedules for mobile robots, guided by questions such as "what is the minimum resource budget required to achieve a given level of performance?"
Abstract: The design of mobile autonomous robots is challenging due to the limited on-board resources such as processing power and energy. A promising approach is to generate intelligent schedules that reduce the resource consumption while maintaining best performance, or more interestingly, to trade off reduced resource consumption for a slightly lower but still acceptable level of performance. In this paper, we provide a framework to aid designers in exploring such resource-performance trade-offs and finding schedules for mobile robots, guided by questions such as "what is the minimum resource budget required to achieve a given level of performance?" The framework is based on a quantitative multi-objective verification technique which, for a collection of possibly conflicting objectives, produces the Pareto front that contains all the optimal trade-offs that are achievable. The designer then selects a specific Pareto point based on the resource constraints and desired performance level, and a correct-by-construction schedule that meets those constraints is automatically generated. We demonstrate the efficacy of this framework on several robotic scenarios in both simulations and experiments with encouraging results.

Proceedings ArticleDOI
16 May 2016
TL;DR: This paper offers a formulation of the problem which naturally and in a unified way, captures the variety of architectural constraints that can be discovered and applied in urban reconstructions and demonstrates the approach in an end-to-end implementation.
Abstract: This paper is about discovering and leveraging architectural constraints in large scale 3D reconstructions using laser. Our contribution is to offer a formulation of the problem which naturally and in a unified way, captures the variety of architectural constraints that can be discovered and applied in urban reconstructions. We focus in particular on the case of survey construction with a push broom laser + VO system. Here visual odometry is combined with vertical 2D scans to create a 3D picture of the environment. A key characteristic here is that the sensors pass/sweep swiftly through the environment such that elements of the scene are seen only briefly by cameras and scanned just once by the laser. These qualities make for a an ill-constrained optimisation problem which is greatly aided if architectural constraints can be discovered and appropriately applied. We demonstrate our approach in an end-to-end implementation which discovers salient architectural constraints and rejects false loop closures before invoking an optimisation to return a 3D model of the workspace. We evaluate the precision of this model by comparison to a ground truth provided by a 3rd party professional survey using highend (static) 3D laser scanners.

Patent
21 Apr 2016
TL;DR: In this paper, a system and method of generating a 3D representation of an environment comprising a plurality of points, substantially each point having an estimated depth of that point relative to a reference is presented.
Abstract: A system and method of generating a 3D representation of an environment, the 3D representation comprising a plurality of points, substantially each point having an estimated depth of that point relative to a reference. The method comprises the following steps: i) obtaining a depth-map generated from the environment; ii) calculating a certainty value for the estimated depths of at least some of the points within the depth-map; iii) for points having a certainty value below a first threshold, using a geometric assumption of the environment together with depth information for points having a certainty value above a second threshold to calculate a new estimated depth for those points below the first threshold; and iv) generating the 3D representation of the environment using the new estimated depths for points having a certainty value below the first threshold and the estimated depths from the depth-map for points having a certainty value above the first threshold.