Deep learning features at scale for visual place recognition

doi:10.1109/ICRA.2017.7989366

Open AccessProceedings ArticleDOI

Deep learning features at scale for visual place recognition

Zetao Chen, +7 more

- pp 3223-3230

Chats0

TLDR

This paper trains, at large scale, two CNN architectures for the specific place recognition task and employs a multi-scale feature encoding method to generate condition- and viewpoint-invariant features.

Abstract:

The success of deep learning techniques in the computer vision domain has triggered a range of initial investigations into their utility for visual place recognition, all using generic features from networks that were trained for other types of recognition tasks. In this paper, we train, at large scale, two CNN architectures for the specific place recognition task and employ a multi-scale feature encoding method to generate condition- and viewpoint-invariant features. To enable this training to occur, we have developed a massive Specific PlacEs Dataset (SPED) with hundreds of examples of place appearance change at thousands of different places, as opposed to the semantic place type datasets currently available. This new dataset enables us to set up a training regime that interprets place recognition as a classification problem. We comprehensively evaluate our trained networks on several challenging benchmark place recognition datasets and demonstrate that they achieve an average 10% increase in performance over other place recognition algorithms and pre-trained CNNs. By analyzing the network responses and their differences from pre-trained networks, we provide insights into what a network learns when training for place recognition, and what these results signify for future research in this area.

Citations

PDF

Open Access

More filters

Journal Article

SeqSLAM : visual route-based navigation for sunny summer days and stormy winter nights

Michael Milford, +1 more

- 01 Jan 2012 -

Science & Engineering Faculty

TL;DR: A new approach to visual navigation under changing conditions dubbed SeqSLAM, which removes the need for global matching performance by the vision front-end - instead it must only pick the best match within any short sequence of images.

...read moreread less

Proceedings ArticleDOI

Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions

Torsten Sattler, +14 more

TL;DR: This paper introduces the first benchmark datasets specifically designed for analyzing the impact of day-night changes, weather and seasonal variations, as well as sequence-based localization approaches and the need for better local features on visual localization.

...read moreread less

Journal ArticleDOI

Performance Analysis of Google Colaboratory as a Tool for Accelerating Deep Learning Applications

Tiago Carneiro, +5 more

- 08 Oct 2018 -

IEEE Access

TL;DR: This paper presents a detailed analysis of Colaboratory regarding hardware resources, performance, and limitations and shows that the performance reached using this cloud service is equivalent to the performance of the dedicated testbeds, given similar resources.

...read moreread less

Proceedings ArticleDOI

Semantic Visual Localization

Johannes L. Schönberger, +3 more

TL;DR: In this paper, a joint 3D geometric and semantic understanding of the world is used for robust visual localization under a wide range of viewing conditions, enabling it to succeed under conditions where previous approaches failed.

...read moreread less

Proceedings ArticleDOI

Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition

Stephen Hausler, +4 more

TL;DR: Patch-NetVLAD as discussed by the authors combines the advantages of both local and global descriptor methods by deriving patch-level features from NetVLAD residuals, which enables aggregation and matching of deep-learned local features defined over the feature-space grid.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal Article

SeqSLAM : visual route-based navigation for sunny summer days and stormy winter nights

Michael Milford, +1 more

- 01 Jan 2012 -

Science & Engineering Faculty

TL;DR: A new approach to visual navigation under changing conditions dubbed SeqSLAM, which removes the need for global matching performance by the vision front-end - instead it must only pick the best match within any short sequence of images.

...read moreread less

Proceedings ArticleDOI

Aggregating Local Deep Features for Image Retrieval

Artem Babenko Yandex, +1 more

TL;DR: This paper shows that deep features and traditional hand-engineered features have quite different distributions of pairwise similarities, hence existing aggregation methods have to be carefully re-evaluated and reveals that in contrast to shallow features, the simple aggregation method based on sum pooling provides the best performance for deep convolutional features.

...read moreread less

Journal ArticleDOI

Fast and Incremental Method for Loop-Closure Detection Using Bags of Visual Words

Adrien Angeli, +3 more

- 01 Oct 2008 -

IEEE Transactions on Robotics

TL;DR: This work presents an online method that makes it possible to detect when an image comes from an already perceived scene using local shape and color information, and extends the bag-of-words method used in image classification to incremental conditions and relies on Bayesian filtering to estimate loop-closure probability.

...read moreread less

Proceedings ArticleDOI

On the performance of ConvNet features for place recognition

Niko Sünderhauf, +4 more

TL;DR: In this paper, the authors evaluated and compared the utility of three state-of-the-art ConvNets on the problems of particular relevance to navigation for robots; viewpoint-invariance and condition-variance, and for the first time enabled real-time place recognition performance using convNets with large maps.

...read moreread less

Neural Codes for Image Retrieval

David Stutz

TL;DR: A thorough discussion of several state-of-the-art techniques in image retrieval by considering the associated subproblems: image description, descriptor compression, nearest-neighbor search and query expansion, and the combined use of deep architectures and hand-crafted image representations for accurate and efficient image retrieval.

...read moreread less