Deep learning features at scale for visual place recognition

doi:10.1109/ICRA.2017.7989366

Open AccessProceedings ArticleDOI

Deep learning features at scale for visual place recognition

Zetao Chen, +7 more

- pp 3223-3230

Chats0

TLDR

This paper trains, at large scale, two CNN architectures for the specific place recognition task and employs a multi-scale feature encoding method to generate condition- and viewpoint-invariant features.

Abstract:

The success of deep learning techniques in the computer vision domain has triggered a range of initial investigations into their utility for visual place recognition, all using generic features from networks that were trained for other types of recognition tasks. In this paper, we train, at large scale, two CNN architectures for the specific place recognition task and employ a multi-scale feature encoding method to generate condition- and viewpoint-invariant features. To enable this training to occur, we have developed a massive Specific PlacEs Dataset (SPED) with hundreds of examples of place appearance change at thousands of different places, as opposed to the semantic place type datasets currently available. This new dataset enables us to set up a training regime that interprets place recognition as a classification problem. We comprehensively evaluate our trained networks on several challenging benchmark place recognition datasets and demonstrate that they achieve an average 10% increase in performance over other place recognition algorithms and pre-trained CNNs. By analyzing the network responses and their differences from pre-trained networks, we provide insights into what a network learns when training for place recognition, and what these results signify for future research in this area.

Citations

PDF

Open Access

More filters

Posted Content

SeqNet: Learning Descriptors for Sequence-based Hierarchical Place Recognition

Sourav Garg, +1 more

- 23 Feb 2021 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: SeqNet as mentioned in this paper uses a temporal convolutional network to encode short image sequences using 1-D convolutions, which are then matched against the corresponding temporal descriptors from the reference dataset to provide an ordered list of place match hypotheses.

...read moreread less

Journal ArticleDOI

Improving Street View Image Classification Using Pre-trained CNN Model Extracted Features

Meriem Djouadi, +1 more

- 26 Sep 2022 -

Periodica polytechnica. Electrical engin...

TL;DR: This paper presents a new approach for the challenging problem of image geo-localization using Convolutional Neural Networks (CNNs) and shows that this approach can improve the classification by achieving a good accuracy rate.

...read moreread less

Journal ArticleDOI

Attention-Based Deep Odometry Estimation on Point Clouds

Prince Kapoor, +3 more

TL;DR: This paper extends on an existing 3D point cloud based deep odometry estimation model by introducing attention layers in the model architecture, showing significant improvement over the original model, as error on the test set is reduced by 66 %.

...read moreread less

Journal ArticleDOI

Merging Classification Predictions with Sequential Information for Lightweight Visual Place Recognition in Changing Environments

Bruno Arcanjo, +4 more

- 03 Oct 2022 -

arXiv.org

TL;DR: This work addresses lightweight VPR by proposing a novel system based on the combination of binary-weighted classiﬁer networks with a one- dimensional convolutional network, dubbed merger, which achieves inference times as low as 1 millisecond.

...read moreread less

Posted Content

Appearance-Invariant 6-DoF Visual Localization using Generative Adversarial Networks.

Lin Yimin, +2 more

- 24 Dec 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Zhang et al. as discussed by the authors proposed a visual localization network composed of a feature extraction network and pose regression network, which can capture intrinsic appearance-invariant feature maps from unpaired samples of different weathers and seasons.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings ArticleDOI

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick, +3 more

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

Posted Content

Rich feature hierarchies for accurate object detection and semantic segmentation

Ross Girshick, +3 more

- 11 Nov 2013 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.

...read moreread less

Posted Content

Caffe: Convolutional Architecture for Fast Feature Embedding

Yangqing Jia, +7 more

- 20 Jun 2014 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Caffe as discussed by the authors is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

Proceedings ArticleDOI

Dimensionality Reduction by Learning an Invariant Mapping

Raia Hadsell, +2 more

TL;DR: This work presents a method - called Dimensionality Reduction by Learning an Invariant Mapping (DrLIM) - for learning a globally coherent nonlinear function that maps the data evenly to the output manifold.

...read moreread less