End-to-End Learning of Deep Visual Representations for Image Retrieval

doi:10.1007/S11263-017-1016-8

Open AccessJournal ArticleDOI

End-to-End Learning of Deep Visual Representations for Image Retrieval

Albert Gordo, +3 more

- 05 Jun 2017 -

International Journal of Computer Vision

- Vol. 124, Iss: 2, pp 237-254

TLDR

In this article, the authors leverage a large-scale but noisy landmark dataset and develop an automatic cleaning method that produces a suitable training set for deep retrieval, and train this network with a siamese architecture that combines three streams with a triplet loss.

Abstract:

While deep learning has become a key ingredient in the top performing methods for many computer vision tasks, it has failed so far to bring similar improvements to instance-level image retrieval In this article, we argue that reasons for the underwhelming results of deep methods on image retrieval are threefold: (1) noisy training data, (2) inappropriate deep architecture, and (3) suboptimal training procedure We address all three issues First, we leverage a large-scale but noisy landmark dataset and develop an automatic cleaning method that produces a suitable training set for deep retrieval Second, we build on the recent R-MAC descriptor, show that it can be interpreted as a deep and differentiable architecture, and present improvements to enhance it Last, we train this network with a siamese architecture that combines three streams with a triplet loss At the end of the training process, the proposed architecture produces a global image representation in a single forward pass that is well suited for image retrieval Extensive experiments show that our approach significantly outperforms previous retrieval approaches, including state-of-the-art methods based on costly local descriptor indexing and spatial verification On Oxford 5k, Paris 6k and Holidays, we respectively report 947, 966, and 948 mean average precision Our representations can also be heavily compressed using product quantization with little loss in accuracy

Citations

PDF

Open Access

More filters

Journal ArticleDOI

An overview of deep learning in medical imaging focusing on MRI

Alexander Lundervold, +4 more

- 25 Nov 2018 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this article, the authors provide a short overview of recent advances and some associated challenges in machine learning applied to medical image processing and image analysis, and provide a starting point for people interested in experimenting and perhaps contributing to the field of machine learning for medical imaging.

...read moreread less

Journal ArticleDOI

Fine-Tuning CNN Image Retrieval with No Human Annotation

Filip Radenovic, +2 more

- 01 Jul 2019 -

IEEE Transactions on Pattern Analysis an...

TL;DR: It is shown that both hard-positive and hard-negative examples, selected by exploiting the geometry and the camera positions available from the 3D models, enhance the performance of particular-object retrieval.

...read moreread less

Posted Content

Fine-tuning CNN Image Retrieval with No Human Annotation

Filip Radenovic, +2 more

- 03 Nov 2017 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this article, the authors proposed to fine-tune CNNs for image retrieval on a large collection of unordered images in a fully automated manner, using Reconstructed 3D models obtained by the state-of-the-art retrieval and structure-from-motion methods.

...read moreread less

Proceedings ArticleDOI

D2-Net: A Trainable CNN for Joint Description and Detection of Local Features

Mihai Dusmanu, +6 more

TL;DR: This work proposes an approach where a single convolutional neural network plays a dual role: It is simultaneously a dense feature descriptor and a feature detector, and shows that this model can be trained using pixel correspondences extracted from readily available large-scale SfM reconstructions, without any further annotations.

...read moreread less

Journal ArticleDOI

An overview of deep learning in medical imaging focusing on MRI

Alexander Lundervold, +4 more

- 01 May 2019 -

Zeitschrift Fur Medizinische Physik

TL;DR: This paper indicates how deep learning has been applied to the entire MRI processing chain, from acquisition to image retrieval, from segmentation to disease prediction, and provides a starting point for people interested in experimenting and contributing to the field of deep learning for medical imaging.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe

- 01 Nov 2004 -

International Journal of Computer Vision

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Collapse

End-to-End Learning of Deep Visual Representations for Image Retrieval

Citations

An overview of deep learning in medical imaging focusing on MRI

Fine-Tuning CNN Image Retrieval with No Human Annotation

Fine-tuning CNN Image Retrieval with No Human Annotation

D2-Net: A Trainable CNN for Joint Description and Detection of Local Features

An overview of deep learning in medical imaging focusing on MRI

References

Deep Residual Learning for Image Recognition

ImageNet Classification with Deep Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

ImageNet: A large-scale hierarchical image database

Distinctive Image Features from Scale-Invariant Keypoints

Related Papers (5)

Deep Residual Learning for Image Recognition

Object retrieval with large vocabularies and fast spatial matching

Distinctive Image Features from Scale-Invariant Keypoints

Video Google: a text retrieval approach to object matching in videos

ImageNet Classification with Deep Convolutional Neural Networks