scispace - formally typeset
Open AccessJournal ArticleDOI

End-to-End Learning of Deep Visual Representations for Image Retrieval

TLDR
In this article, the authors leverage a large-scale but noisy landmark dataset and develop an automatic cleaning method that produces a suitable training set for deep retrieval, and train this network with a siamese architecture that combines three streams with a triplet loss.
Abstract
While deep learning has become a key ingredient in the top performing methods for many computer vision tasks, it has failed so far to bring similar improvements to instance-level image retrieval In this article, we argue that reasons for the underwhelming results of deep methods on image retrieval are threefold: (1) noisy training data, (2) inappropriate deep architecture, and (3) suboptimal training procedure We address all three issues First, we leverage a large-scale but noisy landmark dataset and develop an automatic cleaning method that produces a suitable training set for deep retrieval Second, we build on the recent R-MAC descriptor, show that it can be interpreted as a deep and differentiable architecture, and present improvements to enhance it Last, we train this network with a siamese architecture that combines three streams with a triplet loss At the end of the training process, the proposed architecture produces a global image representation in a single forward pass that is well suited for image retrieval Extensive experiments show that our approach significantly outperforms previous retrieval approaches, including state-of-the-art methods based on costly local descriptor indexing and spatial verification On Oxford 5k, Paris 6k and Holidays, we respectively report 947, 966, and 948 mean average precision Our representations can also be heavily compressed using product quantization with little loss in accuracy

read more

Citations
More filters
Journal ArticleDOI

An overview of deep learning in medical imaging focusing on MRI

TL;DR: In this article, the authors provide a short overview of recent advances and some associated challenges in machine learning applied to medical image processing and image analysis, and provide a starting point for people interested in experimenting and perhaps contributing to the field of machine learning for medical imaging.
Journal ArticleDOI

Fine-Tuning CNN Image Retrieval with No Human Annotation

TL;DR: It is shown that both hard-positive and hard-negative examples, selected by exploiting the geometry and the camera positions available from the 3D models, enhance the performance of particular-object retrieval.
Posted Content

Fine-tuning CNN Image Retrieval with No Human Annotation

TL;DR: In this article, the authors proposed to fine-tune CNNs for image retrieval on a large collection of unordered images in a fully automated manner, using Reconstructed 3D models obtained by the state-of-the-art retrieval and structure-from-motion methods.
Proceedings ArticleDOI

D2-Net: A Trainable CNN for Joint Description and Detection of Local Features

TL;DR: This work proposes an approach where a single convolutional neural network plays a dual role: It is simultaneously a dense feature descriptor and a feature detector, and shows that this model can be trained using pixel correspondences extracted from readily available large-scale SfM reconstructions, without any further annotations.
Journal ArticleDOI

An overview of deep learning in medical imaging focusing on MRI

TL;DR: This paper indicates how deep learning has been applied to the entire MRI processing chain, from acquisition to image retrieval, from segmentation to disease prediction, and provides a starting point for people interested in experimenting and contributing to the field of deep learning for medical imaging.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.