Aggregating local descriptors into a compact image representation

doi:10.1109/CVPR.2010.5540039

Open AccessProceedings ArticleDOI

Aggregating local descriptors into a compact image representation

- pp 3304-3311

TLDR

This work proposes a simple yet efficient way of aggregating local image descriptors into a vector of limited dimension, which can be viewed as a simplification of the Fisher kernel representation, and shows how to jointly optimize the dimension reduction and the indexing algorithm.

Abstract:

We address the problem of image search on a very large scale, where three constraints have to be considered jointly: the accuracy of the search, its efficiency, and the memory usage of the representation. We first propose a simple yet efficient way of aggregating local image descriptors into a vector of limited dimension, which can be viewed as a simplification of the Fisher kernel representation. We then show how to jointly optimize the dimension reduction and the indexing algorithm, so that it best preserves the quality of vector comparison. The evaluation shows that our approach significantly outperforms the state of the art: the search accuracy is comparable to the bag-of-features approach for an image representation that fits in 20 bytes. Searching a 10 million image dataset takes about 50ms.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Going deeper with convolutions

Christian Szegedy, +8 more

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Posted Content

Group Normalization

Yuxin Wu, +1 more

- 22 Mar 2018 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Group Normalization can outperform its BN-based counterparts for object detection and segmentation in COCO, and for video classification in Kinetics, showing that GN can effectively replace the powerful BN in a variety of tasks.

...read moreread less

Proceedings ArticleDOI

NetVLAD: CNN Architecture for Weakly Supervised Place Recognition

Relja Arandjelovic, +4 more

TL;DR: A convolutional neural network architecture that is trainable in an end-to-end manner directly for the place recognition task and an efficient training procedure which can be applied on very large-scale weakly labelled tasks are developed.

...read moreread less

Journal ArticleDOI

Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval

Yunchao Gong, +3 more

- 01 Dec 2013 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This paper addresses the problem of learning similarity-preserving binary codes for efficient similarity search in large-scale image collections by proposing a simple and efficient alternating minimization algorithm, dubbed iterative quantization (ITQ), and demonstrating an application of ITQ to learning binary attributes or "classemes" on the ImageNet data set.

...read moreread less

Journal ArticleDOI

Aggregating Local Image Descriptors into Compact Codes

Herve Jegou, +5 more

- 01 Sep 2012 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This paper first presents and evaluates different ways of aggregating local image descriptors into a vector and shows that the Fisher kernel achieves better performance than the reference bag-of-visual words approach for any given vector dimension.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe

- 01 Nov 2004 -

International Journal of Computer Vision

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Book

Pattern Recognition and Machine Learning

Christopher M. Bishop

TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.

...read moreread less

Journal ArticleDOI

Pattern Recognition and Machine Learning

Radford M. Neal

- 01 Aug 2007 -

Technometrics

TL;DR: This book covers a broad range of topics for regular factorial designs and presents all of the material in very mathematical fashion and will surely become an invaluable resource for researchers and graduate students doing research in the design of factorial experiments.

...read moreread less

Journal ArticleDOI

A performance evaluation of local descriptors

Krystian Mikolajczyk, +1 more

- 01 Oct 2005 -

IEEE Transactions on Pattern Analysis an...

TL;DR: It is observed that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best and Moments and steerable filters show the best performance among the low dimensional descriptors.

...read moreread less

Proceedings ArticleDOI

Video Google: a text retrieval approach to object matching in videos

Sivic, +1 more

TL;DR: An approach to object and scene retrieval which searches for and localizes all the occurrences of a user outlined object in a video, represented by a set of viewpoint invariant region descriptors so that recognition can proceed successfully despite changes in viewpoint, illumination and partial occlusion.

...read moreread less

Collapse

Aggregating local descriptors into a compact image representation

Citations

Going deeper with convolutions

Group Normalization

NetVLAD: CNN Architecture for Weakly Supervised Place Recognition

Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval

Aggregating Local Image Descriptors into Compact Codes

References

Distinctive Image Features from Scale-Invariant Keypoints

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning

A performance evaluation of local descriptors

Video Google: a text retrieval approach to object matching in videos

Related Papers (5)

Distinctive Image Features from Scale-Invariant Keypoints

Video Google: a text retrieval approach to object matching in videos

ImageNet Classification with Deep Convolutional Neural Networks

Scalable Recognition with a Vocabulary Tree

Deep Residual Learning for Image Recognition