This paper presents a probabilistic framework for modeling the feature to feature similarity measure, and proposes a function to score the individual contributions into an image to image similarity within the probabilism framework.
Abstract:
Many recent object retrieval systems rely on local features for describing an image. The similarity between a pair of images is measured by aggregating the similarity between their corresponding local features. In this paper we present a probabilistic framework for modeling the feature to feature similarity measure. We then derive a query adaptive distance which is appropriate for global similarity evaluation. Furthermore, we propose a function to score the individual contributions into an image to image similarity within the probabilistic framework. Experimental results show that our method improves the retrieval accuracy significantly and consistently. Moreover, our result compares favorably to the state-of-the-art.
TL;DR: A convolutional neural network architecture that is trainable in an end-to-end manner directly for the place recognition task and an efficient training procedure which can be applied on very large-scale weakly labelled tasks are developed.
TL;DR: A conditional random field model that reasons about possible groundings of scene graphs to test images and shows that the full model can be used to improve object localization compared to baseline methods and outperforms retrieval methods that use only objects or low-level image features.
TL;DR: It turns out that the learned matching function is so powerful that a simple tracker built upon it, coined Siamese INstance search Tracker, SINT, suffices to reach state-of-the-art performance.
TL;DR: A convolutional neural network architecture that is trainable in an end-to-end manner directly for the place recognition task, and significantly outperforms non-learnt image representations and off-the-shelf CNN descriptors on two challenging place recognition benchmarks.
TL;DR: A comprehensive survey of instance retrieval over the last decade, presenting milestones in modern instance retrieval, reviews a broad selection of previous works in different categories, and provides insights on the connection between SIFT and CNN-based methods.
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
TL;DR: An approach to object and scene retrieval which searches for and localizes all the occurrences of a user outlined object in a video, represented by a set of viewpoint invariant region descriptors so that recognition can proceed successfully despite changes in viewpoint, illumination and partial occlusion.
TL;DR: A comparative evaluation of different detectors is presented and it is shown that the proposed approach for detecting interest points invariant to scale and affine transformations provides better results than existing methods.
TL;DR: To improve query performance, this work adds an efficient spatial verification stage to re-rank the results returned from the bag-of-words model and shows that this consistently improves search quality, though by less of a margin when the visual vocabulary is large.
TL;DR: This paper introduces a product quantization-based approach for approximate nearest neighbor search to decompose the space into a Cartesian product of low-dimensional subspaces and to quantize each subspace separately.
Q1. What contributions have the authors mentioned in the paper "Query adaptive similarity for large scale object retrieval" ?
In this paper the authors present a probabilistic framework for modeling the feature to feature similarity measure. Furthermore, the authors propose a function to score the individual contributions into an image to image similarity within the probabilistic framework.
Q2. How does the mAP function perform without the feature scaling?
For the experiment on Oxford5k, the authors find out that without the feature scaling, mAP will drop from 0.739 to 0.707, while without burstiness weighting, mAP will drop to 0.692.
Q3. How can the authors estimate the distance to the non-corresponding features?
The expected distance to the non-corresponding features can be used to adapt the original distance and can be efficiently estimated by introducing a small set of random features as negative examples.
Q4. How do the authors normalize distance between features?
Since the distribution of the Euclidean distance varies enormously from one query feature to another, the authors propose to normalize the distance locally to obtain similar degree of measurement across queries.
Q5. What is the threshold for comparing the adaptive distance function to the Euclidean distance?
In order to compare the adaptive distance function to the Euclidean distance, the authors use a threshold for separating matching and non-matching features.
Q6. How can the authors estimate the distance between the non-corresponding features?
Since the non-corresponding features are independent from the query, a set of randomly sampled, thus unrelated features can be used to represent the set of noncorrespondent features to each query.
Q7. how do the authors calculate the distance between a query and a database?
in order to have an estimation of the pairwise distance d(xi, yj) between query and database features, the authors add a product quantization scheme as in [12] and select the same parameters as the original author.
Q8. How does the model adapt to the query feature?
The authors show - both on simulated and real data - that the Euclidean distance density distribution is highly query dependent and that their model adapts the original distance accordingly.