A sparse kernel relevance model for automatic image annotation

doi:10.1007/S13735-014-0063-Y

Journal ArticleDOI

A sparse kernel relevance model for automatic image annotation

Sean Moran, +1 more

- 19 Sep 2014 -

International Journal of Multimedia Info...

- Vol. 3, Iss: 4, pp 209-229

TLDR

A new form of the continuous relevance model (CRM), dubbed the SKL-CRM, that adaptively selects the best performing kernel per feature type for automatic image annotation is introduced and is found to attain performance that is competitive to a suite of state-of-the-art image annotation models.

Abstract:

In this paper, we introduce a new form of the continuous relevance model (CRM), dubbed the SKL-CRM, that adaptively selects the best performing kernel per feature type for automatic image annotation. Previous image annotation models apply a standard selection of kernels to model the distribution of image features. Popular examples include a Gaussian kernel for modelling GIST features or a Laplacian kernel for global colour histograms. In this work, we demonstrate that this standard assignment of kernels to feature types is sub-optimal and a substantially higher image annotation accuracy can be attained by adapting the kernel-feature assignment. We formulate an efficient greedy algorithm to find the best kernel-feature alignment and show that it is able to rapidly find a sparse subset of features that maximises annotation $$F_{1}$$ score. In a second contribution, we introduce two data-adaptive kernels for image annotation—the generalised Gaussian and multinomial kernels—which we demonstrate can better model the distribution of image features as compared to standard kernels. Evaluation is conducted on three standard image datasets across a selection of different feature representations. The proposed SKL-CRM model is found to attain performance that is competitive to a suite of state-of-the-art image annotation models.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Automatic Image Annotation using Deep Learning Representations

Venkatesh N. Murthy, +2 more

TL;DR: It is demonstrated that word embedding vectors perform better than binary vectors as a representation of the tags associated with an image and the CCA model is compared to a simple CNN based linear regression model, which allows the CNN layers to be trained using back-propagation.

...read moreread less

Proceedings ArticleDOI

Deep Classifiers from Image Tags in the Wild

Hamid Izadinia, +4 more

TL;DR: This paper introduces a large-scale robust classification algorithm in order to handle the inherent noise in these tags, and a calibration procedure to better predict objective annotations, and shows that freely available, wild tag can obtain similar or superior results to large databases of costly manual annotations.

...read moreread less

Journal ArticleDOI

Image Annotation by Propagating Labels from Semantic Neighbourhoods

Yashaswi Verma, +1 more

- 01 Jan 2017 -

International Journal of Computer Vision

TL;DR: This work proposes 2-pass k-nearest neighbour (2PKNN) algorithm, a two-step variant of the classical k-NEarest neighbour algorithm, that tries to address issues in the image annotation task, and establishes a new state-of-the-art on the prevailing image annotation datasets.

...read moreread less

Journal ArticleDOI

ImageCLEF annotation with explicit context-aware kernel maps

Hichem Sahbi

- 20 Mar 2015 -

International Journal of Multimedia Info...

TL;DR: This paper will show that the underlying kernel solution converges to a positive semi-definite fixed-point, which can also be expressed as a dot product involving “explicit” kernel maps.

...read moreread less

Journal ArticleDOI

Image annotation using multi-view non-negative matrix factorization with different number of basis vectors

Roya Rad, +1 more

- 01 Jul 2017 -

Journal of Visual Communication and Imag...

TL;DR: An AIA system using Non-negative Matrix Factorization (NMF) framework, which discovers a latent space, by factorizing data into a set of non-negative basis and coefficients and is competitive with the current state-of-the-art methods.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe

- 01 Nov 2004 -

International Journal of Computer Vision

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Journal ArticleDOI

Normalized cuts and image segmentation

Jianbo Shi, +1 more

- 01 Aug 2000 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.

...read moreread less

Proceedings ArticleDOI

Normalized cuts and image segmentation

Jianbo Shi, +1 more

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.

...read moreread less

Proceedings ArticleDOI

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

Svetlana Lazebnik, +2 more

TL;DR: This paper presents a method for recognizing scene categories based on approximate global geometric correspondence that exceeds the state of the art on the Caltech-101 database and achieves high accuracy on a large database of fifteen natural scene categories.

...read moreread less

Journal ArticleDOI

Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

Aude Oliva, +1 more

- 01 May 2001 -

International Journal of Computer Vision

TL;DR: The performance of the spatial envelope model shows that specific information about object shape or identity is not a requirement for scene categorization and that modeling a holistic representation of the scene informs about its probable semantic category.

...read moreread less

Collapse

A sparse kernel relevance model for automatic image annotation

Citations

Automatic Image Annotation using Deep Learning Representations

Deep Classifiers from Image Tags in the Wild

Image Annotation by Propagating Labels from Semantic Neighbourhoods

ImageCLEF annotation with explicit context-aware kernel maps

Image annotation using multi-view non-negative matrix factorization with different number of basis vectors

References

Distinctive Image Features from Scale-Invariant Keypoints

Normalized cuts and image segmentation

Normalized cuts and image segmentation

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

Related Papers (5)

TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation

A New Baseline for Image Annotation

Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

Multiple Bernoulli relevance models for image and video annotation

Fast Image Tagging