scispace - formally typeset
Journal ArticleDOI

A sparse kernel relevance model for automatic image annotation

TLDR
A new form of the continuous relevance model (CRM), dubbed the SKL-CRM, that adaptively selects the best performing kernel per feature type for automatic image annotation is introduced and is found to attain performance that is competitive to a suite of state-of-the-art image annotation models.
Abstract
In this paper, we introduce a new form of the continuous relevance model (CRM), dubbed the SKL-CRM, that adaptively selects the best performing kernel per feature type for automatic image annotation. Previous image annotation models apply a standard selection of kernels to model the distribution of image features. Popular examples include a Gaussian kernel for modelling GIST features or a Laplacian kernel for global colour histograms. In this work, we demonstrate that this standard assignment of kernels to feature types is sub-optimal and a substantially higher image annotation accuracy can be attained by adapting the kernel-feature assignment. We formulate an efficient greedy algorithm to find the best kernel-feature alignment and show that it is able to rapidly find a sparse subset of features that maximises annotation $$F_{1}$$ score. In a second contribution, we introduce two data-adaptive kernels for image annotation—the generalised Gaussian and multinomial kernels—which we demonstrate can better model the distribution of image features as compared to standard kernels. Evaluation is conducted on three standard image datasets across a selection of different feature representations. The proposed SKL-CRM model is found to attain performance that is competitive to a suite of state-of-the-art image annotation models.

read more

Citations
More filters
Proceedings ArticleDOI

Automatic Image Annotation using Deep Learning Representations

TL;DR: It is demonstrated that word embedding vectors perform better than binary vectors as a representation of the tags associated with an image and the CCA model is compared to a simple CNN based linear regression model, which allows the CNN layers to be trained using back-propagation.
Proceedings ArticleDOI

Deep Classifiers from Image Tags in the Wild

TL;DR: This paper introduces a large-scale robust classification algorithm in order to handle the inherent noise in these tags, and a calibration procedure to better predict objective annotations, and shows that freely available, wild tag can obtain similar or superior results to large databases of costly manual annotations.
Journal ArticleDOI

Image Annotation by Propagating Labels from Semantic Neighbourhoods

TL;DR: This work proposes 2-pass k-nearest neighbour (2PKNN) algorithm, a two-step variant of the classical k-NEarest neighbour algorithm, that tries to address issues in the image annotation task, and establishes a new state-of-the-art on the prevailing image annotation datasets.
Journal ArticleDOI

ImageCLEF annotation with explicit context-aware kernel maps

TL;DR: This paper will show that the underlying kernel solution converges to a positive semi-definite fixed-point, which can also be expressed as a dot product involving “explicit” kernel maps.
Journal ArticleDOI

Image annotation using multi-view non-negative matrix factorization with different number of basis vectors

TL;DR: An AIA system using Non-negative Matrix Factorization (NMF) framework, which discovers a latent space, by factorizing data into a set of non-negative basis and coefficients and is competitive with the current state-of-the-art methods.
References
More filters
Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Journal ArticleDOI

Normalized cuts and image segmentation

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.
Proceedings ArticleDOI

Normalized cuts and image segmentation

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.
Proceedings ArticleDOI

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

TL;DR: This paper presents a method for recognizing scene categories based on approximate global geometric correspondence that exceeds the state of the art on the Caltech-101 database and achieves high accuracy on a large database of fifteen natural scene categories.
Journal ArticleDOI

Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

TL;DR: The performance of the spatial envelope model shows that specific information about object shape or identity is not a requirement for scene categorization and that modeling a holistic representation of the scene informs about its probable semantic category.
Related Papers (5)