scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Image annotation by composite kernel learning with group structure

28 Nov 2011-pp 1497-1500
TL;DR: Comparisons with other image annotation algorithms show that the proposed Composite Kernel Learning with Group Structure for image annotation achieves a better performance.
Abstract: We can obtain more and more kinds of heterogeneous features (such as color, shape and texture) in images which can be extracted to describe various aspects of visual characteristics. Those high-dimensional heterogeneous visual features are intrinsically embedded in a non-linear space. In order to effectively utilize these heterogeneous features, this paper proposes an approach, called Composite Kernel Learning with Group Structure (CKLGS), to select groups of discriminative features for image annotation. For each image label, the CKLGS method embeds the nonlinear image data with discriminative features into different Reproducing Kernel Hilbert Spaces (RKHS), and then composes these kernels to select groups of discriminative features. Thus a classification model can be trained for image annotation. By the comparisons with other image annotation algorithms, experiments show that the proposed CKLGS for image annotation achieves a better performance.
Citations
More filters
Proceedings ArticleDOI
01 Jan 2012
TL;DR: A novel image annotation approach based on maximum margin classification and a new class of kernels that goes beyond the naive use of existing kernels and their restricted combinations in order to design “model-free“ transductive kernels applicable to interconnected image databases is introduced.
Abstract: We introduce in this paper a novel image annotation approach based on maximum margin classification and a new class of kernels. The method goes beyond the naive use of existing kernels and their restricted combinations in order to design “model-free“ transductive kernels applicable to interconnected image databases. The main contribution of our method includes the minimization of an energy function mixing i) a reconstruction term that factorizes a matrix of interconnected image data as a product of a learned dictionary and a learned kernel map ii) a fidelity term that ensures consistent label predictions with those provided in a training set and iii) a smoothness term which guarantees similar labels for neighboring data and allows us to iteratively diffuse kernel maps and labels from labeled to unlabeled images. Solving this minimization problem makes it possible to learn both a decision criterion and a kernel map that guarantee linear separability in a high dimensional space and good generalization performance. Experiments conducted on image annotation, show that our obtained kernel achieves at least comparable results with related state of the art methods on the MSRC and the Corel5k databases.

40 citations


Cites methods from "Image annotation by composite kerne..."

  • ...When applied, these transductive methods turned out to be very useful in order to overcome the limited cardinality of the labeled images in image annotation [8, 18, 35, 50, 52]....

    [...]

Journal ArticleDOI
Fei Wu1, Yahong Han1, Xiang Liu1, Jian Shao1, Yueting Zhuang1, Zhongfei Zhang1 
TL;DR: This paper introduces many of the recent efforts in sparsity-based heterogenous feature selection, the representation of the intrinsic latent structure embedded in multimedia, and the related hashing index techniques.
Abstract: There is a rapid growth of the amount of multimedia data from real-world multimedia sharing web sites, such as Flickr and Youtube. These data are usually of high dimensionality, high order, and large scale. Moreover, different types of media data are interrelated everywhere in a complicated and extensive way by context prior. It is well known that we can obtain lots of features from multimedia such as images and videos; those high-dimensional features often describe various aspects of characteristics in multimedia. However, the obtained features are often over-complete to describe certain semantics. Therefore, the selection of limited discriminative features for certain semantics is hence crucial to make the understanding of multimedia more interpretable. Furthermore, the effective utilization of intrinsic embedding structures in various features can boost the performance of multimedia retrieval. As a result, the appropriate representation of the latent information hidden in the related features is hence crucial during multimedia understanding. This paper introduces many of the recent efforts in sparsity-based heterogenous feature selection, the representation of the intrinsic latent structure embedded in multimedia, and the related hashing index techniques.

21 citations


Cites background from "Image annotation by composite kerne..."

  • ...The composite kernel learning with group structure (CKLGS) is proposed in [86] to select groups of discriminative features....

    [...]

Journal ArticleDOI
TL;DR: The experimental results showed that significant improvements in the data compatibility were obtained when the online PC-KFA was used, based on an accuracy measure for long-term sequential online datasets, and the computational time is reduced by more than 93% in online training compared with that of offline training.

16 citations

Journal ArticleDOI
TL;DR: A web community based image annotation model based on weighted KNN that is appropriate for annotating images from large-scale real web community and an annotation refinement method based on WordNet level is proposed to improve the annotation results of non-abstract words.

11 citations

Journal ArticleDOI
TL;DR: The strength of the proposed S^2CLGS method for multi-label image annotation is to integrate semi-supervised discriminant analysis, cross-domain learning and sparse coding together.

11 citations

References
More filters
Journal ArticleDOI
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

40,785 citations


"Image annotation by composite kerne..." refers background in this paper

  • ...Therefore, how to select out most discriminant subgroups of heterogeneous visual features for image annotation is an important problem....

    [...]

Proceedings ArticleDOI
08 Jul 2009
TL;DR: The benchmark results indicate that it is possible to learn effective models from sufficiently large image dataset to facilitate general image retrieval and four research issues on web image annotation and retrieval are identified.
Abstract: This paper introduces a web image dataset created by NUS's Lab for Media Search. The dataset includes: (1) 269,648 images and the associated tags from Flickr, with a total of 5,018 unique tags; (2) six types of low-level features extracted from these images, including 64-D color histogram, 144-D color correlogram, 73-D edge direction histogram, 128-D wavelet texture, 225-D block-wise color moments extracted over 5x5 fixed grid partitions, and 500-D bag of words based on SIFT descriptions; and (3) ground-truth for 81 concepts that can be used for evaluation. Based on this dataset, we highlight characteristics of Web image collections and identify four research issues on web image annotation and retrieval. We also provide the baseline results for web image annotation by learning from the tags using the traditional k-NN algorithm. The benchmark results indicate that it is possible to learn effective models from sufficiently large image dataset to facilitate general image retrieval.

2,648 citations

Journal ArticleDOI
TL;DR: This paper shows how the kernel matrix can be learned from data via semidefinite programming (SDP) techniques and leads directly to a convex method for learning the 2-norm soft margin parameter in support vector machines, solving an important open problem.
Abstract: Kernel-based learning algorithms work by embedding the data into a Euclidean space, and then searching for linear relations among the embedded data points. The embedding is performed implicitly, by specifying the inner products between each pair of points in the embedding space. This information is contained in the so-called kernel matrix, a symmetric and positive semidefinite matrix that encodes the relative positions of all points. Specifying this matrix amounts to specifying the geometry of the embedding space and inducing a notion of similarity in the input space---classical model selection problems in machine learning. In this paper we show how the kernel matrix can be learned from data via semidefinite programming (SDP) techniques. When applied to a kernel matrix associated with both training and test data this gives a powerful transductive algorithm---using the labeled part of the data one can learn an embedding also for the unlabeled part. The similarity between test points is inferred from training points and their labels. Importantly, these learning problems are convex, so we obtain a method for learning both the model class and the function without local minima. Furthermore, this approach leads directly to a convex method for learning the 2-norm soft margin parameter in support vector machines, solving an important open problem.

2,419 citations


"Image annotation by composite kerne..." refers background in this paper

  • ...Meanwhile, high-dimensional heterogeneous features extracted from the image are often embedded in a nonlinear and inseparable subspace and it is essential to discern the embedded subspace of original high-dimensional heterogenous features....

    [...]

Proceedings ArticleDOI
04 Jul 2004
TL;DR: Experimental results are presented that show that the proposed novel dual formulation of the QCQP as a second-order cone programming problem is significantly more efficient than the general-purpose interior point methods available in current optimization toolboxes.
Abstract: While classical kernel-based classifiers are based on a single kernel, in practice it is often desirable to base classifiers on combinations of multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for the support vector machine (SVM), and showed that the optimization of the coefficients of such a combination reduces to a convex optimization problem known as a quadratically-constrained quadratic program (QCQP). Unfortunately, current convex optimization toolboxes can solve this problem only for a small number of kernels and a small number of data points; moreover, the sequential minimal optimization (SMO) techniques that are essential in large-scale implementations of the SVM cannot be applied because the cost function is non-differentiable. We propose a novel dual formulation of the QCQP as a second-order cone programming problem, and show how to exploit the technique of Moreau-Yosida regularization to yield a formulation to which SMO techniques can be applied. We present experimental results that show that our SMO-based algorithm is significantly more efficient than the general-purpose interior point methods available in current optimization toolboxes.

1,625 citations


"Image annotation by composite kerne..." refers background or methods in this paper

  • ...In the multiple kernel learning method [1], the problem of learning the best linear combination of kernels is given as follows: K(xi, xj) = M∑ m=1 ηmKm(xi, xj) with coefficients subjected to M∑ m=1 ηm = 1, ηm ≥ 0, 1 ≤ m ≤M....

    [...]

  • ...Recently, Szafranski et al. [8] put forth an approach of composite kernel learning to overcome the deficiency of MKL....

    [...]

  • ...However, MKL does not take the group structure of features into account....

    [...]

  • ...Bach et al. [1] proposed Multiple Kernel Learning (MKL) to overcome such deficiency....

    [...]

  • ...The MKL method embedded the nonlinear image data with discriminative features into RKHS, and then utilized the kernel function in RKHS to select groups of discriminative features....

    [...]

Proceedings ArticleDOI
24 Aug 2008
TL;DR: This paper considers a general framework for extracting shared structures in multi-label classification, and includes several well-known algorithms as special cases, thus elucidating their intrinsic relationships.
Abstract: Multi-label problems arise in various domains such as multi-topic document categorization and protein function prediction. One natural way to deal with such problems is to construct a binary classifier for each label, resulting in a set of independent binary classification problems. Since the multiple labels share the same input space, and the semantics conveyed by different labels are usually correlated, it is essential to exploit the correlation information contained in different labels. In this paper, we consider a general framework for extracting shared structures in multi-label classification. In this framework, a common subspace is assumed to be shared among multiple labels. We show that the optimal solution to the proposed formulation can be obtained by solving a generalized eigenvalue problem, though the problem is non-convex. For high-dimensional problems, direct computation of the solution is expensive, and we develop an efficient algorithm for this case. One appealing feature of the proposed framework is that it includes several well-known algorithms as special cases, thus elucidating their intrinsic relationships. We have conducted extensive experiments on eleven multi-topic web page categorization tasks, and results demonstrate the effectiveness of the proposed formulation in comparison with several representative algorithms.

210 citations