Showing papers on "Multiple kernel learning published in 2007"

PDF

Open Access

Posted Content•

Consistency of the group Lasso and multiple kernel learning

[...]

Francis Bach¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

23 Jul 2007-arXiv: Learning

TL;DR: In this paper, the authors consider the least-square regression problem with regularization by a block 1-norm and derive necessary and sufficient conditions for the consistency of group Lasso under practical assumptions, such as model misspecification.

...read moreread less

Abstract: We consider the least-square regression problem with regularization by a block 1-norm, i.e., a sum of Euclidean norms over spaces of dimensions larger than one. This problem, referred to as the group Lasso, extends the usual regularization by the 1-norm where all spaces have dimension one, where it is commonly referred to as the Lasso. In this paper, we study the asymptotic model consistency of the group Lasso. We derive necessary and sufficient conditions for the consistency of group Lasso under practical assumptions, such as model misspecification. When the linear predictors and Euclidean norms are replaced by functions and reproducing kernel Hilbert norms, the problem is usually referred to as multiple kernel learning and is commonly used for learning from heterogeneous data sources and for non linear variable selection. Using tools from functional analysis, and in particular covariance operators, we extend the consistency results to this infinite dimensional case and also propose an adaptive scheme to obtain a consistent model estimate, even when the necessary condition required for the non adaptive scheme is not satisfied.

...read moreread less

613 citations

Proceedings Article•DOI•

Learning The Discriminative Power-Invariance Trade-Off

[...]

Manik Varma¹, Debajyoti Ray²•Institutions (2)

Microsoft¹, University College London²

26 Dec 2007

TL;DR: This paper investigates the problem of learning optimal descriptors for a given classification task using the kernel learning framework and learns the optimal, domain-specific kernel as a combination of base kernels corresponding to base features which achieve different levels of trade-off.

...read moreread less

Abstract: We investigate the problem of learning optimal descriptors for a given classification task. Many hand-crafted descriptors have been proposed in the literature for measuring visual similarity. Looking past initial differences, what really distinguishes one descriptor from another is the tradeoff that it achieves between discriminative power and invariance. Since this trade-off must vary from task to task, no single descriptor can be optimal in all situations. Our focus, in this paper, is on learning the optimal tradeoff for classification given a particular training set and prior constraints. The problem is posed in the kernel learning framework. We learn the optimal, domain-specific kernel as a combination of base kernels corresponding to base features which achieve different levels of trade-off (such as no invariance, rotation invariance, scale invariance, affine invariance, etc.) This leads to a convex optimisation problem with a unique global optimum which can be solved for efficiently. The method is shown to achieve state-of-the-art performance on the UIUC textures, Oxford flowers and Cal- tech 101 datasets.

...read moreread less

566 citations

Proceedings Article•DOI•

More efficiency in multiple kernel learning

[...]

Alain Rakotomamonjy¹, Francis Bach², Stéphane Canu³, Yves Grandvalet•Institutions (3)

University of Rouen¹, Mines ParisTech², Institut national des sciences appliquées de Rouen³

20 Jun 2007

TL;DR: This paper proposes an algorithm for solving the MKL problem through an adaptive 2-norm regularization formulation and provides an new insight on MKL algorithms based on block 1- norm regularization by showing that the two approaches are equivalent.

...read moreread less

Abstract: An efficient and general multiple kernel learning (MKL) algorithm has been recently proposed by Sonnenburg et al. (2006). This approach has opened new perspectives since it makes the MKL approach tractable for large-scale problems, by iteratively using existing support vector machine code. However, it turns out that this iterative algorithm needs several iterations before converging towards a reasonable solution. In this paper, we address the MKL problem through an adaptive 2-norm regularization formulation. Weights on each kernel matrix are included in the standard SVM empirical risk minimization problem with a l1 constraint to encourage sparsity. We propose an algorithm for solving this problem and provide an new insight on MKL algorithms based on block 1-norm regularization by showing that the two approaches are equivalent. Experimental results show that the resulting algorithm converges rapidly and its efficiency compares favorably to other MKL algorithms.

...read moreread less

310 citations

Proceedings Article•DOI•

Multiclass multiple kernel learning

[...]

Alexander Zien¹, Cheng Soon Ong¹•Institutions (1)

Max Planck Society¹

20 Jun 2007

TL;DR: This work proposes MKL for joint feature maps, which provides a convenient and principled way for MKL with multiclass problems, and shows the equivalence of several different primal formulations including different regularizers.

...read moreread less

Abstract: In many applications it is desirable to learn from several kernels. "Multiple kernel learning" (MKL) allows the practitioner to optimize over linear combinations of kernels. By enforcing sparse coefficients, it also generalizes feature selection to kernel selection. We propose MKL for joint feature maps. This provides a convenient and principled way for MKL with multiclass problems. In addition, we can exploit the joint feature map to learn kernels on output spaces. We show the equivalence of several different primal formulations including different regularizers. We present several optimization methods, and compare a convex quadratically constrained quadratic program (QCQP) and two semi-infinite linear programs (SILPs) on toy data, showing that the SILPs are faster than the QCQP. We then demonstrate the utility of our method by applying the SILP to three real world datasets.

...read moreread less

308 citations

Proceedings Article•DOI•

Image Classification with Segmentation Graph Kernels

[...]

Zaid Harchaoui¹, Francis Bach²•Institutions (2)

Centre national de la recherche scientifique¹, École Normale Supérieure²

17 Jun 2007

TL;DR: A family of kernels between images, defined as kernels between their respective segmentation graphs, based on soft matching of subtree-patterns of the respective graphs, leveraging the natural structure of images while remaining robust to the associated segmentation process uncertainty.

...read moreread less

Abstract: We propose a family of kernels between images, defined as kernels between their respective segmentation graphs. The kernels are based on soft matching of subtree-patterns of the respective graphs, leveraging the natural structure of images while remaining robust to the associated segmentation process uncertainty. Indeed, output from morphological segmentation is often represented by a labelled graph, each vertex corresponding to a segmented region, with edges joining neighboring regions. However, such image representations have mostly remained underused for learning tasks, partly because of the observed instability of the segmentation process and the inherent hardness of inexact graph matching with uncertain graphs. Our kernels count common virtual substructures amongst images, which enables to perform efficient supervised classification of natural images with a support vector machine. Moreover, the kernel machinery allows us to take advantage of recent advances in kernel-based learning: (i) semi-supervised learning reduces the required number of labelled images, while (ii) multiple kernel learning algorithms efficiently select the most relevant similarity measures between images within our family.

...read moreread less

266 citations

Journal Article•DOI•

A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue

[...]

Zhenyu Chen¹, Jianping Li¹, Liwei Wei¹•Institutions (1)

Chinese Academy of Sciences¹

01 Oct 2007-Artificial Intelligence in Medicine

TL;DR: A novel rule extraction approach using the information provided by the separating hyperplane and support vectors is proposed to improve the generalization capacity and comprehensibility of rules and reduce the computational complexity of SVM.

...read moreread less

113 citations

Book Chapter•DOI•

Improving SVM Performance Using a Linear Combination of Kernels

[...]

Laura Dios¹, Mihai Oltean², Alexandrina Rogozan¹, Jean-Pierre Pécuchet¹•Institutions (2)

Institut national des sciences appliquées¹, Babeș-Bolyai University²

11 Apr 2007

TL;DR: This work proposes an evolutionary approach for finding the optimal weights of a combined kernel used by the Support Vector Machines (SVM) algorithm for solving some particular problems and uses a genetic algorithm for evolving these weights.

...read moreread less

Abstract: Standard kernel-based classifiers use only a single kernel, but the real-world applications and the recent developments of various kernel methods have emphasized the need to consider a combination of multiple kernels. We propose an evolutionary approach for finding the optimal weights of a combined kernel used by the Support Vector Machines (SVM) algorithm for solving some particular problems. We use a genetic algorithm (GA) for evolving these weights. The numerical experiments show that the evolved combined kernels (ECKs) perform better than the convex combined kernels (CCKs) for several classification problems.

...read moreread less

22 citations

Proceedings Article•DOI•

Model selection in pedestrian detection using multiple kernel learning

[...]

Frédéric Suard¹, Alain Rakotomamonjy¹, Abdelaziz Bensrhair¹•Institutions (1)

Intelligence and National Security Alliance¹

13 Jun 2007

TL;DR: It is shown that the MKL framework enable us to apply a model selection and improve the performance and three different applications concerning combination of representations, automatic parameters setting and feature selection are proposed.

...read moreread less

Abstract: This paper presents a pedestrian detection method based on the multiple kernel framework. This approach enables us to select and combine different kinds of image representations. The combination is done through a linear combination of kernels, weighted according to the relevance of kernels. After having presented some descriptors and detailed the multiple kernel framework, we propose three different applications concerning combination of representations, automatic parameters setting and feature selection. We then show that the MKL framework enable us to apply a model selection and improve the performance.

...read moreread less

9 citations

Classification d'images à l'aide de noyaux sur graphes de segmentation

[...]

Za ¨ õd Harchaoui¹, Francis Bach•Institutions (1)

Télécom ParisTech¹

01 Jan 2007

...read moreread less

Abstract: We propose a family of kernels between images, defined as kernels between their respective segmentation graphs. The kernels are based on soft matching of subtree-patterns of the respective graphs, leveraging the natural structure of images while remaining robust to the associated segmentation process uncertainty. Indeed, output from morphological segmentation is often represented by a labelled graph, each vertex corresponding to a segmented region, with edges joining neighboring regions. However, such image representations have mostly remained underused for learning tasks, partly because of the observed instability of the segmentation process and the inherent hardness of inexact graph matching with uncertain graphs. Our kernels count common virtual substructures amongst images, which enables to perform efficient supervised classification of natural images with a support vector machine. Moreover, the kernel machinery allows us to take advantage of recent advances in kernel-based learning: i) semi-supervised learning reduces the required number of labelled images, while ii) multiple kernel learning algorithms efficiently select the most relevant similarity measures between images within our family.

...read moreread less

1 citations

Prediction of phosphorylation sites using multiple kernel learning

[...]

김종경, 최승진

01 Oct 2007

TL;DR: This paper proposes an optimal way of integrating multiple features in the framework of multiple kernel learning that optimally combine seven kernels extracted from sequence, physico-chemical properties, pairwise alignment, and structural information and significantly improves the prediction preformance compared with the previous well-known methods.

...read moreread less

Abstract: Phosphorylation is one of the most important post translational modifications which regulate the activity of proteins. The problem of predicting phosphorylation sites is the first step of understanding various biological processes that initiate the actual function of proteins in each signaling pathway. Although many prediction methods using single or multiple features extracted from protein sequences have been proposed, systematic data integration approach has not been applied in order to improve the accuracy of predicting general phosphorylation sites. In this paper, we propose an optimal way of integrating multiple features in the framework of multiple kernel learning. We optimally combine seven kernels extracted from sequence, physico-chemical properties, pairwise alignment, and structural information. Using the data set of Phospho.ELM, the accuracy evaluated by 5-fold cross-validation reaches 85% for serine, 85% for threonine, and 81% for tyrosine. Our computational experiments show significant improvement in the performance of prediction relative to a single feature, or to the combined feature with equal weights. Moreover, our systematic integration method significantly improves the prediction preformance compared with the previous well-known methods.

...read moreread less

Journal Article•

Algorithm Research on Multiple Kernel Learning SVM for Text Classification

[...]

Yao Futian¹•Institutions (1)

China Jiliang University¹

01 Jan 2007-Computer Engineering

TL;DR: The proposed algorithm of multiple kernel learning considers that conic combinations of kernel matrices for classification leads to a convex quadratically constraint quadratic program, and it can be efficiently solved by recycling the standard SVM implementations.

...read moreread less

Abstract: According to the feature of text classification which often involves multiple,heterogeneous data sources,this paper puts forward the algorithm of multiple kernel learningIt considers that conic combinations of kernel matrices for classification leads to a convex quadratically constraint quadratic program,and it can be efficiently solved by recycling the standard SVM implementationsExperimental results show that the proposed algorithm works for hundred thousands of examples or hundreds of kernels to be combined,and it has higher recall rate and higher precision rate for classification of text email with multiple,heterogeneous data sources

...read moreread less