scispace - formally typeset
Search or ask a question
Topic

Multiple kernel learning

About: Multiple kernel learning is a research topic. Over the lifetime, 1630 publications have been published within this topic receiving 56082 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This work uses a sparse version of Multiple Kernel Learning (MKL) to simultaneously learn the contribution of each brain region, previously defined by an atlas, to the decision function and shows how this can lead to improved overall generalisation performance.
Abstract: Pattern recognition models have been increasingly applied to neuroimaging data over the last two decades. These applications have ranged from cognitive neuroscience to clinical problems. A common limitation of these approaches is that they do not incorporate previous knowledge about the brain structure and function into the models. Previous knowledge can be embedded into pattern recognition models by imposing a grouping structure based on anatomically or functionally defined brain regions. In this work, we present a novel approach that uses group sparsity to model the whole brain multivariate pattern as a combination of regional patterns. More specifically, we use a sparse version of Multiple Kernel Learning (MKL) to simultaneously learn the contribution of each brain region, previously defined by an atlas, to the decision function. Our application of MKL provides two beneficial features: (1) it can lead to improved overall generalisation performance when the grouping structure imposed by the atlas is consistent with the data; (2) it can identify a subset of relevant brain regions for the predictive model. In order to investigate the effect of the grouping in the proposed MKL approach we compared the results of three different atlases using three different datasets. The method has been implemented in the new version of the open-source Pattern Recognition for Neuroimaging Toolbox (PRoNTo).

58 citations

Journal ArticleDOI
TL;DR: The experimental results show that the proposed MK-SVM method not only leads to better global performances by taking the advantages of multiple features but also has a low computational complexity.
Abstract: In this letter, we propose a multiple kernel support vector machine (MK-SVM) method for multiple feature based VAD. To make the MK-SVM based VAD practical, we adapt the multiple kernel learning (MKL) thought to an efficient cutting-plane structural SVM solver. We further discuss the performances of the MK-SVM with two different optimization objectives, in terms of minimum classification errors (MCE) and improvement of receiver operating characteristic (ROC) curves. Our experimental results show that the proposed method not only leads to better global performances by taking the advantages of multiple features but also has a low computational complexity.

58 citations

Proceedings ArticleDOI
04 Dec 2013
TL;DR: This paper combines kernels at each layer and then optimize over an estimate of the support vector machine leave-one-out error rather than the dual objective function to improve performance on a variety of datasets.
Abstract: Deep learning methods have predominantly been applied to large artificial neural networks. Despite their state-of-the-art performance, these large networks typically do not generalize well to datasets with limited sample sizes. In this paper, we take a different approach by learning multiple layers of kernels. We combine kernels at each layer and then optimize over an estimate of the support vector machine leave-one-out error rather than the dual objective function. Our experiments on a variety of datasets show that each layer successively increases performance with only a few base kernels.

57 citations

Book ChapterDOI
07 Oct 2012
TL;DR: It is shown that a scalable optimization process in the Fourier domain can be used to identify the different frequency bands that are useful for prediction on training data and recover efficient and scalable linear reformulations for both single and multiple kernel learning.
Abstract: Approximations based on random Fourier embeddings have recently emerged as an efficient and formally consistent methodology to design large-scale kernel machines [23]. By expressing the kernel as a Fourier expansion, features are generated based on a finite set of random basis projections, sampled from the Fourier transform of the kernel, with inner products that are Monte Carlo approximations of the original non-linear model. Based on the observation that different kernel-induced Fourier sampling distributions correspond to different kernel parameters, we show that a scalable optimization process in the Fourier domain can be used to identify the different frequency bands that are useful for prediction on training data. This approach allows us to design a family of linear prediction models where we can learn the hyper-parameters of the kernel together with the weights of the feature vectors jointly. Under this methodology, we recover efficient and scalable linear reformulations for both single and multiple kernel learning. Experiments show that our linear models produce fast and accurate predictors for complex datasets such as the Visual Object Challenge 2011 and ImageNet ILSVRC 2011.

57 citations

Posted Content
TL;DR: In this article, the authors propose a new algorithm which first estimates consistently the variance of the noise, based upon the concept of minimal penalty, which was previously introduced in the context of model selection, and then, plugging our variance estimate in Mallows' $C_L$ penalty is proved to lead to an algorithm satisfying an oracle inequality.
Abstract: This paper tackles the problem of selecting among several linear estimators in non-parametric regression; this includes model selection for linear regression, the choice of a regularization parameter in kernel ridge regression, spline smoothing or locally weighted regression, and the choice of a kernel in multiple kernel learning. We propose a new algorithm which first estimates consistently the variance of the noise, based upon the concept of minimal penalty, which was previously introduced in the context of model selection. Then, plugging our variance estimate in Mallows' $C_L$ penalty is proved to lead to an algorithm satisfying an oracle inequality. Simulation experiments with kernel ridge regression and multiple kernel learning show that the proposed algorithm often improves significantly existing calibration procedures such as generalized cross-validation.

57 citations


Network Information
Related Topics (5)
Convolutional neural network
74.7K papers, 2M citations
89% related
Deep learning
79.8K papers, 2.1M citations
89% related
Feature extraction
111.8K papers, 2.1M citations
87% related
Feature (computer vision)
128.2K papers, 1.7M citations
87% related
Image segmentation
79.6K papers, 1.8M citations
86% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202321
202244
202172
2020101
2019113
2018114