scispace - formally typeset
Search or ask a question
Topic

Multiple kernel learning

About: Multiple kernel learning is a research topic. Over the lifetime, 1630 publications have been published within this topic receiving 56082 citations.


Papers
More filters
Proceedings Article
08 Dec 2008
TL;DR: In this article, the kernel decomposes into a large sum of individual basis kernels which can then be embedded in a directed acyclic graph, and it is then possible to perform kernel selection through a hierarchical multiple kernel learning framework, in polynomial time in the number of selected kernels.
Abstract: For supervised and unsupervised learning, positive definite kernels allow to use large and potentially infinite dimensional feature spaces with a computational cost that only depends on the number of observations This is usually done through the penalization of predictor functions by Euclidean or Hilbertian norms In this paper, we explore penalizing by sparsity-inducing norms such as the l1-norm or the block l1-norm We assume that the kernel decomposes into a large sum of individual basis kernels which can be embedded in a directed acyclic graph; we show that it is then possible to perform kernel selection through a hierarchical multiple kernel learning framework, in polynomial time in the number of selected kernels This framework is naturally applied to non linear variable selection; our extensive simulations on synthetic datasets and datasets from the UCI repository show that efficiently exploring the large feature space through sparsity-inducing norms leads to state-of-the-art predictive performance

137 citations

Journal ArticleDOI
TL;DR: It is observed that while the direction of returns is not predictable using either text or returns, their size is, with text features producing significantly better performance than historical returns alone.
Abstract: We show how text from news articles can be used to predict intraday price movements of financial assets using support vector machines. Multiple kernel learning is used to combine equity returns with text as predictive features to increase classification performance and we develop an analytic center cutting plane method to solve the kernel learning problem efficiently. We observe that while the direction of returns is not predictable using either text or returns, their size is, with text features producing significantly better performance than historical returns alone.

135 citations

Proceedings ArticleDOI
01 Sep 2009
TL;DR: This paper introduces bag-of-detector scene descriptors that encode presence/absence and structural relations between object parts and derives a novel Bayesian classification method based on Gaussian processes with multiple kernel covariance functions (MKGPC), in order to automatically select and weight multiple features out of a large collection.
Abstract: Recognizing human action in non-instrumented video is a challenging task not only because of the variability produced by general scene factors like illumination, background, occlusion or intra-class variability, but also because of subtle behavioral patterns among interacting people or between people and objects in images. To improve recognition, a system may need to use not only low-level spatio-temporal video correlations but also relational descriptors between people and objects in the scene. In this paper we present contextual scene descriptors and Bayesian multiple kernel learning methods for recognizing human action in complex non-instrumented video. Our contribution is threefold: (1) we introduce bag-of-detector scene descriptors that encode presence/absence and structural relations between object parts; (2) we derive a novel Bayesian classification method based on Gaussian processes with multiple kernel covariance functions (MKGPC), in order to automatically select and weight multiple features, both low-level and high-level, out of a large collection, in a principled way, and (3) perform large scale evaluation using a variety of features on the KTH and a recently introduced, challenging, Hollywood movie dataset. On the KTH dataset, we obtain 94.1% accuracy, the best result reported to date. On the Hollywood dataset we obtain promising results in several action classes using fewer descriptors and about 9.1% improvement in a previous benchmark test.1

134 citations

Proceedings Article
12 Feb 2016
TL;DR: This paper proposes an MKKM clustering with a novel, effective matrix-induced regularization to reduce such redundancy and enhance the diversity of the selected kernels and shows that maximizing the kernel alignment for clustering can be viewed as a special case of this approach.
Abstract: Multiple kernel k-means (MKKM) clustering aims to optimally combine a group of pre-specified kernels to improve clustering performance. However, we observe that existing MKKM algorithms do not sufficiently consider the correlation among these kernels. This could result in selecting mutually redundant kernels and affect the diversity of information sources utilized for clustering, which finally hurts the clustering performance. To address this issue, this paper proposes an MKKM clustering with a novel, effective matrix-induced regularization to reduce such redundancy and enhance the diversity of the selected kernels. We theoretically justify this matrix-induced regularization by revealing its connection with the commonly used kernel alignment criterion. Furthermore, this justification shows that maximizing the kernel alignment for clustering can be viewed as a special case of our approach and indicates the extendability of the proposed matrix-induced regularization for designing better clustering algorithms. As experimentally demonstrated on five challenging MKL benchmark data sets, our algorithm significantly improves existing MKKM and consistently outperforms the state-of-the-art ones in the literature, verifying the effectiveness and advantages of incorporating the proposed matrix-induced regularization.

133 citations

Journal ArticleDOI
TL;DR: A novel soft margin perspective for MKL is presented, which introduces an additional slack variable called kernel slack variable to each quadratic constraint of MKL, which corresponds to one support vector machine model using a single base kernel.
Abstract: Multiple kernel learning (MKL) has been proposed for kernel methods by learning the optimal kernel from a set of predefined base kernels. However, the traditional $L_{1}{\rm MKL}$ method often achieves worse results than the simplest method using the average of base kernels (i.e., average kernel) in some practical applications. In order to improve the effectiveness of MKL, this paper presents a novel soft margin perspective for MKL. Specifically, we introduce an additional slack variable called kernel slack variable to each quadratic constraint of MKL, which corresponds to one support vector machine model using a single base kernel. We first show that $L_{1}{\rm MKL}$ can be deemed as hard margin MKL, and then we propose a novel soft margin framework for MKL. Three commonly used loss functions, including the hinge loss, the square hinge loss, and the square loss, can be readily incorporated into this framework, leading to the new soft margin MKL objective functions. Many existing MKL methods can be shown as special cases under our soft margin framework. For example, the hinge loss soft margin MKL leads to a new box constraint for kernel combination coefficients. Using different hyper-parameter values for this formulation, we can inherently bridge the method using average kernel, $L_{1}{\rm MKL}$ , and the hinge loss soft margin MKL. The square hinge loss soft margin MKL unifies the family of elastic net constraint/regularizer based approaches; and the square loss soft margin MKL incorporates $L_{2}{\rm MKL}$ naturally. Moreover, we also develop efficient algorithms for solving both the hinge loss and square hinge loss soft margin MKL. Comprehensive experimental studies for various MKL algorithms on several benchmark data sets and two real world applications, including video action recognition and event recognition demonstrate that our proposed algorithms can efficiently achieve an effective yet sparse solution for MKL.

130 citations


Network Information
Related Topics (5)
Convolutional neural network
74.7K papers, 2M citations
89% related
Deep learning
79.8K papers, 2.1M citations
89% related
Feature extraction
111.8K papers, 2.1M citations
87% related
Feature (computer vision)
128.2K papers, 1.7M citations
87% related
Image segmentation
79.6K papers, 1.8M citations
86% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202321
202244
202172
2020101
2019113
2018114