Predictive low-rank decomposition for kernel methods
Francis Bach,Michael I. Jordan +1 more
- pp 33-40
Reads0
Chats0
TLDR
This paper presents an algorithm that can exploit side information (e.g., classification labels, regression responses) in the computation of low-rank decompositions for kernel matrices and presents simulation results that show that the algorithm yields decomposition of significantly smaller rank than those found by incomplete Cholesky decomposition.Abstract:
Low-rank matrix decompositions are essential tools in the application of kernel methods to large-scale learning problems. These decompositions have generally been treated as black boxes---the decomposition of the kernel matrix that they deliver is independent of the specific learning task at hand---and this is a potentially significant source of inefficiency. In this paper, we present an algorithm that can exploit side information (e.g., classification labels, regression responses) in the computation of low-rank decompositions for kernel matrices. Our algorithm has the same favorable scaling as state-of-the-art methods such as incomplete Cholesky decomposition---it is linear in the number of data points and quadratic in the rank of the approximation. We present simulation results that show that our algorithm yields decompositions of significantly smaller rank than those found by incomplete Cholesky decomposition.read more
Citations
More filters
Journal ArticleDOI
Efficient Additive Kernels via Explicit Feature Maps
Andrea Vedaldi,Andrew Zisserman +1 more
TL;DR: This work introduces explicit feature maps for the additive class of kernels, such as the intersection, Hellinger's, and χ2 kernels, commonly used in computer vision, and enables their use in large scale problems.
Journal ArticleDOI
Survey Kernel methods in system identification, machine learning and function estimation: A survey
TL;DR: A survey of kernel-based regularization and its connections with reproducing kernel Hilbert spaces and Bayesian estimation of Gaussian processes to demonstrate that learning techniques tailored to the specific features of dynamic systems may outperform conventional parametric approaches for identification of stable linear systems.
Proceedings ArticleDOI
Efficient additive kernels via explicit feature maps
Andrea Vedaldi,Andrew Zisserman +1 more
TL;DR: It is shown that the χ2 kernel, which has been found to yield the best performance in most applications, also has the most compact feature representation, and is able to obtain a significant performance improvement over current state of the art results based on the intersection kernel.
Journal Article
Sampling methods for the Nyström method
TL;DR: This work reports results of extensive experiments that provide a detailed comparison of various fixed and adaptive sampling techniques, and demonstrates the performance improvement associated with the ensemble Nystrom method when used in conjunction with either fixed or adaptive sampling schemes.
Proceedings ArticleDOI
Improved Nyström low-rank approximation and error analysis
TL;DR: An error analysis that directly relates the Nyström approximation quality with the encoding powers of the landmark points in summarizing the data is presented, and the resultant error bound suggests a simple and efficient sampling scheme, the k-means clustering algorithm, for NyStröm low-rank approximation.
References
More filters
Journal ArticleDOI
The Elements of Statistical Learning
TL;DR: Chapter 11 includes more case studies in other areas, ranging from manufacturing to marketing research, and a detailed comparison with other diagnostic tools, such as logistic regression and tree-based methods.
Journal ArticleDOI
Least Squares Support Vector Machine Classifiers
TL;DR: A least squares version for support vector machine (SVM) classifiers that follows from solving a set of linear equations, instead of quadratic programming for classical SVM's.
Book
Kernel Methods for Pattern Analysis
TL;DR: This book provides an easy introduction for students and researchers to the growing field of kernel-based pattern analysis, demonstrating with examples how to handcraft an algorithm or a kernel for a new specific application, and covering all the necessary conceptual and mathematical tools to do so.