scispace - formally typeset
Journal ArticleDOI

Classification with sparse grids using simplicial basis functions

Jochen Garcke, +1 more
- Vol. 6, Iss: 6, pp 483-502
Reads0
Chats0
TLDR
It turns out that the method scales linearly with the number of given data points and is well suited for data mining applications where the amount of data is very large, but where the dimension of the feature space is moderately high.
Abstract
Recently we presented a new approach [20] to the classification problem arising in data mining. It is based on the regularization network approach but in contrast to other methods, which employ ansatz functions associated to data points, we use a grid in the usually high-dimensional feature space for the minimization process. To cope with the curse of dimensionality, we employ sparse grids [52]. Thus, only O(h_n^{-1} n^{d-1}) instead of O(h_n^{-d}) grid points and unknowns are involved. Here d denotes the dimension of the feature space and h_n = 2^{-n} gives the mesh size. We use the sparse grid combination technique [30] where the classification problem is discretized and solved on a sequence of conventional grids with uniform mesh sizes in each dimension. The sparse grid solution is then obtained by linear combination. The method computes a nonlinear classifier but scales only linearly with the number of data points and is well suited for data mining applications where the amount of data is very large, but where the dimension of the feature space is moderately high. In contrast to our former work, where d-linear functions were used, we now apply linear basis functions based on a simplicial discretization. This allows to handle more dimensions and the algorithm needs less operations per data point. We further extend the method to so-called anisotropic sparse grids, where now different a-priori chosen mesh sizes can be used for the discretization of each attribute. This can improve the run time of the method and the approximation results in the case of data sets with different importance of the attributes. We describe the sparse grid combination technique for the classification problem, give implementational details and discuss the complexity of the algorithm. It turns out that the method scales linearly with the number of given data points. Finally we report on the quality of the classifier built by our new method on data sets with up to 14 dimensions. We show that our new method achieves correctness rates which are competitive to those of the best existing methods.

read more

Citations
More filters
Journal ArticleDOI

Dimension-adaptive tensor-product quadrature

TL;DR: The dimension–adaptive quadrature method is developed and presented, based on the sparse grid method, which tries to find important dimensions and adaptively refines in this respect guided by suitable error estimators, and leads to an approach which is based on generalized sparse grid index sets.
Journal ArticleDOI

An Anisotropic Sparse Grid Stochastic Collocation Method for Partial Differential Equations with Random Input Data

TL;DR: This work proposes and analyzes an anisotropic sparse grid stochastic collocation method for solving partial differential equations with random coefficients and forcing terms (input data of the model) and provides a rigorous convergence analysis of the fully discrete problem.
Dissertation

Spatially Adaptive Sparse Grids for High-Dimensional Problems.

Dirk Pflüger
TL;DR: The curse of dimensionality, i.e. the exponential dependency of the overall computational effort on the number of dimensions, is still a roadblock for the numerical treatment of sparse grids.
Journal ArticleDOI

Multivariate Regression and Machine Learning with Sums of Separable Functions

TL;DR: An algorithm for learning (or estimating) a function of many variables from scattered data is approximated by a sum of separable functions, following the paradigm of separated representations, which is suitable for large data sets in high dimensions.
Journal ArticleDOI

Dynamic cluster formation using level set methods

TL;DR: The notion of cluster intensity function (CIF) is introduced which captures the important characteristics of clusters and is more robust than that based on density functions obtained by kernel density estimation, which are often oscillatory or oversmoothed.
References
More filters
Book

The Nature of Statistical Learning Theory

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Book

Spline models for observational data

Grace Wahba
TL;DR: In this paper, a theory and practice for the estimation of functions from noisy data on functionals is developed, where convergence properties, data based smoothing parameter selection, confidence intervals, and numerical methods are established which are appropriate to a number of problems within this framework.
Journal ArticleDOI

Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter

TL;DR: The generalized cross-validation (GCV) method as discussed by the authors is a generalized version of Allen's PRESS, which can be used in subset selection and singular value truncation, and even to choose from among mixtures of these methods.