Classification with sparse grids using simplicial basis functions

doi:10.3233/IDA-2002-6602

Journal ArticleDOI

Classification with sparse grids using simplicial basis functions

Jochen Garcke, +1 more

- Vol. 6, Iss: 6, pp 483-502

Chats0

TLDR

It turns out that the method scales linearly with the number of given data points and is well suited for data mining applications where the amount of data is very large, but where the dimension of the feature space is moderately high.

Abstract:

Recently we presented a new approach [20] to the classification problem arising in data mining. It is based on the regularization network approach but in contrast to other methods, which employ ansatz functions associated to data points, we use a grid in the usually high-dimensional feature space for the minimization process. To cope with the curse of dimensionality, we employ sparse grids [52]. Thus, only O(h_n^{-1} n^{d-1}) instead of O(h_n^{-d}) grid points and unknowns are involved. Here d denotes the dimension of the feature space and h_n = 2^{-n} gives the mesh size. We use the sparse grid combination technique [30] where the classification problem is discretized and solved on a sequence of conventional grids with uniform mesh sizes in each dimension. The sparse grid solution is then obtained by linear combination. The method computes a nonlinear classifier but scales only linearly with the number of data points and is well suited for data mining applications where the amount of data is very large, but where the dimension of the feature space is moderately high. In contrast to our former work, where d-linear functions were used, we now apply linear basis functions based on a simplicial discretization. This allows to handle more dimensions and the algorithm needs less operations per data point. We further extend the method to so-called anisotropic sparse grids, where now different a-priori chosen mesh sizes can be used for the discretization of each attribute. This can improve the run time of the method and the approximation results in the case of data sets with different importance of the attributes. We describe the sparse grid combination technique for the classification problem, give implementational details and discuss the complexity of the algorithm. It turns out that the method scales linearly with the number of given data points. Finally we report on the quality of the classifier built by our new method on data sets with up to 14 dimensions. We show that our new method achieves correctness rates which are competitive to those of the best existing methods.

Classification with sparse grids using simplicial basis functions

Citations

Dimension-adaptive tensor-product quadrature

An Anisotropic Sparse Grid Stochastic Collocation Method for Partial Differential Equations with Random Input Data

Spatially Adaptive Sparse Grids for High-Dimensional Problems.

Multivariate Regression and Machine Learning with Sums of Separable Functions

Dynamic cluster formation using level set methods

References

The Nature of Statistical Learning Theory

UCI Repository of machine learning databases

Solutions of ill-posed problems

Spline models for observational data

Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter

Related Papers (5)

Data mining with sparse grids

Dimension-adaptive tensor-product quadrature

Adaptive sparse grids

Acta Numerica 2004: Sparse grids

Spline models for observational data