scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

An approach to incremental SVM learning algorithm

13 Nov 2000-pp 268-273
TL;DR: This paper proposes an intercross iterative approach for training SVM to incremental learning taking the possible impact of new training data to history data each other into account and shows that this approach has more satisfying accuracy in classification precision.
Abstract: The classification algorithm that is based on a support vector machine (SVM) is now attracting more attention, due to its perfect theoretical properties and good empirical results. In this paper, we first analyze the properties of the support vector (SV) set thoroughly, then introduce a new learning method, which extends the SVM classification algorithm to the incremental learning area. The theoretical basis of this algorithm is the classification equivalence of the SV set and the training set. In this algorithm, knowledge is accumulated in the process of incremental learning. In addition, unimportant samples are discarded optimally by a least-recently used (LRU) scheme. Theoretical analyses and experimental results showed that this algorithm could not only speed up the training process, but it could also reduce the storage costs, while the classification precision is also guaranteed.
Citations
More filters
Proceedings ArticleDOI
13 Oct 2005
TL;DR: Comparisons of least squares support vector machines with SVM for regression show that LS-SVM is preferred especially for large scale problem, because its solution procedure is high efficiency and after pruning both sparseness and performance are comparable with those of SVM.
Abstract: Support vector machines (SVM) has been widely used in classification and nonlinear function estimation. However, the major drawback of SVM is its higher computational burden for the constrained optimization programming. This disadvantage has been overcome by least squares support vector machines (LS-SVM), which solves linear equations instead of a quadratic programming problem. This paper compares LS-SVM with SVM for regression. According to the parallel test results, conclusions can be made that LS-SVM is preferred especially for large scale problem, because its solution procedure is high efficiency and after pruning both sparseness and performance of LS-SVM are comparable with those of SVM

277 citations

Journal ArticleDOI
TL;DR: This survey focuses on evolving fuzzy rule-based models and neuro-fuzzy networks for clustering, classification and regression and system identification in online, real-time environments where learning and model development should be performed incrementally.

231 citations

Journal ArticleDOI
TL;DR: This paper combines the evolving classifier with a trie-based user profiling to obtain a powerful self-learning online scheme and develops further the recursive formula of the potential of a data point to become a cluster center using cosine distance.
Abstract: Knowledge about computer users is very beneficial for assisting them, predicting their future actions or detecting masqueraders. In this paper, a new approach for creating and recognizing automatically the behavior profile of a computer user is presented. In this case, a computer user behavior is represented as the sequence of the commands she/he types during her/his work. This sequence is transformed into a distribution of relevant subsequences of commands in order to find out a profile that defines its behavior. Also, because a user profile is not necessarily fixed but rather it evolves/changes, we propose an evolving method to keep up to date the created profiles using an Evolving Systems approach. In this paper, we combine the evolving classifier with a trie-based user profiling to obtain a powerful self-learning online scheme. We also develop further the recursive formula of the potential of a data point to become a cluster center using cosine distance, which is provided in the Appendix. The novel approach proposed in this paper can be applicable to any problem of dynamic/evolving user behavior modeling where it can be represented as a sequence of actions or events. It has been evaluated on several real data streams.

130 citations

Patent
25 Apr 2007
TL;DR: In this paper, a method of identifying and localizing objects belonging to one of three or more classes, including deriving vectors, each being mapped to objects, where each of the vectors is an element of an N-dimensional space.
Abstract: A method of identifying and localizing objects belonging to one of three or more classes, includes deriving vectors, each being mapped to one of the objects, where each of the vectors is an element of an N-dimensional space. The method includes training an ensemble of binary classifiers with a CISS technique, using an ECOC technique. For each object corresponding to a class, the method includes calculating a probability that the associated vector belongs to a particular class, using an ECOC probability estimation technique. In another embodiment, increased detection accuracy is achieved by using images obtained with different contrast methods. A nonlinear dimensional reduction technique, Kernel PCA, was employed to extract features from the multi-contrast composite image. The Kernel PCA preprocessing shows improvements over traditional linear PCA preprocessing possibly due to its ability to capture high-order, nonlinear correlations in the high dimensional image space.

89 citations

Journal ArticleDOI
TL;DR: A comparative study between the history of incremental learning and ensemble learning and their performances w.r.t. accuracy and time efficiency, under various concept drift scenarios.
Abstract: With unlimited growth of real-world data size and increasing requirement of real-time processing, immediate processing of big stream data has become an urgent problem. In stream data, hidden patterns commonly evolve over time (i.e.,concept drift), where many dynamic learning strategies have been proposed, such as the incremental learning and ensemble learning. To the best of our knowledge, there is no work systematically compare these two methods. In this paper we conduct comparative study between theses two learning methods. We first introduce the concept of “concept drift”, and propose how to quantitatively measure it. Then, we recall the history of incremental learning and ensemble learning, introducing milestones of their developments. In experiments, we comprehensively compare and analyze their performances w.r.t. accuracy and time efficiency, under various concept drift scenarios. We conclude with several future possible research problems.

60 citations

References
More filters
Book
Vladimir Vapnik1
01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Abstract: Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

40,147 citations


Additional excerpts

  • ...SVM [2] is a new approach of pattern recognition based on Structural Risk Minimization which is suitable to deal with magnitude features problem with a given finite amount of training data and can construct decision rules that generalize well....

    [...]

01 Jan 1998
TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Abstract: A comprehensive look at learning and generalization theory. The statistical theory of learning and generalization concerns the problem of choosing desired functions on the basis of empirical data. Highly applicable to a variety of computer science and robotics fields, this book offers lucid coverage of the theory as a whole. Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.

26,531 citations


"An approach to incremental SVM lear..." refers background in this paper

  • ...O(N~+LjV~+d&N,) [ 7 ]. In the case of incremental learning (a = 1,p = 0), suppose the incremental set has...

    [...]

Journal ArticleDOI
TL;DR: There are several arguments which support the observed high accuracy of SVMs, which are reviewed and numerous examples and proofs of most of the key theorems are given.
Abstract: The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.

15,696 citations

Proceedings Article
01 Jan 2001
TL;DR: Computational results indicate that test set correctness for the reduced support vector machine (RSVM), with a nonlinear separating surface that depends on a small randomly selected portion of the dataset, is better than that of a conventional support vectors machine (SVM) with aNonlinear surface that explicitly depends on the entire dataset, and much better than a conventional SVM using a small random sample of the data.
Abstract: An algorithm is proposed which generates a nonlinear kernel-based separating surface that requires as little as 1% of a large dataset for its explicit evaluation. To generate this nonlinear surface, the entire dataset is used as a constraint in an optimization problem with very few variables corresponding to the 1% of the data kept. The remainder of the data can be thrown away after solving the optimization problem. This is achieved by making use of a rectangular m×m kernel K(A, Ā) that greatly reduces the size of the quadratic program to be solved and simplifies the characterization of the nonlinear separating surface. Here, the m rows of A represent the original m data points while the m rows of Ā represent a greatly reduced m data points. Computational results indicate that test set correctness for the reduced support vector machine (RSVM), with a nonlinear separating surface that depends on a small randomly selected portion of the dataset, is better than that of a conventional support vector machine (SVM) with a nonlinear surface that explicitly depends on the entire dataset, and much better than a conventional SVM using a small random sample of the data. Computational times, as well as memory usage, are much smaller for RSVM than that of a conventional SVM using the entire dataset.

691 citations


"An approach to incremental SVM lear..." refers methods in this paper

  • ...Most incremental learning algorithm are based on improving SVM training process by collecting more useful data as support vectors([5-8])....

    [...]

Journal ArticleDOI
TL;DR: An implicit Lagrangian for the dual of a simple reformulation of the standard quadratic program of a linear support vector machine is proposed, which leads to the minimization of an unconstrained differentiable convex function in a space of dimensionality equal to the number of classified points.
Abstract: An implicit Lagrangian for the dual of a simple reformulation of the standard quadratic program of a linear support vector machine is proposed. This leads to the minimization of an unconstrained differentiable convex function in a space of dimensionality equal to the number of classified points. This problem is solvable by an extremely simple linearly convergent Lagrangian support vector machine (LSVM) algorithm. LSVM requires the inversion at the outset of a single matrix of the order of the much smaller dimensionality of the original input space plus one. The full algorithm is given in this paper in 11 lines of MATLAB code without any special optimization tools such as linear or quadratic programming solvers. This LSVM code can be used "as is" to solve classification problems with millions of points. For example, 2 million points in 10 dimensional input space were classified by a linear surface in 82 minutes on a Pentium III 500 MHz notebook with 384 megabytes of memory (and additional swap space), and in 7 minutes on a 250 MHz UltraSPARC II processor with 2 gigabytes of memory. Other standard classification test problems were also solved. Nonlinear kernel classification can also be solved by LSVM. Although it does not scale up to very large problems, it can handle any positive semidefinite kernel and is guaranteed to converge. A short MATLAB code is also given for nonlinear kernels and tested on a number of problems.

634 citations