Training v -support vector regression: theory and algorithms

doi:10.1162/089976602760128081

Home
/
Papers
/
Training v -support vector regression: theory and algorithms

Journal Article•DOI•

Training v -support vector regression: theory and algorithms

Chih-Chung Chang¹, Chih-Jen Lin¹•Institutions (1)

National Taiwan University¹

01 Aug 2002-Neural Computation (MIT Press)-Vol. 14, Iss: 8, pp 1959-1977

TL;DR: This work discusses the relation between-support vector regression (-SVR) and v- support vector regression (v-SVR), and focuses on properties that are different from those of C- Support vector classification (C-SVC) andv-supportvector classification (v -SVC).

read less

Abstract: We discuss the relation between e-support vector regression (e-SVR) and ν-support vector regression (ν-SVR). In particular, we focus on properties that are different from those of C-support vector classification (C-SVC) and ν-support vector classification (ν-SVC). We then discuss some issues that do not occur in the case of classification: the possible range of e and the scaling of target values. A practical decomposition method for ν-SVR is implemented, and computational experiments are conducted. We show some interesting numerical observations specific to regression.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

LIBSVM: A library for support vector machines

[...]

Chih-Chung Chang¹, Chih-Jen Lin¹•Institutions (1)

National Taiwan University¹

06 May 2011-ACM Transactions on Intelligent Systems and Technology

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

40,826 citations

Cites background from "Training v -support vector regressi..."

...and regression can be seen in (Chang and Lin, 2001, Section 4) and ( Chang and Lin, 2002 ),...
[...]

Support Vector Regression

[...]

Debasish Basak, Srimanta Pal¹, Dipak Chandra Patranabis•Institutions (1)

Indian Statistical Institute¹

01 Jan 2007

TL;DR: An attempt has been made to review the existing theory, methods, recent developments and scopes of Support Vector Regression.

...read moreread less

Abstract: Instead of minimizing the observed training error, Support Vector Regression (SVR) attempts to minimize the generalization error bound so as to achieve generalized performance. The idea of SVR is based on the computation of a linear regression function in a high dimensional feature space where the input data are mapped via a nonlinear function. SVR has been applied in various fields - time series and financial (noisy and risky) prediction, approximation of complex engineering analyses, convex quadratic programming and choices of loss functions, etc. In this paper, an attempt has been made to review the existing theory, methods, recent developments and scopes of SVR.

...read moreread less

1,467 citations

Cites background from "Training v -support vector regressi..."

...Most algorithms for SVR [19, 11, 20, 21] require that the training samples be delivered in a single batch....
[...]

Journal Article•DOI•

Online-SVR for short-term traffic flow prediction under typical and atypical traffic conditions

[...]

Manoel Mendonca de Castro-Neto¹, Young-Seon Jeong², Myong-Kee Jeong², Lee D. Han¹•Institutions (2)

University of Tennessee¹, Rutgers University²

01 Apr 2009-Expert Systems With Applications

TL;DR: The OL-SVR model is compared with three well-known prediction models including Gaussian maximum likelihood (GML), Holt exponential smoothing, and artificial neural net models and suggests that GML, which relies heavily on the recurring characteristics of day-to-day traffic, performs slightly better than other models under typical traffic conditions, as demonstrated by previous studies.

...read moreread less

Abstract: Most literature on short-term traffic flow forecasting focused mainly on normal, or non-incident, conditions and, hence, limited their applicability when traffic flow forecasting is most needed, i.e., incident and atypical conditions. Accurate prediction of short-term traffic flow under atypical conditions, such as vehicular crashes, inclement weather, work zone, and holidays, is crucial to effective and proactive traffic management systems in the context of intelligent transportation systems (ITS) and, more specifically, dynamic traffic assignment (DTA). To this end, this paper presents an application of a supervised statistical learning technique called Online Support Vector machine for Regression, or OL-SVR, for the prediction of short-term freeway traffic flow under both typical and atypical conditions. The OL-SVR model is compared with three well-known prediction models including Gaussian maximum likelihood (GML), Holt exponential smoothing, and artificial neural net models. The resultant performance comparisons suggest that GML, which relies heavily on the recurring characteristics of day-to-day traffic, performs slightly better than other models under typical traffic conditions, as demonstrated by previous studies. Yet OL-SVR is the best performer under non-recurring atypical traffic conditions. It appears that for deployed ITS systems that gear toward timely response to real-world atypical and incident situations, OL-SVR may be a better tool than GML.

...read moreread less

644 citations

Additional excerpts

...…KKT conditions for OL-SVR can be rewritten as oLD oai ¼ Xl j¼1 Qijðaj a j Þ þ e yi þ f di þ ui ¼ 0; oLD oa i ¼ Xl j¼1 Q ijðaj a j Þ þ eþ yi f d i þ u i ¼ 0; dð Þi P 0; d ð Þ i a ð Þ i ¼ 0; uð Þi P 0; u ð Þ i ða ð Þ i CÞ ¼ 0; ð3Þ where f in (3) is equal to b in (1) at optimality (Chang & Lin, 2002)....
[...]

Journal Article•DOI•

Training ν -Support Vector Classifiers: Theory and Algorithms

[...]

Chih-Chung Chang¹, Chih-Jen Lin¹•Institutions (1)

National Taiwan University¹

01 Sep 2001-Neural Computation

TL;DR: A decomposition method for -SVM is proposed that is competitive with existing methods for C-SVM and shows that in general they are two different problems with the same optimal solution set.

...read moreread less

Abstract: The ν-support vector machine (ν-SVM) for classification proposed by Scholkopf, Smola, Williamson, and Bartlett (2000) has the advantage of using a parameter ν on controlling the number of support vectors. In this article, we investigate the relation between ν-SVM and C-SVM in detail. We show that in general they are two different problems with the same optimal solution set. Hence, we may expect that many numerical aspects of solving them are similar. However, compared to regular C-SVM, the formulation of ν-SVM is more complicated, so up to now there have been no effective methods for solving large-scale ν-SVM. We propose a decomposition method for ν-SVM that is competitive with existing methods for C-SVM. We also discuss the behavior of ν-SVM by some numerical experiments.

...read moreread less

461 citations

Journal Article•DOI•

Incremental learning for ν -Support Vector Regression

[...]

Bin Gu¹, Victor S. Sheng², Zhijie Wang³, Derek Ho, Said Osman, Shuo Li¹ - Show less +2 more•Institutions (3)

University of Western Ontario¹, University of Central Arkansas², GE Healthcare³

01 Jul 2015-Neural Networks

TL;DR: This paper proposes a special procedure called initial adjustments, which adjusts the weights of ν-SVC based on the Karush-Kuhn-Tucker conditions to prepare an initial solution for the incremental learning of the INSVR learning algorithm.

...read moreread less

418 citations

Cites methods from "Training v -support vector regressi..."

...The original version of this theorem is proved in Chang and Lin (2002)....
[...]
...Theorem 1 (Chang & Lin, 2002)....
[...]
...For example, Chang and Lin (2001, 2002) gave SMO algorithm and implementation for training ϵ-SVR....
[...]
...Chang and Lin (2002) proposed a recognized SMO-type algorithm specially designed for batch ν-SVR training, which is implemented in C++ as a part of the LIBSVM software package (Chang & Lin, 2001)....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

LIBSVM: A library for support vector machines

[...]

Chih-Chung Chang¹, Chih-Jen Lin¹•Institutions (1)

National Taiwan University¹

06 May 2011-ACM Transactions on Intelligent Systems and Technology

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

40,826 citations

Statistical learning theory

[...]

Vladimir Vapnik

01 Jan 1998

TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.

...read moreread less

Abstract: A comprehensive look at learning and generalization theory. The statistical theory of learning and generalization concerns the problem of choosing desired functions on the basis of empirical data. Highly applicable to a variety of computer science and robotics fields, this book offers lucid coverage of the theory as a whole. Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.

...read moreread less

26,531 citations

"Training v -support vector regressi..." refers background in this paper

...This formulation is different from the original -SVR (Vapnik, 1998):...
[...]
...This formulation is different from the original -SVR (Vapnik, 1998): (P ) min 1 2 wTw + C l l∑ i=1 (ξi + ξ∗i ) (wTφ(xi) + b) − yi ≤ + ξi, (1.2) yi − (wTφ(xi) + b) ≤ + ξ∗i , ξi, ξ ∗ i ≥ 0, i = 1, . . . , l....
[...]

Proceedings Article•DOI•

Advances in kernel methods: support vector learning

[...]

Bernhard Schölkopf¹, Christopher John Burges, Alexander J. Smola•Institutions (1)

Max Planck Society¹

08 Feb 1999

TL;DR: Support vector machines for dynamic reconstruction of a chaotic system, Klaus-Robert Muller et al pairwise classification and support vector machines, Ulrich Kressel.

...read moreread less

Abstract: Introduction to support vector learning roadmap. Part 1 Theory: three remarks on the support vector method of function estimation, Vladimir Vapnik generalization performance of support vector machines and other pattern classifiers, Peter Bartlett and John Shawe-Taylor Bayesian voting schemes and large margin classifiers, Nello Cristianini and John Shawe-Taylor support vector machines, reproducing kernel Hilbert spaces, and randomized GACV, Grace Wahba geometry and invariance in kernel based methods, Christopher J.C. Burges on the annealed VC entropy for margin classifiers - a statistical mechanics study, Manfred Opper entropy numbers, operators and support vector kernels, Robert C. Williamson et al. Part 2 Implementations: solving the quadratic programming problem arising in support vector classification, Linda Kaufman making large-scale support vector machine learning practical, Thorsten Joachims fast training of support vector machines using sequential minimal optimization, John C. Platt. Part 3 Applications: support vector machines for dynamic reconstruction of a chaotic system, Davide Mattera and Simon Haykin using support vector machines for time series prediction, Klaus-Robert Muller et al pairwise classification and support vector machines, Ulrich Kressel. Part 4 Extensions of the algorithm: reducing the run-time complexity in support vector machines, Edgar E. Osuna and Federico Girosi support vector regression with ANOVA decomposition kernels, Mark O. Stitson et al support vector density estimation, Jason Weston et al combining support vector and mathematical programming methods for classification, Bernhard Scholkopf et al.

...read moreread less

5,506 citations

Additional excerpts

...As it is difficult to select an appropriate , Schölkopf et al. (1999) introduced a new parameter ν that lets one control the number of support vectors and training errors....
[...]

Fast training of support vector machines using sequential minimal optimization, advances in kernel methods

[...]

J. C. Platt

01 Jan 1999

TL;DR: SMO breaks this large quadratic programming problem into a series of smallest possible QP problems, which avoids using a time-consuming numerical QP optimization as an inner loop and hence SMO is fastest for linear SVMs and sparse data sets.

...read moreread less

5,350 citations

"Training v -support vector regressi..." refers methods in this paper

...The decomposition method was first proposed for SVM classification (Osuna, Freund, & Girosi, 1997; Joachims, 1998; Platt, 1998)....
[...]
...Following the idea of sequential minimal optimization (SMO) by Platt (1998), we use only two elements as the working set in each iteration....
[...]

Book•

Fast training of support vector machines using sequential minimal optimization

[...]

John Platt¹•Institutions (1)

Microsoft¹

08 Feb 1999

TL;DR: In this article, the authors proposed a new algorithm for training Support Vector Machines (SVM) called SMO (Sequential Minimal Optimization), which breaks this large QP problem into a series of smallest possible QP problems.

...read moreread less

Abstract: This chapter describes a new algorithm for training Support Vector Machines: Sequential Minimal Optimization, or SMO Training a Support Vector Machine (SVM) requires the solution of a very large quadratic programming (QP) optimization problem SMO breaks this large QP problem into a series of smallest possible QP problems These small QP problems are solved analytically, which avoids using a time-consuming numerical QP optimization as an inner loop The amount of memory required for SMO is linear in the training set size, which allows SMO to handle very large training sets Because large matrix computation is avoided, SMO scales somewhere between linear and quadratic in the training set size for various test problems, while a standard projected conjugate gradient (PCG) chunking algorithm scales somewhere between linear and cubic in the training set size SMO's computation time is dominated by SVM evaluation, hence SMO is fastest for linear SVMs and sparse data sets For the MNIST database, SMO is as fast as PCG chunking; while for the UCI Adult database and linear SVMs, SMO can be more than 1000 times faster than the PCG chunking algorithm

...read moreread less

5,019 citations