Showing papers on "Statistical learning theory published in 2006"

PDF

Open Access

Journal Article•DOI•

Domain adaptation for statistical classifiers

[...]

Hal Daumé¹, Daniel Marcu¹•Institutions (1)

01 May 2006-Journal of Artificial Intelligence Research

TL;DR: This work introduces a statistical formulation of this problem in terms of a simple mixture model and presents an instantiation of this framework to maximum entropy classifiers and their linear chain counterparts and leads to improved performance on three real world tasks on four different data sets from the natural language processing domain.

...read moreread less

Abstract: The most basic assumption used in statistical learning theory is that training data and test data are drawn from the same underlying distribution. Unfortunately, in many applications, the "in-domain" test data is drawn from a distribution that is related, but not identical, to the "out-of-domain" distribution of the training data. We consider the common case in which labeled out-of-domain data is plentiful, but labeled in-domain data is scarce. We introduce a statistical formulation of this problem in terms of a simple mixture model and present an instantiation of this framework to maximum entropy classifiers and their linear chain counterparts. We present efficient inference algorithms for this special case based on the technique of conditional expectation maximization. Our experimental results show that our approach leads to improved performance on three real world tasks on four different data sets from the natural language processing domain.

...read moreread less

894 citations

Journal Article•DOI•

A Novel Transductive SVM for Semisupervised Classification of Remote-Sensing Images

[...]

Lorenzo Bruzzone¹, Mingmin Chi¹, Mattia Marconcini²•Institutions (2)

University of Trento¹, Fudan University²

30 Oct 2006-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: A novel modified TSVM classifier designed for addressing ill-posed remote-sensing problems is proposed that is able to mitigate the effects of suboptimal model selection and can address multiclass cases.

...read moreread less

Abstract: This paper introduces a semisupervised classification method that exploits both labeled and unlabeled samples for addressing ill-posed problems with support vector machines (SVMs). The method is based on recent developments in statistical learning theory concerning transductive inference and in particular transductive SVMs (TSVMs). TSVMs exploit specific iterative algorithms which gradually search a reliable separating hyperplane (in the kernel space) with a transductive process that incorporates both labeled and unlabeled samples in the training phase. Based on an analysis of the properties of the TSVMs presented in the literature, a novel modified TSVM classifier designed for addressing ill-posed remote-sensing problems is proposed. In particular, the proposed technique: 1) is based on a novel transductive procedure that exploits a weighting strategy for unlabeled patterns, based on a time-dependent criterion; 2) is able to mitigate the effects of suboptimal model selection (which is unavoidable in the presence of small-size training sets); and 3) can address multiclass cases. Experimental results confirm the effectiveness of the proposed method on a set of ill-posed remote-sensing classification problems representing different operative conditions

...read moreread less

560 citations

Journal Article•DOI•

Support vector regression for real-time flood stage forecasting

[...]

Pao Shan Yu¹, Shien Tsung Chen¹, I-Fan Chang¹•Institutions (1)

National Cheng Kung University¹

15 Sep 2006-Journal of Hydrology

TL;DR: The support vector machine, a novel artificial intelligence-based method developed from statistical learning theory, is adopted herein to establish a real-time stage forecasting model that can effectively predict the flood stage forecasts one-to-six-hours ahead.

...read moreread less

464 citations

Journal Article•DOI•

Multi-time scale stream flow predictions: The support vector machines approach

[...]

Tirusew Asefa¹, Mariush Kemblowski¹, Mac McKee¹, Abedalrazq F. Khalil¹•Institutions (1)

Utah State University¹

01 Mar 2006-Journal of Hydrology

TL;DR: New data-driven models based on Statistical Learning Theory that were used to forecast flows at two time scales: seasonal flow volumes and hourly stream flows showed a promising performance in solving site-specific, real-time water resources management problems.

...read moreread less

254 citations

Journal Article•DOI•

Soil Moisture Prediction Using Support Vector Machines

[...]

M. Kashif Gill, Tirusew Asefa, Mariush Kemblowski, Mac McKee

01 Aug 2006-Journal of The American Water Resources Association

TL;DR: Results from the SVM modeling are compared with predictions obtained from ANN models and show that SVM models performed better for soil moisture forecasting than ANN models.

...read moreread less

Abstract: Herein, a recently developed methodology, Support Vector Machines (SVMs), is presented and applied to the challenge of soil moisture prediction. Support Vector Machines are derived from statistical learning theory and can be used to predict a quantity forward in time based on training that uses past data, hence providing a statistically sound approach to solving inverse problems. The principal strength of SVMs lies in the fact that they employ Structural Risk Minimization (SRM) instead of Empirical Risk Minimization (ERM). The SVMs formulate a quadratic optimization problem that ensures a global optimum, which makes them superior to traditional learning algorithms such as Artificial Neural Networks (ANNs). The resulting model is sparse and not characterized by the “curse of dimensionality.” Soil moisture distribution and variation is helpful in predicting and understanding various hydrologic processes, including weather changes, energy and moisture fluxes, drought, irrigation scheduling, and rainfall/runoff generation. Soil moisture and meteorological data are used to generate SVM predictions for four and seven days ahead. Predictions show good agreement with actual soil moisture measurements. Results from the SVM modeling are compared with predictions obtained from ANN models and show that SVM models performed better for soil moisture forecasting than ANN models.

...read moreread less

237 citations

Journal Article•DOI•

Support Vector Machines with Applications

[...]

Javier M. Moguerza¹, Alberto Muñoz²•Institutions (2)

King Juan Carlos University¹, Complutense University of Madrid²

01 Aug 2006-Statistical Science

TL;DR: This paper is intended as an introduction to SVMs and their applications, emphasizing their key features, and some algorithmic extensions and illustrative real-world applications of SVMs are shown.

...read moreread less

Abstract: Support vector machines (SVMs) appeared in the early nineties as optimal margin classiers in the context of Vapnikis statistical learning theory. Since then SVMs have been successfully applied to real-world data analysis problems, often providing improved results compared with other techniques. The SVMs operate within the framework of regularization theory by minimizing an empirical risk in a well-posed and consistent way. A clear advantage of the support vector approach is that sparse solutions to classi- cation and regression problems are usually obtained: only a few samples are involved in the determination of the classication or regression functions. This fact facilitates the application of SVMs to problems that involve a large amount of data, such as text processing and bioinformatics tasks. This paper is intended as an introduction to SVMs and their applications, emphasizing their key features. In addition, some algorithmic extensions and illustrative real-world applications of SVMs are shown.

...read moreread less

232 citations

Journal Article•DOI•

Support vector machines : A recent method for classification in chemometrics

[...]

Yun Xu¹, Simeone Zomer¹, Richard G. Brereton¹•Institutions (1)

University of Bristol¹

01 Dec 2006-Critical Reviews in Analytical Chemistry

TL;DR: Support Vector Machines are a new generation of classification method that attempts to produce boundaries between classes by both minimising the empirical error from the training set and also controlling the complexity of the decision boundary, which can be non-linear.

...read moreread less

Abstract: Support Vector Machines (SVMs) are a new generation of classification method. Derived from well principled Statistical Learning theory, this method attempts to produce boundaries between classes by both minimising the empirical error from the training set and also controlling the complexity of the decision boundary, which can be non-linear. SVMs use a kernel matrix to transform a non-linear separation problem in input space to a linear separation problem in feature space. Common kernels include the Radial Basis Function, Polynomial and Sigmoidal Functions. In many simulated studies and real applications, SVMs show superior generalisation performance compared to traditional classification methods. SVMs also provide several useful statistics that can be used for both model selection and feature selection because these statistics are the upper bounds of the generalisation performance estimation of Leave-One-Out Cross-Validation. SVMs can be employed for multiclass problems in addition to the traditional two ...

...read moreread less

148 citations

Book Chapter•DOI•

Active learning in the non-realizable case

[...]

Matti Kääriäinen¹•Institutions (1)

University of Helsinki¹

07 Oct 2006

TL;DR: In this paper, the authors study how relaxing the realizability assumption affects the sample complexity of active learning and show that active learning can be transformed to tolerate random bounded rate class noise, and in particular exponential label complexity savings over passive learning are still possible.

...read moreread less

Abstract: Most of the existing active learning algorithms are based on the realizability assumption: The learner's hypothesis class is assumed to contain a target function that perfectly classifies all training and test examples. This assumption can hardly ever be justified in practice. In this paper, we study how relaxing the realizability assumption affects the sample complexity of active learning. First, we extend existing results on query learning to show that any active learning algorithm for the realizable case can be transformed to tolerate random bounded rate class noise. Thus, bounded rate class noise adds little extra complications to active learning, and in particular exponential label complexity savings over passive learning are still possible. However, it is questionable whether this noise model is any more realistic in practice than assuming no noise at all. Our second result shows that if we move to the truly non-realizable model of statistical learning theory, then the label complexity of active learning has the same dependence Ω(1/e2) on the accuracy parameter e as the passive learning label complexity. More specifically, we show that under the assumption that the best classifier in the learner's hypothesis class has generalization error at most β>0, the label complexity of active learning is Ω(β2/e2log(1/δ)), where the accuracy parameter e measures how close to optimal within the hypothesis class the active learner has to get and δ is the confidence parameter. The implication of this lower bound is that exponential savings should not be expected in realistic models of active learning, and thus the label complexity goals in active learning should be refined.

...read moreread less

110 citations

Journal Article•

Active learning in the non-realizable case

[...]

Matti Kääriäinen¹•Institutions (1)

University of Helsinki¹

01 Jan 2006-Lecture Notes in Computer Science

TL;DR: It is shown that under the assumption that the best classifier in the learner's hypothesis class has generalization error at most β > 0, the label complexity of active learning is Ω(β 2 /∈ 2 log(1/δ)), where the accuracy parameter e measures how close to optimal within the hypothesis class the active learners has to get and δ is the confidence parameter.

...read moreread less

Abstract: Most of the existing active learning algorithms are based on the realizability assumption: The learner's hypothesis class is assumed to contain a target function that perfectly classifies all training and test examples. This assumption can hardly ever be justified in practice. In this paper, we study how relaxing the realizability assumption affects the sample complexity of active learning. First, we extend existing results on query learning to show that any active learning algorithm for the realizable case can be transformed to tolerate random bounded rate class noise. Thus, bounded rate class noise adds little extra complications to active learning, and in particular exponential label complexity savings over passive learning are still possible. However, it is questionable whether this noise model is any more realistic in practice than assuming no noise at all. Our second result shows that if we move to the truly non-realizable model of statistical learning theory, then the label complexity of active learning has the same dependence Ω(1/∈ 2 ) on the accuracy parameter e as the passive learning label complexity. More specifically, we show that under the assumption that the best classifier in the learner's hypothesis class has generalization error at most β > 0, the label complexity of active learning is Ω(β 2 /∈ 2 log(1/δ)), where the accuracy parameter e measures how close to optimal within the hypothesis class the active learner has to get and δ is the confidence parameter. The implication of this lower bound is that exponential savings should not be expected in realistic models of active learning, and thus the label complexity goals in active learning should be refined.

...read moreread less

98 citations

Journal Article•DOI•

Support vector machine based battery model for electric vehicles

[...]

Wang Junping¹, Chen Quan-shi², Cao Binggang¹•Institutions (2)

Xi'an Jiaotong University¹, Tsinghua University²

01 May 2006-Energy Conversion and Management

TL;DR: In this article, support vector machine (SVM) is used to model the battery nonlinear dynamics in order to establish the relationship between the load voltage and the current under different temperatures and state of charge (SOC).

...read moreread less

83 citations

Journal Article•DOI•

Multiobjective analysis of chaotic dynamic systems with sparse learning machines

[...]

Abedalrazq F. Khalil¹, Mac McKee¹, Mariush Kemblowski¹, Tirusew Asefa¹, Luis A. Bastidas¹ - Show less +1 more•Institutions (1)

Utah State University¹

01 Jan 2006-Advances in Water Resources

TL;DR: Efforts are made to assess the uncertainty and robustness of the machines in learning and forecasting as a function of model structure, model parameters, and bootstrapping samples, and the utility and practicality of the proposed approaches are demonstrated.

...read moreread less

Journal Article•DOI•

Support vector regression applied to materials optimization of sialon ceramics

[...]

Liu Xu¹, LU Wen-Cong¹, Jin Shengli², Li Yawei², Chen Nian-yi¹ - Show less +1 more•Institutions (2)

Shanghai University¹, Wuhan University of Science and Technology²

26 May 2006-Chemometrics and Intelligent Laboratory Systems

TL;DR: Support Vector Regression was applied to predict the cold modulus of sialon ceramic with satisfactory results and showed that the prediction accuracy of SVR model was higher than those of BP-ANN and PLS models.

...read moreread less

Journal Article•DOI•

Discussion: Local Rademacher complexities and oracle inequalities in risk minimization

[...]

Gilles Blanchard, Pascal Massart¹•Institutions (1)

Université Paris-Saclay¹

01 Dec 2006-Annals of Statistics

TL;DR: In this magnificent paper, Professor Koltchinskii offers general and powerful performance bounds for empirical risk minimization, a fundamental principle of statistical learning theory, and develops a powerful new methodology, iterative localization, which is able to explain most of the recent results and go significantly beyond them in many cases.

...read moreread less

Abstract: In this magnificent paper, Professor Koltchinskii offers general and powerful performance bounds for empirical risk minimization, a fundamental principle of statistical learning theory. Since the elegant pioneering work of Vapnik and Chervonenkis in the early 1970s, various such bounds have been known that relate the performance of empirical risk minimizers to combinatorial and geometrical features of the class over which the minimization is performed. This area of research has been a rich source of motivation and a major field of applications of empirical process theory. The appearance of advanced concentration inequalities in the 1990s, primarily thanks to Talagrand’s influential work, provoked major advances in both empirical process theory and statistical learning theory and led to a much deeper understanding of some of the basic phenomena. In the discussed paper Professor Koltchinskii develops a powerful new methodology, iterative localization, which, with the help of concentration inequalities, is able to explain most of the recent results and go significantly beyond them in many cases. The main motivation behind Professor Koltchinskii’s paper is based on classical problems of statistical learning theory such as binary classification and regression in which, given a sample (Xi ,Y i), i = 1 ,...,n , of independent and identically distributed pairs of random variables (where the Xi take their values in some feature space X and the Yi are, say, real-valued), the goal is to find a function f : X → R whose risk, defined in terms of the expected value of an appropriately chosen loss function, is as small as possible. In the remaining part of this discussion we point out how the performance bounds of Professor Koltchinskii’s paper can be used to study a seemingly different model, motivated by nonparametric ranking problems, which has received increasing attention both in the statistical and machine learning literature. Indeed, in several applications, such as the search engine problem or credit risk screening, the goal is to learn how to rank—or to score—observations rather than just classify them. In this case, performance measures involve pairs of observations, as can be seen, for instance, with the AUC (Area Under an ROC Curve) criterion. In this

...read moreread less

Journal Article•DOI•

Feasibility study on transient identification in nuclear power plants using support vector machines

[...]

Christoffer Gottlieb¹, Vasily Arzhanov², Waclaw Gudowski², N.S. Garis•Institutions (2)

Stockholm University¹, Royal Institute of Technology²

01 Jul 2006-Nuclear Technology

TL;DR: Support vector machines, a relatively new paradigm in statistical learning theory, are studied for their potential to recognize transient behavior of detector signals corresponding to various accident events at nuclear power plants (NPPs), and SVM calculations have demonstrated that they can produce classifiers with good generalization ability for data.

...read moreread less

Abstract: Support vector machines (SVMs), a relatively new paradigm in statistical learning theory, are studied for their potential to recognize transient behavior of detector signals corresponding to variou ...

...read moreread less

Proceedings Article•DOI•

Mixture of Support Vector Machines for HMM based Speech Recognition

[...]

Sven E. Krüger¹, Martin Schafföner¹, Marcel Katz¹, Edin Andelic¹, Andreas Wendemuth¹ - Show less +1 more•Institutions (1)

Otto-von-Guericke University Magdeburg¹

20 Aug 2006

TL;DR: This paper uses parallel mixtures of support vector machines (SVMs) for classification by integrating this method in a HMM-based speech recognition system, and trains and test the hybrid system on the DARPA resource management corpus, showing better performance than H MM-based decoder using Gaussian mixtures.

...read moreread less

Abstract: Speech recognition is usually based on Hidden Markov Models (HMMs), which represent the temporal dynamics of speech very efficiently, and Gaussian mixture models, which do non-optimally the classification of speech into single speech units (phonemes). In this paper we use parallel mixtures of Support Vector Machines (SVMs) for classification by integrating this method in a HMM-based speech recognition system. SVMs are very appealing due to their association with statistical learning theory and have already shown good results in pattern recognition and in continuous speech recognition. They suffer however from the effort for training which scales at least quadratic with respect to the number of training vectors. The SVM mixtures need only nearly linear training time making it easier to deal with the large amount of speech data. In our hybrid system we use the SVM mixtures as acoustic models in a HMM-based decoder. We train and test the hybrid system on the DARPA Resource Management (RM1) corpus, showing better performance than HMM-based decoder using Gaussian mixtures.

...read moreread less

Journal Article•

Application of Support Vector Machines on Network Abnormal Intrusion Detection

[...]

Liu Feng-yu

01 Jan 2006-Application Research of Computers

TL;DR: The experimental results demonstrate that the proposed model has higher detection accuracy of intrusions, especially for some unknown attacks, and proves that SVM can effectively detect intrusion.

...read moreread less

Abstract: Apply SVM technique to network intrusion detection,and propose a network abnormal intrusion detection model based on SVM.The experimental results demonstrate that the proposed model has higher detection accuracy of intrusions,especially for some unknown attacks.It proves that SVM can effectively detect intrusion.

...read moreread less

Journal Article•

Consistency of Multiclass Empirical Risk Minimization Methods Based on Convex Loss

[...]

Di-Rong Chen, Tao Sun

01 Dec 2006-Journal of Machine Learning Research

TL;DR: This work considers the consistency of ERM scheme over classes of combinations of very simple rules (base classifiers) in multiclass classification to establish a quantitative relationship between classification errors and convex risks.

...read moreread less

Abstract: The consistency of classification algorithm plays a central role in statistical learning theory. A consistent algorithm guarantees us that taking more samples essentially suffices to roughly reconstruct the unknown distribution. We consider the consistency of ERM scheme over classes of combinations of very simple rules (base classifiers) in multiclass classification. Our approach is, under some mild conditions, to establish a quantitative relationship between classification errors and convex risks. In comparison with the related previous work, the feature of our result is that the conditions are mainly expressed in terms of the differences between some values of the convex function.

...read moreread less

Proceedings Article•DOI•

Robustness of support vector machine-based classification of heart rate signals.

[...]

Argyro Kampouraki, Christophoros Nikou¹, George Manis•Institutions (1)

University of Ioannina¹

01 Jan 2006

TL;DR: The use of support vector machine (SVM) learning to classify heart rate signals performs very well even with signals exhibiting very low signal to noise ratio which is not the case for other standard methods proposed by the literature.

...read moreread less

Abstract: In this study, we discuss the use of Support Vector Machine (SVM) learning to classify heart rate signals. Each signal is represented by an attribute vector containing a set of statistical measures for the respective signal. At first, the SVM classifier is trained by data (attribute vectors) with known ground truth. Then, the classifier learnt parameters can be used for the categorization of new signals not belonging to the training set. We have experimented with both real and artificial signals and the SVM classifier performs very well even with signals exhibiting very low signal to noise ratio which is not the case for other standard methods proposed by the literature. I. INTRODUCTION Heart Rate Variability (HRV) analysis is based on measur- ing the variability of heart rate signals and more specifically, the variability in intervals between R peaks of the electrocar- diogram (ECG), referred as RR intervals. Several techniques have been proposed for the investigation of evolution of features of the HRV time series. A survey of statistical methods, based on the estimation of the statistical properties of the beat-to-beat time series, can be found in (1). These methods describe the average statistical behavior of the signal over a considered time window. Spectral methods (2), based on FFT or standard autoregressive modeling, were also proposed. More recently, nonlinear approaches, including Markov modeling (3), entropy-based metrics (4), (5), the mutual information measure (6) and probabilistic modeling (7), (8) were presented to examine heart rate fluctuations. Other methods include the application of the Karhunen- Lo¨ eve transformation (9) or modulation analysis (10), (11). In this study, we investigate the potential benefit of using a support vector machine (SVM) learning (12), (13) to classify heart rate signals. Support vector classifiers are based on recent advances on statistical learning theory (14). They use a hypothesis space of linear functions in a high dimensional feature space, trained with a learning algorithm from opti- mization theory that implements a learning bias derived from statistical learning theory. In the last decade, SVM learning has found a wide range of applications (15), including image segmentation (16) and classification (17), object recognition (18), image fusion (19) and stereo correspondence (20). Based on our previous work on support vector classification

...read moreread less

Granular support vector machines based on granular computing, soft computing and statistical learning

[...]

Yan-Qing Zhang¹, Yuchun Tang¹•Institutions (1)

Georgia State University¹

01 Jan 2006

TL;DR: A framework named Granular Support Vector Machines (GSVM) is proposed to systematically and formally combine statistical learning theory, granular computing theory and soft computing theory to address challenging predictive data modeling problems effectively and/or efficiently, with specific focus on binary classification problems.

...read moreread less

Abstract: With emergence of biomedical informatics, Web intelligence, and E-business, new challenges are coming for knowledge discovery and data mining modeling problems In this dissertation work, a framework named Granular Support Vector Machines (GSVM) is proposed to systematically and formally combine statistical learning theory, granular computing theory and soft computing theory to address challenging predictive data modeling problems effectively and/or efficiently, with specific focus on binary classification problems In general, GSVM works in 3 steps Step 1 is granulation to build a sequence of information granules from the original dataset or from the original feature space Step 2 is modeling Support Vector Machines (SVM) in some of these information granules when necessary Finally, step 3 is aggregation to consolidate information in these granules at suitable abstract level A good granulation method to find suitable granules is crucial for modeling a good GSVM Under this framework, many different granulation algorithms including the GSVM-CMW (cumulative margin width) algorithm, the GSVM-AR (association rule mining) algorithm, a family of GSVM-RFE (recursive feature elimination) algorithms, the GSVM-DC (data cleaning) algorithm and the GSVM-RU (repetitive undersampling) algorithm are designed for binary classification problems with different characteristics The empirical studies in biomedical domain and many other application domains demonstrate that the framework is promising As a preliminary step, this dissertation work will be extended in the future to build a Granular Computing based Predictive Data Modeling framework (GrC-PDM) with which we can create hybrid adaptive intelligent data mining systems for high quality prediction

...read moreread less

Journal Article•DOI•

An AGO?SVM drift modelling method for a dynamically tuned gyroscope

[...]

Guoping Xu, Weifeng Tian, Zhihua Jin

01 Jan 2006-Measurement Science and Technology

TL;DR: The modelling results of the real drift data from the long-term measurement system of a DTG indicate that the SVM method is available practically in the modelling of DTG drift and the proposed strategy of combining SVM with AGO is effective in improving the modelling precision and the learning performance.

...read moreread less

Abstract: In this paper, the support vector machine (SVM), a novel learning machine based on statistical learning theory (SLT), is described and applied in the drift modelling of the dynamically tuned gyroscope (DTG). As a data preprocessing method, accumulated generating operation (AGO) is applied to the SVM for further improving the modelling precision and the learning performance of the drift model. The grey modelling method and RBF neural network are also investigated as a comparison to the SVM and AGO–SVM modelling methods. The modelling results of the real drift data from the long-term measurement system of a DTG indicate that the SVM method is available practically in the modelling of DTG drift and the proposed strategy of combining SVM with AGO is effective in improving the modelling precision and the learning performance.

...read moreread less

Book Chapter•DOI•

A randomized online learning algorithm for better variance control

[...]

Jean-Yves Audibert¹•Institutions (1)

École Normale Supérieure¹

22 Jun 2006

TL;DR: A sequential randomized algorithm, which at each step concentrates on functions having both low risk and low variance with respect to the previous step prediction function, which satisfies a simple risk bound.

...read moreread less

Abstract: We propose a sequential randomized algorithm, which at each step concentrates on functions having both low risk and low variance with respect to the previous step prediction function It satisfies a simple risk bound, which is sharp to the extent that the standard statistical learning approach, based on supremum of empirical processes, does not lead to algorithms with such a tight guarantee on its efficiency Our generalization error bounds complement the pioneering work of Cesa-Bianchi et al [12] in which standard-style statistical results were recovered with tight constants using worst-case analysis A nice feature of our analysis of the randomized estimator is to put forward the links between the probabilistic and worst-case viewpoint It also allows to recover recent model selection results due to Juditsky et al [16] and to improve them in least square regression with heavy noise, ie when no exponential moment condition is assumed on the output

...read moreread less

Journal Article•DOI•

Support Vector Machines with Applications

[...]

Javier M. Moguerza¹, Alberto Muñoz²•Institutions (2)

King Juan Carlos University¹, Complutense University of Madrid²

28 Dec 2006-arXiv: Statistics Theory

TL;DR: Support vector machines (SVM) as discussed by the authors operate within the framework of regularization theory by minimizing an empirical risk in a wellposed and consistent way, and have been successfully applied to real-world data analysis problems, often providing improved results compared with other techniques.

...read moreread less

Abstract: Support vector machines (SVMs) appeared in the early nineties as optimal margin classifiers in the context of Vapnik's statistical learning theory. Since then SVMs have been successfully applied to real-world data analysis problems, often providing improved results compared with other techniques. The SVMs operate within the framework of regularization theory by minimizing an empirical risk in a well-posed and consistent way. A clear advantage of the support vector approach is that sparse solutions to classification and regression problems are usually obtained: only a few samples are involved in the determination of the classification or regression functions. This fact facilitates the application of SVMs to problems that involve a large amount of data, such as text processing and bioinformatics tasks. This paper is intended as an introduction to SVMs and their applications, emphasizing their key features. In addition, some algorithmic extensions and illustrative real-world applications of SVMs are shown.

...read moreread less

Journal Article•DOI•

Constructing a model hierarchy with background knowledge for structural risk minimization: application to biological treatment of wastewater

[...]

Aziz Guergachi¹, G.G. Patry²•Institutions (2)

Ryerson University¹, University of Ottawa²

01 Mar 2006

TL;DR: It is shown how to develop a mechanistically based learning machine (i.e., a machine that contains background knowledge) for the case of biological wastewater treatment systems, which has a hierarchical property and can be used to implement the IPSRM.

...read moreread less

Abstract: This article introduces a novel approach to the issue of learning from empirical data coming from complex systems that are continuous, dynamic, highly nonlinear, and stochastic. The main feature of this approach is that it attempts to integrate the powerful statistical learning theoretic methods and the valuable background knowledge that one possesses about the system under study. The learning machines that have been used, up to now, for the implementation of Vapnik's inductive principle of structural risk minimization (IPSRM) are of the "black-box" type, such as artificial neural networks, ARMA models, or polynomial functions. These are generic models that contain absolutely no knowledge about the problem at hand. They are used to approximate the behavior of any system and are prodigal in their requirements of training data. In addition, the conditions that underlie the theory of statistical learning would not hold true when these "black-box" models are used to describe highly complex systems. In this paper, it is argued that the use of a learning machine whose structure is developed on the basis of the physical mechanisms of the system under study is more advantageous. Such a machine will indeed be specific to the problem at hand and will require many less data points for training than their black-box counterparts. Furthermore, because this machine contains background knowledge about the system, it will provide better approximations of the various dynamic modes of this system and will, therefore, satisfy some of the prerequisites that are needed for meeting the conditions of statistical learning theory (SLT). This paper shows how to develop such a mechanistically based learning machine (i.e., a machine that contains background knowledge) for the case of biological wastewater treatment systems. Fuzzy logic concepts, combined with the results of the research in the area of wastewater engineering, will be utilized to construct such a machine. This machine has a hierarchical property and can, therefore, be used to implement the IPSRM.

...read moreread less

Journal Article•DOI•

Domain adaptation for statistical classifiers

[...]

DauméHal, MarcuDaniel

01 May 2006-Journal of Artificial Intelligence Research

TL;DR: The most basic assumption in statistical learning theory is that training data and test data are drawn from the same underlying distribution, but in many applications, the "in-domai...

...read moreread less

Proceedings Article•DOI•

Web Page Classification Based on SVM

[...]

Weimin Xue¹, Hong Bao¹, Weitong Huang², Yuchang Lu•Institutions (2)

Beijing Union University¹, Tsinghua University²

23 Oct 2006

TL;DR: It is proved that if a kernel has a perfect alignment with the classification task, the SVM classifier has better performances.

...read moreread less

Abstract: This paper studies several key aspects of support vector machine (SVM) for Web page classification. Developed from statistical learning theory, SVM is widely investigated and used for text categorization because of its high generalization performance and tolerant ability of processing high dimension classification. Firstly, some methods for Web page presentation are studied. Secondly the Web page classification based on SVM is implementation on data set, and NB classifier is used for study the performance of the SVM classifier processing high dimension space. Finally, the comparison on the polynomial kernel function and the radius basis function (RBF) kernel function is studied. It is proved that if a kernel has a perfect alignment with the classification task, the SVM classifier has better performances

...read moreread less

Journal Article•DOI•

A New Cluster Validity for Data Clustering

[...]

Xulei Yang¹, Aize Cao², Qing Song¹•Institutions (2)

Nanyang Technological University¹, Vanderbilt University²

01 Jun 2006-Neural Processing Letters

TL;DR: The results of comparative study show that the proposed VB index has high ability in producing a good cluster number estimate and in addition, it provides a new approach for cluster validity from the view of statistical learning theory.

...read moreread less

Abstract: Cluster validity has been widely used to evaluate the fitness of partitions produced by clustering algorithms. This paper presents a new validity, which is called the Vapnik---Chervonenkis-bound (VB) index, for data clustering. It is estimated based on the structural risk minimization (SRM) principle, which optimizes the bound simultaneously over both the distortion function (empirical risk) and the VC-dimension (model complexity). The smallest bound of the guaranteed risk achieved on some appropriate cluster number validates the best description of the data structure. We use the deterministic annealing (DA) algorithm as the underlying clustering technique to produce the partitions. Five numerical examples and two real data sets are used to illustrate the use of VB as a validity index. Its effectiveness is compared to several popular cluster-validity indexes. The results of comparative study show that the proposed VB index has high ability in producing a good cluster number estimate and in addition, it provides a new approach for cluster validity from the view of statistical learning theory.

...read moreread less

Journal Article•DOI•

A new Bayesian recursive technique for parameter estimation

[...]

Yasir H. Kaheil¹, M. Kashif Gill¹, Mac McKee¹, Luis A. Bastidas¹•Institutions (1)

Utah State University¹

01 Aug 2006-Water Resources Research

TL;DR: The LOBARE methodology is applied on two different types of models: an artificial intelligence (AI) model in the form of a support vector machine (SVM) application for forecasting soil moisture and a conceptual rainfall‐runoff (CRR) model represented by the Sacramento soil moisture accounting (SAC‐SMA) model.

...read moreread less

Abstract: [1] The performance of any model depends on how well its associated parameters are estimated. In the current application, a localized Bayesian recursive estimation (LOBARE) approach is devised for parameter estimation. The LOBARE methodology is an extension of the Bayesian recursive estimation (BARE) method. It is applied in this paper on two different types of models: an artificial intelligence (AI) model in the form of a support vector machine (SVM) application for forecasting soil moisture and a conceptual rainfall-runoff (CRR) model represented by the Sacramento soil moisture accounting (SAC-SMA) model. Support vector machines, based on statistical learning theory (SLT), represent the modeling task as a quadratic optimization problem and have already been used in various applications in hydrology. They require estimation of three parameters. SAC-SMA is a very well known model that estimates runoff. It has a 13-dimensional parameter space. In the LOBARE approach presented here, Bayesian inference is used in an iterative fashion to estimate the parameter space that will most likely enclose a best parameter set. This is done by narrowing the sampling space through updating the “parent” bounds based on their fitness. These bounds are actually the parameter sets that were selected by BARE runs on subspaces of the initial parameter space. The new approach results in faster convergence toward the optimal parameter set using minimum training/calibration data and fewer sets of parameter values. The efficacy of the localized methodology is also compared with the previously used BARE algorithm.

...read moreread less

Journal Article•DOI•

Application of support vector machines to global prediction of nuclear properties

[...]

John W. Clark¹, Haochen Li¹•Institutions (1)

Washington University in St. Louis¹

20 Dec 2006-International Journal of Modern Physics B

TL;DR: Results indicate that SVM models can match or even surpass the predictive performance of the best conventional "theory-thick" global models based on nuclear phenomenology.

...read moreread less

Abstract: Advances in statistical learning theory present the opportunity to develop statistical models of quantum many-body systems exhibiting remarkable predictive power. The potential of such "theory-thin" approaches is illustrated with the application of Support Vector Machines (SVMs) to global prediction of nuclear properties as functions of proton and neutron numbers Z and N across the nuclidic chart. Based on the principle of structural-risk minimization, SVMs learn from examples in the existing database of a given property Y, automatically and optimally identify a set of "support vectors" corresponding to representative nuclei in the training set, and approximate the mapping (Z, N) → Y in terms of these nuclei. Results are reported for nuclear masses, beta-decay lifetimes, and spins/parities of nuclear ground states. These results indicate that SVM models can match or even surpass the predictive performance of the best conventional "theory-thick" global models based on nuclear phenomenology.

...read moreread less

Proceedings Article•DOI•

An Embedded Support Vector Machine

[...]

Pedersen, Schoeberl

01 Jan 2006

Journal Article•

Classification Method of Support Vector Machine Based on Statistical Learning Theory

[...]

Lu You¹•Institutions (1)

China University of Petroleum¹

01 Jan 2006-Computer Technology and Development

TL;DR: The elements of statistical learning theory for support vector machines used in classification and algorithms are introduced and the main issues of support vector machine are discussed, and the application foreground ofSupport vector machine is prospected.

...read moreread less

Abstract: Support vector machines are a kind of novel machine learning method, which have become the hotspot of machine learning because of their excellent performance.In this paper,the elements of statistical learning theory for support vector machines used in classification and algorithms are introduced.The main issues of support vector machine are discussed,and the application foreground of support vector machine is prospected.

...read moreread less