scispace - formally typeset
Search or ask a question

Showing papers on "Statistical learning theory published in 2006"


Journal ArticleDOI
TL;DR: This work introduces a statistical formulation of this problem in terms of a simple mixture model and presents an instantiation of this framework to maximum entropy classifiers and their linear chain counterparts and leads to improved performance on three real world tasks on four different data sets from the natural language processing domain.
Abstract: The most basic assumption used in statistical learning theory is that training data and test data are drawn from the same underlying distribution. Unfortunately, in many applications, the "in-domain" test data is drawn from a distribution that is related, but not identical, to the "out-of-domain" distribution of the training data. We consider the common case in which labeled out-of-domain data is plentiful, but labeled in-domain data is scarce. We introduce a statistical formulation of this problem in terms of a simple mixture model and present an instantiation of this framework to maximum entropy classifiers and their linear chain counterparts. We present efficient inference algorithms for this special case based on the technique of conditional expectation maximization. Our experimental results show that our approach leads to improved performance on three real world tasks on four different data sets from the natural language processing domain.

894 citations


Journal ArticleDOI
TL;DR: A novel modified TSVM classifier designed for addressing ill-posed remote-sensing problems is proposed that is able to mitigate the effects of suboptimal model selection and can address multiclass cases.
Abstract: This paper introduces a semisupervised classification method that exploits both labeled and unlabeled samples for addressing ill-posed problems with support vector machines (SVMs). The method is based on recent developments in statistical learning theory concerning transductive inference and in particular transductive SVMs (TSVMs). TSVMs exploit specific iterative algorithms which gradually search a reliable separating hyperplane (in the kernel space) with a transductive process that incorporates both labeled and unlabeled samples in the training phase. Based on an analysis of the properties of the TSVMs presented in the literature, a novel modified TSVM classifier designed for addressing ill-posed remote-sensing problems is proposed. In particular, the proposed technique: 1) is based on a novel transductive procedure that exploits a weighting strategy for unlabeled patterns, based on a time-dependent criterion; 2) is able to mitigate the effects of suboptimal model selection (which is unavoidable in the presence of small-size training sets); and 3) can address multiclass cases. Experimental results confirm the effectiveness of the proposed method on a set of ill-posed remote-sensing classification problems representing different operative conditions

560 citations


Journal ArticleDOI
TL;DR: The support vector machine, a novel artificial intelligence-based method developed from statistical learning theory, is adopted herein to establish a real-time stage forecasting model that can effectively predict the flood stage forecasts one-to-six-hours ahead.

464 citations


Journal ArticleDOI
TL;DR: New data-driven models based on Statistical Learning Theory that were used to forecast flows at two time scales: seasonal flow volumes and hourly stream flows showed a promising performance in solving site-specific, real-time water resources management problems.

254 citations


Journal ArticleDOI
TL;DR: Results from the SVM modeling are compared with predictions obtained from ANN models and show that SVM models performed better for soil moisture forecasting than ANN models.
Abstract: Herein, a recently developed methodology, Support Vector Machines (SVMs), is presented and applied to the challenge of soil moisture prediction. Support Vector Machines are derived from statistical learning theory and can be used to predict a quantity forward in time based on training that uses past data, hence providing a statistically sound approach to solving inverse problems. The principal strength of SVMs lies in the fact that they employ Structural Risk Minimization (SRM) instead of Empirical Risk Minimization (ERM). The SVMs formulate a quadratic optimization problem that ensures a global optimum, which makes them superior to traditional learning algorithms such as Artificial Neural Networks (ANNs). The resulting model is sparse and not characterized by the “curse of dimensionality.” Soil moisture distribution and variation is helpful in predicting and understanding various hydrologic processes, including weather changes, energy and moisture fluxes, drought, irrigation scheduling, and rainfall/runoff generation. Soil moisture and meteorological data are used to generate SVM predictions for four and seven days ahead. Predictions show good agreement with actual soil moisture measurements. Results from the SVM modeling are compared with predictions obtained from ANN models and show that SVM models performed better for soil moisture forecasting than ANN models.

237 citations


Journal ArticleDOI
TL;DR: This paper is intended as an introduction to SVMs and their applications, emphasizing their key features, and some algorithmic extensions and illustrative real-world applications of SVMs are shown.
Abstract: Support vector machines (SVMs) appeared in the early nineties as optimal margin classiers in the context of Vapnikis statistical learning theory. Since then SVMs have been successfully applied to real-world data analysis problems, often providing improved results compared with other techniques. The SVMs operate within the framework of regularization theory by minimizing an empirical risk in a well-posed and consistent way. A clear advantage of the support vector approach is that sparse solutions to classi- cation and regression problems are usually obtained: only a few samples are involved in the determination of the classication or regression functions. This fact facilitates the application of SVMs to problems that involve a large amount of data, such as text processing and bioinformatics tasks. This paper is intended as an introduction to SVMs and their applications, emphasizing their key features. In addition, some algorithmic extensions and illustrative real-world applications of SVMs are shown.

232 citations


Journal ArticleDOI
TL;DR: Support Vector Machines are a new generation of classification method that attempts to produce boundaries between classes by both minimising the empirical error from the training set and also controlling the complexity of the decision boundary, which can be non-linear.
Abstract: Support Vector Machines (SVMs) are a new generation of classification method. Derived from well principled Statistical Learning theory, this method attempts to produce boundaries between classes by both minimising the empirical error from the training set and also controlling the complexity of the decision boundary, which can be non-linear. SVMs use a kernel matrix to transform a non-linear separation problem in input space to a linear separation problem in feature space. Common kernels include the Radial Basis Function, Polynomial and Sigmoidal Functions. In many simulated studies and real applications, SVMs show superior generalisation performance compared to traditional classification methods. SVMs also provide several useful statistics that can be used for both model selection and feature selection because these statistics are the upper bounds of the generalisation performance estimation of Leave-One-Out Cross-Validation. SVMs can be employed for multiclass problems in addition to the traditional two ...

148 citations


Book ChapterDOI
07 Oct 2006
TL;DR: In this paper, the authors study how relaxing the realizability assumption affects the sample complexity of active learning and show that active learning can be transformed to tolerate random bounded rate class noise, and in particular exponential label complexity savings over passive learning are still possible.
Abstract: Most of the existing active learning algorithms are based on the realizability assumption: The learner's hypothesis class is assumed to contain a target function that perfectly classifies all training and test examples. This assumption can hardly ever be justified in practice. In this paper, we study how relaxing the realizability assumption affects the sample complexity of active learning. First, we extend existing results on query learning to show that any active learning algorithm for the realizable case can be transformed to tolerate random bounded rate class noise. Thus, bounded rate class noise adds little extra complications to active learning, and in particular exponential label complexity savings over passive learning are still possible. However, it is questionable whether this noise model is any more realistic in practice than assuming no noise at all. Our second result shows that if we move to the truly non-realizable model of statistical learning theory, then the label complexity of active learning has the same dependence Ω(1/e2) on the accuracy parameter e as the passive learning label complexity. More specifically, we show that under the assumption that the best classifier in the learner's hypothesis class has generalization error at most β>0, the label complexity of active learning is Ω(β2/e2log(1/δ)), where the accuracy parameter e measures how close to optimal within the hypothesis class the active learner has to get and δ is the confidence parameter. The implication of this lower bound is that exponential savings should not be expected in realistic models of active learning, and thus the label complexity goals in active learning should be refined.

110 citations


Journal Article
TL;DR: It is shown that under the assumption that the best classifier in the learner's hypothesis class has generalization error at most β > 0, the label complexity of active learning is Ω(β 2 /∈ 2 log(1/δ)), where the accuracy parameter e measures how close to optimal within the hypothesis class the active learners has to get and δ is the confidence parameter.
Abstract: Most of the existing active learning algorithms are based on the realizability assumption: The learner's hypothesis class is assumed to contain a target function that perfectly classifies all training and test examples. This assumption can hardly ever be justified in practice. In this paper, we study how relaxing the realizability assumption affects the sample complexity of active learning. First, we extend existing results on query learning to show that any active learning algorithm for the realizable case can be transformed to tolerate random bounded rate class noise. Thus, bounded rate class noise adds little extra complications to active learning, and in particular exponential label complexity savings over passive learning are still possible. However, it is questionable whether this noise model is any more realistic in practice than assuming no noise at all. Our second result shows that if we move to the truly non-realizable model of statistical learning theory, then the label complexity of active learning has the same dependence Ω(1/∈ 2 ) on the accuracy parameter e as the passive learning label complexity. More specifically, we show that under the assumption that the best classifier in the learner's hypothesis class has generalization error at most β > 0, the label complexity of active learning is Ω(β 2 /∈ 2 log(1/δ)), where the accuracy parameter e measures how close to optimal within the hypothesis class the active learner has to get and δ is the confidence parameter. The implication of this lower bound is that exponential savings should not be expected in realistic models of active learning, and thus the label complexity goals in active learning should be refined.

98 citations


Journal ArticleDOI
TL;DR: In this article, support vector machine (SVM) is used to model the battery nonlinear dynamics in order to establish the relationship between the load voltage and the current under different temperatures and state of charge (SOC).

83 citations


Journal ArticleDOI
TL;DR: Efforts are made to assess the uncertainty and robustness of the machines in learning and forecasting as a function of model structure, model parameters, and bootstrapping samples, and the utility and practicality of the proposed approaches are demonstrated.

Journal ArticleDOI
TL;DR: Support Vector Regression was applied to predict the cold modulus of sialon ceramic with satisfactory results and showed that the prediction accuracy of SVR model was higher than those of BP-ANN and PLS models.

Journal ArticleDOI
TL;DR: In this magnificent paper, Professor Koltchinskii offers general and powerful performance bounds for empirical risk minimization, a fundamental principle of statistical learning theory, and develops a powerful new methodology, iterative localization, which is able to explain most of the recent results and go significantly beyond them in many cases.
Abstract: In this magnificent paper, Professor Koltchinskii offers general and powerful performance bounds for empirical risk minimization, a fundamental principle of statistical learning theory. Since the elegant pioneering work of Vapnik and Chervonenkis in the early 1970s, various such bounds have been known that relate the performance of empirical risk minimizers to combinatorial and geometrical features of the class over which the minimization is performed. This area of research has been a rich source of motivation and a major field of applications of empirical process theory. The appearance of advanced concentration inequalities in the 1990s, primarily thanks to Talagrand’s influential work, provoked major advances in both empirical process theory and statistical learning theory and led to a much deeper understanding of some of the basic phenomena. In the discussed paper Professor Koltchinskii develops a powerful new methodology, iterative localization, which, with the help of concentration inequalities, is able to explain most of the recent results and go significantly beyond them in many cases. The main motivation behind Professor Koltchinskii’s paper is based on classical problems of statistical learning theory such as binary classification and regression in which, given a sample (Xi ,Y i), i = 1 ,...,n , of independent and identically distributed pairs of random variables (where the Xi take their values in some feature space X and the Yi are, say, real-valued), the goal is to find a function f : X → R whose risk, defined in terms of the expected value of an appropriately chosen loss function, is as small as possible. In the remaining part of this discussion we point out how the performance bounds of Professor Koltchinskii’s paper can be used to study a seemingly different model, motivated by nonparametric ranking problems, which has received increasing attention both in the statistical and machine learning literature. Indeed, in several applications, such as the search engine problem or credit risk screening, the goal is to learn how to rank—or to score—observations rather than just classify them. In this case, performance measures involve pairs of observations, as can be seen, for instance, with the AUC (Area Under an ROC Curve) criterion. In this

Journal ArticleDOI
TL;DR: Support vector machines, a relatively new paradigm in statistical learning theory, are studied for their potential to recognize transient behavior of detector signals corresponding to various accident events at nuclear power plants (NPPs), and SVM calculations have demonstrated that they can produce classifiers with good generalization ability for data.
Abstract: Support vector machines (SVMs), a relatively new paradigm in statistical learning theory, are studied for their potential to recognize transient behavior of detector signals corresponding to variou ...

Proceedings ArticleDOI
20 Aug 2006
TL;DR: This paper uses parallel mixtures of support vector machines (SVMs) for classification by integrating this method in a HMM-based speech recognition system, and trains and test the hybrid system on the DARPA resource management corpus, showing better performance than H MM-based decoder using Gaussian mixtures.
Abstract: Speech recognition is usually based on Hidden Markov Models (HMMs), which represent the temporal dynamics of speech very efficiently, and Gaussian mixture models, which do non-optimally the classification of speech into single speech units (phonemes). In this paper we use parallel mixtures of Support Vector Machines (SVMs) for classification by integrating this method in a HMM-based speech recognition system. SVMs are very appealing due to their association with statistical learning theory and have already shown good results in pattern recognition and in continuous speech recognition. They suffer however from the effort for training which scales at least quadratic with respect to the number of training vectors. The SVM mixtures need only nearly linear training time making it easier to deal with the large amount of speech data. In our hybrid system we use the SVM mixtures as acoustic models in a HMM-based decoder. We train and test the hybrid system on the DARPA Resource Management (RM1) corpus, showing better performance than HMM-based decoder using Gaussian mixtures.

Journal Article
TL;DR: The experimental results demonstrate that the proposed model has higher detection accuracy of intrusions, especially for some unknown attacks, and proves that SVM can effectively detect intrusion.
Abstract: Apply SVM technique to network intrusion detection,and propose a network abnormal intrusion detection model based on SVM.The experimental results demonstrate that the proposed model has higher detection accuracy of intrusions,especially for some unknown attacks.It proves that SVM can effectively detect intrusion.

Journal Article
TL;DR: This work considers the consistency of ERM scheme over classes of combinations of very simple rules (base classifiers) in multiclass classification to establish a quantitative relationship between classification errors and convex risks.
Abstract: The consistency of classification algorithm plays a central role in statistical learning theory. A consistent algorithm guarantees us that taking more samples essentially suffices to roughly reconstruct the unknown distribution. We consider the consistency of ERM scheme over classes of combinations of very simple rules (base classifiers) in multiclass classification. Our approach is, under some mild conditions, to establish a quantitative relationship between classification errors and convex risks. In comparison with the related previous work, the feature of our result is that the conditions are mainly expressed in terms of the differences between some values of the convex function.

Proceedings ArticleDOI
01 Jan 2006
TL;DR: The use of support vector machine (SVM) learning to classify heart rate signals performs very well even with signals exhibiting very low signal to noise ratio which is not the case for other standard methods proposed by the literature.
Abstract: In this study, we discuss the use of Support Vector Machine (SVM) learning to classify heart rate signals. Each signal is represented by an attribute vector containing a set of statistical measures for the respective signal. At first, the SVM classifier is trained by data (attribute vectors) with known ground truth. Then, the classifier learnt parameters can be used for the categorization of new signals not belonging to the training set. We have experimented with both real and artificial signals and the SVM classifier performs very well even with signals exhibiting very low signal to noise ratio which is not the case for other standard methods proposed by the literature. I. INTRODUCTION Heart Rate Variability (HRV) analysis is based on measur- ing the variability of heart rate signals and more specifically, the variability in intervals between R peaks of the electrocar- diogram (ECG), referred as RR intervals. Several techniques have been proposed for the investigation of evolution of features of the HRV time series. A survey of statistical methods, based on the estimation of the statistical properties of the beat-to-beat time series, can be found in (1). These methods describe the average statistical behavior of the signal over a considered time window. Spectral methods (2), based on FFT or standard autoregressive modeling, were also proposed. More recently, nonlinear approaches, including Markov modeling (3), entropy-based metrics (4), (5), the mutual information measure (6) and probabilistic modeling (7), (8) were presented to examine heart rate fluctuations. Other methods include the application of the Karhunen- Lo¨ eve transformation (9) or modulation analysis (10), (11). In this study, we investigate the potential benefit of using a support vector machine (SVM) learning (12), (13) to classify heart rate signals. Support vector classifiers are based on recent advances on statistical learning theory (14). They use a hypothesis space of linear functions in a high dimensional feature space, trained with a learning algorithm from opti- mization theory that implements a learning bias derived from statistical learning theory. In the last decade, SVM learning has found a wide range of applications (15), including image segmentation (16) and classification (17), object recognition (18), image fusion (19) and stereo correspondence (20). Based on our previous work on support vector classification

01 Jan 2006
TL;DR: A framework named Granular Support Vector Machines (GSVM) is proposed to systematically and formally combine statistical learning theory, granular computing theory and soft computing theory to address challenging predictive data modeling problems effectively and/or efficiently, with specific focus on binary classification problems.
Abstract: With emergence of biomedical informatics, Web intelligence, and E-business, new challenges are coming for knowledge discovery and data mining modeling problems In this dissertation work, a framework named Granular Support Vector Machines (GSVM) is proposed to systematically and formally combine statistical learning theory, granular computing theory and soft computing theory to address challenging predictive data modeling problems effectively and/or efficiently, with specific focus on binary classification problems In general, GSVM works in 3 steps Step 1 is granulation to build a sequence of information granules from the original dataset or from the original feature space Step 2 is modeling Support Vector Machines (SVM) in some of these information granules when necessary Finally, step 3 is aggregation to consolidate information in these granules at suitable abstract level A good granulation method to find suitable granules is crucial for modeling a good GSVM Under this framework, many different granulation algorithms including the GSVM-CMW (cumulative margin width) algorithm, the GSVM-AR (association rule mining) algorithm, a family of GSVM-RFE (recursive feature elimination) algorithms, the GSVM-DC (data cleaning) algorithm and the GSVM-RU (repetitive undersampling) algorithm are designed for binary classification problems with different characteristics The empirical studies in biomedical domain and many other application domains demonstrate that the framework is promising As a preliminary step, this dissertation work will be extended in the future to build a Granular Computing based Predictive Data Modeling framework (GrC-PDM) with which we can create hybrid adaptive intelligent data mining systems for high quality prediction

Journal ArticleDOI
TL;DR: The modelling results of the real drift data from the long-term measurement system of a DTG indicate that the SVM method is available practically in the modelling of DTG drift and the proposed strategy of combining SVM with AGO is effective in improving the modelling precision and the learning performance.
Abstract: In this paper, the support vector machine (SVM), a novel learning machine based on statistical learning theory (SLT), is described and applied in the drift modelling of the dynamically tuned gyroscope (DTG). As a data preprocessing method, accumulated generating operation (AGO) is applied to the SVM for further improving the modelling precision and the learning performance of the drift model. The grey modelling method and RBF neural network are also investigated as a comparison to the SVM and AGO–SVM modelling methods. The modelling results of the real drift data from the long-term measurement system of a DTG indicate that the SVM method is available practically in the modelling of DTG drift and the proposed strategy of combining SVM with AGO is effective in improving the modelling precision and the learning performance.

Book ChapterDOI
22 Jun 2006
TL;DR: A sequential randomized algorithm, which at each step concentrates on functions having both low risk and low variance with respect to the previous step prediction function, which satisfies a simple risk bound.
Abstract: We propose a sequential randomized algorithm, which at each step concentrates on functions having both low risk and low variance with respect to the previous step prediction function It satisfies a simple risk bound, which is sharp to the extent that the standard statistical learning approach, based on supremum of empirical processes, does not lead to algorithms with such a tight guarantee on its efficiency Our generalization error bounds complement the pioneering work of Cesa-Bianchi et al [12] in which standard-style statistical results were recovered with tight constants using worst-case analysis A nice feature of our analysis of the randomized estimator is to put forward the links between the probabilistic and worst-case viewpoint It also allows to recover recent model selection results due to Juditsky et al [16] and to improve them in least square regression with heavy noise, ie when no exponential moment condition is assumed on the output

Journal ArticleDOI
TL;DR: Support vector machines (SVM) as discussed by the authors operate within the framework of regularization theory by minimizing an empirical risk in a wellposed and consistent way, and have been successfully applied to real-world data analysis problems, often providing improved results compared with other techniques.
Abstract: Support vector machines (SVMs) appeared in the early nineties as optimal margin classifiers in the context of Vapnik's statistical learning theory. Since then SVMs have been successfully applied to real-world data analysis problems, often providing improved results compared with other techniques. The SVMs operate within the framework of regularization theory by minimizing an empirical risk in a well-posed and consistent way. A clear advantage of the support vector approach is that sparse solutions to classification and regression problems are usually obtained: only a few samples are involved in the determination of the classification or regression functions. This fact facilitates the application of SVMs to problems that involve a large amount of data, such as text processing and bioinformatics tasks. This paper is intended as an introduction to SVMs and their applications, emphasizing their key features. In addition, some algorithmic extensions and illustrative real-world applications of SVMs are shown.

Journal ArticleDOI
01 Mar 2006
TL;DR: It is shown how to develop a mechanistically based learning machine (i.e., a machine that contains background knowledge) for the case of biological wastewater treatment systems, which has a hierarchical property and can be used to implement the IPSRM.
Abstract: This article introduces a novel approach to the issue of learning from empirical data coming from complex systems that are continuous, dynamic, highly nonlinear, and stochastic. The main feature of this approach is that it attempts to integrate the powerful statistical learning theoretic methods and the valuable background knowledge that one possesses about the system under study. The learning machines that have been used, up to now, for the implementation of Vapnik's inductive principle of structural risk minimization (IPSRM) are of the "black-box" type, such as artificial neural networks, ARMA models, or polynomial functions. These are generic models that contain absolutely no knowledge about the problem at hand. They are used to approximate the behavior of any system and are prodigal in their requirements of training data. In addition, the conditions that underlie the theory of statistical learning would not hold true when these "black-box" models are used to describe highly complex systems. In this paper, it is argued that the use of a learning machine whose structure is developed on the basis of the physical mechanisms of the system under study is more advantageous. Such a machine will indeed be specific to the problem at hand and will require many less data points for training than their black-box counterparts. Furthermore, because this machine contains background knowledge about the system, it will provide better approximations of the various dynamic modes of this system and will, therefore, satisfy some of the prerequisites that are needed for meeting the conditions of statistical learning theory (SLT). This paper shows how to develop such a mechanistically based learning machine (i.e., a machine that contains background knowledge) for the case of biological wastewater treatment systems. Fuzzy logic concepts, combined with the results of the research in the area of wastewater engineering, will be utilized to construct such a machine. This machine has a hierarchical property and can, therefore, be used to implement the IPSRM.

Journal ArticleDOI
TL;DR: The most basic assumption in statistical learning theory is that training data and test data are drawn from the same underlying distribution, but in many applications, the "in-domai...
Abstract: The most basic assumption used in statistical learning theory is that training data and test data are drawn from the same underlying distribution. Unfortunately, in many applications, the "in-domai...

Proceedings ArticleDOI
23 Oct 2006
TL;DR: It is proved that if a kernel has a perfect alignment with the classification task, the SVM classifier has better performances.
Abstract: This paper studies several key aspects of support vector machine (SVM) for Web page classification. Developed from statistical learning theory, SVM is widely investigated and used for text categorization because of its high generalization performance and tolerant ability of processing high dimension classification. Firstly, some methods for Web page presentation are studied. Secondly the Web page classification based on SVM is implementation on data set, and NB classifier is used for study the performance of the SVM classifier processing high dimension space. Finally, the comparison on the polynomial kernel function and the radius basis function (RBF) kernel function is studied. It is proved that if a kernel has a perfect alignment with the classification task, the SVM classifier has better performances

Journal ArticleDOI
TL;DR: The results of comparative study show that the proposed VB index has high ability in producing a good cluster number estimate and in addition, it provides a new approach for cluster validity from the view of statistical learning theory.
Abstract: Cluster validity has been widely used to evaluate the fitness of partitions produced by clustering algorithms. This paper presents a new validity, which is called the Vapnik---Chervonenkis-bound (VB) index, for data clustering. It is estimated based on the structural risk minimization (SRM) principle, which optimizes the bound simultaneously over both the distortion function (empirical risk) and the VC-dimension (model complexity). The smallest bound of the guaranteed risk achieved on some appropriate cluster number validates the best description of the data structure. We use the deterministic annealing (DA) algorithm as the underlying clustering technique to produce the partitions. Five numerical examples and two real data sets are used to illustrate the use of VB as a validity index. Its effectiveness is compared to several popular cluster-validity indexes. The results of comparative study show that the proposed VB index has high ability in producing a good cluster number estimate and in addition, it provides a new approach for cluster validity from the view of statistical learning theory.

Journal ArticleDOI
TL;DR: The LOBARE methodology is applied on two different types of models: an artificial intelligence (AI) model in the form of a support vector machine (SVM) application for forecasting soil moisture and a conceptual rainfall‐runoff (CRR) model represented by the Sacramento soil moisture accounting (SAC‐SMA) model.
Abstract: [1] The performance of any model depends on how well its associated parameters are estimated. In the current application, a localized Bayesian recursive estimation (LOBARE) approach is devised for parameter estimation. The LOBARE methodology is an extension of the Bayesian recursive estimation (BARE) method. It is applied in this paper on two different types of models: an artificial intelligence (AI) model in the form of a support vector machine (SVM) application for forecasting soil moisture and a conceptual rainfall-runoff (CRR) model represented by the Sacramento soil moisture accounting (SAC-SMA) model. Support vector machines, based on statistical learning theory (SLT), represent the modeling task as a quadratic optimization problem and have already been used in various applications in hydrology. They require estimation of three parameters. SAC-SMA is a very well known model that estimates runoff. It has a 13-dimensional parameter space. In the LOBARE approach presented here, Bayesian inference is used in an iterative fashion to estimate the parameter space that will most likely enclose a best parameter set. This is done by narrowing the sampling space through updating the “parent” bounds based on their fitness. These bounds are actually the parameter sets that were selected by BARE runs on subspaces of the initial parameter space. The new approach results in faster convergence toward the optimal parameter set using minimum training/calibration data and fewer sets of parameter values. The efficacy of the localized methodology is also compared with the previously used BARE algorithm.

Journal ArticleDOI
TL;DR: Results indicate that SVM models can match or even surpass the predictive performance of the best conventional "theory-thick" global models based on nuclear phenomenology.
Abstract: Advances in statistical learning theory present the opportunity to develop statistical models of quantum many-body systems exhibiting remarkable predictive power. The potential of such "theory-thin" approaches is illustrated with the application of Support Vector Machines (SVMs) to global prediction of nuclear properties as functions of proton and neutron numbers Z and N across the nuclidic chart. Based on the principle of structural-risk minimization, SVMs learn from examples in the existing database of a given property Y, automatically and optimally identify a set of "support vectors" corresponding to representative nuclei in the training set, and approximate the mapping (Z, N) → Y in terms of these nuclei. Results are reported for nuclear masses, beta-decay lifetimes, and spins/parities of nuclear ground states. These results indicate that SVM models can match or even surpass the predictive performance of the best conventional "theory-thick" global models based on nuclear phenomenology.

Proceedings ArticleDOI
01 Jan 2006

Journal Article
TL;DR: The elements of statistical learning theory for support vector machines used in classification and algorithms are introduced and the main issues of support vector machine are discussed, and the application foreground ofSupport vector machine is prospected.
Abstract: Support vector machines are a kind of novel machine learning method, which have become the hotspot of machine learning because of their excellent performance.In this paper,the elements of statistical learning theory for support vector machines used in classification and algorithms are introduced.The main issues of support vector machine are discussed,and the application foreground of support vector machine is prospected.