scispace - formally typeset
Search or ask a question

Showing papers on "Statistical learning theory published in 2004"


Book ChapterDOI
TL;DR: This tutorial introduces the techniques that are used to obtain results in the form of so-called error bounds in statistical learning theory.
Abstract: The goal of statistical learning theory is to study, in a statistical framework, the properties of learning algorithms. In particular, most results take the form of so-called error bounds. This tutorial introduces the techniques that are used to obtain such results.

602 citations


Journal ArticleDOI
TL;DR: An experimental application to network intrusion detection shows that SmartSifter was able to identify data with high scores that corresponded to attacks, with low computational costs.
Abstract: Outlier detection is a fundamental issue in data mining, specifically in fraud detection, network intrusion detection, network monitoring, etc. SmartSifter is an outlier detection engine addressing this problem from the viewpoint of statistical learning theory. This paper provides a theoretical basis for SmartSifter and empirically demonstrates its effectiveness. SmartSifter detects outliers in an on-line process through the on-line unsupervised learning of a probabilistic model (using a finite mixture model) of the information source. Each time a datum is input SmartSifter employs an on-line discounting learning algorithm to learn the probabilistic model. A score is given to the datum based on the learned model with a high score indicating a high possibility of being a statistical outlier. The novel features of SmartSifter are: (1) it is adaptive to non-stationary sources of datas (2) a score has a clear statistical/information-theoretic meanings (3) it is computationally inexpensives and (4) it can handle both categorical and continuous variables. An experimental application to network intrusion detection shows that SmartSifter was able to identify data with high scores that corresponded to attacks, with low computational costs. Further experimental application has identified a number of meaningful rare cases in actual health insurance pathology data from Australia's Health Insurance Commission.

592 citations


Journal ArticleDOI
TL;DR: This paper proves tight data-dependent bounds for the risk of this hypothesis in terms of an easily computable statistic M/sub n/ associated with the on-line performance of the ensemble, and obtains risk tail bounds for kernel perceptron algorithms interms of the spectrum of the empirical kernel matrix.
Abstract: In this paper, it is shown how to extract a hypothesis with small risk from the ensemble of hypotheses generated by an arbitrary on-line learning algorithm run on an independent and identically distributed (i.i.d.) sample of data. Using a simple large deviation argument, we prove tight data-dependent bounds for the risk of this hypothesis in terms of an easily computable statistic M/sub n/ associated with the on-line performance of the ensemble. Via sharp pointwise bounds on M/sub n/, we then obtain risk tail bounds for kernel perceptron algorithms in terms of the spectrum of the empirical kernel matrix. These bounds reveal that the linear hypotheses found via our approach achieve optimal tradeoffs between hinge loss and margin size over the class of all linear functions, an issue that was left open by previous results. A distinctive feature of our approach is that the key tools for our analysis come from the model of prediction of individual sequences; i.e., a model making no probabilistic assumptions on the source generating the data. In fact, these tools turn out to be so powerful that we only need very elementary statistical facts to obtain our final risk bounds.

580 citations


Book
27 Aug 2004
TL;DR: In this paper, a scenario approach for Probabilistic Robust Design is presented for LPV systems. But the approach is not suitable for linear systems and does not address the limitations of the robustness Paradigm.
Abstract: Overview.- Elements of Probability Theory.- Uncertain Linear Systems and Robustness.- Linear Robust Control Design.- Some Limits of the Robustness Paradigm.- Probabilistic Methods for Robustness.- Monte Carlo Methods.- Randomized Algorithms in Systems and Control.- Probability Inequalities.- Statistical Learning Theory and Control Design.- Sequential Algorithms for Probabilistic Robust Design.- Sequential Algorithms for LPV Systems.- Scenario Approach for Probabilistic Robust Design.- Random Number and Variate Generation.- Statistical Theory of Radial Random Vectors.- Vector Randomization Methods.- Statistical Theory of Radial Random Matrices.- Matrix Randomization Methods.- Applications of Randomized Algorithms.- Appendix.

393 citations


BookDOI
01 Jan 2004
TL;DR: Stochastic optimization and statistical learning theory and stochastic optimization, کتابخانه دیجیتال جندی شاپور اهواز
Abstract: Statistical learning theory and stochastic optimization , Statistical learning theory and stochastic optimization , کتابخانه دیجیتال جندی شاپور اهواز

326 citations


Journal ArticleDOI
TL;DR: Support vector machine (SVM), a new powerful machine learning method based on statistical learning theory (SLT), is introduced into soft sensor modeling and a model selection method within the Bayesian evidence framework is proposed to select an optimal model for a soft sensor based on SVM.

289 citations


Journal ArticleDOI
TL;DR: This task is performed from the point of view of the Theory of Statistical Learning, which provides a unified framework for all regression, classification and probability density estimation and it is shown that only one group is useful for structural reliability, according to some specific criteria.

172 citations


Book ChapterDOI
01 Jan 2004
TL;DR: This chapter discusses the statistical theory underlying various parameter-estimation methods, and gives algorithms which depend on alternatives to maximum-likelihood estimation, and describes parameter estimation algorithms which are motivated by these generalization bounds.
Abstract: A fundamental problem in statistical parsing is the choice of criteria and algo-algorithms used to estimate the parameters in a model. The predominant approach in computational linguistics has been to use a parametric model with some variant of maximum-likelihood estimation. The assumptions under which maximum-likelihood estimation is justified are arguably quite strong. This chapter discusses the statistical theory underlying various parameter-estimation methods, and gives algorithms which depend on alternatives to (smoothed) maximum-likelihood estimation. We first give an overview of results from statistical learning theory. We then show how important concepts from the classification literature - specifically, generalization results based on margins on training data - can be derived for parsing models. Finally, we describe parameter estimation algorithms which are motivated by these generalization bounds.

121 citations


01 Jan 2004
TL;DR: This is meant to be a self-contained presentation of adaptive classification seen from the PAC-Bayesian point of view, where the main improvements brought here are more localized bounds and the use of exchangeable prior distributions.
Abstract: This is meant to be a self-contained presentation of adaptive classification seen from the PAC-Bayesian point of view. Although most of the results are original, some review materials about the VC dimension and support vector machines are also included. This study falls in the field of statistical learning theory, where complex data have to be analyzed from a limited amount of informations, drawn from a finite sample. It relies on non asymptotic deviation inequalities, where the complexity of models is captured through the use of prior measures. The main improvements brought here are more localized bounds and the use of exchangeable prior distributions. Interesting consequences are drawn for the generalization properties of support vector machines and the design of new classification algorithms. 2000 Mathematics Subject Classification: 62H30, 68T05, 62B10.

80 citations


Book
01 Jan 2004
TL;DR: An Introduction to Pattern Classification and Bayesian Inference: An Introduction to Principles and Practice in Machine Learning and Gaussian Processes in Unsupervised Learning as discussed by the authors, and Monte Carlo Methods for Absolute Beginners.
Abstract: An Introduction to Pattern Classification.- Some Notes on Applied Mathematics for Machine Learning.- Bayesian Inference: An Introduction to Principles and Practice in Machine Learning.- Gaussian Processes in Machine Learning.- Unsupervised Learning.- Monte Carlo Methods for Absolute Beginners.- Stochastic Learning.- to Statistical Learning Theory.- Concentration Inequalities.

69 citations


Journal ArticleDOI
TL;DR: Inspired by several generalization bounds, "compression coefficients" for SVMs are constructed which measure the amount by which the training labels can be compressed by a code built from the separating hyperplane and can fairly accurately predict the parameters for which the test error is minimized.
Abstract: In this paper we investigate connections between statistical learning theory and data compression on the basis of support vector machine (SVM) model selection. Inspired by several generalization bounds we construct "compression coefficients" for SVMs which measure the amount by which the training labels can be compressed by a code built from the separating hyperplane. The main idea is to relate the coding precision to geometrical concepts such as the width of the margin or the shape of the data in the feature space. The so derived compression coefficients combine well known quantities such as the radius-margin term R2/ρ2, the eigenvalues of the kernel matrix, and the number of support vectors. To test whether they are useful in practice we ran model selection experiments on benchmark data sets. As a result we found that compression coefficients can fairly accurately predict the parameters for which the test error is minimized.

Journal ArticleDOI
TL;DR: Methods to establish the process of mapping from low-dimensional embedded space to high-dimensional space for LLE and validate their efficiency with the application of reconstruction of multi-pose face images are proposed and proposed.

Proceedings ArticleDOI
31 Oct 2004
TL;DR: The hope is that this approach will enable "new science" in the design of self-managing systems by allowing the rapid and widespread application of statistical learning theory techniques (SLT) to problems of system dependability.
Abstract: Complex distributed Internet services form the basis not only of e-commerce but increasingly of mission-critical network-based applications What is new is that the workload and internal architecture of three-tier enterprise applications presents the opportunity for a new approach to keeping them running in the face of many common recoverable failures The core of the approach is anomaly detection and localization based on statistical machine learning techniques Unlike previous approaches, we propose anomaly detection and pattern mining not only for operational statistics such as mean response time, but also for structural behaviors of the system---what parts of the system, in what combinations, are being exercised in response to different kinds of external stimuli In addition, rather than building baseline models a priori, we extract them by observing the behavior of the system over a short period of time during normal operation We explain the necessary underlying assumptions and why they can be realized by systems research, report on some early successes using the approach, describe benefits of the approach that make it competitive as a path toward self-managing systems, and outline some research challenges Our hope is that this approach will enable "new science" in the design of self-managing systems by allowing the rapid and widespread application of statistical learning theory techniques (SLT) to problems of system dependability

Journal ArticleDOI
01 Feb 2004
TL;DR: A new Empirical Risk Functional as cost function for training neuro-fuzzy classifiers that provides a differentiable approximation of the misclassification rate so that the Empiricals Risk Minimization Principle formulated in Vapnik's Statistical Learning Theory can be applied.
Abstract: The paper proposes a new Empirical Risk Functional as cost function for training neuro-fuzzy classifiers. This cost function, called Approximate Differentiable Empirical Risk Functional (ADERF), provides a differentiable approximation of the misclassification rate so that the Empirical Risk Minimization Principle formulated in Vapnik's Statistical Learning Theory can be applied. Also, based on the proposed ADERF, a learning algorithm is formulated. Experimental results on a number of benchmark classification tasks are provided and comparison to alternative approaches given.

Proceedings ArticleDOI
01 Jan 2004
TL;DR: The support vector machine, a classifier motivated from the statistical learning theory, is used in the pattern recognition stage of automatic pulmonary nodule detection and gives the unique optimal solution.
Abstract: Developing a Computer-Assisted Detection (CAD) system for automatic detection of pulmonary nodules in thoracic CT is a highly challenging research area in the medical domain. It requires the application of state-of-the-art image processing and pattern recognition technologies. The object recognition and feature extraction phase of such a system generates a large number of data set. As there is normally a large quantity of non-nodule objects within this data set while the nodule objects are sparse, a Gaussian mixture model-based sampling method is used to reduce the non-nodule data and thus the classification complexity. The support vector machine, a classifier motivated from the statistical learning theory, is used in the pattern recognition stage of automatic pulmonary nodule detection. After the training process, only support vectors will be used in the classification process. As the support vector machine classifier gives the unique optimal solution, the experiment on the lung nodule data shows a fast and satisfactory classification rate.


Journal ArticleDOI
TL;DR: The experimental results show that for motion estimation applications, SLT-based model selection compares favorably against alternative model selection methods, such as the Akaike's fpe, Schwartz' criterion, Generalized Cross-Validation, and Shibata's Model Selector.
Abstract: This paper describes a novel application of statistical learning theory (SLT) to single motion estimation and tracking. The problem of motion estimation can be related to statistical model selection, where the goal is to select one (correct) motion model from several possible motion models, given finite noisy samples. SLT, also known as Vapnik-Chervonenkis (VC), theory provides analytic generalization bounds for model selection, which have been used successfully for practical model selection. This paper describes a successful application of an SLT-based model selection approach to the challenging problem of estimating optimal motion models from small data sets of image measurements (flow). We present results of experiments on both synthetic and real image sequences for motion interpolation and extrapolation; these results demonstrate the feasibility and strength of our approach. Our experimental results show that for motion estimation applications, SLT-based model selection compares favorably against alternative model selection methods, such as the Akaike's fpe, Schwartz' criterion (sc), Generalized Cross-Validation (gcv), and Shibata's Model Selector (sms). The paper also shows how to address the aperture problem using SLT-based model selection for penalized linear (ridge regression) formulation.

Journal ArticleDOI
TL;DR: The purpose of this paper is to present a learning algorithm to classify data with nonlinear characteristics by applying the SVM method to AVO classification of gas sand and wet sand.
Abstract: [1] The purpose of this paper is to present a learning algorithm to classify data with nonlinear characteristics. The Support Vector Machine (SVM) is a novel type of learning machine based on statistical learning theory [Vapnik, 1998]. The support vector machine (SVM) implements the following idea: It maps the input vector X into a high-dimensional feature space Z through some nonlinear mapping, chosen a priori. In this space, an optimal separating hyperplane is constructed to separate data groupings. The support vector machine (SVM) learning method can be used to classify seismic data patterns for exploration and reservoir characterization applications. The SVM is particularly good at classifying data with nonlinear characteristics. As an example the SVM method is applied to AVO classification of gas sand and wet sand.

DOI
01 Jan 2004
TL;DR: This thesis considers a 2D object positioning application in a computer-rendered environment (CRE) that is operated with four mental activities (controlling MAs) and the BCI operation is asynchronous, namely the system is always active and reacts only when it recognizes any of the controlling MIAs.
Abstract: Scalp recorded electroencephalogram signals (EEG) reflect the combined synaptic and axonal activity of groups of neurons. In addition to their clinical applications, EEG signals can be used as support for direct brain-computer communication devices (Brain-Computer Interfaces BCIs). Indeed, during the performance of mental activities, EEG patterns that characterize them emerge. If actions executed by the BCI, are associated with classes of patterns resulting from mental activities that do not involve any physical effort, communication by means of thoughts is achieved. The subject operates the BCI by performing mental activities which are recognized by the BCI through comparison with recognition models that are set up during a training phase. In this thesis we consider a 2D object positioning application in a computer-rendered environment (CRE) that is operated with four mental activities (controlling MAs). BCI operation is asynchronous, namely the system is always active and reacts only when it recognizes any of the controlling MIAs. The BCI analyzes segments of EEG (EEG-trials) and executes actions on the CRE in accordance with a set of rules (action rules) adapted to the subject controlling skills. EEG signals have small amplitudes and are therefore sensitive to external electromagnetic perturbations. In addition, subject-generated artifacts (ocular and muscular) can hinder BCI operation and even lead to misleading conclusions regarding the real controlling skills of a subject. Thus, it is especially important to remove external perturbations and detect subject-generated artifacts. External perturbations are removed using established signal processing techniques and artifacts are detected through a singular event detection algorithm based on kernel methods. The detection parameters are calibrated at the beginning of each experimental session through an interactive procedure. Whenever an artifact is detected in an EEG-trial the BCI notifies the subject by executing a special action. Features that are relevant for the recognition of the controlling MIAs are extracted from EEG-trials (free of artifacts) through the statistical analysis of their time, frequency, and phase properties. Since a complete analysis covering all these aspects, would result in a very large number of features, various hypotheses on the nature of EEG are considered in order to reduce the number of needed features. Features are grouped into feature vectors that are used to build the recognition models using machine learning concepts. From a machine learning point of view, low dimensional feature vectors are preferred as they reduce the risk of over-fitting. Recognition models are built based on statistical learning theory and kernel methods. The advantage of these methods resides in their high recognition accuracy and flexibility. A particular requirement of BCI systems is to continuously adapt to possible EEG changes resulting from external factors or subject adaptation to the BCI. This requirement is fulfilled by means of an online learning framework that makes the parameters of the recognition models easily updatable in a computationally efficient way. After the completion of a series of training sessions, the feature extraction methods are chosen (according to an optimality criterion based on the recognition error), the initial recognition models are built for each controlling MA, and the action rules are set. In these sessions, the subject is asked to perform the controlling MAs, in accordance to a training protocol which determines the training schedule. In posterior training sessions, the BCI provides feedback indicating the subject how well the asked MA was recognized by the BCI. Thus, the subject can modulate his brain activity so as to obtain positive feedback. Furthermore, at the end of each session the BCI updates its recognition models. Such updating is straightforward as the recognition models can be dynamically updated, i.e. their parameters can be updated as new training data becomes available while progressively forgetting the contribution of old data. Because of the adaptation of the recognition models, the action rules must be adapted as well. This is achieved by considering, in the definition of the action rules, variables that change along with the recognition model parameters. The training schedule is decided based on the recognition error associated with each controlling MA, thus those MAs with large recognition errors are trained more often. The BCI developed in this thesis was validated by experiments on six subjects who participated in nine training sessions. The first three training sessions served to select the feature extraction methods, build the initial recognition models, and set the action rules. In the last six sessions, in addition to the training with feedback, positioning tests were carried out to measure the controlling skills acquired by them during each session. The evaluation was done following two criteria, namely the computation of the theoretical information transfer rate using estimates of the average recognition errors over the controlling MAs, and an experimental measure of the information transfer rate corresponding to the positioning tests. The latter has the advantage of corresponding to a real controlling situation and consequently reflects more closely the actual controlling skills of a subject. Both information transfer rates increased during the last six sessions and reached an average, over subjects of 126 and 25 bits per minute respectively.

Journal Article
TL;DR: The research shows that the method suggested features higher performance on classification and generalization ability and shorter training time over the methods based on artificial neural networks, especially for small samples.
Abstract: Fault diagnosis method based on SVM is proposed in this paper. The research shows that the method suggested features higher performance on classification and generalization ability and shorter training time over the methods based on artificial neural networks, especially for small samples.

Proceedings ArticleDOI
19 Sep 2004
TL;DR: A multilevel decision-making model for power transformer fault diagnosis based on statistical learning theory is presented and the dependability of this model is enhanced greatly, and its effectiveness and usefulness is proved.
Abstract: After thoroughly analyzing the relationships between indications and faults, it has been found that there are no explicit mapping functions between the faults of oil-immersed power transformer. To handle this problem, a multilevel decision-making model for power transformer fault diagnosis based on statistical learning theory is presented. Based on the concentration distribution of some typical fault gases, the proposed approach is to determine the optimal solution with a few training samples. The output of this model is improved by approaching exactly with K-nearest neighbor search classification for the SVM classification results, which is adjacent to optimal separating hyperplan. So the dependability of this model is enhanced greatly, and its effectiveness and usefulness is proved.

Journal Article
TL;DR: This paper introduces a new classifier design method based on a kernel extension of the classical Ho-Kashyap procedure that leads to robustness against outliers and a better approximation of the misclassification error.
Abstract: This paper introduces a new classifier design method based on a kernel extension of the classical Ho-Kashyap procedure. The proposed method uses an approximation of the absolute error rather than the squared error to design a classifier, which leads to robustness against outliers and a better approximation of the misclassification error. Additionally, easy control of the generalization ability is obtained using the structural risk minimization induction principle from statistical learning theory. Finally, examples are given to demonstrate the validity of the introduced method.

Proceedings ArticleDOI
25 Jul 2004
TL;DR: This work proposes a cluster-based learning methodology to reduce training time and the memory size for SVM by using k-means based clustering technique, and applied this technique to text-independent speaker identification problems.
Abstract: Based on statistical learning theory, support vector machines (SVM) is a powerful tool for various classification problems, such as pattern recognition and speaker identification etc. However, training SVM consumes large memory and long computing time. This work proposes a cluster-based learning methodology to reduce training time and the memory size for SVM. By using k-means based clustering technique, training data at boundary of each cluster were selected for SVM learning. We also applied this technique to text-independent speaker identification problems. Without deteriorating recognition performance, the training data and time can be reduced up to 75% and 87.5% respectively.

Book ChapterDOI
Xizhao Wang1, Qiang He1
20 Sep 2004
TL;DR: An attempt to enlarge the margin between two support vector hyper-planes by feature weight adjustment and shows that the proposed techniques can en-hance the generalization capability of the original SVM classifiers.
Abstract: It is well recognized that support vector machines (SVMs) would produce better classification performance in terms of generalization power. A SVM constructs an optimal separating hyper-plane through maximizing the margin between two classes in high-dimensional feature space. Based on statistical learning theory, the margin scale reflects the generalization capability to a great extent. The bigger the margin scale takes, the better the generalization capability of SVMs will have. This paper makes an attempt to enlarge the margin between two support vector hyper-planes by feature weight adjustment. The experiments demonstrate that our proposed techniques in this paper can en-hance the generalization capability of the original SVM classifiers.

Proceedings ArticleDOI
01 Jan 2004
TL;DR: A SVM (support vector machine), which receives the set of linear and nonlinear parameters extracted from the fetal heart rate signal (FHR) as input and gives the indication of fetal distress as output, is proposed, which is a powerful supervised learning algorithm belonging to the statistical learning theory.
Abstract: The present work is concerned with the automatic identification of fetal sufferance in intrauterine growth retarded (IUGR) fetuses, based on a multiparametric analysis of cardiotocographic recordings feeding a neural classifier As classification tool, we propose a SVM (support vector machine), which receives the set of linear and nonlinear parameters extracted from the fetal heart rate signal (FHR) as input and gives the indication of fetal distress as output SVM is a powerful supervised learning algorithm belonging to the statistical learning theory It minimizes the structural risk performance in various classification problems Three SVMs are built with different kernels Their training set includes 70 cases: 35 normal and 35 IUGR suffering fetuses Classification results obtained with a 2nd order polynomial kernel, on a test set of 30 unknown cases, show good values of accuracy, specificity and sensitivity The SVM performance is very similar to that obtained with multilayer perceptron and neurofuzzy classifiers proposed in previous works The introduction of a hybrid unsupervised/supervised learning scheme integrating independent component analysis (ICA) with SVM will be the natural development of this work with a further improvement of the diagnostic ability of the system

Journal ArticleDOI
TL;DR: A model derived from the statistical learning theory and using the biological model of Thorpe et al. is introduced, an interesting front-end for algorithmsderived from the Vapnik theory is experimented using a restrained sign language recognition experiment.
Abstract: Regarding biological visual classification, recent series of experiments have enlighten the fact that data classification can be realized in the human visual cortex with latencies of about 100-150 ms, which, considering the visual pathways latencies, is only compatible with a very specific processing architecture, described by models from Thorpe et al. Surprisingly enough, this experimental evidence is in coherence with algorithms derived from the statistical learning theory. More precisely, there is a double link: on one hand, the so-called Vapnik theory offers tools to evaluate and analyze the biological model performances and on the other hand, this model is an interesting front-end for algorithms derived from the Vapnik theory. The present contribution develops this idea, introducing a model derived from the statistical learning theory and using the biological model of Thorpe et al. We experiment its performances using a restrained sign language recognition experiment. This paper intends to be read by biologist as well as statistician, as a consequence basic material in both fields have been reviewed.

Proceedings ArticleDOI
26 Aug 2004
TL;DR: A multi-class SVM based semi-supervised approach that upgrades the classification efficiency greatly with practicable class rate and chooses the initial cluster centers manually first, then labels the samples as the training ones automatically with fuzzy C-means clustering algorithm.
Abstract: Support vector machine (SVM), which is based on statistical learning theory (SLT), has shown much better performance than most other existing machine learning methods, which are based on the traditional statistics. The original SVM was developed to solve the dichotomy classification problem. Various approaches have been presented to solve multi-class problems. Using multi-class SVM classifier we have obtained high class rate of 95.4% in remote sensing image classification. However for the class number of remote sensing image is much great, manually obtaining of training samples is a much time-consuming work. Hence, we present a multi-class SVM based semi-supervised approach. We choose the initial cluster centers manually first, then label the samples as the training ones automatically with fuzzy C-means clustering algorithm. It is believed that this method upgrades the classification efficiency greatly with practicable class rate.

Journal ArticleDOI
TL;DR: This work introduces tools from machine learning to deal with the sparsity of data and applies one of these tools, the Support Vector Machine, to delineate geologic facies from hydraulic conductivity data.
Abstract: [1] Insufficient site parameterization remains a major stumbling block for efficient and reliable prediction of flow and transport in a subsurface environment. The lack of sufficient parameter data is usually dealt with by treating relevant parameters as random fields, which enables one to employ various geostatistical and stochastic tools. The major conceptual difficulty with these techniques is that they rely on the ergodicity hypothesis to interchange spatial and ensemble statistics. Instead of treating deterministic material properties as random, we introduce tools from machine learning to deal with the sparsity of data. To demonstrate the relevance and advantages of this approach, we apply one of these tools, the Support Vector Machine, to delineate geologic facies from hydraulic conductivity data.

Proceedings ArticleDOI
05 Jan 2004
TL;DR: The work presented here examines the feasibility of applying SVMs to the aerodynamic modeling field through empirical comparisons between the SVMs and the commonly used neural network technique through two practical data modeling cases.
Abstract: Aerodynamic data modeling plays an important role in aerospace and industrial fluid engineering. Support vector machines (SVMs), as a novel type of learning algorithms based on the statistical learning theory, can be used for regression problems and have been reported to perform well with promising results. The work presented here examines the feasibility of applying SVMs to the aerodynamic modeling field. Mainly, the empirical comparisons between the SVMs and the commonly used neural network technique are carried out through two practical data modeling cases – performance-prediction of a new prototype mixer for engine combustors, and calibration of a five-hole pressure probe. A CFD-based diffuser optimization design is also involved in the article, in which an SVM is used to construct a response surface and hereby to make the optimization perform on an easily computable surrogate space. The obtained simulation results in all the application cases demonstrate that SVMs are the potential options for the ch...

Journal ArticleDOI
TL;DR: The results showed that the normalization methods could affect the prediction performance of support vector machines and could be useful for determining a proper normalization method to achieve the best performance in SVMs.
Abstract: Support vector machines (SVM) based on the statistical learning theory is currently one of the most popular and efficient approaches for pattern recognition problem, because of their remarkable performance in terms of prediction accuracy. It is, however, required to choose a proper normalization method for input vectors in order to improve the system performance. Various normalization methods for SVMs have been studied in this research and the results showed that the normalization methods could affect the prediction performance. The results could be useful for determining a proper normalization method to achieve the best performance in SVMs.