Showing papers on "Statistical learning theory published in 2004"

PDF

Open Access

Book Chapter•DOI•

Introduction to Statistical Learning Theory

[...]

Olivier Bousquet¹, Stéphane Boucheron², Gábor Lugosi³•Institutions (3)

Max Planck Society¹, University of Paris², Pompeu Fabra University³

01 Sep 2004-Lecture Notes in Computer Science

TL;DR: This tutorial introduces the techniques that are used to obtain results in the form of so-called error bounds in statistical learning theory.

...read moreread less

Abstract: The goal of statistical learning theory is to study, in a statistical framework, the properties of learning algorithms. In particular, most results take the form of so-called error bounds. This tutorial introduces the techniques that are used to obtain such results.

...read moreread less

602 citations

Journal Article•DOI•

On-Line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms

[...]

Kenji Yamanishi¹, Jun'ichi Takeuchi¹, Graham J. Williams², Peter A. Milne²•Institutions (2)

NEC¹, Commonwealth Scientific and Industrial Research Organisation²

01 May 2004-Data Mining and Knowledge Discovery

TL;DR: An experimental application to network intrusion detection shows that SmartSifter was able to identify data with high scores that corresponded to attacks, with low computational costs.

...read moreread less

Abstract: Outlier detection is a fundamental issue in data mining, specifically in fraud detection, network intrusion detection, network monitoring, etc. SmartSifter is an outlier detection engine addressing this problem from the viewpoint of statistical learning theory. This paper provides a theoretical basis for SmartSifter and empirically demonstrates its effectiveness. SmartSifter detects outliers in an on-line process through the on-line unsupervised learning of a probabilistic model (using a finite mixture model) of the information source. Each time a datum is input SmartSifter employs an on-line discounting learning algorithm to learn the probabilistic model. A score is given to the datum based on the learned model with a high score indicating a high possibility of being a statistical outlier. The novel features of SmartSifter are: (1) it is adaptive to non-stationary sources of datas (2) a score has a clear statistical/information-theoretic meanings (3) it is computationally inexpensives and (4) it can handle both categorical and continuous variables. An experimental application to network intrusion detection shows that SmartSifter was able to identify data with high scores that corresponded to attacks, with low computational costs. Further experimental application has identified a number of meaningful rare cases in actual health insurance pathology data from Australia's Health Insurance Commission.

...read moreread less

592 citations

Journal Article•DOI•

On the generalization ability of on-line learning algorithms

[...]

Nicolò Cesa-Bianchi¹, Alex Conconi¹, Claudio Gentile²•Institutions (2)

University of Milan¹, University of Insubria²

01 Sep 2004-IEEE Transactions on Information Theory

TL;DR: This paper proves tight data-dependent bounds for the risk of this hypothesis in terms of an easily computable statistic M/sub n/ associated with the on-line performance of the ensemble, and obtains risk tail bounds for kernel perceptron algorithms interms of the spectrum of the empirical kernel matrix.

...read moreread less

Abstract: In this paper, it is shown how to extract a hypothesis with small risk from the ensemble of hypotheses generated by an arbitrary on-line learning algorithm run on an independent and identically distributed (i.i.d.) sample of data. Using a simple large deviation argument, we prove tight data-dependent bounds for the risk of this hypothesis in terms of an easily computable statistic M/sub n/ associated with the on-line performance of the ensemble. Via sharp pointwise bounds on M/sub n/, we then obtain risk tail bounds for kernel perceptron algorithms in terms of the spectrum of the empirical kernel matrix. These bounds reveal that the linear hypotheses found via our approach achieve optimal tradeoffs between hinge loss and margin size over the class of all linear functions, an issue that was left open by previous results. A distinctive feature of our approach is that the key tools for our analysis come from the model of prediction of individual sequences; i.e., a model making no probabilistic assumptions on the source generating the data. In fact, these tools turn out to be so powerful that we only need very elementary statistical facts to obtain our final risk bounds.

...read moreread less

580 citations

Book•

Randomized Algorithms for Analysis and Control of Uncertain Systems

[...]

Roberto Tempo¹, Giuseppe Carlo Calafiore, Fabrizio Dabbene•Institutions (1)

Polytechnic University of Turin¹

27 Aug 2004

TL;DR: In this paper, a scenario approach for Probabilistic Robust Design is presented for LPV systems. But the approach is not suitable for linear systems and does not address the limitations of the robustness Paradigm.

...read moreread less

Abstract: Overview.- Elements of Probability Theory.- Uncertain Linear Systems and Robustness.- Linear Robust Control Design.- Some Limits of the Robustness Paradigm.- Probabilistic Methods for Robustness.- Monte Carlo Methods.- Randomized Algorithms in Systems and Control.- Probability Inequalities.- Statistical Learning Theory and Control Design.- Sequential Algorithms for Probabilistic Robust Design.- Sequential Algorithms for LPV Systems.- Scenario Approach for Probabilistic Robust Design.- Random Number and Variate Generation.- Statistical Theory of Radial Random Vectors.- Vector Randomization Methods.- Statistical Theory of Radial Random Matrices.- Matrix Randomization Methods.- Applications of Randomized Algorithms.- Appendix.

...read moreread less

393 citations

Book•DOI•

Statistical learning theory and stochastic optimization

[...]

Olivier Catoni, Jean Picard

01 Jan 2004

TL;DR: Stochastic optimization and statistical learning theory and stochastic optimization, کتابخانه دیجیتال جندی شاپور اهواز

...read moreread less

Abstract: Statistical learning theory and stochastic optimization , Statistical learning theory and stochastic optimization , کتابخانه دیجیتال جندی شاپور اهواز

...read moreread less

326 citations

Journal Article•DOI•

Soft sensing modeling based on support vector machine and Bayesian model selection

[...]

Weiwu Yan¹, Huihe Shao¹, Xiaofan Wang¹•Institutions (1)

Shanghai Jiao Tong University¹

15 Jul 2004-Computers & Chemical Engineering

TL;DR: Support vector machine (SVM), a new powerful machine learning method based on statistical learning theory (SLT), is introduced into soft sensor modeling and a model selection method within the Bayesian evidence framework is proposed to select an optimal model for a soft sensor based on SVM.

...read moreread less

289 citations

Journal Article•DOI•

An examination of methods for approximating implicit limit state functions from the viewpoint of statistical learning theory

[...]

Jorge E. Hurtado¹•Institutions (1)

National University of Colombia¹

01 Jul 2004-Structural Safety

TL;DR: This task is performed from the point of view of the Theory of Statistical Learning, which provides a unified framework for all regression, classification and probability density estimation and it is shown that only one group is useful for structural reliability, according to some specific criteria.

...read moreread less

172 citations

Book Chapter•DOI•

Parameter estimation for statistical parsing models: theory and practice of distribution-free methods

[...]

Michael Collins¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Jan 2004

TL;DR: This chapter discusses the statistical theory underlying various parameter-estimation methods, and gives algorithms which depend on alternatives to maximum-likelihood estimation, and describes parameter estimation algorithms which are motivated by these generalization bounds.

...read moreread less

Abstract: A fundamental problem in statistical parsing is the choice of criteria and algo-algorithms used to estimate the parameters in a model. The predominant approach in computational linguistics has been to use a parametric model with some variant of maximum-likelihood estimation. The assumptions under which maximum-likelihood estimation is justified are arguably quite strong. This chapter discusses the statistical theory underlying various parameter-estimation methods, and gives algorithms which depend on alternatives to (smoothed) maximum-likelihood estimation. We first give an overview of results from statistical learning theory. We then show how important concepts from the classification literature - specifically, generalization results based on margins on training data - can be derived for parsing models. Finally, we describe parameter estimation algorithms which are motivated by these generalization bounds.

...read moreread less

121 citations

A PAC-Bayesian approach to adaptive classification

[...]

Olivier Catoni

01 Jan 2004

TL;DR: This is meant to be a self-contained presentation of adaptive classification seen from the PAC-Bayesian point of view, where the main improvements brought here are more localized bounds and the use of exchangeable prior distributions.

...read moreread less

Abstract: This is meant to be a self-contained presentation of adaptive classification seen from the PAC-Bayesian point of view. Although most of the results are original, some review materials about the VC dimension and support vector machines are also included. This study falls in the field of statistical learning theory, where complex data have to be analyzed from a limited amount of informations, drawn from a finite sample. It relies on non asymptotic deviation inequalities, where the complexity of models is captured through the use of prior measures. The main improvements brought here are more localized bounds and the use of exchangeable prior distributions. Interesting consequences are drawn for the generalization properties of support vector machines and the design of new classification algorithms. 2000 Mathematics Subject Classification: 62H30, 68T05, 62B10.

...read moreread less

80 citations

Book•

Advanced lectures on machine learning : ML Summer Schools 2003, Canberra, Australia, February 2-14, 2003, Tübingen, Germany, August 4-16, 2003 : revised lectures

[...]

Olivier Bousquet, Ulrike von Luxburg, Gunnar Rätsch

01 Jan 2004

TL;DR: An Introduction to Pattern Classification and Bayesian Inference: An Introduction to Principles and Practice in Machine Learning and Gaussian Processes in Unsupervised Learning as discussed by the authors, and Monte Carlo Methods for Absolute Beginners.

...read moreread less

Abstract: An Introduction to Pattern Classification.- Some Notes on Applied Mathematics for Machine Learning.- Bayesian Inference: An Introduction to Principles and Practice in Machine Learning.- Gaussian Processes in Machine Learning.- Unsupervised Learning.- Monte Carlo Methods for Absolute Beginners.- Stochastic Learning.- to Statistical Learning Theory.- Concentration Inequalities.

...read moreread less

69 citations

Journal Article•DOI•

A Compression Approach to Support Vector Model Selection

[...]

Ulrike von Luxburg¹, Olivier Bousquet¹, Bernhard Schölkopf¹•Institutions (1)

Max Planck Society¹

01 Dec 2004-Journal of Machine Learning Research

TL;DR: Inspired by several generalization bounds, "compression coefficients" for SVMs are constructed which measure the amount by which the training labels can be compressed by a code built from the separating hyperplane and can fairly accurately predict the parameters for which the test error is minimized.

...read moreread less

Abstract: In this paper we investigate connections between statistical learning theory and data compression on the basis of support vector machine (SVM) model selection. Inspired by several generalization bounds we construct "compression coefficients" for SVMs which measure the amount by which the training labels can be compressed by a code built from the separating hyperplane. The main idea is to relate the coding precision to geometrical concepts such as the width of the margin or the shape of the data in the feature space. The so derived compression coefficients combine well known quantities such as the radius-margin term R2/ρ2, the eigenvalues of the kernel matrix, and the number of support vectors. To test whether they are useful in practice we ran model selection experiments on benchmark data sets. As a result we found that compression coefficients can fairly accurately predict the parameters for which the test error is minimized.

...read moreread less

Journal Article•DOI•

Reconstruction and analysis of multi-pose face images based on nonlinear dimensionality reduction

[...]

Changshui Zhang¹, Jun Wang¹, Nanyuan Zhao¹, David Zhang¹•Institutions (1)

Tsinghua University¹

01 Feb 2004-Pattern Recognition

TL;DR: Methods to establish the process of mapping from low-dimensional embedded space to high-dimensional space for LLE and validate their efficiency with the application of reconstruction of multi-pose face images are proposed and proposed.

...read moreread less

Proceedings Article•DOI•

Combining statistical monitoring and predictable recovery for self-management

[...]

Armando Fox¹, Emre Kiciman¹, David A. Patterson²•Institutions (2)

Stanford University¹, University of California, Berkeley²

31 Oct 2004

TL;DR: The hope is that this approach will enable "new science" in the design of self-managing systems by allowing the rapid and widespread application of statistical learning theory techniques (SLT) to problems of system dependability.

...read moreread less

Abstract: Complex distributed Internet services form the basis not only of e-commerce but increasingly of mission-critical network-based applications What is new is that the workload and internal architecture of three-tier enterprise applications presents the opportunity for a new approach to keeping them running in the face of many common recoverable failures The core of the approach is anomaly detection and localization based on statistical machine learning techniques Unlike previous approaches, we propose anomaly detection and pattern mining not only for operational statistics such as mean response time, but also for structural behaviors of the system---what parts of the system, in what combinations, are being exercised in response to different kinds of external stimuli In addition, rather than building baseline models a priori, we extract them by observing the behavior of the system over a short period of time during normal operation We explain the necessary underlying assumptions and why they can be realized by systems research, report on some early successes using the approach, describe benefits of the approach that make it competitive as a path toward self-managing systems, and outline some research challenges Our hope is that this approach will enable "new science" in the design of self-managing systems by allowing the rapid and widespread application of statistical learning theory techniques (SLT) to problems of system dependability

...read moreread less

Journal Article•DOI•

An empirical risk functional to improve learning in a neuro-fuzzy classifier

[...]

Giovanna Castellano¹, Anna Maria Fanelli¹, Corrado Mencar¹•Institutions (1)

University of Bari¹

01 Feb 2004

TL;DR: A new Empirical Risk Functional as cost function for training neuro-fuzzy classifiers that provides a differentiable approximation of the misclassification rate so that the Empiricals Risk Minimization Principle formulated in Vapnik's Statistical Learning Theory can be applied.

...read moreread less

Abstract: The paper proposes a new Empirical Risk Functional as cost function for training neuro-fuzzy classifiers. This cost function, called Approximate Differentiable Empirical Risk Functional (ADERF), provides a differentiable approximation of the misclassification rate so that the Empirical Risk Minimization Principle formulated in Vapnik's Statistical Learning Theory can be applied. Also, based on the proposed ADERF, a learning algorithm is formulated. Experimental results on a number of benchmark classification tasks are provided and comparison to alternative approaches given.

...read moreread less

Proceedings Article•DOI•

Classification of Lung Data by Sampling and Support Vector Machine

[...]

Jamshid Dehmeshki, J. Chen, M.V. Casique, Mustafa Karakoy

01 Jan 2004

TL;DR: The support vector machine, a classifier motivated from the statistical learning theory, is used in the pattern recognition stage of automatic pulmonary nodule detection and gives the unique optimal solution.

...read moreread less

Abstract: Developing a Computer-Assisted Detection (CAD) system for automatic detection of pulmonary nodules in thoracic CT is a highly challenging research area in the medical domain. It requires the application of state-of-the-art image processing and pattern recognition technologies. The object recognition and feature extraction phase of such a system generates a large number of data set. As there is normally a large quantity of non-nodule objects within this data set while the nodule objects are sparse, a Gaussian mixture model-based sampling method is used to reduce the non-nodule data and thus the classification complexity. The support vector machine, a classifier motivated from the statistical learning theory, is used in the pattern recognition stage of automatic pulmonary nodule detection. After the training process, only support vectors will be used in the classification process. As the support vector machine classifier gives the unique optimal solution, the experiment on the lung nodule data shows a fast and satisfactory classification rate.

...read moreread less

Statistical Learning Theory by Boosting Method

[...]

Takashi Takenouchi

24 Mar 2004

Journal Article•DOI•

Motion estimation using statistical learning theory

[...]

Harry Wechsler¹, Zoran Duric¹, Fayin Li¹, Vladimir Cherkassky²•Institutions (2)

George Mason University¹, University of Minnesota²

01 Apr 2004-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The experimental results show that for motion estimation applications, SLT-based model selection compares favorably against alternative model selection methods, such as the Akaike's fpe, Schwartz' criterion, Generalized Cross-Validation, and Shibata's Model Selector.

...read moreread less

Abstract: This paper describes a novel application of statistical learning theory (SLT) to single motion estimation and tracking. The problem of motion estimation can be related to statistical model selection, where the goal is to select one (correct) motion model from several possible motion models, given finite noisy samples. SLT, also known as Vapnik-Chervonenkis (VC), theory provides analytic generalization bounds for model selection, which have been used successfully for practical model selection. This paper describes a successful application of an SLT-based model selection approach to the challenging problem of estimating optimal motion models from small data sets of image measurements (flow). We present results of experiments on both synthetic and real image sequences for motion interpolation and extrapolation; these results demonstrate the feasibility and strength of our approach. Our experimental results show that for motion estimation applications, SLT-based model selection compares favorably against alternative model selection methods, such as the Akaike's fpe, Schwartz' criterion (sc), Generalized Cross-Validation (gcv), and Shibata's Model Selector (sms). The paper also shows how to address the aperture problem using SLT-based model selection for penalized linear (ridge regression) formulation.

...read moreread less

Journal Article•DOI•

Support Vector Machine (SVM) pattern recognition to AVO classification

[...]

Jiakang Li¹, John P. Castagna¹•Institutions (1)

University of Oklahoma¹

01 Jan 2004-Geophysical Research Letters

TL;DR: The purpose of this paper is to present a learning algorithm to classify data with nonlinear characteristics by applying the SVM method to AVO classification of gas sand and wet sand.

...read moreread less

Abstract: [1] The purpose of this paper is to present a learning algorithm to classify data with nonlinear characteristics. The Support Vector Machine (SVM) is a novel type of learning machine based on statistical learning theory [Vapnik, 1998]. The support vector machine (SVM) implements the following idea: It maps the input vector X into a high-dimensional feature space Z through some nonlinear mapping, chosen a priori. In this space, an optimal separating hyperplane is constructed to separate data groupings. The support vector machine (SVM) learning method can be used to classify seismic data patterns for exploration and reservoir characterization applications. The SVM is particularly good at classifying data with nonlinear characteristics. As an example the SVM method is applied to AVO classification of gas sand and wet sand.

...read moreread less

DOI•

Direct brain-computer communication through scalp recorded EEG signals

[...]

Garcia Molina, Gary Nelson

01 Jan 2004

TL;DR: This thesis considers a 2D object positioning application in a computer-rendered environment (CRE) that is operated with four mental activities (controlling MAs) and the BCI operation is asynchronous, namely the system is always active and reacts only when it recognizes any of the controlling MIAs.

...read moreread less

Abstract: Scalp recorded electroencephalogram signals (EEG) reflect the combined synaptic and axonal activity of groups of neurons. In addition to their clinical applications, EEG signals can be used as support for direct brain-computer communication devices (Brain-Computer Interfaces BCIs). Indeed, during the performance of mental activities, EEG patterns that characterize them emerge. If actions executed by the BCI, are associated with classes of patterns resulting from mental activities that do not involve any physical effort, communication by means of thoughts is achieved. The subject operates the BCI by performing mental activities which are recognized by the BCI through comparison with recognition models that are set up during a training phase. In this thesis we consider a 2D object positioning application in a computer-rendered environment (CRE) that is operated with four mental activities (controlling MAs). BCI operation is asynchronous, namely the system is always active and reacts only when it recognizes any of the controlling MIAs. The BCI analyzes segments of EEG (EEG-trials) and executes actions on the CRE in accordance with a set of rules (action rules) adapted to the subject controlling skills. EEG signals have small amplitudes and are therefore sensitive to external electromagnetic perturbations. In addition, subject-generated artifacts (ocular and muscular) can hinder BCI operation and even lead to misleading conclusions regarding the real controlling skills of a subject. Thus, it is especially important to remove external perturbations and detect subject-generated artifacts. External perturbations are removed using established signal processing techniques and artifacts are detected through a singular event detection algorithm based on kernel methods. The detection parameters are calibrated at the beginning of each experimental session through an interactive procedure. Whenever an artifact is detected in an EEG-trial the BCI notifies the subject by executing a special action. Features that are relevant for the recognition of the controlling MIAs are extracted from EEG-trials (free of artifacts) through the statistical analysis of their time, frequency, and phase properties. Since a complete analysis covering all these aspects, would result in a very large number of features, various hypotheses on the nature of EEG are considered in order to reduce the number of needed features. Features are grouped into feature vectors that are used to build the recognition models using machine learning concepts. From a machine learning point of view, low dimensional feature vectors are preferred as they reduce the risk of over-fitting. Recognition models are built based on statistical learning theory and kernel methods. The advantage of these methods resides in their high recognition accuracy and flexibility. A particular requirement of BCI systems is to continuously adapt to possible EEG changes resulting from external factors or subject adaptation to the BCI. This requirement is fulfilled by means of an online learning framework that makes the parameters of the recognition models easily updatable in a computationally efficient way. After the completion of a series of training sessions, the feature extraction methods are chosen (according to an optimality criterion based on the recognition error), the initial recognition models are built for each controlling MA, and the action rules are set. In these sessions, the subject is asked to perform the controlling MAs, in accordance to a training protocol which determines the training schedule. In posterior training sessions, the BCI provides feedback indicating the subject how well the asked MA was recognized by the BCI. Thus, the subject can modulate his brain activity so as to obtain positive feedback. Furthermore, at the end of each session the BCI updates its recognition models. Such updating is straightforward as the recognition models can be dynamically updated, i.e. their parameters can be updated as new training data becomes available while progressively forgetting the contribution of old data. Because of the adaptation of the recognition models, the action rules must be adapted as well. This is achieved by considering, in the definition of the action rules, variables that change along with the recognition model parameters. The training schedule is decided based on the recognition error associated with each controlling MA, thus those MAs with large recognition errors are trained more often. The BCI developed in this thesis was validated by experiments on six subjects who participated in nine training sessions. The first three training sessions served to select the feature extraction methods, build the initial recognition models, and set the action rules. In the last six sessions, in addition to the training with feedback, positioning tests were carried out to measure the controlling skills acquired by them during each session. The evaluation was done following two criteria, namely the computation of the theoretical information transfer rate using estimates of the average recognition errors over the controlling MAs, and an experimental measure of the information transfer rate corresponding to the positioning tests. The latter has the advantage of corresponding to a real controlling situation and consequently reflects more closely the actual controlling skills of a subject. Both information transfer rates increased during the last six sessions and reached an average, over subjects of 126 and 25 bits per minute respectively.

...read moreread less

Journal Article•

Fault Diagnosis Based on Support Vector Machine

[...]

YU Jin-shou

01 Jan 2004-Journal of East China University of Science and Technology

TL;DR: The research shows that the method suggested features higher performance on classification and generalization ability and shorter training time over the methods based on artificial neural networks, especially for small samples.

...read moreread less

Abstract: Fault diagnosis method based on SVM is proposed in this paper. The research shows that the method suggested features higher performance on classification and generalization ability and shorter training time over the methods based on artificial neural networks, especially for small samples.

...read moreread less

Proceedings Article•DOI•

Fault diagnosis model for power transformer based on statistical learning theory and dissolved gas analysis

[...]

M. Dong¹, D.K. Xu, M.H. Li, Zhang Yan•Institutions (1)

Xi'an Jiaotong University¹

19 Sep 2004

TL;DR: A multilevel decision-making model for power transformer fault diagnosis based on statistical learning theory is presented and the dependability of this model is enhanced greatly, and its effectiveness and usefulness is proved.

...read moreread less

Abstract: After thoroughly analyzing the relationships between indications and faults, it has been found that there are no explicit mapping functions between the faults of oil-immersed power transformer. To handle this problem, a multilevel decision-making model for power transformer fault diagnosis based on statistical learning theory is presented. Based on the concentration distribution of some typical fault gases, the proposed approach is to determine the optimal solution with a few training samples. The output of this model is improved by approaching exactly with K-nearest neighbor search classification for the SVM classification results, which is adjacent to optimal separating hyperplan. So the dependability of this model is enhanced greatly, and its effectiveness and usefulness is proved.

...read moreread less

Journal Article•

Kernel Ho-Kashyap classifier with generalization control

[...]

Jacek Łęski

01 Jan 2004-International Journal of Applied Mathematics and Computer Science

TL;DR: This paper introduces a new classifier design method based on a kernel extension of the classical Ho-Kashyap procedure that leads to robustness against outliers and a better approximation of the misclassification error.

...read moreread less

Abstract: This paper introduces a new classifier design method based on a kernel extension of the classical Ho-Kashyap procedure. The proposed method uses an approximation of the absolute error rather than the squared error to design a classifier, which leads to robustness against outliers and a better approximation of the misclassification error. Additionally, easy control of the generalization ability is obtained using the structural risk minimization induction principle from statistical learning theory. Finally, examples are given to demonstrate the validity of the introduced method.

...read moreread less

Proceedings Article•DOI•

Cluster-based support vector machines in text-independent speaker identification

[...]

Sheng-Yu Sun¹, C.L. Tseng¹, Yueh-Hong Chen¹, S.C. Chuang¹, Hsin-Chia Fu¹ - Show less +1 more•Institutions (1)

National Chiao Tung University¹

25 Jul 2004

TL;DR: This work proposes a cluster-based learning methodology to reduce training time and the memory size for SVM by using k-means based clustering technique, and applied this technique to text-independent speaker identification problems.

...read moreread less

Abstract: Based on statistical learning theory, support vector machines (SVM) is a powerful tool for various classification problems, such as pattern recognition and speaker identification etc. However, training SVM consumes large memory and long computing time. This work proposes a cluster-based learning methodology to reduce training time and the memory size for SVM. By using k-means based clustering technique, training data at boundary of each cluster were selected for SVM learning. We also applied this technique to text-independent speaker identification problems. Without deteriorating recognition performance, the training data and time can be reduced up to 75% and 87.5% respectively.

...read moreread less

Book Chapter•DOI•

Enhancing Generalization Capability of SVM Classifiers with Feature Weight Adjustment

[...]

Xizhao Wang¹, Qiang He¹•Institutions (1)

Hebei University¹

20 Sep 2004

TL;DR: An attempt to enlarge the margin between two support vector hyper-planes by feature weight adjustment and shows that the proposed techniques can en-hance the generalization capability of the original SVM classifiers.

...read moreread less

Abstract: It is well recognized that support vector machines (SVMs) would produce better classification performance in terms of generalization power. A SVM constructs an optimal separating hyper-plane through maximizing the margin between two classes in high-dimensional feature space. Based on statistical learning theory, the margin scale reflects the generalization capability to a great extent. The bigger the margin scale takes, the better the generalization capability of SVMs will have. This paper makes an attempt to enlarge the margin between two support vector hyper-planes by feature weight adjustment. The experiments demonstrate that our proposed techniques in this paper can en-hance the generalization capability of the original SVM classifiers.

...read moreread less

Proceedings Article•DOI•

Identification of fetal sufferance antepartum through a multiparametric analysis and a support vector machine

[...]

Giovanni Magenes¹, L. Pedrinazzi¹, Maria G. Signorini•Institutions (1)

University of Pavia¹

01 Jan 2004

TL;DR: A SVM (support vector machine), which receives the set of linear and nonlinear parameters extracted from the fetal heart rate signal (FHR) as input and gives the indication of fetal distress as output, is proposed, which is a powerful supervised learning algorithm belonging to the statistical learning theory.

...read moreread less

Abstract: The present work is concerned with the automatic identification of fetal sufferance in intrauterine growth retarded (IUGR) fetuses, based on a multiparametric analysis of cardiotocographic recordings feeding a neural classifier As classification tool, we propose a SVM (support vector machine), which receives the set of linear and nonlinear parameters extracted from the fetal heart rate signal (FHR) as input and gives the indication of fetal distress as output SVM is a powerful supervised learning algorithm belonging to the statistical learning theory It minimizes the structural risk performance in various classification problems Three SVMs are built with different kernels Their training set includes 70 cases: 35 normal and 35 IUGR suffering fetuses Classification results obtained with a 2nd order polynomial kernel, on a test set of 30 unknown cases, show good values of accuracy, specificity and sensitivity The SVM performance is very similar to that obtained with multilayer perceptron and neurofuzzy classifiers proposed in previous works The introduction of a hybrid unsupervised/supervised learning scheme integrating independent component analysis (ICA) with SVM will be the natural development of this work with a further improvement of the diagnostic ability of the system

...read moreread less

Journal Article•DOI•

Using an Hebbian learning rule for multi-class SVM classifiers.

[...]

Thierry Viéville¹, Sylvie Crahay¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

01 Nov 2004-Journal of Computational Neuroscience

TL;DR: A model derived from the statistical learning theory and using the biological model of Thorpe et al. is introduced, an interesting front-end for algorithmsderived from the Vapnik theory is experimented using a restrained sign language recognition experiment.

...read moreread less

Abstract: Regarding biological visual classification, recent series of experiments have enlighten the fact that data classification can be realized in the human visual cortex with latencies of about 100-150 ms, which, considering the visual pathways latencies, is only compatible with a very specific processing architecture, described by models from Thorpe et al. Surprisingly enough, this experimental evidence is in coherence with algorithms derived from the statistical learning theory. More precisely, there is a double link: on one hand, the so-called Vapnik theory offers tools to evaluate and analyze the biological model performances and on the other hand, this model is an interesting front-end for algorithms derived from the Vapnik theory. The present contribution develops this idea, introducing a model derived from the statistical learning theory and using the biological model of Thorpe et al. We experiment its performances using a restrained sign language recognition experiment. This paper intends to be read by biologist as well as statistician, as a consequence basic material in both fields have been reviewed.

...read moreread less

Proceedings Article•DOI•

Multi-class SVM based remote sensing image classification and its semi-supervised improvement scheme

[...]

Heng-Nian Qi¹, Jian-Gang Yang¹, Yi-Wen Zhong¹, Chao Deng¹•Institutions (1)

Zhejiang University¹

26 Aug 2004

TL;DR: A multi-class SVM based semi-supervised approach that upgrades the classification efficiency greatly with practicable class rate and chooses the initial cluster centers manually first, then labels the samples as the training ones automatically with fuzzy C-means clustering algorithm.

...read moreread less

Abstract: Support vector machine (SVM), which is based on statistical learning theory (SLT), has shown much better performance than most other existing machine learning methods, which are based on the traditional statistics. The original SVM was developed to solve the dichotomy classification problem. Various approaches have been presented to solve multi-class problems. Using multi-class SVM classifier we have obtained high class rate of 95.4% in remote sensing image classification. However for the class number of remote sensing image is much great, manually obtaining of training samples is a much time-consuming work. Hence, we present a multi-class SVM based semi-supervised approach. We choose the initial cluster centers manually first, then label the samples as the training ones automatically with fuzzy C-means clustering algorithm. It is believed that this method upgrades the classification efficiency greatly with practicable class rate.

...read moreread less

Journal Article•DOI•

Delineation of geologic facies with statistical learning theory

[...]

Daniel M. Tartakovsky¹, Brendt Wohlberg¹•Institutions (1)

Los Alamos National Laboratory¹

01 Sep 2004-Geophysical Research Letters

TL;DR: This work introduces tools from machine learning to deal with the sparsity of data and applies one of these tools, the Support Vector Machine, to delineate geologic facies from hydraulic conductivity data.

...read moreread less

Abstract: [1] Insufficient site parameterization remains a major stumbling block for efficient and reliable prediction of flow and transport in a subsurface environment. The lack of sufficient parameter data is usually dealt with by treating relevant parameters as random fields, which enables one to employ various geostatistical and stochastic tools. The major conceptual difficulty with these techniques is that they rely on the ergodicity hypothesis to interchange spatial and ensemble statistics. Instead of treating deterministic material properties as random, we introduce tools from machine learning to deal with the sparsity of data. To demonstrate the relevance and advantages of this approach, we apply one of these tools, the Support Vector Machine, to delineate geologic facies from hydraulic conductivity data.

...read moreread less

Proceedings Article•DOI•

Aerodynamic Data Modeling Using Support Vector Machines

[...]

Hui-Yuan Fan¹, George S. Dulikravich¹, Zhen-Xue Han¹•Institutions (1)

University of Texas at Arlington¹

05 Jan 2004

TL;DR: The work presented here examines the feasibility of applying SVMs to the aerodynamic modeling field through empirical comparisons between the SVMs and the commonly used neural network technique through two practical data modeling cases.

...read moreread less

Abstract: Aerodynamic data modeling plays an important role in aerospace and industrial fluid engineering. Support vector machines (SVMs), as a novel type of learning algorithms based on the statistical learning theory, can be used for regression problems and have been reported to perform well with promising results. The work presented here examines the feasibility of applying SVMs to the aerodynamic modeling field. Mainly, the empirical comparisons between the SVMs and the commonly used neural network technique are carried out through two practical data modeling cases – performance-prediction of a new prototype mixer for engine combustors, and calibration of a five-hole pressure probe. A CFD-based diffuser optimization design is also involved in the article, in which an SVM is used to construct a response surface and hereby to make the optimization perform on an easily computable surrogate space. The obtained simulation results in all the application cases demonstrate that SVMs are the potential options for the ch...

...read moreread less

Journal Article•DOI•

Prediction performance of support vector machines on input vector normalization methods

[...]

Daehyon Kim

01 May 2004-International Journal of Computer Mathematics

TL;DR: The results showed that the normalization methods could affect the prediction performance of support vector machines and could be useful for determining a proper normalization method to achieve the best performance in SVMs.

...read moreread less

Abstract: Support vector machines (SVM) based on the statistical learning theory is currently one of the most popular and efficient approaches for pattern recognition problem, because of their remarkable performance in terms of prediction accuracy. It is, however, required to choose a proper normalization method for input vectors in order to improve the system performance. Various normalization methods for SVMs have been studied in this research and the results showed that the normalization methods could affect the prediction performance. The results could be useful for determining a proper normalization method to achieve the best performance in SVMs.

...read moreread less