Showing papers on "Statistical learning theory published in 2000"

PDF

Open Access

Book•

An Introduction to Support Vector Machines and Other Kernel-based Learning Methods

[...]

Nello Cristianini¹, John Shawe-Taylor²•Institutions (2)

University of Bristol¹, Royal Holloway, University of London²

01 Jan 2000

TL;DR: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory, and will guide practitioners to updated literature, new applications, and on-line software.

...read moreread less

Abstract: From the publisher: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory. SVMs deliver state-of-the-art performance in real-world applications such as text categorisation, hand-written character recognition, image classification, biosequences analysis, etc., and are now established as one of the standard tools for machine learning and data mining. Students will find the book both stimulating and accessible, while practitioners will be guided smoothly through the material required for a good grasp of the theory and its applications. The concepts are introduced gradually in accessible and self-contained stages, while the presentation is rigorous and thorough. Pointers to relevant literature and web sites containing software ensure that it forms an ideal starting point for further study. Equally, the book and its associated web site will guide practitioners to updated literature, new applications, and on-line software.

...read moreread less

13,736 citations

Journal Article•DOI•

Statistical pattern recognition: a review

[...]

Anil K. Jain¹, Robert P. W. Duin², Jianchang Mao³•Institutions (3)

Michigan State University¹, Delft University of Technology², IBM³

01 Jan 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.

...read moreread less

Abstract: The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques and methods imported from statistical learning theory have been receiving increasing attention. The design of a recognition system requires careful attention to the following issues: definition of pattern classes, sensing environment, pattern representation, feature extraction and selection, cluster analysis, classifier design and learning, selection of training and test samples, and performance evaluation. In spite of almost 50 years of research and development in this field, the general problem of recognizing complex patterns with arbitrary orientation, location, and scale remains unsolved. New and emerging applications, such as data mining, web searching, retrieval of multimedia data, face recognition, and cursive handwriting recognition, require robust and efficient pattern recognition techniques. The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.

...read moreread less

6,527 citations

Book•

An Introduction to Support Vector Machines

[...]

Nello Cristianini, John Shawe-Taylor

01 Mar 2000

TL;DR: This book is the first comprehensive introduction to Support Vector Machines, a new generation learning system based on recent advances in statistical learning theory, and introduces Bayesian analysis of learning and relates SVMs to Gaussian Processes and other kernel based learning methods.

...read moreread less

Abstract: This book is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory. The book also introduces Bayesian analysis of learning and relates SVMs to Gaussian Processes and other kernel based learning methods. SVMs deliver state-of-the-art performance in real-world applications such as text categorisation, hand-written character recognition, image classification, biosequences analysis, etc. Their first introduction in the early 1990s lead to a recent explosion of applications and deepening theoretical analysis, that has now established Support Vector Machines along with neural networks as one of the standard tools for machine learning and data mining. Students will find the book both stimulating and accessible, while practitioners will be guided smoothly through the material required for a good grasp of the theory and application of these techniques. The concepts are introduced gradually in accessible and self-contained stages, though in each stage the presentation is rigorous and thorough. Pointers to relevant literature and web sites containing software ensure that it forms an ideal starting point for further study. Equally the book will equip the practitioner to apply the techniques and an associated web site will provide pointers to updated literature, new applications, and on-line software.

...read moreread less

4,327 citations

Introduction to statistical learning theory and support vector machines

[...]

Zhang Xuegong

01 Jan 2000

TL;DR: A new framework for the general learning problem, and a novel powerful learning method called Support Vector Machine or SVM, which can solve small sample learning problems better are introduced.

...read moreread less

Abstract: Data based machine learning covers a wide range of topics from pattern recognition to function regression and density estimation Most of the existing methods are based on traditional statistics, which provides conclusion only for the situation where sample size is tending to infinity So they may not work in practical cases of limited samples Statistical Learning Theory or SLT is a small sample statistics by Vapnik et al, which concerns mainly the statistic principles when samples are limited, especially the properties of learning procedure in such cases SLT provides us a new framework for the general learning problem, and a novel powerful learning method called Support Vector Machine or SVM, which can solve small sample learning problems better It is believed that the study of SLT and SVM is becoming a new hot area in the field of machine learning This review introduces the basic ideas of SLT and SVM, their major characteristics and some current research trends

...read moreread less

408 citations

Proceedings Article•DOI•

Support vector machine for regression and applications to financial forecasting

[...]

Theodore B. Trafalis¹, Huseyin Ince•Institutions (1)

University of Oklahoma¹

24 Jul 2000

TL;DR: The main purpose of the paper is to compare the support vector machine (SVM) developed by Cortes and Vapnik (1995) with other techniques such as backpropagation and radial basis function (RBF) networks for financial forecasting applications.

...read moreread less

Abstract: The main purpose of the paper is to compare the support vector machine (SVM) developed by Cortes and Vapnik (1995) with other techniques such as backpropagation and radial basis function (RBF) networks for financial forecasting applications. The theory of the SVM algorithm is based on statistical learning theory. Training of SVMs leads to a quadratic programming (QP) problem. Preliminary computational results for stock price prediction are also presented.

...read moreread less

303 citations

Proceedings Article•DOI•

On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms

[...]

Kenji Yamanishi¹, Jun'ichi Takeuchi¹, Graham J. Williams², Peter A. Milne²•Institutions (2)

NEC¹, Commonwealth Scientific and Industrial Research Organisation²

01 Aug 2000

TL;DR: An experimental application to network intrusion detection shows that SmartSifter was able to identify data with high scores that corresponded to attacks, with low computational costs.

...read moreread less

Abstract: Outlier detection is a fundamental issue in data mining, specifically in fraud detection, network intrusion detection, network monitoring, etc. SmartSifter is an outlier detection engine addressing this problem from the viewpoint of statistical learning theory. This paper provides a theoretical basis for SmartSifter and empirically demonstrates its effectiveness. SmartSifter detects outliers in an on-line process through the on-line unsupervised learning of a probabilistic model (using a finite mixture model) of the information source. Each time a datum is input SmartSifter employs an on-line discounting learning algorithm to learn the probabilistic model. A score is given to the datum based on the learned model with a high score indicating a high possibility of being a statistical outlier. The novel features of SmartSifter are: (1) it is adaptive to non-stationary sources of data; (2) a score has a clear statistical/information-theoretic meaning; (3) it is computationally inexpensive; and (4) it can handle both categorical and continuous variables. An experimental application to network intrusion detection shows that SmartSifter was able to identify data with high scores that corresponded to attacks, with low computational costs. Further experimental application has identified a number of meaningful rare cases in actual health insurance pathology data from Australia's Health Insurance Commission.

...read moreread less

182 citations

Proceedings Article•DOI•

Hierarchical support vector machines for multi-class pattern recognition

[...]

Friedhelm Schwenker¹•Institutions (1)

University of Ulm¹

30 Aug 2000

TL;DR: SVM architectures for multi-class classification problems are discussed, in particular binary trees of SVMs are considered to solve the multi- class problem.

...read moreread less

Abstract: Support vector machines (SVM) are learning algorithms derived from statistical learning theory. The SVM approach was originally developed for binary classification problems. In this paper SVM architectures for multi-class classification problems are discussed, in particular we consider binary trees of SVMs to solve the multi-class problem. Numerical results for different classifiers on a benchmark data set of handwritten digits are presented.

...read moreread less

99 citations

Proceedings Article•DOI•

Image denoising using wavelet thresholding and model selection

[...]

Shi Zhong, Vladimir Cherkassky¹•Institutions (1)

University of Texas at Austin¹

10 Sep 2000

TL;DR: Empirical results show that the proposed wavelet thresholding for image denoising under the framework provided by statistical learning theory outperforms Donoho's level dependent thresholding techniques and the advantages become more significant under finite sample and non-Gaussian noise settings.

...read moreread less

Abstract: This paper describes wavelet thresholding for image denoising under the framework provided by statistical learning theory a.k.a. Vapnik-Chervonenkis (VC) theory. Under the framework of VC-theory, wavelet thresholding amounts to ordering of wavelet coefficients according to their relevance to accurate function estimation, followed by discarding insignificant coefficients. Existing wavelet thresholding methods specify an ordering based on the coefficient magnitude, and use threshold(s) derived under Gaussian noise assumption and asymptotic settings. In contrast, the proposed approach uses orderings better reflecting the statistical properties of natural images, and VC-based thresholding developed for finite sample settings under very general noise assumptions. A tree structure is proposed to order the wavelet coefficients based on its magnitude, scale and spatial location. The choice of a threshold is based on the general VC method for model complexity control. Empirical results show that the proposed method outperforms Donoho's (1992, 1995) level dependent thresholding techniques and the advantages become more significant under finite sample and non-Gaussian noise settings.

...read moreread less

89 citations

Journal Article•DOI•

Statistical Learning Theory: A Primer

[...]

Theodoros Evgeniou¹, Massimiliano Pontil¹, Tomaso Poggio¹•Institutions (1)

Massachusetts Institute of Technology¹

30 Jun 2000-International Journal of Computer Vision

TL;DR: The main concepts of Statistical Learning Theory are overviewed, a framework in which learning from examples can be studied in a principled way and well known as well as emerging learning techniques such as Regularization Networks and Support Vector Machines are discussed.

...read moreread less

Abstract: In this paper we first overview the main concepts of Statistical Learning Theory, a framework in which learning from examples can be studied in a principled way. We then briefly discuss well known as well as emerging learning techniques such as Regularization Networks and Support Vector Machines which can be justified in term of the same induction principle.

...read moreread less

82 citations

Journal Article•DOI•

Improved sample complexity estimates for statistical learning control of uncertain systems

[...]

Vladimir Koltchinskii, Chaouki T. Abdallah¹, Marco Ariola², P. Dorato¹, Dmitry Panchenko¹ - Show less +1 more•Institutions (2)

University of New Mexico¹, University of Naples Federico II²

01 Dec 2000-IEEE Transactions on Automatic Control

TL;DR: In this article, the authors introduce bootstrap learning methods and the concept of stopping times to drastically reduce the bound on the number of samples required to achieve a performance level, and apply these results to obtain more efficient algorithms which probabilistically guarantee stability and robustness levels when designing controllers for uncertain systems.

...read moreread less

Abstract: Probabilistic methods and statistical learning theory have been shown to provide approximate solutions to "difficult" control problems. Unfortunately, the number of samples required in order to guarantee stringent performance levels may be prohibitively large. This paper introduces bootstrap learning methods and the concept of stopping times to drastically reduce the bound on the number of samples required to achieve a performance level. We then apply these results to obtain more efficient algorithms which probabilistically guarantee stability and robustness levels when designing controllers for uncertain systems.

...read moreread less

77 citations

Dissertation•

Principal curves: learning, design, and applications

[...]

Adam Krzyżak, Balazs Kegl

01 Jan 2000

TL;DR: The main result here is the first known consistency proof of a principal curve estimation scheme, and an application of the polygonal line algorithm to hand-written character skeletonization.

...read moreread less

Abstract: The subjects of this thesis are unsupervised learning in general, and principal curves in particular. Principal curves were originally defined by Hastie [Has84] and Hastie and Stuetzle [HS89] (hereafter HS) to formally capture the notion of a smooth curve passing through the "middle" of a d -dimensional probability distribution or data cloud. Based on the definition, HS also developed an algorithm for constructing principal curves of distributions and data sets. The field has been very active since Hastie and Stuetzle's groundbreaking work. Numerous alternative definitions and methods for estimating principal curves have been proposed, and principal curves were further analyzed and compared with other unsupervised learning techniques. Several applications in various areas including image analysis, feature extraction, and speech processing demonstrated that principal curves are not only of theoretical interest, but they also have a legitimate place in the family of practical unsupervised learning techniques. Although the concept of principal curves as considered by HS has several appealing characteristics, complete theoretical analysis of the model seems to be rather hard. This motivated us to redefine principal curves in a manner that allowed us to carry out extensive theoretical analysis while preserving the informal notion of principal curves. Our first contribution to the area is, hence, a new theoretical model that is analyzed by using tools of statistical learning theory. Our main result here is the first known consistency proof of a principal curve estimation scheme. The theoretical model proved to be too restrictive to be practical. However, it inspired the design of a new practical algorithm to estimate principal curves based on data. The polygonal line algorithm, which compares favorably with previous methods both in terms of performance and computational complexity, is our second contribution to the area of principal curves. To complete the picture, in the last part of the thesis we consider an application of the polygonal line algorithm to hand-written character skeletonization.

...read moreread less

Book Chapter•DOI•

Direct Methods in Statistical Learning Theory

[...]

Vladimir Vapnik¹•Institutions (1)

AT&T Labs¹

01 Jan 2000

TL;DR: In this chapter, a new approach to the main problems of statistical learning theory is introduced: pattern recognition, regression estimation, and density estimation.

...read moreread less

Abstract: In this chapter we introduce a new approach to the main problems of statistical learning theory: pattern recognition, regression estimation, and density estimation.

...read moreread less

Proceedings Article•DOI•

DirectSVM: a fast and simple support vector machine perceptron

[...]

D. Roobaert¹•Institutions (1)

Royal Institute of Technology¹

11 Dec 2000

TL;DR: A simple implementation of the support vector machine (SVM) for pattern recognition, that is not based on solving a complex quadratic optimization problem, is proposed, based on a few simple heuristics.

...read moreread less

Abstract: We propose a simple implementation of the support vector machine (SVM) for pattern recognition, that is not based on solving a complex quadratic optimization problem. Instead we propose a simple, iterative algorithm that is based on a few simple heuristics. The proposed algorithm finds high-quality solutions in a fast and intuitively-simple way. In experiments on the COIL database, on the extended COIL database and on the Sonar database of the UCI Irvine repository, DirectSVM is able to find solutions that are similar to these found by the original SVM. However DirectSVM is able to find these solutions substantially faster, while requiring less computational resources than the original SVM.

...read moreread less

Journal Article•DOI•

Modified cascade-correlation learning for classification

[...]

M. Lehtokangas¹•Institutions (1)

Tampere University of Technology¹

01 May 2000-IEEE Transactions on Neural Networks

TL;DR: Modifications to the standard cascade-correlation learning that take into account the optimal hyperplane constraints are introduced and Experimental results demonstrate that with modified cascade correlation, considerable performance gains are obtained, including better generalization, smaller network size, and faster learning.

...read moreread less

Abstract: The main advantages of cascade-correlation learning are the abilities to learn quickly and to determine the network size. However, recent studies have shown that in many problems the generalization performance of a cascade-correlation trained network may not be quite optimal. Moreover, to reach a certain performance level, a larger network may be required than with other training methods. Recent advances in statistical learning theory emphasize the importance of a learning method to be able to learn optimal hyperplanes. This has led to advanced learning methods, which have demonstrated substantial performance improvements. Based on these recent advances in statistical learning theory, we introduce modifications to the standard cascade-correlation learning that take into account the optimal hyperplane constraints. Experimental results demonstrate that with modified cascade correlation, considerable performance gains are obtained compared to the standard cascade-correlation learning. This includes better generalization, smaller network size, and faster learning.

...read moreread less

Proceedings Article•DOI•

Implementation of the SVM neural network generalization function for image processing

[...]

R.A. Reyna¹, D. Esteve¹, D. Houzet¹, M.-F. Albenge¹•Institutions (1)

Centre national de la recherche scientifique¹

11 Sep 2000

TL;DR: This work uses SVM classifier as the main module of the system and proposes a parallel implementation on an FPGA programmed with VHDL to exploit the regularities of the SVM decision function in an integrated vision system.

...read moreread less

Abstract: Based on the statistical learning theory, Support Vector Machines is a novel neural network method for solving image classification problems. It has proven to obtain the optimal decision hyperplane and is also unaware of the dimensionality of the problem. The decision function is constructed with the support vectors obtained during the learning process. Each pixel bloc in the training database is processed as an input vector, the learning process finds out between input vectors those who will construct the solution (the support vectors), the weights and the threshold of the neural network. SVM does not need a test database and the solution depends entirely on the training database. The aim of our work is to exploit the regularities of the SVM decision function in an integrated vision system. The application of our vision system is object detection and localization. We use SVM classifier as the main module of the system. In order to reduce the classification computation time we are proposing a parallel implementation on an FPGA programmed with VHDL.

...read moreread less

Statistical Learning Control of Uncertain Systems: It is Better Than It Seems

[...]

V. Koltchinski

01 Jan 2000

TL;DR: In this paper, the authors introduce bootstrap learning methods and the concept of stopping times to reduce the bound on the number of samples required to achieve a performance level, and then apply these results to obtain more eAEcient algorithms which probabilistically guarantee stability and robustness levels when designing controllers for uncertain systems.

...read moreread less

Abstract: Recently, probabilistic methods and statistical learning theory have been shown to provide approximate solutions to \diAEcult" control problems. Unfortunately, the number of samples required in order to guarantee stringent performance levels may be prohibitively large. This paper introduces bootstrap learning methods and the concept of stopping times to drastically reduce the bound on the number of samples required to achieve a performance level. We then apply these results to obtain more eAEcient algorithms which probabilistically guarantee stability and robustness levels when designing controllers for uncertain systems.

...read moreread less

Proceedings Article•DOI•

Object recognition and detection by a combination of support vector machine and rotation invariant phase only correlation

[...]

C. Nakajima, N. Itoh, Massimiliano Pontil, Tomaso Poggio

03 Sep 2000

TL;DR: This paper proposes an object recognition and detection method by a combination of support vector machine classifier (SVM) and rotation invariant phase-only correlation (RIPOC) that can recognize and detect objects from image sequences without special image marks or sensors and show information about the objects through a head-mounted display.

...read moreread less

Abstract: This paper proposes an object recognition and detection method by a combination of support vector machine classifier (SVM) and rotation invariant phase-only correlation (RIPOC). SVM is a learning technique that is well founded in statistical learning theory. RIPOC is a position and rotation invariant pattern matching technique. We combined these two techniques to develop an augmented reality system. This system can recognize and detect objects from image sequences without special image marks or sensors and show information about the objects through a head-mounted display. Performance is real time.

...read moreread less

Margin Error and Generalization Capabilities of Multi-Class Discriminant Systems

[...]

André Elisseeff, Yann Guermeur, Hélène Paugam-Moisy

01 Jan 2000

TL;DR: This work, which aims at establishing foundations for the statistical analysis of multi-class models, based on a new notion of margin, paves the way for the theoretical study of the existing multi- class support vector machines and the design of new ones.

...read moreread less

Abstract: The theory and practice of discriminant analysis have been mainly developed for two-class problems (computation of dichotomies). This phenomenon can easily be explained, since there is an obvious way to perform multi-category discrimination tasks using solely models computing dichotomies. It consists in dividing the problem at hand into several {\it one-against-all} ones and applying a simple rule to construct the global discriminant function from the partial ones. Adopting a direct approach, however, should make it possible to improve the results, let them be theoretical (bounds on the expected risk) or practical (values of the empirical risk and the expected risk). Although multi-category extensions of the main models computing dichotomiesas in the case of multi-layer perceptrons, in other cases, this cannot be done readily but at the expense of the loss of part of the theoretical foundations. This is for instance the main shortcoming of the multi-category support vector machines developed so far. One of the major difficulties of multi-category discriminant analysis rests in the fact that it requires specific uniform convergence results. Indeed, the uniform strong laws of large numbers established for dichotomies do not extend nicely to multi-category problems. They become significantly looser. This is problematical indeed, since the question of the quality of bounds is of central importance if one wants to implement with confidence the structural risk minimization inductive principle, which is preciselyIn this paper, building upon the notions of margin used in the context of statistical learning theory and boosting theory, and the corresponding generalization error bounds, we derive sharper bounds on the expected risk (generalization error) of multi-class vector-valued discriminant models. The main result is an extension of a lemma by Bartlett. After a discussion about the notion of margin and its use for two-class discriminant analysis, we derive the main theorem and its corollaries. We then show how to bound the capacity measure for sets of functions of interest and study specifically the case of the multivariate linear regression model, which is of particular importance, since it is directly related to multi-category support vector machines. Finally, the bound is assessed on a real-worldThis work, which aims at establishing foundations for the statistical analysis of multi-class models, based on a new notion of margin, paves the way for the theoretical study of the existing multi-class support vector machines and the design of new ones. biocomputing problem. grounding the support vector method. can often be conceived simply,

...read moreread less

Proceedings Article•DOI•

Parallel non-linear dichotomizers

[...]

Francesco Masulli, Giorgio Valentini

24 Jul 2000

TL;DR: A new learning machine model for classification problems, based on decompositions of multiclass classification problems in sets of two-class subproblems, assigned to nonlinear dichotomizers that learn their task independently of each other is presented.

...read moreread less

Abstract: We present a new learning machine model for classification problems, based on decompositions of multiclass classification problems in sets of two-class subproblems, assigned to nonlinear dichotomizers that learn their task independently of each other. The experimentation performed on classical data sets, shows that this learning machine model achieves significant performance improvements over MLP, and previous classifiers models based on decomposition of polychotomies into dichotomies. The theoretical reasons of the good properties of generalization of the proposed learning machine model are explained in the framework of the statistical learning theory.

...read moreread less

Journal Article•DOI•

Non- and semiparametric statistics: compared and contrasted

[...]

Peter J. Bickel¹, Ya'acov Ritov²•Institutions (2)

University of California, Berkeley¹, Hebrew University of Jerusalem²

01 Dec 2000-Journal of Statistical Planning and Inference

TL;DR: In this article, the authors argue that the people working in these two areas constitute two cultures arising from distinct historical roots and operating under two different paradigms: descriptive statistics and classical inference concerning the behavior of well-conditioned infinite-dimensional objects.

...read moreread less

Dissertation•

Learning with kernel machine architectures

[...]

Theodoros K. Evgeniou

01 Jan 2000

TL;DR: This thesis presents a theoretical justification of these machines within a unified framework based on the statistical learning theory of Vapnik, and studies the generalization performance of RN and SVM within this framework.

...read moreread less

Abstract: This thesis studies the problem of supervised learning using a family of machines, namely kernel learning machines. A number of standard learning methods belong to this family, such as Regularization Networks (RN) and Support Vector Machines (SVM). The thesis presents a theoretical justification of these machines within a unified framework based on the statistical learning theory of Vapnik. The generalization performance of RN and SVM is studied within this framework, and bounds on the generalization error of these machines are proved. In the second part, the thesis goes beyond standard one-layer learning machines, and probes into the problem of learning using hierarchical learning schemes. In particular it investigates the question: what happens when instead of training one machine using the available examples we train many of them, each in a different way, and then combine the machines? Two types of ensembles are defined: voting combinations and adaptive combinations. The statistical properties of these hierarchical learning schemes are investigated both theoretically and experimentally: bounds on their generalization performance are proved, and experiments characterizing their behavior are shown. Finally, the last part of the thesis discusses the problem of choosing data representations for learning. It is an experimental part that uses the particular problem of object detection in images as a framework to discuss a number of issues that arise when kernel machines are used in practice. Thesis Supervisor: Tomaso Poggio Title: Uncas and Helen Whitaker Professor of Brain and Cognitive Sciences

...read moreread less

Proceedings Article•

Pattern recognition with novel support vector machine learning method

[...]

Mikko Lehtokangas¹•Institutions (1)

Tampere University of Technology¹

01 Sep 2000

TL;DR: This study investigates the basic SVM method and points out some problems that may arise especially in large scale problems with abundant data, and proposes a novel SVM type method that aims to avoid the problems found in the basic method.

...read moreread less

Abstract: The concept of optimal hyperplane has been recently proposed in the context of statistical learning theory. The important property of an optimal hyperplane is that it provides maximum margins to each class to be separated. Obviously, such a decision boundary is expected to yield good generalization. Currently, the support vector machines (SVM) are probably one of the very few models (if not the only ones) that make use of the optimal hyperplane concept. In this study we investigate the basic SVM method and point out some problems that may arise especially in large scale problems with abundant data. Moreover, we propose a novel SVM type method that aims to avoid the problems found in the basic method. The experimental results demonstrate that the proposed method can give very good classification performance. However, the results also point out another potential problem in the SVM scheme which should be considered in the future studies.

...read moreread less

Proceedings Article•DOI•

Generalization performance of multiclass discriminant models

[...]

Hélène Paugam-Moisy, André Elisseeff, Yann Guermeur

24 Jul 2000

TL;DR: This work studies the generalization performance of multiclass discriminant systems and establishes on this performance a bound based on covering numbers, which is then applied to a linear ensemble method which estimates the class posterior probabilities.

...read moreread less

Abstract: Starting from a direct definition of the notion of margin in the multiclass case, we study the generalization performance of multiclass discriminant systems. In the framework of statistical learning theory, we establish on this performance a bound based on covering numbers. An application to a linear ensemble method which estimates the class posterior probabilities provides us with a way to compare this bound and another one based on combinatorial dimensions, with respect to the capacity measure they incorporate. Experimental results highlight their usefulness for a real-world problem.

...read moreread less

Proceedings Article•DOI•

Optimizing fusion architectures for limited training data sets

[...]

Brian A. Baertlein¹, Ajith H. Gunatilaka¹•Institutions (1)

Ohio State University¹

22 Aug 2000

TL;DR: It is shown that fusion of features, soft decisions, and hard decisions each yield improved performance with respect to the individual sensors, and fusion decreases the overall error rate.

...read moreread less

Abstract: A method is described to improve the performance of sensor fusion algorithms. Data sets available for training fusion algorithms are often smaller than described, since the sensor suite used for data acquisition is always limited by the slowest, least reliable sensor. In addition, the fusion process expands the dimension of the data, which increases the requirement for training data. By using structural risk minimization, a technique of statistical learning theory, a classifier of optimal complexity can be obtained, leading to improved performance. A technique for jointly optimizing the local decision thresholds is also described for hard- decision fusion. The procedure is demonstrated for EMI, GPR and MWIR data acquired at the US Army mine lanes at Fort AP Hill, VA, Site 71A. It is shown that fusion of features, soft decisions, and hard decisions each yield improved performance with respect to the individual sensors. Fusion decreases the overall error rate from roughly 20 percent for the best single sensor to roughly 10 percent for the best fused result.

...read moreread less

Journal Article•

Application of support vector machines function regression in fractal interpolation

[...]

Zhang Xuegong

01 Jan 2000-Journal of Tsinghua University

TL;DR: This paper presents a new framework for the general learning problem and a powerful new learning method called support vector machine which can solve small sample learning problems better and eliminates the rupture of fractal curve in sample calculations.

...read moreread less

Abstract: Rupture of fractal curves is controled using support vector machine function regression in the later period of fractal interpolation. The method uses Statistical Learning Theory (SLT) which mainly considers the statistic properties of small samples, especially the properties of the learning procedure in such cases. SLT provides a new framework for the general learning problem and a powerful new learning method called support vector machine which can solve small sample learning problems better. The method not only eliminates the rupture of fractal curve in sample calculations, but also has the advantage of fractal interpolation, which can display details.

...read moreread less

Journal Article•

Support vector machine methods in pattern recognition of sedimentary facies

[...]

Yan Hui

01 Jan 2000-Computing Techniques For Giophysical and Geochenical Exploration

TL;DR: The result of practical application indicates that the performance of SVM had superiority over RBFNN and overcome the problem of overfitting excellently.

...read moreread less

Abstract: Aiming at the problem of pattern recognition in sedimentary facies analysis, we put forward a scheme which apply SVM to sedimentary facies recognition. Unlike traditional method try to reduce the dimension of input space(i.e. characters selection and transformation), SVM increase dimension of input space to ensure it is Linearly Separable in high dimension space. The method is feasible because it only changes inner product operation and the complexity of algorithm doesn't increase. Using SVM we needn't wasting time on character extraction but resort to the intrinsic character extraction ability which makes it more suitable in practical instance. The result of practical application indicates that the performance of SVM had superiority over RBFNN and overcome the problem of overfitting excellently.

...read moreread less

Proceedings Article•DOI•

From bits to information with learning machines: theory and applications

[...]

T. Poggio¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Oct 2000

TL;DR: The evolution of a trainable object detection system for classifying objects-such as faces and people and cars-in complex cluttered images and some data which provide a glimpse of how 3D objects are represented in the visual cortex are described.

...read moreread less

Abstract: Summary form only given. Learning is becoming the central problem in trying to understand intelligence and in trying to develop intelligent machines. The paper outlines some previous efforts in developing machines that learn. It sketches the authors's work on statistical learning theory and theoretical results on the problem of classification and function approximation that connect regularization theory and support vector machines. The main application focus is classification (and regression) in various domains-such as sound, text, video and bioinformatics. In particular, the paper describe the evolution of a trainable object detection system for classifying objects-such as faces and people and cars-in complex cluttered images. Finally, it speculates on the implications of this research for how the brain works and review some data which provide a glimpse of how 3D objects are represented in the visual cortex.

...read moreread less

Proceedings Article•DOI•

Perspectives from the informational complexity of learning

[...]

Partha Niyogi¹•Institutions (1)

Bell Labs¹

28 May 2000

TL;DR: An analysis of real-valued function learning using neural networks shows how the generalization ability of a learner is bounded both by finite data and limited representational capacity and shifts attention away from asymptotics to learning with finite resources.

...read moreread less

Abstract: We discuss two seemingly disparate problems of learning from examples within the framework of statistical learning theory. The first involves real-valued function learning using neural networks and an analysis of this has two interesting aspects (1) it shows how the generalization ability of a learner is bounded both by finite data and limited representational capacity (2) it shifts attention away from asymptotics to learning with finite resources. The perspective that this yields is then brought to bear on the second problem of learning natural language grammars to articulate some issues that computational linguistics needs to deal with.

...read moreread less

Environmental Data Mapping with Support Vector Regression and Geostatistics

[...]

Mikhail Kanevski, Patrick C. M. Wong, Stéphane Canu

01 Jan 2000

TL;DR: The decision-oriented mapping of pollution using hybrid models based on statistical learning theory and geostatistics is considered in this article, where the authors consider the problem of decision oriented mapping.

...read moreread less

Abstract: The decision-oriented mapping of pollution using hybrid models based on statistical learning theory and geostatistics is considered

...read moreread less