scispace - formally typeset
Search or ask a question

Showing papers on "Statistical learning theory published in 2000"


Book
01 Jan 2000
TL;DR: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory, and will guide practitioners to updated literature, new applications, and on-line software.
Abstract: From the publisher: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory. SVMs deliver state-of-the-art performance in real-world applications such as text categorisation, hand-written character recognition, image classification, biosequences analysis, etc., and are now established as one of the standard tools for machine learning and data mining. Students will find the book both stimulating and accessible, while practitioners will be guided smoothly through the material required for a good grasp of the theory and its applications. The concepts are introduced gradually in accessible and self-contained stages, while the presentation is rigorous and thorough. Pointers to relevant literature and web sites containing software ensure that it forms an ideal starting point for further study. Equally, the book and its associated web site will guide practitioners to updated literature, new applications, and on-line software.

13,736 citations


Journal ArticleDOI
TL;DR: The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.
Abstract: The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques and methods imported from statistical learning theory have been receiving increasing attention. The design of a recognition system requires careful attention to the following issues: definition of pattern classes, sensing environment, pattern representation, feature extraction and selection, cluster analysis, classifier design and learning, selection of training and test samples, and performance evaluation. In spite of almost 50 years of research and development in this field, the general problem of recognizing complex patterns with arbitrary orientation, location, and scale remains unsolved. New and emerging applications, such as data mining, web searching, retrieval of multimedia data, face recognition, and cursive handwriting recognition, require robust and efficient pattern recognition techniques. The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.

6,527 citations


Book
01 Mar 2000
TL;DR: This book is the first comprehensive introduction to Support Vector Machines, a new generation learning system based on recent advances in statistical learning theory, and introduces Bayesian analysis of learning and relates SVMs to Gaussian Processes and other kernel based learning methods.
Abstract: This book is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory. The book also introduces Bayesian analysis of learning and relates SVMs to Gaussian Processes and other kernel based learning methods. SVMs deliver state-of-the-art performance in real-world applications such as text categorisation, hand-written character recognition, image classification, biosequences analysis, etc. Their first introduction in the early 1990s lead to a recent explosion of applications and deepening theoretical analysis, that has now established Support Vector Machines along with neural networks as one of the standard tools for machine learning and data mining. Students will find the book both stimulating and accessible, while practitioners will be guided smoothly through the material required for a good grasp of the theory and application of these techniques. The concepts are introduced gradually in accessible and self-contained stages, though in each stage the presentation is rigorous and thorough. Pointers to relevant literature and web sites containing software ensure that it forms an ideal starting point for further study. Equally the book will equip the practitioner to apply the techniques and an associated web site will provide pointers to updated literature, new applications, and on-line software.

4,327 citations


01 Jan 2000
TL;DR: A new framework for the general learning problem, and a novel powerful learning method called Support Vector Machine or SVM, which can solve small sample learning problems better are introduced.
Abstract: Data based machine learning covers a wide range of topics from pattern recognition to function regression and density estimation Most of the existing methods are based on traditional statistics, which provides conclusion only for the situation where sample size is tending to infinity So they may not work in practical cases of limited samples Statistical Learning Theory or SLT is a small sample statistics by Vapnik et al, which concerns mainly the statistic principles when samples are limited, especially the properties of learning procedure in such cases SLT provides us a new framework for the general learning problem, and a novel powerful learning method called Support Vector Machine or SVM, which can solve small sample learning problems better It is believed that the study of SLT and SVM is becoming a new hot area in the field of machine learning This review introduces the basic ideas of SLT and SVM, their major characteristics and some current research trends

408 citations


Proceedings ArticleDOI
24 Jul 2000
TL;DR: The main purpose of the paper is to compare the support vector machine (SVM) developed by Cortes and Vapnik (1995) with other techniques such as backpropagation and radial basis function (RBF) networks for financial forecasting applications.
Abstract: The main purpose of the paper is to compare the support vector machine (SVM) developed by Cortes and Vapnik (1995) with other techniques such as backpropagation and radial basis function (RBF) networks for financial forecasting applications. The theory of the SVM algorithm is based on statistical learning theory. Training of SVMs leads to a quadratic programming (QP) problem. Preliminary computational results for stock price prediction are also presented.

303 citations


Proceedings ArticleDOI
01 Aug 2000
TL;DR: An experimental application to network intrusion detection shows that SmartSifter was able to identify data with high scores that corresponded to attacks, with low computational costs.
Abstract: Outlier detection is a fundamental issue in data mining, specifically in fraud detection, network intrusion detection, network monitoring, etc. SmartSifter is an outlier detection engine addressing this problem from the viewpoint of statistical learning theory. This paper provides a theoretical basis for SmartSifter and empirically demonstrates its effectiveness. SmartSifter detects outliers in an on-line process through the on-line unsupervised learning of a probabilistic model (using a finite mixture model) of the information source. Each time a datum is input SmartSifter employs an on-line discounting learning algorithm to learn the probabilistic model. A score is given to the datum based on the learned model with a high score indicating a high possibility of being a statistical outlier. The novel features of SmartSifter are: (1) it is adaptive to non-stationary sources of data; (2) a score has a clear statistical/information-theoretic meaning; (3) it is computationally inexpensive; and (4) it can handle both categorical and continuous variables. An experimental application to network intrusion detection shows that SmartSifter was able to identify data with high scores that corresponded to attacks, with low computational costs. Further experimental application has identified a number of meaningful rare cases in actual health insurance pathology data from Australia's Health Insurance Commission.

182 citations


Proceedings ArticleDOI
30 Aug 2000
TL;DR: SVM architectures for multi-class classification problems are discussed, in particular binary trees of SVMs are considered to solve the multi- class problem.
Abstract: Support vector machines (SVM) are learning algorithms derived from statistical learning theory. The SVM approach was originally developed for binary classification problems. In this paper SVM architectures for multi-class classification problems are discussed, in particular we consider binary trees of SVMs to solve the multi-class problem. Numerical results for different classifiers on a benchmark data set of handwritten digits are presented.

99 citations


Proceedings ArticleDOI
10 Sep 2000
TL;DR: Empirical results show that the proposed wavelet thresholding for image denoising under the framework provided by statistical learning theory outperforms Donoho's level dependent thresholding techniques and the advantages become more significant under finite sample and non-Gaussian noise settings.
Abstract: This paper describes wavelet thresholding for image denoising under the framework provided by statistical learning theory a.k.a. Vapnik-Chervonenkis (VC) theory. Under the framework of VC-theory, wavelet thresholding amounts to ordering of wavelet coefficients according to their relevance to accurate function estimation, followed by discarding insignificant coefficients. Existing wavelet thresholding methods specify an ordering based on the coefficient magnitude, and use threshold(s) derived under Gaussian noise assumption and asymptotic settings. In contrast, the proposed approach uses orderings better reflecting the statistical properties of natural images, and VC-based thresholding developed for finite sample settings under very general noise assumptions. A tree structure is proposed to order the wavelet coefficients based on its magnitude, scale and spatial location. The choice of a threshold is based on the general VC method for model complexity control. Empirical results show that the proposed method outperforms Donoho's (1992, 1995) level dependent thresholding techniques and the advantages become more significant under finite sample and non-Gaussian noise settings.

89 citations


Journal ArticleDOI
TL;DR: The main concepts of Statistical Learning Theory are overviewed, a framework in which learning from examples can be studied in a principled way and well known as well as emerging learning techniques such as Regularization Networks and Support Vector Machines are discussed.
Abstract: In this paper we first overview the main concepts of Statistical Learning Theory, a framework in which learning from examples can be studied in a principled way. We then briefly discuss well known as well as emerging learning techniques such as Regularization Networks and Support Vector Machines which can be justified in term of the same induction principle.

82 citations


Journal ArticleDOI
TL;DR: In this article, the authors introduce bootstrap learning methods and the concept of stopping times to drastically reduce the bound on the number of samples required to achieve a performance level, and apply these results to obtain more efficient algorithms which probabilistically guarantee stability and robustness levels when designing controllers for uncertain systems.
Abstract: Probabilistic methods and statistical learning theory have been shown to provide approximate solutions to "difficult" control problems. Unfortunately, the number of samples required in order to guarantee stringent performance levels may be prohibitively large. This paper introduces bootstrap learning methods and the concept of stopping times to drastically reduce the bound on the number of samples required to achieve a performance level. We then apply these results to obtain more efficient algorithms which probabilistically guarantee stability and robustness levels when designing controllers for uncertain systems.

77 citations


Dissertation
01 Jan 2000
TL;DR: The main result here is the first known consistency proof of a principal curve estimation scheme, and an application of the polygonal line algorithm to hand-written character skeletonization.
Abstract: The subjects of this thesis are unsupervised learning in general, and principal curves in particular. Principal curves were originally defined by Hastie [Has84] and Hastie and Stuetzle [HS89] (hereafter HS) to formally capture the notion of a smooth curve passing through the "middle" of a d -dimensional probability distribution or data cloud. Based on the definition, HS also developed an algorithm for constructing principal curves of distributions and data sets. The field has been very active since Hastie and Stuetzle's groundbreaking work. Numerous alternative definitions and methods for estimating principal curves have been proposed, and principal curves were further analyzed and compared with other unsupervised learning techniques. Several applications in various areas including image analysis, feature extraction, and speech processing demonstrated that principal curves are not only of theoretical interest, but they also have a legitimate place in the family of practical unsupervised learning techniques. Although the concept of principal curves as considered by HS has several appealing characteristics, complete theoretical analysis of the model seems to be rather hard. This motivated us to redefine principal curves in a manner that allowed us to carry out extensive theoretical analysis while preserving the informal notion of principal curves. Our first contribution to the area is, hence, a new theoretical model that is analyzed by using tools of statistical learning theory. Our main result here is the first known consistency proof of a principal curve estimation scheme. The theoretical model proved to be too restrictive to be practical. However, it inspired the design of a new practical algorithm to estimate principal curves based on data. The polygonal line algorithm, which compares favorably with previous methods both in terms of performance and computational complexity, is our second contribution to the area of principal curves. To complete the picture, in the last part of the thesis we consider an application of the polygonal line algorithm to hand-written character skeletonization.

Book ChapterDOI
Vladimir Vapnik1
01 Jan 2000
TL;DR: In this chapter, a new approach to the main problems of statistical learning theory is introduced: pattern recognition, regression estimation, and density estimation.
Abstract: In this chapter we introduce a new approach to the main problems of statistical learning theory: pattern recognition, regression estimation, and density estimation.

Proceedings ArticleDOI
11 Dec 2000
TL;DR: A simple implementation of the support vector machine (SVM) for pattern recognition, that is not based on solving a complex quadratic optimization problem, is proposed, based on a few simple heuristics.
Abstract: We propose a simple implementation of the support vector machine (SVM) for pattern recognition, that is not based on solving a complex quadratic optimization problem. Instead we propose a simple, iterative algorithm that is based on a few simple heuristics. The proposed algorithm finds high-quality solutions in a fast and intuitively-simple way. In experiments on the COIL database, on the extended COIL database and on the Sonar database of the UCI Irvine repository, DirectSVM is able to find solutions that are similar to these found by the original SVM. However DirectSVM is able to find these solutions substantially faster, while requiring less computational resources than the original SVM.

Journal ArticleDOI
TL;DR: Modifications to the standard cascade-correlation learning that take into account the optimal hyperplane constraints are introduced and Experimental results demonstrate that with modified cascade correlation, considerable performance gains are obtained, including better generalization, smaller network size, and faster learning.
Abstract: The main advantages of cascade-correlation learning are the abilities to learn quickly and to determine the network size. However, recent studies have shown that in many problems the generalization performance of a cascade-correlation trained network may not be quite optimal. Moreover, to reach a certain performance level, a larger network may be required than with other training methods. Recent advances in statistical learning theory emphasize the importance of a learning method to be able to learn optimal hyperplanes. This has led to advanced learning methods, which have demonstrated substantial performance improvements. Based on these recent advances in statistical learning theory, we introduce modifications to the standard cascade-correlation learning that take into account the optimal hyperplane constraints. Experimental results demonstrate that with modified cascade correlation, considerable performance gains are obtained compared to the standard cascade-correlation learning. This includes better generalization, smaller network size, and faster learning.

Proceedings ArticleDOI
11 Sep 2000
TL;DR: This work uses SVM classifier as the main module of the system and proposes a parallel implementation on an FPGA programmed with VHDL to exploit the regularities of the SVM decision function in an integrated vision system.
Abstract: Based on the statistical learning theory, Support Vector Machines is a novel neural network method for solving image classification problems. It has proven to obtain the optimal decision hyperplane and is also unaware of the dimensionality of the problem. The decision function is constructed with the support vectors obtained during the learning process. Each pixel bloc in the training database is processed as an input vector, the learning process finds out between input vectors those who will construct the solution (the support vectors), the weights and the threshold of the neural network. SVM does not need a test database and the solution depends entirely on the training database. The aim of our work is to exploit the regularities of the SVM decision function in an integrated vision system. The application of our vision system is object detection and localization. We use SVM classifier as the main module of the system. In order to reduce the classification computation time we are proposing a parallel implementation on an FPGA programmed with VHDL.

01 Jan 2000
TL;DR: In this paper, the authors introduce bootstrap learning methods and the concept of stopping times to reduce the bound on the number of samples required to achieve a performance level, and then apply these results to obtain more eAEcient algorithms which probabilistically guarantee stability and robustness levels when designing controllers for uncertain systems.
Abstract: Recently, probabilistic methods and statistical learning theory have been shown to provide approximate solutions to \diAEcult" control problems. Unfortunately, the number of samples required in order to guarantee stringent performance levels may be prohibitively large. This paper introduces bootstrap learning methods and the concept of stopping times to drastically reduce the bound on the number of samples required to achieve a performance level. We then apply these results to obtain more eAEcient algorithms which probabilistically guarantee stability and robustness levels when designing controllers for uncertain systems.

Proceedings ArticleDOI
03 Sep 2000
TL;DR: This paper proposes an object recognition and detection method by a combination of support vector machine classifier (SVM) and rotation invariant phase-only correlation (RIPOC) that can recognize and detect objects from image sequences without special image marks or sensors and show information about the objects through a head-mounted display.
Abstract: This paper proposes an object recognition and detection method by a combination of support vector machine classifier (SVM) and rotation invariant phase-only correlation (RIPOC). SVM is a learning technique that is well founded in statistical learning theory. RIPOC is a position and rotation invariant pattern matching technique. We combined these two techniques to develop an augmented reality system. This system can recognize and detect objects from image sequences without special image marks or sensors and show information about the objects through a head-mounted display. Performance is real time.

01 Jan 2000
TL;DR: This work, which aims at establishing foundations for the statistical analysis of multi-class models, based on a new notion of margin, paves the way for the theoretical study of the existing multi- class support vector machines and the design of new ones.
Abstract: The theory and practice of discriminant analysis have been mainly developed for two-class problems (computation of dichotomies). This phenomenon can easily be explained, since there is an obvious way to perform multi-category discrimination tasks using solely models computing dichotomies. It consists in dividing the problem at hand into several {\it one-against-all} ones and applying a simple rule to construct the global discriminant function from the partial ones. Adopting a direct approach, however, should make it possible to improve the results, let them be theoretical (bounds on the expected risk) or practical (values of the empirical risk and the expected risk). Although multi-category extensions of the main models computing dichotomiesas in the case of multi-layer perceptrons, in other cases, this cannot be done readily but at the expense of the loss of part of the theoretical foundations. This is for instance the main shortcoming of the multi-category support vector machines developed so far. One of the major difficulties of multi-category discriminant analysis rests in the fact that it requires specific uniform convergence results. Indeed, the uniform strong laws of large numbers established for dichotomies do not extend nicely to multi-category problems. They become significantly looser. This is problematical indeed, since the question of the quality of bounds is of central importance if one wants to implement with confidence the structural risk minimization inductive principle, which is preciselyIn this paper, building upon the notions of margin used in the context of statistical learning theory and boosting theory, and the corresponding generalization error bounds, we derive sharper bounds on the expected risk (generalization error) of multi-class vector-valued discriminant models. The main result is an extension of a lemma by Bartlett. After a discussion about the notion of margin and its use for two-class discriminant analysis, we derive the main theorem and its corollaries. We then show how to bound the capacity measure for sets of functions of interest and study specifically the case of the multivariate linear regression model, which is of particular importance, since it is directly related to multi-category support vector machines. Finally, the bound is assessed on a real-worldThis work, which aims at establishing foundations for the statistical analysis of multi-class models, based on a new notion of margin, paves the way for the theoretical study of the existing multi-class support vector machines and the design of new ones. biocomputing problem. grounding the support vector method. can often be conceived simply,

Proceedings ArticleDOI
24 Jul 2000
TL;DR: A new learning machine model for classification problems, based on decompositions of multiclass classification problems in sets of two-class subproblems, assigned to nonlinear dichotomizers that learn their task independently of each other is presented.
Abstract: We present a new learning machine model for classification problems, based on decompositions of multiclass classification problems in sets of two-class subproblems, assigned to nonlinear dichotomizers that learn their task independently of each other. The experimentation performed on classical data sets, shows that this learning machine model achieves significant performance improvements over MLP, and previous classifiers models based on decomposition of polychotomies into dichotomies. The theoretical reasons of the good properties of generalization of the proposed learning machine model are explained in the framework of the statistical learning theory.

Journal ArticleDOI
TL;DR: In this article, the authors argue that the people working in these two areas constitute two cultures arising from distinct historical roots and operating under two different paradigms: descriptive statistics and classical inference concerning the behavior of well-conditioned infinite-dimensional objects.

Dissertation
01 Jan 2000
TL;DR: This thesis presents a theoretical justification of these machines within a unified framework based on the statistical learning theory of Vapnik, and studies the generalization performance of RN and SVM within this framework.
Abstract: This thesis studies the problem of supervised learning using a family of machines, namely kernel learning machines. A number of standard learning methods belong to this family, such as Regularization Networks (RN) and Support Vector Machines (SVM). The thesis presents a theoretical justification of these machines within a unified framework based on the statistical learning theory of Vapnik. The generalization performance of RN and SVM is studied within this framework, and bounds on the generalization error of these machines are proved. In the second part, the thesis goes beyond standard one-layer learning machines, and probes into the problem of learning using hierarchical learning schemes. In particular it investigates the question: what happens when instead of training one machine using the available examples we train many of them, each in a different way, and then combine the machines? Two types of ensembles are defined: voting combinations and adaptive combinations. The statistical properties of these hierarchical learning schemes are investigated both theoretically and experimentally: bounds on their generalization performance are proved, and experiments characterizing their behavior are shown. Finally, the last part of the thesis discusses the problem of choosing data representations for learning. It is an experimental part that uses the particular problem of object detection in images as a framework to discuss a number of issues that arise when kernel machines are used in practice. Thesis Supervisor: Tomaso Poggio Title: Uncas and Helen Whitaker Professor of Brain and Cognitive Sciences

Proceedings Article
01 Sep 2000
TL;DR: This study investigates the basic SVM method and points out some problems that may arise especially in large scale problems with abundant data, and proposes a novel SVM type method that aims to avoid the problems found in the basic method.
Abstract: The concept of optimal hyperplane has been recently proposed in the context of statistical learning theory. The important property of an optimal hyperplane is that it provides maximum margins to each class to be separated. Obviously, such a decision boundary is expected to yield good generalization. Currently, the support vector machines (SVM) are probably one of the very few models (if not the only ones) that make use of the optimal hyperplane concept. In this study we investigate the basic SVM method and point out some problems that may arise especially in large scale problems with abundant data. Moreover, we propose a novel SVM type method that aims to avoid the problems found in the basic method. The experimental results demonstrate that the proposed method can give very good classification performance. However, the results also point out another potential problem in the SVM scheme which should be considered in the future studies.

Proceedings ArticleDOI
24 Jul 2000
TL;DR: This work studies the generalization performance of multiclass discriminant systems and establishes on this performance a bound based on covering numbers, which is then applied to a linear ensemble method which estimates the class posterior probabilities.
Abstract: Starting from a direct definition of the notion of margin in the multiclass case, we study the generalization performance of multiclass discriminant systems. In the framework of statistical learning theory, we establish on this performance a bound based on covering numbers. An application to a linear ensemble method which estimates the class posterior probabilities provides us with a way to compare this bound and another one based on combinatorial dimensions, with respect to the capacity measure they incorporate. Experimental results highlight their usefulness for a real-world problem.

Proceedings ArticleDOI
22 Aug 2000
TL;DR: It is shown that fusion of features, soft decisions, and hard decisions each yield improved performance with respect to the individual sensors, and fusion decreases the overall error rate.
Abstract: A method is described to improve the performance of sensor fusion algorithms. Data sets available for training fusion algorithms are often smaller than described, since the sensor suite used for data acquisition is always limited by the slowest, least reliable sensor. In addition, the fusion process expands the dimension of the data, which increases the requirement for training data. By using structural risk minimization, a technique of statistical learning theory, a classifier of optimal complexity can be obtained, leading to improved performance. A technique for jointly optimizing the local decision thresholds is also described for hard- decision fusion. The procedure is demonstrated for EMI, GPR and MWIR data acquired at the US Army mine lanes at Fort AP Hill, VA, Site 71A. It is shown that fusion of features, soft decisions, and hard decisions each yield improved performance with respect to the individual sensors. Fusion decreases the overall error rate from roughly 20 percent for the best single sensor to roughly 10 percent for the best fused result.

Journal Article
TL;DR: This paper presents a new framework for the general learning problem and a powerful new learning method called support vector machine which can solve small sample learning problems better and eliminates the rupture of fractal curve in sample calculations.
Abstract: Rupture of fractal curves is controled using support vector machine function regression in the later period of fractal interpolation. The method uses Statistical Learning Theory (SLT) which mainly considers the statistic properties of small samples, especially the properties of the learning procedure in such cases. SLT provides a new framework for the general learning problem and a powerful new learning method called support vector machine which can solve small sample learning problems better. The method not only eliminates the rupture of fractal curve in sample calculations, but also has the advantage of fractal interpolation, which can display details.

Journal Article
TL;DR: The result of practical application indicates that the performance of SVM had superiority over RBFNN and overcome the problem of overfitting excellently.
Abstract: Aiming at the problem of pattern recognition in sedimentary facies analysis, we put forward a scheme which apply SVM to sedimentary facies recognition. Unlike traditional method try to reduce the dimension of input space(i.e. characters selection and transformation), SVM increase dimension of input space to ensure it is Linearly Separable in high dimension space. The method is feasible because it only changes inner product operation and the complexity of algorithm doesn't increase. Using SVM we needn't wasting time on character extraction but resort to the intrinsic character extraction ability which makes it more suitable in practical instance. The result of practical application indicates that the performance of SVM had superiority over RBFNN and overcome the problem of overfitting excellently.

Proceedings ArticleDOI
01 Oct 2000
TL;DR: The evolution of a trainable object detection system for classifying objects-such as faces and people and cars-in complex cluttered images and some data which provide a glimpse of how 3D objects are represented in the visual cortex are described.
Abstract: Summary form only given. Learning is becoming the central problem in trying to understand intelligence and in trying to develop intelligent machines. The paper outlines some previous efforts in developing machines that learn. It sketches the authors's work on statistical learning theory and theoretical results on the problem of classification and function approximation that connect regularization theory and support vector machines. The main application focus is classification (and regression) in various domains-such as sound, text, video and bioinformatics. In particular, the paper describe the evolution of a trainable object detection system for classifying objects-such as faces and people and cars-in complex cluttered images. Finally, it speculates on the implications of this research for how the brain works and review some data which provide a glimpse of how 3D objects are represented in the visual cortex.

Proceedings ArticleDOI
Partha Niyogi1
28 May 2000
TL;DR: An analysis of real-valued function learning using neural networks shows how the generalization ability of a learner is bounded both by finite data and limited representational capacity and shifts attention away from asymptotics to learning with finite resources.
Abstract: We discuss two seemingly disparate problems of learning from examples within the framework of statistical learning theory. The first involves real-valued function learning using neural networks and an analysis of this has two interesting aspects (1) it shows how the generalization ability of a learner is bounded both by finite data and limited representational capacity (2) it shifts attention away from asymptotics to learning with finite resources. The perspective that this yields is then brought to bear on the second problem of learning natural language grammars to articulate some issues that computational linguistics needs to deal with.

01 Jan 2000
TL;DR: The decision-oriented mapping of pollution using hybrid models based on statistical learning theory and geostatistics is considered in this article, where the authors consider the problem of decision oriented mapping.
Abstract: The decision-oriented mapping of pollution using hybrid models based on statistical learning theory and geostatistics is considered