Topic

Statistical learning theory

About: Statistical learning theory is a research topic. Over the lifetime, 1618 publications have been published within this topic receiving 158033 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

On the mathematical foundations of learning

[...]

Felipe Cucker¹, Steve Smale², Steve Smale¹•Institutions (2)

City University of Hong Kong¹, University of California, Berkeley²

05 Oct 2001-Bulletin of the American Mathematical Society

TL;DR: A main theme of this report is the relationship of approximation to learning and the primary role of sampling (inductive inference) and relations of the theory of learning to the mainstream of mathematics are emphasized.

...read moreread less

Abstract: (1) A main theme of this report is the relationship of approximation to learning and the primary role of sampling (inductive inference). We try to emphasize relations of the theory of learning to the mainstream of mathematics. In particular, there are large roles for probability theory, for algorithms such as least squares, and for tools and ideas from linear algebra and linear analysis. An advantage of doing this is that communication is facilitated and the power of core mathematics is more easily brought to bear. We illustrate what we mean by learning theory by giving some instances. (a) The understanding of language acquisition by children or the emergence of languages in early human cultures. (b) In Manufacturing Engineering, the design of a new wave of machines is anticipated which uses sensors to sample properties of objects before, during, and after treatment. The information gathered from these samples is to be analyzed by the machine to decide how to better deal with new input objects (see [43]). (c) Pattern recognition of objects ranging from handwritten letters of the alphabet to pictures of animals, to the human voice. Understanding the laws of learning plays a large role in disciplines such as (Cognitive) Psychology, Animal Behavior, Economic Decision Making, all branches of Engineering, Computer Science, and especially the study of human thought processes (how the brain works). Mathematics has already played a big role towards the goal of giving a universal foundation of studies in these disciplines. We mention as examples the theory of Neural Networks going back to McCulloch and Pitts [25] and Minsky and Papert [27], the PAC learning of Valiant [40], Statistical Learning Theory as developed by Vapnik [42], and the use of reproducing kernels as in [17] among many other mathematical developments. We are heavily indebted to these developments. Recent discussions with a number of mathematicians have also been helpful. In

...read moreread less

1,651 citations

Journal Article•DOI•

Comparing support vector machines with Gaussian kernels to radial basis function classifiers

[...]

Bernhard Schölkopf¹, Kah-Kay Sung², C.J.C. Burges³, Federico Girosi⁴, Partha Niyogi³, Tomaso Poggio⁴, Vladimir Vapnik⁵ - Show less +3 more•Institutions (5)

Max Planck Society¹, National University of Singapore², Alcatel-Lucent³, Massachusetts Institute of Technology⁴, AT&T⁵

01 Nov 1997-IEEE Transactions on Signal Processing

TL;DR: The results show that on the United States postal service database of handwritten digits, the SV machine achieves the highest recognition accuracy, followed by the hybrid system, and the SV approach is thus not only theoretically well-founded but also superior in a practical application.

...read moreread less

Abstract: The support vector (SV) machine is a novel type of learning machine, based on statistical learning theory, which contains polynomial classifiers, neural networks, and radial basis function (RBF) networks as special cases. In the RBF case, the SV algorithm automatically determines centers, weights, and threshold that minimize an upper bound on the expected test error. The present study is devoted to an experimental comparison of these machines with a classical approach, where the centers are determined by X-means clustering, and the weights are computed using error backpropagation. We consider three machines, namely, a classical RBF machine, an SV machine with Gaussian kernel, and a hybrid system with the centers determined by the SV method and the weights trained by error backpropagation. Our results show that on the United States postal service database of handwritten digits, the SV machine achieves the highest recognition accuracy, followed by the hybrid system. The SV approach is thus not only theoretically well-founded but also superior in a practical application.

...read moreread less

1,385 citations

Journal Article•DOI•

Domain adaptation for statistical classifiers

[...]

Hal Daumé¹, Daniel Marcu¹•Institutions (1)

University of Southern California¹

01 May 2006-Journal of Artificial Intelligence Research

TL;DR: This work introduces a statistical formulation of this problem in terms of a simple mixture model and presents an instantiation of this framework to maximum entropy classifiers and their linear chain counterparts and leads to improved performance on three real world tasks on four different data sets from the natural language processing domain.

...read moreread less

Abstract: The most basic assumption used in statistical learning theory is that training data and test data are drawn from the same underlying distribution. Unfortunately, in many applications, the "in-domain" test data is drawn from a distribution that is related, but not identical, to the "out-of-domain" distribution of the training data. We consider the common case in which labeled out-of-domain data is plentiful, but labeled in-domain data is scarce. We introduce a statistical formulation of this problem in terms of a simple mixture model and present an instantiation of this framework to maximum entropy classifiers and their linear chain counterparts. We present efficient inference algorithms for this special case based on the technique of conditional expectation maximization. Our experimental results show that our approach leads to improved performance on three real world tasks on four different data sets from the natural language processing domain.

...read moreread less

894 citations

Proceedings Article•

Semi-Supervised Support Vector Machines

[...]

Kristin P. Bennett¹, Ayhan Demiriz¹•Institutions (1)

Rensselaer Polytechnic Institute¹

01 Dec 1998

TL;DR: A general S3VM model is proposed that minimizes both the misclassification error and the function capacity based on all the available data that can be converted to a mixed-integer program and then solved exactly using integer programming.

...read moreread less

Abstract: We introduce a semi-supervised support vector machine (S3VM) method. Given a training set of labeled data and a working set of unlabeled data, S3VM constructs a support vector machine using both the training and working sets. We use S3VM to solve the transduction problem using overall risk minimization (ORM) posed by Vapnik. The transduction problem is to estimate the value of a classification function at the given points in the working set. This contrasts with the standard inductive learning problem of estimating the classification function at all possible values and then using the fixed function to deduce the classes of the working set data. We propose a general S3VM model that minimizes both the misclassification error and the function capacity based on all the available data. We show how the S3VM model for 1-norm linear support vector machines can be converted to a mixed-integer program and then solved exactly using integer programming. Results of S3VM and the standard 1-norm support vector machine approach are compared on ten data sets. Our computational results support the statistical learning theory results showing that incorporating working data improves generalization when insufficient training information is available. In every case, S3VM either improved or showed no significant difference in generalization compared to the traditional approach.

...read moreread less

882 citations

Journal Article•DOI•

A survey of multi-view machine learning

[...]

Shiliang Sun¹•Institutions (1)

East China Normal University¹

17 Feb 2013-Neural Computing and Applications

TL;DR: This paper reviews theories developed to understand the properties and behaviors of multi-view learning and gives a taxonomy of approaches according to the machine learning mechanisms involved and the fashions in which multiple views are exploited.

...read moreread less

Abstract: Multi-view learning or learning with multiple distinct feature sets is a rapidly growing direction in machine learning with well theoretical underpinnings and great practical success. This paper reviews theories developed to understand the properties and behaviors of multi-view learning and gives a taxonomy of approaches according to the machine learning mechanisms involved and the fashions in which multiple views are exploited. This survey aims to provide an insightful organization of current developments in the field of multi-view learning, identify their limitations, and give suggestions for further research. One feature of this survey is that we attempt to point out specific open problems which can hopefully be useful to promote the research of multi-view machine learning.

...read moreread less

782 citations

Collapse

Network Information

Performance

Metrics

1,647

Papers

173,903

Citations

No. of papers in the topic in previous years
Year	Papers
2023	9
2022	19
2021	59
2020	69
2019	72
2018	47

Statistical learning theory

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics