scispace - formally typeset
Search or ask a question

Showing papers on "Statistical learning theory published in 1998"


01 Jan 1998
TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Abstract: A comprehensive look at learning and generalization theory. The statistical theory of learning and generalization concerns the problem of choosing desired functions on the basis of empirical data. Highly applicable to a variety of computer science and robotics fields, this book offers lucid coverage of the theory as a whole. Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.

26,531 citations


Proceedings Article
01 Dec 1998
TL;DR: A general S3VM model is proposed that minimizes both the misclassification error and the function capacity based on all the available data that can be converted to a mixed-integer program and then solved exactly using integer programming.
Abstract: We introduce a semi-supervised support vector machine (S3VM) method. Given a training set of labeled data and a working set of unlabeled data, S3VM constructs a support vector machine using both the training and working sets. We use S3VM to solve the transduction problem using overall risk minimization (ORM) posed by Vapnik. The transduction problem is to estimate the value of a classification function at the given points in the working set. This contrasts with the standard inductive learning problem of estimating the classification function at all possible values and then using the fixed function to deduce the classes of the working set data. We propose a general S3VM model that minimizes both the misclassification error and the function capacity based on all the available data. We show how the S3VM model for 1-norm linear support vector machines can be converted to a mixed-integer program and then solved exactly using integer programming. Results of S3VM and the standard 1-norm support vector machine approach are compared on ten data sets. Our computational results support the statistical learning theory results showing that incorporating working data improves generalization when insufficient training information is available. In every case, S3VM either improved or showed no significant difference in generalization compared to the traditional approach.

882 citations


Journal ArticleDOI
TL;DR: The use of randomized algorithms to solve some problems in control system designs that are perceived to be "difficult" is presented to show that the randomized approach can be quite successful in tackling a practical problem.
Abstract: The topic of the present article is the use of randomized algorithms to solve some problems in control system designs that are perceived to be "difficult". A brief introduction is given to the notions of computational complexity that are pertinent to the present discussion, and then some problems in control system analysis and synthesis that are difficult in a complexity-theoretic sense are described. Some of the elements of statistical learning theory, which forms the basis of the randomized approach, are briefly described. Finally, these two sets of ideas are brought together to show that it is possible to construct efficient randomized algorithms for each of the difficult problems discussed by using the ideas of statistical learning theory. A real-life design example of synthesizing a first-order controller for the longitudinal stabilization of an unstable fighter aircraft is then presented to show that the randomized approach can be quite successful in tackling a practical problem.

173 citations


Proceedings ArticleDOI
04 May 1998
TL;DR: This work suggests using the framework of statistical learning theory to explain the effect of weight initialization on complexity control in multilayer perceptron (MLP) networks trained via backpropagation.
Abstract: Complexity control of a learning method is critical for obtaining good generalization with finite training data. We discuss complexity control in multilayer perceptron (MLP) networks trained via backpropagation. For such networks, the number of hidden units and/or network weights is usually used as a complexity parameter. However, application of backpropagation training introduces additional mechanisms for complexity control. These mechanisms are implicit in the implementation of an optimization procedure, and they cannot be easily quantified (in contrast to the number of weights or the number of hidden units). We suggest using the framework of statistical learning theory to explain the effect of weight initialization. Using this framework, we demonstrate the effect of weight initialization on complexity control in MLP networks.

12 citations


Journal Article
TL;DR: It is shown that a property from statistical learning theory known as uniform convergence of empirical means (UCEM) plays an important role in allowing us to construct efficient randomized algorithms for a wide variety of controller synthesis problems, including robust stabilization and weighted H"2/H"~-norm minimization are amenable to the randomized approach.
Abstract: By now it is known that several problems in the robustness analysis and synthesis of control systems are NP-complete or NP-hard. These negative results force us to modify our notion of solving a given problem. If we cannot solve a problem exactly because it is NP-hard, then we must settle for solving it approximately. If we cannot solve all instances of a problem, we must settle for solving almost all instances of a problem. An approach that is recently gaining popularity is that of using randomized algorithms. The notion of a randomized algorithm as defined here is somewhat different from that in the computer science literature, and enlarges the class of problems that can be efficiently solved. We begin with the premise that many problems in robustness analysis and synthesis can be formulated as the minimization of an objective function with respect to the controller pararneters. It is argued that, in order to assess the performance of a controller as the plant varies over a prespecified family, it is better to use the average performance of the controller as the objective function to be minimized, rather than its worst-case performance, as the worst-case objective function usually leads to rather conservative designs. Then it is shown that a property from statistical learning theory known as uniform convergence of empirical means (UCEM) plays an important role in allowing us to construct efficient randomized algorithms for a wide variety of controller synthesis problems. In particular, whenever the UCEM property holds, there exists an efficient (i.e., polynomial-time) randomized algorithm. Using very recent results in VC-dimension theory, it is shown that the UCEM property holds in several problems such as robust stabilization and weighted H 2 /H∞-norm minimization. Hence it is possible to solve such problems efficiently using randomized algorithms.

9 citations


Journal ArticleDOI
TL;DR: In this paper, a brief introduction is given to some statistical aspects of PAC (probably approximately correct) learning theory, and a close connection between the principal results in PAC learning theory and those in empirical process theory, the latter being a well-established branch of probability theory.

5 citations


Book ChapterDOI
01 Jan 1998
TL;DR: It is shown that a property from statistical learning theory known as uniform convergence of empirical means (UCEM) plays an important role in allowing us to construct efficient randomized algorithms for a wide variety of controller synthesis problems, and whenever the UCEM property holds, there exists an efficient (i.e., polynomial-time) randomized algorithm.
Abstract: By now it is known that several problems in control and matrix theory are NP-hard. These include matrix problems that arise in control theory, as well as other problems in the robustness analysis and synthesis of control systems. These negative results force us to modify our notion of “solving” a given problem. If we cannot solve a problem exactly because it is NP-hard, then we must settle for solving it approximately. If we cannot solve all instances of a problem, we must settle for solving “almost all” instances of a problem. An approach that is recently gaining popularity is that of using randomized algorithms. The notion of a randomized algorithm as defined here is somewhat different from that in the computer science literature, and enlarges the class of problems that can be efficiently solved. We begin with the premise that many problems in robustness analysis and synthesis can be formulated as the minimization of an objective function with respect to the controller parameters. It is argued that, in order to assess the performance of a controller as the plant varies over a prespecified family, it is better to use the average performance of the controller as the objective function to be minimized, rather than its worstcase performance, as the worst-case objective function usually leads to rather conservative designs. Then it is shown that a property from statistical learning theory known as uniform convergence of empirical means (UCEM) plays an important role in allowing us to construct efficient randomized algorithms for a wide variety of controller synthesis problems. In particular, whenever the UCEM property holds, there exists an efficient (i.e., polynomial-time) randomized algorithm. Using very recent results in VC-dimension theory, it is shown that the UCEM property holds in several problems such as robust stabilization and weighted H∞-norm minimization. Hence it is possible to solve such problems efficiently using randomized algorithms. The paper is concluded by showing that the statistical learning methodology is also applicable to some NP-hard matrix problems.

1 citations


Journal ArticleDOI
TL;DR: A number of recent results in statistical learning theory are summarised in the context of nonlinear system identification, leading to the statement of a number of characterisation results.