Showing papers on "Empirical risk minimization published in 1998"

PDF

Open Access

[...]

01 Jan 1998

TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.

...read moreread less

Abstract: A comprehensive look at learning and generalization theory. The statistical theory of learning and generalization concerns the problem of choosing desired functions on the basis of empirical data. Highly applicable to a variety of computer science and robotics fields, this book offers lucid coverage of the theory as a whole. Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.

...read moreread less

26,531 citations

Proceedings Article•

Gradient Descent for General Reinforcement Learning

[...]

Leemon C. Baird¹, Andrew W. Moore¹•Institutions (1)

Carnegie Mellon University¹

01 Dec 1998

TL;DR: A simple learning rule is derived, the VAPS algorithm, which can be instantiated to generate a wide range of new reinforcement-learning algorithms, and allows policy-search and value-based algorithms to be combined, thus unifying two very different approaches to reinforcement learning into a single Value and Policy Search algorithm.

...read moreread less

Abstract: A simple learning rule is derived, the VAPS algorithm, which can be instantiated to generate a wide range of new reinforcement-learning algorithms. These algorithms solve a number of open problems, define several new approaches to reinforcement learning, and unify different approaches to reinforcement learning under a single theory. These algorithms all have guaranteed convergence, and include modifications of several existing algorithms that were known to fail to converge on simple MOPs. These include Q-learning, SARSA, and advantage learning. In addition to these value-based algorithms it also generates pure policy-search reinforcement-learning algorithms, which learn optimal policies without learning a value function. In addition, it allows policy-search and value-based algorithms to be combined, thus unifying two very different approaches to reinforcement learning into a single Value and Policy Search (VAPS) algorithm. And these algorithms converge for POMDPs without requiring a proper belief state. Simulations results are given, and several areas for future research are discussed.

...read moreread less

284 citations

Journal Article•DOI•

Radial basis function networks and complexity regularization in function learning

[...]

Adam Krzyżak¹, Tamas Linder²•Institutions (2)

Concordia University¹, University of California, San Diego²

01 Mar 1998-IEEE Transactions on Neural Networks

TL;DR: This approach differs from previous complexity regularization neural-network function learning schemes in that it operates with random covering numbers and l(1) metric entropy, making it possible to consider much broader families of activation functions, namely functions of bounded variation.

...read moreread less

Abstract: We apply the method of complexity regularization to derive estimation bounds for nonlinear function estimation using a single hidden layer radial basis function network. Our approach differs from previous complexity regularization neural-network function learning schemes in that we operate with random covering numbers and l/sub 1/ metric entropy, making it possible to consider much broader families of activation functions, namely functions of bounded variation. Some constraints previously imposed on the network parameters are also eliminated this way. The network is trained by means of complexity regularization involving empirical risk minimization. Bounds on the expected risk in terms of the sample size are obtained for a large class of loss functions. Rates of convergence to the optimal loss are also derived.

...read moreread less

90 citations

Learning Preference Relations for Information Retrieval

[...]

Ralf Herbrich, Thore Graepel, Peter Bollmann-Sdorra, Klaus Obermayer

01 Jan 1998

TL;DR: A vector space based method is presented that performs a linear mapping from documeats to scalar utility values and thus guarantees transitivity, and is extended to polynomial utility functions by using the potential function method, which allows to incorporate higher order correlations of features into the utility function at minimal computational costs.

...read moreread less

Abstract: In this paper we investigate the problem of learning a preference relation i~om a given set of ranked documents. We show that the Bayes’s optimal decision function, when applied to learning a preference relation, may violate transitivity. This is undesirable for information retrieval, because it is in conflict with a document ranking based on the user’s preferences. To overcome this problem we present a vector space based method that performs a linear mapping from documeats to scalar utility values and thus guarantees transitivity. The learning of the relation between documeats is formulated as a classification problem on pairs of documents and is solved using the principle of structural risk minimization for good generalization. The approach is extended to polynomial utility functions by using the potential function method (the so called "kernel trick"), which allows to incorporate higher order correlations of features into the utility function at minimal computational costs. The resulting algorithm is tested on aa example with artificial data. The algorithm successfully learns the utility function underlying the training examples and shows good classification performance.

...read moreread less

71 citations

Macro-Actions in Reinforcement Learning: An Empirical Analysis

[...]

Amy McGovern, Richard S. Sutton

01 Jan 1998

TL;DR: Although eligibility traces increased the rate of convergence to the optimal value function compared to learning with macro-actions but without eligibility traces, eligibility traces did not permit the optimal policy to be learned as quickly as it was using macro- actions.

...read moreread less

Abstract: Several researchers have proposed reinforcement learning methods that obtain advantages in learning by using temporally extended actions, or macro-actions, but none has carefully analyzed what these advantages are. In this paper, we separate and analyze two advantages of using macro-actions in reinforcement learning: the effect on exploratory behavior, independent of learning, and the effect on the speed with which the learning process propagates accurate value information. We empirically measure the separate contributions of these two effects in gridworld and simulated robotic environments. In these environments, both effects were significant, but the effect of value propagation was larger. We also compare the accelerations of value propagation due to macro-actions and eligibility traces in the gridworld environment. Although eligibility traces increased the rate of convergence to the optimal value function compared to learning with macro-actions but without eligibility traces, eligibility traces did not permit the optimal policy to be learned as quickly as it was using macro-actions.

...read moreread less

47 citations

Empirical Risk Approximation: An Induction Principle for Unsupervised Learning

[...]

Joachim M. Buhmann

03 Apr 1998

TL;DR: A theory of unsupervised learning has been developed in analogy to the highly successful statistical learning theory of classiication and regression Vapnik and Chervonenkis which addresses computational problems of supervised learning in addition to the statistical constraints.

...read moreread less

Abstract: Unsupervised learning algorithms are designed to extract structure from data without reference to explicit teacher information. The quality of the learned structure is determined by a cost function which guides the learning process. This paper proposes Empirical Risk Approximation as a new induction principle for unsupervised learning. The complexity of the unsupervised learning models are automatically controlled by the two conditions for learning: (i) the empirical risk of learning should uniformly converge towards the expected risk; (ii) the hypothesis class should retain a minimal variety for consistent inference. The maximal entropy principle with deterministic annealing as an eecient search strategy arises from the Empirical Risk Approximation principle as the optimal inference strategy for large learning problems. Parameter selection of learnable data structures is demonstrated for the case of k-means clustering. 1 What is unsupervised learning? Learning algorithms are designed with the goal in mind that they should extract structure from data. Two classes of algorithms have been widely discussed in the literature { supervised and unsupervised learning. The distinction between the two classes relates to supervision or teacher information which is either available to the learning algorithm or missing in the learning process. This paper presents a theory of unsupervised learning which has been developed in analogy to the highly successful statistical learning theory of classiication and regression Vapnik, 1982, Vapnik, 1995]. In supervised learning of classiication boundaries or of regression functions the learning algorithm is provided with example points and selects the best candidate function from a set of functions, called the hypothesis class. Statistical learning theory, developed by Vapnik and Chervonenkis in a series of seminal papers (see Vapnik, 1982, Vapnik, 1995]), measures the amount of information in a data set which can be used to determine the parameters of the classiication or regression models. Computational learning theory Valiant, 1984] addresses computational problems of supervised learning in addition to the statistical constraints. 2 In this paper I propose a theoretical framework for unsupervised learning based on optimization of a quality functional for structures in data. The learning algorithm extracts an underlying structure from a sample data set under the guidance of a quality measure denoted as learning costs. The extracted structure of the data is encoded by a loss function and it is assumed to produce a learning risk below a predeened risk threshold. This induction principle is refered to as Empirical Risk Approximation (ERA) and is summarized …

...read moreread less

32 citations

Proceedings Article•DOI•

A new composition theorem for learning algorithms

[...]

Nader H. Bshouty¹•Institutions (1)

University of Calgary¹

23 May 1998

TL;DR: A new approach to the composition of learning algorithmo (in various models) for classes of constant VC-dimension into learning algorithms for more complicated classes is presented and it is shown that if a class of constantVC-dimension is PAC-learnable from aclass of conotnnt VC- dimension then it is SQ-learnables and PAC- learnable with mnlicious noise.

...read moreread less

14 citations

Journal Article•DOI•

Worst-case analysis of the perceptron and exponentiated update algorithms

[...]

Tom Bylander¹•Institutions (1)

University of Texas at San Antonio¹

01 Dec 1998-Artificial Intelligence

TL;DR: This paper demonstrates worst-case upper bounds on the absolute loss for the Perception learning algorithm and the Exponentiated Update learning algorithm, which is related to the Weighted Majority algorithm.

...read moreread less

9 citations

Book Chapter•DOI•

Random Case Analysis of Inductive Learning Algorithms

[...]

Kuniaki Uehara¹•Institutions (1)

Kobe University¹

14 Dec 1998

TL;DR: A framework, called random case analysis, is adopted, which can predict various aspects of learning algorithm's behavior, and require less computational time than the other theoretical analyses, and can easily apply to practical learning algorithms.

...read moreread less

Abstract: In machine learning, it is important to reduce computational time to analyze learning algorithms. Some researchers have attempted to understand learning algorithms by experimenting them on a variety of domains. Others have presented theoretical methods of learning algorithm by using approximately mathematical model. The mathematical model has some deficiency that, if the model is too simplified, it may lose the essential behavior of the original algorithm. Furthermore, experimental analyses are based only on informal analyses of the learning task, whereas theoretical analyses address the worst case. Therefore, the results of theoretical analyses are quite different from empirical results. In our framework, called random case analysis, we adopt the idea of randomized algorithms. By using random case analysis, it can predict various aspects of learning algorithm's behavior, and require less computational time than the other theoretical analyses. Furthermore, we can easily apply our framework to practical learning algorithms.

...read moreread less

5 citations

Proceedings Article•DOI•

Wavelet-based signal approximation with multilevel learning algorithms using genetic neuron selection

[...]

J.W. Wang¹, Jeng-Shyang Pan, C.H. Chen, H.L. Fang•Institutions (1)

National Cheng Kung University¹

10 Nov 1998

TL;DR: Experimental studies exhibit that the string representation of genetic algorithms (GA) is a key issue in determining the suitable network structures and the performances of function approximation for the two learning algorithms.

...read moreread less

Abstract: Neural networks based on wavelets are constructed to study the function learning problems. Two types of learning algorithms, the overall multilevel learning (OML) and the pyramidal multilevel learning (PML) with genetic neuron selection are comparatively studied for the convergence rate and accuracy using data samples of a piecewise defined signal. Moreover, the two algorithms are examined using orthogonal and non orthogonal bases. Experimental studies exhibit that the string representation of genetic algorithms (GA) is a key issue in determining the suitable network structures and the performances of function approximation for the two learning algorithms.

...read moreread less

3 citations

Proceedings Article•

Statistical Learning in Optimization: Gaussian Modeling for Population Search.

[...]

Shotaro Akaho

01 Jan 1998

TL;DR: It is pointed out that the algorithms based on the explicit modeling of the elites' distribution tend to converge to unpreferable local optima, and the algorithm is modified to conquer the defect.

...read moreread less

Abstract: Population search algorithms for optimization problems such as Genetic algorithm is an e ective way to nd an optimal value, especially when we have little information about the objective function. Baluja has proposed e ective algorithms modeling the distribution of elites explicitly by some statistical model. We propose such an algorithm based on Gaussian modeling of elites, and analyze the convergence property of the algorithm by de ning the objective function as a stochastic model. We point out that the algorithms based on the explicit modeling of the elites' distribution tend to converge to unpreferable local optima, and we modify the algorithm to conquer the defect.

...read moreread less

Journal Article•DOI•

Some Results in Statistical Learning Theory with Relevance to Nonlinear System Identification

[...]

Robert C. Williamson¹•Institutions (1)

Australian National University¹

01 Jul 1998-IFAC Proceedings Volumes

TL;DR: A number of recent results in statistical learning theory are summarised in the context of nonlinear system identification, leading to the statement of a number of characterisation results.

...read moreread less