Showing papers on "Generalization published in 1992"

PDF

Open Access

Proceedings Article•DOI•

A training algorithm for optimal margin classifiers

[...]

Bernhard E. Boser¹, Isabelle Guyon², Vladimir Vapnik²•Institutions (2)

University of California, Berkeley¹, Bell Labs²

01 Jul 1992

TL;DR: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented, applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions.

...read moreread less

Abstract: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented. The technique is applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions. The effective number of parameters is adjusted automatically to match the complexity of the problem. The solution is expressed as a linear combination of supporting patterns. These are the subset of training patterns that are closest to the decision boundary. Bounds on the generalization performance based on the leave-one-out method and the VC-dimension are given. Experimental results on optical character recognition problems demonstrate the good generalization obtained when compared with other learning algorithms.

...read moreread less

11,211 citations

Journal Article•DOI•

Original Contribution: Stacked generalization

[...]

David H. Wolpert

05 Feb 1992-Neural Networks

TL;DR: The conclusion is that for almost any real-world generalization problem one should use some version of stacked generalization to minimize the generalization error rate.

...read moreread less

5,834 citations

Journal Article•DOI•

Density matrix formulation for quantum renormalization groups

[...]

Steven R. White¹•Institutions (1)

University of California, Irvine¹

09 Nov 1992-Physical Review Letters

TL;DR: A generalization of the numerical renormalization-group procedure used first by Wilson for the Kondo problem is presented and it is shown that this formulation is optimal in a certain sense.

...read moreread less

Abstract: A generalization of the numerical renormalization-group procedure used first by Wilson for the Kondo problem is presented. It is shown that this formulation is optimal in a certain sense. As a demonstration of the effectiveness of this approach, results from numerical real-space renormalization-group calculations for Heisenberg chains are presented.

...read moreread less

5,625 citations

Journal Article•DOI•

Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method.

[...]

Donald D. Dorfman¹, Kevin S. Berbaum, Charles E. Metz•Institutions (1)

University of Iowa¹

01 Sep 1992-Investigative Radiology

818 citations

Journal Article•DOI•

Temporal reasoning based on semi-intervals

[...]

Christian Freksa

01 Mar 1992-Artificial Intelligence

TL;DR: A generalization of Allen's interval-based approach to temporal reasoning is presented and the notion of ‘conceptual neighborhood’ of qualitative relations between events is central to the presented approach, using semi-intervals rather than intervals as the basic units of knowledge.

...read moreread less

701 citations

Journal Article•DOI•

Statistical mechanics of learning from examples

[...]

Hyunjune Sebastian Seung¹, Hyunjune Sebastian Seung², Haim Sompolinsky¹, Haim Sompolinsky², Naftali Tishby¹ - Show less +1 more•Institutions (2)

Bell Labs¹, The Racah Institute of Physics²

01 Apr 1992-Physical Review A

TL;DR: It is shown that for smooth networks, i.e., those with continuously varying weights and smooth transfer functions, the generalization curve asymptotically obeys an inverse power law, while for nonsmooth networks other behaviors can appear, depending on the nature of the nonlinearities as well as the realizability of the rule.

...read moreread less

Abstract: Learning from examples in feedforward neural networks is studied within a statistical-mechanical framework. Training is assumed to be stochastic, leading to a Gibbs distribution of networks characterized by a temperature parameter T. Learning of realizable rules as well as of unrealizable rules is considered. In the latter case, the target rule cannot be perfectly realized by a network of the given architecture. Two useful approximate theories of learning from examples are studied: the high-temperature limit and the annealed approximation. Exact treatment of the quenched disorder generated by the random sampling of the examples leads to the use of the replica theory. Of primary interest is the generalization curve, namely, the average generalization error ${\mathrm{\ensuremath{\epsilon}}}_{\mathit{g}}$ versus the number of examples P used for training. The theory implies that, for a reduction in ${\mathrm{\ensuremath{\epsilon}}}_{\mathit{g}}$ that remains finite in the large-N limit, P should generally scale as \ensuremath{\alpha}N, where N is the number of independently adjustable weights in the network. We show that for smooth networks, i.e., those with continuously varying weights and smooth transfer functions, the generalization curve asymptotically obeys an inverse power law. In contrast, for nonsmooth networks other behaviors can appear, depending on the nature of the nonlinearities as well as the realizability of the rule. In particular, a discontinuous learning transition from a state of poor to a state of perfect generalization can occur in nonsmooth networks learning realizable rules.We illustrate both gradual and continuous learning with a detailed analytical and numerical study of several single-layer perceptron models. Comparing with the exact replica theory of perceptron learning, we find that for realizable rules the high-temperature and annealed theories provide very good approximations to the generalization performance. Assuming this to hold for multilayer networks as well, we propose a classification of possible asymptotic forms of learning curves in general realizable models. For unrealizable rules we find that the above approximations fail in general to predict correctly the shapes of the generalization curves. Another indication of the important role of quenched disorder for unrealizable rules is that the generalization error is not necessarily a monotonically increasing function of temperature. Also, unrealizable rules can possess genuine spin-glass phases indicative of degenerate minima separated by high barriers.

...read moreread less

461 citations

Journal Article•DOI•

A generic arc-consistency algorithm and its specializations

[...]

Pascal Van Hentenryck¹, Yves Deville², Choh-Man Teng¹•Institutions (2)

Brown University¹, Université catholique de Louvain²

01 Oct 1992-Artificial Intelligence

TL;DR: It is proved that AC-5, in conjuction with node consistency, provides a decision procedure for these constraints running in time $O(ed)$ and has an important application in constraint logic programming over finite domains.

...read moreread less

450 citations

Journal Article•DOI•

The justification for applying the effective-mass approximation to microstructures

[...]

M G Burt

10 Aug 1992-Journal of Physics: Condensed Matter

TL;DR: In this article, the authors reviewed the assumptions of conventional effective-mass theory, especially the one of continuity of the envelope function at an abrupt interface, and the need for a fresh approach becomes apparent.

...read moreread less

Abstract: The assumptions of conventional effective-mass theory, especially the one of continuity of the envelope function at an abrupt interface, are reviewed critically so that the need for a fresh approach becomes apparent. A new envelope-function method, developed by the author over the past few years, is reviewed. This new method is based on both a generalization and a novel application to microstructures of the Luttinger-Kohn envelope-function expansion. The differences between this new method and the conventional envelope-function method are emphasized. An alternative derivation of the new envelope-function equations, which are exact, to that already published is provided. A new and improved derivation of the author's effective-mass equation is given, in which the differences in the zone-centre eigenstates of the constituent crystals are taken into account. The cause of the kinks in the conventional effective-mass envelope function, at abrupt effective-mass changes, is identified.

...read moreread less

375 citations

Book•

Generalization in Digital Cartography

[...]

Robert B McMaster

01 Jan 1992

311 citations

Book Chapter•DOI•

A computational scheme for reasoning in dynamic probabilistic networks

[...]

Uffe Kjærulff¹•Institutions (1)

Aalborg University¹

01 Jun 1992

TL;DR: The scheme is viewed as a generalization of the inference methods of classical time-series analysis in the sense that it allows description of non-linear, multivariate dynamic systems with complex conditional independence structures.

...read moreread less

Abstract: A computational scheme for reasoning about dynamic systems using (causal) probabilistic networks is presented. The scheme is based on the framework of Lauritzen and Spiegel-halter (1988), and may be viewed as a generalization of the inference methods of classical time-series analysis in the sense that it allows description of non-linear, multivariate dynamic systems with complex conditional independence structures. Further, the scheme provides a method for efficient backward smoothing and possibilities for efficient, approximate forecasting methods. The scheme has been implemented on top of the HUGIN shell.

...read moreread less

217 citations

Proceedings Article•

Automatic Capacity Tuning of Very Large VC-Dimension Classifiers

[...]

Isabelle Guyon¹, Bernhard E. Boser², Vladimir Vapnik¹•Institutions (2)

Bell Labs¹, University of California, Berkeley²

30 Nov 1992

TL;DR: It is shown that even high-order polynomial classifiers in high dimensional spaces can be trained with a small amount of training data and yet generalize better than classifiers with a smaller VC-dimension.

...read moreread less

Abstract: Large VC-dimension classifiers can learn difficult tasks, but are usually impractical because they generalize well only if they are trained with huge quantities of data. In this paper we show that even high-order polynomial classifiers in high dimensional spaces can be trained with a small amount of training data and yet generalize better than classifiers with a smaller VC-dimension. This is achieved with a maximum margin algorithm (the Generalized Portrait). The technique is applicable to a wide variety of classifiers, including Perceptrons, polynomial classifiers (sigma-pi unit networks) and Radial Basis Functions. The effective number of parameters is adjusted automatically by the training algorithm to match the complexity of the problem. It is shown to equal the number of those training patterns which are closest patterns to the decision boundary (supporting patterns). Bounds on the generalization error and the speed of convergence of the algorithm are given. Experimental results on handwritten digit recognition demonstrate good generalization compared to other algorithms.

...read moreread less

Journal Article•DOI•

Unification under a mixed prefix

[...]

Dale Miller¹•Institutions (1)

University of Pennsylvania¹

01 Oct 1992-Journal of Symbolic Computation

TL;DR: Tow schemes for simplifying quantifier alternation, called Skolemization and raising, are presented and various optimizations on the general unification search problem are discussed.

...read moreread less

Journal Article•DOI•

Generalization and maintenance of preschool children's social skills: a critical review and analysis

[...]

Lynette K. Chandler¹, Roger C. Lubeck¹, Susan A. Fowler²•Institutions (2)

Southern Illinois University Carbondale¹, University of Illinois at Urbana–Champaign²

01 Jun 1992-Journal of Applied Behavior Analysis

TL;DR: This paper summarizes the results of a retrospective review of generalization in the context of social skills research with preschool children and reveals some differences concerning the practices employed by studies within each group.

...read moreread less

Abstract: This paper summarizes the results of a retrospective review of generalization in the context of social skills research with preschool children. A review of studies from 22 journals (1976 to 1990) that assessed generalization as part of social interaction research provided information concerning the prevalence of studies that have assessed generalization, common practices concerning the production and assessment of generalization, and the overall success of obtaining generalization and maintenance of social behaviors. A comparison of the most and least successful studies, with respect to generalization, revealed some differences concerning the practices employed by studies within each group. Differences differentially related to the production of generalization are discussed and recommendations are provided to guide and support future research efforts.

...read moreread less

Journal Article•DOI•

Algorithms for automated line generalization1 based on a natural principle of objective generalization

[...]

Zhilin Li¹, Stan Openshaw²•Institutions (2)

University of Southampton¹, University of Leeds²

01 Jan 1992-International Journal of Geographic Information Systems

TL;DR: A new set of algorithms for locally–adaptive line generalization based on the so-called natural principle of objective generalization is described, which is compared with benchmarks based on both manual cartographic procedures and a standard method found in many geographical information systems.

...read moreread less

Abstract: This article describes a new set of algorithms for locally–adaptive line generalization based on the so-called natural principle of objective generalization. The drawbacks of existing methods of line generalization are briefly discussed and the algorithms described. The performance of these new methods is compared with benchmarks based on both manual cartographic procedures and a standard method found in many geographical information systems.

...read moreread less

Generalization performance of backpropagation learning on a syllabification task

[...]

Walter Daelemans, A.P.J. van den Bosch

01 Jan 1992

TL;DR: This paper presents a meta-analysis of the generalization performance of backpropagation learning on a syllabification task in connection with Connectionism and natural language processing.

...read moreread less

Abstract: Citation for published version (APA): Daelemans, W. M. P., & Bosch, A. P. J. (1992). Generalization performance of backpropagation learning on a syllabification task. In M. F. J. Drossaers, & A. Nijholt (Eds.), Connectionism and natural language processing: Proceedings of the third Twente Workshop on Language Technology, TWLT3, Enschede, May 12-13, 1992 (organized by Project Parlevink) (Vol. 3, pp. 27-38). (Memoranda informatica; Vol. 3, No. 92-64). University of Twente, Department of Computer Science.

...read moreread less

Journal Article•DOI•

A multicriteria Pareto-optimal path algorithm

[...]

Chi Tung Tung¹, Kim Lin Chew¹•Institutions (1)

National University of Singapore¹

26 Oct 1992-European Journal of Operational Research

TL;DR: An algorithm for finding, in a multicriteria network, Pareto-optimal paths, one each for each efficient objective vector, a generalization of an earlier one for the bicriterion case.

...read moreread less

Journal Article•DOI•

A generalization of ekellandz’s variational principle

[...]

Chr. Tammer¹•Institutions (1)

Merseburg University of Applied Sciences¹

01 Jan 1992-Optimization

TL;DR: In this paper, a vector valued variational principle by using a general concept of ∊-efficiency and a nonconvex separation theorem is presented. But this principle is not applicable to the problem of vector valued VAE.

...read moreread less

Abstract: This paper presents a vector valued variational principle by using a general concept of ∊-efficiency and a nonconvex separation theorem

...read moreread less

Journal Article•DOI•

Generalization in a linear perceptron in the presence of noise

[...]

Anders Krogh, John Hertz

07 Mar 1992-Journal of Physics A

TL;DR: It is shown that a weight decay of the same size as the variance of the noise on the teacher improves on the generalization and suppresses the overfitting, and weight noise and output noise acts similarly above the transition at alpha =1.

...read moreread less

Abstract: The authors study the evolution of the generalization ability of a simple linear perceptron with N inputs which learns to imitate a 'teacher perceptron'. The system is trained on p= alpha N example inputs drawn from some distribution and the generalization ability is measured by the average agreement with the teacher on test examples drawn from the same distribution. The dynamics may be solved analytically and exhibits a phase transition from imperfect to perfect generalization at alpha =1, when there are no errors (static noise) in the training examples. If the examples are produced by an erroneous teacher, overfitting is observed, i.e. the generalization error starts to increase after a finite time of training. It is shown that a weight decay of the same size as the variance of the noise (errors) on the teacher improves on the generalization and suppresses the overfitting. The generalization error as a function of time is calculated numerically for various values of the parameters. Finally dynamic noise in the training is considered. White noise on the input corresponds on average to a weight decay, and can thus improve generalization, whereas white noise on the weights or the output degrades generalization. Generalization is particularly sensitive to noise on the weights (for alpha (1) where it makes the error constantly increase with time, but this effect is also shown to be damped by a weight decay. Weight noise and output noise acts similarly above the transition at alpha =1.

...read moreread less

Journal Article•DOI•

Quadratic Spline Models for Producer's Supply and Demand Functions

[...]

Walter Diewert, Terence Wales

01 Aug 1992-International Economic Review

TL;DR: In this article, a generalization of the standard normalized quadratic form has been proposed, which can provide a local second-order approximation while maintaining the correct curvature globally.

...read moreread less

Abstract: In this paper, the authors propose and estimate a system of producer output supply and input demand functions that generalizes the standard normalized quadratic form. The generalization adds either linear or quadratic splines in a time (or technical change) variable, yet retains the main attractive property of the normalized quadratic, which is that it can provide a local second order approximation while maintaining the correct curvature globally. However, the generalization has additional desirable approximation properties with respect to the splined variable and, thus, permits a more flexible treatment of technical change than is provided by standard flexible functional forms. Copyright 1992 by Economics Department of the University of Pennsylvania and the Osaka University Institute of Social and Economic Research Association.

...read moreread less

Journal Article•DOI•

Optimal generalization in perceptions

[...]

Osame Kinouchi, Nestor Caticha

07 Dec 1992-Journal of Physics A

TL;DR: In this article, a new learning algorithm for the one-layer perceptron is presented, which aims to maximize the generalization gain per example by maximizing the expected stability of the example in the teacher perceptron.

...read moreread less

Abstract: A new learning algorithm for the one-layer perceptron is presented. It aims to maximize the generalization gain per example. Analytical results are obtained for the case of single presentation of each example. The weight attached to a Hebbian term is a function of the expected stability of the example in the teacher perceptron. This leads to the obtention of upper bounds for the generalization ability. This scheme can be iterated and the results of numerical simulations show that it converges, within errors, to the theoretical optimal generalization ability of the Bayes algorithm. Analytical and numerical results for an algorithm with maximized generalization in the learning strategy with selection of examples are obtained and it is proved that, as expected, orthogonal selection is optimal. Exponential decay of the generalization error is obtained for the single presentation of selected examples.

...read moreread less

Book Chapter•DOI•

Growing Cell Structures – a Self-organizing Network in k Dimensions

[...]

Bernd Fritzke¹•Institutions (1)

University of Erlangen-Nuremberg¹

01 Jan 1992

TL;DR: An improved version of a self-organizing network model which has been proposed at the ICANN-91 and since then has been applied to various problems is described, with the generalization of the model to arbitrary dimension and the introduction of a local estimate of the probability density.

...read moreread less

Abstract: In this paper an improved version of a self-organizing network model is described which has been proposed at the ICANN-91[3] and since then has been applied to various problems [1,2,5]. The improvements presented here are the generalization of the model to arbitrary dimension and the introduction of a local estimate of the probability density. The latter leads to a very clear distinction between necessary and superfluous neurons with respect to modeling a given probability distribution. This makes it possible to automatically generate network structures that are nearly optimally suited for the distribution at hand.

...read moreread less

Journal Article•DOI•

Validity generalization in the context of situational models

[...]

Lawrence R. James, Robert G. Demaree, Stanley A. Mulaik, Robert T. Ladd

01 Feb 1992-Journal of Applied Psychology

TL;DR: In this article, the authors question the independence assumption by theoretically integrating situational variables into the validity generalization estimation process, which is based on the assumption that the effects of statistical artifacts on validities are independent of the effects on situational moderators.

...read moreread less

Abstract: A primary objective of validity generalization analysis is to decompose the between-situation variance in validities into (a) variance attributable to between-situation differences in statistical artifacts and (b) variance attributable to between-situation differences in (unidentified) situational moderators. This process is based on the assumption that the effects of statistical artifacts on validities are independent of the effects of situational moderators on validities. The present article seeks to question the independence assumption by theoretically integrating situational variables into the validity generalization estimation process.

...read moreread less

Book Chapter•DOI•

The dynamic of belief in the transferable belief model and specialization-generalization matrices

[...]

Frank Klawonn¹, Philippe Smets²•Institutions (2)

Braunschweig University of Technology¹, Université libre de Bruxelles²

01 Jun 1992

TL;DR: The fundamental updating process in the transferable belief model is related to the concept of specialization and can be described by a specialization matrix, and it is shown that Dempster's rule of conditioning corresponds essentially to the least committed specialization.

...read moreread less

Abstract: The fundamental updating process in the transferable belief model is related to the concept of specialization and can be described by a specialization matrix. The degree of belief in the truth of a proposition is a degree of justified support. The Principle of Minimal Commitment implies that one should never give more support to the truth of a proposition than justified. We show that Dempster's rule of conditioning corresponds essentially to the least committed specialization, and that Dempster's rule of combination results essentially from commutativity requirements. The concept of generalization, dual to the concept of specialization, is described.

...read moreread less

Book Chapter•DOI•

Minimality and realization of discrete time-varying systems

[...]

Israel Gohberg¹, Marinus A. Kaashoek², Leonid Lerer³•Institutions (3)

Tel Aviv University¹, VU University Amsterdam², Technion – Israel Institute of Technology³

01 May 1992-Operator theory

TL;DR: In this article, the minimality and realization theory for discrete time-varying finite dimensional linear systems with time varying state spaces has been developed, and the results appear as a natural generalization of the corresponding theory for the time independent case.

...read moreread less

Abstract: The minimality and realization theory is developed for discrete time-varying finite dimensional linear systems with time-varying state spaces. The results appear as a natural generalization of the corresponding theory for the time-independent case. Special attention is paid to periodical systems. The case when the state space dimensions do not change in time is re-examined.

...read moreread less

Journal Article•DOI•

Models and Experiments for Adaptive Computer-Assisted Terrain Generalization

[...]

Robert Weibel

01 Jan 1992

TL;DR: The project described in this article had two primary objectives: to design a strategy for terrain generalization that is adaptive to different terrain types, scales, and map purposes, and to implement and evaluate some components of this approach to assess its potential.

...read moreread less

Abstract: The project described in this article had two primary objectives: to design a strategy for terrain generalization that is adaptive to different terrain types, scales, and map purposes, and to implement and evaluate some components of this approach to assess its potential. The strategy includes three different generalization methods: a global filtering procedure, a selective (iterative) filtering method, and a heuristic approach based on the generalization of the terrain's structure lines. For a given generalization problem that is constrained by the terrain character, map objective, scale, graphic limits, and data quality, the appropriate technique is selected through structure and process recognition procedures. Some of the key components of the strategy have been implemented and some experiments were conducted. Other parts were covered by proposing models that could serve as implementation guidelines. Our work was intended to break ground for future research. Recommendations for appropriate parameter se...

...read moreread less

Journal Article•DOI•

How tight are the Vapnik-Chervonenkis bounds?

[...]

David Cohn¹, Gerald Tesauro²•Institutions (2)

University of Washington¹, IBM²

01 Mar 1992-Neural Computation

TL;DR: It is found that, in some cases, the average generalization of neural networks trained on a variety of simple functions is significantly better than the VC bound: the approach to perfect performance is exponential in the number of examples m, rather than the 1/m result of the bound.

...read moreread less

Abstract: We describe a series of numerical experiments that measure the average generalization capability of neural networks trained on a variety of simple functions. These experiments are designed to test the relationship between average generalization performance and the worst-case bounds obtained from formal learning theory using the Vapnik-Chervonenkis (VC) dimension (Blumer et al. 1989; Haussler et al. 1990). Recent statistical learning theories (Tishby et al. 1989; Schwartz et al. 1990) suggest that surpassing these bounds might be possible if the spectrum of possible generalizations has a “gap” near perfect performance. We indeed find that, in some cases, the average generalization is significantly better than the VC bound: the approach to perfect performance is exponential in the number of examples m, rather than the 1/m result of the bound. However, in these cases, we have not found evidence of the gap predicted by the above statistical theories. In other cases, we do find the 1/m behavior of the VC bound...

...read moreread less

Journal Article•DOI•

Dirichlet norms, capacities and generalized isoperimetric inequalities for Markov operators

[...]

Vadim A. Kaimanovich¹•Institutions (1)

Sapienza University of Rome¹

01 Mar 1992-Potential Analysis

TL;DR: In this paper, the notion of p-capacity for a reversible Markov operator on a general measure space was defined and it was shown that uniform estimates for the ratio of capacity and measure are equivalent to certain imbedding theorems for the Orlicz and Dirichlet norms.

...read moreread less

Abstract: We define the notion of p-capacity for a reversible Markov operator on a general measure space and prove that uniform estimates for the ratio of capacity and measure are equivalent to certain imbedding theorems for the Orlicz and Dirichlet norms. As a corollary we get results on connections between embedding theorems and isoperimetric properties for general Markov operators and, particularly, a generalization of the Kesten theorem on the spectral radius of random walks on amenable groups for the case of arbitrary graphs with non-finitely supported transition probabilities.

...read moreread less

Journal Article•DOI•

A generalized maximum principle and its applications in geometry

[...]

Q. Chen, Yuanlong Xin

01 Apr 1992-American Journal of Mathematics

Book Chapter•DOI•

Forming concepts for fast inference

[...]

Henry Kautz¹, Bart Selman¹•Institutions (1)

Bell Labs¹

12 Jul 1992

TL;DR: It is proved that unless NP ⊆ non-uniform P, not all theories have small Horn least-upper-bound approximations.

...read moreread less

Abstract: Knowledge compilation speeds inference by creating tractable approximations of a knowledge base, but this advantage is lost if the approximations are too large. We show how learning concept generalizations can allow for a more compact representation of the tractable theory. We also give a general induction rule for generating such concept generalizations. Finally, we prove that unless NP ⊆ non-uniform P, not all theories have small Horn least-upper-bound approximations.

...read moreread less

Journal Article•DOI•

ROC rating analysis : generalization to the population of readers and cases with the jackknife method

[...]

Dd Dorfman

01 Dec 1992-Investigative Radiology

Collapse