scispace - formally typeset
Search or ask a question

Showing papers on "Generalization published in 1995"



Journal ArticleDOI
TL;DR: The main rhetorical elements for all attempts to generalize in accounting research are statistical, contextual, and constructive generalization as mentioned in this paper, which are the three main rhetorics used in accounting studies: statistical generalization rhetoric relies on formal arguments brought from a mathematical theory, contextual generalization is based on understanding of the historical and institutional context, and the diffusion of innovation.
Abstract: Generalization in accounting research is always suspect as the social context and institutions of accounting change over time and space. However, exactly for this reason, there are a number of different ways to reach pragmatic and somewhat generalizable results. No research programme or approach has an absolute upper hand in understanding the true dynamics of economic development. The genuine puzzle of inductive reasoning creates a rhetorical element for all attempts to generalize in accounting research. The main rhetorics used in accounting studies are statistical, contextual, and constructive generalization. To put it in broad terms, statistical generalization rhetoric relies on formal arguments brought from a mathematical theory, contextual generalization rhetoric is based on understanding of the historical and institutional context, and constructive generalization relies on the diffusion of innovation. Combining the often silenced opportunities of contextual or constructive generalization rhetorics in...

324 citations


Journal ArticleDOI
TL;DR: In this paper, the partial regularity of suitable weak solutions to the dynamical systems modelling the flow of liquid crystals is established, which is a natural generalization of an earlier work of Caffarelli-Kohn-Nirenberg on the Navier-Stokes system with some simplifications due to better estimates on the pressure term.
Abstract: Here we established the partial regularity of suitable weak solutions to the dynamical systems modelling the flow of liquid crystals. It is a natural generalization of an earlier work of Caffarelli-Kohn-Nirenberg on the Navier-Stokes system with some simplifications due to better estimates on the pressure term.

292 citations


Journal ArticleDOI
TL;DR: Much better generalization can be obtained by using a variable interpolation kernel in combination with conjugate gradient optimization of the similarity metric and kernel size to create a variable-kernel similarity metric (VSM) learning.
Abstract: Nearest-neighbor interpolation algorithms have many useful properties for applications to learning, but they often exhibit poor generalization. In this paper, it is shown that much better generalization can be obtained by using a variable interpolation kernel in combination with conjugate gradient optimization of the similarity metric and kernel size. The resulting method is called variable-kernel similarity metric (VSM) learning. It has been tested on several standard classification data sets, and on these problems it shows better generalization than backpropagation and most other learning methods. The number of parameters that must be determined through optimization are orders of magnitude less than for backpropagation or radial basis function (RBF) networks, which may indicate that the method better captures the essential degrees of variation in learning. Other features of VSM learning are discussed that make it relevant to models for biological learning in the brain.

276 citations


Journal ArticleDOI
TL;DR: Most of the known results on linear networks, including backpropagation learning and the structure of the error function landscape, the temporal evolution of generalization, and unsupervised learning algorithms and their properties are surveyed.
Abstract: Networks of linear units are the simplest kind of networks, where the basic questions related to learning, generalization, and self-organization can sometimes be answered analytically. We survey most of the known results on linear networks, including: 1) backpropagation learning and the structure of the error function landscape, 2) the temporal evolution of generalization, and 3) unsupervised learning algorithms and their properties. The connections to classical statistical ideas, such as principal component analysis (PCA), are emphasized as well as several simple but challenging open questions. A few new results are also spread across the paper, including an analysis of the effect of noise on backpropagation networks and a unified view of all unsupervised algorithms. >

258 citations


Journal ArticleDOI
TL;DR: A new index is proposed for the evaluation of the sensitivity of the output of the multilayer perceptron to small input changes and a way is presented for improving these sensitivity criteria.
Abstract: In most applications of the multilayer perceptron (MLP) the main objective is to maximize the generalization ability of the network. We show that this ability is related to the sensitivity of the output of the MLP to small input changes. Several criteria have been proposed for the evaluation of the sensitivity. We propose a new index and present a way for improving these sensitivity criteria. Some numerical experiments allow a first comparison of the efficiencies of these criteria.

247 citations



Book ChapterDOI
01 Feb 1995
TL;DR: This work establishes the theoretical foundations of multi-scale models and derive TD algorithms for learning them and treats only the prediction problem--that of learning a model and value function for the case of fixed agent behavior.
Abstract: ARRAY(0x83fc18c) hierarchical or multi-level planning and reinforcement learning. In this paper we treat only the prediction problem--that of learning a model and value function for the case of fixed agent behavior. Within this context, we establish the theoretical foundations of multi-scale models and derive TD algorithms for learning them. Two small computational experiments are presented to test and illustrate the theory. This work is an extension and generalization of the work of Singh (1992), Dayan (1993), and Sutton and Pinette (1985).

188 citations


Journal ArticleDOI
TL;DR: In this paper, the authors introduce and survey random-cluster measures from the probabilist's point of view, giving clear statements of some of the many open problems and present new results for such measures, as follows.
Abstract: The random-cluster model is a generalization of percolation and ferromagnetic Potts models, due to Fortuin and Kasteleyn. Not only is the random-cluster model a worthwhile topic for study in its own right, but also it provides much information about phase transitions in the associated physical models. This paper serves two functions. First, we introduce and survey random-cluster measures from the probabilist's point of view, giving clear statements of some of the many open problems. Second, we present new results for such measures, as follows. We discuss the relationship between weak limits of random-cluster measures and measures satisfying a suitable DLR condition. Using an argument based on the convexity of pressure, we prove the uniqueness of random-cluster measures for all but (at most) countably many values of the parameter $p$. Related results concerning phase transition in two or more dimensions are included, together with various stimulating conjectures. The uniqueness of the infinite cluster is employed in an intrinsic way in part of these arguments. In the second part of this paper is constructed a Markov process whose level sets are reversible Markov processes with random-cluster measures as unique equilibrium measures. This construction enables a coupling of random-cluster measures for all values of $p$. Furthermore, it leads to a proof of the semicontinuity of the percolation probability and provides a heuristic probabilistic justification for the widely held belief that there is a first-order phase transition if and only if the cluster-weighting factor $q$ is sufficiently large.

183 citations


Journal ArticleDOI
TL;DR: The present situation of Turán having anextremely simple formulation but beingextremely hard to solve, has become one of the most fascinatingextremal problems in combinatorics.
Abstract: The numbers which are traditionally named in honor of Paul Turan were introduced by him as a generalization of a problem he solved in 1941. The general problem of Turan having anextremely simple formulation but beingextremely hard to solve, has become one of the most fascinatingextremal problems in combinatorics. We describe the present situation and list conjectures which are not so hopeless.

176 citations


01 Jan 1995
TL;DR: The technique can be viewed as an instance of Martens' and Gallagher's recent framework for global termination of partial deduction, but it is more general in some important respects, e.g. it uses well-quasi orderings rather than well-founded orderings.
Abstract: This paper presents a termination technique for positive supercompilation, based on notions from term algebra. The technique is not particularily biased towards positive supercompilation, but also works for deforestation and partial evaluation. It appears to be well suited for partial deduction too. The technique guarantees termination, yet it is not overly conservative. Our technique can be viewed as an instance of Martens' and Gallagher's recent framework for global termination of partial deduction, but it is more general in some important respects, e.g. it uses well-quasi orderings rather than well-founded orderings. Its merits are illustrated on several examples.

Book ChapterDOI
21 Sep 1995
TL;DR: A system of line segment relations which generalizes Allen's system of interval relations to two dimensions is introduced and it is shown that this generalization differs in interesting properties from the generalizations based on topological relations which have been proposed so far.
Abstract: Ordering information is a special type of spatial information that derives from the linear, planar or spatial ordering of points. A definition of ordering information in terms of the orientation of simplexes is used in this paper to introduce a system of line segment relations which generalizes Allen's system of interval relations to two dimensions. It shows that this generalization differs in interesting properties from the generalizations based on topological relations which have been proposed so far. The conceptual neighborhood structure of the line segment relations provides the foundation of ordering information reasoning. This is illustrated with an example from motion planning. Finally, the problem of representing ordering information is addressed. In that context the cell complex representation of Frank and Kuhn is compared with the approach presented here.

Proceedings ArticleDOI
01 Oct 1995
TL;DR: In this article, the authors add functional continuations and prompts to a language with an ML-style type system, and prove that well-typed terms never produce run-time type errors.
Abstract: We add functional continuations and prompts to a language with an ML-style type system. The operators signi cantly extend and simplify the control operators in SML/NJ, and can be themselves used to implement (simple) exceptions. We prove that well-typed terms never produce run-time type errors and give a module for implementing them in the latest version of SML/NJ.

Book ChapterDOI
01 Jan 1995
TL;DR: The concept of "on-the-fly' map generalization is very different from the implementation approaches described in the paper by Muller et al.: batch and interactive generalization (Chapter 1), so a special structure is proposed: the GAP-tree.
Abstract: The concept of "on-the-fly' map generalization is very different from the implementation approaches described in the paper by Muller et al.: batch and interactive generalization (Chapter 1). The term batch generalization is used for the process in which a computer gets an input dataset and returns an output dataset using algorithms, rules, or constraints (Lagrange et al., 1993) without the intervention of humans. Area partitioning possesses some special problems when being generalized. In order to avoid gaps when not selecting small area features, a special structure is proposed: the GAP-tree. Section 9.3 describes two other reactive data structures, which will be used in combination with the new GAP-tree: the Reactive-tree and the BLG-tree. The implementation and test results are given in section 9.4, where both visual and numerical results are shown. Finally, conclusions and future work are summarized in section 9.5. -from Author


Journal ArticleDOI
Michael Kearns1
27 Nov 1995
TL;DR: It is argued that the following qualitative properties of cross-validation behavior should be quite robust to significant changes in the underlying model selection problem: when the target function complexity is small compared to the sample size, the performance of cross validation is relatively insensitive to the choice of .
Abstract: We give a theoretical and experimental analysis of the generalization error of cross validation using two natural measures of the problem under consideration. The approximation rate measures the accuracy to which the target function can be ideally approximated as a function of the number of parameters, and thus captures the complexity of the target function with respect to the hypothesis model. The estimation rate measures the deviation between the training and generalization errors as a function of the number of parameters, and thus captures the extent to which the hypothesis model suffers from overfitting. Using these two measures, we give a rigorous and general bound on the error of the simplest form of cross validation. The bound clearly shows the dangers of making γ —the fraction of data saved for testing—too large or too small. By optimizing the bound with respect to γ, we then argue that the following qualitative properties of cross-validation behavior should be quite robust to significant changes ...

Proceedings Article
27 Nov 1995
TL;DR: This paper showed that neural networks with continuous activation functions have VC dimension at least as large as the square of the number of weights w. This result settles a long-standing open question, namely whether the well-known O(w log w) bound, known for hard-threshold nets, also held for more general sigmoidal nets.
Abstract: This paper shows that neural networks which use continuous activation functions have VC dimension at least as large as the square of the number of weights w. This result settles a long-standing open question, namely whether the well-known O(w log w) bound, known for hard-threshold nets, also held for more general sigmoidal nets. Implications for the number of samples needed for valid generalization are discussed.

Journal ArticleDOI
01 Jun 1995-EPL
TL;DR: In this paper, a physical consideration is given to reinforce, interpret and generalize the result of Shepelyansky that a not too weak short-range interaction can cause correlated two-electron states to be substantially delocalized with respect to singleelectron ones.
Abstract: A physical consideration is given to reinforce, interpret and generalize the result of Shepelyansky that a not too weak short-range interaction can cause correlated two-electron states to be substantially delocalized with respect to single-electron ones. A generalization of the Thouless block-scaling picture is used and its wide applicability is pointed out. A similar effect for correlated electron-hole pairs is also found. Some physical applications are briefly discussed.

Journal ArticleDOI
TL;DR: The Lorenz zonotope is a multivariate generalization of the Lorenz curve as mentioned in this paper, which allows to define multivariate Lorenz majorization whose properties are studied.
Abstract: The Lorenz zonotope is a multivariate generalization of the Lorenz curve. It allows to define multivariate Lorenz majorization whose properties are studied.

Journal ArticleDOI
TL;DR: In this paper, the authors present a logic of generalization based on proximal similarity, heterogeneity of irrelevancies, discriminant validity, empirical interpolation and extrapolation, and explanation.
Abstract: Both experiments and ethnographies are highly localized, so they are often criticized for lack of generalizability. The present article describes a logic of generalization that may help solve such problems. The logic consists of five principles outlined by Cook (1990): (a) proximal similarity, (b) heterogeneity of irrelevancies, (c) discriminant validity, (d) empirical interpolation and extrapolation, and (e) explanation. Because validity is a property of knowledge claims, not methods, these five principles apply to claims about generalization generated by any method, including both ethnographies and experiments. The principles are illustrated using Rizzo and Corsaro's interesting ethnographies as examples.

Journal ArticleDOI
TL;DR: A novel algorithm is presented which supplements the training phase in feedforward networks with various forms of information about desired learning properties to improve convergence, learning speed, and generalization properties through prompt activation of the hidden units, optimal alignment of successive weight vector offsets, elimination of excessive hidden nodes, and regulation of the magnitude of search steps in the weight space.
Abstract: A novel algorithm is presented which supplements the training phase in feedforward networks with various forms of information about desired learning properties. This information is represented by conditions which must be satisfied in addition to the demand for minimization of the usual mean square error cost function. The purpose of these conditions is to improve convergence, learning speed, and generalization properties through prompt activation of the hidden units, optimal alignment of successive weight vector offsets, elimination of excessive hidden nodes, and regulation of the magnitude of search steps in the weight space. The algorithm is applied to several small- and large-scale binary benchmark training tasks, to test its convergence ability and learning speed, as well as to a large-scale OCR problem, to test its generalization capability. Its performance in terms of percentage of local minima, learning speed, and generalization ability is evaluated and found superior to the performance of the backpropagation algorithm and variants thereof taking especially into account the statistical significance of the results. >

Book ChapterDOI
01 Jan 1995
TL;DR: In this paper, a generalization of the Rasch model to a discrete mixture distribution model is presented, which can be used to test the fit of the ordinary Rasch models.
Abstract: This chapter deals with the generalization of the Rasch model to a discrete mixture distribution model. Its basic assumption is that the Rasch model holds within subpopulations of individuals, but with different parameter values in each subgroup. These subpopulations are not defined by manifest indicators, rather they have to be identified by applying the model. Model equations are derived by conditioning out the class-specific ability parameters and introducing class-specific score probabilities as model parameters. The model can be used to test the fit of the ordinary Rasch model. By means of an example it is illustrated that this goodness-of-fit test can be more powerful for detecting model violations than the conditional likelihood ratio test by Andersen.

Journal ArticleDOI
TL;DR: In this paper, the generalization that competitive promotions are mixed strategies is proposed and an empirical regularity is established that promotions are independent across competitors, which is not the case in this paper.
Abstract: This paper offers the generalization that competitive promotions are mixed strategies. First an empirical regularity is established that promotions are independent across competitors. This regulari...

Book ChapterDOI
01 Jan 1995
TL;DR: In this article, the conservation law for generalization performance in a uniformly random universe was studied and a more meaningful measure of generalization was introduced, expected generalization, which is conserved only when certain symmetric properties hold in our universe.
Abstract: The “Conservation Law for Generalization Performance” [Schaffer, 1994] states that for any learning algorithm and bias, “generalization is a zero-sum enterprise.” In this paper we study the law and show that while the law is true, the manner in which the Conservation Law adds up generalization performance over all target concepts, without regard to the probability with which each concept occurs, is relevant only in a uniformly random universe. We then introduce a more meaningful measure of generalization, expected generalization performance. Unlike the Conservation Law's measure of generalization performance (which is, in essence, defined to be zero), expected generalization performance is conserved only when certain symmetric properties hold in our universe. There is no reason to believe, a priori, that such symmetries exist; learning algorithms may well exhibit non-zero (expected) generalization performance.

Journal ArticleDOI
TL;DR: A relationship between the mixed-search number of a graph G and the proper-path-width of G is established and complexity results are proved.

Journal ArticleDOI
TL;DR: An in-depth study which investigated two algorithms for line simplification and caricatural generalization (namely, those developed by Douglas and Peucker, and Visvalingam, respectively) in the context of a wider program of research on scale-free mapping suggests the Douglas-Peucker algorithm is better at minimal simplification.
Abstract: This paper reports the results of an in-depth study which investigated two algorithms for line simplification and caricatural generalization (namely, those developed by Douglas and Peucker, and Vis...


Journal ArticleDOI
TL;DR: Doignon and Falmagne as discussed by the authors proposed a graph-theoretic formulation of a wide class of facet-defining inequalities, including Koppen's and Fishburn's inequalities.

Journal ArticleDOI
TL;DR: In this paper, the relation between the existence of an orthonormal wavelet and a multi-resolution wavelet is clarified, and four theorems for their existence are proved.
Abstract: Methods from noncommutative harmonic analysis are used to develop an abstract theory of orthonormal wavelets. The relationship between the existence of an orthonormal wavelet and the existence of a multi-resolution is clarified, and four theorems guaranteeing the existence of wavelets are proved. As a special case of the fourth theorem, a generalization of known results on the existence of smooth wavelets having compact support is obtained.

Journal ArticleDOI
TL;DR: This article utilizes an algorithm for urban space pattern analysis and illustrates how it can be applied to capture the essence of urban settlement in order to generate objective, intermediate design solutions for the representation of urban Settlement at a range of scales.
Abstract: When a human cartographer designs a map, the symbols represent objects in a way that attempts to convey geographic process occurring at a given scale. The skill is in abstracting patterns of process and representing them in a context at an appropriate scale. The cartographer conveys the essence of the phenomenon by capturing the essential characteristics of a feature. In order for automated cartographic systems to match this subtlety in design and this ability to characterize, they must have cartometric techniques that enable the same abstraction of pattern to take place; techniques that take into account the phenomenological aspects of features. This article utilizes an algorithm for urban space pattern analysis and illustrates how it can be applied to capture the essence of urban settlement in order to generate objective, intermediate design solutions for the representation of urban settlement at a range of scales. Such an algorithm is considered to be one of a growing set of generalization operators th...