Showing papers on "Generalization published in 1998"

PDF

Open Access

[...]

01 Jan 1998

TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.

...read moreread less

Abstract: A comprehensive look at learning and generalization theory. The statistical theory of learning and generalization concerns the problem of choosing desired functions on the basis of empirical data. Highly applicable to a variety of computer science and robotics fields, this book offers lucid coverage of the theory as a whole. Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.

...read moreread less

26,531 citations

Journal Article•DOI•

The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network

[...]

Peter L. Bartlett¹•Institutions (1)

Australian National University¹

01 Mar 1998-IEEE Transactions on Information Theory

TL;DR: Results in this paper show that if a large neural network is used for a pattern classification problem and the learning algorithm finds a network with small weights that has small squared error on the training patterns, then the generalization performance depends on the size of the weights rather than the number of weights.

...read moreread less

Abstract: Sample complexity results from computational learning theory, when applied to neural network learning for pattern classification problems, suggest that for good generalization performance the number of training examples should grow at least linearly with the number of adjustable parameters in the network. Results in this paper show that if a large neural network is used for a pattern classification problem and the learning algorithm finds a network with small weights that has small squared error on the training patterns, then the generalization performance depends on the size of the weights rather than the number of weights. For example, consider a two-layer feedforward network of sigmoid units, in which the sum of the magnitudes of the weights associated with each unit is bounded by A and the input dimension is n. We show that the misclassification probability is no more than a certain error estimate (that is related to squared error on the training set) plus A/sup 3/ /spl radic/((log n)/m) (ignoring log A and log m factors), where m is the number of training patterns. This may explain the generalization performance of neural networks, particularly when the number of training examples is considerably smaller than the number of weights. It also supports heuristics (such as weight decay and early stopping) that attempt to keep the weights small during training. The proof techniques appear to be useful for the analysis of other pattern classifiers: when the input domain is a totally bounded metric space, we use the same approach to give upper bounds on misclassification probability for classifiers with decision boundaries that are far from the training examples.

...read moreread less

1,234 citations

Journal Article•DOI•

Structural risk minimization over data-dependent hierarchies

[...]

John Shawe-Taylor¹, Peter L. Bartlett², Robert C. Williamson², Martin Anthony³•Institutions (3)

University of London¹, Australian National University², London School of Economics and Political Science³

01 Sep 1998-IEEE Transactions on Information Theory

TL;DR: A result is presented that allows one to trade off errors on the training sample against improved generalization performance, and a more general result in terms of "luckiness" functions, which provides a quite general way for exploiting serendipitous simplicity in observed data to obtain better prediction accuracy from small training sets.

...read moreread less

Abstract: The paper introduces some generalizations of Vapnik's (1982) method of structural risk minimization (SRM). As well as making explicit some of the details on SRM, it provides a result that allows one to trade off errors on the training sample against improved generalization performance. It then considers the more general case when the hierarchy of classes is chosen in response to the data. A result is presented on the generalization performance of classifiers with a "large margin". This theoretically explains the impressive generalization performance of the maximal margin hyperplane algorithm of Vapnik and co-workers (which is the basis for their support vector machines). The paper concludes with a more general result in terms of "luckiness" functions, which provides a quite general way for exploiting serendipitous simplicity in observed data to obtain better prediction accuracy from small training sets. Four examples are given of such functions, including the Vapnik-Chervonenkis (1971) dimension measured on the sample.

...read moreread less

589 citations

Journal Article•DOI•

Reliability generalization: Exploring variance in measurement error affecting score reliability across studies.

[...]

Tammi Vacha-Haase¹•Institutions (1)

Western Michigan University¹

01 Feb 1998-Educational and Psychological Measurement

TL;DR: In this paper, reliability generalization characterizes the typical reliability of scores for a given test across studies, the amount of variability in reliability coefficients for given measures, and the sources of variability of reliability coefficients across studies.

...read moreread less

Abstract: Because tests are not reliable, it is important to explore score reliability in virtually all studies. The present article proposes and illustrates a new method-reliability generalization-that can be used in a meta-analysis application similar to validity generalization. Reliability generalization characterizes (a) the typical reliability of scores for a given test across studies, (b) the amount of variability in reliability coefficients for given measures, and (c) the sources of variability in reliability coefficients across studies. The use of reliability generalization is illustrated here by analyzing 87 reliability coefficients reported for the two scales of the Bem Sex Role Inventory (BSRI).

...read moreread less

380 citations

Journal Article•DOI•

Monoidal Globular Categories As a Natural Environment for the Theory of Weakn-Categories☆

[...]

Michael Batanin¹•Institutions (1)

Macquarie University¹

01 Jun 1998-Advances in Mathematics

TL;DR: In this paper, a definition of weakω-categories based on a higher-order generalization of the apparatus of operads is presented, where weakω is defined as a class of weak operads.

...read moreread less

316 citations

What Size Neural Network Gives Optimal Generalization? Convergence Properties of Backpropagation

[...]

Steve Lawrence, C. Lee Giles, Ah Chung Tsoi

15 Oct 1998

TL;DR: The convergence of the backpropagation algorithm with respect to a) the complexity of the required function approximatio n, b) the size of the network in relation to the size required for an optimal solution, and c) the degree of noise in the training data is investigated.

...read moreread less

Abstract: One of the most important aspects of any machine learning paradigm is how i t scales according to problem size and complexity. Using a task with known optimal train ing error, and a pre-specified maximum number of training updates, we investigate the convergence of th backpropagation algorithm with respect to a) the complexity of the required function approximatio n, b) the size of the network in relation to the size required for an optimal solution, and c) the degree o f noise in the training data. In general, for a) the solution found is worse when the function to be app roximated is more complex, for b) oversized networks can result in lower training and generalization error in certain cases, and for c) the use of committee or ensemble techniques can be more beneficial as the level o f noise in the training data is increased. For the experiments we performed, we do not obtain the optimal solution in any case. We further support the observation that larger networks can produce bett er training and generalization error using a face recognition example where a network with many more par ameters than training points generalizes better than smaller networks.

...read moreread less

271 citations

TEACHER'S CORNER A Note on Multiple Sample Extensions of the RMSEA Fit Index

[...]

James H. Steiger

01 Jan 1998

TL;DR: In this paper, the Steiger-Lind root mean square error of approximation fit indexes and interval estimation procedure for models based on multiple independent samples is discussed and an approach that seems both reasonable and workable, and caution against one that definitely seems inappropriate.

...read moreread less

Abstract: Generalization of the Steiger-Lind root mean square error of approximation fit indexes and interval estimation procedure to models based on multiple independent samples is discussed. In this article, we suggest an approach that seems both reasonable and workable, and caution against one that definitely seems inappropriate.

...read moreread less

253 citations

Journal Article•DOI•

A matrix variate generalization of the power exponential family of distributions

[...]

Eusebio Gómez Sánchez-Manzano¹, Miguel A. Gómez-Villegas¹, Juan-Miguel Marín-Diazaraque²•Institutions (2)

Complutense University of Madrid¹, King Juan Carlos University²

01 Jan 1998-Communications in Statistics-theory and Methods

TL;DR: In this article, a matrix variate generalization of the power exponential distribution family is proposed, which can be useful in generalizing statistical procedures in multivariate analysis and in designing robust alternatives to them.

...read moreread less

Abstract: This paper proposes a matrix variate generalization of the power exponential distribution family, which can be useful in generalizing statistical procedures in multivariate analysis and in designing robust alternatives to them. An example is added to show an application of the generalization.

...read moreread less

250 citations

Journal Article•DOI•

An ANN-based approach for predicting global radiation in locations with no direct measurement instrumentation

[...]

Saleh M. Al-Alawi¹, Hilal Al-Hinai¹•Institutions (1)

Sultan Qaboos University¹

01 May 1998-Renewable Energy

TL;DR: In this paper, a novel approach using an artificial neural network was used to develop a model for analyzing the relationship between the Global Radiation (GR) and climatological variables, and to predict GR for locations not covered by the model's training data.

...read moreread less

242 citations

Journal Article•DOI•

Extended Poisson Games and the Condorcet Jury Theorem

[...]

Roger B. Myerson¹•Institutions (1)

Northwestern University¹

01 Oct 1998-Games and Economic Behavior

TL;DR: In this paper, the authors extended the Poisson model of games with population uncertainty by allowing that expected population sizes and players' utility functions may depend on an unknown state of the world.

...read moreread less

199 citations

Journal Article•

Generalizing case frames using a thesaurus and the MDL principle

[...]

Hang Li¹, Naoki Abe¹•Institutions (1)

NEC¹

01 Jun 1998-Computational Linguistics

TL;DR: In this paper, the problem of generalizing values of a case frame slot for a verb is viewed as that of estimating a conditional probability distribution over a partition of words, and a new generalization method based on the minimum description length (MDL) principle is proposed.

...read moreread less

Abstract: A new method for automatically acquiring case frame patterns from large corpora is proposed. In particular, the problem of generalizing values of a case frame slot for a verb is viewed as that of estimating a conditional probability distribution over a partition of words, and a new generalization method based on the Minimum Description Length (MDL) principle is proposed. In order to assist with efficiency, the proposed method makes use of an existing thesaurus and restricts its attention to those partitions that are present as "cuts" in the thesaurus tree, thus reducing the generalization problem to that of estimating a "tree cut model" of the thesaurus tree. An efficient algorithm is given, which provably obtains the optimal tree cut model for the given frequency data of a case slot, in the sense of MDL. Case frame patterns obtained by the method were used to resolve PP-attachment ambiguity. Experimental results indicate that the proposed method improves upon or is at least comparable with existing methods.

...read moreread less

Journal Article•DOI•

A rough set approach to attribute generalization in data mining

[...]

Chien-Chung Chan¹•Institutions (1)

University of Akron¹

01 Jun 1998-Information Sciences

TL;DR: A method for updating approximations of a concept incrementally is presented, using the inductive learning algorithm, LERS based on rough set theory, to implement a quasi-incremental algorithm for learning classification rules from very large data bases generalized by dynamic conceptual hierarchies provided by users.

...read moreread less

Proceedings Article•

Classification on Pairwise Proximity Data

[...]

Thore Graepel¹, Ralf Herbrich, Peter Bollmann-Sdorra, Klaus Obermayer¹•Institutions (1)

Technical University of Berlin¹

01 Dec 1998

TL;DR: This work investigates the problem of learning a classification task on data represented in terms of their pairwise proximities, which does not refer to an explicit feature representation of the data items and is thus more general than the standard approach of using Euclidean feature vectors.

...read moreread less

Abstract: We investigate the problem of learning a classification task on data represented in terms of their pairwise proximities. This representation does not refer to an explicit feature representation of the data items and is thus more general than the standard approach of using Euclidean feature vectors, from which pairwise proximities can always be calculated. Our first approach is based on a combined linear embedding and classification procedure resulting in an extension of the Optimal Hyperplane algorithm to pseudo-Euclidean data. As an alternative we present another approach based on a linear threshold model in the proximity values themselves, which is optimized using Structural Risk Minimization. We show that prior knowledge about the problem can be incorporated by the choice of distance measures and examine different metrics W.r.t. their generalization. Finally, the algorithms are successfully applied to protein structure data and to data from the cat's cerebral cortex. They show better performance than K-nearest-neighbor classification.

...read moreread less

Journal Article•DOI•

Existence dependency: The key to semantic integrity between structural and behavioral aspects of object types

[...]

Monique Snoeck, Guido Dedene¹•Institutions (1)

Katholieke Universiteit Leuven¹

01 Apr 1998-IEEE Transactions on Software Engineering

TL;DR: It is claimed that the notion of existence dependency is always possible to classify objects according to this relationship, thus removing the necessity for the Part Of relation and other kinds of associations between object types.

...read moreread less

Abstract: In object-oriented conceptual modeling, the generalization/specialization hierarchy and the whole/part relationship are prevalent classification schemes for object types. This paper presents an object-oriented conceptual model where, in the end, object types are classified according to two relationships only. Existence dependency and generalization/specialization. Existence dependency captures some of the interesting semantics that are usually associated with the concept of aggregation (also called composition or Part Of relation), but in contrast with the latter concept, the semantics of existence dependency are very precise and its use clear cut. The key advantage of classifying object types according to existence dependency are the simplicity of the concept, its absolute unambiguity, and the fact that it enables to check conceptual schemes for semantic integrity and consistency. We will first define the notion of existence dependency and claim that it is always possible to classify objects according to this relationship, thus removing the necessity for the Part Of relation and other kinds of associations between object types. The second claim of this paper is that existence dependency is the key to semantic integrity checking to a level unknown to current object-oriented analysis methods. In other words: Existence dependency allows us to track and solve inconsistencies in an object-oriented conceptual schema.

...read moreread less

Journal Article•DOI•

Line Generalization Based on Analysis of Shape Characteristics

[...]

Ze-shen Wang, Jean-Claude Müller

01 Jan 1998

TL;DR: The aim of this paper is to observe school-case solutions available in standard cartographic books and try to replicate those automatically to preserve the overall structure with line bends which are mathematically defined according to size, shape, and context.

...read moreread less

Abstract: Many solutions for line generalizations have already been proposed. Most of them, however, are geometric solutions, not cartographic ones. The position we take in this paper is to observe school-case solutions available in standard cartographic books and try to replicate those automatically. A central criterion guiding the process of cartographic generalization is line structure, which itself can be decomposed into a series of line bends. Hence our solution is to preserve the overall structure with line bends which are mathematically defined according to size, shape, and context. Rules are subsequently applied using operators such as elimination, combination, and exaggeration. The algorithms that were used are both procedural and knowledge based. Various experiments were conducted on physical and political geographic lines, and we show the graphical results so that readers may visually assess them. Further research to improve the present solutions is discussed, particularly options for avoiding conflicts ...

...read moreread less

Journal Article•DOI•

On a q -generalization of circular and hyperbolic functions

[...]

Ernesto P. Borges

12 Jun 1998-Journal of Physics A

TL;DR: In this paper, a generalization of the circular and hyperbolic functions is proposed, based on the Tsallis statistics and on a consistent generalisation of the Euler formula, and some properties of the presently proposed q-trigonometry are then investigated.

...read moreread less

Abstract: A generalization of the circular and hyperbolic functions is proposed, based on the Tsallis statistics and on a consistent generalization of the Euler formula. Some properties of the presently proposed q-trigonometry are then investigated. The generalized functions are exact solutions of a nonlinear oscillator. Original circular and hyperbolic relations are recovered as the limiting case.

...read moreread less

Journal Article•DOI•

Fault-Tolerant Quantum Computation with Higher-Dimensional Systems

[...]

Daniel Gottesman

02 Feb 1998-arXiv: Quantum Physics

TL;DR: It is proved that universal fault-tolerant computation is possible with any higher-dimensional stabilizer code for prime d, and the theory of fault-Tolerant computations using such codes is discussed.

...read moreread less

Abstract: Instead of a quantum computer where the fundamental units are 2-dimensional qubits, we can consider a quantum computer made up of d-dimensional systems. There is a straightforward generalization of the class of stabilizer codes to d-dimensional systems, and I will discuss the theory of fault-tolerant computation using such codes. I prove that universal fault-tolerant computation is possible with any higher-dimensional stabilizer code for prime d.

...read moreread less

Journal Article•DOI•

Properties of Phonological Markers That Affect the Acquisition of Gender-Like Subclasses☆☆☆★

[...]

Lenore C. Frigo¹, Janet L. McDonald¹•Institutions (1)

Louisiana State University¹

01 Aug 1998-Journal of Memory and Language

TL;DR: This paper showed that generalization only occurs when some studied items are systematically marked and the process consists of two components: one component involves making a link between the phonological markers and the indicators (e.g., definite and indefinite articles) of subclass membership.

...read moreread less

Journal Article•DOI•

Controlling generalization and polyvariance in partial deduction of normal logic programs

[...]

Michael Leuschel¹, Bern Martens¹, Danny De Schreye¹•Institutions (1)

Katholieke Universiteit Leuven¹

01 Jan 1998-ACM Transactions on Programming Languages and Systems

TL;DR: This article elaborate global control for partial deduction, using the concept of a characteristic tree, encapsulating specialization behavior rather than syntactic structure, to guide generalization and polyvariance, and shows how this can be done in a correct and elegant way.

...read moreread less

Abstract: Given a program and some input data, partial deduction computes a specialized program handling any remaining input more efficiently. However, controlling the process well is a rather difficult problem. In this article, we elaborate global control for partial deduction: for which atoms, among possibly infinitely many, should specialized relations be produced, meanwhile guaranteeing correctness as well as termination? Our work is based on two ingredients. First, we use the concept of a characteristic tree, encapsulating specialization behavior rather than syntactic structure, to guide generalization and polyvariance, and we show how this can be done in a correct and elegant way. Second, we structure combinations of atoms and associated characteristic trees in global trees registering “causal” relationships among such pairs. This allows us to spot looming nontermination and perform proper generalization in order to avert the danger, without having to impose a depth bound on characteristic trees. The practical relevance and benefits of the work are illustrated through extensive experiments. Finally, a similar approach may improve upon current (on-line) control strategies for program transformation in general such as (positive) supercompilation of functional programs. It also seems valuable in the context of abstract interpretation to handle infinite domains of infinite height with more precision.

...read moreread less

Journal Article•DOI•

Inductive generalization in 9‐ and 11‐month‐olds

[...]

Laraine McDonough¹, Jean M. Mandler²•Institutions (2)

City University of New York¹, University of California, San Diego²

01 Oct 1998-Developmental Science

TL;DR: The authors showed 9-and 11-month-old infants imitation of animal and vehicle properties such as drinking from a cup or giving a ride. But infants generalized the properties broadly to both typical and novel exemplars within the appropriate domain, and rarely to exemplars from the inappropriate domain.

...read moreread less

Abstract: Using little models, we showed 9- and 11-month-old infants events in which animal or vehicle properties were demonstrated, such as a dog drinking from a cup or a car giving a ride. The infants were tested on imitation of these properties on the same exemplars as used for the modeling, on generalization to other exemplars from the same domain, and on generalization to exemplars from an inappropriate domain. Infants generalized the properties broadly to both typical and novel exemplars within the appropriate domain, and rarely to exemplars from the inappropriate domain. It is concluded that at least by 9 months infants have formed global concepts of animals and vehicles that control the way infants learn the characteristic properties of these classes.

...read moreread less

Book•

An Introduction to Statistical Analysis of Random Arrays

[...]

Vi︠a︡cheslav Leonidovich Girko

01 Dec 1998

TL;DR: In this paper, the results of 30 years of investigation by the author into the creation of a new theory on statistical analysis of observations, based on the principle of random arrays of random vectors and matrices of increasing dimensions, are described.

...read moreread less

Abstract: This book contains the results of 30 years of investigation by the author into the creation of a new theory on statistical analysis of observations, based on the principle of random arrays of random vectors and matrices of increasing dimensions It describes limit phenomena of sequences of random observations, which occupy a central place in the theory of random matrices This is the first book to explore statistical analysis of random arrays and provides the necessary tools for such analysis This book is a natural generalization of multidimensional statistical analysis and aims to provide its readers with new, improved estimators of this analysis The book consists of 14 chapters and opens with the theory of sample random matrices of fixed dimension, which allows to envelop not only the problems of multidimensional statistical analysis, but also some important problems of mechanics, physics and economics The second chapter deals with all 50 known canonical equations of the new statistical analysis, which form the basis for finding new and improved statistical estimators Chapters 3-5 contain detailed proof of the three main laws on the theory of sample random matrices In chapters 6-10 detailed, strong proofs of the Circular and Elliptic Laws and their generalization are given In chapters 11-13 the convergence rates of spectral functions are given for the practical application of new estimators and important questions on random matrix physics are considered The final chapter contains 54 new statistical estimators, which generalize the main estimators of statistical analysis

...read moreread less

Book Chapter•DOI•

Statistical mechanics of generalization

[...]

Manfred Opper

01 Oct 1998

TL;DR: A neural network’s ability to generalize from examples using ideas from statistical mechanics is estimated and a variety of learning problems that can be treated exactly by the replica method of statistical physics are introduced.

...read moreread less

Abstract: We estimate a neural network’s ability to generalize from examples using ideas from statistical mechanics. We discuss the connection between this approach and other powerful concepts from mathematical statistics, computer science, and information theory that are useful in explaining the performance of such machines. For the simplest network, the perceptron, we introduce a variety of learning problems that can be treated exactly by the replica method of statistical physics.

...read moreread less

Journal Article•DOI•

The stochastic rotation problem: A generalization of Faustmann's formula to stochastic forest growth

[...]

Yngve Willassen

01 Apr 1998-Journal of Economic Dynamics and Control

TL;DR: In this paper, the authors studied the optimal cutting strategy for an ongoing forest, using stochastic impulse control, and showed how Faustmann's formula can be generalized to growing forests.

...read moreread less

Journal Article•DOI•

The Black-Scholes pricing formula in the quantum context

[...]

Wiliam Segal, Irving E. Segal

31 Mar 1998-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: The differentiability of the Wiener process as a sesquilinear form on a dense domain in the Hilbert space of square-integrable functions over Wiener space is shown and is extended to the quantum context, providing a basis for a corresponding generalization of the Ito theory of stochastic integration.

...read moreread less

Abstract: A natural explanation for extreme irregularities in the evolution of prices in financial markets is provided by quantum effects. The lack of simultaneous observability of relevant variables and the interference of attempted observation with the values of these variables represent such effects. These characteristics have been noted by traders and economists and appear intrinsic to market dynamics. This explanation is explored here in terms of a corresponding generalization of the Wiener process and its role in the Black–Scholes–Merton theory. The differentiability of the Wiener process as a sesquilinear form on a dense domain in the Hilbert space of square-integrable functions over Wiener space is shown and is extended to the quantum context. This provides a basis for a corresponding generalization of the Ito theory of stochastic integration. An extension of the Black–Scholes option pricing formula to the quantum context is deduced.

...read moreread less

Journal Article•DOI•

Convergence of Global and Bounded Solutions of the Wave Equation with Linear Dissipation and Analytic Nonlinearity

[...]

Mohamed Ali Jendoubi¹•Institutions (1)

Pierre-and-Marie-Curie University¹

10 Apr 1998-Journal of Differential Equations

TL;DR: In this article, the authors prove convergence of global, bounded, and smooth solutions of the wave equation with linear dissipation and analytic nonlinearity, and a generalization and examples of applications are given at the end of the paper.

...read moreread less

Book Chapter•DOI•

Evolutionary Search for Minimal Elements in Partially Ordered Finite Sets

[...]

Günter Rudolph¹•Institutions (1)

Technical University of Dortmund¹

25 Mar 1998-Evolutionary Programming

TL;DR: It is shown that evolutionary algorithms are able to converge to the set of minimal elements in finite time with probability one, provided that the search space is finite, the time-invariant variation operator is associated with a positive transition probability function and that the selection operator obeys the so-called ‘elite preservation strategy.’

...read moreread less

Abstract: The task of finding minimal elements of a partially ordered set is a generalization of the task of finding the global minimum of a real-valued function or of finding Pareto-optimal points of a multicriteria optimization problem. It is shown that evolutionary algorithms are able to converge to the set of minimal elements in finite time with probability one, provided that the search space is finite, the time-invariant variation operator is associated with a positive transition probability function and that the selection operator obeys the so-called ‘elite preservation strategy.’

...read moreread less

Journal Article•DOI•

Generalization-based data mining in object-oriented databases using an object cube model

[...]

Jiawei Han¹, Shojiro Nishio², Hiroyuki Kawano³, Wei Wang¹•Institutions (3)

Simon Fraser University¹, Osaka University², Kyoto University³

01 Mar 1998

TL;DR: The study shows that a set of sophisticated generalization operators can be constructed for generalization of complex data objects, a dimension-based class generalization mechanism can be developed for object cube construction, and sophisticated rule formation methods can be develop for extraction of different kinds of knowledge from data.

...read moreread less

Abstract: Data mining is the discovery of knowledge and useful information from the large amounts of data stored in databases. With the increasing popularity of object-oriented database systems in advanced database applications, it is important to study the data mining methods for object-oriented databases because mining knowledge from such databases may improve understanding, organization, and utilization of the data stored there. In this paper, issues on generalization-based data mining in object-oriented databases are investigated in three aspects: (1) generalization of complex objects, (2) class-based generalization, and (3) extraction of different kinds of rules. An object cube model is proposed for class-based generalization, on-line analytical processing, and data mining. The study shows that (i) a set of sophisticated generalization operators can be constructed for generalization of complex data objects, (ii) a dimension-based class generalization mechanism can be developed for object cube construction, and (iii) sophisticated rule formation methods can be developed for extraction of different kinds of knowledge from data, including characteristic rules, discriminant rules, association rules, and classification rules. Furthermore, the application of such discovered knowledge may substantially enhance the power and flexibility of browsing databases, organizing databases and querying data and knowledge in object-oriented databases.

...read moreread less

Journal Article•DOI•

Neural-network design for small training sets of high dimension

[...]

Jen-Lun Yuan¹, Terrence L. Fine²•Institutions (2)

National Chung Hsing University¹, Cornell University²

01 Mar 1998-IEEE Transactions on Neural Networks

TL;DR: A statistically based methodology for the design of neural networks when the dimension d of the network input is comparable to the size n of the training set, illustrated in detail in the context of short-term forecasting of the demand for electric power from an electric utility.

...read moreread less

Abstract: We introduce a statistically based methodology for the design of neural networks when the dimension d of the network input is comparable to the size n of the training set. If one proceeds straightforwardly, then one is committed to a network of complexity exceeding n. The result will be good performance on the training set but poor generalization performance when the network is presented with new data. To avoid this we need to select carefully the network architecture, including control over the input variables. Our approach to selecting a network architecture first selects a subset of input variables (features) using the nonparametric statistical process of difference-based variance estimation and then selects a simple network architecture using projection pursuit regression (PPR) ideas combined with the statistical idea of slicing inverse regression (SIR). The resulting network, which is then retrained without regard to the PPR/SIR determined parameters, is one of moderate complexity (number of parameters significantly less than n) whose performance on the training set can be expected to generalize well. The application of this methodology is illustrated in detail in the context of short-term forecasting of the demand for electric power from an electric utility.

...read moreread less

Journal Article•DOI•

Do viewpoint-dependent mechanisms generalize across members of a class?

[...]

Michael J. Tarr¹, Isabel Gauthier²•Institutions (2)

Brown University¹, Yale University²

17 Jul 1998-Cognition

TL;DR: Testing whether viewpoint-specific representations for some members of a class facilitate the recognition of other members of that class supports the hypothesis that image-based representations are viewpoint dependent, but that these representations generalize across members of perceptually-defined classes.

...read moreread less

Journal Article•DOI•

A generalization of the Bala-Carter theorem for nilpotent orbits

[...]

Eric Sommers

01 Jan 1998-International Mathematics Research Notices

Collapse