scispace - formally typeset
Search or ask a question

Showing papers on "Generalization published in 2006"


Book
01 Jan 2006
TL;DR: In this article, the authors present a survey of the generalization process in the context of constructionist themes and cross-linguistic generalizations in argument realization, and explain how generalizations are learned.
Abstract: Part One: Constructions 1. Overview 2. Surface Generalizations 3. Item Specific Knowledge and Generalizations Part Two: Learning Generalizations 4. How Generalizations are Learned 5. How Generalizations are Constrained 6. Why Generalizations are Learned Part Three: Explaining Generalizations 7. Island Constraints and Scope 8. Grammatical Categorization: Subject Auxiliary Inversion 9. Cross-linguistic Generalizations in Argument Realization 10. Variations on a Constructionist Theme 11. Conclusion References Index

2,337 citations


Posted Content
Lek-Heng Lim1
TL;DR: In this article, a theory of eigenvalues, eigenvectors, singular values, and singular vectors for tensors based on a constrained variational approach much like the Rayleigh quotient for symmetric matrix eigen values is proposed.
Abstract: We propose a theory of eigenvalues, eigenvectors, singular values, and singular vectors for tensors based on a constrained variational approach much like the Rayleigh quotient for symmetric matrix eigenvalues. These notions are particularly useful in generalizing certain areas where the spectral theory of matrices has traditionally played an important role. For illustration, we will discuss a multilinear generalization of the Perron-Frobenius theorem.

646 citations


Book ChapterDOI
01 Jan 2006
TL;DR: This work proposes to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences, and incorporates this new language model into a state-of-the-art speech recognizer of conversational speech.
Abstract: A central goal of statistical language modeling is to learn the joint probability function of sequences of words in a language. This is intrinsically difficult because of the curse of dimensionality: a word sequence on which the model will be tested is likely to be different from all the word sequences seen during training. Traditional but very successful approaches based on n-grams obtain generalization by concatenating very short overlapping sequences seen in the training set. We propose to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences. Generalization is obtained because a sequence of words that has never been seen before gets high probability if it is made of words that are similar (in the sense of having a nearby representation) to words forming an already seen sentence. Training such large models (with millions of parameters) within a reasonable time is itself a significant challenge. We report on several methods to speed-up both training and probability computation, as well as comparative experiments to evaluate the improvements brought by these techniques. We finally describe the incorporation of this new language model into a state-of-the-art speech recognizer of conversational speech.

587 citations


Journal ArticleDOI
TL;DR: An explicit reciprocal transformation between a two-component generalization of the Camassa-Holm equation, called the 2-CH system, and the first negative flow of the AKNS hierarchy is established in this paper.
Abstract: An explicit reciprocal transformation between a two-component generalization of the Camassa–Holm equation, called the 2-CH system, and the first negative flow of the AKNS hierarchy is established. This transformation enables one to obtain solutions of the 2-CH system from those of the first negative flow of the AKNS hierarchy. Interesting examples of peakon and multi-kink solutions of the 2-CH system are presented

365 citations


Journal ArticleDOI
TL;DR: This work proposes an evolution equation for the level-set function based on a generalization of the concept of topological gradient, which results in a new algorithm allowing for all kinds of topology changes.

297 citations


Journal ArticleDOI
TL;DR: In this paper, the authors describe a simple scheme, based on the Nystrom method, for extending empirical functions f defined on a set X to a larger set X ¯, where the extension process involves the construction of a specific family of functions that are termed geometric harmonics.

253 citations


Journal ArticleDOI
TL;DR: In this article, leave-one-out (LOO) stability is defined as a statistical form of well-posedness, and it is shown that for bounded loss classes LOO stability is sufficient and sufficient for generalization.
Abstract: Solutions of learning problems by Empirical Risk Minimization (ERM) – and almost-ERM when the minimizer does not exist – need to be consistent, so that they may be predictive. They also need to be well-posed in the sense of being stable, so that they might be used robustly. We propose a statistical form of stability, defined as leave-one-out (LOO) stability. We prove that for bounded loss classes LOO stability is (a) sufficient for generalization, that is convergence in probability of the empirical error to the expected error, for any algorithm satisfying it and, (b) necessary and sufficient for consistency of ERM. Thus LOO stability is a weak form of stability that represents a sufficient condition for generalization for symmetric learning algorithms while subsuming the classical conditions for consistency of ERM. In particular, we conclude that a certain form of well-posedness and consistency are equivalent for ERM.

227 citations


Journal ArticleDOI
TL;DR: This paper establishes here the stability of the BP in the presence of noise for sparse enough representations, and is a direct generalization of noiseless BP study, and indeed, when the noise power is reduced to zero, the known results of the noisless BP are obtained.

195 citations


Journal ArticleDOI
TL;DR: A framework is provided for a statistically grounded meta-analysis of coefficient alpha using its sampling distribution and two empirical examples are offered to illustrate these methods, and limitations of reliability generalization are described.
Abstract: The meta-analysis of coefficient alpha across many studies is becoming more common in psychology by a methodology labeled reliability generalization. Existing reliability generalization studies have not used the sampling distribution of coefficient alpha for precision weighting and other common meta-analytic procedures. A framework is provided for a statistically grounded meta-analysis of coefficient alpha using its sampling distribution. Two empirical examples are offered to illustrate these methods, and limitations of reliability generalization are described.

183 citations


Journal ArticleDOI
TL;DR: In this article, the authors introduced the paradigm of network error correction as a generalization of classical link-by-link error correction and obtained the network generalizations of the Hamming bound and the Singleton bound in classical algebraic coding theory.
Abstract: In Part I of this paper, we introduced the paradigm of network error correction as a generalization of classical link-by-link error correction. We also obtained the network generalizations of the Hamming bound and the Singleton bound in classical algebraic coding theory. In Part II, we prove the network generalization of the Gilbert-Varshamov bound and its enhancement. With the latter, we show that the tightness of the Singleton bound is preserved in the network setting. We also discuss the implication of the results in this paper.

171 citations


Book ChapterDOI
08 May 2006
TL;DR: In this paper, a generalization of the framework of Dung is proposed, which allows for sets of arguments to attack other arguments, and the semantics associated with the original framework are extended to this generalization, and all results in the paper by Dung have an equivalent in this more abstract framework.
Abstract: One of the most widely studied systems of argumentation is the one described by Dung in a paper from 1995. Unfortunately, this framework does not allow for joint attacks on arguments, which we argue must be required of any truly abstract argumentation framework. A few frameworks can be said to allow for such interactions among arguments, but for various reasons we believe that these are inadequate for modelling argumentation systems with joint attacks. In this paper we propose a generalization of the framework of Dung, which allows for sets of arguments to attack other arguments. We extend the semantics associated with the original framework to this generalization, and prove that all results in the paper by Dung have an equivalent in this more abstract framework.

Proceedings ArticleDOI
14 Jun 2006
TL;DR: In this article, the authors generalize the constraint tightening approach to robust model predictive control, which guarantees robust feasibility and convergence for a constrained linear system subject to persistent, unknown but bounded disturbances.
Abstract: This paper generalizes the constraint tightening approach to robust model predictive control, which guarantees robust feasibility and convergence for a constrained linear system subject to persistent, unknown but bounded disturbances. The constraints in the optimization are tightened in a monotonic sequence such that a predetermined candidate correction policy is feasible for all possible disturbances. The generalization in this paper enables the candidate policy to be time-varying and considers a general convergence problem. A key feature of the generalization is the potential to use a range of nilpotent candidate policies, which eliminate the need to compute a robustly invariant terminal constraint set.

Journal ArticleDOI
TL;DR: In this article, a variant of the well-known isomorphism between completely positive maps and bipartite density operators is derived, which makes this connection much more explicit and is applied to elucidate the connection between no-cloning and no-broadcasting theorems and the monogamy of entanglement.
Abstract: Quantum theory can be regarded as a noncommutative generalization of classical probability. From this point of view, one expects quantum dynamics to be analogous to classical conditional probabilities. In this paper, a variant of the well-known isomorphism between completely positive maps and bipartite density operators is derived, which makes this connection much more explicit. This isomorphism is given an operational interpretation in terms of statistical correlations between ensemble preparation procedures and outcomes of measurements. Finally, the isomorphism is applied to elucidate the connection between no-cloning and no-broadcasting theorems and the monogamy of entanglement, and a simplified proof of the no-broadcasting theorem is obtained as a by-product.

Proceedings Article
04 Dec 2006
TL;DR: This work introduces the concept of irreducible independent subspaces or components and presents a generalization to a parameter-free mixture model, which relieves the condition of at-most-one-Gaussian by including previous results on non- Gaussian component analysis.
Abstract: The increasingly popular independent component analysis (ICA) may only be applied to data following the generative ICA model in order to guarantee algorithm-independent and theoretically valid results. Subspace ICA models generalize the assumption of component independence to independence between groups of components. They are attractive candidates for dimensionality reduction methods, however are currently limited by the assumption of equal group sizes or less general semi-parametric models. By introducing the concept of irreducible independent subspaces or components, we present a generalization to a parameter-free mixture model. Moreover, we relieve the condition of at-most-one-Gaussian by including previous results on non-Gaussian component analysis. After introducing this general model, we discuss joint block diagonalization with unknown block sizes, on which we base a simple extension of JADE to algorithmically perform the subspace analysis. Simulations confirm the feasibility of the algorithm.

Journal Article
TL;DR: An approach to the inductive synthesis of recursive equations from input/output-examples which is based on the classical two-step approach to induction of functional Lisp programs of Summers (1977) is described.
Abstract: We describe an approach to the inductive synthesis of recursive equations from input/output-examples which is based on the classical two-step approach to induction of functional Lisp programs of Summers (1977). In a first step, I/O-examples are rewritten to traces which explain the outputs given the respective inputs based on a datatype theory. These traces can be integrated into one conditional expression which represents a non-recursive program. In a second step, this initial program term is generalized into recursive equations by searching for syntactical regularities in the term. Our approach extends the classical work in several aspects. The most important extensions are that we are able to induce a set of recursive equations in one synthesizing step, the equations may contain more than one recursive call, and additionally needed parameters are automatically introduced.

Proceedings ArticleDOI
17 Jun 2006
TL;DR: This paper provides new insights into how the method works and uses these to derive new algorithms which given the data alone automatically learn different plausible data partitionings.
Abstract: Spectral clustering is a simple yet powerful method for finding structure in data using spectral properties of an associated pairwise similarity matrix. This paper provides new insights into how the method works and uses these to derive new algorithms which given the data alone automatically learn different plausible data partitionings. The main theoretical contribution is a generalization of a key result in the field, the multicut lemma [7]. We use this generalization to derive two algorithms. The first uses the eigenvalues of a given affinity matrix to infer the number of clusters in data, and the second combines learning the affinity matrix with inferring the number of clusters. A hierarchical implementation of the algorithms is also derived. The algorithms are theoretically motivated and demonstrated on nontrivial data sets.

Journal ArticleDOI
TL;DR: This generalization of the Lambert W function expresses the exact solutions for general-relativistic self-gravitating N-body systems in one spatial and one time dimension, and a previously unknown mathematical link between the (1+1) gravity problem and the Schrödinger wave equation.
Abstract: We present a canonical form for a natural and necessary generalization of the Lambert W function, natural in that it requires minimal mathematical definitions for this generalization, and necessary in that it provides a means of expressing solutions to a number of physical problems of fundamental nature. This generalization expresses the exact solutions for general-relativistic self-gravitating N-body systems in one spatial and one time dimension, and a previously unknown mathematical link between the (1+1) gravity problem and the Schrodinger wave equation.

Book ChapterDOI
01 Jan 2006
TL;DR: It is shown that the p positive elements can be determined up to a constant number of misclassifications, bounded by the gap between the thresholds, and the number of tests needed to achieve this goal if n elements are given.
Abstract: We introduce a natural generalization of the well-studied group testing problem: A test gives a positive (negative) answer if the pool contains at least u (at most l) positive elements, and an arbitrary answer if the number of positive elements is between these fixed thresholds l and u. We show that the p positive elements can be determined up to a constant number of misclassifications, bounded by the gap between the thresholds. This is in a sense the best possible result. Then we study the number of tests needed to achieve this goal if n elements are given. If the gap is zero, the complexity is, similarly to classical group testing, O(plogn) for any fixed u. For the general case we propose a two-phase strategy consisting of a Distill and a Compress phase. We obtain some tradeoffs between classification accuracy and the number of tests.

Journal ArticleDOI
TL;DR: Bregman divergences are used to motivate a generalization of the least mean squared (LMS) algorithm, which can handle generalized linear models where the output of the system is a linear function combined with a nonlinear transfer function.
Abstract: Recently much work has been done analyzing online machine learning algorithms in a worst case setting, where no probabilistic assumptions are made about the data. This is analogous to the H/sup /spl infin// setting used in adaptive linear filtering. Bregman divergences have become a standard tool for analyzing online machine learning algorithms. Using these divergences, we motivate a generalization of the least mean squared (LMS) algorithm. The loss bounds for these so-called p-norm algorithms involve other norms than the standard 2-norm. The bounds can be significantly better if a large proportion of the input variables are irrelevant, i.e., if the weight vector we are trying to learn is sparse. We also prove results for nonstationary targets. We only know how to apply kernel methods to the standard LMS algorithm (i.e., p=2). However, even in the general p-norm case, we can handle generalized linear models where the output of the system is a linear function combined with a nonlinear transfer function (e.g., the logistic sigmoid).

Journal ArticleDOI
TL;DR: It is proved that interval t-norms supplied with interval automorphism is also a category, and a functor between these categories is provided that always returns the best interval representation of any t- norm and Automorphism and therefore can be used to deal with optimality of interval fuzzy algorithms.

Posted Content
TL;DR: An explicit solution to the rank-constrained matrix approximation in Frobenius norm is given, which is a generalization of the classical approximation of an $m\times n$ matrix by a matrix of, at most, rank k.
Abstract: In this paper we give an explicit solution to the rank constrained matrix approximation in Frobenius norm, which is a generalization of the classical approximation of an m by n matrix A by a matrix of rank k at most.

Journal ArticleDOI
TL;DR: In this article, a three-parameter generalization of the Weibull distribution is presented to deal with general situations in modeling survival process with various shapes in the hazard function.
Abstract: A three-parameter generalization of the Weibull distribution is presented to deal with general situations in modeling survival process with various shapes in the hazard function. This generalized W...

Patent
31 May 2006
TL;DR: A variety of techniques are described by which keyword sets and target audience profiles may be generalized in a systematic and effective way with reference to relationships between keywords, profiles, and the data of an underlying user population.
Abstract: A variety of techniques are described by which keyword sets and target audience profiles may be generalized in a systematic and effective way with reference to relationships between keywords, profiles, and the data of an underlying user population.

Journal ArticleDOI
Bob Rehder1
TL;DR: Findings suggest that category-based property generalization is often an instance of causal inference.
Abstract: Five experiments were performed to investigate the category-based generalization ofnonblank properties, properties that were novel but that were attributed to existing category features with causal explanations. Experiments 1–3 tested how such explanations interact with the well-known effects of similarity on such generalizations. The results showed that when the causal explanations were used, standard effects of typicality (Experiment 1), diversity (Experiment 2), or similarity itself (Experiment 3) were almost completely eliminated. Experiments 4 and 5 demonstrated that category-based generalizations exhibit some of the standard properties of causal reasoning; for example, an effect (i.e., a novel category property) is judged to be more prevalent when its cause (i.e., an existing category feature) is also prevalent. These findings suggest that category-based property generalization is often an instance of causal inference.

Proceedings Article
02 Jun 2006
TL;DR: A generalization of Dung's theory of argumentation enabling to take account for some additional constraints on the admissible sets of arguments, expressed as a propositional formula over the set of arguments is presented.
Abstract: We present a generalization of Dung's theory of argumentation enabling to take account for some additional constraints on the admissible sets of arguments, expressed as a propositional formula over the set of arguments. We point out several semantics for such constrained argumentation frameworks, and compare the corresponding inference relations w.r.t. cautiousness. We show that our setting encompasses some previous approaches based on Dung's theory as specific cases. We also investigate the complexity issue for the inference relations in the extended setting. Interestingly, we show that our generalization does not lead to a complexity shift w.r.t. inference for several semantics.

Proceedings ArticleDOI
15 May 2006
TL;DR: To generalize skills across contexts, the architecture for solving generically the problem of extracting the constraints of a given task in a programming by demonstration framework and theproblem of generalizing the acquired knowledge to various contexts is presented.
Abstract: This paper presents an architecture for solving generically the problem of extracting the constraints of a given task in a programming by demonstration framework and the problem of generalizing the acquired knowledge to various contexts. We validate the architecture in a series of experiments, where a human demonstrator teaches a humanoid robot simple manipulatory tasks. First, the combined joint angles and hand path motions are projected into a generic latent space, composed of a mixture of Gaussians (GMM) spreading across the spatial dimensions of the motion. Second, the temporal variation of the latent representation of the motion is encoded in a hidden Markov model (HMM). This two-step probabilistic encoding provides a measure of the spatio-temporal correlations across the different modalities collected by the robot, which determines a metric of imitation performance. A generalization of the demonstrated trajectories is then performed using Gaussian mixture regression (GMR). Finally, to generalize skills across contexts, we compute formally the trajectory that optimizes the metric, given the new context and the robot's specific body constraints

Journal ArticleDOI
TL;DR: A new general class of methods for each alternative generalization of canonical correlation is proposed, which form a superclass of methods that strike a compromise between explaining the variance within sets of variables and explaining the agreement between sets of variable.

Journal ArticleDOI
TL;DR: In this paper, the equivalence problem of intersection-body generalizations of the Busemann-Petty problem has been studied in the integral geometry of the Grassmann manifold.

Journal ArticleDOI
TL;DR: In this article, the authors consider the fractional generalization of nonholonomic constraints defined by equations with fractional derivatives and provide some examples using variational principle and prove that fractional constraints can be used to describe the evolution of dynamical systems in which some coordinates and velocities are related to velocity through a power-law memory function.
Abstract: We consider the fractional generalization of nonholonomic constraints defined by equations with fractional derivatives and provide some examples. The corresponding equations of motion are derived using variational principle. We prove that fractional constraints can be used to describe the evolution of dynamical systems in which some coordinates and velocities are related to velocities through a power-law memory function.

Journal ArticleDOI
TL;DR: A way to simulate the basic interactions between two individuals with different opinions, in the context of strategic game theory, is proposed, and a generalization of the Deffuant et al. model of continuous opinion dynamics is obtained.
Abstract: A way to simulate the basic interactions between two individuals with different opinions, in the context of strategic game theory, is proposed. Various games are considered, which produce different kinds of opinion formation dynamics. First, by assuming that all individuals (players) are equals, we obtain the bounded confidence model of continuous opinion dynamics proposed by Deffuant et al. In such a model a tolerance threshold is defined, such that individuals with difference in opinion larger than the threshold can not interact. Then, we consider that the individuals have different inclinations to change opinion and different abilities in convincing the others. In this way, we obtain the so-called ``Stubborn individuals and Orators'' (SO) model, a generalization of the Deffuant et al. model, in which the threshold tolerance is different for every couple of individuals. We explore, by numerical simulations, the dynamics of the SO model, and we propose further generalizations that can be implemented.