scispace - formally typeset
Search or ask a question

Showing papers on "Generalization published in 2007"


Posted Content
TL;DR: The cube operator as discussed by the authors generalizes the histogram, cross-tabulation, roll-up, drill-down, and sub-total constructs found in most report writers, and treats each of the N aggregation attributes as a dimension of N-space.
Abstract: Data analysis applications typically aggregate data across many dimensions looking for anomalies or unusual patterns. The SQL aggregate functions and the GROUP BY operator produce zero-dimensional or one-dimensional aggregates. Applications need the N-dimensional generalization of these operators. This paper defines that operator, called the data cube or simply cube. The cube operator generalizes the histogram, cross-tabulation, roll-up, drill-down, and sub-total constructs found in most report writers. The novelty is that cubes are relations. Consequently, the cube operator can be imbedded in more complex non-procedural data analysis programs. The cube operator treats each of the N aggregation attributes as a dimension of N-space. The aggregate of a particular set of attribute values is a point in this space. The set of points forms an N-dimensional cube. Super-aggregates are computed by aggregating the N-cube to lower dimensional spaces. This paper (1) explains the cube and roll-up operators, (2) shows how they fit in SQL, (3) explains how users can define new aggregate functions for cubes, and (4) discusses efficient techniques to compute the cube. Many of these features are being added to the SQL Standard.

1,870 citations


01 Jan 2007
TL;DR: The notion of bias in generalization problems is defined, and it is shown that biases are necessary for the inductive leap.
Abstract: Learning involves the ability to generalize from past experience in order to deal with new situations that are ”related to” this experience. The inductive leaap needed to deal with new situations seems to be possible only under certain biases for choosing one generalization of the situation over another. This paper defines precisely the notion of bias in generalization problems, then shows that biases are necessary for the inductive leap. Classes of justifiable biases are considered, and the relationship between bias and domain-independence is considered. We restrict the scope of this discussion to the problem of generalizing from training instances, defined as follows: The Generalization Problem Given:

555 citations


Book
30 Jun 2007
TL;DR: An alternative selection scheme based on relative bounds between estimators is described and study, and a two step localization technique which can handle the selection of a parametric model from a family of those is presented.
Abstract: This monograph deals with adaptive supervised classification, using tools borrowed from statistical mechanics and information theory, stemming from the PACBayesian approach pioneered by David McAllester and applied to a conception of statistical learning theory forged by Vladimir Vapnik. Using convex analysis on the set of posterior probability measures, we show how to get local measures of the complexity of the classification model involving the relative entropy of posterior distributions with respect to Gibbs posterior measures. We then discuss relative bounds, comparing the generalization error of two classification rules, showing how the margin assumption of Mammen and Tsybakov can be replaced with some empirical measure of the covariance structure of the classification model.We show how to associate to any posterior distribution an effective temperature relating it to the Gibbs prior distribution with the same level of expected error rate, and how to estimate this effective temperature from data, resulting in an estimator whose expected error rate converges according to the best possible power of the sample size adaptively under any margin and parametric complexity assumptions. We describe and study an alternative selection scheme based on relative bounds between estimators, and present a two step localization technique which can handle the selection of a parametric model from a family of those. We show how to extend systematically all the results obtained in the inductive setting to transductive learning, and use this to improve Vapnik's generalization bounds, extending them to the case when the sample is made of independent non-identically distributed pairs of patterns and labels. Finally we review briefly the construction of Support Vector Machines and show how to derive generalization bounds for them, measuring the complexity either through the number of support vectors or through the value of the transductive or inductive margin.

369 citations


Proceedings ArticleDOI
11 Jun 2007
TL;DR: A polynomial time approximation scheme (PTAS) for the minimum feedback arc set problem on tournaments and a simple weighted generalization gives a PTAS for Kemeny-Young rank aggregation.
Abstract: We present a polynomial time approximation scheme (PTAS) for the minimum feedback arc set problem on tournaments. A simple weighted generalization gives a PTAS for Kemeny-Young rank aggregation.

284 citations


01 Jan 2007
TL;DR: A new model of human concept learning that provides a rational analysis of learning feature-based concepts is proposed, built upon Bayesian inference for a grammatically structured hypothesis space-a concept language of logical rules.
Abstract: A Rational Analysis of Rule-based Concept Learning Noah D. Goodman 1 (ndg@mit.edu), Thomas Griffiths 2 (tom griffiths@berkeley.edu), Jacob Feldman 3 (jacob@ruccs.rutgers.edu), Joshua B. Tenenbaum 1 (jbt@mit.edu) 1 Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology 2 Department of Psychology, University of California, Berkeley 3 Department of Psychology, Center for Cognitive Science, Rutgers University Abstract However, existing rule-based models are primarily heuristic—no rational analysis has been provided, and they have not been tied to statistical approaches to induction. A ra- tional analysis for rule-based models might assume that con- cepts are (represented as) rules, and ask what degree of be- lief a rational agent should accord to each rule, given some observed examples. We answer this question by formulat- ing the hypothesis space of rules as words in a “concept lan- guage” generated by a context-free grammar. Considering the probability of productions in this grammar leads to a prior probability for words in the language, and the logical form of these words motivates an expression for the probability of observed examples given a rule. The methods of Bayesian analysis then lead to the Rational Rules model of concept learning. This grammatical approach to induction has ben- efits for Bayesian rational analysis: it compactly specifies an infinite, and flexible, hypothesis space of structured rules and a prior that decreases with complexity. The Rational Rules model thus makes contributions to both rule-based concept modeling and rational statistical learning models: to the for- mer it provides a rational analysis, and to the latter it provides the grammar-based approach. Across a range of experimen- tal tasks, this new model achieves comparable fits to the best rule-based models in the literature, but with fewer free pa- rameters and arbitrary processing assumptions. We propose a new model of human concept learning that pro- vides a rational analysis for learning of feature-based concepts. This model is built upon Bayesian inference for a grammat- ically structured hypothesis space—a “concept language” of logical rules. We compare the model predictions to human generalization judgments in two well-known category learning experiments, and find good agreement for both average and individual participants’ generalizations. Keywords: concept learning; categorization; Bayesian induc- tion; probabilistic grammar; rules. Introduction Concepts are a topic of perennial interest to psychology, par- ticularly concepts which identify kinds of things. Such con- cepts are mental representations which enable one to discrim- inate between objects that satisfy the concept and those which do not. Given their discriminative use, a natural hypothesis is that concepts are simply rules for classifying objects based on features. Indeed, the “classical” theory of concepts (see Smith and Medin, 1981) takes this viewpoint, suggesting that a concept can be expressed as a simple feature-based rule: a conjunction of features that are necessary and jointly suffi- cient for membership. Early models based on this approach failed to account for many aspects of human categorization behavior, especially the graded use of concepts (Mervis and Rosch, 1981). Attention consequently turned to models with a more statistical nature: similarity to prototypes or to exem- plars (Medin and Schaffer, 1978; Kruschke, 1992; Love et al., 2004). The statistical nature of many of these models has made them amenable to a rational analysis (Anderson, 1990), which attempts to explain why people do what they do, com- plementing (often apparently ad-hoc) process-level accounts. Despite the success of similarity-based models, recently re- newed interest has led to more sophisticated rule-based mod- els. Among the reasons for this reconsideration are the inabil- ity of similarity-based models to provide a method for con- cept combination, common reports by participants that they “feel as if” they are using a rule, and the unrealistic mem- ory demands of most similarity-based models. The RULEX model (Nosofsky et al., 1994), for instance, treats concepts as conjunctive rules plus exceptions, learned by a heuristic search process, and has some of the best fits to human ex- perimental data—particularly for the judgments of individ- ual participants. Parallel motivation for reexamining the role of logical structures in human concept representation comes from evidence that the difficulty of learning a new concept is well predicted by its logical complexity (Feldman, 2000). An Analysis of Concepts A general approach to the rational analysis of inductive learn- ing problems has emerged in recent years (Anderson, 1990; Tenenbaum, 1999; Chater and Oaksford, 1999). Under this approach a space of hypotheses is posited, and beliefs are as- signed using Bayesian statistics—a coherent framework that combines data and a priori knowledge to give posterior de- grees of belief. Uses of this approach, for instance in causal induction (Griffiths and Tenenbaum, 2005) and word learn- ing (Xu and Tenenbaum, 2005), have successfully predicted human generalization behavior in a range of tasks. In our case, we wish to establish a hypothesis space of rules, and analyze the behavior of a rational agent trying to learn those rules from labeled examples. Thus the learn- ing problem is to determine P(F|E, `(E)), where F ranges over rules, E is the set of observed example objects (possibly with repeats) and `(E) are the observed labels. (Through- out this section we consider a single labeled concept, thus `(x) ∈ {0, 1} indicates whether x is an example or a non- example of the concept.) This quantity may be expressed

278 citations


Journal ArticleDOI
TL;DR: An attribute generalization and its relation to feature selection and feature extraction are discussed and a new approach for incrementally updating approximations of a concept is presented under the characteristic relation-based rough sets.
Abstract: Any attribute set in an information system may be evolving in time when new information arrives. Approximations of a concept by rough set theory need updating for data mining or other related tasks. For incremental updating approximations of a concept, methods using the tolerance relation and similarity relation have been previously studied in literature. The characteristic relation-based rough sets approach provides more informative results than the tolerance-and-similarity relation based approach. In this paper, an attribute generalization and its relation to feature selection and feature extraction are firstly discussed. Then, a new approach for incrementally updating approximations of a concept is presented under the characteristic relation-based rough sets. Finally, the approach of direct computation of rough set approximations and the proposed approach of dynamic maintenance of rough set approximations are employed for performance comparison. An extensive experimental evaluation on a large soybean database from MLC shows that the proposed approach effectively handles a dynamic attribute generalization in data mining.

277 citations


Journal ArticleDOI
TL;DR: In this paper, the concept of Euler-Lagrange fractional extremal was used to prove a Noether-type theorem for the calculus of variations with fractional derivatives.

274 citations


Journal ArticleDOI
TL;DR: In this paper, a portfolio choice application compared the effect of changes in confidence under ambiguity vs. changes in estimation risk under Bayesian learning, and the former was shown to induce a trend towards more stock market participation and investment even when the latter does not.
Abstract: This paper considers learning when the distinction between risk and ambiguity matters. It first describes thought experiments, dynamic variants of those provided by Ellsberg, that highlight a sense in which the Bayesian learning model is extreme—it models agents who are implausibly ambitious about what they can learn in complicated environments. The paper then provides a generalization of the Bayesian model that accommodates the intuitive choices in the thought experiments. In particular, the model allows decision-makers’ confidence about the environment to change—along with beliefs—as they learn. A portfolio choice application compares the effect of changes in confidence under ambiguity vs. changes in estimation risk under Bayesian learning. The former is shown to induce a trend towards more stock market participation and investment even when the latter does not.

251 citations


Journal ArticleDOI
TL;DR: In this paper, a fractional-order dynamical model of love has been proposed, where the state dynamics of the model are assumed to assume fractional orders, and it has been shown that with appropriate model parameters, strange chaotic attractors may be obtained under different fractional order.
Abstract: This paper examines fractional-order dynamical models of love. It offers a generalization of a dynamical model recently reported in the literature. The generalization is obtained by permitting the state dynamics of the model to assume fractional orders. The fact that fractional systems possess memory justifies this generalization, as the time evolution of romantic relationships is naturally impacted by memory. We show that with appropriate model parameters, strange chaotic attractors may be obtained under different fractional orders, thus confirming previously reported results obtained from integer-order models, yet at an advantage of reduced system order. Furthermore, this work opens a new direction of research whereby fractional derivative applications might offer more insight into the modeling of dynamical systems in psychology and life sciences. Our results are validated by numerical simulations.

221 citations


Proceedings Article
11 Mar 2007
TL;DR: It is argued that this setting is more natural in some experimental settings and proposed algorithm based on convex optimization techniques to solve the non-metric multidimensional scaling problem in which only a set of order relations of the form dij < dkl are provided is provided.
Abstract: We consider the non-metric multidimensional scaling problem: given a set of dissimilarities ∆, find an embedding whose inter-point Euclidean distances have the same ordering as ∆. In this paper, we look at a generalization of this problem in which only a set of order relations of the form dij < dkl are provided. Unlike the original problem, these order relations can be contradictory and need not be specified for all pairs of dissimilarities. We argue that this setting is more natural in some experimental settings and propose an algorithm based on convex optimization techniques to solve this problem. We apply this algorithm to human subject data from a psychophysics experiment concerning how reflectance properties are perceived. We also look at the standard NMDS problem, where a dissimilarity matrix ∆ is provided as input, and show that we can always find an orderrespecting embedding of ∆.

197 citations



Proceedings ArticleDOI
20 Jun 2007
TL;DR: This work proposes a generalization of multilinear models using nonlinear basis functions and obtains a multifactor form of the Gaussian process latent variable model, in which each factor is kernelized independently, allowing nonlinear mappings from any particular factor to the data.
Abstract: We introduce models for density estimation with multiple, hidden, continuous factors. In particular, we propose a generalization of multilinear models using nonlinear basis functions. By marginalizing over the weights, we obtain a multifactor form of the Gaussian process latent variable model. In this model, each factor is kernelized independently, allowing nonlinear mappings from any particular factor to the data. We learn models for human locomotion data, in which each pose is generated by factors representing the person's identity, gait, and the current state of motion. We demonstrate our approach using time-series prediction, and by synthesizing novel animation from the model.

Journal ArticleDOI
TL;DR: In this paper, a method for learning sparse representations shared across multiple tasks is proposed, which is based on a novel non-convex regularizer which controls the number of learned features common across the tasks.
Abstract: We present a method for learning sparse representations shared across multiple tasks. This method is a generalization of the well known single task 1-norm regularization. It is based on a novel non-convex regularizer which controls the number of learned features common across the tasks. We prove that the method is equivalent to solving a convex optimization problem for which there is an iterative algorithm which, as we prove, converges to an optimal solution. The algorithm has a simple interpretation: it alternately performs a supervised and an unsupervised step, where in the former step it learns task functions and in the latter step it learns common across tasks sparse representations for these functions. We also provide an extension of the algorithm which learns sparse nonlinear representations using kernels. We report experiments on simulated and real data sets which demonstrate that the proposed method can both improve the performance relative to learning each task independently and lead to a few learned features common across related tasks. Our algorithm can also be used, as a special case, to simply select - not learn - a few common variables across the tasks.

Posted Content
TL;DR: The main result of as mentioned in this paper is a (practically optimal) criterium for the pseudoeffectivity of the twisted relative canonical bundles of surjective projective maps.
Abstract: The main result of the present article is a (practically optimal) criterium for the pseudoeffectivity of the twisted relative canonical bundles of surjective projective maps. Our theorem has several applications in algebraic geometry; to start with, we obtain the natural analytic generalization of some semipositivity results due to E. Viehweg and F. Campana. As a byproduct, we give a simple and direct proof of a recent result due to C. Hacon--J. McKernan, S. Takayama and H. Tsuji concerning the extension of twisted pluricanonical forms. More applications will be offered in the sequel of this article.

Journal ArticleDOI
TL;DR: This work proposes a generalization of CCA to several data sets, which is shown to be equivalent to the classical maximum variance (MAXVAR) generalization proposed by Kettenring.

Journal ArticleDOI
TL;DR: This paper presented a taxonomy that distinguishes between students' activity as they generalize, or generalizing actions, and students' final statements of generalization, or reflection generalizations, based on a 3-week teaching experiment and a series of individual interviews.
Abstract: This article presents a cohesive, empirically grounded categorization system differentiating the types of generalizations students constructed when reasoning mathematically. The generalization taxonomy developed out of an empirical study conducted during a 3-week teaching experiment and a series of individual interviews. Qualitative analysis of data from teaching sessions with 7 seventh-graders and individual interviews with 7 eighth-graders resulted in a taxonomy that distinguishes between students' activity as they generalize, or generalizing actions, and students' final statements of generalization, or reflection generalizations. The three major generalizing action categories that emerged from analysis are (a) relating, in which one forms an association between two or more problems or objects, (b) searching, in which one repeats an action to locate an element of similarity, and (c) extending, in which one expands a pattern or relation into a more general structure. Reflection generalizations took the f...

Journal ArticleDOI
01 Jun 2007
TL;DR: This paper proposes a generalization of narrowing which can be used to solve reachability goals in initial and free models of a rewrite theory ℛ and identifies several large classes of rewrite theories, covering many practical applications, for which narrowing is strongly complete.
Abstract: Narrowing was introduced, and has traditionally been used, to solve equations in initial and free algebras modulo a set of equations E. This paper proposes a generalization of narrowing which can be used to solve reachability goals in initial and free models of a rewrite theory ?. We show that narrowing is sound and weakly complete (i.e., complete for normalized solutions) under appropriate executability assumptions about ?. We also show that in general narrowing is not strongly complete, that is, not complete when some solutions can be further rewritten by ?. We then identify several large classes of rewrite theories, covering many practical applications, for which narrowing is strongly complete. Finally, we illustrate an application of narrowing to analysis of cryptographic protocols.

Journal ArticleDOI
TL;DR: This letter proposes a one-parameter family of integration, called -integration, which includes all of these well-known integrations, which are generalizations of various averages of numbers such as arithmetic, geometric, and harmonic averages.
Abstract: When there are a number of stochastic models in the form of probability distributions, one needs to integrate them. Mixtures of distributions are frequently used, but exponential mixtures also provide a good means of integration. This letter proposes a one-parameter family of integration, called α-integration, which includes all of these well-known integrations. These are generalizations of various averages of numbers such as arithmetic, geometric, and harmonic averages. There are psychophysical experiments that suggest that α-integrations are used in the brain. The α-divergence between two distributions is defined, which is a natural generalization of Kullback-Leibler divergence and Hellinger distance, and it is proved that α-integration is optimal in the sense of minimizing α-divergence. The theory is applied to generalize the mixture of experts and the product of experts to the α-mixture of experts. The α-predictive distribution is also stated in the Bayesian framework.

Journal ArticleDOI
01 Oct 2007
TL;DR: In this article, a generalization of the Diffie-Hellman key exchange over finite sets has been proposed, where abelian semigroups act on finite sets and a simple semigroup action is constructed from simple semirings.
Abstract: A generalization of the original Diffie-Hellman key exchange in $(\mathbb Z$∕$p\mathbb Z)$* found a new depth when Miller [27] and Koblitz [16] suggested that such a protocol could be used with the group over an elliptic curve. In this paper, we propose a further vast generalization where abelian semigroups act on finite sets. We define a Diffie-Hellman key exchange in this setting and we illustrate how to build interesting semigroup actions using finite (simple) semirings. The practicality of the proposed extensions rely on the orbit sizes of the semigroup actions and at this point it is an open question how to compute the sizes of these orbits in general and also if there exists a square root attack in general. In Section 5 a concrete practical semigroup action built from simple semirings is presented. It will require further research to analyse this system.

Journal ArticleDOI
TL;DR: The concept of I-convergence is a generalization of statistical convergence and it is dependent on the notion of the ideal I of subsets of the set N of positive integers as mentioned in this paper.
Abstract: The concept of I-convergence is a generalization of statistical convergence and it is dependent on the notion of the ideal I of subsets of the set N of positive integers. In this paper we prove a decomposition theorem for I-convergent sequences and we introduce the notions of I Cauchy sequence and I∗-Cauchy sequence, and then study their certain properties.


Proceedings ArticleDOI
21 Oct 2007
TL;DR: The approach combines ideas from the junta test of Fischer et al. 16 with ideas from learning theory, and yields property testers that make po!y(s/epsiv) queries for Boolean function classes such as s-term DNF formulas and s-sparse polynomials over finite fields.
Abstract: We describe a general method for testing whether a function on n input variables has a concise representation. The approach combines ideas from the junta test of Fischer et al. 16 with ideas from learning theory, and yields property testers that make po!y(s/epsiv) queries (independent of n) for Boolean function classes such as s-term DNF formulas (answering a question posed by Parnas et al. [12]), sizes. decision trees, sizes Boolean formulas, and sizes Boolean circuits. The method can be applied to non-Boolean valued function classes as well. This is achieved via a generalization of the notion of van at ion/row Fischer et al. to non-Boolean functions. Using this generalization we extend the original junta test of Fischer et al. to work for non-Boolean functions, and give poly(s/e)-query testing algorithms for non-Boolean valued function classes such as sizes algebraic circuits and s-sparse polynomials over finite fields. We also prove an Omega(radic(s)) query lower bound for nonadaptively testing s-sparse polynomials over finite fields of constant size. This shows that in some instances, our general method yields a property tester with query complexity that is optimal (for nonadaptive algorithms) up to a polynomial factor.

Journal Article
TL;DR: This work studies LVQ rigorously within a simplifying model situation: two competing prototypes are trained from a sequence of examples drawn from a mixture of Gaussians, yielding typical learning curves, convergence properties, and achievable generalization abilities.
Abstract: Learning vector quantization (LVQ) schemes constitute intuitive, powerful classification heuristics with numerous successful applications but, so far, limited theoretical background. We study LVQ rigorously within a simplifying model situation: two competing prototypes are trained from a sequence of examples drawn from a mixture of Gaussians. Concepts from statistical physics and the theory of on-line learning allow for an exact description of the training dynamics in high-dimensional feature space. The analysis yields typical learning curves, convergence properties, and achievable generalization abilities. This is also possible for heuristic training schemes which do not relate to a cost function. We compare the performance of several algorithms, including Kohonen's LVQ1 and LVQ+/-, a limiting case of LVQ2.1. The former shows close to optimal performance, while LVQ+/- displays divergent behavior. We investigate how early stopping can overcome this difficulty. Furthermore, we study a crisp version of robust soft LVQ, which was recently derived from a statistical formulation. Surprisingly, it exhibits relatively poor generalization. Performance improves if a window for the selection of data is introduced; the resulting algorithm corresponds to cost function based LVQ2. The dependence of these results on the model parameters, for example, prior class probabilities, is investigated systematically, simulations confirm our analytical findings.

Journal ArticleDOI
TL;DR: The reaction-diffusion-space, a combination of linear scale-space and mathematical morphology, is used here for the development of an automatic generalization procedure for three-dimensional (3D) building models.
Abstract: In image analysis, scale-space theory is used, e.g., for object recognition. A scale-space is obtained by deriving coarser representations at different scales from an image. With it, the behaviour of image features over scales can be analysed. One example of a scale-space is the reaction-diffusion-space, a combination of linear scale-space and mathematical morphology. As scale-spaces have an inherent abstraction capability, they are used here for the development of an automatic generalization procedure for three-dimensional (3D) building models. It can be used to generate level of detail (LOD) representations of 3D city models. Practically, it works by moving parallel facets towards each other until a 3D feature under a certain extent is eliminated or a gap is closed. As not all building structures consist of perpendicular facets, means for a squaring of non-orthogonal structures are given. Results for generalization and squaring are shown and remaining problems are discussed. The conference version of this paper is Forberg [Forberg, A., 2004. Generalization of 3D Building Data Based on a Scale-Space Approach. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 35 (Part B4) http://www.isprs.org/istanbul2004/comm4/papers/341.pdf (accessed January 17, 2007)].

Journal ArticleDOI
TL;DR: This work generalizes the concept of neutral element and this generalization gives rise to a new class of AMC binary operators on [0,1] called n-uninorms, which is denoted as U^n, where n comes from the generalization of the neutral element.

Book ChapterDOI
01 Jan 2007
TL;DR: This chapter describes modeling techniques for using these algorithms to form a comprehensive generalization process, and describes the historical evolution of these modeling techniques as well as their strengths and weaknesses.
Abstract: Publisher Summary Research on the automation of cartographic generalization has led to the development of a large number of generalization algorithms. This chapter describes modeling techniques for using these algorithms to form a comprehensive generalization process. Research on the automation of cartographic generalization has led to the development of a large number of generalization algorithms. Important issues include when to use the generalization algorithms and how to trigger and control them. Three main modeling techniques are described: condition-action modeling, human interaction modeling, and constraint-based modeling. In a condition-action modeling process an identification of objects and relationships between objects is first performed. Then, based on the identified conditions, generalization algorithms are triggered. Human interaction modeling is based on the principle that the cognitive workload can be shared between computer and human. The computer typically carries out those tasks that can be sufficiently formalized to be cast into algorithms, while the human assumes responsibility for guiding and controlling the computer software. Finally, in constraint-based modeling the starting point is the requirements (constraints) of the generalized map. An optimization process then finds a generalization solution that satisfies as many of the constraints as is possible. The chapter describes the historical evolution of these modeling techniques as well as their strengths and weaknesses.

Journal ArticleDOI
TL;DR: How the algorithms described here provide a general architecture for addressing the pipeline problem -- the problem of passing information back and forth between various stages of processing in a perceptual system is discussed.
Abstract: We consider the problem of computing a lightest derivation of a global structure using a set of weighted rules. A large variety of inference problems in AI can be formulated in this framework. We generalize A* search and heuristics derived from abstractions to a broad class of lightest derivation problems. We also describe a new algorithm that searches for lightest derivations using a hierarchy of abstractions. Our generalization of A* gives a new algorithm for searching AND/OR graphs in a bottom-up fashion. We discuss how the algorithms described here provide a general architecture for addressing the pipeline problem -- the problem of passing information back and forth between various stages of processing in a perceptual system. We consider examples in computer vision and natural language processing. We apply the hierarchical search algorithm to the problem of estimating the boundaries of convex objects in grayscale images and compare it to other search methods. A second set of experiments demonstrate the use of a new compositional model for finding salient curves in images.

Proceedings ArticleDOI
11 Nov 2007
TL;DR: A safety analysis of finite-state systems is described that generalizes from counterexamples to the inductiveness of the safety specification to inductive invariants and abstracts the system's state space relative to the asserted property.
Abstract: Scaling verification to large circuits requires some form of abstraction relative to the asserted property We describe a safety analysis of finite-state systems that generalizes from counterexamples to the inductiveness of the safety specification to inductive invariants It thus abstracts the system's state space relative to the property The analysis either strengthens a safety specification to be inductive or discovers a counterexample to its correctness The analysis is easily made parallel We provide experimental data showing how the analysis time decreases with the number of processes on several hard problems

Journal ArticleDOI
TL;DR: The Bohr radius for power series of holomorphic functions mapping a multidimensional Reinhardt domain into the convex domain in the complex plane is independent of this convex manifold as mentioned in this paper.
Abstract: The Bohr radius for power series of holomorphic functions mapping a multidimensional Reinhardt domain into the convex domain in the complex plane is independent of this convex domain.

Proceedings ArticleDOI
14 May 2007
TL;DR: Preliminary experiments with a novel algorithm, AMBI (Approximate Models Based on Instances), demonstrate that this approach yields faster learning on some standard benchmark problems than many contemporary algorithms.
Abstract: Reinforcement learning promises a generic method for adapting agents to arbitrary tasks in arbitrary stochastic environments, but applying it to new real-world problems remains difficult, a few impressive success stories notwithstanding. Most interesting agent-environment systems have large state spaces, so performance depends crucially on efficient generalization from a small amount of experience. Current algorithms rely on model-free function approximation, which estimates the long-term values of states and actions directly from data and assumes that actions have similar values in similar states. This paper proposes model-based function approximation, which combines two forms of generalization by assuming that in addition to having similar values in similar states, actions also have similar effects. For one family of generalization schemes known as averagers, computation of an approximate value function from an approximate model is shown to be equivalent to the computation of the exact value function for a finite model derived from data. This derivation both integrates two independent sources of generalization and permits the extension of model-based techniques developed for finite problems. Preliminary experiments with a novel algorithm, AMBI (Approximate Models Based on Instances), demonstrate that this approach yields faster learning on some standard benchmark problems than many contemporary algorithms.