scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Artificial Intelligence Research in 1995"


Journal ArticleDOI
TL;DR: Foidl as mentioned in this paper is a method for inducing logic programs from examples that learns a new class of concepts called first-order decision lists, defined as ordered lists of clauses each ending in a cut.
Abstract: This paper presents a method for inducing logic programs from examples that learns a new class of concepts called first-order decision lists, defined as ordered lists of clauses each ending in a cut. The method, called Foidl, is based on Foil (Quinlan, 1990) but employs intensional background knowledge and avoids the need for explicit negative examples. It is particularly useful for problems that involve rules with specific exceptions, such as learning the past-tense of English verbs, a task widely studied in the context of the symbolic/connectionist debate. Foidl is able to learn concise, accurate programs for this problem from significantly fewer examples than previous methods (both connectionist and symbolic).

200 citations


Journal ArticleDOI
TL;DR: OPUS as discussed by the authors is a branch and bound search algorithm that enables efficient admissible search through spaces for which the order of search operator application is not significant, and has potential for application in other areas of artificial intelligence, notably, truth maintenance.
Abstract: OPUS is a branch and bound search algorithm that enables efficient admissible search through spaces for which the order of search operator application is not significant. The algorithm's search efficiency is demonstrated with respect to very large machine learning search spaces. The use of admissible search is of potential value to the machine learning community as it means that the exact learning biases to be employed for complex learning tasks can be precisely specified and manipulated. OPUS also has potential for application in other areas of artificial intelligence, notably, truth maintenance.

197 citations


Journal ArticleDOI
TL;DR: The method induces solutions from samples in the form of ordered disjunctive normal form (DNF) decision rules, which can be extended to search efficiently for similar cases prior to approximating function values.
Abstract: We describe a machine learning method for predicting the value of a real-valued function, given the values of multiple input variables. The method induces solutions from samples in the form of ordered disjunctive normal form (DNF) decision rules. A central objective of the method and representation is the induction of compact, easily interpretable solutions. This rule-based decision model can be extended to search efficiently for similar cases prior to approximating function values. Experimental results on real-world data demonstrate that the new techniques are competitive with existing machine learning and statistical methods and can sometimes yield superior regression performance.

181 citations


Journal ArticleDOI
TL;DR: In this article, the authors present a definition of cause and effect in terms of decision-theoretic primitives and thereby provide a principled foundation for causal reasoning, which departs from the traditional view of causation in that causal assertions may vary with the set of decisions available.
Abstract: We present a definition of cause and effect in terms of decision-theoretic primitives and thereby provide a principled foundation for causal reasoning. Our definition departs from the traditional view of causation in that causal assertions may vary with the set of decisions available. We argue that this approach provides added clarity to the notion of cause. Also in this paper, we examine the encoding of causal relationships in directed acyclic graphs. We describe a special class of influence diagrams, those in canonical form, and show its relationship to Pearl's representation of cause and effect. Finally, we show how canonical form facilitates counterfactual reasoning.

140 citations


Journal ArticleDOI
TL;DR: In this article, an expectation-driven low-level image segmentation approach is presented for road detection using mesh-connected massively parallel Simd architectures capable of handling hierarchical data structures, where the input image is assumed to contain a distorted version of a given template.
Abstract: The main aim of this work is the development of a vision-based road detection system fast enough to cope with the difficult real-time constraints imposed by moving vehicle applications. The hardware platform, a special-purpose massively parallel system, has been chosen to minimize system production and operational costs. This paper presents a novel approach to expectation-driven low-level image segmentation, which can be mapped naturally onto mesh-connected massively parallel Simd architectures capable of handling hierarchical data structures. The input image is assumed to contain a distorted version of a given template; a multiresolution stretching process is used to reshape the original template in accordance with the acquired image content, minimizing a potential function. The distorted template is the process output.

119 citations


Journal ArticleDOI
TL;DR: This paper presents an approach to learning from situated, interactive tutorial instruction within an ongoing agent that combines a form of explanation-based learning that is situated for each instruction with a full suite of contextually guided responses to incomplete explanations.
Abstract: This paper presents an approach to learning from situated, interactive tutorial instruction within an ongoing agent. Tutorial instruction is a flexible (and thus powerful) paradigm for teaching tasks because it allows an instructor to communicate whatever types of knowledge an agent might need in whatever situations might arise. To support this flexibility, however, the agent must be able to learn multiple kinds of knowledge from a broad range of instructional interactions. Our approach, called situated explanation, achieves such learning through a combination of analytic and inductive techniques. It combines a form of explanation-based learning that is situated for each instruction with a full suite of contextually guided responses to incomplete explanations. The approach is implemented in an agent called INSTRUCTO-SOAR that learns hierarchies of new tasks and other domain knowledge from interactive natural language instructions. INSTRUCTO-SOAR meets three key requirements of flexible instructability that distinguish it from previous systems: (1) it can take known or unknown commands at any instruction point; (2) it can handle instructions that apply to either its current situation or to a hypothetical situation specified in language (as in, for instance, conditional instructions); and (3) it can learn, from instructions, each class of knowledge it uses to perform tasks.

88 citations


Journal ArticleDOI
TL;DR: A new abstraction methodology and a related sound and complete learning algorithm that allows the complete change of representation language of planning cases from concrete to abstract is developed.
Abstract: ion is one of the most promising approaches to improve the performance of problem solvers. In several domains abstraction by dropping sentences of a domain description - as used in most hierarchical planners - has proven useful. In this paper we present examples which illustrate significant drawbacks of abstraction by dropping sentences. To overcome these drawbacks, we propose a more general view of abstraction involving the change of representation language. We have developed a new abstraction methodology and a related sound and complete learning algorithm that allows the complete change of representation language of planning cases from concrete to abstract. However, to achieve a powerful change of the representation language, the abstract language itself as well as rules which describe admissible ways of abstracting states must be provided in the domain model. This new abstraction approach is the core of PARIS (Plan ion and Refinement in an Integrated System), a system in which abstract planning cases are automatically learned from given concrete cases. An empirical study in the domain of process planning in mechanical engineering shows significant advantages of the proposed reasoning from abstract cases over classical hierarchical planning.

86 citations


Journal ArticleDOI
Roni Khardon1
TL;DR: The two translation problems are equivalent under polynomial reductions, and that they are equivalent to the corresponding decision problem, which is equivalent to deciding whether a given set of models is the set of characteristic models for a given Horn expression.
Abstract: Characteristic models are an alternative, model based, representation for Horn expressions. It has been shown that these two representations are incomparable and each has its advantages over the other. It is therefore natural to ask what is the cost of translating, back and forth, between these representations. Interestingly, the same translation questions arise in database theory, where it has applications to the design of relational databases. This paper studies the computational complexity of these problems. Our main result is that the two translation problems are equivalent under polynomial reductions, and that they are equivalent to the corresponding decision problem. Namely, translating is equivalent to deciding whether a given set of models is the set of characteristic models for a given Horn expression. We also relate these problems to the hypergraph transversal problem, a well known problem which is related to other applications in AI and for which no polynomial time algorithm is known. It is shown that in general our translation problems are at least as hard as the hypergraph transversal problem, and in a special case they are equivalent to it.

69 citations


Journal ArticleDOI
TL;DR: This paper shows that the problem of diffusion of context and credit is reduced when the transition probabilities approach 0 or 1, i.e., the transition probability matrices are sparse and the model essentially deterministic.
Abstract: This paper studies the problem of ergodicity of transition probability matrices in Markovian models, such as hidden Markov models (HMMs), and how it makes very difficult the task of learning to represent long-term context for sequential data. This phenomenon hurts the forward propagation of long-term context information, as well as learning a hidden state representation to represent long-term context, which depends on propagating credit information backwards in time. Using results from Markov chain theory, we show that this problem of diffusion of context and credit is reduced when the transition probabilities approach 0 or 1, i.e., the transition probability matrices are sparse and the model essentially deterministic. The results found in this paper apply to learning approaches based on continuous optimization, such as gradient descent and the Baum-Welch algorithm.

38 citations


Journal ArticleDOI
TL;DR: Flecs as mentioned in this paper is a planner that can be used to study which domains and problems are best for which planning strategies, and Flecs represents a novel contribution to planning in that it explicitly provides the choice of which commitment strategy to use while planning.
Abstract: There has been evidence that least-commitment planners can efficiently handle planning problems that involve difficult goal interactions. This evidence has led to the common belief that delayed-commitment is the "best" possible planning strategy. However, we recently found evidence that eager-commitment planners can handle a variety of planning problems more efficiently, in particular those with difficult operator choices. Resigned to the futility of trying to find a universally successful planning strategy, we devised a planner that can be used to study which domains and problems are best for which planning strategies. In this article we introduce this new planning algorithm, flecs, which uses a FLExible Commitment Strategy with respect to plan-step orderings. It is able to use any strategy from delayed-commitment to eager-commitment. The combination of delayed and eager operator-ordering commitments allows flecs to take advantage of the benefits of explicitly using a simulated execution state and reasoning about planning constraints. flecs can vary its commitment strategy across different problems and domains, and also during the course of a single planning problem. flecs represents a novel contribution to planning in that it explicitly provides the choice of which commitment strategy to use while planning. FLECS provides a framework to investigate the mapping from planning domains and problems to efficient planning strategies.

36 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed an improvement to the standard local activation function used for symmetric networks, called activate, which guarantees that a global minimum is found in linear time for tree-like subnetworks.
Abstract: Symmetric networks designed for energy minimization such as Boltzman machines and Hopfield nets are frequently investigated for use in optimization, constraint satisfaction and approximation of NP-hard problems. Nevertheless, finding a global solution (i.e., a global minimum for the energy function) is not guaranteed and even a local solution may take an exponential number of steps. We propose an improvement to the standard local activation function used for such networks. The improved algorithm guarantees that a global minimum is found in linear time for tree-like subnetworks. The algorithm, called activate, is uniform and does not assume that the network is tree-like. It can identify tree-like subnetworks even in cyclic topologies (arbitrary networks) and avoid local minima along these trees. For acyclic networks, the algorithm is guaranteed to converge to a global minimumfrom any initial state of the system (self-stabilization) and remains correct under various types of schedulers. On the negative side, we show that in the presence of cycles, no uniform algorithm exists that guarantees optimality even under a sequential asynchronous scheduler. An asynchronous scheduler can activate only one unit at a time while a synchronous scheduler can activate any number of units in a single time step. In addition, no uniform algorithm exists to optimize even acyclic networks when the scheduler is synchronous. Finally, we show how the algorithm can be improved using the cycle-cutset scheme. The general algorithm, called activate-with-cutset improves over activate and has some performance guarantees that are related to the size of the network's cycle-cutset.

Journal ArticleDOI
TL;DR: A stronger form of implication is introduced, called T-implication, which is decidable between clauses, and it is shown that for every finite set of clauses there exists a least general generalization under T- Implication.
Abstract: In the area of inductive learning, generalization is a main operation, and the usual definition of induction is based on logical implication. Recently there has been a rising interest in clausal representation of knowledge in machine learning. Almost all inductive learning systems that perform generalization of clauses use the relation θ-subsumption instead of implication. The main reason is that there is a well-known and simple technique to compute least general generalizations under θ-subsumption, but not under implication. However generalization under θ-subsumption is inappropriate for learning recursive clauses, which is a crucial problem since recursion is the basic program structure of logic programs. We note that implication between clauses is undecidable, and we therefore introduce a stronger form of implication, called T-implication, which is decidable between clauses. We show that for every finite set of clauses there exists a least general generalization under T-implication. We describe a technique to reduce generalizations under implication of a clause to generalizations under θ-subsumption of what we call an expansion of the original clause. Moreover we show that for every non-tautological clause there exists a T-complete expansion, which means that every generalization under T-implication of the clause is reduced to a generalization under θ-subsumption of the expansion.

Journal ArticleDOI
TL;DR: Using a large number of classified Othello positions, feature weights for evaluation functions with a game-phase-independent meaning are estimated by means of logistic regression, Fisher's linear discriminant, and the quadratic discriminant function for normally distributed features.
Abstract: This article describes an application of three well-known statistical methods in the field of game-tree search: using a large number of classified Othello positions, feature weights for evaluation functions with a game-phase-independent meaning are estimated by means of logistic regression, Fisher's linear discriminant, and the quadratic discriminant function for normally distributed features. Thereafter, the playing strengths are compared by means of tournaments between the resulting versions of a world-class Othello program. In this application, logistic regression -- which is used here for the first time in the context of game playing - leads to better results than the other approaches.

Journal ArticleDOI
TL;DR: An algorithm for identifying inaccurate data by using qualitative correlations among related data as confirmatory or disconfirmatory evidence is presented, and a practical system for interpreting infrared spectra by applying the method is developed.
Abstract: Identifying inaccurate data has long been regarded as a significant and difficult problem in AI. In this paper, we present a new method for identifying inaccurate data on the basis of qualitative correlations among related data. First, we introduce the definitions of related data and qualitative correlations among related data. Then we put forward a new concept called support coefficient function (SCF). SCF can be used to extract, represent, and calculate qualitative correlations among related data within a dataset. We propose an approach to determining dynamic shift intervals of inaccurate data, and an approach to calculating possibility of identifying inaccurate data, respectively. Both of the approaches are based on SCF. Finally we present an algorithm for identifying inaccurate data by using qualitative correlations among related data as confirmatory or disconfirmatory evidence. We have developed a practical system for interpreting infrared spectra by applying the method, and have fully tested the system against several hundred real spectra. The experimental results show that the method is significantly better than the conventional methods used in many similar systems.

Journal ArticleDOI
TL;DR: In this article, a general framework called FLARE is proposed, which combines inductive learning using prior knowledge together with reasoning in a propositional setting, and several examples are presented, including classical induction, many important reasoning protocols and two simple expert systems.
Abstract: Learning and reasoning are both aspects of what is considered to be intelligence Their studies within AI have been separated historically, learning being the topic of machine learning and neural networks, and reasoning falling under classical (or symbolic) AI However, learning and reasoning are in many ways interdependent This paper discusses the nature of some of these interdependencies and proposes a general framework called FLARE, that combines inductive learning using prior knowledge together with reasoning in a propositional setting Several examples that test the framework are presented, including classical induction, many important reasoning protocols and two simple expert systems

Journal ArticleDOI
TL;DR: This paper introduces ICET, a new algorithm for cost-sensitive classification that uses a genetic algorithm to evolve a population of biases for a decision tree induction algorithm.
Abstract: This paper introduces ICET, a new algorithm for cost-sensitive classification. ICET uses a genetic algorithm to evolve a population of biases for a decision tree induction algorithm. The fitness fu...

Journal ArticleDOI
TL;DR: This paper presents negative results showing that any natural clause of constant-depth determinate k-ary recursive clauses is not efficiently learnable.
Abstract: In a companion paper it was shown that the class of constant-depth determinate k-ary recursive clauses is efficiently learnable. In this paper we present negative results showing that any natural g...

Journal ArticleDOI
TL;DR: This paper presents a method for inducing logic programs from examples that learns a new class of concepts called first-order decision lists, defined as ordered lists of clauses each ending in a decision list.
Abstract: This paper presents a method for inducing logic programs from examples that learns a new class of concepts called first-order decision lists, defined as ordered lists of clauses each ending in a cu...

Journal ArticleDOI
TL;DR: For many years, the intuitions underlying partial-order planning were largely taken for granted and only in the past few years has there been renewed interest in the fundamental principles underlying partial order planning as mentioned in this paper.
Abstract: For many years, the intuitions underlying partial-order planning were largely taken for granted. Only in the past few years has there been renewed interest in the fundamental principles underlying ...

Journal ArticleDOI
TL;DR: Temporal difference (TD) methods as discussed by the authors constitute a class of methods for learning predictions in multi-step prediction problems, parameterized by a recency factor λ, and are currently the most important applicat...
Abstract: Temporal difference (TD) methods constitute a class of methods for learning predictions in multi-step prediction problems, parameterized by a recency factor λ. Currently the most important applicat...