scispace - formally typeset
Search or ask a question

Showing papers in "arXiv: Artificial Intelligence in 1996"


Posted Content
TL;DR: A survey of reinforcement learning from a computer science perspective can be found in this article, where the authors discuss the central issues of RL, including trading off exploration and exploitation, establishing the foundations of RL via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state.
Abstract: This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.

5,970 citations


Posted Content
TL;DR: In this article, the authors proposed a new method for exploiting causal independencies in exact Bayesian network inference, which enables one to further factorize the conditional probabilities into a combination of even smaller factors and consequently obtain a finer-grain factorization of the joint probability.
Abstract: A new method is proposed for exploiting causal independencies in exact Bayesian network inference. A Bayesian network can be viewed as representing a factorization of a joint probability into the multiplication of a set of conditional probabilities. We present a notion of causal independence that enables one to further factorize the conditional probabilities into a combination of even smaller factors and consequently obtain a finer-grain factorization of the joint probability. The new formulation of causal independence lets us specify the conditional probability of a variable given its parents in terms of an associative and commutative operator, such as ``or'', ``sum'' or ``max'', on the contribution of each parent. We start with a simple algorithm VE for Bayesian network inference that, given evidence and a query variable, uses the factorization to find the posterior distribution of the query. We show how this algorithm can be extended to exploit causal independence. Empirical studies, based on the CPCS networks for medical diagnosis, show that this method is more efficient than previous methods and allows for inference in larger networks than previous algorithms.

437 citations


Posted Content
TL;DR: This work shows how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression.
Abstract: For many types of machine learning algorithms, one can compute the statistically `optimal' way to select training data. In this paper, we review how optimal data selection techniques have been used with feedforward neural networks. We then show how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are computationally expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate. Empirically, we observe that the optimality criterion sharply decreases the number of training examples the learner needs in order to achieve good performance.

274 citations


Posted Content
TL;DR: Information about preferences between rules can be expressed in the logical language and derived dynamically by using a reserved predicate symbol and a naming technique to extend well-founded semantics for logic programs with two types of negation.
Abstract: The paper describes an extension of well-founded semantics for logic programs with two types of negation. In this extension information about preferences between rules can be expressed in the logical language and derived dynamically. This is achieved by using a reserved predicate symbol and a naming technique. Conflicts among rules are resolved whenever possible on the basis of derived preference information. The well-founded conclusions of prioritized logic programs can be computed in polynomial time. A legal reasoning example illustrates the usefulness of the approach.

118 citations


Posted Content
TL;DR: The notion of worth is used in an altered definition of utility, thus enabling agreements in a wider class of joint-goal reachable situations and introducing two mechanisms, one "strict," the other "tolerant," and their affects on the stability and efficiency of negotiation outcomes.
Abstract: This paper lays part of the groundwork for a domain theory of negotiation, that is, a way of classifying interactions so that it is clear, given a domain, which negotiation mechanisms and strategies are appropriate. We define State Oriented Domains, a general category of interaction. Necessary and sufficient conditions for cooperation are outlined. We use the notion of worth in an altered definition of utility, thus enabling agreements in a wider class of joint-goal reachable situations. An approach is offered for conflict resolution, and it is shown that even in a conflict situation, partial cooperative steps can be taken by interacting agents (that is, agents in fundamental conflict might still agree to cooperate up to a certain point). A Unified Negotiation Protocol (UNP) is developed that can be used in all types of encounters. It is shown that in certain borderline cooperative situations, a partial cooperative agreement (i.e., one that does not achieve all agents' goals) might be preferred by all agents, even though there exists a rational agreement that would achieve all their goals. Finally, we analyze cases where agents have incomplete information on the goals and worth of other agents. First we consider the case where agents' goals are private information, and we analyze what goal declaration strategies the agents might adopt to increase their utility. Then, we consider the situation where the agents' goals (and therefore stand-alone costs) are common knowledge, but the worth they attach to their goals is private information. We introduce two mechanisms, one 'strict', the other 'tolerant', and analyze their affects on the stability and efficiency of negotiation outcomes.

108 citations


Posted Content
TL;DR: This paper discusses the design and empirical analysis of algorithms for a temporal reasoning system based on Allen's influential interval-based framework for representing temporal information, and develops variable and value ordering heuristics that are shown empirically to dramatically improve the performance of the algorithm.
Abstract: Many applications -- from planning and scheduling to problems in molecular biology -- rely heavily on a temporal reasoning component. In this paper, we discuss the design and empirical analysis of algorithms for a temporal reasoning system based on Allen's influential interval-based framework for representing temporal information. At the core of the system are algorithms for determining whether the temporal information is consistent, and, if so, finding one or more scenarios that are consistent with the temporal information. Two important algorithms for these tasks are a path consistency algorithm and a backtracking algorithm. For the path consistency algorithm, we develop techniques that can result in up to a ten-fold speedup over an already highly optimized implementation. For the backtracking algorithm, we develop variable and value ordering heuristics that are shown empirically to dramatically improve the performance of the algorithm. As well, we show that a previously suggested reformulation of the backtracking search problem can reduce the time and space requirements of the backtracking search. Taken together, the techniques we develop allow a temporal reasoning component to solve problems that are of practical size.

103 citations


Posted Content
TL;DR: A computational paradigm, spatial aggregation, is developed to unify the description of a class of imagistic problem solvers and describes the computational structure of three implemented problemsolvers in terms of the spatial aggregation generic operators by mixing and matching a library of commonly used routines.
Abstract: Visual thinking plays an important role in scientific reasoning. Based on the research in automating diverse reasoning tasks about dynamical systems, nonlinear controllers, kinematic mechanisms, and fluid motion, we have identified a style of visual thinking, imagistic reasoning. Imagistic reasoning organizes computations around image-like, analogue representations so that perceptual and symbolic operations can be brought to bear to infer structure and behavior. Programs incorporating imagistic reasoning have been shown to perform at an expert level in domains that defy current analytic or numerical methods. We have developed a computational paradigm, spatial aggregation, to unify the description of a class of imagistic problem solvers. A program written in this paradigm has the following properties. It takes a continuous field and optional objective functions as input, and produces high-level descriptions of structure, behavior, or control actions. It computes a multi-layer of intermediate representations, called spatial aggregates, by forming equivalence classes and adjacency relations. It employs a small set of generic operators such as aggregation, classification, and localization to perform bidirectional mapping between the information-rich field and successively more abstract spatial aggregates. It uses a data structure, the neighborhood graph, as a common interface to modularize computations. To illustrate our theory, we describe the computational structure of three implemented problem solvers -- KAM, MAPS, and HIPAIR --- in terms of the spatial aggregation generic operators by mixing and matching a library of commonly used routines.

71 citations


Posted Content
Diane J. Litman1
TL;DR: This paper explores the use of machine learning for classifying cue phrases as discourse or sentential in natural language processing systems that exploit discourse structure, e.g., for performing tasks such as anaphora resolution and plan recognition.
Abstract: Cue phrases may be used in a discourse sense to explicitly signal discourse structure, but also in a sentential sense to convey semantic rather than structural information. Correctly classifying cue phrases as discourse or sentential is critical in natural language processing systems that exploit discourse structure, e.g., for performing tasks such as anaphora resolution and plan recognition. This paper explores the use of machine learning for classifying cue phrases as discourse or sentential. Two machine learning programs (Cgrendel and C4.5) are used to induce classification models from sets of pre-classified cue phrases and their features in text and speech. Machine learning is shown to be an effective technique for not only automating the generation of classification models, but also for improving upon previous results. When compared to manually derived classification models already in the literature, the learned models often perform with higher accuracy and contain new linguistic insights into the data. In addition, the ability to automatically construct classification models makes it easier to comparatively analyze the utility of alternative feature representations of the data. Finally, the ease of retraining makes the learning approach more scalable and flexible than manual methods.

70 citations


Posted Content
TL;DR: In this article, a mean field theory for sigmoid belief networks based on ideas from statistical mechanics was developed, which provides a tractable approximation to the true probability distribution in these networks; it also yields a lower bound on the likelihood of evidence.
Abstract: We develop a mean field theory for sigmoid belief networks based on ideas from statistical mechanics. Our mean field theory provides a tractable approximation to the true probability distribution in these networks; it also yields a lower bound on the likelihood of evidence. We demonstrate the utility of this framework on a benchmark problem in statistical pattern recognition---the classification of handwritten digits.

26 citations


Posted Content
TL;DR: In this article, the authors give a systematic treatment of the existence or non-existence of least generalizations and greatest specializations of finite sets of clauses in each of these six ordered sets.
Abstract: The main operations in Inductive Logic Programming (ILP) are generalization and specialization, which only make sense in a generality order. In ILP, the three most important generality orders are subsumption, implication and implication relative to background knowledge. The two languages used most often are languages of clauses and languages of only Horn clauses. This gives a total of six different ordered languages. In this paper, we give a systematic treatment of the existence or non-existence of least generalizations and greatest specializations of finite sets of clauses in each of these six ordered sets. We survey results already obtained by others and also contribute some answers of our own. Our main new results are, firstly, the existence of a computable least generalization under implication of every finite set of clauses containing at least one non-tautologous function-free clause (among other, not necessarily function-free clauses). Secondly, we show that such a least generalization need not exist under relative implication, not even if both the set that is to be generalized and the background knowledge are function-free. Thirdly, we give a complete discussion of existence and non-existence of greatest specializations in each of the six ordered languages.

21 citations


Posted Content
TL;DR: In this paper, an inexpensive strategy for creating initial clusterings, coupled with several control strategies for iterative optimization, each of which repeatedly modifies an initial clustering in search of a better one, are evaluated.
Abstract: Clustering is often used for discovering structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. Ideally, the search strategy should consistently construct clusterings of high quality, but be computationally inexpensive as well. In general, we cannot have it both ways, but we can partition the search so that a system inexpensively constructs a `tentative' clustering for initial examination, followed by iterative optimization, which continues to search in background for improved clusterings. Given this motivation, we evaluate an inexpensive strategy for creating initial clusterings, coupled with several control strategies for iterative optimization, each of which repeatedly modifies an initial clustering in search of a better one. One of these methods appears novel as an iterative optimization strategy in clustering contexts. Once a clustering has been constructed it is judged by analysts -- often according to task-specific criteria. Several authors have abstracted these criteria and posited a generic performance task akin to pattern completion, where the error rate over completed patterns is used to `externally' judge clustering utility. Given this performance task, we adapt resampling-based pruning strategies used by supervised learning systems to the task of simplifying hierarchical clusterings, thus promising to ease post-clustering analysis. Finally, we propose a number of objective functions, based on attribute-selection measures for decision-tree induction, that might perform well on the error rate and simplicity dimensions.

Posted Content
TL;DR: In this paper, a hierarchy of classes of knowledge bases, Omega_1,Omega_2,..., with the following properties: first, Omega is the class of all stratified knowledge bases; second, if a knowledge base Pi is in Omega_k, then Pi has at most k stable models, and all of them may be found in time O(lnk), where l is the length of the knowledge base and n the number of atoms in Pi.
Abstract: Finding the stable models of a knowledge base is a significant computational problem in artificial intelligence. This task is at the computational heart of truth maintenance systems, autoepistemic logic, and default logic. Unfortunately, it is NP-hard. In this paper we present a hierarchy of classes of knowledge bases, Omega_1,Omega_2,..., with the following properties: first, Omega_1 is the class of all stratified knowledge bases; second, if a knowledge base Pi is in Omega_k, then Pi has at most k stable models, and all of them may be found in time O(lnk), where l is the length of the knowledge base and n the number of atoms in Pi; third, for an arbitrary knowledge base Pi, we can find the minimum k such that Pi belongs to Omega_k in time polynomial in the size of Pi; and, last, where K is the class of all knowledge bases, it is the case that union{i=1 to infty} Omega_i = K, that is, every knowledge base belongs to some class in the hierarchy.

Posted Content
TL;DR: In this paper, an extension to the constraint satisfaction problem (CSP) called MUSE CSP (MUltiply SEgmented Constraint Satisfaction Problem) is described.
Abstract: This paper describes an extension to the constraint satisfaction problem (CSP) called MUSE CSP (MUltiply SEgmented Constraint Satisfaction Problem). This extension is especially useful for those problems which segment into multiple sets of partially shared variables. Such problems arise naturally in signal processing applications including computer vision, speech processing, and handwriting recognition. For these applications, it is often difficult to segment the data in only one way given the low-level information utilized by the segmentation algorithms. MUSE CSP can be used to compactly represent several similar instances of the constraint satisfaction problem. If multiple instances of a CSP have some common variables which have the same domains and constraints, then they can be combined into a single instance of a MUSE CSP, reducing the work required to apply the constraints. We introduce the concepts of MUSE node consistency, MUSE arc consistency, and MUSE path consistency. We then demonstrate how MUSE CSP can be used to compactly represent lexically ambiguous sentences and the multiple sentence hypotheses that are often generated by speech recognition algorithms so that grammar constraints can be used to provide parses for all syntactically correct sentences. Algorithms for MUSE arc and path consistency are provided. Finally, we discuss how to create a MUSE CSP from a set of CSPs which are labeled to indicate when the same variable is shared by more than a single CSP.

Posted Content
TL;DR: In this paper, a methodology for proving termination of general logic programs w.r.t. the Prolog selection rule is introduced, which is based on the notions of low-, weakly up-, and up-acceptable programs.
Abstract: Termination of logic programs with negated body atoms (here called general logic programs) is an important topic. One reason is that many computational mechanisms used to process negated atoms, like Clark's negation as failure and Chan's constructive negation, are based on termination conditions. This paper introduces a methodology for proving termination of general logic programs w.r.t. the Prolog selection rule. The idea is to distinguish parts of the program depending on whether or not their termination depends on the selection rule. To this end, the notions of low-, weakly up-, and up-acceptable program are introduced. We use these notions to develop a methodology for proving termination of general logic programs, and show how interesting problems in non-monotonic reasoning can be formalized and implemented by means of terminating general logic programs.

Posted Content
TL;DR: In this article, an MDL-inspired penalty is applied to such tests, eliminating some of them from consideration and altering the relative desirability of all tests, leading to smaller decision trees with higher predictive accuracies.
Abstract: A reported weakness of C4.5 in domains with continuous attributes is addressed by modifying the formation and evaluation of tests on continuous attributes. An MDL-inspired penalty is applied to such tests, eliminating some of them from consideration and altering the relative desirability of all tests. Empirical trials show that the modifications lead to smaller decision trees with higher predictive accuracies. Results also confirm that a new version of C4.5 incorporating these changes is superior to recent approaches that use global discretization and that construct small trees with multi-interval splits.

Posted Content
TL;DR: Cassandra as mentioned in this paper uses explicit decision-steps that enable the agent executing the plan to decide which plan branch to follow, which are planned for in the same way as any other subgoals.
Abstract: A fundamental assumption made by classical AI planners is that there is no uncertainty in the world: the planner has full knowledge of the conditions under which the plan will be executed and the outcome of every action is fully predictable. These planners cannot therefore construct contingency plans, i.e., plans in which different actions are performed in different circumstances. In this paper we discuss some issues that arise in the representation and construction of contingency plans and describe Cassandra, a partial-order contingency planner. Cassandra uses explicit decision-steps that enable the agent executing the plan to decide which plan branch to follow. The decision-steps in a plan result in subgoals to acquire knowledge, which are planned for in the same way as any other subgoals. Cassandra thus distinguishes the process of gathering information from the process of making decisions. The explicit representation of decisions in Cassandra allows a coherent approach to the problems of contingent planning, and provides a solid base for extensions such as the use of different decision-making procedures.

Posted Content
TL;DR: In this paper, a particular first-order learning system is modified to customize it for finding definitions of functional relations, which leads to faster learning times and, in some cases, to definitions that have higher predictive accuracy.
Abstract: First-order learning involves finding a clause-form definition of a relation from examples of the relation and relevant background information. In this paper, a particular first-order learning system is modified to customize it for finding definitions of functional relations. This restriction leads to faster learning times and, in some cases, to definitions that have higher predictive accuracy. Other first-order learning systems might benefit from similar specialization.

Posted Content
TL;DR: This approach identifies strategies that not only decrease the amount of CPU time required to produce schedules, but also increase the percentage of problems that are solvable within computational resource limitations.
Abstract: Although most scheduling problems are NP-hard, domain specific techniques perform well in practice but are quite expensive to construct. In adaptive problem-solving solving, domain specific knowledge is acquired automatically for a general problem solver with a flexible control architecture. In this approach, a learning system explores a space of possible heuristic methods for one well-suited to the eccentricities of the given domain and problem distribution. In this article, we discuss an application of the approach to scheduling satellite communications. Using problem distributions based on actual mission requirements, our approach identifies strategies that not only decrease the amount of CPU time required to produce schedules, but also increase the percentage of problems that are solvable within computational resource limitations.

Posted Content
TL;DR: In this paper, the authors proposed two domain-independent techniques for bringing well-founded partial-order planners closer to practicality, one based on a simple adjustment to the default A* heuristic used by UCPOP to select plans for refinement, and the other based on preferring ''zero commitment'' (forced) plan refinements whenever possible and using LIFO prioritization otherwise.
Abstract: We propose some domain-independent techniques for bringing well-founded partial-order planners closer to practicality. The first two techniques are aimed at improving search control while keeping overhead costs low. One is based on a simple adjustment to the default A* heuristic used by UCPOP to select plans for refinement. The other is based on preferring ``zero commitment'' (forced) plan refinements whenever possible, and using LIFO prioritization otherwise. A more radical technique is the use of operator parameter domains to prune search. These domains are initially computed from the definitions of the operators and the initial and goal conditions, using a polynomial-time algorithm that propagates sets of constants through the operator graph, starting in the initial conditions. During planning, parameter domains can be used to prune nonviable operator instances and to remove spurious clobbering threats. In experiments based on modifications of UCPOP, our improved plan and goal selection strategies gave speedups by factors ranging from 5 to more than 1000 for a variety of problems that are nontrivial for the unmodified version. Crucially, the hardest problems gave the greatest improvements. The pruning technique based on parameter domains often gave speedups by an order of magnitude or more for difficult problems, both with the default UCPOP search strategy and with our improved strategy. The Lisp code for our techniques and for the test problems is provided in on-line appendices.

Posted Content
TL;DR: Using a game-theoretic framework, this paper studies the design of partially controlled multi-agent systems in two contexts: in one context, the uncontrollable agents are expected utility maximizers, while in the other they are reinforcement learners.
Abstract: Motivated by the control theoretic distinction between controllable and uncontrollable events, we distinguish between two types of agents within a multi-agent system: controllable agents, which are directly controlled by the system's designer, and uncontrollable agents, which are not under the designer's direct control. We refer to such systems as partially controlled multi-agent systems, and we investigate how one might influence the behavior of the uncontrolled agents through appropriate design of the controlled agents. In particular, we wish to understand which problems are naturally described in these terms, what methods can be applied to influence the uncontrollable agents, the effectiveness of such methods, and whether similar methods work across different domains. Using a game-theoretic framework, this paper studies the design of partially controlled multi-agent systems in two contexts: in one context, the uncontrollable agents are expected utility maximizers, while in the other they are reinforcement learners. We suggest different techniques for controlling agents' behavior in each domain, assess their success, and examine their relationship.

Posted Content
TL;DR: Three intelligent user interfaces are described: handwriting recognition, adaptive menus, and predictive fillin in the context of adding a person's name and address to an electronic organizer, which shows handwriting recognition is slower than typing on an on-screen, soft keyboard, while adaptive menus and predictive Fillin can be twice as fast.
Abstract: Efficiently entering information into a computer is key to enjoying the benefits of computing. This paper describes three intelligent user interfaces: handwriting recognition, adaptive menus, and predictive fillin. In the context of adding a personUs name and address to an electronic organizer, tests show handwriting recognition is slower than typing on an on-screen, soft keyboard, while adaptive menus and predictive fillin can be twice as fast. This paper also presents strategies for applying these three interfaces to other information collection domains.

Posted Content
TL;DR: In this paper, the authors present new experimental evidence against the utility of Occam's razor for post-processing decision trees produced by C4.5, and they demonstrate that the resulting more complex decision trees have, on average, higher predictive accuracy than the less complex original decision trees.
Abstract: This paper presents new experimental evidence against the utility of Occam's razor. A~systematic procedure is presented for post-processing decision trees produced by C4.5. This procedure was derived by rejecting Occam's razor and instead attending to the assumption that similar objects are likely to belong to the same class. It increases a decision tree's complexity without altering the performance of that tree on the training data from which it is inferred. The resulting more complex decision trees are demonstrated to have, on average, for a variety of common learning tasks, higher predictive accuracy than the less complex original decision trees. This result raises considerable doubt about the utility of Occam's razor as it is commonly applied in modern machine learning.

Posted Content
TL;DR: New characterizations of decomposable models in terms of independence relationships are presented, which are obtained by adding a single axiom to the well-known set characterizing dependency models that are isomorphic to undirected graphs.
Abstract: Decomposable dependency models possess a number of interesting and useful properties. This paper presents new characterizations of decomposable models in terms of independence relationships, which are obtained by adding a single axiom to the well-known set characterizing dependency models that are isomorphic to undirected graphs. We also briefly discuss a potential application of our results to the problem of learning graphical models from data.