Showing papers in "Journal of Artificial Intelligence Research in 2000"

PDF

Open Access

Journal Article•DOI•

Hierarchical reinforcement learning with the MAXQ value function decomposition

[...]

01 Aug 2000-Journal of Artificial Intelligence Research

TL;DR: The paper presents an online model-free learning algorithm, MAXQ-Q, and proves that it converges with probability 1 to a kind of locally-optimal policy known as a recursively optimal policy, even in the presence of the five kinds of state abstraction.

...read moreread less

Abstract: This paper presents a new approach to hierarchical reinforcement learning based on decomposing the target Markov decision process (MDP) into a hierarchy of smaller MDPs and decomposing the value function of the target MDP into an additive combination of the value functions of the smaller MDPs. The decomposition, known as the MAXQ decomposition, has both a procedural semantics--as a subroutine hierarchy--and a declarative semantics--as a representation of the value function of a hierarchical policy. MAXQ unifies and extends previous work on hierarchical reinforcement learning by Singh, Kaelbling, and Dayan and Hinton. It is based on the assumption that the programmer can identify useful subgoals and define subtasks that achieve these subgoals. By defining such subgoals, the programmer constrains the set of policies that need to be considered during reinforcement learning. The MAXQ value function decomposition can represent the value function of any policy that is consistent with the given hierarchy. The decomposition also creates opportunities to exploit state abstractions, so that individual MDPs within the hierarchy can ignore large parts of the state space. This is important for the practical application of the method. This paper defines the MAXQ hierarchy, proves formal results on its representational power, and establishes five conditions for the safe use of state abstractions. The paper presents an online model-free learning algorithm, MAXQ-Q, and proves that it converges with probability 1 to a kind of locally-optimal policy known as a recursively optimal policy, even in the presence of the five kinds of state abstraction. The paper evaluates the MAXQ representation and MAXQ-Q through a series of experiments in three domains and shows experimentally that MAXQ-Q (with state abstractions) converges to a recursively optimal policy much faster than flat Q learning. The fact that MAXQ learns a representation of the value function has an important benefit: it makes it possible to compute and execute an improved, non-hierarchical policy via a procedure similar to the policy improvement step of policy iteration. The paper demonstrates the effectiveness of this nonhierarchical execution experimentally. Finally, the paper concludes with a comparison to related work and a discussion of the design tradeoffs in hierarchical reinforcement learning.

...read moreread less

1,486 citations

Journal Article•DOI•

A model of inductive bias learning

[...]

Jonathan Baxter¹•Institutions (1)

Australian National University¹

01 Feb 2000-Journal of Artificial Intelligence Research

TL;DR: Under certain restrictions on the set of all hypothesis spaces available to the learner, it is shown that a hypothesis space that performs well on a sufficiently large number of training tasks will also perform well when learning novel tasks in the same environment.

...read moreread less

Abstract: A major problem in machine learning is that of inductive bias: how to choose a learner's hypothesis space so that it is large enough to contain a solution to the problem being learnt, yet small enough to ensure reliable generalization from reasonably-sized training sets. Typically such bias is supplied by hand through the skill and insights of experts. In this paper a model for automatically learning bias is investigated. The central assumption of the model is that the learner is embedded within an environment of related learning tasks. Within such an environment the learner can sample from multiple tasks, and hence it can search for a hypothesis space that contains good solutions to many of the problems in the environment. Under certain restrictions on the set of all hypothesis spaces available to the learner, we show that a hypothesis space that performs well on a sufficiently large number of training tasks will also perform well when learning novel tasks in the same environment. Explicit bounds are also derived demonstrating that learning multiple tasks within an environment of related tasks can potentially give much better generalization than learning a single task.

...read moreread less

1,084 citations

Journal Article•DOI•

Value-function approximations for partially observable Markov decision processes

[...]

Milos Hauskrecht¹•Institutions (1)

Brown University¹

01 Aug 2000-Journal of Artificial Intelligence Research

TL;DR: This work surveys various approximation methods, analyzes their properties and relations and provides some new insights into their differences, and presents a number of new approximation methods and novel refinements of existing techniques.

...read moreread less

Abstract: Partially observable Markov decision processes (POMDPs) provide an elegant mathematical framework for modeling complex decision and planning problems in stochastic domains in which states of the system are observable only indirectly, via a set of imperfect or noisy observations. The modeling advantage of POMDPs, however, comes at a price -- exact methods for solving them are computationally very expensive and thus applicable in practice only to very simple problems. We focus on efficient approximation (heuristic) methods that attempt to alleviate the computational problem and trade off accuracy for speed. We have two objectives here. First, we survey various approximation methods, analyze their properties and relations and provide some new insights into their differences. Second, we present a number of new approximation methods and novel refinements of existing techniques. The theoretical results are supported by experiments on a problem from the agent navigation domain.

...read moreread less

583 citations

Journal Article•DOI•

An application of reinforcement learning to dialogue strategy selection in a spoken dialogue system for email

[...]

Marilyn A. Walker¹•Institutions (1)

AT&T¹

01 Feb 2000-Journal of Artificial Intelligence Research

TL;DR: A novel method by which a spoken dialogue system can learn to choose an optimal dialogue strategy from its experience interacting with human users, based on a combination of reinforcement learning and performance modeling of spoken dialogue systems.

...read moreread less

Abstract: This paper describes a novel method by which a spoken dialogue system can learn to choose an optimal dialogue strategy from its experience interacting with human users. The method is based on a combination of reinforcement learning and performance modeling of spoken dialogue systems. The reinforcement learning component applies Q-learning (Watkins, 1989), while the performance modeling component applies the PARADISE evaluation framework (Walker et al., 1997) to learn the performance function (reward) used in reinforcement learning. We illustrate the method with a spoken dialogue system named elvis (EmaiL Voice Interactive System), that supports access to email over the phone. We conduct a set of experiments for training an optimal dialogue strategy on a corpus of 219 dialogues in which human users interact with elvis over the phone. We then test that strategy on a corpus of 18 dialogues. We show that elvis can learn to optimize its strategy selection for agent initiative, for reading messages, and for summarizing email folders.

...read moreread less

231 citations

Journal Article•DOI•

Exact phase transitions in random constraint satisfaction problems

[...]

Ke Xu¹, Wei Li¹•Institutions (1)

Beihang University¹

01 Feb 2000-Journal of Artificial Intelligence Research

TL;DR: It is proved that phase transitions from a region where almost all problems are satisfiable to a region Where almost all Problems are unsatisfiable do exist for Model RB as the number of variables approaches infinity.

...read moreread less

Abstract: In this paper we propose a new type of random CSP model, called Model RB, which is a revision to the standard Model B. It is proved that phase transitions from a region where almost all problems are satisfiable to a region where almost all problems are unsatisfiable do exist for Model RB as the number of variables approaches infinity. Moreover, the critical values at which the phase transitions occur are also known exactly. By relating the hardness of Model RB to Model B, it is shown that there exist a lot of hard instances in Model RB.

...read moreread less

209 citations

Journal Article•DOI•

AIS-BN: an adaptive importance sampling algorithm for evidential reasoning in large Bayesian networks

[...]

Jian Cheng¹, Marek J. Druzdzel¹•Institutions (1)

University of Pittsburgh¹

01 Aug 2000-Journal of Artificial Intelligence Research

TL;DR: An adaptive importance sampling algorithm, AIS-BN, is proposed that shows promising convergence rates even under extreme conditions and seems to outperform the existing sampling algorithms consistently, and two heuristics for initialization of the importance function that are based on the theoretical properties of importance sampling in finite-dimensional integrals and the structural advantages of Bayesian networks.

...read moreread less

Abstract: Stochastic sampling algorithms, while an attractive alternative to exact algorithms in very large Bayesian network models, have been observed to perform poorly in evidential reasoning with extremely unlikely evidence. To address this problem, we propose an adaptive importance sampling algorithm, AIS-BN, that shows promising convergence rates even under extreme conditions and seems to outperform the existing sampling algorithms consistently. Three sources of this performance improvement are (1) two heuristics for initialization of the importance function that are based on the theoretical properties of importance sampling in finite-dimensional integrals and the structural advantages of Bayesian networks, (2) a smooth learning method for the importance function, and (3) a dynamic weighting function for combining samples from different stages of the algorithm. We tested the performance of the AIS-BN algorithm along with two state of the art general purpose sampling algorithms, likelihood weighting (Fung & Chang, 1989; Shachter & Peot, 1989) and self-importance sampling (Shachter & Peot, 1989). We used in our tests three large real Bayesian network models available to the scientific community: the CPCS network (Pradhan et al., 1994), the PATHFINDER network (Heckerman, Horvitz, & Nathwani, 1990), and the ANDES network (Conati, Gertner, VanLehn, & Druzdzel, 1997), with evidence as unlikely as 10-41. While the AIS-BN algorithm always performed better than the other two algorithms, in the majority of the test cases it achieved orders of magnitude improvement in precision of the results. Improvement in speed given a desired precision is even more dramatic, although we are unable to report numerical results here, as the other algorithms almost never achieved the precision reached even by the first few iterations of the AIS-BN algorithm.

...read moreread less

208 citations

Journal Article•DOI•

The complexity of reasoning with cardinality restrictions and nominals in expressive description logics

[...]

Stephan Tobies¹•Institutions (1)

RWTH Aachen University¹

01 Feb 2000-Journal of Artificial Intelligence Research

TL;DR: This work reduces the complexity of reasoning with cardinality restrictions to reasoning with the (in general weaker) terminological formalism of general axioms for ALCQ extended with nominals, and shows that pure concept satisfiability for A LCQI with Nominals is NExpTime-complete.

...read moreread less

Abstract: We study the complexity of the combination of the Description Logics ALCQ and ALCQI with a terminological formalism based on cardinality restrictions on concepts. These combinations can naturally be embedded into C2, the two variable fragment of predicate logic with counting quantifiers, which yields decidability in NExpTime. We show that this approach leads to an optimal solution for ALCQI, as ALCQI with cardinality restrictions has the same complexity as C2 (NExpTime-complete). In contrast, we show that for ALCQ, the problem can be solved in ExpTime. This result is obtained by a reduction of reasoning with cardinality restrictions to reasoning with the (in general weaker) terminological formalism of general axioms for ALCQ extended with nominals. Using the same reduction, we show that, for the extension of ALCQI with nominals, reasoning with general axioms is a NExpTime-complete problem. Finally, we sharpen this result and show that pure concept satisfiability for ALCQI with nominals is NExpTime-complete. Without nominals, this problem is known to be PSpace-complete.

...read moreread less

192 citations

Journal Article•DOI•

Axiomatizing causal reasoning

[...]

Joseph Y. Halpern¹•Institutions (1)

Cornell University¹

01 Feb 2000-Journal of Artificial Intelligence Research

TL;DR: It is shown that to reason about causality in the most general third class, it must extend the language used by Galles and Pearl.

...read moreread less

Abstract: Causal models defined in terms of a collection of equations, as defined by Pearl, are axiomatized here. Axiomatizations are provided for three successively more general classes of causal models: (1) the class of recursive theories (those without feedback), (2) the class of theories where the solutions to the equations are unique, (3) arbitrary theories (where the equations may not have solutions and, if they do, they are not necessarily unique). It is shown that to reason about causality in the most general third class, we must extend the language used by Galles and Pearl (1997, 1998). In addition, the complexity of the decision procedures is characterized for all the languages and classes of models considered.

...read moreread less

191 citations

Journal Article•DOI•

Conformant planning via symbolic model checking

[...]

Alessandro Cimatti, Marco Roveri¹•Institutions (1)

University of Milan¹

01 Aug 2000-Journal of Artificial Intelligence Research

TL;DR: In this article, the authors present a general planning algorithm for conformant planning, which applies to fully nondeterministic domains, with uncertainty in the initial condition and in action effects, based on the representation of the planning domain as a finite state automaton.

...read moreread less

Abstract: We tackle the problem of planning in nondeterministic domains, by presenting a new approach to conformant planning. Conformant planning is the problem of finding a sequence of actions that is guaranteed to achieve the goal despite the nondeterminism of the domain. Our approach is based on the representation of the planning domain as a finite state automaton. We use Symbolic Model Checking techniques, in particular Binary Decision Diagrams, to compactly represent and efficiently search the automaton. In this paper we make the following contributions. First, we present a general planning algorithm for conformant planning, which applies to fully nondeterministic domains, with uncertainty in the initial condition and in action effects. The algorithm is based on a breadth-first, backward search, and returns conformant plans of minimal length, if a solution to the planning problem exists, otherwise it terminates concluding that the problem admits no conformant solution. Second, we provide a symbolic representation of the search space based on Binary Decision Diagrams (BDDs), which is the basis for search techniques derived from symbolic model checking. The symbolic representation makes it possible to analyze potentially large sets of states and transitions in a single computation step, thus providing for an efficient implementation. Third, we present CMBP (Conformant Model Based Planner), an efficient implementation of the data structures and algorithm described above, directly based on BDD manipulations, which allows for a compact representation of the search layers and an efficient implementation of the search steps. Finally, we present an experimental comparison of our approach with the state-of-the-art conformant planners CGP, QBFPLAN and GPT. Our analysis includes all the planning problems from the distribution packages of these systems, plus other problems defined to stress a number of specific factors. Our approach appears to be the most effective: CMBP is strictly more expressive than QBFPLAN and CGP and, in all the problems where a comparison is possible, CMBP outperforms its competitors, sometimes by orders of magnitude.

...read moreread less

147 citations

Journal Article•DOI•

On reasonable and forced goal orderings and their use in an agenda-driven planning algorithm

[...]

Jana Koehler, Jörg Hoffmann¹•Institutions (1)

University of Freiburg¹

01 Feb 2000-Journal of Artificial Intelligence Research

TL;DR: The paper formally defines and discusses two different goal orderings, which are called the reasonable and the forced ordering, and shows how the ordering relations can be used to compute a so-called goal agenda that divides G into an ordered set of subgoals.

...read moreread less

Abstract: The paper addresses the problem of computing goal orderings, which is one of the longstanding issues in AI planning. It makes two new contributions. First, it formally defines and discusses two different goal orderings, which are called the reasonable and the forced ordering. Both orderings are defined for simple STRIPS operators as well as for more complex ADL operators supporting negation and conditional effects. The complexity of these orderings is investigated and their practical relevance is discussed. Secondly, two different methods to compute reasonable goal orderings are developed. One of them is based on planning graphs, while the other investigates the set of actions directly. Finally, it is shown how the ordering relations, which have been derived for a given set of goals G, can be used to compute a so-called goal agenda that divides G into an ordered set of subgoals. Any planner can then, in principle, use the goal agenda to plan for increasing sets of subgoals. This can lead to an exponential complexity reduction, as the solution to a complex planning problem is found by solving easier subproblems. Since only a polynomial overhead is caused by the goal agenda computation, a potential exists to dramatically speed up planning algorithms as we demonstrate in the empirical evaluation, where we use this method in the IPP planner.

...read moreread less

146 citations

Journal Article•DOI•

On the compilability and expressive power of propositional planning formalisms

[...]

Bernhard Nebel¹•Institutions (1)

University of Freiburg¹

01 Feb 2000-Journal of Artificial Intelligence Research

TL;DR: In this paper, the expressiveness of a large family of propositional planning formalisms, ranging from basic STRIPS to a formalism with conditional effects, partial state specifications, and propositional formulae in the preconditions, is analyzed.

...read moreread less

Abstract: The recent approaches of extending the GRAPHPLAN algorithm to handle more expressive planning formalisms raise the question of what the formal meaning of "expressive power" is. We formalize the intuition that expressive power is a measure of how concisely planning domains and plans can be expressed in a particular formalism by introducing the notion of "compilation schemes" between planning formalisms. Using this notion, we analyze the expressiveness of a large family of propositional planning formalisms, ranging from basic STRIPS to a formalism with conditional effects, partial state specifications, and propositional formulae in the preconditions. One of the results is that conditional effects cannot be compiled away if plan size should grow only linearly but can be compiled away if we allow for polynomial growth of the resulting plans. This result confirms that the recently proposed extensions to the GRAPHPLAN algorithm concerning conditional effects are optimal with respect to the "compilability" framework. Another result is that general propositional formulae cannot be compiled into conditional effects if the plan size should be preserved linearly. This implies that allowing general propositional formulae in preconditions and effect conditions adds another level of difficulty in generating a plan.

...read moreread less

Journal Article•DOI•

Randomized algorithms for the loop cutset problem

[...]

Ann Becker¹, Reuven Bar-Yehuda¹, Dan Geiger¹•Institutions (1)

Technion – Israel Institute of Technology¹

01 Feb 2000-Journal of Artificial Intelligence Research

TL;DR: In this paper, a randomized algorithm for finding a minimum weight loop cutset in a Bayesian network with high probability is presented, with probability at least 1 - (1 - 1/6k)c6k, where c > 1 is a constant specified by the user.

...read moreread less

Abstract: We show how to find a minimum weight loop cutset in a Bayesian network with high probability. Finding such a loop cutset is the first step in the method of conditioning for inference. Our randomized algorithm for finding a loop cutset outputs a minimum loop cutset after O(c 6kkn) steps with probability at least 1 - (1 - 1/6k)c6k, where c > 1 is a constant specified by the user, k is the minimal size of a minimum weight loop cutset, and n is the number of vertices. We also show empirically that a variant of this algorithm often finds a loop cutset that is closer to the minimum weight loop cutset than the ones found by the best deterministic algorithms known.

...read moreread less

Journal Article•DOI•

Robust agent teams via socially-attentive monitoring

[...]

Gal A. Kaminka¹, Milind Tambe¹•Institutions (1)

University of Southern California¹

01 Feb 2000-Journal of Artificial Intelligence Research

TL;DR: This work empirically and analytically explores a family of socially-attentive teamwork monitoring algorithms in two dynamic, complex, multi-agent domains, under varying conditions of task distribution and uncertainty and shows that a centralized scheme using a complex algorithm trades correctness for completeness and requires monitoring all teammates.

...read moreread less

Abstract: Agents in dynamic multi-agent environments must monitor their peers to execute individual and group plans. A key open question is how much monitoring of other agents' states is required to be effective: The Monitoring Selectivity Problem. We investigate this question in the context of detecting failures in teams of cooperating agents, via Socially-Attentive Monitoring, which focuses on monitoring for failures in the social relationships between the agents. We empirically and analytically explore a family of socially-attentive teamwork monitoring algorithms in two dynamic, complex, multi-agent domains, under varying conditions of task distribution and uncertainty. We show that a centralized scheme using a complex algorithm trades correctness for completeness and requires monitoring all teammates. In contrast, a simple distributed teamwork monitoring algorithm results in correct and complete detection of teamwork failures, despite relying on limited, uncertain knowledge, and monitoring only key agents in a team. In addition, we report on the design of a socially-attentive monitoring system and demonstrate its generality in monitoring several coordination relationships, diagnosing detected failures, and both on-line and off-line applications.

...read moreread less

Journal Article•DOI•

OBDD-based universal planning for synchronized agents in non-deterministic domains

[...]

Rune Møller Jensen¹, Manuela Veloso¹•Institutions (1)

Carnegie Mellon University¹

01 Aug 2000-Journal of Artificial Intelligence Research

TL;DR: In this article, a new planning domain description language, NADL, has been proposed to specify non-deterministic, multi-agent domains with controllable agents and uncontrollable environment agents.

...read moreread less

Abstract: Recently model checking representation and search techniques were shown to be efficiently applicable to planning, in particular to non-deterministic planning Such planning approaches use Ordered Binary Decision Diagrams (OBDDS) to encode a planning domain as a non-deterministic finite automaton and then apply fast algorithms from model checking to search for a solution OBDDS can effectively scale and can provide universal plans for complex planning domains We are particularly interested in addressing the complexities arising in non-deterministic, multi-agent domains In this article, we present UMOP, a new universal OBDD-based planning framework for non-deterministic, multi-agent domains We introduce a new planning domain description language, NADL, to specify non-deterministic, multi-agent domains The language contributes the explicit definition of controllable agents and uncontrollable environment agents We describe the syntax and semantics of NADL and show how to build an efficient OBDD-based representation of an NADL description The UMOP planning system uses NADL and different OBDD-based universal planning algorithms It includes the previously developed strong and strong cyclic planning algorithms In addition, we introduce our new optimistic planning algorithm that relaxes optimality guarantees and generates plausible universal plans in some domains where no strong nor strong cyclic solution exists We present empirical results applying UMOP to domains ranging from deterministic and single-agent with no environment actions to non-deterministic and multi-agent with complex environment actions UMOP is shown to be a rich and efficient planning system

...read moreread less

Journal Article•DOI•

Planning graph as a (dynamic) CSP: exploiting EBL, DDB and other CSP search techniques in Graphplan

[...]

Subbarao Kambhampati¹•Institutions (1)

Arizona State University¹

01 Feb 2000-Journal of Artificial Intelligence Research

TL;DR: This paper describes how explanation based learning, dependency directed backtracking, dynamic variable ordering, forward checking, sticky values and random-restart search strategies can be adapted to Graphplan and demonstrates that these augmentations improve Graphplan's performance significantly.

...read moreread less

Abstract: This paper reviews the connections between Graphplan's planning-graph and the dynamic constraint satisfaction problem and motivates the need for adapting CSP search techniques to the Graphplan algorithm. It then describes how explanation based learning, dependency directed backtracking, dynamic variable ordering, forward checking, sticky values and random-restart search strategies can be adapted to Graphplan. Empirical results are provided to demonstrate that these augmentations improve Graphplan's performance significantly (up to 1000x speedups) on several benchmark problems. Special attention is paid to the explanation-based learning and dependency directed backtracking techniques as they are empirically found to be most useful in improving the performance of Graphplan.

...read moreread less

Journal Article•DOI•

Backbone fragility and the local search cost peak

[...]

Josh Singer¹, Ian P. Gent², Alan Smaill¹•Institutions (2)

University of Edinburgh¹, University of St Andrews²

01 Feb 2000-Journal of Artificial Intelligence Research

TL;DR: It is proposed that high-cost random instances for local search are those with very large backbones which are also backbone-fragile, and the decay in cost beyond the satisfiability threshold is due to increasing backbone robustness (the opposite of backbone fragility).

...read moreread less

Abstract: The local search algorithm WSAT is one of the most successful algorithms for solving the satisfiability (SAT) problem. It is notably effective at solving hard Random 3-SAT instances near the so-called 'satisfiability threshold', but still shows a peak in search cost near the threshold and large variations in cost over different instances. We make a number of significant contributions to the analysis of WSAT on high-cost random instances, using the recently-introduced concept of the backbone of a SAT instance. The backbone is the set of literals which are entailed by an instance. We find that the number of solutions predicts the cost well for small-backbone instances but is much less relevant for the large-backbone instances which appear near the threshold and dominate in the overconstrained region. We show a very strong correlation between search cost and the Hamming distance to the nearest solution early in WSAT's search. This pattern leads us to introduce a measure of the backbone fragility of an instance, which indicates how persistent the backbone is as clauses are removed. We propose that high-cost random instances for local search are those with very large backbones which are also backbone-fragile. We suggest that the decay in cost beyond the satisfiability threshold is due to increasing backbone robustness (the opposite of backbone fragility). Our hypothesis makes three correct predictions. First, that the backbone robustness of an instance is negatively correlated with the local search cost when other factors are controlled for. Second, that backbone-minimal instances (which are 3-SAT instances altered so as to be more backbone-fragile) are unusually hard for WSAT. Third, that the clauses most often unsatisfied during search are those whose deletion has the most effect on the backbone. In understanding the pathologies of local search methods, we hope to contribute to the development of new and better techniques.

...read moreread less

Journal Article•DOI•

Space efficiency of propositional knowledge representation formalisms

[...]

Marco Cadoli¹, Francesco M. Donini², Paolo Liberatore¹, Marco Schaerf¹•Institutions (2)

Sapienza University of Rome¹, Instituto Politécnico Nacional²

01 Aug 2000-Journal of Artificial Intelligence Research

TL;DR: In this paper, the authors investigate the space efficiency of a Propositional Knowledge Representation (PKR) formalism, where knowledge is either a set of propositional interpretations (models) or propositional formulae (theorems).

...read moreread less

Abstract: We investigate the space efficiency of a Propositional Knowledge Representation (PKR) formalism. Intuitively, the space efficiency of a formalism F in representing a certain piece of knowledge α, is the size of the shortest formula of F that represents α. In this paper we assume that knowledge is either a set of propositional interpretations (models) or a set of propositional formulae (theorems). We provide a formal way of talking about the relative ability of PKR formalisms to compactly represent a set of models or a set of theorems. We introduce two new compactness measures, the corresponding classes, and show that the relative space efficiency of a PKR formalism in representing models/theorems is directly related to such classes. In particular, we consider formalisms for nonmonotonic reasoning, such as circumscription and default logic, as well as belief revision operators and the stable model semantics for logic programs with negation. One interesting result is that formalisms with the same time complexity do not necessarily belong to the same space efficiency class.

...read moreread less

Journal Article•DOI•

On deducing conditional independence from d-separation in causal graphs with feedback

[...]

Radford M. Neal¹•Institutions (1)

University of Toronto¹

01 Feb 2000-Journal of Artificial Intelligence Research

TL;DR: In this article, it was shown that the d-separation criterion for conditional independence in acyclic causal networks also applies to networks of discrete variables that have feedback cycles, provided that the variables of the system are uniquely determined by the random disturbances.

...read moreread less

Abstract: Pearl and Dechter (1996) claimed that the d-separation criterion for conditional independence in acyclic causal networks also applies to networks of discrete variables that have feedback cycles, provided that the variables of the system are uniquely determined by the random disturbances. I show by example that this is not true in general. Some condition stronger than uniqueness is needed, such as the existence of a causal dynamics guaranteed to lead to the unique solution.

...read moreread less

Journal Article•DOI•

Asimovian adaptive agents

[...]

Diana F. Gordon¹•Institutions (1)

United States Naval Research Laboratory¹

01 Aug 2000-Journal of Artificial Intelligence Research

TL;DR: Two solutions are presented: positive results that certain learning operators are a priori guaranteed to preserve useful classes of behavioral assurance constraints (which implies that no reverification is needed for these operators), and efficient incremental reverification algorithms for those learning operators that have negative a priora results.

...read moreread less

Abstract: The goal of this research is to develop agents that are adaptive and predictable and timely. At first blush, these three requirements seem contradictory. For example, adaptation risks introducing undesirable side effects, thereby making agents' behavior less predictable. Furthermore, although formal verification can assist in ensuring behavioral predictability, it is known to be time-consuming. Our solution to the challenge of satisfying all three requirements is the following. Agents have finite-state automaton plans, which are adapted online via evolutionary learning (perturbation) operators. To ensure that critical behavioral constraints are always satisfied, agents' plans are first formally verified. They are then reverified after every adaptation. If reverification concludes that constraints are violated, the plans are repaired. The main objective of this paper is to improve the efficiency of reverification after learning, so that agents have a sufficiently rapid response time. We present two solutions: positive results that certain learning operators are a priori guaranteed to preserve useful classes of behavioral assurance constraints (which implies that no reverification is needed for these operators), and efficient incremental reverification algorithms for those learning operators that have negative a priori results.

...read moreread less

Journal Article•DOI•

Reasoning on interval and point-based disjunctive metric constraints in temporal contexts

[...]

Federico Barber¹•Institutions (1)

Polytechnic University of Valencia¹

01 Feb 2000-Journal of Artificial Intelligence Research

TL;DR: A temporal model for reasoning on disjunctive metric constraints on intervals and time points in temporal contexts that is able to represent non-binary constraints, such that logical dependencies on disjuncts in constraints can be handled.

...read moreread less

Abstract: We introduce a temporal model for reasoning on disjunctive metric constraints on intervals and time points in temporal contexts. This temporal model is composed of a labeled temporal algebra and its reasoning algorithms. The labeled temporal algebra defines labeled disjunctive metric point-based constraints, where each disjunct in each input disjunctive constraint is univocally associated to a label. Reasoning algorithms manage labeled constraints, associated label lists, and sets of mutually inconsistent disjuncts. These algorithms guarantee consistency and obtain a minimal network. Additionally, constraints can be organized in a hierarchy of alternative temporal contexts. Therefore, we can reason on context-dependent disjunctive metric constraints on intervals and points. Moreover, the model is able to represent non-binary constraints, such that logical dependencies on disjuncts in constraints can be handled. The computational cost of reasoning algorithms is exponential in accordance with the underlying problem complexity, although some improvements are proposed.

...read moreread less

Journal Article•DOI•

A model of inductive bias learning

[...]

BaxterJonathan

01 Mar 2000-Journal of Artificial Intelligence Research

TL;DR: A major problem in machine learning is that of inductive bias: how to choose a learner's hypothesis space so that it is large enough to contain a solution to the problem being learnt, yet small eno...

...read moreread less

Journal Article•DOI•

Planning graph as a (dynamic) CSP

[...]

KambhampatiSubbarao

01 Feb 2000-Journal of Artificial Intelligence Research

TL;DR: This paper reviews the connections between Graphplan's planning-graph and the dynamic constraint satisfaction problem and motivates the need for adapting CSP search techniques to the Graphplan algo.

...read moreread less

Journal Article•DOI•

Exact phase transitions in random constraint satisfaction problems

[...]

XuKe, LiWei

01 Mar 2000-Journal of Artificial Intelligence Research

TL;DR: In this paper, a new type of random CSP model, called Model RB, is proposed, which is a revision to the standard Model B. It is proved that phase transitions from a region where almost all problems ar...

...read moreread less