scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Artificial Intelligence Research in 2008"


Journal ArticleDOI
TL;DR: This article suggests an alternative mechanism for coordinating the movement of autonomous vehicles through intersections and demonstrates in simulation that this new mechanism has the potential to significantly outperform current intersection control technology--traffic lights and stop signs.
Abstract: Artificial intelligence research is ushering in a new era of sophisticated, mass-market transportation technology. While computers can already fly a passenger jet better than a trained human pilot, people are still faced with the dangerous yet tedious task of driving automobiles. Intelligent Transportation Systems (ITS) is the field that focuses on integrating information technology with vehicles and transportation infrastructure to make transportation safer, cheaper, and more efficient. Recent advances in ITS point to a future in which vehicles themselves handle the vast majority of the driving task. Once autonomous vehicles become popular, autonomous interactions amongst multiple vehicles will be possible. Current methods of vehicle coordination, which are all designed to work with human drivers, will be outdated. The bottleneck for roadway efficiency will no longer be the drivers, but rather the mechanism by which those drivers' actions are coordinated. While open-road driving is a well-studied and more-or-less-solved problem, urban traffic scenarios, especially intersections, are much more challenging. We believe current methods for controlling traffic, specifically at intersections, will not be able to take advantage of the increased sensitivity and precision of autonomous vehicles as compared to human drivers. In this article, we suggest an alternative mechanism for coordinating the movement of autonomous vehicles through intersections. Drivers and intersections in this mechanism are treated as autonomous agents in a multiagent system. In this multiagent system, intersections use a new reservation-based approach built around a detailed communication protocol, which we also present. We demonstrate in simulation that our new mechanism has the potential to significantly outperform current intersection control technology--traffic lights and stop signs. Because our mechanism can emulate a traffic light or stop sign, it subsumes the most popular current methods of intersection control. This article also presents two extensions to the mechanism. The first extension allows the system to control human-driven vehicles in addition to autonomous vehicles. The second gives priority to emergency vehicles without significant cost to civilian vehicles. The mechanism, including both extensions, is implemented and tested in simulation, and we present experimental results that strongly attest to the efficacy of this approach.

1,163 citations


Journal ArticleDOI
TL;DR: SATzilla is described, an automated approach for constructing per-instance algorithm portfolios for SAT that use so-called empirical hardness models to choose among their constituent solvers and is improved by integrating local search solvers as candidate solvers, by predicting performance score instead of runtime, and by using hierarchical hardness models that take into account different types of SAT instances.
Abstract: It has been widely observed that there is no single "dominant" SAT solver; instead, different solvers perform best on different instances. Rather than following the traditional approach of choosing the best solver for a given class of instances, we advocate making this decision online on a per-instance basis. Building on previous work, we describe SATzilla, an automated approach for constructing per-instance algorithm portfolios for SAT that use so-called empirical hardness models to choose among their constituent solvers. This approach takes as input a distribution of problem instances and a set of component solvers, and constructs a portfolio optimizing a given objective function (such as mean runtime, percent of instances solved, or score in a competition). The excellent performance of SATzilla was independently verified in the 2007 SAT Competition, where our SATzilla07 solvers won three gold, one silver and one bronze medal. In this article, we go well beyond SATzilla07 by making the portfolio construction scalable and completely automated, and improving it by integrating local search solvers as candidate solvers, by predicting performance score instead of runtime, and by using hierarchical hardness models that take into account different types of SAT instances. We demonstrate the effectiveness of these new techniques in extensive experimental results on data sets including instances from the most recent SAT competition.

916 citations


Journal ArticleDOI
TL;DR: The objectives here are to survey the various existing online POMDP methods, analyze their properties and discuss their advantages and disadvantages; and to thoroughly evaluate these online approaches in different environments under various metrics.
Abstract: Partially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains. However, solving a POMDP is often intractable except for small problems due to their complexity. Here, we focus on online approaches that alleviate the computational complexity by computing good local policies at each decision step during the execution. Online algorithms generally consist of a lookahead search to find the best action to execute at each time step in an environment. Our objectives here are to survey the various existing online POMDP methods, analyze their properties and discuss their advantages and disadvantages; and to thoroughly evaluate these online approaches in different environments under various metrics (return, error bound reduction, lower bound improvement). Our experimental results indicate that state-of-the-art online heuristic search methods can handle large POMDP domains efficiently.

557 citations


Journal ArticleDOI
TL;DR: This paper introduces the notions of conservative extension, safety and module for a very general class of logic-based ontology languages, and provides the notion of a safety class, which characterizes any sufficient condition for safety, and identifies a family of safety classes-called locality--which enjoys a collection of desirable properties.
Abstract: In this paper, we propose a set of tasks that are relevant for the modular reuse of ontologies. In order to formalize these tasks as reasoning problems, we introduce the notions of conservative extension, safety and module for a very general class of logic-based ontology languages. We investigate the general properties of and relationships between these notions and study the relationships between the relevant reasoning problems we have previously identified. To study the computability of these problems, we consider, in particular, Description Logics (DLs), which provide the formal underpinning of the W3C Web Ontology Language (OWL), and show that all the problems we consider are undecidable or algorithmically unsolvable for the description logic underlying OWL DL. In order to achieve a practical solution, we identify conditions sufficient for an ontology to reuse a set of symbols "safely"--that is, without changing their meaning. We provide the notion of a safety class, which characterizes any sufficient condition for safety, and identify a family of safety classes-called locality--which enjoys a collection of desirable properties. We use the notion of a safety class to extract modules from ontologies, and we provide various modularization algorithms that are appropriate to the properties of the particular safety class in use. Finally, we show practical benefits of our safety checking and module extraction algorithms.

416 citations


Journal ArticleDOI
TL;DR: This work shows how previous formulations of sentence compression can be recast as ILPs and extend these models with novel global constraints to infer globally optimal compressions in the presence of linguistically motivated constraints.
Abstract: Sentence compression holds promise for many applications ranging from summarization to subtitle generation. Our work views sentence compression as an optimization problem and uses integer linear programming (ILP) to infer globally optimal compressions in the presence of linguistically motivated constraints. We show how previous formulations of sentence compression can be recast as ILPs and extend these models with novel global constraints. Experimental results on written and spoken texts demonstrate improvements over state-of-the-art models.

337 citations


Journal ArticleDOI
TL;DR: This paper studies whether similar Q-value functions can be defined for decentralized POMDP models (Dec-POMDPs), and how policies can be extracted from such value functions, and describes a family of algorithms for extracting policies from such Q- value functions.
Abstract: Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent frameworks like MDPs and POMDPs, planning can be carried out by resorting to Q-value functions: an optimal Q-value function Q* is computed in a recursive manner by dynamic programming, and then an optimal policy is extracted from Q*. In this paper we study whether similar Q-value functions can be defined for decentralized POMDP models (Dec-POMDPs), and how policies can be extracted from such value functions. We define two forms of the optimal Q-value function for Dec-POMDPs: one that gives a normative description as the Q-value function of an optimal pure joint policy and another one that is sequentially rational and thus gives a recipe for computation. This computation, however, is infeasible for all but the smallest problems. Therefore, we analyze various approximate Q-value functions that allow for efficient computation. We describe how they relate, and we prove that they all provide an upper bound to the optimal Q-value function Q*. Finally, unifying some previous approaches for solving Dec-POMDPs, we describe a family of algorithms for extracting policies from such Q-value functions, and perform an experimental evaluation on existing test problems, including a new firefighting benchmark problem.

318 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider unions of conjunctive queries over knowledge bases formulated in the prominent DL SHIQ and allow transitive roles in both the query and the knowledge base, and show decidability of query answering in this setting and establish two tight complexity bounds.
Abstract: Conjunctive queries play an important role as an expressive query language for Description Logics (DLs). Although modern DLs usually provide for transitive roles, conjunctive query answering over DL knowledge bases is only poorly understood if transitive roles are admitted in the query. In this paper, we consider unions of conjunctive queries over knowledge bases formulated in the prominent DL SHIQ and allow transitive roles in both the query and the knowledge base. We show decidability of query answering in this setting and establish two tight complexity bounds: regarding combined complexity, we prove that there is a deterministic algorithm for query answering that needs time single exponential in the size of the KB and double exponential in the size of the query, which is optimal. Regarding data complexity, we prove containment in co-NP.

207 citations


Journal ArticleDOI
TL;DR: In this article, the map is partitioned into subgraphs of known structure with entry and exit restrictions and planning then becomes a search in the much smaller space of subgraph configurations.
Abstract: Multi-robot path planning is dificult due to the combinatorial explosion of the search space with every new robot added Complete search of the combined state-space soon becomes intractable In this paper we present a novel form of abstraction that allows us to plan much more eficiently The key to this abstraction is the partitioning of the map into subgraphs of known structure with entry and exit restrictions which we can represent compactly Planning then becomes a search in the much smaller space of subgraph configurations Once an abstract plan is found, it can be quickly resolved into a correct (but possibly sub-optimal) concrete plan without the need for further search We prove that this technique is sound and complete and demonstrate its practical effiectiveness on a real map A contending solution, prioritised planning, is also evaluated and shown to have similar performance albeit at the cost of completeness The two approaches are not necessarily conflicting; we demonstrate how they can be combined into a single algorithm which out-performs either approach alone

193 citations


Journal ArticleDOI
TL;DR: This work considers the problem of allocating fairly a set of indivisible goods among agents from the point of view of compact representation and computational complexity, and identifies the complexity of determining whether there exists an efficient and envy-free allocation.
Abstract: We consider the problem of allocating fairly a set of indivisible goods among agents from the point of view of compact representation and computational complexity. We start by assuming that agents have dichotomous preferences expressed by propositional formulae. We express efficiency and envy-freeness in a logical setting, which reveals unexpected connections to nonmonotonic reasoning. Then we identify the complexity of determining whether there exists an efficient and envy-free allocation, for several notions of efficiency, when preferences are represented in a succinct way (as well as restrictions of this problem). We first study the problem under the assumption that preferences are dichotomous, and then in the general case.

146 citations


Journal ArticleDOI
TL;DR: A wide set of solving alternatives on a broad set of optimization benchmarks indicates that the performance of MINIMAXSAT is usually close to the best specialized alternative and, in some cases, even better.
Abstract: In this paper we introduce MINIMAXSAT, a new Max-SAT solver that is built on top of MIN-ISAT+.It incorporates the best current SAT and Max-SAT techniques. It can handle hard clauses (clauses of mandatory satisfaction as in SAT), soft clauses (clauses whose falsification is penalized by a cost as in Max-SAT) as well as pseudo-boolean objective functions and constraints. Its main features are: learning and backjumping on hard clauses; resolution-based and substraction-based lower bounding; and lazy propagation with the two-watched literal scheme. Our empirical evaluation comparing a wide set of solving alternatives on a broad set of optimization benchmarks indicates that the performance of MINIMAXSAT is usually close to the best specialized alternative and, in some cases, even better.

138 citations


Journal ArticleDOI
TL;DR: This paper characterize the worst-case complexity of manipulation and control in the context of four prominent multiwinner voting systems, under different formulations of the strategic agent's goal.
Abstract: Although recent years have seen a surge of interest in the computational aspects of social choice, no specific attention has previously been devoted to elections with multiple winners, e.g., elections of an assembly or committee. In this paper, we characterize the worst-case complexity of manipulation and control in the context of four prominent multiwinner voting systems, under different formulations of the strategic agent's goal.

Journal ArticleDOI
TL;DR: The notion of analogical dissimilarity is defined, which is a measure of how far four objects are from being in analogical proportion, and also learning algorithms, i.e. methods to find the triple of objects in a learning sample which has the least analogical Dissimilarity with a given object.
Abstract: This paper defines the notion of analogical dissimilarity be tween four objects, with a special focus on objects structured as sequences. Firstly, it studi es the case where the four objects have a null analogical dissimilarity, i.e. are in analogical proportion. Secondly, when one of these objects is unknown, it gives algorithms to compute it. Thirdly, it ta ckles the problem of defining analogical dissimilarity, which is a measure of how far four objects are from being in analogical proportion. In particular, when objects are sequences, it gives a definitio n and an algorithm based on an optimal alignment of the four sequences. It gives also learning algorithms, i.e. methods to find the triple of objects in a learning sample which has the least analogical dissimilarity with a given object. Two practical experiments are described: the first is a class ification problem on benchmarks of binary and nominal data, the second shows how the generation of sequences by solving analogical equations enables a handwritten character recognition system to rapidly be adapted to a new writer.

Journal ArticleDOI
TL;DR: In this paper, the authors investigated the computational complexity of testing dominance and consistency in CP-nets and showed that both dominance and stability for general CP-net are PSPACE-complete.
Abstract: We investigate the computational complexity of testing dominance and consistency in CP-nets Previously, the complexity of dominance has been determined for restricted classes in which the dependency graph of the CP-net is acyclic However, there are preferences of interest that define cyclic dependency graphs; these are modeled with general CP-nets In our main results, we show here that both dominance and consistency for general CP-nets are PSPACE-complete We then consider the concept of strong dominance, dominance equivalence and dominance incomparability, and several notions of optimality, and identify the complexity of the corresponding decision problems The reductions used in the proofs are from STRIPS planning, and thus reinforce the earlier established connections between both areas

Journal ArticleDOI
TL;DR: The Latent Relation Mapping Engine (LRME) as mentioned in this paper combines ideas from Structure Mapping Theory (SMT) and LRA in order to remove the requirement for hand-coded representations.
Abstract: Many AI researchers and cognitive scientists have argued that analogy is the core of cognition. The most influential work on computational modeling of analogy-making is Structure Mapping Theory (SMT) and its implementation in the Structure Mapping Engine (SME). A limitation of SME is the requirement for complex hand-coded representations. We introduce the Latent Relation Mapping Engine (LRME), which combines ideas from SME and Latent Relational Analysis (LRA) in order to remove the requirement for hand-coded representations. LRME builds analogical mappings between lists of words, using a large corpus of raw text to automatically discover the semantic relations among the words. We evaluate LRME on a set of twenty analogical mapping problems, ten based on scientific analogies and ten based on common metaphors. LRME achieves human-level performance on the twenty problems. We compare LRME with a variety of alternative approaches and find that they are not able to reach the same level of performance.

Journal ArticleDOI
TL;DR: This article uses naive mean field approximation and belief propagation to approximate the optimal control in systems with linear dynamics, and compares the approximate inference methods with the exact solution, and shows that they can accurately compute the optimalControl.
Abstract: In this article we consider the issue of optimal control in collaborative multi-agent systems with stochastic dynamics. The agents have a joint task in which they have to reach a number of target states. The dynamics of the agents contains additive control and additive noise, and the autonomous part factorizes over the agents. Full observation of the global state is assumed. The goal is to minimize the accumulated joint cost, which consists of integrated instantaneous costs and a joint end cost. The joint end cost expresses the joint task of the agents. The instantaneous costs are quadratic in the control and factorize over the agents. The optimal control is given as a weighted linear combination of single-agent to single-target controls. The single-agent to single-target controls are expressed in terms of diffusion processes. These controls, when not closed form expressions, are formulated in terms of path integrals, which are calculated approximately by Metropolis-Hastings sampling. The weights in the control are interpreted as marginals of a joint distribution over agent to target assignments. The structure of the latter is represented by a graphical model, and the marginals are obtained by graphical model inference. Exact inference of the graphical model will break down in large systems, and so approximate inference methods are needed. We use naive mean field approximation and belief propagation to approximate the optimal control in systems with linear dynamics. We compare the approximate inference methods with the exact solution, and we show that they can accurately compute the optimal control. Finally, we demonstrate the control method in multi-agent systems with nonlinear dynamics consisting of up to 80 agents that have to reach an equal number of target states.

Journal ArticleDOI
TL;DR: A new MARL algorithm called the Weighted Policy Learner (WPL), which allows agents to reach a Nash Equilibrium (NE) in benchmark 2-player-2-action games with minimum knowledge and outperforms the state-of-the-art algorithms in a more realistic setting of 100 agents interacting and learning concurrently.
Abstract: Several multiagent reinforcement learning (MARL) algorithms have been proposed to optimize agents' decisions. Due to the complexity of the problem, the majority of the previously developed MARL algorithms assumed agents either had some knowledge of the underlying game (such as Nash equilibria) and/or observed other agents actions and the rewards they received. We introduce a new MARL algorithm called theWeighted Policy Learner (WPL), which allows agents to reach a Nash Equilibrium (NE) in benchmark 2-player-2-action games with minimum knowledge. Using WPL, the only feedback an agent needs is its own local reward (the agent does not observe other agents actions or rewards). Furthermore, WPL does not assume that agents know the underlying game or the corresponding Nash Equilibrium a priori. We experimentally show that our algorithm converges in benchmark two-player-two-action games. We also show that our algorithm converges in the challenging Shapley's game where previous MARL algorithms failed to converge without knowing the underlying game or the NE. Furthermore, we show that WPL outperforms the state-of-the-art algorithms in a more realistic setting of 100 agents interacting and learning concurrently. An important aspect of understanding the behavior of a MARL algorithm is analyzing the dynamics of the algorithm: how the policies of multiple learning agents evolve over time as agents interact with one another. Such an analysis not only verifies whether agents using a given MARL algorithm will eventually converge, but also reveals the behavior of the MARL algorithm prior to convergence. We analyze our algorithm in two-player-two-action games and show that symbolically proving WPL's convergence is difficult, because of the non-linear nature of WPL's dynamics, unlike previous MARL algorithms that had either linear or piece-wise-linear dynamics. Instead, we numerically solve WPL's dynamics differential equations and compare the solution to the dynamics of previous MARL algorithms.

Journal ArticleDOI
TL;DR: This work presents the first tractable, exact solution for the problem of identifying actions' effects in partially observable STRIPS domains, and it is shown that these results have been applied to an autonomous agent in a virtual world, facilitating decision making, diagnosis, and exploration.
Abstract: We present exact algorithms for identifying deterministic-actions' effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model (the way actions affect the world) of a domain and must learn it from partial observations over time. Such scenarios are common in real world applications. They are challenging for AI tasks because traditional domain structures that underly tractability (e.g., conditional independence) fail there (e.g., world features become correlated). Our work departs from traditional assumptions about partial observations and action models. In particular, it focuses on problems in which actions are deterministic of simple logical structure and observation models have all features observed with some frequency. We yield tractable algorithms for the modified problem for such domains. Our algorithms take sequences of partial observations over time as input, and output deterministic action models that could have lead to those observations. The algorithms output all or one of those models (depending on our choice), and are exact in that no model is misclassified given the observations. Our algorithms take polynomial time in the number of time steps and state features for some traditional action classes examined in the AI-planning literature, e.g., STRIPS actions. In contrast, traditional approaches for HMMs and Reinforcement Learning are inexact and exponentially intractable for such domains. Our experiments verify the theoretical tractability guarantees, and show that we identify action models exactly. Several applications in planning, autonomous exploration, and adventure-game playing already use these results. They are also promising for probabilistic settings, partially observable reinforcement learning, and diagnosis.

Journal ArticleDOI
TL;DR: This paper presents Networks of Influence Diagrams (NID), a compact, natural and highly expressive language for reasoning about agents' beliefs and decision-making processes that makes an explicit distinction between agents' optimal strategies, and how they actually behave in reality.
Abstract: This paper presents Networks of Influence Diagrams (NID), a compact, natural and highly expressive language for reasoning about agents' beliefs and decision-making processes. NIDs are graphical structures in which agents' mental models are represented as nodes in a network; a mental model for an agent may itself use descriptions of the mental models of other agents. NIDs are demonstrated by examples, showing how they can be used to describe conflicting and cyclic belief structures, and certain forms of bounded rationality. In an opponent modeling domain, NIDs were able to outperform other computational agents whose strategies were not known in advance. NIDs are equivalent in representation to Bayesian games but they are more compact and structured than this formalism. In particular, the equilibrium definition for NIDs makes an explicit distinction between agents' optimal strategies, and how they actually behave in reality.

Journal ArticleDOI
TL;DR: In this article, the authors propose a base learner VR-Tree which generates trees with variable-randomness, which is able to span from the conventional deterministic trees to the complete-random trees using a probabilistic parameter.
Abstract: In this paper, we show that a continuous spectrum of randomisation exists, in which most existing tree randomisations are only operating around the two ends of the spectrum. That leaves a huge part of the spectrum largely unexplored. We propose a base learner VR-Tree which generates trees with variable-randomness. VR-Trees are able to span from the conventional deterministic trees to the complete-random trees using a probabilistic parameter. Using VR-Trees as the base models, we explore the entire spectrum of randomised ensembles, together with Bagging and Random Subspace. We discover that the two halves of the spectrum have their distinct characteristics; and the understanding of which allows us to propose a new approach in building better decision tree ensembles. We name this approach Coalescence, which coalesces a number of points in the random-half of the spectrum. Coalescence acts as a committee of "experts" to cater for unforeseeable conditions presented in training data. Coalescence is found to perform better than any single operating point in the spectrum, without the need to tune to a specific level of randomness. In our empirical study, Coalescence ranks top among the benchmarking ensemble methods including Random Forests, Random Subspace and C5 Boosting; and only Coalescence is significantly better than Bagging and Max-Diverse Ensemble among all the methods in the comparison. Although Coalescence is not significantly better than Random Forests, we have identified conditions under which one will perform better than the other.

Journal ArticleDOI
TL;DR: In this paper, a general definition of additive abstractions that can be applied to any state space and prove that heuristics based on additive abstraction are consistent as well as admissible is presented.
Abstract: Informally, a set of abstractions of a state space S is additive if the distance between any two states in S is always greater than or equal to the sum of the corresponding distances in the abstract spaces. The first known additive abstractions, called disjoint pattern databases, were experimentally demonstrated to produce state of the art performance on certain state spaces. However, previous applications were restricted to state spaces with special properties, which precludes disjoint pattern databases from being defined for several commonly used testbeds, such as Rubik's Cube, TopSpin and the Pancake puzzle. In this paper we give a general definition of additive abstractions that can be applied to any state space and prove that heuristics based on additive abstractions are consistent as well as admissible. We use this new definition to create additive abstractions for these testbeds and show experimentally that well chosen additive abstractions can reduce search time substantially for the (18,4)-TopSpin puzzle and by three orders of magnitude over state of the art methods for the 17-Pancake puzzle. We also derive a way of testing if the heuristic value returned by additive abstractions is provably too low and show that the use of this test can reduce search time for the 15-puzzle and TopSpin by roughly a factor of two.

Journal ArticleDOI
TL;DR: This article defines and axiomatically characterize several decision rules tailored for the joint handling of positive and negative arguments in an ordinal setting, and shows that the simplest rules can be viewed as extensions of the maximin and maximax criteria to the bipolar case, and consequently suffer from poor decisive power.
Abstract: Making a decision is often a matter of listing and comparing positive and negative arguments. In such cases, the evaluation scale for decisions should be considered bipolar, that is, negative and positive values should be explicitly distinguished. That is what is done, for example, in Cumulative Prospect Theory. However, contrary to the latter framework that presupposes genuine numerical assessments, human agents often decide on the basis of an ordinal ranking of the pros and the cons, and by focusing on the most salient arguments. In other terms, the decision process is qualitative as well as bipolar. In this article, based on a bipolar extension of possibility theory, we define and axiomatically characterize several decision rules tailored for the joint handling of positive and negative arguments in an ordinal setting. The simplest rules can be viewed as extensions of the maximin and maximax criteria to the bipolar case, and consequently suffer from poor decisive power. More decisive rules that refine the former are also proposed. These refinements agree both with principles of efficiency and with the spirit of order-of-magnitude reasoning, that prevails in qualitative decision theory. The most refined decision rule uses leximin rankings of the pros and the cons, and the ideas of counting arguments of equal strength and cancelling pros by cons. It is shown to come down to a special case of Cumulative Prospect Theory, and to subsume the "Take the Best" heuristic studied by cognitive psychologists.

Journal ArticleDOI
TL;DR: This work extends the MDP model to incorporate-1) simultaneous action execution, 2) durative actions, and 3) stochastic durations, and develops several algorithms to combat the computational explosion introduced by these features.
Abstract: Probabilistic planning problems are typically modeled as a Markov Decision Process (MDP). MDPs, while an otherwise expressive model, allow only for sequential, non-durative actions. This poses severe restrictions in modeling and solving a real world planning problem. We extend the MDP model to incorporate-1) simultaneous action execution, 2) durative actions, and 3) stochastic durations. We develop several algorithms to combat the computational explosion introduced by these features. The key theoretical ideas used in building these algorithms are -- modeling a complex problem as an MDP in extended state/action space, pruning of irrelevant actions, sampling of relevant actions, using informed heuristics to guide the search, hybridizing different planners to achieve benefits of both, approximating the problem and replanning. Our empirical evaluation illuminates the different merits in using various algorithms, viz., optimality, empirical closeness to optimality, theoretical error bounds, and speed.

Journal ArticleDOI
TL;DR: The paper develops First Order Decision Diagrams (FODD), a new compact representation for functions over relational structures, together with a set of operators to combine FODDs, and novel reduction techniques to keep the representation small.
Abstract: Markov decision processes capture sequential decision making under uncertainty, where an agent must choose actions so as to optimize long term reward. The paper studies efficient reasoning mechanisms for Relational Markov Decision Processes (RMDP) where world states have an internal relational structure that can be naturally described in terms of objects and relations among them. Two contributions are presented. First, the paper develops First Order Decision Diagrams (FODD), a new compact representation for functions over relational structures, together with a set of operators to combine FODDs, and novel reduction techniques to keep the representation small. Second, the paper shows how FODDs can be used to develop solutions for RMDPs, where reasoning is performed at the abstract level and the resulting optimal policy is independent of domain size (number of objects) or instantiation. In particular, a variant of the value iteration algorithm is developed by using special operations over FODDs, and the algorithm is shown to converge to the optimal policy.

Journal ArticleDOI
TL;DR: In this paper, the authors present an extensive axiomatic study of ranking systems and show that there is no general social ranking rule that satisfies both transitivity and ranked independence of irrelevant alternatives.
Abstract: Reasoning about agent preferences on a set of alternatives, and the aggregation of such preferences into some social ranking is a fundamental issue in reasoning about multi-agent systems. When the set of agents and the set of alternatives coincide, we get the ranking systems setting. A famous type of ranking systems are page ranking systems in the context of search engines. In this paper we present an extensive axiomatic study of ranking systems. In particular, we consider two fundamental axioms: Transitivity, and Ranked Independence of Irrelevant Alternatives. Surprisingly, we find that there is no general social ranking rule that satisfies both requirements. Furthermore, we show that our impossibility result holds under various restrictions on the class of ranking problems considered. However, when transitivity is weakened, an interesting possibility result is obtained. In addition, we show a complete axiomatization of approval voting using ranked IIA.

Journal ArticleDOI
TL;DR: Empirical evidence shows that the proposed novel method for approximate inference in Bayesian networks can achieve good approximation accuracy at low online computational cost.
Abstract: We propose a novel method for approximate inference in Bayesian networks (BNs). The idea is to sample data from a BN, learn a latent tree model (LTM) from the data offline, and when online, make inference with the LTM instead of the original BN. Because LTMs are tree-structured, inference takes linear time. In the meantime, they can represent complex relationship among leaf nodes and hence the approximation accuracy is often good. Empirical evidence shows that our method can achieve good approximation accuracy at low online computational cost.

Journal ArticleDOI
TL;DR: A polynomial-time algorithm that uses macros to generate plans for the class 3S of planning problems with binary state variables and acyclic causal graphs implies that plan generation may be tractable even when a planning problem has an exponentially long minimal solution.
Abstract: We present three new complexity results for classes of planning problems with simple causal graphs. First, we describe a polynomial-time algorithm that uses macros to generate plans for the class 3S of planning problems with binary state variables and acyclic causal graphs. This implies that plan generation may be tractable even when a planning problem has an exponentially long minimal solution. We also prove that the problem of plan existence for planning problems with multi-valued variables and chain causal graphs is NP-hard. Finally, we show that plan existence for planning problems with binary state variables and polytree causal graphs is NP-complete.

Journal ArticleDOI
TL;DR: This work begins the study of the mediation and enforcement values, providing several general results on the value of correlation as captured by these concepts, and presents a set of results for the more specialized case of congestion games, a class of games that received a lot of attention in the recent literature.
Abstract: Correlated equilibrium generalizes Nash equilibrium to allow correlation devices. Correlated equilibrium captures the idea that in many systems there exists a trusted administrator who can recommend behavior to a set of agents, but can not enforce such behavior. This makes this solution concept most appropriate to the study of multi-agent systems in AI. Aumann showed an example of a game, and of a correlated equilibrium in this game in which the agents' welfare (expected sum of players' utilities) is greater than their welfare in all mixed-strategy equilibria. Following the idea initiated by the price of anarchy literature this suggests the study of two major measures for the value of correlation in a game with nonnegative payoffs: 1. The ratio between the maximal welfare obtained in a correlated equilibrium to the maximal welfare obtained in a mixed-strategy equilibrium. We refer to this ratio as the mediation value. 2. The ratio between the maximal welfare to the maximal welfare obtained in a correlated equilibrium. We refer to this ratio as the enforcement value. In this work we initiate the study of the mediation and enforcement values, providing several general results on the value of correlation as captured by these concepts. We also present a set of results for the more specialized case of congestion games, a class of games that received a lot of attention in the recent literature.

Journal ArticleDOI
TL;DR: This paper defines primitive update operations on the models of Computation Tree Logic (CTL) and formalize the principle of minimal change for CTL model update, and describes a formal algorithm that implements this approach.
Abstract: Model checking is a promising technology, which has been applied for verification of many hardware and software systems. In this paper, we introduce the concept of model update towards the development of an automatic system modification tool that extends model checking functions. We define primitive update operations on the models of Computation Tree Logic (CTL) and formalize the principle of minimal change for CTL model update. These primitive update operations, together with the underlying minimal change principle, serve as the foundation for CTL model update. Essential semantic and computational characterizations are provided for our CTL model update approach. We then describe a formal algorithm that implements this approach. We also illustrate two case studies of CTL model updates for the well-known microwave oven example and the Andrew File System 1, from which we further propose a method to optimize the update results in complex system modifications.

Journal ArticleDOI
TL;DR: This paper extends classic and modern real-time search algorithms with an automated mechanism for dynamic depth and subgoal selection with nearly a three-fold improvement in suboptimality over the existing state-of-the-art algorithms and a 15-fold increase in the amount of planning per action.
Abstract: Real-time heuristic search is a challenging type of agent-centered search because the agent's planning time per action is bounded by a constant independent of problem size. A common problem that imposes such restrictions is pathfinding in modern computer games where a large number of units must plan their paths simultaneously over large maps. Common search algorithms (e.g., A*, IDA*, D*, ARA*, AD*) are inherently not real-time and may lose completeness when a constant bound is imposed on per-action planning time. Real-time search algorithms retain completeness but frequently produce unacceptably suboptimal solutions. In this paper, we extend classic and modern real-time search algorithms with an automated mechanism for dynamic depth and subgoal selection. The new algorithms remain real-time and complete. On large computer game maps, they find paths within 7% of optimal while on average expanding roughly a single state per action. This is nearly a three-fold improvement in suboptimality over the existing state-of-the-art algorithms and, at the same time, a 15-fold improvement in the amount of planning per action.

Journal ArticleDOI
TL;DR: This work presents the AND/OR Multi-Valued Decision Diagram (AOMDD) which compiles a graphical model into a canonical form that supports polynomial ( e.g., solution counting, belief updating) or constant time (e.g. equivalence of graphical models) queries.
Abstract: Inspired by the recently introduced framework of AND/OR search spaces for graphical models, we propose to augment Multi-Valued Decision Diagrams (MDD) with AND nodes, in order to capture function decomposition structure and to extend these compiled data structures to general weighted graphical models (e.g., probabilistic models). We present the AND/OR Multi-Valued Decision Diagram (AOMDD) which compiles a graphical model into a canonical form that supports polynomial (e.g., solution counting, belief updating) or constant time (e.g. equivalence of graphical models) queries. We provide two algorithms for compiling the AOMDD of a graphical model. The first is search-based, and works by applying reduction rules to the trace of the memory intensive AND/OR search algorithm. The second is inference-based and uses a Bucket Elimination schedule to combine the AOMDDs of the input functions via the the APPLY operator. For both algorithms, the compilation time and the size of the AOMDD are, in the worst case, exponential in the treewidth of the graphical model, rather than pathwidth as is known for ordered binary decision diagrams (OBDDs). We introduce the concept of semantic treewidth, which helps explain why the size of a decision diagram is often much smaller than the worst case bound. We provide an experimental evaluation that demonstrates the potential of AOMDDs.