scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Artificial Intelligence Research in 2001"


Journal ArticleDOI
TL;DR: A novel search strategy is introduced that combines hill-climbing with systematic search, and it is shown how other powerful heuristic information can be extracted and used to prune the search space.
Abstract: We describe and evaluate the algorithmic techniques that are used in the FF planning system. Like the HSP system, FF relies on forward state space search, using a heuristic that estimates goal distances by ignoring delete lists. Unlike HSP's heuristic, our method does not assume facts to be independent. We introduce a novel search strategy that combines hill-climbing with systematic search, and we show how other powerful heuristic information can be extracted and used to prune the search space. FF was the most successful automatic planner at the recent AIPS-2000 planning competition. We review the results of the competition, give data for other benchmark domains, and investigate the reasons for the runtime performance of FF compared to HSP.

1,994 citations


Journal ArticleDOI
TL;DR: In this article, a simulation-based algorithm for generating a biased estimate of the gradient of the average reward in Partially Observable Markov Decision Processes (POMDPs) controlled by parameterized stochastic policies is proposed.
Abstract: Gradient-based approaches to direct policy search in reinforcement learning have received much recent attention as a means to solve problems of partial observability and to avoid some of the problems associated with policy degradation in value-function methods. In this paper we introduce GPOMDP, a simulation-based algorithm for generating a biased estimate of the gradient of the average reward in Partially Observable Markov Decision Processes (POMDPs) controlled by parameterized stochastic policies. A similar algorithm was proposed by Kimura, Yamamura, and Kobayashi (1995). The algorithm's chief advantages are that it requires storage of only twice the number of policy parameters, uses one free parameter β ∈ [0, 1] (which has a natural interpretation in terms of bias-variance trade-off), and requires no knowledge of the underlying state. We prove convergence of GPOMDP, and show how the correct choice of the parameter β is related to the mixing time of the controlled POMDP. We briefly describe extensions of GPOMDP to controlled Markov chains, continuous state, observation and control spaces, multiple-agents, higher-order derivatives, and a version for training stochastic policies with internal states. In a companion paper (Baxter, Bartlett, & Weaver, 2001) we show how the gradient estimates generated by GPOMDP can be used in both a traditional stochastic gradient algorithm and a conjugate-gradient procedure to find local optima of the average reward.

587 citations


Journal ArticleDOI
TL;DR: This paper presents a fuzzy extension of ALC, combining Zadeh's fuzzy logic with a classical DL, where concepts becomes fuzzy and, thus, reasoning about imprecise concepts is supported.
Abstract: Description Logics (DLs) are suitable, well-known, logics for managing structured knowledge. They allow reasoning about individuals and well defined concepts, i.e. set of individuals with common properties. The experience in using DLs in applications has shown that in many cases we would like to extend their capabilities. In particular, their use in the context of Multimedia Information Retrieval (MIR) leads to the convincement that such DLs should allow the treatment of the inherent imprecision in multimedia object content representation and retrieval. In this paper we will present a fuzzy extension of ALC, combining Zadeh's fuzzy logic with a classical DL. In particular, concepts becomes fuzzy and, thus, reasoning about imprecise concepts is supported. We will define its syntax, its semantics, describe its properties and present a constraint propagation calculus for reasoning in it.

548 citations


Journal ArticleDOI
TL;DR: A logical/mathematical framework for statistical parameter learning of parameterized logic programs, i.e. definite clause programs containing probabilistic facts with a parameterized distribution, and a new EM algorithm that can significantly outperform the Inside-Outside algorithm.
Abstract: We propose a logical/mathematical framework for statistical parameter learning of parameterized logic programs, i.e. definite clause programs containing probabilistic facts with a parameterized distribution. It extends the traditional least Herbrand model semantics in logic programming to distribution semantics, possible world semantics with a probability distribution which is unconditionally applicable to arbitrary logic programs including ones for HMMs, PCFGs and Bayesian networks. We also propose a new EM algorithm, the graphical EM algorithm, that runs for a class of parameterized logic programs representing sequential decision processes where each decision is exclusive and independent. It runs on a new data structure called support graphs describing the logical relationship between observations and their explanations, and learns parameters by computing inside and outside probability generalized for logic programs. The complexity analysis shows that when combined with OLDT search for all explanations for observations, the graphical EM algorithm, despite its generality, has the same time complexity as existing EM algorithms, i.e. the Baum-Welch algorithm for HMMs, the Inside-Outside algorithm for PCFGs, and the one for singly connected Bayesian networks that have been developed independently in each research field. Learning experiments with PCFGs using two corpora of moderate size indicate that the graphical EM algorithm can significantly outperform the Inside-Outside algorithm.

287 citations


Journal ArticleDOI
TL;DR: An implemented system for recognizing the occurrence of events described by simple spatial-motion verbs in short image sequences using force dynamics and event logic to specify the lexical semantics of events to be more robust than prior systems based on motion profile.
Abstract: This paper presents an implemented system for recognizing the occurrence of events described by simple spatial-motion verbs in short image sequences. The semantics of these verbs is specified with event-logic expressions that describe changes in the state of force-dynamic relations between the participants of the event. An efficient finite representation is introduced for the infinite sets of intervals that occur when describing liquid and semi-liquid events. Additionally, an efficient procedure using this representation is presented for inferring occurrences of compound events, described with event-logic expressions, from occurrences of primitive events. Using force dynamics and event logic to specify the lexical semantics of events allows the system to be more robust than prior systems based on motion profile.

272 citations


Journal ArticleDOI
TL;DR: GIB, the program being described, involves five separate technical advances: partition search, the practical application of Monte Carlo techniques to realistic problems, a focus on achievable sets to solve problems inherent in the Monte Carlo approach, an extension of alpha-beta pruning from total orders to arbitrary distributive lattices, and the use of squeaky wheel optimization to find approximately optimal solutions to cardplay problems.
Abstract: This paper investigates the problems arising in the construction of a program to play the game of contract bridge. These problems include both the difficulty of solving the game's perfect information variant, and techniques needed to address the fact that bridge is not, in fact, a perfect information game. GIB, the program being described, involves five separate technical advances: partition search, the practical application of Monte Carlo techniques to realistic problems, a focus on achievable sets to solve problems inherent in the Monte Carlo approach, an extension of alpha-beta pruning from total orders to arbitrary distributive lattices, and the use of squeaky wheel optimization to find approximately optimal solutions to cardplay problems. GIB is currently believed to be of approximately expert caliber, and is currently the strongest computer bridge program in the world.

184 citations


Journal ArticleDOI
TL;DR: This paper proposes a method for accelerating the convergence of value iteration, a well-known algorithm for finding optimal policies for POMDPs, and has been evaluated on an array of benchmark problems and was found to be very effective.
Abstract: Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for finding optimal policies for POMDPs. It typically takes a large number of iterations to converge. This paper proposes a method for accelerating the convergence of value iteration. The method has been evaluated on an array of benchmark problems and was found to be very effective: It enabled value iteration to converge after only a few iterations on all the test problems.

152 citations


Journal ArticleDOI
TL;DR: The experiments show that the adaption of the algorithms used for qualitative temporal reasoning can solve large RCC-8 instances, even if they are in the phase transition region - provided that one uses the maximal tractable subsets of R CC-8 that have been identified by us.
Abstract: The theoretical properties of qualitative spatial reasoning in the RCC-8 framework have been analyzed extensively. However, no empirical investigation has been made yet. Our experiments show that the adaption of the algorithms used for qualitative temporal reasoning can solve large RCC-8 instances, even if they are in the phase transition region - provided that one uses the maximal tractable subsets of RCC-8 that have been identified by us. In particular, we demonstrate that the orthogonal combination of heuristic methods is successful in solving almost all apparently hard instances in the phase transition region up to a certain size in reasonable time.

152 citations


Journal ArticleDOI
TL;DR: This paper focuses on the local consistencies that are stronger than arc consistency, without changing the structure of the network, i.e., only removing inconsistent values from the domains.
Abstract: Enforcing local consistencies is one of the main features of constraint reasoning. Which level of local consistency should be used when searching for solutions in a constraint network is a basic question. Arc consistency and partial forms of arc consistency have been widely studied, and have been known for sometime through the forward checking or the MAC search algorithms. Until recently, stronger forms of local consistency remained limited to those that change the structure of the constraint graph, and thus, could not be used in practice, especially on large networks. This paper focuses on the local consistencies that are stronger than arc consistency, without changing the structure of the network, i.e., only removing inconsistent values from the domains. In the last five years, several such local consistencies have been proposed by us or by others. We make an overview of all of them, and highlight some relations between them. We compare them both theoretically and experimentally, considering their pruning efficiency and the time required to enforce them.

138 citations


Journal ArticleDOI
TL;DR: In this article, the authors present algorithms that perform gradient ascent of the average reward in a partially observable Markov decision process (POMDP) using biased estimates of the performance gradient in POMDPs.
Abstract: In this paper, we present algorithms that perform gradient ascent of the average reward in a partially observable Markov decision process (POMDP). These algorithms are based on GPOMDP, an algorithm introduced in a companion paper (Baxter & Bartlett, 2001), which computes biased estimates of the performance gradient in POMDPs. The algorithm's chief advantages are that it uses only one free parameter β ∈ [0, 1], which has a natural interpretation in terms of bias-variance trade-off, it requires no knowledge of the underlying state, and it can be applied to infinite state, control and observation spaces. We show how the gradient estimates produced by GPOMDP can be used to perform gradient ascent, both with a traditional stochastic-gradient algorithm, and with an algorithm based on conjugate-gradients that utilizes gradient information to bracket maxima in line searches. Experimental results are presented illustrating both the theoretical results of Baxter and Bartlett (2001) on a toy problem, and practical aspects of the algorithms on a number of more realistic problems.

136 citations


Journal ArticleDOI
TL;DR: This work uses WHIRL - an information integration system - to implement different recommendation algorithms derived from information retrieval principles, and uses a novel autonomous procedure for gathering reviewer interest information from the Web.
Abstract: The growing need to manage and exploit the proliferation of online data sources is opening up new opportunities for bringing people closer to the resources they need. For instance, consider a recommendation service through which researchers can receive daily pointers to journal papers in their fields of interest. We survey some of the known approaches to the problem of technical paper recommendation and ask how they can be extended to deal with multiple information sources. More specifically, we focus on a variant of this problem - recommending conference paper submissions to reviewing committee members - which offers us a testbed to try different approaches. Using WHIRL - an information integration system - we are able to implement different recommendation algorithms derived from information retrieval principles. We also use a novel autonomous procedure for gathering reviewer interest information from the Web. We evaluate our approach and compare it to other methods using preference data provided by members of the AAAI-98 conference reviewing committee along with data about the actual submissions.

Journal ArticleDOI
TL;DR: This work demonstrates that with simple modifications, the STRIPS action representation language can be used to represent interacting actions and develops a sound and complete partial-order planner for planning with concurrent interacting actions, POMP, that extends existing partial- order planners in a straightforward way.
Abstract: In order to generate plans for agents with multiple actuators, agent teams, or distributed controllers, we must be able to represent and plan using concurrent actions with interacting effects. This has historically been considered a challenging task requiring a temporal planner with the ability to reason explicitly about time. We show that with simple modifications, the STRIPS action representation language can be used to represent interacting actions. Moreover, algorithms for partial-order planning require only small modifications in order to be applied in such multiagent domains. We demonstrate this fact by developing a sound and complete partial-order planner for planning with concurrent interacting actions, POMP, that extends existing partial-order planners in a straightforward way. These results open the way to the use of partial-order planners for the centralized control of cooperative multiagent systems.

Journal ArticleDOI
TL;DR: This paper clarifies the different variants of the Reduced Error Pruning algorithm, brings new insight to its algorithmic properties, analyses the algorithm with less imposed assumptions than before, and includes the previously overlooked empty subtrees to the analysis.
Abstract: Top-down induction of decision trees has been observed to suffer from the inadequate functioning of the pruning phase In particular, it is known that the size of the resulting tree grows linearly with the sample size, even though the accuracy of the tree does not improve Reduced Error Pruning is an algorithm that has been used as a representative technique in attempts to explain the problems of decision tree learning In this paper we present analyses of Reduced Error Pruning in three different settings First we study the basic algorithmic properties of the method, properties that hold independent of the input decision tree and pruning examples Then we examine a situation that intuitively should lead to the subtree under consideration to be replaced by a leaf node, one in which the class label and attribute values of the pruning examples are independent of each other This analysis is conducted under two different assumptions The general analysis shows that the pruning probability of a node fitting pure noise is bounded by a function that decreases exponentially as the size of the tree grows In a specific analysis we assume that the examples are distributed uniformly to the tree This assumption lets us approximate the number of subtrees that are pruned because they do not receive any pruning examples This paper clarifies the different variants of the Reduced Error Pruning algorithm, brings new insight to its algorithmic properties, analyses the algorithm with less imposed assumptions than before, and includes the previously overlooked empty subtrees to the analysis

Journal ArticleDOI
TL;DR: It is shown that for several variations of partially observable Markov decision processes, polynomial-time algorithms for finding control policies are unlikely to or simply don't have guarantees of finding policies within a constant factor or a constant summand of optimal.
Abstract: We show that for several variations of partially observable Markov decision processes, polynomial-time algorithms for finding control policies are unlikely to or simply don't have guarantees of finding policies within a constant factor or a constant summand of optimal. Here "unlikely" means "unless some complexity classes collapse," where the collapses considered are P = NP, P = PSPACE, or P = EXP. Until or unless these collapses are shown to hold, any control-policy designer must choose between such performance guarantees and efficient computation.


Journal ArticleDOI
TL;DR: This paper shows that there exists a "perfect" dynamic variable ordering such that CBJ becomes redundant, and empirically shows that adding CBJ to a backtracking algorithm that maintains generalized arc consistency (GAC), an algorithm that is referred to as GAC-CBJ, can still provide orders of magnitude speedups.
Abstract: In recent years, many improvements to backtracking algorithms for solving constraint satisfaction problems have been proposed The techniques for improving backtracking algorithms can be conveniently classified as look-ahead schemes and look-back schemes Unfortunately, look-ahead and look-back schemes are not entirely orthogonal as it has been observed empirically that the enhancement of look-ahead techniques is sometimes counterproductive to the effects of look-back techniques In this paper, we focus on the relationship between the two most important look-ahead techniques--using a variable ordering heuristic and maintaining a level of local consistency during the backtracking search--and the look-back technique of conflict-directed backjumping (CBJ) We show that there exists a "perfect" dynamic variable ordering such that CBJ becomes redundant We also show theoretically that as the level of local consistency that is maintained in the backtracking search is increased, the less that backjumping will be an improvement Our theoretical results partially explain why a backtracking algorithm doing more in the look-ahead phase cannot benefit more from the backjumping look-back scheme Finally, we show empirically that adding CBJ to a backtracking algorithm that maintains generalized arc consistency (GAC), an algorithm that we refer to as GAC-CBJ, can still provide orders of magnitude speedups Our empirical results contrast with Bessiere and Regin's conclusion (1996) that CBJ is useless to an algorithm that maintains arc consistency

Journal ArticleDOI
TL;DR: Planning by Rewriting (PbR), a new paradigm for efficient high-quality domain-independent planning that exploits declarative plan-rewriting rules and efficient local search techniques to transform an easy-to-generate, but possibly suboptimal, initial plan into a high- quality plan.
Abstract: Domain-independent planning is a hard combinatorial problem. Taking into account plan quality makes the task even more dificult. This article introduces Planning by Rewriting (PbR), a new paradigm for efficient high-quality domain-independent planning. PbR exploits declarative plan-rewriting rules and efficient local search techniques to transform an easy-to-generate, but possibly suboptimal, initial plan into a high-quality plan. In addition to addressing the issues of planning efficiency and plan quality, this framework offers a new anytime planning algorithm. We have implemented this planner and applied it to several existing domains. The experimental results show that the PbR approach provides significant savings in planning effort while generating high-quality plans.

Journal ArticleDOI
TL;DR: This article describes ATTac-2000, the first-place finisher in TAC, which uses a principled bidding strategy that includes several elements of adaptivity and isolated empirical results are presented indicating the robustness and effectiveness of the adaptive strategy.
Abstract: The First Trading Agent Competition (TAC) was held from June 22nd to July 8th, 2000. TAC was designed to create a benchmark problem in the complex domain of e-marketplaces and to motivate researchers to apply unique approaches to a common task. This article describes ATTac-2000, the first-place finisher in TAC. ATTac-2000 uses a principled bidding strategy that includes several elements of adaptivity. In addition to the success at the competition, isolated empirical results are presented indicating the robustness and effectiveness of ATTac-2000's adaptive strategy.

Journal ArticleDOI
TL;DR: The paper presents the benefits from the adoption of opposite directions between the preprocessing and the search phases, discusses some difficulties that arise in the pre-processing phase and introduces techniques to cope with them, and presents several methods of improving the efficiency of the heuristic.
Abstract: This paper presents GRT, a domain-independent heuristic planning system for STRIPS worlds. GRT solves problems in two phases. In the pre-processing phase, it estimates the distance between each fact and the goals of the problem, in a backward direction. Then, in the search phase, these estimates are used in order to further estimate the distance between each intermediate state and the goals, guiding so the search process in a forward direction and on a best-first basis. The paper presents the benefits from the adoption of opposite directions between the preprocessing and the search phases, discusses some difficulties that arise in the pre-processing phase and introduces techniques to cope with them. Moreover, it presents several methods of improving the efficiency of the heuristic, by enriching the representation and by reducing the size of the problem. Finally, a method of overcoming local optimal states, based on domain axioms, is proposed. According to it, difficult problems are decomposed into easier sub-problems that have to be solved sequentially. The performance results from various domains, including those of the recent planning competitions, show that GRT is among the fastest planners.

Journal ArticleDOI
TL;DR: This work describes and evaluates the algorithmic techniques that are used in the FF planning system, which relies on forward state space search, using a heuristic that estimates goal distribution estimates.
Abstract: We describe and evaluate the algorithmic techniques that are used in the FF planning system. Like the HSP system, FF relies on forward state space search, using a heuristic that estimates goal dist...

Journal ArticleDOI
Christopher A. Meek1
TL;DR: The problem of learning an optimal path graphical model from data is considered and it is shown to be NP-hard for the maximum likelihood and minimum description length approaches and a Bayesian approach.
Abstract: I consider the problem of learning an optimal path graphical model from data and show the problem to be NP-hard for the maximum likelihood and minimum description length approaches and a Bayesian approach. This hardness result holds despite the fact that the problem is a restriction of the polynomially solvable problem of finding the optimal tree graphical model.

Journal ArticleDOI
TL;DR: A general notion of algebraic conditional plausibility measures is defined in this article, which can be represented using Bayesian networks, and can be expressed as probability measures, ranking functions, possibility measures, and sets of probability measures.
Abstract: A general notion of algebraic conditional plausibility measures is defined. Probability measures, ranking functions, possibility measures, and (under the appropriate definitions) sets of probability measures can all be viewed as defining algebraic conditional plausibility measures. It is shown that algebraic conditional plausibility measures can be represented using Bayesian networks.

Journal ArticleDOI
TL;DR: It is shown that although determining subsumption between concept descriptions has the same complexity (though requiring different algorithms), the story is different in the case of determining the least common subsumer (lcs); for attributes interpreted as partial functions, the lcs exists and can be computed relatively easily.
Abstract: Functional relationships between objects, called "attributes", are of considerable importance in knowledge representation languages, including Description Logics (DLs). A study of the literature indicates that papers have made, often implicitly, different assumptions about the nature of attributes: whether they are always required to have a value, or whether they can be partial functions. The work presented here is the first explicit study of this difference for subclasses of the CLASSIC DL, involving the same-as concept constructor. It is shown that although determining subsumption between concept descriptions has the same complexity (though requiring different algorithms), the story is different in the case of determining the least common subsumer (lcs). For attributes interpreted as partial functions, the lcs exists and can be computed relatively easily; even in this case our results correct and extend three previous papers about the lcs of DLs. In the case where attributes must have a value, the lcs may not exist, and even if it exists it may be of exponential size. Interestingly, it is possible to decide in polynomial time if the lcs exists.

Journal ArticleDOI
TL;DR: An algorithm for identifying noun-phrase antecedents of pronouns and adjectival anaphors in Spanish dialogues is presented based on linguistic constraints and preferences and uses an anaphoric accessibility space within which the algorithm finds the noun phrase.
Abstract: This paper presents an algorithm for identifying noun-phrase antecedents of pronouns and adjectival anaphors in Spanish dialogues. We believe that anaphora resolution requires numerous sources of information in order to find the correct antecedent of the anaphor. These sources can be of different kinds, e.g., linguistic information, discourse/dialogue structure information, or topic information. For this reason, our algorithm uses various different kinds of information (hybrid information). The algorithm is based on linguistic constraints and preferences and uses an anaphoric accessibility space within which the algorithm finds the noun phrase. We present some experiments related to this algorithm and this space using a corpus of 204 dialogues. The algorithm is implemented in Prolog. According to this study, 95.9% of antecedents were located in the proposed space, a precision of 81.3% was obtained for pronominal anaphora resolution, and 81.5% for adjectival anaphora.

Journal ArticleDOI
TL;DR: The ability of two classes of algorithms to propagate and discover reachability and relevance constraints in classical planning problems is compared and light is shed on the ability of different plan-encoding schemes to propagate information forward and backward.
Abstract: In recent years, there is a growing awareness of the importance of reachability and relevance-based pruning techniques for planning, but little work specifically targets these techniques. In this paper, we compare the ability of two classes of algorithms to propagate and discover reachability and relevance constraints in classical planning problems. The first class of algorithms operates on SAT encoded planning problems obtained using the linear and GRAPHPLAN encoding schemes. It applies unit-propagation and more general resolution steps (involving larger clauses) to these plan encodings. The second class operates at the plan level and contains two families of pruning algorithms: Reachable-k and Relevant-k . Reachable-k provides a coherent description of a number of existing forward pruning techniques used in numerous algorithms, while Relevant-k captures different grades of backward pruning. Our results shed light on the ability of different plan-encoding schemes to propagate information forward and backward and on the relative merit of plan-level and SAT-level pruning methods.

Journal ArticleDOI
TL;DR: This paper proposes mean-field approximations for a broad class of Belief networks, of which sigmoid and noisy-or networks can be seen as special cases, based on a powerful mean- field theory suggested by Plefka.
Abstract: The chief aim of this paper is to propose mean-field approximations for a broad class of Belief networks, of which sigmoid and noisy-or networks can be seen as special cases. The approximations are based on a powerful mean-field theory suggested by Plefka. We show that Saul, Jaakkola, and Jordan's approach is the first order approximation in Plefka's approach, via a variational derivation. The application of Plefka's theory to belief networks is not computationally tractable. To tackle this problem we propose new approximations based on Taylor series. Small scale experiments show that the proposed schemes are attractive.

Journal ArticleDOI
TL;DR: For several variations of partially observable Markov decision processes, the authors show that polynomial-time algorithms for finding control policies are unlikely to or simply don't have guarantees of finality.
Abstract: We show that for several variations of partially observable Markov decision processes, polynomial-time algorithms for finding control policies are unlikely to or simply don't have guarantees of fin...

Journal ArticleDOI
TL;DR: In this paper, a logical/mathematical framework for statistical parameter learning of parameterized logic programs is proposed, i.e. definite clause programs containing probabilistic facts with a parameterized disambiguation.
Abstract: We propose a logical/mathematical framework for statistical parameter learning of parameterized logic programs, i.e. definite clause programs containing probabilistic facts with a parameterized dis...

Journal ArticleDOI
TL;DR: In this paper, the authors propose a logics for managing structured knowledge, which allows reasoning about individuals and well defined concepts, i.e. set of individuals with common concepts.
Abstract: Description Logics (DLs) are suitable, well-known, logics for managing structured knowledge. They allow reasoning about individuals and well defined concepts, i.e. set of individuals with common pr...

Journal ArticleDOI
TL;DR: The growing need to manage and exploit the proliferation of online data sources is opening up new opportunities for bringing people closer to the resources they need.
Abstract: The growing need to manage and exploit the proliferation of online data sources is opening up new opportunities for bringing people closer to the resources they need. For instance, consider a recom...