scispace - formally typeset
Open AccessProceedings Article

Stochastic safest and shortest path problems

Florent Teichteil-Königsbuch
- pp 1825-1831
TLDR
This work introduces a more general and richer dual optimization criterion, which minimizes the average (undiscounted) cost of only paths leading to the goal among all policies that maximize the probability to reach the goal.
Abstract
Optimal solutions to Stochastic Shortest Path Problems (SSPs) usually require that there exists at least one policy that reaches the goal with probability 1 from the initial state. This condition is very strong and prevents from solving many interesting problems, for instance where all possible policies reach some dead-end states with a positive probability. We introduce a more general and richer dual optimization criterion, which minimizes the average (undiscounted) cost of only paths leading to the goal among all policies that maximize the probability to reach the goal. We present policy update equations in the form of dynamic programming for this new dual criterion, which are different from the standard Bellman equations. We demonstrate that our equations converge in infinite horizon without any condition on the structure of the problem or on its policies, which actually extends the class of SSPs that can be solved. We experimentally show that our dual criterion provides wellfounded solutions to SSPs that can not be solved by the standard criterion, and that using a discount factor with the latter certainly provides solution policies but which are not optimal considering our well-founded criterion.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

A survey of multi-objective sequential decision-making

TL;DR: This article surveys algorithms designed for sequential decision-making problems with multiple objectives and proposes a taxonomy that classifies multi-objective methods according to the applicable scenario, the nature of the scalarization function, and the type of policies considered.
Book

Automated Planning and Acting

TL;DR: This book presents a comprehensive paradigm of planning and acting using the most recent and advanced automated-planning techniques, and explains the computational deliberation capabilities that allow an actor to reason about its actions, choose them, organize them purposefully, and act deliberately to achieve an objective.
Book

Planning with Markov Decision Processes: An AI Perspective

TL;DR: Markov Decision Processes (MDPs) are widely used in Artificial Intelligence for modeling sequential decision-making scenarios with probabilistic dynamics as mentioned in this paper, and they are the framework of choice when designing an intelligent agent that needs to act for long periods of time in an environment where its actions could have uncertain outcomes.
Proceedings Article

Simulated penetration testing: from Dijkstra to Turing Test++

TL;DR: Analyzing prior work in AI and other relevant areas, a systematization of the simulated pentesting model space is derived, highlighting a multitude of interesting challenges to AI sequential decision making research.
Proceedings Article

A theory of goal-oriented MDPs with dead ends

TL;DR: In this paper, value iteration-based and heuristic search algorithms for solving stochastic shortest path (SSP) MDPs with dead-end states are presented, and a preliminary empirical study comparing the performance of these algorithms on different MDP classes is conducted.
References
More filters
Book

Markov Decision Processes: Discrete Stochastic Dynamic Programming

TL;DR: Puterman as discussed by the authors provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models, focusing primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous time discrete state models.

Neuro-Dynamic Programming.

TL;DR: In this article, the authors present the first textbook that fully explains the neuro-dynamic programming/reinforcement learning methodology, which is a recent breakthrough in the practical application of neural networks and dynamic programming to complex problems of planning, optimal decision making, and intelligent control.
Book

Neuro-dynamic programming

TL;DR: This is the first textbook that fully explains the neuro-dynamic programming/reinforcement learning methodology, which is a recent breakthrough in the practical application of neural networks and dynamic programming to complex problems of planning, optimal decision making, and intelligent control.
Proceedings Article

Labeled RTDP: improving the convergence of real-time dynamic programming

TL;DR: This paper introduces a labeling scheme into RTDP that speeds up its convergence while retaining its good anytime behavior, and shows that Labeled RTDP (LRTDP) converges orders of magnitude faster than RTDP, and faster also than another recent heuristic-search DP algorithm, LAO*.
Journal ArticleDOI

The first probabilistic track of the international planning competition

TL;DR: The 2004 International Planning Competition, IPC-4, included a probabilistic planning track for the first time and the new domain specification language was created for the track, the evaluation methodology, the competition domains developed, and the results of the participating teams.