scispace - formally typeset
Search or ask a question
Journal ArticleDOI

RONALD A. HOWARD “Dynamic Programming and Markov Processes,”

01 Feb 1961-Technometrics (Taylor & Francis Group)-Vol. 3, Iss: 1, pp 120-121
TL;DR: In this article, the authors present a dynamic programming and Markov process based approach to the problem of Markov Processes, which they call DPMs, with a focus on dynamic programming.
Abstract: (1961). RONALD A. HOWARD “Dynamic Programming and Markov Processes,”. Technometrics: Vol. 3, No. 1, pp. 120-121.
Citations
More filters
Journal ArticleDOI
TL;DR: The software architecture of an autonomous, interactive tour-guide robot is presented, which integrates localization, mapping, collision avoidance, planning, and various modules concerned with user interaction and Web-based telepresence and enables robots to operate safely, reliably, and at high speeds in highly dynamic environments.

889 citations


Cites methods from "RONALD A. HOWARD “Dynamic Programmi..."

  • ...The minimum-cost path is computed usin g a modified version ofvalue iteration, a popular dynamic programming algorithm [4, 61]:...

    [...]

Book
01 Jan 2007
TL;DR: The workload model that is the basis of traditional analysis of the single queue becomes a foundation for workload relaxations used in the treatment of complex networks and Lyapunov functions and dynamic programming equations lead to the celebrated MaxWeight policy.
Abstract: Power grids, flexible manufacturing, cellular communications: interconnectedness has consequences. This remarkable book gives the tools and philosophy you need to build network models detailed enough to capture essential dynamics but simple enough to expose the structure of effective control solutions and to clarify analysis. Core chapters assume only exposure to stochastic processes and linear algebra at the undergraduate level; later chapters are for advanced graduate students and researchers/practitioners. This gradual development bridges classical theory with the state-of-the-art. The workload model that is the basis of traditional analysis of the single queue becomes a foundation for workload relaxations used in the treatment of complex networks. Lyapunov functions and dynamic programming equations lead to the celebrated MaxWeight policy along with many generalizations. Other topics include methods for synthesizing hedging and safety stocks, stability theory for networks, and techniques for accelerated simulation. Examples and figures throughout make ideas concrete. Solutions to end-of-chapter exercises available on a companion website.

555 citations

Journal ArticleDOI
TL;DR: It is proposed that the probabilistic approach to robotics scales better to complex real-world applications than approaches that ignore a robot's uncertainty.
Abstract: This article describes a methodology for programming robots known as probabilistic robotics The probabilistic paradigm pays tribute to the inherent uncertainty in robot perception, relying on explicit representations of uncertainty when determining what to do This article surveys some of the progress in the field, using in-depth examples to illustrate some of the nuts and bolts of the basic approach My central conjecture is that the probabilistic approach to robotics scales better to complex real-world applications than approaches that ignore a robot's uncertainty

496 citations


Cites background from "RONALD A. HOWARD “Dynamic Programmi..."

  • ...The most prominent approach to calculating is value iteration[1, 41], a version of dynamic programming for computing the expected cumulative cost of belief states that has become highly popular in the field of reinforcement learning [47, 85]....

    [...]

Proceedings Article
01 Jan 1993
TL;DR: The DG-learning algorithm is presented, which learns eeciently to achieve dynamically changing goals and exhibits good knowledge transfer between goals and shows the superiority of DG learning over Q learning in a moderately large, synthetic, non-deterministic domain.
Abstract: Temporal diierence methods solve the temporal credit assignment problem for reinforcement learning. An important subproblem of general reinforcement learning is learning to achieve dynamic goals. Although existing temporal diierence methods, such as Q learning, can be applied to this problem, they do not take advantage of its special structure. This paper presents the DG-learning algorithm, which learns eeciently to achieve dynamically changing goals and exhibits good knowledge transfer between goals. In addition, this paper shows how traditional relaxation techniques can be applied to the problem. Finally, experimental results are given that demonstrate the superiority of DG learning over Q learning in a moderately large, synthetic, non-deterministic domain.

340 citations

Book
01 May 1998
TL;DR: This chapter surveys basic methods for learning maps and high speed autonomous navigation for indoor mobile robots for researchers and engineers who attempt to build reliable mobile robot navigation software.
Abstract: This chapter surveys basic methods for learning maps and high speed autonomous navigation for indoor mobile robots. The methods have been developed in our lab over the past few years, and most of them have been tested thoroughly in various indoor environments. The chapter is targeted towards researchers and engineers who attempt to build reliable mobile robot navigation software.

271 citations


Cites methods from "RONALD A. HOWARD “Dynamic Programmi..."

  • ...The minimum-cost path is computed using a modified version of value iteration, a popular dynamic programming algorithm [2, 24]:...

    [...]