scispace - formally typeset
Search or ask a question

Showing papers on "Markov decision process published in 1968"



Journal ArticleDOI
TL;DR: In this article, two methods for computing optimal decision sequences and their cost functions are presented for solving a broad class of shortest-route problems and a third solution technique is shown to apply to certain, but not all, of these Markov renewal programs.
Abstract: : Two methods are presented for computing optimal decision sequences and their cost functions. The first method, called 'policy iteration,' is an adaption of an iterative scheme that is widely used for sequential decision problems. The second method is to specify a linear programming problem whose solution determines an optimal policy and its cost function. A third solution technique is shown to apply to certain, but not all, of these Markov renewal programs. As a byproduct of the development, new techniques are provided for solving a broad class of shortest-route problems. (Author)

153 citations



Journal ArticleDOI
TL;DR: In this article, the authors consider the problem of finding a policy that maximizes the expected return over a given planning horizon, which depends on both the policy and the sample path of the process.
Abstract: The system we consider may be in one of n states at any point in time and its probability law is a Markov process which depends on the policy (control) chosen. The return to the system over a given planning horizon is the integral (over that horizon) of a return rate which depends on both the policy and the sample path of the process. Our objective is to find a policy which maximizes the expected return over the given planning horizon. A necessary and sufficient condition for optimality is obtained, and a constructive proof is given that there is a piecewise constant policy which is optimal. A bound on the number of switches (points where the piecewise constant policy jumps) is obtained for the case where there are two states.

107 citations



Journal ArticleDOI
TL;DR: In this paper, a semi-Markovian decision process is defined and the concept of returns associated with this process is introduced, and the average return per unit time that the system will get in the steady state is obtained.

38 citations


Journal ArticleDOI
TL;DR: In this article, the authors considered decision processes on a denumerable state space and showed that if one always chooses the same decision at each state, the resulting Markov chain is ergodic (i.e. positive recurrent).
Abstract: This paper considers decision processes on a denumerable state space. At each state a finite number of decisions is allowed. The main assumption is that if one always chooses the same decision at each state the resulting Markov chain is ergodic (i.e. positive recurrent). Under this assumption it is shown that all possible decision procedures are (in an appropriate sense) uniformly ergodic.

9 citations


Journal ArticleDOI
TL;DR: The linear programming solution to Markov chain theory models is presented and compared to the dynamic programming solution and it is shown that the elements of the simplex tableau contain information relevant to the understanding of the programmed system.
Abstract: Some essential elements of the Markov chain theory are reviewed, along with programming of economic models which incorporate Markovian matrices and whose objective function is the maximization of the present value of an infinite stream of income. The linear programming solution to these models is presented and compared to the dynamic programming solution. Several properties of the solution are analyzed and it is shown that the elements of the simplex tableau contain information relevant to the understanding of the programmed system. It is also shown that the model can be extended to cover, among other elements, multiprocess enterprises and the realistic cases of programming in the face of probable deterioration of the productive capacity of the system or its total destruction.

8 citations


BookDOI
01 Jan 1968
TL;DR: A submitted manuscript is the author's version of the article upon submission and before peer-review as discussed by the authors, and the final published version features the final layout of the paper including the volume, issue and page numbers.
Abstract: • A submitted manuscript is the author's version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers.

8 citations


Journal ArticleDOI
TL;DR: It is shown that a two-stage stochastic program with recourse with right-hand sides random has optimal decision rules which are continuous and piecewise linear, but this result does not extend to programs with three or more stages.

8 citations


Journal ArticleDOI
TL;DR: A multistage decision problem is optimized using a new formulation of stochastic dynamic programming that employs at one stage a Markov decision process with an infinite number of substages and shows how this process may be compressed and handled as one stage in the larger problem.
Abstract: A multistage decision problem is optimized using a new formulation of stochastic dynamic programming. The problem optimized in this paper concerns a semiconductor production process where the transitions at each work station are stochastic. The mathematical model employs at one stage a Markov decision process with an infinite number of substages and shows how this process may be compressed and handled as one stage in the larger problem.



Journal ArticleDOI
D. Detchmendy1, R. Kalaba
TL;DR: A procedure for reducing the decision dimensionality in a dynamic programming calculation is presented for multi-stage decision processes for which the dimension of the decision vector is greater than thedimension of the state vector.
Abstract: A procedure for reducing the decision dimensionality in a dynamic programming calculation is presented for multi-stage decision processes for which the dimension of the decision vector is greater than the dimension of the state vector. This procedure facilitates the numerical and analytical investigations of this class of optimization problems.