Showing papers on "Markov decision process published in 1974"

PDF

Open Access

Book•

Dynamic programming and Markov potential theory

[...]

Arie Hordijk

01 Jan 1974

213 citations

Journal Article•DOI•

Incomplete Information in Markovian Decision Models

[...]

Detlef Rhenius

01 Nov 1974-Annals of Statistics

TL;DR: In this paper, a method is introduced with which Markov models of dynamic programming can be transformed and which preserves the Markov property, which applies to relatively general sets of states.

...read moreread less

Abstract: If a set of states is given in a problem of dynamic programming in which each state can be observed only partially, the given model is generally transformed into a new model with completely observed states. In this article a method is introduced with which Markov models of dynamic programming can be transformed and which preserves the Markov property. The method applies to relatively general sets of states.

...read moreread less

89 citations

Journal Article•DOI•

Stochastic Programming and Decision Analysis: An Apparent Dilemma

[...]

Roger A. Blau¹•Institutions (1)

University of North Carolina at Chapel Hill¹

01 Nov 1974-Management Science

TL;DR: An example is given which demonstrates that using a decision theory analysis for the basic chance-constrained model of stochastic linear programming may lead to an apparent dilemma, namely, 0 > EVSI > EVPI.

...read moreread less

Abstract: An example is given which demonstrates that using a decision theory analysis for the basic chance-constrained model of stochastic linear programming may lead to an apparent dilemma, namely, 0 > EVSI > EVPI. The problem is discussed and a resolution suggested.

...read moreread less

32 citations

Journal Article•DOI•

Dispatching from Depot Repair in a Recoverable Item Inventory System: On the Optimality of a Heuristic Rule

[...]

Bruce L. Miller¹•Institutions (1)

University of California, Los Angeles¹

01 Nov 1974-Management Science

TL;DR: This paper examines real-time decision rules for a U.S. Air Force inventory system where items are repaired rather than “used up” and a heuristic rule is presented, justified theoretically by showing that the rule is optimal for a modified model.

...read moreread less

Abstract: This paper examines real-time decision rules for a U.S. Air Force inventory system where items are repaired rather than “used up.” The problem is to decide which user in the system has the greatest need for the newly available inventory items coming out of repair. The system is modeled as a Markov decision process and a heuristic rule is presented. This rule, the Transportation Time Look Ahead policy, is justified theoretically by showing that the rule is optimal for a modified model. Thus we have a theoretical justification of a decision rule in a large scale dynamic programming application.

...read moreread less

29 citations

Journal Article•DOI•

On Maximal Rewards and $|varepsilon$-Optimal Policies in Continuous Time Markov Decision Chains

[...]

Mark R. Lembersky

01 Jan 1974-Annals of Statistics

TL;DR: For continuous time Markov decision chains of finite duration, this article showed that the vector of maximal total rewards, less a linear average-return term, converges as the duration $t \rightarrow \infty.

...read moreread less

Abstract: For continuous time Markov decision chains of finite duration, we show that the vector of maximal total rewards, less a linear average-return term, converges as the duration $t \rightarrow \infty$. We then show that there are policies which are both simultaneously $\varepsilon$-optimal for all durations $t$ and are stationary except possibly for a final, finite segment. Further, the length of this final segment depends on $\varepsilon$, but not on $t$ for large enough $t$, while the initial stationary part of the policy is independent of both $\varepsilon$ and $t$.

...read moreread less

21 citations

Journal Article•DOI•

On a Class of Strategies in General Markov Decision Models

[...]

A. A. Yushkevich

01 Sep 1974-Theory of Probability and Its Applications

21 citations

Journal Article•DOI•

Averaging vs. Discounting in Dynamic Programming: a Counterexample

[...]

James Flynn

01 Mar 1974-Annals of Statistics

TL;DR: In this paper, the authors consider countable state, finite action dynamic programming problems with bounded rewards and show that a policy is optimal if it maximizes the expected discounted total return for all values of the discount factor sufficiently close to 1.

...read moreread less

Abstract: We consider countable state, finite action dynamic programming problems with bounded rewards. Under Blackwell's optimality criterion, a policy is optimal if it maximizes the expected discounted total return for all values of the discount factor sufficiently close to 1. We give an example where a policy meets that optimality criterion, but is not optimal with respect to Derman's average cost criterion. We also give conditions under which this pathology cannot occur.

...read moreread less

18 citations

Journal Article•DOI•

A Theory of Markovian Design Machines

[...]

Michael Batty¹•Institutions (1)

University of Reading¹

01 Dec 1974-Environment and Planning B-planning & Design

TL;DR: A theory of the design process, based on an analogy with the well-known Markov process in probability theory, is developed and applied to the classic highway location problem first discussed by Alexander and Manheim (1962).

...read moreread less

Abstract: A theory of the design process, based on an analogy with the well-known Markov process in probability theory, is developed and applied in this paper. Design is considered as a process of averaging a set of conflicting factors, and the sequential averaging characteristic of the Markov process is presented algebraically with an emphasis upon the weight of each factor in the final solution. A classification of Markov chains and an interpretation using linear graph theory serves to delimit the set of relevant design problems, and a particular group of such problems based on symmetric structures is specifically described. A second analogy between the choice of design method and the theory of Markov decision processes exists, and the problem of selecting an optimal method using this decision theory is solved using a dynamic programming algorithm due to Howard (1960). The theory is then applied to the classic highway location problem first discussed by Alexander and Manheim (1962), and some comparisons between the different results are attempted. Finally the place of the theory in the wider context of design is briefly alluded to.

...read moreread less

18 citations

Dissertation•DOI•

Nonstationary Markov decision processes and related topics in nonstationary Markov chains

[...]

Bruce Lee Bowerman¹•Institutions (1)

Iowa State University¹

01 Jan 1974

17 citations

Improved successive approximation methods for discounted Markov decision processes

[...]

J.A.E.E. van Nunen

01 Jan 1974

TL;DR: Some new algorithms for solving discounted Markov decision problems are introduced while furthermore it will be shown how the several S.A.A.) algorithms may be combined.

...read moreread less

Abstract: Successive Approximation (S.A.) methods, for solving discounted Markov decision problems, have been developed to avoid the extensive computations that are connected with linear programming and policy iteration techniques for solving large scaled problems. Several authors give such an S.A. algorithm. In this paper we introduce some new algorithms while furthermore it will be shown how the several S.A. algorithms may be combined. For each algorithm converging sequences of upper and lower bounds for the optimal value will be given.

...read moreread less

16 citations

Journal Article•DOI•

A Generalized Discrete Dynamic Programming Model

[...]

Richard C. Grinold

01 Mar 1974-Management Science

TL;DR: A stationary discrete dynamic programming model that is a generalization of the finite state and finite action Markov programming problem and allows the parameters of the problem to be random variables to indicate when the expected values or these random variables are certainty equivalents.

...read moreread less

Abstract: This paper considers a stationary discrete dynamic programming model that is a generalization of the finite state and finite action Markov programming problem. We specify conditions under which an optimal stationary linear decision rule exists and show how this optimal policy can be calculated using linear programming, policy iteration, or value iteration. In addition we allow the parameters of the problem to be random variables and indicate when the expected values or these random variables are certainty equivalents.

...read moreread less

A principle for generating optimization procedures for discounted Markov decision processes

[...]

J Jaap Wessels, van Jaee Jo Nunen

01 Jan 1974

Journal Article•DOI•

Preferred Rules in Continuous Time Markov Decision Processes

[...]

Mark R. Lembersky¹•Institutions (1)

Oregon State University¹

01 Nov 1974-Management Science

TL;DR: This work characterize decision rules, called preferred, which may be used in the initially stationary part of nearly optimal policies, and then, under conditions involving state recurrence and accessibility, consider finding such rules.

...read moreread less

Abstract: Motivated by a planning horizon result for continuous time Markov decision chains, we study decision rules, called preferred, which may be used in the initially stationary part of nearly optimal policies We characterize these rules and then, under conditions involving state recurrence and accessibility, consider finding such rules We also discuss the connection between preferred rules and certain discounted process decision rules, and the role of preferred rules in optimal policies

...read moreread less

Journal Article•DOI•

Review: M. Frank Norman, Markov Processes and Learning Models

[...]

Radu Theodorescu

01 Oct 1974-Annals of Probability

Journal Article•DOI•

An iterative procedure for nondiscounted discrete‐time markov decisions

[...]

Gary J. Koehler¹, Andrew B. Whinston², Gordon P. Wright²•Institutions (2)

Northwestern University¹, Purdue University²

01 Dec 1974-Naval Research Logistics Quarterly

TL;DR: In this paper, the problem of computing, by iterative methods, optimal policies for Markov decision processes was considered and the policies computed are optimal for all sufficiently small interest rates. But the problem was not addressed in this paper.

...read moreread less

Abstract: This paper considers the problem of computing, by iterative methods, optimal policies for Markov decision processes. The policies computed are optimal for all sufficiently small interest rates.

...read moreread less

Estimating the human capital associated with an organization by the use of Markov chains / 171

[...]

Wayne J. Morse

01 Jan 1974

TL;DR: In this paper, the human capital associated with an organization is estimated by estimating the Human Capital Associated with an Organization (HCA) with respect to the number of employees in the organization.

...read moreread less

Abstract: (1975). Estimating the Human Capital Associated with an Organization. Accounting and Business Research: Vol. 6, No. 21, pp. 48-56.

...read moreread less

Book Chapter•DOI•

A Class of Markovian Decision Processes

[...]

D. Reetz

01 Jan 1974

TL;DR: In this article, a class of Markovian decision processes is characterized using a weak row sum criterion, which is shown to be necessary and sufficient for the absolute convergence of present values, and a modified successive value iteration procedure is obtained.

...read moreread less

Abstract: A class of Markovian decision processes is characterized using a weak row sum criterion. The criterion is shown to be necessary and sufficient for the absolute convergence of present values. A modified successive value iteration procedure is obtained.

...read moreread less

The solution of an undiscounted completely ergodic Markov decision process by successive approximation

[...]

J. van der Wal

01 Jan 1974

TL;DR: This paper considers a completely ergodic Markov decision process with finite state and decision spaces using the average return per unit time criterion and derives an algorithm which approximates the optimal solution.

...read moreread less

Abstract: In this paper we consider a completely ergodic Markov decision process with finite state and decision spaces using the average return per unit time criterion. An algorithm is derived which approximates the optimal solution. It will be shown that this algorithm is finite and supplies upper and lower bounds for the maximal average return and a near optimal policy with average return between these bounds.

...read moreread less

Some computational experiments with a special generalized Markov programming model

[...]

P.J. Weeda

01 Jan 1974

Journal Article•DOI•

Note---A Counterexample in Continuous Markov Decision Chains

[...]

Mark R. Lembersky¹, Melvin L. Ott¹•Institutions (1)

Oregon State University¹

01 Nov 1974-Management Science

TL;DR: In this article, the question of whether there always exists an initially stationary optimal policy for continuous time Markov decision chains was raised and the authors gave an example in which there is do such policy.

...read moreread less

Abstract: Results in another paper in this issue may raise the question of whether for finite state and action, continuous time parameter Markov decision chains there always exists an initially stationary optimal policy. We give an example in which there is do such policy.

...read moreread less

Journal Article•DOI•

Use of Isolated Forecasts in Markov Decision Processes with Imperfect Information

[...]

D. M. Brooks, C. T. Leondes¹•Institutions (1)

University of California, Berkeley¹

01 Sep 1974-Iie Transactions

TL;DR: In this article, an expression for the expected return for each available action is developed, as a perturbation to the basic process, and the optimal action and value of the forecast are obtained by combining a Policy Iteration solution of the imperfect information process and evaluation of the policies for the perturbed process.

...read moreread less

Abstract: The Markov Decision Process formulation and its application to processes in which there is uncertainty as to process state, or imperfect state information, are reviewed. The question of how to determine an optimal action if some form of intelligence or “forecast” of the process state for a single process stage is posed. An expression for the expected return for each available action is developed, as a perturbation to the basic process. The optimal action and value of the forecast are obtained by combining a Policy Iteration solution of the imperfect information process and evaluation of the policies for the perturbed process based on these imperfect information process parameters.

...read moreread less

Book•

Linear Programming for Decision Making: An Applications Approach

[...]

David R. Anderson, Dennis J. Sweeney, Thomas A. Williams

01 Jun 1974

Proceedings Article•DOI•

Markov: A general purpose simulation program

[...]

Maurice F. Aburdene, Ralph J. Kochenburger

01 Jan 1974

TL;DR: A program for simulating Markov processes (MARKOV) intended for use by someone with programming experience, which establishes the initial state of the process, the present state, the previous state, and the number of transitions from each state to all states.

...read moreread less

Abstract: This paper describes in language free flow diagram form, a program for simulating Markov processes (MARKOV) intended for use by someone with programming experience. The program establishes the initial state of the process (if it is unknown), the present state, the previous state, and the number of transitions from each state to all states. These parameters can be used to determine various characteristics of Markov processes of interest to the systems analyst.

...read moreread less