Showing papers on "Markov decision process published in 1973"

PDF

Open Access

Journal Article•DOI•

[...]

01 Aug 1973-Advances in Applied Probability

TL;DR: A batch service queue is considered where each batch size and its time of service is subject to control and policies which minimize the expected continuously discounted cost and the expected cost per unit time over an infinite time horizon are shown.

...read moreread less

Abstract: A batch service queue is considered where each batch size and its time of service is subject to control. Costs are incurred for serving the customers and for holding them in the system. Viewing the system as a Markov decision process (i.e., dynamic program) with unbounded costs, we show that policies which minimize the expected continuously discounted cost and the expected cost per unit time over an infinite time horizon are of the form: at a review point when x customers are waiting, serve min {x, Q} customers (Q being the, possibly infinite, service capacity) if and only if x exceeds a certain optimal level M. Methods of computing M for both the discounted and average cost contexts are presented.

...read moreread less

192 citations

Journal Article•DOI•

Semi-Markov Decision Processes with Unbounded Rewards

[...]

Steven A. Lippman¹•Institutions (1)

University of California, Los Angeles¹

01 Mar 1973-Management Science

TL;DR: In this paper, the authors consider a semi-Markov decision process with arbitrary action space, where the state space is the nonnegative integers and the one-period reward is bounded by a polynomial in n.

...read moreread less

Abstract: We consider a semi-Markov decision process with arbitrary action space; the state space is the nonnegative integers. As in queueing systems, we assume that {0, 1, 2,..., n + N} is the set of states accessible from state n in one transition, where N is finite and independent of n. The novel feature of this model is that the one-period reward is not required to be uniformly bounded; instead, we merely assume it to be bounded by a polynomial in n. Our main concern is with the average cost problem. A set of conditions sufficient for there to be an optimal stationary policy which can be obtained from the usual functional equation is developed. These conditions are quite weak and, as illustrated in several queueing examples, are easily verified.

...read moreread less

136 citations

Journal Article•DOI•

Markov Decision Processes with a New Optimality Criterion: Discrete Time

[...]

Stratton C. Jaquette

01 May 1973-Annals of Statistics

TL;DR: In this paper, moment optimality is used to define a policy that maximizes the sequence of signed moments of total discounted return with a positive (negative) sign if the moment is odd (even).

...read moreread less

Abstract: Standard finite state and action discrete time Markov decision processes with discounting are studied using a new optimality criterion called moment optimality. A policy is moment optimal if it lexicographically maximizes the sequence of signed moments of total discounted return with a positive (negative) sign if the moment is odd (even). This criterion is equivalent to being a little risk adverse. It is shown that a stationary policy is moment optimal by examining the negative of the Laplace transform of the total return random variable. An algorithm to construct all stationary moment optimal policies is developed. The algorithm is shown to be finite.

...read moreread less

91 citations

Journal Article•DOI•

Tests for Suboptimal Actions in Discounted Markov Programming

[...]

N. A. J. Hastings, J. M. C. Mello

01 May 1973-Management Science

TL;DR: In this paper, a new test for suboptimal actions in discounted Markov decision problems is proposed and discussed in relation to that of MacQueen and Porteus and preferred computational schemes are given.

...read moreread less

Abstract: A new test for suboptimal actions in discounted Markov decision problems is proposed The test is discussed in relation to that of MacQueen and Porteus and preferred computational schemes are given

...read moreread less

35 citations

Journal Article•DOI•

Technical Note-Elimination of Suboptimal Actions in Markov Decision Problems

[...]

Richard C. Grinold¹•Institutions (1)

University of California, Berkeley¹

01 Jun 1973-Operations Research

TL;DR: Upper and lower bounds on the optimal value function of a finite discounted Markov decision problem can be computed easily when the problem is solved by linear programming or policy iteration, and can be used to identify suboptimal actions.

...read moreread less

Abstract: This note points out that upper and lower bounds on the optimal value function of a finite discounted Markov decision problem can be computed easily when the problem is solved by linear programming or policy iteration. These bounds can be used to identify suboptimal actions.

...read moreread less

22 citations

Book•

Linear programming : an emphasis on decision making

[...]

Ann J. Hughes, Dennis E. Grawoig

01 Jan 1973

16 citations

Journal Article•DOI•

Security Prices as Markov Processes

[...]

Terence M. Ryan

01 Jan 1973-Journal of Financial and Quantitative Analysis

TL;DR: In this article, the authors explore the relevance of the theory of Markov processes to the analysis of stock price movements, and apply the Markov model to more disaggregated data, specifically to individual stock price data.

...read moreread less

Abstract: The purpose of this article is to explore the relevance of the theory of Markov processes to the analysis of stock price movements. The present study was prompted by the work of Dryden [6], in which aggregate data on United Kingdom share prices were analyzed within a Markovian framework, and which indicated that it might be fruitful to apply the Markov model to more disaggregated data, specifically to individual stock price data.

...read moreread less

15 citations

Book Chapter•DOI•

Potential theory for markov chains

[...]

R. Syski¹•Institutions (1)

University of Maryland, College Park¹

01 Jan 1973

TL;DR: Probabilistic potential theory as mentioned in this paper deals with those aspects of Markov processes and martingales that can be put in the framework of analytic potential theory, in such a way that potential-theoretic concepts and problems have a probabilistic interpretation.

...read moreread less

Abstract: Publisher Summary This chapter presents the potential theory for Markov chains. Probabilistic potential theory is a new branch of stochastic processes, more specifically Markov processes and martingales, and has been developed extensively in recent years reaching the status of an independent, well-established, and very popular discipline. Probabilistic potential theory deals with those aspects of Markov processes and martingales that can be put in the framework of analytic potential theory, in such a way that potential-theoretic concepts and problems have a probabilistic interpretation. Probabilistic potential theory includes the majority of problems of interest in Markov processes and martingales, and their applications. The potential-theoretic approach provided far reaching unification for Markov processes and opened new research areas, and on the other hand, probabilistic methods helped to solve outstanding problems in analytic potential theory. Potential-theoretic notions, such as the Dirichlet problem, Martin boundary, Choquet capacity, Cartan fine topology, and many others, are indispensable notions in the theory of Markov processes.

...read moreread less

13 citations

Journal Article•DOI•

On the adaptive control of finite state Markov processes

[...]

Petr Mandl¹•Institutions (1)

Czechoslovak Academy of Sciences¹

01 Dec 1973-Probability Theory and Related Fields

5 citations

Journal Article•DOI•

Optimal control of stationary Markov processes

[...]

Richard Morton¹•Institutions (1)

University of Manchester¹

01 Jul 1973-Stochastic Processes and their Applications

TL;DR: In this article, sufficient conditions are given for the optimal control of Markov processes when the control policy is stationary and the process possesses a stationary distribution, and the costs are unbounded and additive, and may or may not be discounted.

...read moreread less

5 citations

Report•DOI•

Markov Models and Dynamic Analysis of ASW Effectiveness

[...]

William H Frye, Andrew J Korsak

01 Jul 1973

TL;DR: It is suggested that improved effectiveness modeling may be possible with Markov methods because more data can be used to estimate model parameters and three models that include false contacts are suggested.

...read moreread less

Abstract: : The report presents methodology for evaluating ASW system effectiveness in dynamic operational contexts. It shows that conditional probability models may be generalized by using Markov modeling methods and suggests three models that include false contacts. Parameter estimation methods are given for Markov models and the results of experimentation with several methods are presented. Some of the methods use operational data not currently used in effectiveness modeling; hence, it is suggested that improved effectiveness modeling may be possible with Markov methods because more data can be used to estimate model parameters. State space structuring is investigated and results of numerical experimentation are given for different sets of states applied to the same set of data. (Data were generated by hypothetical non- Markov dynamic processes and a Monte Carlo simulation model).

...read moreread less

Journal Article•DOI•

Optimization of costly measurements in stochastic decision processes

[...]

Pier Paolo Puliafito¹, Riccardo Zoppoli¹•Institutions (1)

University of Genoa¹

01 Jul 1973

TL;DR: An efficient algorithm to determine the optimal sequence of measurements is presented, on the base of which a sensitivity analysis with respect to the process parameters is also carried out.

...read moreread less

Abstract: Two kinds of problems regarding measurement optimization in stochastic decision processes, when measurements are costly or constrained not to exceed a given number, have been investigated in the last years: the first one refers to the optimum timing of observations on the state vector of the process, while the second refers to the convenience of buying information on the random actions exerted by a stochastic environment. In this paper the two problems are considered from a unified point of view. In other words, the decision maker has to determine the optimal observation policy, under the assumption that both state and random vectors are measurable. A solution based on the application of dynamic programming is discussed for a general class of multistage processes. Analytical results are then obtained for scalar linear systems with quadratic cost on state and control. In this case, an efficient algorithm to determine the optimal sequence of measurements is presented, on the base of which a sensitivity analysis with respect to the process parameters is also carried out.

...read moreread less

Dissertation•DOI•

Aspects of the convergence of Bayes policies and posterior distributions for Markov decision processes

[...]

Mohamed Fathi Ahmed El-Sabbagh¹•Institutions (1)

Iowa State University¹

01 Jan 1973

TL;DR: The majority of users indicate that the textual content is of greatest value, however, a somewhat higher quality reproduction could be made from "photographs" if essential to the understanding of the dissertation.

...read moreread less

Abstract: Aspects of the convergence of Bayes policies and posterior distributions for Markov decision processes " (1973). Retrospective Theses and Dissertations. Paper 5003. INFORMATION TO USERS This material was produced from a microfilm copy of the original document. While the most advanced technological means to photograph and reproduce this document have been used, the quality is heavily dependent upon the quality of the original submitted. The following explanation of techniques is provided to help you understand markings or patterns which m^y appear on this reproduction. 1.The sign or "target" for pages apparently lacking from the document photographed is "Missing Page(s)" If it was possible to obtain the missing page(s) or section, they are spliced into the film along with adjacent pages. This may have necessitated cutting thru an image and duplicating adjacent pages to insure you complete continuity. 2. When an image on the film is obliterated with a large round black mark, it is an indication that the photographer suspected that the copy may have moved during exposure and thus cause a blurred image. You will find a good image of the page in the adjacent frame. 3. When a map, drawing or chart, etc., was part of the material being photographed the photographer followed a definite method in "sectioning" the material. It is customary to begin photoing at the upper left hand comer of a large sheet and to continue photoing from left to right in equal sections with a small overlap. If necessary, sectioning is continued again — beginning below the first row and continuing on until complete. 4. The majority of users indicate that the textual content is of greatest value, however, a somewhat higher quality reproduction could be made from "photographs" if essential to the understanding of the dissertation. Silver prints of "photographs" may be ordered at additional charge by writing the Order Department, giving the catalog number, title, author and specific pages you wish reproduced. 1973 Signature was redacted for privacy.

...read moreread less

Journal Article•DOI•

An application of Markov potential theory to Markovian decision processes

[...]

Hirotomo Aso, Masayuki Kimura

01 Nov 1973-International Journal of Systems Science

TL;DR: In this paper, sufficient conditions for the existence of a solution of the equation x = f+Px are generalized to the one's for cyclic chains and multi-chains for denumerable Markovian decision processes.

...read moreread less

Abstract: The aim of this paper is to identify the denumerable stochastic systems which have an optimal strategy, namely, to give general sufficient conditions of the existence of an optimal strategy for denumerable Markovian decision processes. The aim is accomplished by using Markov potential theory and showing the range of the validity of a method to find an optimal strategy called Howard's technique. Since Markov potential theory helps to characterize the properties of evaluations of strategies on such processes, it plays an important role to find the sufficient conditions mentioned above For Markov potential theory, new concepts are introduced such as absorbable chains, quasi-potentials, etc., and using them, sufficient conditions for the existence of a solution of the equation x = f+Px are generalized to the one's for cyclic chains and multi-chains The results obtained in this paper are as follows : if a system represents a strong Markov chain for any strategy, then there exists an optimal strategy w...

...read moreread less

Journal Article•DOI•

Utilization of Graph Theory and Dynamic Programming as Substitutes for Decision Trees

[...]

Ross H. Johnson¹, Paul R. Winn¹•Institutions (1)

Illinois State University¹

01 Sep 1973-Journal of the Academy of Marketing Science

TL;DR: A methodology is presented to select from two marketing or pricing plans with the objective of maximizing profit, portrayed in graphical form, with alternate stages of decision and uncertainty.

...read moreread less

Abstract: A methodology is presented to select from two marketing or pricing plans with the objective of maximizing profit. The decision structure is portrayed in graphical form, with alternate stages of decision and uncertainty. A standard dynamic programming algorithm is used to solve the graph and aid in final decision making. A decision path is plotted at each decision vertex on the graph such that the decision maker can determine, at any combination of sales volume and time period, which of the two marketing plans to follow from that vertex so as to achieve the greatest expected profit for the remaining span of time.

...read moreread less

Journal Article•DOI•

Optimal SAM defense system: An application of optimal control concept to operations research

[...]

Jong-Sen Lee¹•Institutions (1)

United States Naval Research Laboratory¹

01 Oct 1973-IEEE Transactions on Automatic Control

TL;DR: In this paper, an optimal surface-to-air-missile (SAM) firing pattern to defend a surface target is solved via applications of the concept of closed-loop (feedback) and open-loop optimal control.

...read moreread less

Abstract: An operations research problem concerning the optimal surface-to-air-missile (SAM) firing pattern to defend a surface target is solved via applications of the concept of closed-loop (feedback) and open-loop optimal control. The SAM defense problem is formulated as a Markov decision process with the number of SAM's in each salvo as the decision variable. Interesting cases, including the presence of imperfect sensor observation and a bound on the number of SAM's available are considered. The principle of dynamic programming and the technique of nonlinear integer programming are applied to reach closed-loop and open-loop solutions. Numerical examples are given for illustration.

...read moreread less

Book Chapter•DOI•

Dynamic Decision Theory and Techniques1

[...]

William R. Osgood, Cornelius T. Leondes¹•Institutions (1)

University of California, Los Angeles¹

01 Jan 1973-Control and dynamic systems

TL;DR: This chapter deals with a decision-making system based on the Markov process as a system model that applies Bayesian decision theory to a Markov chain with uncertain probabilities.

...read moreread less

Abstract: Publisher Summary This chapter deals with a decision-making system. It is based on the Markov process as a system model. Applications of this model have been made in the fields of physics, chemistry, biology, and operations research. In these applications it is generally assumed that the matrix of transition probabilities is known. The Bayes' or risk minimizing decision is derived in the chapter. An a priori density is specified over the probabilities and Bayes' formula is used to compute an updated posteriori density after observations are recorded. The decision maker selects the decision that minimizes his expected loss or risk. Once a Markov chain is defined, a reward structure can be placed over the states. Applying Bayesian decision theory to a Markov chain with uncertain probabilities would be simple if a density over the transition probabilities could be easily transformed to a density over the steady state probabilities. A two state Markov chain is no problem. However, a five state Markov chain will necessitate operations in a 20 dimensional space.

...read moreread less