scispace - formally typeset
Search or ask a question

Showing papers on "Markov decision process published in 1973"


Journal ArticleDOI
TL;DR: A batch service queue is considered where each batch size and its time of service is subject to control and policies which minimize the expected continuously discounted cost and the expected cost per unit time over an infinite time horizon are shown.
Abstract: A batch service queue is considered where each batch size and its time of service is subject to control. Costs are incurred for serving the customers and for holding them in the system. Viewing the system as a Markov decision process (i.e., dynamic program) with unbounded costs, we show that policies which minimize the expected continuously discounted cost and the expected cost per unit time over an infinite time horizon are of the form: at a review point when x customers are waiting, serve min {x, Q} customers (Q being the, possibly infinite, service capacity) if and only if x exceeds a certain optimal level M. Methods of computing M for both the discounted and average cost contexts are presented.

192 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider a semi-Markov decision process with arbitrary action space, where the state space is the nonnegative integers and the one-period reward is bounded by a polynomial in n.
Abstract: We consider a semi-Markov decision process with arbitrary action space; the state space is the nonnegative integers. As in queueing systems, we assume that {0, 1, 2,..., n + N} is the set of states accessible from state n in one transition, where N is finite and independent of n. The novel feature of this model is that the one-period reward is not required to be uniformly bounded; instead, we merely assume it to be bounded by a polynomial in n. Our main concern is with the average cost problem. A set of conditions sufficient for there to be an optimal stationary policy which can be obtained from the usual functional equation is developed. These conditions are quite weak and, as illustrated in several queueing examples, are easily verified.

136 citations


Journal ArticleDOI
TL;DR: In this paper, moment optimality is used to define a policy that maximizes the sequence of signed moments of total discounted return with a positive (negative) sign if the moment is odd (even).
Abstract: Standard finite state and action discrete time Markov decision processes with discounting are studied using a new optimality criterion called moment optimality. A policy is moment optimal if it lexicographically maximizes the sequence of signed moments of total discounted return with a positive (negative) sign if the moment is odd (even). This criterion is equivalent to being a little risk adverse. It is shown that a stationary policy is moment optimal by examining the negative of the Laplace transform of the total return random variable. An algorithm to construct all stationary moment optimal policies is developed. The algorithm is shown to be finite.

91 citations


Journal ArticleDOI
TL;DR: In this paper, a new test for suboptimal actions in discounted Markov decision problems is proposed and discussed in relation to that of MacQueen and Porteus and preferred computational schemes are given.
Abstract: A new test for suboptimal actions in discounted Markov decision problems is proposed The test is discussed in relation to that of MacQueen and Porteus and preferred computational schemes are given

35 citations


Journal ArticleDOI
TL;DR: Upper and lower bounds on the optimal value function of a finite discounted Markov decision problem can be computed easily when the problem is solved by linear programming or policy iteration, and can be used to identify suboptimal actions.
Abstract: This note points out that upper and lower bounds on the optimal value function of a finite discounted Markov decision problem can be computed easily when the problem is solved by linear programming or policy iteration. These bounds can be used to identify suboptimal actions.

22 citations



Journal ArticleDOI
TL;DR: In this article, the authors explore the relevance of the theory of Markov processes to the analysis of stock price movements, and apply the Markov model to more disaggregated data, specifically to individual stock price data.
Abstract: The purpose of this article is to explore the relevance of the theory of Markov processes to the analysis of stock price movements. The present study was prompted by the work of Dryden [6], in which aggregate data on United Kingdom share prices were analyzed within a Markovian framework, and which indicated that it might be fruitful to apply the Markov model to more disaggregated data, specifically to individual stock price data.

15 citations


Book ChapterDOI
01 Jan 1973
TL;DR: Probabilistic potential theory as mentioned in this paper deals with those aspects of Markov processes and martingales that can be put in the framework of analytic potential theory, in such a way that potential-theoretic concepts and problems have a probabilistic interpretation.
Abstract: Publisher Summary This chapter presents the potential theory for Markov chains. Probabilistic potential theory is a new branch of stochastic processes, more specifically Markov processes and martingales, and has been developed extensively in recent years reaching the status of an independent, well-established, and very popular discipline. Probabilistic potential theory deals with those aspects of Markov processes and martingales that can be put in the framework of analytic potential theory, in such a way that potential-theoretic concepts and problems have a probabilistic interpretation. Probabilistic potential theory includes the majority of problems of interest in Markov processes and martingales, and their applications. The potential-theoretic approach provided far reaching unification for Markov processes and opened new research areas, and on the other hand, probabilistic methods helped to solve outstanding problems in analytic potential theory. Potential-theoretic notions, such as the Dirichlet problem, Martin boundary, Choquet capacity, Cartan fine topology, and many others, are indispensable notions in the theory of Markov processes.

13 citations



Journal ArticleDOI
TL;DR: In this article, sufficient conditions are given for the optimal control of Markov processes when the control policy is stationary and the process possesses a stationary distribution, and the costs are unbounded and additive, and may or may not be discounted.

5 citations


ReportDOI
01 Jul 1973
TL;DR: It is suggested that improved effectiveness modeling may be possible with Markov methods because more data can be used to estimate model parameters and three models that include false contacts are suggested.
Abstract: : The report presents methodology for evaluating ASW system effectiveness in dynamic operational contexts. It shows that conditional probability models may be generalized by using Markov modeling methods and suggests three models that include false contacts. Parameter estimation methods are given for Markov models and the results of experimentation with several methods are presented. Some of the methods use operational data not currently used in effectiveness modeling; hence, it is suggested that improved effectiveness modeling may be possible with Markov methods because more data can be used to estimate model parameters. State space structuring is investigated and results of numerical experimentation are given for different sets of states applied to the same set of data. (Data were generated by hypothetical non- Markov dynamic processes and a Monte Carlo simulation model).

Journal ArticleDOI
01 Jul 1973
TL;DR: An efficient algorithm to determine the optimal sequence of measurements is presented, on the base of which a sensitivity analysis with respect to the process parameters is also carried out.
Abstract: Two kinds of problems regarding measurement optimization in stochastic decision processes, when measurements are costly or constrained not to exceed a given number, have been investigated in the last years: the first one refers to the optimum timing of observations on the state vector of the process, while the second refers to the convenience of buying information on the random actions exerted by a stochastic environment. In this paper the two problems are considered from a unified point of view. In other words, the decision maker has to determine the optimal observation policy, under the assumption that both state and random vectors are measurable. A solution based on the application of dynamic programming is discussed for a general class of multistage processes. Analytical results are then obtained for scalar linear systems with quadratic cost on state and control. In this case, an efficient algorithm to determine the optimal sequence of measurements is presented, on the base of which a sensitivity analysis with respect to the process parameters is also carried out.

DissertationDOI
01 Jan 1973
TL;DR: The majority of users indicate that the textual content is of greatest value, however, a somewhat higher quality reproduction could be made from "photographs" if essential to the understanding of the dissertation.
Abstract: Aspects of the convergence of Bayes policies and posterior distributions for Markov decision processes " (1973). Retrospective Theses and Dissertations. Paper 5003. INFORMATION TO USERS This material was produced from a microfilm copy of the original document. While the most advanced technological means to photograph and reproduce this document have been used, the quality is heavily dependent upon the quality of the original submitted. The following explanation of techniques is provided to help you understand markings or patterns which m^y appear on this reproduction. 1.The sign or "target" for pages apparently lacking from the document photographed is "Missing Page(s)" If it was possible to obtain the missing page(s) or section, they are spliced into the film along with adjacent pages. This may have necessitated cutting thru an image and duplicating adjacent pages to insure you complete continuity. 2. When an image on the film is obliterated with a large round black mark, it is an indication that the photographer suspected that the copy may have moved during exposure and thus cause a blurred image. You will find a good image of the page in the adjacent frame. 3. When a map, drawing or chart, etc., was part of the material being photographed the photographer followed a definite method in "sectioning" the material. It is customary to begin photoing at the upper left hand comer of a large sheet and to continue photoing from left to right in equal sections with a small overlap. If necessary, sectioning is continued again — beginning below the first row and continuing on until complete. 4. The majority of users indicate that the textual content is of greatest value, however, a somewhat higher quality reproduction could be made from "photographs" if essential to the understanding of the dissertation. Silver prints of "photographs" may be ordered at additional charge by writing the Order Department, giving the catalog number, title, author and specific pages you wish reproduced. 1973 Signature was redacted for privacy.

Journal ArticleDOI
TL;DR: In this paper, sufficient conditions for the existence of a solution of the equation x = f+Px are generalized to the one's for cyclic chains and multi-chains for denumerable Markovian decision processes.
Abstract: The aim of this paper is to identify the denumerable stochastic systems which have an optimal strategy, namely, to give general sufficient conditions of the existence of an optimal strategy for denumerable Markovian decision processes. The aim is accomplished by using Markov potential theory and showing the range of the validity of a method to find an optimal strategy called Howard's technique. Since Markov potential theory helps to characterize the properties of evaluations of strategies on such processes, it plays an important role to find the sufficient conditions mentioned above For Markov potential theory, new concepts are introduced such as absorbable chains, quasi-potentials, etc., and using them, sufficient conditions for the existence of a solution of the equation x = f+Px are generalized to the one's for cyclic chains and multi-chains The results obtained in this paper are as follows : if a system represents a strong Markov chain for any strategy, then there exists an optimal strategy w...

Journal ArticleDOI
TL;DR: A methodology is presented to select from two marketing or pricing plans with the objective of maximizing profit, portrayed in graphical form, with alternate stages of decision and uncertainty.
Abstract: A methodology is presented to select from two marketing or pricing plans with the objective of maximizing profit. The decision structure is portrayed in graphical form, with alternate stages of decision and uncertainty. A standard dynamic programming algorithm is used to solve the graph and aid in final decision making. A decision path is plotted at each decision vertex on the graph such that the decision maker can determine, at any combination of sales volume and time period, which of the two marketing plans to follow from that vertex so as to achieve the greatest expected profit for the remaining span of time.

Journal ArticleDOI
TL;DR: In this paper, an optimal surface-to-air-missile (SAM) firing pattern to defend a surface target is solved via applications of the concept of closed-loop (feedback) and open-loop optimal control.
Abstract: An operations research problem concerning the optimal surface-to-air-missile (SAM) firing pattern to defend a surface target is solved via applications of the concept of closed-loop (feedback) and open-loop optimal control. The SAM defense problem is formulated as a Markov decision process with the number of SAM's in each salvo as the decision variable. Interesting cases, including the presence of imperfect sensor observation and a bound on the number of SAM's available are considered. The principle of dynamic programming and the technique of nonlinear integer programming are applied to reach closed-loop and open-loop solutions. Numerical examples are given for illustration.

Book ChapterDOI
TL;DR: This chapter deals with a decision-making system based on the Markov process as a system model that applies Bayesian decision theory to a Markov chain with uncertain probabilities.
Abstract: Publisher Summary This chapter deals with a decision-making system. It is based on the Markov process as a system model. Applications of this model have been made in the fields of physics, chemistry, biology, and operations research. In these applications it is generally assumed that the matrix of transition probabilities is known. The Bayes' or risk minimizing decision is derived in the chapter. An a priori density is specified over the probabilities and Bayes' formula is used to compute an updated posteriori density after observations are recorded. The decision maker selects the decision that minimizes his expected loss or risk. Once a Markov chain is defined, a reward structure can be placed over the states. Applying Bayesian decision theory to a Markov chain with uncertain probabilities would be simple if a density over the transition probabilities could be easily transformed to a density over the steady state probabilities. A two state Markov chain is no problem. However, a five state Markov chain will necessitate operations in a 20 dimensional space.