Showing papers on "Markov decision process published in 1983"

PDF

Open Access

Journal Article•DOI•

Long-term average cost control problems for continuous time Markov processes: A survey

[...]

Maurice Robin¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

01 Sep 1983-Acta Applicandae Mathematicae

TL;DR: A survey of problems and methods contained in various works for continuous control, optimal stopping, and impulse control is given in this article, where the authors address the long-term average cost control of continuous time Markov processes.

...read moreread less

Abstract: This paper addresses the long-term average cost control of continuous time Markov processes. A survey of problems and methods contained in various works is given for continuous control, optimal stopping, and impulse control.

...read moreread less

78 citations

Journal Article•DOI•

Denumerable Undiscounted Semi-Markov Decision Processes with Unbounded Rewards

[...]

Awi Federgruen, P. J. Schweitzer, Henk Tijms

01 May 1983-Mathematics of Operations Research

TL;DR: This paper establishes the existence of a solution to the optimality equations in undis-counted semi-Markov decision models with countable state space, under conditions generalizing the hitherto obtained results.

...read moreread less

Abstract: This paper establishes the existence of a solution to the optimality equations in undis-counted semi-Markov decision models with countable state space, under conditions generalizing the hitherto obtained results. In particular, we merely require the existence of a finite set of states in which every pair of states can reach each other via some stationary policy, instead of the traditional and restrictive assumption that every stationary policy has a single irreducible set of states. A replacement model and an inventory model illustrate why this extension is essential. Our approach differs fundamentally from classical approaches; we convert the optimality equations into a form suitable for the application of a fixed point theorem.

...read moreread less

64 citations

Journal Article•DOI•

Vector-Valued Dynamic Programming

[...]

Mordechai I. Henig

01 May 1983-Siam Journal on Control and Optimization

TL;DR: In this article, the sets of (Pareto) maximal returns and maximal policies for Markov decision processes with vector-valued returns are defined. But the results of these results hold only for the convex hull of returns of stationary policies.

...read moreread less

Abstract: Dynamic programming models with vector-valued returns are investigated. The sets of (Pareto) maximal returns and (Pareto) maximal policies are defined. Monotonicity conditions are shown to be sufficient for the set of maximal policies to include a stationary policy, and for the set of maximal returns to be in the convex hull of returns of stationary policies. In particular, it is shown that these results hold for Markov decision processes.

...read moreread less

61 citations

Journal Article•DOI•

Average optimal policies in Markov decision drift processes with applications to a queueing and a replacement model

[...]

Arie Hordijk, F. A. Van Der Duyn Schouten

01 Jun 1983-Advances in Applied Probability

TL;DR: In this paper, the authors investigated the existence of stationary average optimal policies for Markov decision drift processes and derived sufficient conditions, which guarantee that a 'limit point' of a sequence of discounted optimal policies with the discounting factor approaching 1 is an average optimal policy.

...read moreread less

Abstract: Recently the authors introduced the concept of Markov decision drift processes. A Markov decision drift process can be seen as a straightforward generalization of a Markov decision process with continuous time parameter. In this paper we investigate the existence of stationary average optimal policies for Markov decision drift processes. Using a well-known Abelian theorem we derive sufficient conditions, which guarantee that a 'limit point' of a sequence of discounted optimal policies with the discounting factor approaching 1 is an average optimal policy. An alternative set of sufficient conditions is obtained for the case in which the discounted optimal policies generate regenerative stochastic processes. The latter set of conditions is easier to verify in several applications. The results of this paper are also applicable to Markov decision processes with discrete or continuous time parameter and to semi-Markov decision processes. In this sense they generalize some wellknown results for Markov decision processes with finite or compact action space. Applications to an M/M/1 queueing model and a maintenance replacement model are given. It is shown that under certain conditions on the model parameters the average optimal policy for the M/M/1 queueing model is monotone non-decreasing (as a function of the number of waiting customers) with respect to the service intensity and monotone non-increasing with respect to the arrival intensity. For the maintenance replacement model we prove the average optimality of a bang-bang type policy. Special attention is paid to the computation of the optimal control parameters.

...read moreread less

52 citations

Journal Article•DOI•

Controlled Markov Chains and Stochastic Networks

[...]

Vivek S. Borkar

01 Jul 1983-Siam Journal on Control and Optimization

TL;DR: In this article, a stochastic network problem that includes interconnected queues is described and studied within the framework of controlled Markov chains with average cost criterion and with special cost and transition structures.

...read moreread less

Abstract: Controlled Markov chains with average cost criterion and with special cost and transition structures are studied. Existence of optimal stationary strategies is established for the average cost criterion. Corresponding dynamic programming equations are derived. A stochastic network problem that includes interconnected queues as a special case is described and studied within this framework.

...read moreread less

41 citations

Journal Article•DOI•

Continuous time markov decision processes with interventions

[...]

A. A. Yushkevich

01 Jan 1983-Stochastics An International Journal of Probability and Stochastic Processes

TL;DR: In this article, the authors study Markov jump decision processes with both continuously and instantaneously acting decisions and with deterministic drift between jumps, and obtain necessary and sufficient optimality conditions for these decision processes in terms of equations and inequalities of quasi-variational type.

...read moreread less

Abstract: We study Markov jump decision processes with both continuously and instantaneouslyacting decisions and with deterministic drift between jumps. Such decision processes were recentlyintroduced and studied from discrete time approximations point of view by Van der Duyn Schouten.Weobtain necessary and sufficient optimality conditions for these decision processes in terms of equations and inequalities of quasi-variational type. By means of the latter we find simple necessaryand sufficient conditions for the existence of stationary optimal policies in such processes with finite state and action spaces, both in the discounted and average per unit time reward cases.

...read moreread less

35 citations

Book Chapter•DOI•

Stationary and Markov policies in countable state dynamic programming

[...]

E. A. Fainberg¹, Isaac M. Sonin²•Institutions (2)

Ministry of Industry and Information Technology of the People's Republic of China¹, Russian Academy of Sciences²

01 Jan 1983

18 citations

Journal Article•DOI•

Computational comparison of value iteration algorithms for discounted Markov decision processes

[...]

Lyn C. Thomas¹, R. Harley¹, A. C. Lavercombe•Institutions (1)

University of Manchester¹

01 Jun 1983-Operations Research Letters

TL;DR: This paper describes a computational comparison of value iteration algorithms for discounted Markov decision processes and concludes that the current state-of-the-art approaches to solving these problems are unsatisfactory.

...read moreread less

17 citations

Book Chapter•DOI•

A Markov Decision Modeling Approach to a Multi-Objective Maintenance Problem

[...]

Kamal Golabi¹•Institutions (1)

University of Pittsburgh¹

01 Jan 1983

TL;DR: A Markov decision modeling approach to the solution of a large-scale multi-objective problem--the maintenance of a statewide network of roads, which integrates management objectives for public safety and comfort with State and Federal budgetary policies and engineering considerations.

...read moreread less

Abstract: This article discusses a Markov decision modeling approach to the solution of a large-scale multi-objective problem--the maintenance of a statewide network of roads. This approach integrates management objectives for public safety and comfort, and preservation of the considerable investment in highways with State and Federal budgetary policies and engineering considerations. The Markov decision model captures the dynamic and probabilistic aspects of the maintenance problem and considers the influence of environmental factors, the type of roads, traffic densities and various engineering factors influencing road deterioration. The model recommends the best maintenance action for each mile of the network of highways, and specifies the minimum funds required to carry out the maintenance program.

...read moreread less

13 citations

Journal Article•DOI•

A Markov Analysis of Fund-Raising Alternatives:

[...]

David J. Soukup¹•Institutions (1)

Tau Beta Pi¹

01 Aug 1983-Journal of Marketing Research

TL;DR: In this article, the authors explore the strategies of fund-raising for a nonprofit organization to maximize net income by modeling the giving pattern of members as a Markov chain and evaluate the alternatives.

...read moreread less

Abstract: The author explores the strategies of fund-raising for a nonprofit organization to maximize net income. The giving pattern of members is modeled as a Markov chain. The alternatives are evaluated us...

...read moreread less

13 citations

Book•

Markov Decision Processes

[...]

Paul Thie

01 Jun 1983

Report•DOI•

Optimal Control of Markov Processes.

[...]

Wendell H. Fleming

01 Mar 1983

TL;DR: In this article, the authors give an overview of some recent developments in optimal stochastic control theory, and discuss the case of continuously acting control, in which at each time t a control u sub t is applied to the system.

...read moreread less

Abstract: : The purpose of this article is to give an overview of some recent developments in optimal stochastic control theory. Broadly speaking, stochastic control theory deals with models of systems whose evolution is affected both by certain random influences and also by certain inputs chosen by a controller. The authors are concerned here only with state-space formulations of control problems in continuous time. Moreover, the authors consider only markovian control problems in which the state x sub t of the process being controlled is Markov provided the controller follows a Markov control policy. They mainly discuss the case of continuously acting control, in which at each time t a control u sub t is applied to the system.

...read moreread less

Book Chapter•DOI•

On Uniformly Nearly-Optimal Markov Strategies

[...]

Jan van der Wal

01 Jan 1983

TL;DR: In this article, the authors proved that for any total reward countable state Markov decision process, there exists a Markov strategy IT which is uniformly nearly-optimal in the following sense: v(i,π), ≥ v*(i) − e − eu *(i), for any initial state i.

...read moreread less

Abstract: In this paper the following result is proved. In any total reward countable state Markov decision process a Markov strategy IT exists which is uniformly nearly-optimal in the following sense: v(i,π,) ≥ v*(i) − e − eu*(i) for any initial state i. Here v* denotes the value function of the process and u* denotes the value of the process if all negative rewards are neglected.

...read moreread less

Journal Article•DOI•

Adaptive Policies in Markov Decision Processes with Uncertain Transition Matrices

[...]

Masami Kurano

01 Jan 1983-Journal of Information and Optimization Sciences

TL;DR: In this article, an adaptive policy and a learning policy for Markov Decision Processes with uncertain transition matrices are defined and a non-Bayesian analysis of this model is studied and an optimal adaptive policy is constructed.

...read moreread less

Abstract: This study is concerned with Markov Decision Processes with uncertain transition matrices. In the discounted case, the Bayesian analysis of this model is studied.We define an adaptive policy and a learning policy and show that there exists, for any ???> 0 an ???-optimal and learning policy. In the average case, the non-Bayesian analysis of this model is studied and an optimal adaptive policy is constructed.

...read moreread less

Journal Article•DOI•

Adaptive control of M/M /1 queues—continuous-time Markov decision process approach

[...]

Lam Yeh, Lyn C. Thomas

01 Jun 1983-Journal of Applied Probability

TL;DR: In this article, the authors consider continuous-time Markov decision processes where decisions can be made at any time and show that there exists a monotone optimal policy among all the regular policies.

...read moreread less

Abstract: By considering continuous-time Markov decision processes where decisions can be made at any time, we show in the case of M/M /1 queues with discounted costs that there exists a monotone optimal policy among all the regular policies.

...read moreread less

Book Chapter•DOI•

Markov Decision Processes and Ship Handling: An Exercise in Aggregation

[...]

Mjg Lenssen, van der J Jan Wal, J Jaap Wessels

01 Jan 1983

TL;DR: Operational planning in a general purpose ship terminal is treated and a check simulation is used, which leads to an iterative aggregation-disaggregation approach.

...read moreread less

Abstract: Operational planning in a general purpose ship terminal is treated. The decisions to be taken concern the weekly manpower capacity and the assignment of manpower and equipment to ships. As a Markov decision problem the model is very big and aggregation is desirable. As a check simulation is used, which leads to an iterative aggregation-disaggregation approach.

...read moreread less

Proceedings Article•DOI•

Two competing queues with linear costs: The µc-rule is often optimal

[...]

John S. Baras¹, A. Dorsey², Armand M. Makowski¹•Institutions (2)

University of Maryland, College Park¹, IBM²

01 Dec 1983

TL;DR: A discrete-time model is presented for a system of two queues that compete for the service attention of a single server with infinite buffer capacity and a fixed prioritization scheme is shown to be optimal when the expected long-run average criterion and the expected discounted criterion are used.

...read moreread less

Abstract: A discrete-time model is presented for a system of two queues that compete for the service attention of a single server with infinite buffer capacity. The arrivals are modelled by an i.i.d, random sequence of a general type while the service completions are generated by independent Bernou lli streams, and the allocation of service attention is governed by feedback policies which are based on past decisions and buffer content histories. The cost of operation per unit time is a linear function of the queue sizes. Under the model assumptions, a fixed prioritization scheme, known as the µc-rule, is shown, to be optimal when the expected long-run average criterion and the expected discounted criterion, over both finite and infinite horizons, are used. The analysis is based on the Dynamic Programming methodology for Markov decision processes and takes advantage of the sample path properties of the adopted state-space model.

...read moreread less

Journal Article•DOI•

Market Value Maximization and Markov Dynamic Programming

[...]

Richard C. Grinold¹•Institutions (1)

University of California, Berkeley¹

01 May 1983-Management Science

TL;DR: In this paper, an operational method for solving dynamic programs can be used, in some cases, to solve the problem of maximizing a firm's market value, formulated as a Markov decision problem that can be solved via linear programming.

...read moreread less

Abstract: This paper shows how an operational method for solving dynamic programs can be used, in some cases, to solve the problem of maximizing a firm's market value. The problem is formulated as a Markov decision problem that can be solved via linear programming. The paper shows how to calculate or estimate the state-contingent prices that are used to value the firm. In addition, the paper points out how states can be aggregated to make the solution technique more practical. The paper's final section contains a specific example.

...read moreread less

Journal Article•DOI•

Transformation of partially observable Markov decision processes into piecewise linear ones

[...]

Katsushige Sawaki¹•Institutions (1)

Nanzan University¹

01 Jan 1983-Journal of Mathematical Analysis and Applications

TL;DR: In this article, the authors consider how partially observable Markov decision processes may be transformed into piecewise linear ones, which have many advantages in that they are easily represented in a computer.

...read moreread less

Proceedings Article•DOI•

Characterization of an optimal delayed resolution policy

[...]

Ashok K. Thareja, Ashok K. Agrawala¹•Institutions (1)

University of Maryland, College Park¹

29 Aug 1983

TL;DR: A method based upon policy iteration technique for Markov decision processes is used to obtain the optimal delayed resolution policy from the class of stationary delayed resolution policies for given values of the parameters.

...read moreread less

Abstract: A class of policies called stationary delayed resolution policies have been proposed recently for sharing finite number of buffers at a store-and-forward node in a message switching network [9]. It has been shown that with respect to the total weighted throughput these policies comprise the optimal class of policies. In this paper, we present methods to obtain an optimal policy from the class of stationary delayed resolution policies for given values of the parameters.A method based upon policy iteration technique for Markov decision processes is used to obtain the optimal delayed resolution policy. It is shown that the policy iteration technique while useful in obtaining the exact optimal policy becomes intractable for practical values of buffer sizes and number of message classes. A class of policies called SRS delayed resolution policies is proposed. It is shown that the best SRS delayed resolution policies closely approximate the performance of the optimal delayed resolution policies.

...read moreread less

Journal Article•DOI•

On the block upper-triangularity of undiscounted multi-chain markov decision problems

[...]

Norihiro Mizuno¹•Institutions (1)

New York University¹

01 Aug 1983-Operations Research Letters

TL;DR: In this paper, the LP formulation for an undiscounted multi-chain Markov decision problem can be put in a block upper-triangular form by a polynomial time procedure.

...read moreread less

Monotonically improving limit-optimal strategies in finite state decision processes

[...]

Theodore P. Hill, J. van der Wal

01 Jan 1983

TL;DR: In this paper, it was shown that for the positive and gambling cases such strategies cannot be constructed by simply switching to a "better" action or gamble at each successive return to a state.

...read moreread less

Abstract: In every finite-state leavable gambling problem and in every finite-state Markov decision process with discounted, negative or positive reward criteria there exists a Markov strategy which is monotonically improving and optimal in the limit along every history. An example is given to show that for the positive and gambling cases such strategies cannot be constructed by simply switching to a "better" action or gamble at each successive return to a state. Key words and phrases: gambling problem, Markov decision process, strategy, stationary strategy, monotonically improving strategy, limit-optimal strategy.

...read moreread less

Book Chapter•DOI•

A Simplex-Like Algorithm to Compute a Blackwell-Optimal Policy

[...]

Arie Hordijk¹, Rommert Dekker¹, Lodewijk C. M. Kallenberg¹•Institutions (1)

Leiden University¹

01 Jan 1983

TL;DR: In this article, a simplex-like algorithm was proposed to find a Blackwell-optimal policy for a fixed ρ near enough to 0 by considering the problem not in the field of real numbers but in that of rational functions in ρ.

...read moreread less

Abstract: In a finite Markov decision process a ρ-optimal policy (ρ being the interest rate) can be found for fixed ρ by solving a linear programming problem. We solve such problems simultaneously for all ρ near enough to 0 by considering the problem not in the field of real numbers but in that of rational functions in ρ and applying a simplex-like algorithm in that field. This finite algorithm produces both a Blackwell-optimal policy, its total discounted reward as rational function in ρ and also the interval (0, ρ 0] in which that policy is ρ-optimal.

...read moreread less

Application of Markov programming

[...]

J Jaap Wessels

01 Jan 1983

TL;DR: It is demonstrated that the relation is very close, since numerical possibilities strongly depend on the structure of the model, even for straightforward numerical techniques, but also for numerical analysis based on aggregation and/or decomposition.

...read moreread less

Abstract: The main topic of the paper is the relation between modelling and numerical analysis for Markov decision processes. It is demonstrated that the relation is very close, since numerical possibilities strongly depend on the structure of the model. This is even true for straightforward numerical techniques, but also for numerical analysis based on aggregation and/or decomposition. Examples amplify the arguments.

...read moreread less

Book Chapter•DOI•

Basic concepts and techniques in the theory of stochastic processes introduction to Markov processes

[...]

R. Vasudevan

01 Jan 1983

Journal Article•DOI•

Extreme-point solutions in markov decision processes

[...]

David Assaf

01 Dec 1983-Journal of Applied Probability

TL;DR: In this article, sufficient conditions for certain functions to be convex are presented, where a convex function takes its maximum at an extreme point, and the conditions may greatly simplify a problem.

...read moreread less

Abstract: The paper presents sufficient conditions for certain functions to be convex. Functions of this type often appear in Markov decision processes, where their maximum is the solution of the problem. Since a convex function takes its maximum at an extreme point, the conditions may greatly simplify a problem. In some cases a full solution may be obtained after the reduction is made. Some illustrative examples are discussed. OPTIMAL POLICY; CONVEX FUNCTION

...read moreread less

Book Chapter•DOI•

Learning Automaton for Finite Semi-Markov Decision Processes

[...]

Yousri M. El-Fattah

01 Jan 1983

TL;DR: A finite semi-Markov decision process is studied to maximize the expected average reward and convergence results are stated in the form of theorems and some examples are given.

...read moreread less

Abstract: A finite semi-Markov decision process is studied to maximize the expected average reward. The semi-Markov kernel of the process depends on an unknown parameter taking values in a subset [a, b] of ℝS. A controller modelled as a learning automaton updates sequentially the probabilities of generating decisions based on the observed decisions, states, and jump times. Convergence results are stated in the form of theorems and some examples are given.

...read moreread less