List of notations Preface to the second edition Preface to the first edition 1. Stochastic programming models 2. Two-stage problems 3. Multistage problems 4. Optimization models with probabilistic constraints 5. Statistical inference 6. Risk averse optimization 7. Background material 8. Bibliographical remarks Bibliography Index.

Lectures on Stochastic Programming: Modeling and Theory

Introduction to linear and nonlinear programming

This exciting and pioneering new overview of multiagent systems, which are online systems composed of multiple interacting intelligent agents, i.e., online trading, offers a newly seen computer science perspective on multiagent systems, while integrating ideas from operations research, game theory, economics, logic, and even philosophy and linguistics. The authors emphasize foundations to create a broad and rigorous treatment of their subject, with thorough presentations of distributed problem solving, game theory, multiagent communication and learning, social choice, mechanism design, auctions, cooperative game theory, and modal logics of knowledge and belief. For each topic, basic concepts are introduced, examples are given, proofs of key results are offered, and algorithmic considerations are examined. An appendix covers background material in probability theory, classical logic, Markov decision processes and mathematical programming. Written by two of the leading researchers of this engaging field, this book will surely serve as THE reference for researchers in the fastest-growing area of computer science, and be used as a text for advanced undergraduate or graduate courses.

Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations

This report presents a unified approach for the study of constrained Markov decision processes with a countable state space and unbounded costs. We consider a single controller having several objectives; it is desirable to design a controller that minimize one of cost objective, subject to inequality constraints on other cost objectives. The objectives that we study are both the expected average cost, as well as the expected total cost (of which the discounted cost is a special case). We provide two frameworks: the case were costs are bounded below, as well as the contracting framework. We characterize the set of achievable expected occupation measures as well as performance vectors. This allows us to reduce the original control dynamic problem into an infinite Linear Programming. We present a Lagrangian approach that enables us to obtain sensitivity analysis. In particular, we obtain asymptotical results for the constrained control problem: convergence of both the value and the policies in the time horizon and in the discount factor. Finally, we present and several state truncation algorithms that enable to approximate the solution of the original control problem via finite linear programs.

/pdf/constrained-markov-decision-processes-3ndxrfjbtt.pdf

Constrained Markov Decision Processes

Aimed at senior undergraduate and graduate students in chemical engineering and technology (materials science), physical chemistry and related fields, this text tackles fundamental colloidal issues, illustrating them with the relevant experimental techniques. It shows the connection between the molecular world and the colloidal domain, and how molecular interactions lead to colloidal behaviour. It focuses on association colloids rather than on the more traditional lyophobic colloidal systems.

The Colloidal Domain: Where Physics, Chemistry, Biology, and Technology Meet

1 Introduction.- 1.0 Background.- 1.1 Raison d'Etre and Limitations.- 1.2 A Menu of Courses and Prerequisites.- 1.3 For the Cognoscenti.- 1.4 Style and Nomenclature.- I Mathematical Programming Perspective.- 2 Markov Decision Processes: The Noncompetitive Case.- 2.0 Introduction.- 2.1 The Summable Markov Decision Processes.- 2.2 The Finite Horizon Markov Decision Process.- 2.3 Linear Programming and the Summable Markov Decision Models.- 2.4 The Irreducible Limiting Average Process.- 2.5 Application: The Hamiltonian Cycle Problem.- 2.6 Behavior and Markov Strategies.- 2.7 Policy Improvement and Newton's Method in Summable MDPs.- 2.8 Connection Between the Discounted and the Limiting Average Models.- 2.9 Linear Programming and the Multichain Limiting Average Process.- 2.10 Bibliographic Notes.- 2.11 Problems.- 3 Stochastic Games via Mathematical Programming.- 3.0 Introduction.- 3.1 The Discounted Stochastic Games.- 3.2 Linear Programming and the Discounted Stochastic Games.- 3.3 Modified Newton's Method and the Discounted Stochastic Games.- 3.4 Limiting Average Stochastic Games: The Issues.- 3.5 Zero-Sum Single-Controller Limiting Average Game.- 3.6 Application: The Travelling Inspector Model.- 3.7 Nonlinear Programming and Zero-Sum Stochastic Games.- 3.8 Nonlinear Programming and General-Sum Stochastic Games.- 3.9 Shapley's Theorem via Mathematical Programming.- 3.10 Bibliographic Notes.- 3.11 Problems.- II Existence, Structure and Applications.- 4 Summable Stochastic Games.- 4.0 Introduction.- 4.1 The Stochastic Game Model.- 4.2 Transient Stochastic Games.- 4.2.1 Stationary Strategies.- 4.2.2 Extension to Nonstationary Strategies.- 4.3 Discounted Stochastic Games.- 4.3.1 Introduction.- 4.3.2 Solutions of Discounted Stochastic Games.- 4.3.3 Structural Properties.- 4.3.4 The Limit Discount Equation.- 4.4 Positive Stochastic Games.- 4.5 Total Reward Stochastic Games.- 4.6 Nonzero-Sum Discounted Stochastic Games.- 4.6.1 Existence of Equilibrium Points.- 4.6.2 A Nonlinear Compementarity Problem.- 4.6.3 Perfect Equilibrium Points.- 4.7 Bibliographic Notes.- 4.8 Problems.- 5 Average Reward Stochastic Games.- 5.0 Introduction.- 5.1 Irreducible Stochastic Games.- 5.2 Existence of the Value.- 5.3 Stationary Strategies.- 5.4 Equilibrium Points.- 5.5 Bibliographic Notes.- 5.6 Problems.- 6 Applications and Special Classes of Stochastic Games.- 6.0 Introduction.- 6.1 Economic Competition and Stochastic Games.- 6.2 Inspection Problems and Single-Control Games.- 6.3 The Presidency Game and Switching-Control Games.- 6.4 Fishery Games and AR-AT Games.- 6.5 Applications of SER-SIT Games.- 6.6 Advertisement Models and Myopic Strategies.- 6.7 Spend and Save Games and the Weighted Reward Criterion.- 6.8 Bibliographic Notes.- 6.9 Problems.- Appendix G Matrix and Bimatrix Games and Mathematical Programming.- G.1 Introduction.- G.2 Matrix Game.- G.3 Linear Programming.- G.4 Bimatrix Games.- G.5 Mangasarian-Stone Algorithm for Bimatrix Games.- G.6 Bibliographic Notes.- Appendix H A Theorem of Hardy and Littlewood.- H.1 Introduction.- H.2 Preliminaries, Results and Examples.- H.3 Proof of the Hardy-Littlewood Theorem.- Appendix M Markov Chains.- M.1 Introduction.- M.2 Stochastic Matrix.- M.3 Invariant Distribution.- M.4 Limit Discounting.- M.5 The Fundamental Matrix.- M.6 Bibliographic Notes.- Appendix P Complex Varieties and the Limit Discount Equation.- P.1 Background.- P.2 Limit Discount Equation as a Set of Simultaneous Polynomials.- P.3 Algebraic and Analytic Varieties.- P.4 Solution of the Limit Discount Equation via Analytic Varieties.- References.

Competitive Markov decision processes

We consider finite state, finite action, stochastic games over an infinite time horizon. We survey algorithms for the computation of minimax optimal stationary strategies in the zerosum case, and of Nash equilibria in stationary strategies in the nonzerosum case. We also survey those theoretical results that pave the way towards future development of algorithms.

Algorithms for stochastic games — A survey

We consider a Markov decision process with both the expected limiting average, and the discounted total return criteria, appropriately modified to include a penalty for the variability in the stream of rewards. In both cases we formulate appropriate nonlinear programs in the space of state-action frequencies averaged, or discounted whose optimal solutions are shown to be related to the optimal policies in the corresponding “variance-penalized MDP.” The analysis of one of the discounted cases is facilitated by the introduction of a “Cartesian product of two independent MDPs.”

Variance-Penalized Markov Decision Processes

We introduce the time-consistency concept that is inspired by the so-called “principle of optimality” of dynamic programming and demonstrate – via an example – that the conditional value-at-risk (CVaR) need not be time-consistent in a multi-stage case. Then, we give the formulation of the target-percentile risk measure which is time-consistent and hence more suitable in the multi-stage investment context. Finally, we also generalize the value-at-risk and CVaR to multi-stage risk measures based on the theory and structure of the target-percentile risk measure.

/pdf/time-consistent-dynamic-risk-measures-2777rubswk.pdf

Time consistent dynamic risk measures

Addresses the following basic feasibility problem for infinite-horizon Markov decision processes (MDPs): can a policy be found that achieves a specified value (target) of the long-run limiting average reward at a specified probability level (percentile)? Related optimization problems of maximizing the target for a specified percentile and vice versa are also considered. The authors present a complete (and discrete) classification of both the maximal achievable target levels and of their corresponding percentiles. The authors also provide an algorithm for computing a deterministic policy corresponding to any feasible target-percentile pair. Next the authors consider similar problems for an MDP with multiple rewards and/or constraints. This case presents some difficulties and leads to several open problems. An LP-based formulation provides constructive solutions for most cases. >

/pdf/percentile-performance-criteria-for-limiting-average-markov-4fquc8f3c8.pdf

Jerzy A. Filar

Papers

Competitive Markov decision processes

Algorithms for stochastic games — A survey

Variance-Penalized Markov Decision Processes

Time consistent dynamic risk measures

Percentile performance criteria for limiting average Markov decision processes