A Linearly Relaxed Approximate Linear Program for Markov Decision Processes
Citations
108 citations
44 citations
Cites background or methods from "A Linearly Relaxed Approximate Line..."
...Others have studied various approaches to compress the large constraint set into a smaller one (Taylor and Parr, 2012; Lakshminarayanan et al., 2017)....
[...]
...Others have studied various approaches to compress the large constraint set into a smaller one (Taylor & Parr, 2012; Lakshminarayanan et al., 2017)....
[...]
43 citations
Cites methods from "A Linearly Relaxed Approximate Line..."
...The approximate algorithm can be viewed as a constraint sampling procedure (De Farias and Van Roy, 2004; Lakshminarayanan et al., 2017) for the dual LP....
[...]
28 citations
[...]
21 citations
Cites background or methods from "A Linearly Relaxed Approximate Line..."
...Our method is based on a subtle variation on the classic LP formulation of optimal control in MDPs due to Manne (1960). One key element in our formulation is a linear relaxation of some of the constraints in this LP, which is a technique looking back to a long history: a similar relaxation has been first proposed by Schweitzer and Seidmann (1985), whose approach was later popularized by the influential work of de Farias and Van Roy (2003)....
[...]
...…paper initiated a long line of work studying the properties of solutions to various linearly relaxed versions of the LP, mostly focusing on the quality of value functions extracted from the solutions (see, e.g., Petrik and Zilberstein, 2009; Desai et al., 2012; Lakshminarayanan et al., 2018)....
[...]
...This latter paper initiated a long line of work studying the properties of solutions to various linearly relaxed versions of the LP, mostly focusing on the quality of value functions extracted from the solutions (see, e.g., Petrik and Zilberstein, 2009; Desai et al., 2012; Lakshminarayanan et al., 2018)....
[...]
...Our method is based on a subtle variation on the classic LP formulation of optimal control in MDPs due to Manne (1960). One key element in our formulation is a linear relaxation of some of the constraints in this LP, which is a technique looking back to a long history: a similar relaxation has been first proposed by Schweitzer and Seidmann (1985), whose approach was later popularized by the influential work of de Farias and Van Roy (2003). This latter paper initiated a long line of work studying the properties of solutions to various linearly relaxed versions of the LP, mostly focusing on the quality of value functions extracted from the solutions (see, e....
[...]
References
11,625 citations
5,188 citations
785 citations
Additional excerpts
...In this paper we adopt the framework of discrete-time, discounted MDPs when a controller steers the stochastically evolving state of a system while receiving rewards that depends on the states visited and actions chosen....
[...]
...Approximate linear programming (ALP) and its variants have been widely applied to Markov Decision Processes (MDPs) with a large number of states....
[...]
...While the second assumption limits the scope of MDPs that the result can be applied to, the other two assumptions limit the choice of the basis functions....
[...]
...The book of [KM12] gives a relatively fresh, algorithm-centered summary of existing methods suitable for planning in MDPs. AI research tend to focus on empirical results through the development of various benchmarks and little if any effort is devoted to the theoretical understanding of the quality-effort tradeoff exhibited by the that the various algorithms that are developed in this field....
[...]
...Keywords: Markov Decision Processes (MDPs), Approximate Linear Programming (ALP), I. INTRODUCTION Markov decision processes (MDPs) have proved to be an indispensable model for sequential decision making under uncertainty with applications in networking, traffic control, robotics, operations research, business, finance, artificial intelligence, health-care and more (see, e.g., [Whi93; Rus96a; FS02; HY07; SB10; BR11; Put94; LL12; AA+15; BD17])....
[...]
643 citations
503 citations