scispace - formally typeset
Journal ArticleDOI

Interactive visualization for testing Markov Decision Processes

Reads0
Chats0
TLDR
The first visualization targeting MDP testing, MDPvis, is presented and it is shown the visualization's generality by connecting it to two reinforcement learning frameworks that implement many different MDPs of interest in the research community.
Abstract
Markov Decision Processes (MDPs) are a formulation for optimization problems in sequential decision making Solving MDPs often requires implementing a simulator for optimization algorithms to invoke when updating decision making rules known as policies The combination of simulator and optimizer are subject to failures of specification, implementation, integration, and optimization that may produce invalid policies We present these failures as queries for a visual analytic system (MDPVIS) MDPVIS addresses three visualization research gaps First, the data acquisition gap is addressed through a general simulator-visualization interface Second, the data analysis gap is addressed through a generalized MDP information visualization Finally, the cognition gap is addressed by exposing model components to the user MDPVIS generalizes a visualization for wildfire management We use that problem to illustrate MDPVIS and show the visualization's generality by connecting it to two reinforcement learning frameworks that implement many different MDPs of interest in the research community HighlightsMarkov decision processes (MDPs) formalize sequential decision optimization problemsComplex simulators often implement MDPs and are subject to a variety of bugsInteractive visualizations support testing MDPs and optimization algorithmsThe first visualization targeting MDP testing, MDPvis, is presented

read more

Citations
More filters
Proceedings ArticleDOI

Gamut: A Design Probe to Understand How Data Scientists Understand Machine Learning Models

TL;DR: This investigation investigated why and how professional data scientists interpret models, and how interface affordances can support data scientists in answering questions about model interpretability, and showed that interpretability is not a monolithic concept.
Proceedings ArticleDOI

A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges

TL;DR: This paper elucidate the roles played by HCI researchers in interactive RL, identifying ideas and promising research directions, and proposes generic design principles that will provide researchers with a guide to effectively implement interactive RL applications.
Proceedings ArticleDOI

Personalizable and Interactive Sequence Recommender System

TL;DR: An interactive sequence recommender system (SeRIES) prototype that uses visualizations to explain and justify the recommendations and provides controls so that users may personalize the recommendations is designed and developed.
Journal ArticleDOI

Infrastructure maintenance and replacement optimization under multiple uncertainties and managerial flexibility

TL;DR: This piece of current research proposes an optimization approach that incorporates the flexibility to choose between multiple successive intervention strategies, regular asset degradation, structural failure and multiple price uncertainties and obtains transition probabilities from existing price data.
References
More filters
Book

Dynamic Programming

TL;DR: The more the authors study the information processing aspects of the mind, the more perplexed and impressed they become, and it will be a very long time before they understand these processes sufficiently to reproduce them.
Book

Markov Decision Processes: Discrete Stochastic Dynamic Programming

TL;DR: Puterman as discussed by the authors provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models, focusing primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous time discrete state models.
Book

Dynamic Programming and Optimal Control

TL;DR: The leading and most up-to-date textbook on the far-ranging algorithmic methododogy of Dynamic Programming, which can be used for optimal control, Markovian decision problems, planning and sequential decision making under uncertainty, and discrete/combinatorial optimization.
Proceedings Article

Policy Gradient Methods for Reinforcement Learning with Function Approximation

TL;DR: This paper proves for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal policy.
Journal ArticleDOI

What is dynamic programming

TL;DR: Sequence alignment methods often use something called a 'dynamic programming' algorithm, which can be a good idea or a bad idea, depending on the method used.
Related Papers (5)