Journal ArticleDOI
Interactive visualization for testing Markov Decision Processes
Sean McGregor,Hailey Buckingham,Thomas G. Dietterich,Rachel Houtman,Claire A. Montgomery,Ronald Metoyer +5 more
Reads0
Chats0
TLDR
The first visualization targeting MDP testing, MDPvis, is presented and it is shown the visualization's generality by connecting it to two reinforcement learning frameworks that implement many different MDPs of interest in the research community.Abstract:
Markov Decision Processes (MDPs) are a formulation for optimization problems in sequential decision making Solving MDPs often requires implementing a simulator for optimization algorithms to invoke when updating decision making rules known as policies The combination of simulator and optimizer are subject to failures of specification, implementation, integration, and optimization that may produce invalid policies We present these failures as queries for a visual analytic system (MDPVIS) MDPVIS addresses three visualization research gaps First, the data acquisition gap is addressed through a general simulator-visualization interface Second, the data analysis gap is addressed through a generalized MDP information visualization Finally, the cognition gap is addressed by exposing model components to the user MDPVIS generalizes a visualization for wildfire management We use that problem to illustrate MDPVIS and show the visualization's generality by connecting it to two reinforcement learning frameworks that implement many different MDPs of interest in the research community HighlightsMarkov decision processes (MDPs) formalize sequential decision optimization problemsComplex simulators often implement MDPs and are subject to a variety of bugsInteractive visualizations support testing MDPs and optimization algorithmsThe first visualization targeting MDP testing, MDPvis, is presentedread more
Citations
More filters
Proceedings ArticleDOI
Gamut: A Design Probe to Understand How Data Scientists Understand Machine Learning Models
TL;DR: This investigation investigated why and how professional data scientists interpret models, and how interface affordances can support data scientists in answering questions about model interpretability, and showed that interpretability is not a monolithic concept.
Proceedings ArticleDOI
A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges
TL;DR: This paper elucidate the roles played by HCI researchers in interactive RL, identifying ideas and promising research directions, and proposes generic design principles that will provide researchers with a guide to effectively implement interactive RL applications.
Proceedings ArticleDOI
Personalizable and Interactive Sequence Recommender System
TL;DR: An interactive sequence recommender system (SeRIES) prototype that uses visualizations to explain and justify the recommendations and provides controls so that users may personalize the recommendations is designed and developed.
Journal ArticleDOI
Infrastructure maintenance and replacement optimization under multiple uncertainties and managerial flexibility
TL;DR: This piece of current research proposes an optimization approach that incorporates the flexibility to choose between multiple successive intervention strategies, regular asset degradation, structural failure and multiple price uncertainties and obtains transition probabilities from existing price data.
References
More filters
Book
Dynamic Programming
TL;DR: The more the authors study the information processing aspects of the mind, the more perplexed and impressed they become, and it will be a very long time before they understand these processes sufficiently to reproduce them.
Book
Markov Decision Processes: Discrete Stochastic Dynamic Programming
TL;DR: Puterman as discussed by the authors provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models, focusing primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous time discrete state models.
Book
Dynamic Programming and Optimal Control
TL;DR: The leading and most up-to-date textbook on the far-ranging algorithmic methododogy of Dynamic Programming, which can be used for optimal control, Markovian decision problems, planning and sequential decision making under uncertainty, and discrete/combinatorial optimization.
Proceedings Article
Policy Gradient Methods for Reinforcement Learning with Function Approximation
TL;DR: This paper proves for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal policy.
Journal ArticleDOI
What is dynamic programming
TL;DR: Sequence alignment methods often use something called a 'dynamic programming' algorithm, which can be a good idea or a bad idea, depending on the method used.