Showing papers by "Thomas G. Dietterich published in 2015"

PDF

Open Access

Research priorities for robust and beneficial artificial intelligence

[...]

01 Jan 2015

TL;DR: This article gives numerous examples of worthwhile research aimed at ensuring that AI remains robust and beneficial.

...read moreread less

419 citations

Posted Content•

A Meta-Analysis of the Anomaly Detection Problem

[...]

Andrew Emmott, Shubhomoy Das, Thomas G. Dietterich, Alan Fern, Weng-Keen Wong - Show less +1 more

03 Mar 2015-arXiv: Artificial Intelligence

TL;DR: A thorough meta-analysis of the anomaly detection problem is provided, providing an ontology for describing anomaly detection contexts, a methodology for controlling various aspects of benchmark creation, guidelines for future experimental design and a discussion of the many potential pitfalls of trying to measure success in this field.

...read moreread less

Abstract: This article provides a thorough meta-analysis of the anomaly detection problem. To accomplish this we first identify approaches to benchmarking anomaly detection algorithms across the literature and produce a large corpus of anomaly detection benchmarks that vary in their construction across several dimensions we deem important to real-world applications: (a) point difficulty, (b) relative frequency of anomalies, (c) clusteredness of anomalies, and (d) relevance of features. We apply a representative set of anomaly detection algorithms to this corpus, yielding a very large collection of experimental results. We analyze these results to understand many phenomena observed in previous work. First we observe the effects of experimental design on experimental results. Second, results are evaluated with two metrics, ROC Area Under the Curve and Average Precision. We employ statistical hypothesis testing to demonstrate the value (or lack thereof) of our benchmarks. We then offer several approaches to summarizing our experimental results, drawing several conclusions about the impact of our methodology as well as the strengths and weaknesses of some algorithms. Last, we compare results against a trivial solution as an alternate means of normalizing the reported performance of algorithms. The intended contributions of this article are many; in addition to providing a large publicly-available corpus of anomaly detection benchmarks, we provide an ontology for describing anomaly detection contexts, a methodology for controlling various aspects of benchmark creation, guidelines for future experimental design and a discussion of the many potential pitfalls of trying to measure success in this field.

...read moreread less

72 citations

Journal Article•DOI•

Rise of concerns about AI: reflections and directions

[...]

Thomas G. Dietterich¹, Eric Horvitz²•Institutions (2)

Oregon State University¹, Microsoft²

28 Sep 2015-Communications of The ACM

TL;DR: Research, leadership, and communication about AI futures will help shape the future of science, technology, and society.

...read moreread less

Abstract: Research, leadership, and communication about AI futures.

...read moreread less

61 citations

Journal Article•DOI•

Letter to the Editor: Research Priorities for Robust and Beneficial Artificial Intelligence: An Open Letter

[...]

Stuart Russell¹, Thomas G. Dietterich², Eric Horvitz³, Bart Selman⁴, Francesca Rossi⁵, Demis Hassabis, Shane Legg, Mustafa Suleyman, Dileep George, D. Scott Phoenix - Show less +6 more•Institutions (5)

University of California, Berkeley¹, Oregon State University², Microsoft³, Cornell University⁴, University of Padua⁵

31 Dec 2015-Ai Magazine

TL;DR: It is believed that research on how to make AI systems robust and beneficial is both important and timely, and that there are concrete research directions that can be pursued today.

...read moreread less

Abstract: Artificial intelligence (AI) research has explored a variety of problems and approaches since its inception, but for the last 20 years or so has been focused on the problems surrounding the construction of intelligent agents — systems that perceive and act in some environment. In this context, "intelligence" is related to statistical and economic notions of rationality — colloquially, the ability to make good decisions, plans, or inferences. The adoption of probabilistic and decision-theoretic representations and statistical learning methods has led to a large degree of integration and cross-fertilization among AI, machine learning, statistics, control theory, neuroscience, and other fields. The establishment of shared theoretical frameworks, combined with the availability of data and processing power, has yielded remarkable successes in various component tasks such as speech recognition, image classification, autonomous vehicles, machine translation, legged locomotion, and question-answering systems. As capabilities in these areas and others cross the threshold from laboratory research to economically valuable technologies, a virtuous cycle takes hold whereby even small improvements in performance are worth large sums of money, prompting greater investments in research. There is now a broad consensus that AI research is progressing steadily, and that its impact on society is likely to increase. The potential benefits are huge, since everything that civilization has to offer is a product of human intelligence; we cannot predict what we might achieve when this intelligence is magnified by the tools AI may provide, but the eradication of disease and poverty are not unfathomable. Because of the great potential of AI, it is important to research how to reap its benefits while avoiding potential pitfalls. The progress in AI research makes it timely to focus research not only on making AI more capable, but also on maximizing the societal benefit of AI. Such considerations motivated the AAAI 2008–09 Presidential Panel on Long-Term AI Futures and other projects on AI impacts, and constitute a significant expansion of the field of AI itself, which up to now has focused largely on techniques that are neutral with respect to purpose. We recommend expanded research aimed at ensuring that increasingly capable AI systems are robust and beneficial: our AI systems must do what we want them to do. The attached research priorities document [see page X] gives many examples of such research directions that can help maximize the societal benefit of AI. This research is by necessity interdisciplinary, because it involves both society and AI. It ranges from economics, law and philosophy to computer security, formal methods and, of course, various branches of AI itself. In summary, we believe that research on how to make AI systems robust and beneficial is both important and timely, and that there are concrete research directions that can be pursued today.

...read moreread less

42 citations

Posted Content•

Systematic Construction of Anomaly Detection Benchmarks from Real Data

[...]

Andrew Emmott¹, Shubhomoy Das¹, Thomas G. Dietterich¹, Alan Fern¹, Weng-Keen Wong¹ - Show less +1 more•Institutions (1)

Oregon State University¹

03 Mar 2015-arXiv: Artificial Intelligence

TL;DR: A methodology for transforming existing classification data sets into ground-truthed benchmark data sets for anomaly detection, which produces data sets that vary along three important dimensions: point difficulty, relative frequency of anomalies, and clusteredness.

...read moreread less

Abstract: Research in anomaly detection suffers from a lack of realistic and publicly-available data sets. Because of this, most published experiments in anomaly detection validate their algorithms with application-specific case studies or benchmark datasets of the researchers' construction. This makes it difficult to compare different methods or to measure progress in the field. It also limits our ability to understand the factors that determine the performance of anomaly detection algorithms. This article proposes a new methodology for empirical analysis and evaluation of anomaly detection algorithms. It is based on generating thousands of benchmark datasets by transforming existing supervised learning benchmark datasets and manipulating properties relevant to anomaly detection. The paper identifies and validates four important dimensions: (a) point difficulty, (b) relative frequency of anomalies, (c) clusteredness of anomalies, and (d) relevance of features. We apply our generated datasets to analyze several leading anomaly detection algorithms. The evaluation verifies the importance of these dimensions and shows that, while some algorithms are clearly superior to others, anomaly detection accuracy is determined more by variation in the four dimensions than by the choice of algorithm.

...read moreread less

34 citations

Journal Article•DOI•

Penalized likelihood methods improve parameter estimates in occupancy models

[...]

Rebecca A. Hutchinson¹, Jonathon J. Valente¹, Sarah C. Emerson¹, Matthew G. Betts¹, Thomas G. Dietterich¹ - Show less +1 more•Institutions (1)

Oregon State University¹

01 Aug 2015-Methods in Ecology and Evolution

TL;DR: In this paper, the authors explore strategies for estimating parameters based on maximizing a penalized likelihood, which augments the usual likelihood with a penalty function that encodes information about what parameter values are undesirable.

...read moreread less

Abstract: Summary Occupancy models are employed in species distribution modelling to account for imperfect detection during field surveys. While this approach is popular in the literature, problems can occur when estimating the model parameters. In particular, the maximum likelihood estimates can exhibit bias and large variance for data sets with small sample sizes, which can result in estimated occupancy probabilities near 0 and 1 (‘boundary estimates’). In this paper, we explore strategies for estimating parameters based on maximizing a penalized likelihood. Penalized likelihood methods augment the usual likelihood with a penalty function that encodes information about what parameter values are undesirable. We introduce penalties for occupancy models that have analogues in ridge regression and Bayesian approaches, and we compare them to a penalty developed for occupancy models in prior work. We examine the bias, variance and mean squared error of parameter estimates obtained from each method on synthetic data. Across all of the synthetic data sets, the penalized estimation methods had lower mean squared error than the maximum likelihood estimates. We also provide an example of the application of these methods to point counts of avian species. Penalized likelihood methods show similar improvements when tested using empirical bird point count data. We discuss considerations for choosing among these methods when modelling occupancy. We conclude that penalized methods may be of practical utility for fitting occupancy models with small sample sizes, and we are releasing R code that implements these methods.

...read moreread less

28 citations

Proceedings Article•DOI•

ℋC-search for structured prediction in computer vision

[...]

Michael Lam¹, Janardhan Rao Doppa², Sinisa Todorovic¹, Thomas G. Dietterich¹•Institutions (2)

Oregon State University¹, Washington State University²

07 Jun 2015

TL;DR: This paper introduces a search operator suited to the vision domain that improves a candidate solution by probabilistically sampling likely object configurations in the scene from the hierarchical Berkeley segmentation, and complements this search operator by applying the DAgger algorithm to robustly train the search heuristic so it learns from its previous mistakes.

...read moreread less

Abstract: The mainstream approach to structured prediction problems in computer vision is to learn an energy function such that the solution minimizes that function. At prediction time, this approach must solve an often-challenging optimization problem. Search-based methods provide an alternative that has the potential to achieve higher performance. These methods learn to control a search procedure that constructs and evaluates candidate solutions. The recently-developed ℋC-Search method has been shown to achieve state-of-the-art results in natural language processing, but mixed success when applied to vision problems. This paper studies whether ℋC-Search can achieve similarly competitive performance on basic vision tasks such as object detection, scene labeling, and monocular depth estimation, where the leading paradigm is energy minimization. To this end, we introduce a search operator suited to the vision domain that improves a candidate solution by probabilistically sampling likely object configurations in the scene from the hierarchical Berkeley segmentation. We complement this search operator by applying the DAgger algorithm to robustly train the search heuristic so it learns from its previous mistakes. Our evaluation shows that these improvements reduce the branching factor and search depth, and thus give a significant performance boost. Our state-of-the-art results on scene labeling and depth estimation suggest that ℋC-Search provides a suitable tool for learning and inference in vision.

...read moreread less

28 citations

Journal Article•

PAC optimal MDP planning with application to invasive species management

[...]

Majid Alkaee Taleghan¹, Thomas G. Dietterich¹, Mark Crowley¹, Kim Hall¹, H. Jo Albers² - Show less +1 more•Institutions (2)

Oregon State University¹, University of Wyoming²

01 Jan 2015-Journal of Machine Learning Research

TL;DR: It is shown that the improved confidence intervals and the new search heuristics yield reductions of between 8% and 47% in the number of simulator calls required to reach near-optimal policies.

...read moreread less

Abstract: In a simulator-defined MDP, the Markovian dynamics and rewards are provided in the form of a simulator from which samples can be drawn. This paper studies MDP planning algorithms that attempt to minimize the number of simulator calls before terminating and outputting a policy that is approximately optimal with high probability. The paper introduces two heuristics for efficient exploration and an improved confidence interval that enables earlier termination with probabilistic guarantees. We prove that the heuristics and the confidence interval are sound and produce with high probability an approximately optimal policy in polynomial time. Experiments on two benchmark problems and two instances of an invasive species management problem show that the improved confidence intervals and the new search heuristics yield reductions of between 8% and 47% in the number of simulator calls required to reach near-optimal policies.

...read moreread less

24 citations

Proceedings Article•

α-min: a compact approximate solver for finite-horizon POMDPs

[...]

Yann Dujardin¹, Thomas G. Dietterich², Iadine Chadès¹•Institutions (2)

Commonwealth Scientific and Industrial Research Organisation¹, Oregon State University²

25 Jul 2015

TL;DR: A new algorithm (α-min) is introduced that formulates a Mixed Integer Linear Program (MILP) to calculate approximate solutions for finite-horizon POMDP problems with limited numbers of α-vectors.

...read moreread less

Abstract: In many POMDP applications in computational sustainability, it is important that the computed policy have a simple description, so that it can be easily interpreted by stakeholders and decision makers. One measure of simplicity for POMDP value functions is the number of α-vectors required to represent the value function. Existing POMDP methods seek to optimize the accuracy of the value function, which can require a very large number of α-vectors. This paper studies methods that allow the user to explore the tradeoff between the accuracy of the value function and the number of α-vectors. Building on previous point-based POMDP solvers, this paper introduces a new algorithm (α-min) that formulates a Mixed Integer Linear Program (MILP) to calculate approximate solutions for finite-horizon POMDP problems with limited numbers of α-vectors. At each time-step, α-min calculates α-vectors to greedily minimize the gap between current upper and lower bounds of the value function. In doing so, good upper and lower bounds are quickly reached allowing a good approximation of the problem with few α-vectors. Experimental results show that α-min provides good approximate solutions given a fixed number of α-vectors on small benchmark problems, on a larger randomly generated problem, as well as on a computational sustainability problem to best manage the endangered Sumatran tiger.

...read moreread less

14 citations

Proceedings Article•DOI•

Facilitating testing and debugging of Markov Decision Processes with interactive visualization

[...]

Sean McGregor¹, Hailey Buckingham¹, Thomas G. Dietterich¹, Rachel Houtman¹, Claire A. Montgomery¹, Ronald Metoyer¹ - Show less +2 more•Institutions (1)

Oregon State University¹

17 Dec 2015

TL;DR: MDPVIS addresses three visualization research gaps by generalizing a visualization for wildfire management and addressing the data acquisition, data analysis, and cognition gap through a generalized MDP information visualization.

...read moreread less

Abstract: Researchers in AI and Operations Research employ the framework of Markov Decision Processes (MDPs) to formalize problems of sequential decision making under uncertainty. A common approach is to implement a simulator of the stochastic dynamics of the MDP and a Monte Carlo optimization algorithm that invokes this simulator to solve the MDP. The resulting software system is often realized by integrating several systems and functions that are collectively subject to failures of specification, implementation, integration, and optimization. We present these failures as queries for a computational steering visual analytic system (MDPVIS). MDPVIS addresses three visualization research gaps. First, the data acquisition gap is addressed through a general simulator-visualization interface. Second, the data analysis gap is addressed through a generalized MDP information visualization. Finally, the cognition gap is addressed by exposing model components to the user. MDPVIS generalizes a visualization for wildfire management. We use that problem to illustrate MDPVIS.

...read moreread less

12 citations

Posted Content•

Sequential Feature Explanations for Anomaly Detection

[...]

Amran Siddiqui¹, Alan Fern¹, Thomas G. Dietterich¹, Weng-Keen Wong¹•Institutions (1)

Oregon State University¹

28 Feb 2015-arXiv: Artificial Intelligence

TL;DR: This article forms the problem of optimizingSFEs for a particular density-based anomaly detector, and presents both greedy algorithms and an optimal algorithm, based on branch-and-bound search, for optimizing SFEs.

...read moreread less

Abstract: In many applications, an anomaly detection system presents the most anomalous data instance to a human analyst, who then must determine whether the instance is truly of interest (e.g. a threat in a security setting). Unfortunately, most anomaly detectors provide no explanation about why an instance was considered anomalous, leaving the analyst with no guidance about where to begin the investigation. To address this issue, we study the problems of computing and evaluating sequential feature explanations (SFEs) for anomaly detectors. An SFE of an anomaly is a sequence of features, which are presented to the analyst one at a time (in order) until the information contained in the highlighted features is enough for the analyst to make a confident judgement about the anomaly. Since analyst effort is related to the amount of information that they consider in an investigation, an explanation's quality is related to the number of features that must be revealed to attain confidence. One of our main contributions is to present a novel framework for large scale quantitative evaluations of SFEs, where the quality measure is based on analyst effort. To do this we construct anomaly detection benchmarks from real data sets along with artificial experts that can be simulated for evaluation. Our second contribution is to evaluate several novel explanation approaches within the framework and on traditional anomaly detection benchmarks, offering several insights into the approaches.

...read moreread less

Proceedings Article•

Learning greedy policies for the easy-first framework

[...]

Jun Xie¹, Chao Ma¹, Janardhan Rao Doppa², Prashanth Mannem¹, Xiaoli Z. Fern¹, Thomas G. Dietterich¹, Prasad Tadepalli¹ - Show less +3 more•Institutions (2)

Oregon State University¹, Washington State University²

25 Jan 2015

TL;DR: This work forms greedy policy learning in the Easy-first approach as a novel non-convex optimization problem and solves it via an efficient Majorization Minimization (MM) algorithm.

...read moreread less

Abstract: Easy-first, a search-based structured prediction approach, has been applied to many NLP tasks including dependency parsing and coreference resolution. This approach employs a learned greedy policy (action scoring function) to make easy decisions first, which constrains the remaining decisions and makes them easier. We formulate greedy policy learning in the Easy-first approach as a novel non-convex optimization problem and solve it via an efficient Majorization Minimization (MM) algorithm. Results on within-document coreference and cross-document joint entity and event coreference tasks demonstrate that the proposed approach achieves statistically significant performance improvement over existing training regimes for Easy-first and is less susceptible to overfitting.

...read moreread less

Proceedings Article•

Progressive abstraction refinement for sparse sampling

[...]

Jesse Hostetler¹, Alan Fern¹, Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

12 Jul 2015

TL;DR: A progressive abstraction refinement algorithm is proposed that refines an initially coarse abstraction during search in order to match the abstraction to the sample budget and combines the strong performance of coarse abstractions at small sample budgets with the ability to exploit larger budgets for further performance gains.

...read moreread less

Abstract: Monte Carlo tree search (MCTS) algorithms can encounter difficulties when solving Markov decision processes (MDPs) in which the outcomes of actions are highly stochastic. This stochastic branching can be reduced through state abstraction. In online planning with a time budget, there is a complex tradeoff between loss in performance due to overly coarse abstraction versus gain in performance from reducing the problem size. Coarse but unsound abstractions often outperform sound abstractions for practical budgets. Motivated by this, we propose a progressive abstraction refinement algorithm that refines an initially coarse abstraction during search in order to match the abstraction to the sample budget. Our experiments show that the algorithm combines the strong performance of coarse abstractions at small sample budgets with the ability to exploit larger budgets for further performance gains.

...read moreread less

Proceedings Article•DOI•

Improving Automated Email Tagging with Implicit Feedback

[...]

Mohammad S. Sorower¹, Michael Slater¹, Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

05 Nov 2015

TL;DR: This paper proposes three algorithms (and two baselines) for incorporating implicit feedback into the EP2 tag predictor, and shows that implicit feedback mechanisms can provide a useful performance boost for email tagging systems.

...read moreread less

Abstract: Tagging email is an important tactic for managing information overload. Machine learning methods can help the user with this task by predicting tags for incoming email messages. The natural user interface displays the predicted tags on the email message, and the user doesn't need to do anything unless those predictions are wrong (in which case, the user can delete the incorrect tags and add the missing tags). From a machine learning perspective, this means that the learning algorithm never receives confirmation that its predictions are correct---it only receives feedback when it makes a mistake. This can lead to slower learning, particularly when the predictions were not very confident, and hence, the learning algorithm would benefit from positive feedback. One could assume that if the user never changes any tag, then the predictions are correct, but users sometimes forget to correct the tags, presumably because they are focused on the content of the email messages and fail to notice incorrect and missing tags. The aim of this paper is to determine whether implicit feedback can provide useful additional training examples to the email prediction subsystem of TaskTracer, known as EP2 (Email Predictor 2). Our hypothesis is that the more time a user spends working on an email message, the more likely it is that the user will notice tag errors and correct them. If no corrections are made, then perhaps it is safe for the learning system to treat the predicted tags as being correct and train accordingly. This paper proposes three algorithms (and two baselines) for incorporating implicit feedback into the EP2 tag predictor. These algorithms are then evaluated using email interaction and tag correction events collected from 14 user-study participants as they performed email-directed tasks while using TaskTracer EP2. The results show that implicit feedback produces important increases in training feedback, and hence, significant reductions in subsequent prediction errors despite the fact that the implicit feedback is not perfect. We conclude that implicit feedback mechanisms can provide a useful performance boost for email tagging systems.

...read moreread less

Posted Content•

Transductive Optimization of Top k Precision

[...]

Li-Ping Liu¹, Thomas G. Dietterich¹, Nan Li², Zhi-Hua Zhou²•Institutions (2)

Oregon State University¹, Nanjing University²

20 Oct 2015-arXiv: Learning

TL;DR: Transductive Top K (TTK) as discussed by the authors minimizes the hinge loss over all training instances under the constraint that exactly $k$ test instances are predicted as positive, which is similar to our approach.

...read moreread less

Abstract: Consider a binary classification problem in which the learner is given a labeled training set, an unlabeled test set, and is restricted to choosing exactly $k$ test points to output as positive predictions. Problems of this kind---{\it transductive precision@$k$}---arise in information retrieval, digital advertising, and reserve design for endangered species. Previous methods separate the training of the model from its use in scoring the test points. This paper introduces a new approach, Transductive Top K (TTK), that seeks to minimize the hinge loss over all training instances under the constraint that exactly $k$ test instances are predicted as positive. The paper presents two optimization methods for this challenging problem. Experiments and analysis confirm the importance of incorporating the knowledge of $k$ into the learning process. Experimental evaluations of the TTK approach show that the performance of TTK matches or exceeds existing state-of-the-art methods on 7 UCI datasets and 3 reserve design problem instances.

...read moreread less

Proceedings Article•

MDPVIS: An Interactive Visualization for Testing Markov Decision Processes.

[...]

Sean McGregor¹, Hailey Buckingham¹, Rachel Houtman¹, Claire A. Montgomery¹, Ronald Metoyer¹, Thomas G. Dietterich¹ - Show less +2 more•Institutions (1)

Oregon State University¹

01 Jan 2015

TL;DR: A domain agnostic visual analytic design and implementation for testing and debugging MDPs: MDPVIS, which treats optimization as an open-ended process whose parameters are repeatedly changed forTesting and debugging.

...read moreread less

Abstract: Markov Decision Process (MDP) simulators and optimization algorithms integrate several systems and functions that are collectively subject to failures of specification, implementation, integration, and optimization. We present a domain agnostic visual analytic design and implementation for testing and debugging MDPs: MDPVIS. A common approach for solving Markov Decision Processes is to implement a simulator of the stochastic dynamics of the MDP and a Monte Carlo optimization algorithm that invokes this simulator. The resulting software system is often realized by integrating several systems and functions that are collectively subject to failures (referred to as “bugs”) of specification, implementation, integration, and optimization. Since bugs are subject to the same stochastic processes as their underlying systems, detecting and characterizing bugs requires exploration with an “informed trial and error” (Sedlmair et al. 2014) process. This process involves writing an interactive client to manually execute transitions, followed by a visualization of state development as a policy rule is followed. A domain agnostic visual analytic interface could facilitate testing and debugging, but during semi-structured interviews of MDP researchers we did not find anyone using a generic visualization tool for testing. We posit this is because researchers have heretofore not had access to a visualization that is easily connected to their MDP simulator and MDP optimizer. In the following section we summarize the implementation and integration of MDPVIS presented in McGregor et al. (2015). Implementation and Integration MDPVIS’ target users are researchers interested in steering the optimization itself, simulator developers who are interested in ensuring the policies optimized for the problem domain are well founded, or domain policy experts primarily interested in the outcomes produced by the optimized policy. In real-world settings these roles can be filled by a sinCopyright c © 2015, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. gle person, or each role can be performed by a large team of developers and domain experts. MDPVIS extends computational steering from the high performance scientific visualization community (Parker et al. 1996). Whereas computational steering traditionally refers to modifying a computer process during its execution (Mulder, van Wijk, and van Liere 1999), we treat optimization as an open-ended process whose parameters are repeatedly changed for testing and debugging. Sedlmair et al. (2014) label techniques for understanding the relationship between input parameters and outputs as Parameter Space Analysis (PSA), “...the systematic variation of model input parameters, generating outputs for each combination of parameters, and investigating the relation between parameter settings and corresponding outputs.” This is a suitable definition for the MDP debugging and testing processes. Testing for MDP bugs requires exploring Monte Carlo rollouts. These rollouts are the output of the system under test, but since the distribution of these rollouts is defined by applying a policy in many successive states, the rollouts are tightly coupled with the parameter space of the MDP’s component functions. Similarly, establishing bug causality (debugging) requires varying the model parameters and examining the resulting rollouts. The VL/HCC paper explores MDP testing questions in the following six broad tasks introduced by Sedlmair et al. (2014): 1. Fitting: Do the outputs match real-world data or expectations? 2. Outliers: Are low probability events occurring with unexpected frequency? 3. Partition: Do different system parameters produce the expected differences? 4. Optimization: Did the optimization algorithm find the local optimum and does the policy exploit a bug in the specification or implementation? 5. Uncertainty: How confident are we in the proposed results? 6. Sensitivity: Do small changes to the system result in big changes to the optimized policy? To test these questions, MDPVIS (Figure 1) has four computational steering control sets and three visualization areas. The controls give the reward, model, and policy parameters that are exposed by the MDP’s software. These layers are memoized in an exploration history that records the parameters and rollouts computed by the MDP. The first visualization area shows state distributions at time steps under the current policy. The second visualization area gives the distribution of a variable’s development through time. The last visualization area gives details of individual states. Each of these steering controls and visualizations are designed to integrate with MDP simulators and optimizers using the same read-eval-print loop (REPL) that is typically implemented in current development practices. We built MDPVIS as a data-driven web application. The web stack emphasizes standard data interchange formats that are easily linked to MDP simulators and optimization algorithms. We identified four HTTP requests (initialize, rollouts, optimize, and state) that are answered by the MDP simulator or optimizer. In each case the current values of the steering controls are sent to a web server acting as a bridge between the HTTP request and the syntax expected by the REPL. 1. /initialize – Ask for the steering parameters that should be displayed to the user. The parameters are a list of tuples, each containing the name, description, minimum value, maximum value, and current value of a parameter. These parameters are then rendered as HTML input elements for the user to modify. Following initialization, MDPVIS automatically requests rollouts for the initial parameters. 2. /rollouts?QUERY – Get rollouts for the parameters that are currently defined in the user interface. The server returns an array of arrays containing the state variables that should be rendered for each time step. 3. /optimize?QUERY – Get a newly optimized policy. This sends all the parameters defined in the user interface and the optimization algorithm returns a newly optimized policy. 4. /state?IDENTIFIER – Get the full state details and images. This is required for high dimensional problems in which the entire state cannot be returned to the client for every state in a rollout The most domain-specific element of any MDP visualization is the representation of a specific state. In Figure 1 individual states are given as two dimensional images of landscape fuel levels. This is a visualization that our forestry colleagues typically generate for natural resource domains. The fourth HTTP request can optionally return images to accommodate these landscapes and arbitrary domain images. These landscapes can be rendered without any changes to the MDPVIS code base. A live version of the visualization is available at MDPvis.github.io for a wildfire suppression policy domain (Houtman et al. 2013). The visualization has been tested on Google Chrome and Firefox and is responsive to a variety of screen resolutions. In the VL/HCC paper we presented a use-case study to provide anecdotal evidence of the utility of MDPVIS on the wildfire problem. The case study involved user sessions with our forestry economics collaborators who have formulated an MDP optimization problem to study fire suppression policies. When applying MDPVIS we found numerous simulator and the optimization bugs.

...read moreread less

Executive Summary: Success in the quest for articial intelligence has the potential to bring unprecedented benets to humanity, and it is therefore worthwhile to research how to maximize these benets while avoiding potential pitfalls. This document gives numerous examples (which should by no means be construed as an exhaustive list) of such worthwhile research aimed at ensuring that AI remains robust and benecial.

[...]

Richard Mallah, Erik Brynjolfsson, Ryan Calo, Thomas G. Dietterich, Dileep George, Demis Hassabis, Eric Horvitz, James Manyika, Michael Osborne, David C. Parkes, Francesca Rossi, Bart Selman, Murray Shanahan - Show less +9 more

01 Jan 2015