Definitions, methods, and applications in interpretable machine learning.

doi:10.1073/PNAS.1900654116

Open AccessJournal ArticleDOI

Definitions, methods, and applications in interpretable machine learning.

W. James Murdoch, +5 more

- 29 Oct 2019 -

Proceedings of the National Academy of S...

- Vol. 116, Iss: 44, pp 22071-22080

Chats0

TLDR

The authors define interpretability in the context of machine learning and introduce the predictive, descriptive, relevant (PDR) framework for discussing interpretations, with three overarching desiderata for evaluation: predictive accuracy, descriptive accuracy, and relevancy, with relevance judged relative to a human audience.

Abstract:

Machine-learning models have demonstrated great success in learning complex patterns that enable them to make predictions about unobserved data. In addition to using models for prediction, the ability to interpret what a model has learned is receiving an increasing amount of attention. However, this increased focus has led to considerable confusion about the notion of interpretability. In particular, it is unclear how the wide array of proposed interpretation methods are related and what common concepts can be used to evaluate them. We aim to address these concerns by defining interpretability in the context of machine learning and introducing the predictive, descriptive, relevant (PDR) framework for discussing interpretations. The PDR framework provides 3 overarching desiderata for evaluation: predictive accuracy, descriptive accuracy, and relevancy, with relevancy judged relative to a human audience. Moreover, to help manage the deluge of interpretation methods, we introduce a categorization of existing techniques into model-based and post hoc categories, with subgroups including sparsity, modularity, and simulatability. To demonstrate how practitioners can use the PDR framework to evaluate and understand interpretations, we provide numerous real-world examples. These examples highlight the often underappreciated role played by human audiences in discussions of interpretability. Finally, based on our framework, we discuss limitations of existing methods and directions for future work. We hope that this work will provide a common vocabulary that will make it easier for both practitioners and researchers to discuss and choose from the full range of interpretation methods.

Definitions, methods, and applications in interpretable machine learning.

Citations

Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI.

Explainable AI: A Review of Machine Learning Interpretability Methods

Explainable Machine Learning for Scientific Insights and Discoveries

A Survey on the Explainability of Supervised Machine Learning

Drug discovery with explainable artificial intelligence

References

Random Forests

Regression Shrinkage and Selection via the Lasso

Histograms of oriented gradients for human detection

ggplot2: Elegant Graphics for Data Analysis

Classification and Regression Trees.

Related Papers (5)

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

A unified approach to interpreting model predictions

A Survey of Methods for Explaining Black Box Models

Random Forests