Showing papers on "Interpretability published in 2020"

PDF

Open Access

Journal Article•DOI•

From Local Explanations to Global Understanding with Explainable AI for Trees.

[...]

Scott M. Lundberg¹, Scott M. Lundberg², Gabriel G. Erion¹, Hugh Chen¹, Alex J. DeGrave¹, Jordan M. Prutkin¹, Bala G. Nair¹, Ronit Katz¹, Jonathan Himmelfarb¹, Nisha Bansal¹, Su-In Lee¹ - Show less +7 more•Institutions (2)

University of Washington¹, Microsoft²

17 Jan 2020-Nature Machine Intelligence

TL;DR: An explanation method for trees is presented that enables the computation of optimal local explanations for individual predictions, and the authors demonstrate their method on three medical datasets.

...read moreread less

Abstract: Tree-based machine learning models such as random forests, decision trees and gradient boosted trees are popular nonlinear predictive models, yet comparatively little attention has been paid to explaining their predictions. Here we improve the interpretability of tree-based models through three main contributions. (1) A polynomial time algorithm to compute optimal explanations based on game theory. (2) A new type of explanation that directly measures local feature interaction effects. (3) A new set of tools for understanding global model structure based on combining many local explanations of each prediction. We apply these tools to three medical machine learning problems and show how combining many high-quality local explanations allows us to represent global structure while retaining local faithfulness to the original model. These tools enable us to (1) identify high-magnitude but low-frequency nonlinear mortality risk factors in the US population, (2) highlight distinct population subgroups with shared risk characteristics, (3) identify nonlinear interaction effects among risk factors for chronic kidney disease and (4) monitor a machine learning model deployed in a hospital by identifying which features are degrading the model’s performance over time. Given the popularity of tree-based machine learning models, these improvements to their interpretability have implications across a broad set of domains. Tree-based machine learning models are widely used in domains such as healthcare, finance and public services. The authors present an explanation method for trees that enables the computation of optimal local explanations for individual predictions, and demonstrate their method on three medical datasets.

...read moreread less

2,548 citations

Posted Content•

REALM: Retrieval-Augmented Language Model Pre-Training.

[...]

Kelvin Guu¹, Kenton Lee¹, Zora Tung¹, Panupong Pasupat¹, Ming-Wei Chang¹ - Show less +1 more•Institutions (1)

Google¹

10 Feb 2020-arXiv: Computation and Language

TL;DR: The effectiveness of Retrieval-Augmented Language Model pre-training (REALM) is demonstrated by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA) and is found to outperform all previous methods by a significant margin, while also providing qualitative benefits such as interpretability and modularity.

...read moreread less

Abstract: Language model pre-training has been shown to capture a surprising amount of world knowledge, crucial for NLP tasks such as question answering. However, this knowledge is stored implicitly in the parameters of a neural network, requiring ever-larger networks to cover more facts. To capture knowledge in a more modular and interpretable way, we augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia, used during pre-training, fine-tuning and inference. For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner, using masked language modeling as the learning signal and backpropagating through a retrieval step that considers millions of documents. We demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training (REALM) by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA). We compare against state-of-the-art models for both explicit and implicit knowledge storage on three popular Open-QA benchmarks, and find that we outperform all previous methods by a significant margin (4-16% absolute accuracy), while also providing qualitative benefits such as interpretability and modularity.

...read moreread less

563 citations

Journal Article•DOI•

Explainable AI: A Review of Machine Learning Interpretability Methods

[...]

Pantelis Linardatos¹, Vasilis Papastefanopoulos¹, Sotiris Kotsiantis¹•Institutions (1)

University of Patras¹

25 Dec 2020-Entropy

TL;DR: In this paper, a literature review and taxonomy of machine learning interpretability methods are presented, as well as links to their programming implementations, in the hope that this survey would serve as a reference point for both theorists and practitioners.

...read moreread less

Abstract: Recent advances in artificial intelligence (AI) have led to its widespread industrial adoption, with machine learning systems demonstrating superhuman performance in a significant number of tasks. However, this surge in performance, has often been achieved through increased model complexity, turning such systems into “black box” approaches and causing uncertainty regarding the way they operate and, ultimately, the way that they come to decisions. This ambiguity has made it problematic for machine learning systems to be adopted in sensitive yet critical domains, where their value could be immense, such as healthcare. As a result, scientific interest in the field of Explainable Artificial Intelligence (XAI), a field that is concerned with the development of new methods that explain and interpret machine learning models, has been tremendously reignited over recent years. This study focuses on machine learning interpretability methods; more specifically, a literature review and taxonomy of these methods are presented, as well as links to their programming implementations, in the hope that this survey would serve as a reference point for both theorists and practitioners.

...read moreread less

543 citations

Journal Article•DOI•

Explaining Deep Neural Networks and Beyond: A Review of Methods and Applications

[...]

Wojciech Samek¹, Grégoire Montavon, Sebastian Lapuschkin¹, Christopher J. Anders, Klaus-Robert Müller - Show less +1 more•Institutions (1)

Heinrich Hertz Institute¹

17 Mar 2020-arXiv: Learning

TL;DR: In this paper, the authors provide a timely overview of explainable AI, with a focus on 'post-hoc' explanations, explain its theoretical foundations, and put interpretability algorithms to a test both from a theory and comparative evaluation perspective using extensive simulations.

...read moreread less

Abstract: With the broader and highly successful usage of machine learning in industry and the sciences, there has been a growing demand for Explainable AI. Interpretability and explanation methods for gaining a better understanding about the problem solving abilities and strategies of nonlinear Machine Learning, in particular, deep neural networks, are therefore receiving increased attention. In this work we aim to (1) provide a timely overview of this active emerging field, with a focus on 'post-hoc' explanations, and explain its theoretical foundations, (2) put interpretability algorithms to a test both from a theory and comparative evaluation perspective using extensive simulations, (3) outline best practice aspects i.e. how to best include interpretation methods into the standard usage of machine learning and (4) demonstrate successful usage of explainable AI in a representative selection of application scenarios. Finally, we discuss challenges and possible future directions of this exciting foundational field of machine learning.

...read moreread less

385 citations

Posted Content•

Captum: A unified and generic model interpretability library for PyTorch.

[...]

Narine Kokhlikyan, Vivek Miglani, Miguel Martin, Edward Wang, Bilal Alsallakh, Jonathan Reynolds, Alexander Melnikov, Natalia Kliushkina, Carlos L. Araya, Siqi Yan, Orion Reblitz-Richardson¹ - Show less +7 more•Institutions (1)

Facebook¹

16 Sep 2020-arXiv: Learning

TL;DR: An interactive visualization tool called Captum Insights that is built on top of Captum library and allows sample-based model debugging and visualization using feature importance metrics and is designed for easy understanding and use.

...read moreread less

Abstract: In this paper we introduce a novel, unified, open-source model interpretability library for PyTorch [12]. The library contains generic implementations of a number of gradient and perturbation-based attribution algorithms, also known as feature, neuron and layer importance algorithms, as well as a set of evaluation metrics for these algorithms. It can be used for both classification and non-classification models including graph-structured models built on Neural Networks (NN). In this paper we give a high-level overview of supported attribution algorithms and show how to perform memory-efficient and scalable computations. We emphasize that the three main characteristics of the library are multimodality, extensibility and ease of use. Multimodality supports different modality of inputs such as image, text, audio or video. Extensibility allows adding new algorithms and features. The library is also designed for easy understanding and use. Besides, we also introduce an interactive visualization tool called Captum Insights that is built on top of Captum library and allows sample-based model debugging and visualization using feature importance metrics.

...read moreread less

312 citations

Proceedings Article•DOI•

Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning

[...]

Harmanpreet Kaur¹, Harsha Nori², Samuel Jenkins², Rich Caruana², Hanna Wallach², Jennifer Wortman Vaughan² - Show less +2 more•Institutions (2)

University of Michigan¹, Microsoft²

21 Apr 2020

TL;DR: It is indicated that data scientists over-trust and misuse interpretability tools, and few of their participants were able to accurately describe the visualizations output by these tools.

...read moreread less

Abstract: Machine learning (ML) models are now routinely deployed in domains ranging from criminal justice to healthcare. With this newfound ubiquity, ML has moved beyond academia and grown into an engineering discipline. To that end, interpretability tools have been designed to help data scientists and machine learning practitioners better understand how ML models work. However, there has been little evaluation of the extent to which these tools achieve this goal. We study data scientists' use of two existing interpretability tools, the InterpretML implementation of GAMs and the SHAP Python package. We conduct a contextual inquiry (N=11) and a survey (N=197) of data scientists to observe how they use interpretability tools to uncover common issues that arise when building and evaluating ML models. Our results indicate that data scientists over-trust and misuse interpretability tools. Furthermore, few of our participants were able to accurately describe the visualizations output by these tools. We highlight qualitative themes for data scientists' mental models of interpretability tools. We conclude with implications for researchers and tool designers, and contextualize our findings in the social science literature.

...read moreread less

311 citations

Posted Content•DOI•

DeepEMhancer: a deep learning solution for cryo-EM volume post-processing

[...]

Ruben Sanchez Garcia¹, Josué Gómez-Blanco², Ana Maria Cuervo¹, José María Carazo¹, Carlos Oscar S. Sorzano¹, Javier Vargas³ - Show less +2 more•Institutions (3)

Spanish National Research Council¹, McGill University², Complutense University of Madrid³

13 Jun 2020-bioRxiv

TL;DR: DeepEMhancer, a deep learning approach designed to perform automatic post-processing of cryo-EM maps, was evaluated on a testing set of 20 different experimental maps, showing its ability to obtain much cleaner and more detailed versions of the experimental maps.

...read moreread less

Abstract: Cryo-electron microscopy (cryo-EM) maps are among the most valuable sources of information for protein structure modeling. However, due to the loss of contrast at high frequencies, they generally need to be post-processed before modeling in order to improve their interpretability. To that end, approaches based on B-factor correction are the most popular choices, yet they suffer from some limitations such as the fact that the correction is applied globally, ignoring the presence of heterogeneity in the map local quality that cryo-EM reconstructions tend to exhibit. With the aim of overcoming these limitations, here we present DeepEMhacer, a deep learning approach designed to perform automatic post-processing of cryo-EM maps. Trained on a dataset of pairs of experimental cryo-EM maps and maps sharpened by LocScape using their respective atomic models, DeepEMhacer has automatically learned how to post-process experimental maps performing masking-like and sharpening-like operations in a single step. DeepEMhacer has been evaluated on a testing set of 20 different experimental maps, showing its ability to obtain much cleaner and detailed versions of the experimental maps, thus, improving their interpretability. Additionally, we have illustrated the benefits of DeepEMhacer with a use case in which the structure of the SARS-CoV 2 RNA polymerase is improved.

...read moreread less

304 citations

Journal Article•DOI•

The importance of interpretability and visualization in machine learning for applications in medicine and health care

[...]

Alfredo Vellido¹•Institutions (1)

Polytechnic University of Catalonia¹

01 Dec 2020-Neural Computing and Applications

TL;DR: It is argued that, beyond improving model interpretability as a goal in itself, machine learning needs to integrate the medical experts in the design of data analysis interpretation strategies Otherwise, machineLearning is unlikely to become a part of routine clinical and health care practice.

...read moreread less

Abstract: In a short period of time, many areas of science have made a sharp transition towards data-dependent methods. In some cases, this process has been enabled by simultaneous advances in data acquisition and the development of networked system technologies. This new situation is particularly clear in the life sciences, where data overabundance has sparked a flurry of new methodologies for data management and analysis. This can be seen as a perfect scenario for the use of machine learning and computational intelligence techniques to address problems in which more traditional data analysis approaches might struggle. But, this scenario also poses some serious challenges. One of them is model interpretability and explainability, especially for complex nonlinear models. In some areas such as medicine and health care, not addressing such challenge might seriously limit the chances of adoption, in real practice, of computer-based systems that rely on machine learning and computational intelligence methods for data analysis. In this paper, we reflect on recent investigations about the interpretability and explainability of machine learning methods and discuss their impact on medicine and health care. We pay specific attention to one of the ways in which interpretability and explainability in this context can be addressed, which is through data and model visualization. We argue that, beyond improving model interpretability as a goal in itself, we need to integrate the medical experts in the design of data analysis interpretation strategies. Otherwise, machine learning is unlikely to become a part of routine clinical and health care practice.

...read moreread less

272 citations

Proceedings Article•DOI•

Towards Faithfully Interpretable NLP Systems: How Should We Define and Evaluate Faithfulness?

[...]

Alon Jacovi¹, Yoav Goldberg²•Institutions (2)

Bar-Ilan University¹, Allen Institute for Artificial Intelligence²

07 Apr 2020

TL;DR: The current binary definition for faithfulness sets a potentially unrealistic bar for being considered faithful, and is called for discarding the binary notion of faithfulness in favor of a more graded one, which is of greater practical utility.

...read moreread less

Abstract: With the growing popularity of deep-learning based NLP models, comes a need for interpretable systems. But what is interpretability, and what constitutes a high-quality interpretation? In this opinion piece we reflect on the current state of interpretability evaluation research. We call for more clearly differentiating between different desired criteria an interpretation should satisfy, and focus on the faithfulness criteria. We survey the literature with respect to faithfulness evaluation, and arrange the current approaches around three assumptions, providing an explicit form to how faithfulness is "defined" by the community. We provide concrete guidelines on how evaluation of interpretation methods should and should not be conducted. Finally, we claim that the current binary definition for faithfulness sets a potentially unrealistic bar for being considered faithful. We call for discarding the binary notion of faithfulness in favor of a more graded one, which we believe will be of greater practical utility.

...read moreread less

265 citations

Journal Article•DOI•

On the Interpretability of Artificial Intelligence in Radiology: Challenges and Opportunities

[...]

Mauricio Reyes, Raphael Meier, Sérgio Pereira, Carlos A. Silva, Fried-Michael Dahlweid, Hendrik von Tengg-Kobligk, Ronald M. Summers, Roland Wiest - Show less +4 more

27 May 2020

TL;DR: Insight is provided into the current state of the art of interpretability methods for radiology AI and radiologists' opinions on the topic and suggests trends and challenges that need to be addressed to effectively streamlineinterpretability methods in clinical practice.

...read moreread less

Abstract: As artificial intelligence (AI) systems begin to make their way into clinical radiology practice, it is crucial to assure that they function correctly and that they gain the trust of experts. Toward this goal, approaches to make AI "interpretable" have gained attention to enhance the understanding of a machine learning algorithm, despite its complexity. This article aims to provide insights into the current state of the art of interpretability methods for radiology AI. This review discusses radiologists' opinions on the topic and suggests trends and challenges that need to be addressed to effectively streamline interpretability methods in clinical practice. Supplemental material is available for this article. © RSNA, 2020 See also the commentary by Gastounioti and Kontos in this issue.

...read moreread less

222 citations

Posted Content•

Transformer Interpretability Beyond Attention Visualization

[...]

Hila Chefer¹, Shir Gur¹, Lior Wolf¹•Institutions (1)

Tel Aviv University¹

17 Dec 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work proposes a novel way to compute relevancy for Transformer networks that assigns local relevance based on the Deep Taylor Decomposition principle and then propagates these releVancy scores through the layers.

...read moreread less

Abstract: Self-attention techniques, and specifically Transformers, are dominating the field of text processing and are becoming increasingly popular in computer vision classification tasks. In order to visualize the parts of the image that led to a certain classification, existing methods either rely on the obtained attention maps or employ heuristic propagation along the attention graph. In this work, we propose a novel way to compute relevancy for Transformer networks. The method assigns local relevance based on the Deep Taylor Decomposition principle and then propagates these relevancy scores through the layers. This propagation involves attention layers and skip connections, which challenge existing methods. Our solution is based on a specific formulation that is shown to maintain the total relevancy across layers. We benchmark our method on very recent visual Transformer networks, as well as on a text classification problem, and demonstrate a clear advantage over the existing explainability methods.

...read moreread less

Posted Content•

Asking and Answering Questions to Evaluate the Factual Consistency of Summaries

[...]

Alex Wang¹, Kyunghyun Cho², Michael Lewis³•Institutions (3)

New York University¹, Canadian Institute for Advanced Research², Facebook³

08 Apr 2020-arXiv: Computation and Language

TL;DR: QAGS (pronounced “kags”), an automatic evaluation protocol that is designed to identify factual inconsistencies in a generated summary, is proposed and is believed to be a promising tool in automatically generating usable and factually consistent text.

...read moreread less

Abstract: Practical applications of abstractive summarization models are limited by frequent factual inconsistencies with respect to their input. Existing automatic evaluation metrics for summarization are largely insensitive to such errors. We propose an automatic evaluation protocol called QAGS (pronounced "kags") that is designed to identify factual inconsistencies in a generated summary. QAGS is based on the intuition that if we ask questions about a summary and its source, we will receive similar answers if the summary is factually consistent with the source. To evaluate QAGS, we collect human judgments of factual consistency on model-generated summaries for the CNN/DailyMail (Hermann et al., 2015) and XSUM (Narayan et al., 2018) summarization datasets. QAGS has substantially higher correlations with these judgments than other automatic evaluation metrics. Also, QAGS offers a natural form of interpretability: The answers and questions generated while computing QAGS indicate which tokens of a summary are inconsistent and why. We believe QAGS is a promising tool in automatically generating usable and factually consistent text.

...read moreread less

Journal Article•DOI•

Interpretability of machine learning‐based prediction models in healthcare

[...]

Gregor Stiglic¹, Primoz Kocbek¹, Nino Fijačko¹, Marinka Zitnik², Katrien Verbert³, Leona Cilar¹ - Show less +2 more•Institutions (3)

University of Maribor¹, Harvard University², University of Copenhagen Faculty of Science³

01 Sep 2020-Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery

TL;DR: In this article, the authors give an overview of interpretability approaches and provide examples of practical interpretability of machine learning in different areas of healthcare, including prediction of health-related outcomes, optimizing treatments or improving the efficiency of screening for specific conditions.

...read moreread less

Abstract: There is a need of ensuring machine learning models that are interpretable. Higher interpretability of the model means easier comprehension and explanation of future predictions for end-users. Further, interpretable machine learning models allow healthcare experts to make reasonable and data-driven decisions to provide personalized decisions that can ultimately lead to higher quality of service in healthcare. Generally, we can classify interpretability approaches in two groups where the first focuses on personalized interpretation (local interpretability) while the second summarizes prediction models on a population level (global interpretability). Alternatively, we can group interpretability methods into model-specific techniques, which are designed to interpret predictions generated by a specific model, such as a neural network, and model-agnostic approaches, which provide easy-to-understand explanations of predictions made by any machine learning model. Here, we give an overview of interpretability approaches and provide examples of practical interpretability of machine learning in different areas of healthcare, including prediction of health-related outcomes, optimizing treatments or improving the efficiency of screening for specific conditions. Further, we outline future directions for interpretable machine learning and highlight the importance of developing algorithmic solutions that can enable machine-learning driven decision making in high-stakes healthcare problems.

...read moreread less

Journal Article•DOI•

A Survey of Safety and Trustworthiness of Deep Neural Networks: Verification, Testing, Adversarial Attack and Defence, and Interpretability

[...]

Xiaowei Huang¹, Daniel Kroening², Wenjie Ruan³, James Sharp⁴, Youcheng Sun⁵, Emese Thamo¹, Min Wu², Xinping Yi¹ - Show less +4 more•Institutions (5)

University of Liverpool¹, University of Oxford², Lancaster University³, Defence Science and Technology Laboratory⁴, Queen's University Belfast⁵

01 Aug 2020-Computer Science Review

TL;DR: A review of the current research effort into making DNNs safe and trustworthy, by focusing on four aspects: verification, testing, adversarial attack and defence, and interpretability as discussed by the authors.

...read moreread less

Journal Article•DOI•

Interpretable spatio-temporal attention LSTM model for flood forecasting

[...]

Yukai Ding¹, Yuelong Zhu¹, Jun Feng¹, Pengcheng Zhang¹, Zirun Cheng² - Show less +1 more•Institutions (2)

Hohai University¹, Tongji University²

25 Aug 2020-Neurocomputing

TL;DR: Experimental results on three small and medium basins in China suggest that the proposed STA-LSTM model outperforms Historical Average, Fully Connected Network (FCN), Convolutional Neural Networks (CNN), Graphconvolutional Networks (GCN), original LSTM, spatial attention LSTm, and temporal attention L STM (TA-L STM) in most cases.

...read moreread less

Proceedings Article•

Understanding Global Feature Contributions With Additive Importance Measures

[...]

Ian Covert¹, Scott M. Lundberg², Su-In Lee¹•Institutions (2)

University of Washington¹, Microsoft²

01 Apr 2020

TL;DR: This work proposes SAGE, a model-agnostic method that quantifies predictive power while accounting for feature interactions and shows that SAGE can be calculated efficiently and that it assigns more accurate importance values than other methods.

...read moreread less

Abstract: Understanding the inner workings of complex machine learning models is a long-standing problem and most recent research has focused on local interpretability. To assess the role of individual input features in a global sense, we explore the perspective of defining feature importance through the predictive power associated with each feature. We introduce two notions of predictive power (model-based and universal) and formalize this approach with a framework of additive importance measures, which unifies numerous methods in the literature. We then propose SAGE, a model-agnostic method that quantifies predictive power while accounting for feature interactions. Our experiments show that SAGE can be calculated efficiently and that it assigns more accurate importance values than other methods.

...read moreread less

Posted Content•

Explainable Artificial Intelligence: a Systematic Review

[...]

Giulia Vilone, Luca Longo

29 May 2020-arXiv: Artificial Intelligence

TL;DR: This systematic review contributes to the body of knowledge by clustering these methods with a hierarchical classification system with four main clusters: review articles, theories and notions, methods and their evaluation.

...read moreread less

Abstract: Explainable Artificial Intelligence (XAI) has experienced a significant growth over the last few years. This is due to the widespread application of machine learning, particularly deep learning, that has led to the development of highly accurate models but lack explainability and interpretability. A plethora of methods to tackle this problem have been proposed, developed and tested. This systematic review contributes to the body of knowledge by clustering these methods with a hierarchical classification system with four main clusters: review articles, theories and notions, methods and their evaluation. It also summarises the state-of-the-art in XAI and recommends future research directions.

...read moreread less

Journal Article•DOI•

Fuzzy neural networks and neuro-fuzzy networks: A review the main techniques and applications used in the literature

[...]

Paulo Vitor de Campos Souza¹•Institutions (1)

Johannes Kepler University of Linz¹

01 Jul 2020-Applied Soft Computing

TL;DR: It is concluded that the fuzzy neural network models and their derivations are efficient in constructing a system with a high degree of accuracy and an appropriate level of interpretability working in a wide range of areas of economics and science.

...read moreread less

Proceedings Article•DOI•

A Model-Driven Deep Neural Network for Single Image Rain Removal

[...]

Hong Wang¹, Qi Xie¹, Qian Zhao¹, Deyu Meng²•Institutions (2)

Xi'an Jiaotong University¹, Macau University of Science and Technology²

14 Jun 2020

TL;DR: Wang et al. as mentioned in this paper proposed a model-driven deep neural network with fully interpretable network structures for single image rain removal, based on the convolutional dictionary learning mechanism for representing rain, and utilize the proximal gradient descent technique to design an iterative algorithm only containing simple operators for solving the model.

...read moreread less

Abstract: Deep learning (DL) methods have achieved state-of-the-art performance in the task of single image rain removal. Most of current DL architectures, however, are still lack of sufficient interpretability and not fully integrated with physical structures inside general rain streaks. To this issue, in this paper, we propose a model-driven deep neural network for the task, with fully interpretable network structures. Specifically, based on the convolutional dictionary learning mechanism for representing rain, we propose a novel single image deraining model and utilize the proximal gradient descent technique to design an iterative algorithm only containing simple operators for solving the model. Such a simple implementation scheme facilitates us to unfold it into a new deep network architecture, called rain convolutional dictionary network (RCDNet), with almost every network module one-to-one corresponding to each operation involved in the algorithm. By end-to-end training the proposed RCDNet, all the rain kernels and proximal operators can be automatically extracted, faithfully characterizing the features of both rain and clean background layers, and thus naturally lead to its better deraining performance, especially in real scenarios. Comprehensive experiments substantiate the superiority of the proposed network, especially its well generality to diverse testing scenarios and good interpretability for all its modules, as compared with state-of-the-arts both visually and quantitatively.

...read moreread less

Posted Content•

Multivariate Time-series Anomaly Detection via Graph Attention Network

[...]

Hang Zhao¹, Yujing Wang¹, Juanyong Duan¹, Congrui Huang¹, Defu Cao², Yunhai Tong², Bixiong Xu¹, Jing Bai¹, Jie Tong¹, Qi Zhang¹ - Show less +6 more•Institutions (2)

Microsoft¹, Peking University²

04 Sep 2020-arXiv: Learning

TL;DR: This paper proposes a novel self-supervised framework for multivariate time-series anomaly detection that outperforms other state-of-the-art models on three real-world datasets and has good interpretability and is useful for anomaly diagnosis.

...read moreread less

Abstract: Anomaly detection on multivariate time-series is of great importance in both data mining research and industrial applications. Recent approaches have achieved significant progress in this topic, but there is remaining limitations. One major limitation is that they do not capture the relationships between different time-series explicitly, resulting in inevitable false alarms. In this paper, we propose a novel self-supervised framework for multivariate time-series anomaly detection to address this issue. Our framework considers each univariate time-series as an individual feature and includes two graph attention layers in parallel to learn the complex dependencies of multivariate time-series in both temporal and feature dimensions. In addition, our approach jointly optimizes a forecasting-based model and are construction-based model, obtaining better time-series representations through a combination of single-timestamp prediction and reconstruction of the entire time-series. We demonstrate the efficacy of our model through extensive experiments. The proposed method outperforms other state-of-the-art models on three real-world datasets. Further analysis shows that our method has good interpretability and is useful for anomaly diagnosis.

...read moreread less

Journal Article•DOI•

Multiobjective Evolution of Fuzzy Rough Neural Network via Distributed Parallelism for Stock Prediction

[...]

Bin Cao¹, Jianwei Zhao¹, Zhihan Lv², Yu Gu³, Peng Yang¹, Saman K. Halgamuge¹ - Show less +2 more•Institutions (3)

Hebei University of Technology¹, Qingdao University², Beijing University of Chemical Technology³

07 Feb 2020-IEEE Transactions on Fuzzy Systems

TL;DR: Modifications to the existing models of fuzzy rough neural network are proposed and a powerful evolutionary framework for fuzzyrough neural networks is developed by inheriting the merits of both the merits and the objectives of prediction precision and network simplicity are considered.

...read moreread less

Abstract: Fuzzy rough theory can describe real-world situations in a mathematically effective and interpretable way, while evolutionary neural networks can be utilized to solve complex problems. Combining them with these complementary capabilities may lead to evolutionary fuzzy rough neural network with the interpretability and prediction capability. In this article, we propose modifications to the existing models of fuzzy rough neural network and then develop a powerful evolutionary framework for fuzzy rough neural networks by inheriting the merits of both the aforementioned systems. We first introduce rough neurons and enhance the consequence nodes, and further integrate the interval type-2 fuzzy set into the existing fuzzy rough neural network model. Thus, several modified fuzzy rough neural network models are proposed. While simultaneously considering the objectives of prediction precision and network simplicity , each model is transformed into a multiobjective optimization problem by encoding the structure, membership functions, and the parameters of the network. To solve these optimization problems, distributed parallel multiobjective evolutionary algorithms are proposed. We enhance the optimization processes with several measures including optimizer replacement and parameter adaption. In the distributed parallel environment, the tedious and time-consuming neural network optimization can be alleviated by numerous computational resources, significantly reducing the computational time. Through experimental verification on complex stock time series prediction tasks, the proposed optimization algorithms and the modified fuzzy rough neural network models exhibit significant improvements the existing fuzzy rough neural network and the long short-term memory network.

...read moreread less

Journal Article•DOI•

Neural circuit policies enabling auditable autonomy

[...]

Mathias Lechner¹, Ramin Hasani², Ramin Hasani³, Alexander Amini³, Thomas A. Henzinger¹, Daniela Rus³, Radu Grosu² - Show less +3 more•Institutions (3)

Institute of Science and Technology Austria¹, Vienna University of Technology², Massachusetts Institute of Technology³

01 Oct 2020-Nature Machine Intelligence

TL;DR: It is discovered that a single algorithm with 19 control neurons, connecting 32 encapsulated input features to outputs by 253 synapses, learns to map high-dimensional inputs into steering commands, showing superior generalizability, interpretability and robustness compared with orders-of-magnitude larger black-box learning systems.

...read moreread less

Abstract: A central goal of artificial intelligence in high-stakes decision-making applications is to design a single algorithm that simultaneously expresses generalizability by learning coherent representations of their world and interpretable explanations of its dynamics. Here, we combine brain-inspired neural computation principles and scalable deep learning architectures to design compact neural controllers for task-specific compartments of a full-stack autonomous vehicle control system. We discover that a single algorithm with 19 control neurons, connecting 32 encapsulated input features to outputs by 253 synapses, learns to map high-dimensional inputs into steering commands. This system shows superior generalizability, interpretability and robustness compared with orders-of-magnitude larger black-box learning systems. The obtained neural agents enable high-fidelity autonomy for task-specific parts of a complex autonomous system. Inspired by the brain of the roundworm Caenorhabditis elegans, the authors design a highly compact neural network controller directly from raw input pixels. Compared with larger networks, this compact controller demonstrates improved generalization, robustness and interpretability on a lane-keeping task.

...read moreread less

Posted Content•

On Interpretability of Artificial Neural Networks: A Survey

[...]

Fenglei Fan¹, Jinjun Xiong², Mengzhou Li¹, Ge Wang¹•Institutions (2)

Rensselaer Polytechnic Institute¹, IBM²

08 Jan 2020-arXiv: Learning

TL;DR: A simple but comprehensive taxonomy for interpretability is proposed, systematically review recent studies on interpretability of neural networks, describe applications of interpretability in medicine, and discuss future research directions, such as in relation to fuzzy logic and brain science.

...read moreread less

Abstract: Deep learning as represented by the artificial deep neural networks (DNNs) has achieved great success in many important areas that deal with text, images, videos, graphs, and so on. However, the black-box nature of DNNs has become one of the primary obstacles for their wide acceptance in mission-critical applications such as medical diagnosis and therapy. Due to the huge potential of deep learning, interpreting neural networks has recently attracted much research attention. In this paper, based on our comprehensive taxonomy, we systematically review recent studies in understanding the mechanism of neural networks, describe applications of interpretability especially in medicine, and discuss future directions of interpretability research, such as in relation to fuzzy logic and brain science.

...read moreread less

Journal Article•DOI•

Shapley additive explanations for NO2 forecasting

[...]

María Vega García¹, José Luis Aznarte¹•Institutions (1)

National University of Distance Education¹

01 Mar 2020-Ecological Informatics

TL;DR: This paper shows how Shapley additive explanations can be used to interpret the outputs of a deep neural network designed to predict Nitrogen dioxide concentrations in Madrid and compares three explanatory methods to determine which one is more suitable for the air quality data and for the chosen machine learning model.

...read moreread less

Journal Article•DOI•

Uncertainty and interpretability in convolutional neural networks for semantic segmentation of colorectal polyps.

[...]

Kristoffer Wickstrøm¹, Michael Kampffmeyer¹, Robert Jenssen¹•Institutions (1)

University of Tromsø¹

01 Feb 2020-Medical Image Analysis

TL;DR: A novel method for estimating the uncertainty associated with important features in the input is proposed and how interpretability and uncertainty can be modeled in DSSs for semantic segmentation of colorectal polyps is demonstrated.

...read moreread less

Proceedings Article•DOI•

The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models

[...]

Ian Tenney¹, James Wexler², Jasmijn Bastings¹, Tolga Bolukbasi¹, Andy Coenen¹, Sebastian Gehrmann³, Ellen Jiang, Mahima Pushkarna¹, Carey Radebaugh, Emily Reif¹, Ann Yuan¹ - Show less +7 more•Institutions (3)

Google¹, UOP LLC², Harvard University³

01 Oct 2020

TL;DR: The Language Interpretability Tool (LIT), an open-source platform for visualization and understanding of NLP models, is presented, which integrates local explanations, aggregate analysis, and counterfactual generation into a streamlined, browser-based interface to enable rapid exploration and error analysis.

...read moreread less

Abstract: We present the Language Interpretability Tool (LIT), an open-source platform for visualization and understanding of NLP models. We focus on core questions about model behavior: Why did my model make this prediction? When does it perform poorly? What happens under a controlled change in the input? LIT integrates local explanations, aggregate analysis, and counterfactual generation into a streamlined, browser-based interface to enable rapid exploration and error analysis. We include case studies for a diverse set of workflows, including exploring counterfactuals for sentiment analysis, measuring gender bias in coreference systems, and exploring local behavior in text generation. LIT supports a wide range of models---including classification, seq2seq, and structured prediction---and is highly extensible through a declarative, framework-agnostic API. LIT is under active development, with code and full documentation available at https://github.com/pair-code/lit.

...read moreread less

Estimating Uncertainty and Interpretability in Deep Learning for Coronavirus

[...]

B Ghoshal, Allan Tucker

27 Mar 2020

Posted Content•

TabTransformer: Tabular Data Modeling Using Contextual Embeddings

[...]

Xin Huang¹, Ashish Khetan¹, Milan Cvitkovic, Zohar Karnin¹•Institutions (1)

Amazon.com¹

11 Dec 2020-arXiv: Learning

TL;DR: The TabTransformer is a novel deep tabular data modeling architecture for supervised and semi-supervised learning built upon self-attention based Transformers that outperforms the state-of-the-art deep learning methods fortabular data by at least 1.0% on mean AUC, and matches the performance of tree-based ensemble models.

...read moreread less

Abstract: We propose TabTransformer, a novel deep tabular data modeling architecture for supervised and semi-supervised learning. The TabTransformer is built upon self-attention based Transformers. The Transformer layers transform the embeddings of categorical features into robust contextual embeddings to achieve higher prediction accuracy. Through extensive experiments on fifteen publicly available datasets, we show that the TabTransformer outperforms the state-of-the-art deep learning methods for tabular data by at least 1.0% on mean AUC, and matches the performance of tree-based ensemble models. Furthermore, we demonstrate that the contextual embeddings learned from TabTransformer are highly robust against both missing and noisy data features, and provide better interpretability. Lastly, for the semi-supervised setting we develop an unsupervised pre-training procedure to learn data-driven contextual embeddings, resulting in an average 2.1% AUC lift over the state-of-the-art methods.

...read moreread less

Posted Content•

A Model-driven Deep Neural Network for Single Image Rain Removal

[...]

Hong Wang¹, Qi Xie¹, Qian Zhao¹, Deyu Meng¹•Institutions (1)

Xi'an Jiaotong University¹

04 May 2020-arXiv: Image and Video Processing

TL;DR: Comprehensive experiments substantiate the superiority of the proposed model-driven deep neural network for the task, especially its well generality to diverse testing scenarios and good interpretability for all its modules, as compared with state-of-the-arts both visually and quantitatively.

...read moreread less

Journal Article•DOI•

A framework for effective application of machine learning to microbiome-based classification problems

[...]

Begüm D. Topçuoğlu¹, Nicholas A. Lesniak¹, Mack T. Ruffin², Jenna Wiens¹, Patrick D. Schloss¹ - Show less +1 more•Institutions (2)

University of Michigan¹, Penn State Milton S. Hershey Medical Center²

09 Jun 2020-Mbio

TL;DR: A reusable open-source pipeline to train, validate, and interpret ML models that help identify microbial biomarkers of disease and highlight the importance of choosing an ML approach based on the goal of the study, as the choice will inform expectations of performance and interpretability.

...read moreread less

Abstract: Machine learning (ML) modeling of the human microbiome has the potential to identify microbial biomarkers and aid in the diagnosis of many diseases such as inflammatory bowel disease, diabetes, and colorectal cancer. Progress has been made toward developing ML models that predict health outcomes using bacterial abundances, but inconsistent adoption of training and evaluation methods call the validity of these models into question. Furthermore, there appears to be a preference by many researchers to favor increased model complexity over interpretability. To overcome these challenges, we trained seven models that used fecal 16S rRNA sequence data to predict the presence of colonic screen relevant neoplasias (SRNs) (n = 490 patients, 261 controls and 229 cases). We developed a reusable open-source pipeline to train, validate, and interpret ML models. To show the effect of model selection, we assessed the predictive performance, interpretability, and training time of L2-regularized logistic regression, L1- and L2-regularized support vector machines (SVM) with linear and radial basis function kernels, a decision tree, random forest, and gradient boosted trees (XGBoost). The random forest model performed best at detecting SRNs with an area under the receiver operating characteristic curve (AUROC) of 0.695 (interquartile range [IQR], 0.651 to 0.739) but was slow to train (83.2 h) and not inherently interpretable. Despite its simplicity, L2-regularized logistic regression followed random forest in predictive performance with an AUROC of 0.680 (IQR, 0.625 to 0.735), trained faster (12 min), and was inherently interpretable. Our analysis highlights the importance of choosing an ML approach based on the goal of the study, as the choice will inform expectations of performance and interpretability. IMPORTANCE Diagnosing diseases using machine learning (ML) is rapidly being adopted in microbiome studies. However, the estimated performance associated with these models is likely overoptimistic. Moreover, there is a trend toward using black box models without a discussion of the difficulty of interpreting such models when trying to identify microbial biomarkers of disease. This work represents a step toward developing more-reproducible ML practices in applying ML to microbiome research. We implement a rigorous pipeline and emphasize the importance of selecting ML models that reflect the goal of the study. These concepts are not particular to the study of human health but can also be applied to environmental microbiology studies.

...read moreread less

Collapse