scispace - formally typeset
Search or ask a question
Proceedings Article

RETAIN: An interpretable predictive model for healthcare using reverse time attention mechanism

01 Jan 2016-Vol. 29, pp 3504-3512
TL;DR: In this paper, a two-level neural attention model is proposed to detect influential past visits and significant clinical variables within those visits (e.g. key diagnoses) in reverse time order so that recent clinical visits are likely to receive higher attention.
Abstract: Accuracy and interpretability are two dominant features of successful predictive models. Typically, a choice must be made in favor of complex black box models such as recurrent neural networks (RNN) for accuracy versus less accurate but more interpretable traditional models such as logistic regression. This tradeoff poses challenges in medicine where both accuracy and interpretability are important. We addressed this challenge by developing the REverse Time AttentIoN model (RETAIN) for application to Electronic Health Records (EHR) data. RETAIN achieves high accuracy while remaining clinically interpretable and is based on a two-level neural attention model that detects influential past visits and significant clinical variables within those visits (e.g. key diagnoses). RETAIN mimics physician practice by attending the EHR data in a reverse time order so that recent clinical visits are likely to receive higher attention. RETAIN was tested on a large health system EHR dataset with 14 million visits completed by 263K patients over an 8 year period and demonstrated predictive accuracy and computational scalability comparable to state-of-the-art methods such as RNN, and ease of interpretability comparable to traditional models.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
17 Oct 2018
TL;DR: Experimental results show that the proposed KAME significantly improves the prediction performance compared with the state-of-the-art approaches, guarantees the robustness with both sufficient and insufficient data, and learns interpretable disease representations.
Abstract: The goal of diagnosis prediction task is to predict the future health information of patients from their historical Electronic Healthcare Records (EHR). The most important and challenging problem of diagnosis prediction is to design an accurate, robust and interpretable predictive model. Existing work solves this problem by employing recurrent neural networks (RNNs) with attention mechanisms, but these approaches suffer from the data sufficiency problem. To obtain good performance with insufficient data, graph-based attention models are proposed. However, when the training data are sufficient, they do not offer any improvement in performance compared with ordinary attention-based models. To address these issues, we propose KAME, an end-to-end, accurate and robust model for predicting patients' future health information. KAME not only learns reasonable embeddings for nodes in the knowledge graph, but also exploits general knowledge to improve the prediction accuracy with the proposed knowledge attention mechanism. With the learned attention weights, KAME allows us to interpret the importance of each piece of knowledge in the graph. Experimental results on three real world datasets show that the proposed KAME significantly improves the prediction performance compared with the state-of-the-art approaches, guarantees the robustness with both sufficient and insufficient data, and learns interpretable disease representations.

151 citations

Journal ArticleDOI
TL;DR: It is found that XAI evaluation in medicine has not been adequately and formally practiced, andple opportunities exist to advance XAI research in medicine.

151 citations

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a dual-stage two-phase recurrent neural network (DSTP-based RNN) for long-term time series prediction, where the first phase produces violent but decentralized response weight, while the second phase leads to stationary and concentrated response weight.
Abstract: Long-term prediction of multivariate time series is still an important but challenging problem. The key to solve this problem is capturing (1) the spatial correlations at the same time, (2) the spatio-temporal relationships at different times, and (3) long-term dependency of the temporal relationships between different series. Attention-based recurrent neural networks (RNN) can effectively represent and learn the dynamic spatio-temporal relationships between exogenous series and target series, but they only perform well in one-step time prediction and short-term time prediction. In this paper, inspired by human attention mechanism including the dual-stage two-phase (DSTP) model and the influence mechanism of target information and non-target information, we propose DSTP-based RNN (DSTP-RNN) and DSTP-RNN-Ⅱ respectively for long-term time series prediction. Specifically, we first propose the DSTP-based structure to enhance the spatial correlations between exogenous series. The first phase produces violent but decentralized response weight, while the second phase leads to stationary and concentrated response weight. Then, we employ multiple attentions on target series to boost the long-term dependency. Finally, we study the performance of deep spatial attention mechanism and provide interpretation. Experimental results demonstrate that the present work can be successfully used to develop expert or intelligent systems for a wide range of applications, with state-of-the-art performances superior to nine baseline methods on four datasets in the fields of energy, finance, environment and medicine, respectively. Overall, the present work carries a significant value not merely in the domain of machine intelligence and deep learning, but also in the fields of many applications.

150 citations

Journal ArticleDOI
TL;DR: This study introduces BEHRT: A deep neural sequence transduction model for electronic health records (EHR), capable of simultaneously predicting the likelihood of 301 conditions in one’s future visits and shows a striking improvement over the existing state-of-the-art deep EHR models.
Abstract: Today, despite decades of developments in medicine and the growing interest in precision healthcare, vast majority of diagnoses happen once patients begin to show noticeable signs of illness. Early indication and detection of diseases, however, can provide patients and carers with the chance of early intervention, better disease management, and efficient allocation of healthcare resources. The latest developments in machine learning (including deep learning) provides a great opportunity to address this unmet need. In this study, we introduce BEHRT: A deep neural sequence transduction model for electronic health records (EHR), capable of simultaneously predicting the likelihood of 301 conditions in one’s future visits. When trained and evaluated on the data from nearly 1.6 million individuals, BEHRT shows a striking improvement of 8.0–13.2% (in terms of average precision scores for different tasks), over the existing state-of-the-art deep EHR models. In addition to its scalability and superior accuracy, BEHRT enables personalised interpretation of its predictions; its flexible architecture enables it to incorporate multiple heterogeneous concepts (e.g., diagnosis, medication, measurements, and more) to further improve the accuracy of its predictions; its (pre-)training results in disease and patient representations can be useful for future studies (i.e., transfer learning).

149 citations

Journal ArticleDOI
TL;DR: A computational framework, Patient2Vec, is proposed to learn an interpretable deep representation of longitudinal EHR data, which is personalized for each patient, and it achieves an area under curve around 0.799, outperforming baseline methods.
Abstract: The wide implementation of electronic health record (EHR) systems facilitates the collection of large-scale health data from real clinical settings. Despite the significant increase in adoption of EHR systems, these data remain largely unexplored, but present a rich data source for knowledge discovery from patient health histories in tasks, such as understanding disease correlations and predicting health outcomes. However, the heterogeneity, sparsity, noise, and bias in these data present many complex challenges. This complexity makes it difficult to translate potentially relevant information into machine learning algorithms. In this paper, we propose a computational framework, Patient2Vec , to learn an interpretable deep representation of longitudinal EHR data, which is personalized for each patient. To evaluate this approach, we apply it to the prediction of future hospitalizations using real EHR data and compare its predictive performance with baseline methods. Patient2Vec produces a vector space with meaningful structure, and it achieves an area under curve around 0.799, outperforming baseline methods. In the end, the learned feature importance can be visualized and interpreted at both the individual and population levels to bring clinical insights.

141 citations