scispace - formally typeset
Open AccessPosted Content

RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism

TLDR
The REverse Time AttentIoN model (RETAIN) is developed for application to Electronic Health Records (EHR) data and achieves high accuracy while remaining clinically interpretable and is based on a two-level neural attention model that detects influential past visits and significant clinical variables within those visits.
Abstract
Accuracy and interpretability are two dominant features of successful predictive models. Typically, a choice must be made in favor of complex black box models such as recurrent neural networks (RNN) for accuracy versus less accurate but more interpretable traditional models such as logistic regression. This tradeoff poses challenges in medicine where both accuracy and interpretability are important. We addressed this challenge by developing the REverse Time AttentIoN model (RETAIN) for application to Electronic Health Records (EHR) data. RETAIN achieves high accuracy while remaining clinically interpretable and is based on a two-level neural attention model that detects influential past visits and significant clinical variables within those visits (e.g. key diagnoses). RETAIN mimics physician practice by attending the EHR data in a reverse time order so that recent clinical visits are likely to receive higher attention. RETAIN was tested on a large health system EHR dataset with 14 million visits completed by 263K patients over an 8 year period and demonstrated predictive accuracy and computational scalability comparable to state-of-the-art methods such as RNN, and ease of interpretability comparable to traditional models.

read more

Citations
More filters
Journal ArticleDOI

Opportunities and obstacles for deep learning in biology and medicine.

TL;DR: It is found that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art.
Journal ArticleDOI

Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review.

TL;DR: A systematic review of deep learning models for electronic health record (EHR) data is conducted, and various deep learning architectures for analyzing different data sources and their target applications are illustrated.
Posted Content

Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting

TL;DR: The Temporal Fusion Transformer is introduced -- a novel attention-based architecture which combines high-performance multi-horizon forecasting with interpretable insights into temporal dynamics and three practical interpretability use-cases of TFT are showcased.
Journal ArticleDOI

Time Series Forecasting With Deep Learning: A Survey

TL;DR: This article surveys common encoder and decoder designs used in both one-step-ahead and multi-horizon time-series forecasting—describing how temporal information is incorporated into predictions by each model.
References
More filters
Journal ArticleDOI

Long short-term memory

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Proceedings ArticleDOI

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.
Posted Content

Neural Machine Translation by Jointly Learning to Align and Translate

TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Journal ArticleDOI

Learning long-term dependencies with gradient descent is difficult

TL;DR: This work shows why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases, and exposes a trade-off between efficient learning by gradient descent and latching on information for long periods.
Related Papers (5)