scispace - formally typeset
Open AccessProceedings ArticleDOI

Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks

TLDR
Dipole as discussed by the authors employs bidirectional recurrent neural networks to remember all the information of both the past visits and the future visits, and introduces three attention mechanisms to measure the relationships of different visits for the prediction.
Abstract
Predicting the future health information of patients from the historical Electronic Health Records (EHR) is a core research task in the development of personalized healthcare. Patient EHR data consist of sequences of visits over time, where each visit contains multiple medical codes, including diagnosis, medication, and procedure codes. The most important challenges for this task are to model the temporality and high dimensionality of sequential EHR data and to interpret the prediction results. Existing work solves this problem by employing recurrent neural networks (RNNs) to model EHR data and utilizing simple attention mechanism to interpret the results. However, RNN-based approaches suffer from the problem that the performance of RNNs drops when the length of sequences is large, and the relationships between subsequent visits are ignored by current RNN-based approaches. To address these issues, we propose Dipole, an end-to-end, simple and robust model for predicting patients' future health information. Dipole employs bidirectional recurrent neural networks to remember all the information of both the past visits and the future visits, and it introduces three attention mechanisms to measure the relationships of different visits for the prediction. With the attention mechanisms, Dipole can interpret the prediction results effectively. Dipole also allows us to interpret the learned medical code representations which are confirmed positively by medical experts. Experimental results on two real world EHR datasets show that the proposed Dipole can significantly improve the prediction accuracy compared with the state-of-the-art diagnosis prediction approaches and provide clinically meaningful interpretation.

read more

Citations
More filters
Journal ArticleDOI

Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review.

TL;DR: A systematic review of deep learning models for electronic health record (EHR) data is conducted, and various deep learning architectures for analyzing different data sources and their target applications are illustrated.
Posted Content

An Attentive Survey of Attention Models

TL;DR: A taxonomy that groups existing techniques into coherent categories in attention models is proposed, and how attention has been used to improve the interpretability of neural networks is described.
Journal ArticleDOI

The importance of interpretability and visualization in machine learning for applications in medicine and health care

TL;DR: It is argued that, beyond improving model interpretability as a goal in itself, machine learning needs to integrate the medical experts in the design of data analysis interpretation strategies Otherwise, machineLearning is unlikely to become a part of routine clinical and health care practice.
Journal ArticleDOI

WiFi CSI Based Passive Human Activity Recognition Using Attention Based BLSTM

TL;DR: This paper proposes a new deep learning based approach, i.e., attention based bi-directional long short-term memory (ABLSTM) for passive human activity recognition using WiFi CSI signals, employed to learn representative features in two directions from raw sequential CSI measurements.
Posted Content

Med-BERT: pre-trained contextualized embeddings on large-scale structured electronic health records for disease prediction

TL;DR: Inspired by BERT, Med-BERT is a contextualized embedding model pretrained on a structured EHR dataset of 28,490,650 patients that substantially improves the prediction accuracy and can boost the area under the receiver operating characteristics curve (AUC) by 1.21–6.14% in two disease prediction tasks from two clinical databases.
References
More filters
Journal ArticleDOI

Long short-term memory

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Posted Content

Neural Machine Translation by Jointly Learning to Align and Translate

TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Proceedings ArticleDOI

Effective Approaches to Attention-based Neural Machine Translation

TL;DR: A global approach which always attends to all source words and a local one that only looks at a subset of source words at a time are examined, demonstrating the effectiveness of both approaches on the WMT translation tasks between English and German in both directions.
Related Papers (5)