scispace - formally typeset
Open AccessJournal ArticleDOI

Correction: Extracting Family History Information From Electronic Health Records: Natural Language Processing Analysis

Reads0
Chats0
TLDR
The authors used transformers to extract disease mentions from clinical notes and used rule-based methods for extracting family member (FM) information from text and coreference resolution techniques to improve the annotation of diseases.
Abstract
Background: The prognosis, diagnosis, and treatment of many genetic disorders and familial diseases significantly improve if the family history (FH) of a patient is known. Such information is often written in the free text of clinical notes. Objective: The aim of this study is to develop automated methods that enable access to FH data through natural language processing. Methods: We performed information extraction by using transformers to extract disease mentions from notes. We also experimented with rule-based methods for extracting family member (FM) information from text and coreference resolution techniques. We evaluated different transfer learning strategies to improve the annotation of diseases. We provided a thorough error analysis of the contributing factors that affect such information extraction systems. Results: Our experiments showed that the combination of domain-adaptive pretraining and intermediate-task pretraining achieved an F1 score of 81.63% for the extraction of diseases and FMs from notes when it was tested on a public shared task data set from the National Natural Language Processing Clinical Challenges (N2C2), providing a statistically significant improvement over the baseline (P<.001). In comparison, in the 2019 N2C2/Open Health Natural Language Processing Shared Task, the median F1 score of all 17 participating teams was 76.59%. Conclusions: Our approach, which leverages a state-of-the-art named entity recognition model for disease mention detection coupled with a hybrid method for FM mention detection, achieved an effectiveness that was close to that of the top 3 systems participating in the 2019 N2C2 FH extraction challenge, with only the top system convincingly outperforming our approach in terms of precision.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Clinician documentation of patient centered care in the electronic health record

TL;DR: In this paper , the authors explored the possibility of using patient centered care documentation as a measure of the delivery of PCC in a health system and explored the feasibility of using this data as a means to measure the degree to which care in a healthcare center is patient centered.
Proceedings Article

CSIRO Data61 Team at BioLaySumm Task 1: Lay Summarisation of Biomedical Research Articles Using Generative Models

TL;DR: This paper presented a comprehensive set of experiments and analysis to investigate the effectiveness of existing pre-trained language models in generating lay summaries, which ranked second for the relevance criteria and third overall among 21 competing teams.
Book ChapterDOI

Investigating the Impact of Query Representation on Medical Information Retrieval

TL;DR: In this article , the authors investigated the effect that various patient-related information extracted from unstructured clinical notes has on two different tasks, i.e., patient allocation in clinical trials and medical literature retrieval.
Journal ArticleDOI

Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods

Xiang Dai, +1 more
TL;DR: The DEAL (Detecting entities in the Astrophysics literature) shared task as mentioned in this paper was the first attempt to build a system that can identify entities in a dataset composed by scholarly articles from astrophysics literature.
References
More filters
Proceedings ArticleDOI

Deep contextualized word representations

TL;DR: This paper introduced a new type of deep contextualized word representation that models both complex characteristics of word use (e.g., syntax and semantics), and how these uses vary across linguistic contexts (i.e., to model polysemy).
Proceedings ArticleDOI

The Stanford CoreNLP Natural Language Processing Toolkit

TL;DR: The design and use of the Stanford CoreNLP toolkit is described, an extensible pipeline that provides core natural language analysis, and it is suggested that this follows from a simple, approachable design, straightforward interfaces, the inclusion of robust and good quality analysis components, and not requiring use of a large amount of associated baggage.
Proceedings ArticleDOI

Neural Machine Translation of Rare Words with Subword Units

TL;DR: This paper introduces a simpler and more effective approach, making the NMT model capable of open-vocabulary translation by encoding rare and unknown words as sequences of subword units, and empirically shows that subword models improve over a back-off dictionary baseline for the WMT 15 translation tasks English-German and English-Russian by 1.3 BLEU.
Proceedings ArticleDOI

Neural Architectures for Named Entity Recognition

TL;DR: Comunicacio presentada a la 2016 Conference of the North American Chapter of the Association for Computational Linguistics, celebrada a San Diego (CA, EUA) els dies 12 a 17 of juny 2016.
Journal ArticleDOI

BioBERT: a pre-trained biomedical language representation model for biomedical text mining.

TL;DR: This article proposed BioBERT (Bidirectional Encoder Representations from Transformers for Biomedical Text Mining), which is a domain-specific language representation model pre-trained on large-scale biomedical corpora.
Related Papers (5)