A Comparative Analysis of Active Learning for Biomedical Text Mining

doi:10.3390/ASI4010023

Open AccessJournal ArticleDOI

A Comparative Analysis of Active Learning for Biomedical Text Mining

Usman Naseem, +4 more

- Vol. 4, Iss: 1, pp 23

Chats0

TLDR

Experiments show that AL has the potential to significantly reducing the cost of manual labelling, and AL-assisted pre-annotations accelerates the de novo annotation process with less annotation time required.

Abstract:

An enormous amount of clinical free-text information, such as pathology reports, progress reports, clinical notes and discharge summaries have been collected at hospitals and medical care clinics. These data provide an opportunity of developing many useful machine learning applications if the data could be transferred into a learn-able structure with appropriate labels for supervised learning. The annotation of this data has to be performed by qualified clinical experts, hence, limiting the use of this data due to the high cost of annotation. An underutilised technique of machine learning that can label new data called active learning (AL) is a promising candidate to address the high cost of the label the data. AL has been successfully applied to labelling speech recognition and text classification, however, there is a lack of literature investigating its use for clinical purposes. We performed a comparative investigation of various AL techniques using ML and deep learning (DL)-based strategies on three unique biomedical datasets. We investigated random sampling (RS), least confidence (LC), informative diversity and density (IDD), margin and maximum representativeness-diversity (MRD) AL query strategies. Our experiments show that AL has the potential to significantly reducing the cost of manual labelling. Furthermore, pre-labelling performed using AL expediates the labelling process by reducing the time required for labelling.

Citations

PDF

Open Access

More filters

Posted Content

A Comprehensive Survey on Word Representation Models: From Classical to State-Of-The-Art Word Representation Language Models

Usman Naseem, +3 more

- 28 Oct 2020 -

arXiv: Computation and Language

TL;DR: A variety of text representation methods, and model designs have blossomed in the context of NLP, including SOTA LMs are described, which can transform large volumes of text into effective vector representations capturing the same semantic information.

...read moreread less

Journal ArticleDOI

A Comprehensive Survey on Word Representation Models: From Classical to State-of-the-Art Word Representation Language Models

Usman Naseem, +3 more

TL;DR: For a survey of word representation models and its power of expression, from the classical to modern-day state-of-the-art word representation language models (LMS), see as mentioned in this paper.

...read moreread less

Journal ArticleDOI

Clinically Applicable Machine Learning Approaches to Identify Attributes of Chronic Kidney Disease (CKD) for Use in Low-Cost Diagnostic Screening

Md. Rashed-Al-Mahfuz, +5 more

- 15 Apr 2021 -

IEEE Journal of Translational Engineerin...

TL;DR: In this paper, the authors developed machine learning models using selective key pathological categories to identify clinical test attributes that will aid in accurate early diagnosis of chronic kidney disease (CKD).

...read moreread less

Journal ArticleDOI

A Novel Approach of Transcriptomic microRNA Analysis Using Text Mining Methods: An Early Detection of Multiple Sclerosis Disease

Nehal M. Ali, +3 more

- 01 Jan 2021 -

IEEE Access

TL;DR: In this article, the authors presented a complete predictive model by combining consecutive transcriptomic data preprocessing procedures, followed by the proposed KmerFIDF method as a feature extraction method and linear discriminant analysis for dimensionality reduction.

...read moreread less

Proceedings ArticleDOI

A Novel Approach for Implementing Conventional LBIST by High Execution Microprocessors

TL;DR: In this article , lower built-in self-test (LBIS T) mechanism is used to design a microprocessor and the proposed methodology is giving performance measure like power efficiency 97.5% and area had been attained.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Posted Content

BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition

Usman Naseem, +5 more

- 19 Sep 2020 -

arXiv: Computation and Language

TL;DR: Biomedical ALBERT (A Lite Bidirectional Encoder Representations from Transformers for Biomedical Text Mining) is proposed, an effective domain-specific language model trained on large-scale biomedical corpora designed to capture biomedical context-dependent NER.

...read moreread less

Posted Content

Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets.

Yifan Peng, +2 more

- 13 Jun 2019 -

arXiv: Computation and Language

TL;DR: The Biomedical Language Understanding Evaluation (BLUE) benchmark as discussed by the authors was introduced to facilitate research in the development of pre-training language representations in the biomedicine domain, which consists of five tasks with ten datasets that cover both biomedical and clinical texts with different dataset sizes and difficulties.

...read moreread less

Book ChapterDOI

Detecting Alzheimer's Disease by Exploiting Linguistic Information from Nepali Transcript.

Surendrabikram Thapa, +5 more

TL;DR: The proposed study makes a convincing conclusion that the difficulty in processing information in AD patients reflects in their speech while describing a picture, and sets a baseline for the early detection of AD using NLP in the Nepali language.

...read moreread less

Book ChapterDOI

Recognizing Biomedical Named Entities Based on the Sentence Vector/Twin Word Embeddings Conditioned Bidirectional LSTM

Lishuang Li, +3 more

TL;DR: The bidirectional recurrent neural network with LSTM unit is mainly adopted to identify biomedical entities, in which the twin word embeddings and sentence vector are added to rich input information, and the complex feature extraction can be skipped.

...read moreread less

Proceedings ArticleDOI

UAV-aided 5G Network in Suburban, Urban, Dense Urban, and High-rise Urban Environments

Shah Khalid Khan, +6 more

TL;DR: In this paper, a brief experimental review on ray-tracing simulation for a UAV-aided 5G network is presented, where the authors assess the usage of UAV in next-generation wireless networks, i.e., deploying UAV as a relay using millimeter wave concurrently in backhaul and access links.

...read moreread less