Proceedings ArticleDOI
Active Learning with Deep Pre-trained Models for Sequence Tagging of Clinical and Biomedical Texts
Artem Shelmanov,Vadim Liventsev,Danil Kireev,Nikita Khromov,Alexander Panchenko,Irina Fedulova,Dmitry V. Dylov +6 more
- pp 482-489
Reads0
Chats0
TLDR
An annotation tool empowered with active learning and deep pre-trained models that could be used for entity annotation directly from Jupyter IDE is proposed and a modification to a standard uncertainty sampling strategy is suggested to show that it could be beneficial for annotation of very skewed datasets.Abstract:
Active learning is a technique that helps to minimize the annotation budget required for the creation of a labeled dataset while maximizing the performance of a model trained on this dataset. It has been shown that active learning can be successfully applied to sequence tagging tasks of text processing in conjunction with deep learning models even when a limited amount of labeled data is available. Recent advances in transfer learning methods for natural language processing based on deep pre-trained models such as ELMo and BERT offer a much better ability to generalize on small annotated datasets compared to their shallow counterparts. The combination of deep pre-trained models and active learning leads to a powerful approach to dealing with annotation scarcity. In this work, we investigate the potential of this approach on clinical and biomedical data. The experimental evaluation shows that the combination of active learning and deep pre-trained models outperforms the standard methods of active learning. We also suggest a modification to a standard uncertainty sampling strategy and empirically show that it could be beneficial for annotation of very skewed datasets. Finally, we propose an annotation tool empowered with active learning and deep pre-trained models that could be used for entity annotation directly from Jupyter IDE.read more
Citations
More filters
Posted Content
A Survey of Deep Active Learning
TL;DR: A formal classification method for the existing work in deep active learning is provided, along with a comprehensive and systematic overview, to investigate whether AL can be used to reduce the cost of sample annotation while retaining the powerful learning capabilities of DL.
Proceedings ArticleDOI
Active Learning for BERT: An Empirical Study
Liat Ein-Dor,Alon Halfon,Ariel Gera,Eyal Shnarch,Lena Dankin,Leshem Choshen,Marina Danilevsky,Ranit Aharonov,Yoav Katz,Noam Slonim +9 more
TL;DR: The results demonstrate that AL can boost BERT performance, especially in the most realistic scenario in which the initial set of labeled examples is created using keyword-based queries, resulting in a biased sample of the minority class.
Journal ArticleDOI
Annotating social determinants of health using active learning, and characterizing determinants using neural event extraction
TL;DR: The Social History Annotation Corpus (SHAC) as discussed by the authors includes 4480 social history sections with detailed annotation for 12 social determinants of health (SDOH) characterizing the status, extent, and temporal information of 18K distinct events.
Proceedings ArticleDOI
Unsupervised non-parametric change point detection in electrocardiography
TL;DR: A new unsupervised and non-parametric method to detect change points in electrocardiography that reaches the level of performance of the supervised state-of-the-art techniques.
Journal ArticleDOI
LETS: A Label-Efficient Training Scheme for Aspect-Based Sentiment Analysis by Using a Pre-Trained Language Model
TL;DR: The authors proposed a task-specific pre-training to exploit unlabeled task specific corpus data, label augmentation to maximise the utility of labelled data, and active learning to label data strategically.
References
More filters
Journal ArticleDOI
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Proceedings ArticleDOI
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Proceedings Article
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.