A Comparison of Word-based and Context-based Representations for Classification Problems in Health Informatics
Aditya Joshi,Sarvnaz Karimi,Ross Sparks,Cecile Paris,C. Raina MacIntyre +4 more
- pp 135-141
Reads0
Chats0
TLDR
The authors compare the two kinds of representations (word versus context) for three classification problems: influenza infection classification, drug usage classification, and personal health mention classification, showing that context-based representations based on ELMo, Universal Sentence Encoder, Neural-Net Language Model and FLAIR are better than Word2Vec, GloVe and the two adapted using the MESH ontology.Abstract:
Distributed representations of text can be used as features when training a statistical classifier. These representations may be created as a composition of word vectors or as context-based sentence vectors. We compare the two kinds of representations (word versus context) for three classification problems: influenza infection classification, drug usage classification and personal health mention classification. For statistical classifiers trained for each of these problems, context-based representations based on ELMo, Universal Sentence Encoder, Neural-Net Language Model and FLAIR are better than Word2Vec, GloVe and the two adapted using the MESH ontology. There is an improvement of 2-4% in the accuracy when these context-based representations are used instead of word-based representations.read more
Citations
More filters
Proceedings ArticleDOI
Contextual and Non-Contextual Word Embeddings: an in-depth Linguistic Investigation.
TL;DR: It is shown that, although BERT is capable of understanding the full context of each word in an input sequence, the implicit knowledge encoded in its aggregated sentence representations is still comparable to that of a contextual-independent model.
Journal ArticleDOI
Survey of Text-based Epidemic Intelligence: A Computational Linguistics Perspective
TL;DR: This survey discusses approaches for epidemic intelligence that use textual datasets, referring to it as “text-based epidemic intelligence,” view past work in terms of two broad categories: health mention classification and health event detection.
To BERT or not to BERT - Comparing Contextual Embeddings in a Deep Learning Architecture for the Automatic Recognition of four Types of Speech, Thought and Writing Representation.
TL;DR: An evaluation of the recognizers for four very different types of speech, thought and writing representation (STWR) for German texts based on deep learning with two different customized contextual embeddings, namely FLAIR and BERT.
Book ChapterDOI
End-to-End Fine-Grained Neural Entity Recognition of Patients, Interventions, Outcomes.
Anjani Dhrangadhariya,Anjani Dhrangadhariya,Gustavo Aguilar,Thamar Solorio,Roger Hilfiker,Henning Müller,Henning Müller +6 more
TL;DR: This paper used multitask learning (MTL) to improve fine-grained PICO recognition using a related auxiliary task and compared it with single-task learning (STL) and achieved state-of-the-art performance.
Journal ArticleDOI
Aggregation levels when the time between events is Weibull distributed
TL;DR: In this paper, the authors present an aggregation process that is best for early detection of any outbreak of events including sales, warrantee claims or disease outbreaks, such as hurricanes and floods.
References
More filters
Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Proceedings ArticleDOI
Glove: Global Vectors for Word Representation
TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Proceedings Article
Distributed Representations of Words and Phrases and their Compositionality
TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Journal Article
LIBLINEAR: A Library for Large Linear Classification
TL;DR: LIBLINEAR is an open source library for large-scale linear classification that supports logistic regression and linear support vector machines and provides easy-to-use command-line tools and library calls for users and developers.
Proceedings ArticleDOI
Deep contextualized word representations
Matthew E. Peters,Mark Neumann,Mohit Iyyer,Matt Gardner,Christopher Clark,Kenton Lee,Luke Zettlemoyer +6 more
TL;DR: This paper introduced a new type of deep contextualized word representation that models both complex characteristics of word use (e.g., syntax and semantics), and how these uses vary across linguistic contexts (i.e., to model polysemy).