scispace - formally typeset
Search or ask a question
Topic

Word embedding

About: Word embedding is a research topic. Over the lifetime, 4683 publications have been published within this topic receiving 153378 citations. The topic is also known as: word embeddings.


Papers
More filters
Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors used BiLSTM to encode the question text to obtain sentence level information and then used bagging to further improve the overall performance of BiBERT.
Abstract: Biomedical factoid question answering is an important task in biomedical question answering application. It has attracted much attention because of its reliability of the answer. In question answering system, better representation of word is of much importance and a proper word embedding usually can improve the performance of system significantly. With the success of pre-trained models in general natural language process tasks, pretrained model has been widely used in biomedical area as well and a lot of pretrained model based approaches have been proven effective in biomedical question answering task. Besides the proper word embedding, name entity is also important information for biomedical question answering. Inspired by the concept of transfer learning, in this research we developed a mechanism to finetune BioBERT with name entity dataset to improve the question answering performance. Furthermore, we also apply BiLSTM to encode the question text to obtain sentence level information. To better combine the question level and token level information, we use bagging to further improve the overall performance. The proposed framework has been evaluated on BioASQ 6b and 7b datasets and the results have shown its promising potential.

17 citations

Proceedings ArticleDOI
20 Jul 2018
TL;DR: It is hypothesize that humans do not associate a single polarity or sentiment to each word, and makes use of the Hilbert Space representation of microscopic particles in Quantum Mechanics to subscribe a relative phase to eachword and investigates two quantum inspired models to derive the meaning of a combination of words.
Abstract: A challenging task for word embeddings is to capture the emergent meaning or polarity of a combination of individual words. For example, existing approaches in word embeddings will assign high probabilities to the words ”Penguin” and ”Fly” if they frequently co-occur, but it fails to capture the fact that they occur in an opposite sense - Penguins do not fly. We hypothesize that humans do not associate a single polarity or sentiment to each word. The word contributes to the overall polarity of a combination of words depending upon which other words it is combined with. This is analogous to the behavior of microscopic particles which exist in all possible states at the same time and interfere with each other to give rise to new states depending upon their relative phases. We make use of the Hilbert Space representation of such particles in Quantum Mechanics where we subscribe a relative phase to each word, which is a complex number, and investigate two such quantum inspired models to derive the meaning of a combination of words. The proposed models achieve better performances than state-ofthe-art non-quantum models on the binary sentence classification task.

17 citations

Posted Content
TL;DR: This article used a generalized regressor system to estimate emotion intensity in tweets using pre-trained word embedding features and trained them on general regressors and finally combined the best performing models to create an ensemble.
Abstract: The paper describes experiments on estimating emotion intensity in tweets using a generalized regressor system. The system combines lexical, syntactic and pre-trained word embedding features, trains them on general regressors and finally combines the best performing models to create an ensemble. The proposed system stood 3rd out of 22 systems in the leaderboard of WASSA-2017 Shared Task on Emotion Intensity.

17 citations

Proceedings ArticleDOI
Mamdouh Farouk1
01 Dec 2018
TL;DR: This paper combines the using of pre-trained word vector and WordNet to measure semantic similarity between two sentences and achieves better results comparing with other approaches previously proposed to measure sentence similarity.
Abstract: Semantic similarity between sentences is a crucial task for many applications. The emerging of word embedding encourages calculating similarity between words and between sentences based on the new semantic word representation. On the other hand, WordNet is widely used to find semantic distance between sentences. This paper combines the using of pre-trained word vector and WordNet to measure semantic similarity between two sentences. In addition, word order similarity is applied to make the final similarity more accurate. The proposed approach has been implemented and tested using standard datasets. Experiments show that presented methods achieves better results comparing with other approaches previously proposed to measure sentence similarity.

17 citations

Journal ArticleDOI
TL;DR: In this article, a fake news detection model for low-resource African languages, such as Amharic, is presented, evaluated with the ETH_FAKE dataset and using the AMFTWE, performed very well.
Abstract: The need to fight the progressive negative impact of fake news is escalating, which is evident in the strive to do research and develop tools that could do this job. However, a lack of adequate datasets and good word embeddings have posed challenges to make detection methods sufficiently accurate. These resources are even totally missing for “low-resource” African languages, such as Amharic. Alleviating these critical problems should not be left for tomorrow. Deep learning methods and word embeddings contributed a lot in devising automatic fake news detection mechanisms. Several contributions are presented, including an Amharic fake news detection model, a general-purpose Amharic corpus (GPAC), a novel Amharic fake news detection dataset (ETH_FAKE), and Amharic fasttext word embedding (AMFTWE). Our Amharic fake news detection model, evaluated with the ETH_FAKE dataset and using the AMFTWE, performed very well.

17 citations


Network Information
Related Topics (5)
Recurrent neural network
29.2K papers, 890K citations
87% related
Unsupervised learning
22.7K papers, 1M citations
86% related
Deep learning
79.8K papers, 2.1M citations
85% related
Reinforcement learning
46K papers, 1M citations
84% related
Graph (abstract data type)
69.9K papers, 1.2M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023317
2022716
2021736
20201,025
20191,078
2018788