Topic
Word embedding
About: Word embedding is a research topic. Over the lifetime, 4683 publications have been published within this topic receiving 153378 citations. The topic is also known as: word embeddings.
Papers
More filters
•
20 Feb 2020
TL;DR: This work proposes FrameAxis, a method of characterizing the framing of a given text by identifying the most relevant semantic axes ("microframes") defined by antonym word pairs, and demonstrates that it can reliably characterize documents with relevant microframes.
Abstract: We propose FrameAxis, a method of characterizing the framing of a given text by identifying the most relevant semantic axes ("microframes") defined by antonym word pairs. In contrast to the traditional framing analysis, which has been constrained by a small number of manually annotated general frames, our unsupervised approach provides much more detailed insights, by considering a host of semantic axes. Our method is capable of quantitatively teasing out framing bias -- how biased a text is in each microframe -- and framing intensity -- how much each microframe is used -- from the text, offering a nuanced characterization of framing. We evaluate our approach using SemEval datasets as well as three other datasets and human evaluations, demonstrating that FrameAxis can reliably characterize documents with relevant microframes. Our method may allow scalable and nuanced computational analyses of framing across disciplines.
17 citations
••
TL;DR: LSTM performed better than other leading methods in the detection of disease-infected people in tweets, proving that the proposed method yields 94% accuracy compared to state-of-the-art approaches.
Abstract: With the massive spike in the use of Online Social Network Sites (OSNSs) platforms such as Web 2.0, microblogs services and online blogs, etc., valuable information in the form of sentiment, thoughts, opinions, as well as epidemic outbreaks, etc. are transferred. With the OSNSs being widely accessible, this work aims at proposing a novel approach for disease (dengue or flu) detection based on social media posts. For this purpose, an automated approach is designed with the help of LSTM (Long Short Term Memory) and word embedding techniques. Then the performance of the proposed approach is validated using a set of standard evaluation matrices. In addition, the effectiveness of the selected models is evaluated with performance measurement techniques. The accuracy of the proposed research approach is evaluated using two word embedding techniques; Word2Vec with Skip-gram (SG) and Word2Vec with Continuous-bag-of-words (CBOW). Based on the results conducted in this paper the LSTM Word2Vec with CBOW achieved better results compared to LSTM with Word2Vec SG features embedding technique. Our findings prove that the proposed method yields 94% accuracy compared to state-of-the-art approaches. Consequently, LSTM performed better than other leading methods in the detection of disease-infected people in tweets. In the end, spatial analysis is performed to identify the disease infected region.
17 citations
••
01 Dec 2016TL;DR: The authors employed dynamic Gaussian Bayesian networks to learn significant network motifs of words and concepts, which were used to pre-train the convolutional neural network and capture the dynamics of discourse across several sentences.
Abstract: Subjectivity detection aims to distinguish natural language as either opinionated (positive or negative) or neutral. In word vector based convolutional neural network models, a word meaning is simply a signal that helps to classify larger entities such as a document. Previous works do not usually consider prior distribution when using sliding windows to learn word embedding's and, hence, they are unable to capture higher-order and long-range features in text. In this paper, we employ dynamic Gaussian Bayesian networks to learn significant network motifs of words and concepts. These motifs are used to pre-train the convolutional neural network and capture the dynamics of discourse across several sentences.
17 citations
••
01 Dec 2016
TL;DR: Improved tfidf algorithm and word embedding are proposed as a way to represent documents and conduct text classification experiments on the Sogou Chinese classification corpus.
Abstract: Word2vec is a neural network language model which can convert words and phrases into a high-quality distributed vector (called word embedding) with semantic word relationships, so it offers a unique perspective to the text classification and other natural language processing (NLP) tasks. In this paper, we propose to combine improved tfidf algorithm and word embedding as a way to represent documents and conduct text classification experiments on the Sogou Chinese classification corpus. Our results show that the combination of word embedding and improved tf-idf algorithm can outperform either individually.
17 citations
••
TL;DR: A new method for the discovery of influential users in Instagram is designed, by focusing on user-generated posts as an alternative source of information, to potentially augment the existing solutions based on network topology or connections.
Abstract: Influencer marketing through social networks is becoming an important alternative to traditional ways of advertising. Various solutions have been proposed that often take advantage of graph-based a...
17 citations