scispace - formally typeset
Search or ask a question
Topic

Word embedding

About: Word embedding is a research topic. Over the lifetime, 4683 publications have been published within this topic receiving 153378 citations. The topic is also known as: word embeddings.


Papers
More filters
Proceedings ArticleDOI
01 Sep 2017
TL;DR: A neural architecture to train a sentiment-aware word embedding by integrating the sentiment supervision at both document and word levels is developed, to enhance the quality ofword embedding as well as the sentiment lexicon.
Abstract: Sentiment lexicon is an important tool for identifying the sentiment polarity of words and texts. How to automatically construct sentiment lexicons has become a research topic in the field of sentiment analysis and opinion mining. Recently there were some attempts to employ representation learning algorithms to construct a sentiment lexicon with sentiment-aware word embedding. However, these methods were normally trained under document-level sentiment supervision. In this paper, we develop a neural architecture to train a sentiment-aware word embedding by integrating the sentiment supervision at both document and word levels, to enhance the quality of word embedding as well as the sentiment lexicon. Experiments on the SemEval 2013-2016 datasets indicate that the sentiment lexicon generated by our approach achieves the state-of-the-art performance in both supervised and unsupervised sentiment classification, in comparison with several strong sentiment lexicon construction methods.

54 citations

Journal ArticleDOI
TL;DR: Convolutional Neural Networks (CNN) with margin loss and different embedding models proposed for detecting fake news are presented and their proposed architectures are evaluated on two recent well-known datasets in the field, namely ISOT and LIAR.
Abstract: The advent of online news platforms such as social media, news blogs, and online newspapers in recent years and their facilitated features such as swift information flow, easy access, and low costs encourage people to seek and raise their information by consuming their provided news. Furthermore, these platforms increase the opportunities for deceiver parties to influence public opinion and awareness by producing fake news, i.e., the news which consists of false and deceptive information and is published for achieving specific political and economic gains. Since the discerning of fake news through their contents by individuals is very difficult, the existence of an automatic fake news detection approach for preventing the spread of such false information is mandatory. In this paper, Convolutional Neural Networks (CNN) with margin loss and different embedding models proposed for detecting fake news. We compare static word embeddings with the non-static embeddings that provide the possibility of incrementally up-training and updating word embedding in the training phase. Our proposed architectures are evaluated on two recent well-known datasets in the field, namely ISOT and LIAR. Our results on the best architecture show encouraging performance, outperforming the state-of-the-art methods by 7.9% on ISOT and 2.1% on the test set of the LIAR dataset.

54 citations

Journal ArticleDOI
TL;DR: This paper tackles PPI extraction by using convolutional neural networks (CNN) and proposes a shortest dependency path based CNN (sdpCNN) model that could avoid bias from feature selection by using CNN.
Abstract: The state-of-the-art methods for protein-protein interaction (PPI) extraction are primarily based on kernel methods, and their performances strongly depend on the handcraft features. In this paper, we tackle PPI extraction by using convolutional neural networks (CNN) and propose a shortest dependency path based CNN (sdpCNN) model. The proposed method (1) only takes the sdp and word embedding as input and (2) could avoid bias from feature selection by using CNN. We performed experiments on standard Aimed and BioInfer datasets, and the experimental results demonstrated that our approach outperformed state-of-the-art kernel based methods. In particular, by tracking the sdpCNN model, we find that sdpCNN could extract key features automatically and it is verified that pretrained word embedding is crucial in PPI task.

54 citations

Journal ArticleDOI
TL;DR: The experiment result show that SVM can performs well on the sentiment classification task using any model used, however, the Word2vec model has the lowest accuracy, compared to other baseline method including Bag of Words model using Binary TF, Raw TF, and TF.
Abstract: Online product reviews have become a source of greatly valuable information for consumers in making purchase decisions and producers to improve their product and marketing strategies However, it becomes more and more difficult for people to understand and evaluate what the general opinion about a particular product in manual way since the number of reviews available increases Hence, the automatic way is preferred One of the most popular techniques is using machine learning approach such as Support Vector Machine (SVM) In this study, we explore the use of Word2Vec model as features in the SVM based sentiment analysis of product reviews in Indonesian language The experiment result show that SVM can performs well on the sentiment classification task using any model used However, the Word2vec model has the lowest accuracy (only 070), compared to other baseline method including Bag of Words model using Binary TF, Raw TF, and TFIDF This is because only small dataset used to train the Word2Vec model Word2Vec need large examples to learn the word representation and place similar words into closer position

54 citations

Proceedings ArticleDOI
01 Aug 2016
TL;DR: A new calculus for subspaces is introduced that supports operations like “−1 × hate = love” and “give me a neutral word for greasy” (i.e., oleaginous) and extends analogy computations like ‘king−man+woman = queen’.
Abstract: We decompose a standard embedding space into interpretable orthogonal subspaces and a “remainder” subspace. We consider four interpretable subspaces in this paper: polarity, concreteness, frequency and part-of-speech (POS) subspaces. We introduce a new calculus for subspaces that supports operations like “−1 × hate = love” and “give me a neutral word for greasy” (i.e., oleaginous). This calculus extends analogy computations like “king−man+woman = queen”. For the tasks of Antonym Classification and POS Tagging our method outperforms the state of the art. We create test sets for Morphological Analogies and for the new task of Polarity Spectrum Creation.

53 citations


Network Information
Related Topics (5)
Recurrent neural network
29.2K papers, 890K citations
87% related
Unsupervised learning
22.7K papers, 1M citations
86% related
Deep learning
79.8K papers, 2.1M citations
85% related
Reinforcement learning
46K papers, 1M citations
84% related
Graph (abstract data type)
69.9K papers, 1.2M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023317
2022716
2021736
20201,025
20191,078
2018788