Refining Word Embeddings Using Intensity Scores for Sentiment Analysis

doi:10.1109/TASLP.2017.2788182

Journal ArticleDOI

Refining Word Embeddings Using Intensity Scores for Sentiment Analysis

Liang-Chih Yu, +3 more

- 01 Mar 2018 -

IEEE Transactions on Audio, Speech, and ...

- Vol. 26, Iss: 3, pp 671-681

Chats0

TLDR

A word vector refinement model is proposed to refine existing pretrained word vectors using real-valued sentiment intensity scores provided by sentiment lexicons to improve each word vector such that it can be closer in the lexicon to both semantically and sentimentally similar words.

Abstract:

Word embeddings that provide continuous low-dimensional vector representations of words have been extensively used for various natural language processing tasks. However, existing context-based word embeddings such as Word2vec and GloVe typically fail to capture sufficient sentiment information, which may result in words with similar vector representations having an opposite sentiment polarity (e.g., good and bad ), thus degrading sentiment analysis performance. To tackle this problem, recent studies have suggested learning sentiment embeddings to incorporate the sentiment polarity (positive and negative) information from labeled corpora. This study adopts another strategy to learn sentiment embeddings. Instead of creating a new word embedding from labeled corpora, we propose a word vector refinement model to refine existing pretrained word vectors using real-valued sentiment intensity scores provided by sentiment lexicons. The idea of the refinement model is to improve each word vector such that it can be closer in the lexicon to both semantically and sentimentally similar words (i.e., those with similar intensity scores) and further away from sentimentally dissimilar words (i.e., those with dissimilar intensity scores). An obvious advantage of the proposed method is that it can be applied to any pretrained word embeddings. In addition, the intensity scores can provide more fine-grained (real-valued) sentiment information than binary polarity labels to guide the refinement process. Experimental results show that the proposed refinement model can improve both conventional word embeddings and previously proposed sentiment embeddings for binary, ternary, and fine-grained sentiment classification on the SemEval and Stanford Sentiment Treebank datasets.

Refining Word Embeddings Using Intensity Scores for Sentiment Analysis

Citations

Sentiment analysis using deep learning architectures: a review

Evaluating word embedding models: methods and experimental results

Transformer based Deep Intelligent Contextual Embedding for Twitter sentiment analysis

Neo: A Learned Query Optimizer

Sentiment analysis on the impact of coronavirus in social life using the BERT model

References

Glove: Global Vectors for Word Representation

Distributed Representations of Words and Phrases and their Compositionality

Efficient Estimation of Word Representations in Vector Space

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

WordNet: a lexical database for English

Related Papers (5)

Glove: Global Vectors for Word Representation

Distributed Representations of Words and Phrases and their Compositionality

Efficient Estimation of Word Representations in Vector Space

A neural probabilistic language model

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding