scispace - formally typeset
Search or ask a question

Showing papers on "Sentiment analysis published in 2016"


Proceedings Article
16 Mar 2016
TL;DR: This paper aims to undertake a stepwise methodology to determine the effects of an average person's tweets over fluctuation of stock prices of a multinational firm called Samsung Electronics Ltd.
Abstract: In the recent years, micro blogging platforms like Twitter have become instrumental in gauging public mood. So it makes it a feasible predictive strategy to speculate the rise and fall of stock prices. This paper aims to undertake a stepwise methodology to determine the effects of an average person's tweets over fluctuation of stock prices of a multinational firm called Samsung Electronics Ltd. It involves extracting tweets from twitter, data cleansing and application of a suitable algorithm in order to get the adequate sentiment analysis. The vast impact created by twitter data feed has been greatly studied in this paper. Attempts have been made to design an algorithm which works well analysing the positive, negative and neutral tweets.

1,978 citations


Proceedings ArticleDOI
01 Nov 2016
TL;DR: This paper reveals that the sentiment polarity of a sentence is not only determined by the content but is also highly related to the concerned aspect, and proposes an Attention-based Long Short-Term Memory Network for aspect-level sentiment classification.
Abstract: Aspect-level sentiment classification is a fine-grained task in sentiment analysis. Since it provides more complete and in-depth results, aspect-level sentiment analysis has received much attention these years. In this paper, we reveal that the sentiment polarity of a sentence is not only determined by the content but is also highly related to the concerned aspect. For instance, “The appetizers are ok, but the service is slow.”, for aspect taste, the polarity is positive while for service, the polarity is negative. Therefore, it is worthwhile to explore the connection between an aspect and the content of a sentence. To this end, we propose an Attention-based Long Short-Term Memory Network for aspect-level sentiment classification. The attention mechanism can concentrate on different parts of a sentence when different aspects are taken as input. We experiment on the SemEval 2014 dataset and results show that our model achieves state-ofthe-art performance on aspect-level sentiment classification.

1,722 citations


Journal ArticleDOI
TL;DR: The emerging fields of affective computing and sentiment analysis, which leverage human-computer interaction, information retrieval, and multimodal signal processing for distilling people's sentiments from the ever-growing amount of online social data.
Abstract: Understanding emotions is an important aspect of personal development and growth, and as such it is a key tile for the emulation of human intelligence. Besides being important for the advancement of AI, emotion processing is also important for the closely related task of polarity detection. The opportunity to automatically capture the general public's sentiments about social events, political movements, marketing campaigns, and product preferences has raised interest in both the scientific community, for the exciting open challenges, and the business world, for the remarkable fallouts in marketing and financial market prediction. This has led to the emerging fields of affective computing and sentiment analysis, which leverage human-computer interaction, information retrieval, and multimodal signal processing for distilling people's sentiments from the ever-growing amount of online social data.

1,153 citations


Proceedings ArticleDOI
01 Jan 2016
TL;DR: This paper describes the SemEval 2016 shared task on Aspect Based Sentiment Analysis (ABSA), a continuation of the respective tasks of 2014 and 2015, which attracted 245 submissions from 29 teams and provided 19 training and 20 testing datasets.
Abstract: This paper describes the SemEval 2016 shared task on Aspect Based Sentiment Analysis (ABSA), a continuation of the respective tasks of 2014 and 2015. In its third year, the task provided 19 training and 20 testing datasets for 8 languages and 7 domains, as well as a common evaluation procedure. From these datasets, 25 were for sentence-level and 14 for text-level ABSA; the latter was introduced for the first time as a subtask in SemEval. The task attracted 245 submissions from 29 teams.

1,139 citations


Proceedings ArticleDOI
05 Nov 2016
TL;DR: This paper propose a machine reading simulator which processes text incrementally from left to right and performs shallow reasoning with memory and attention, enabling adaptive memory usage during recurrence with neural attention, offering a way to weakly induce relations among tokens.
Abstract: In this paper we address the question of how to render sequence-level networks better at handling structured input. We propose a machine reading simulator which processes text incrementally from left to right and performs shallow reasoning with memory and attention. The reader extends the Long Short-Term Memory architecture with a memory network in place of a single memory cell. This enables adaptive memory usage during recurrence with neural attention, offering a way to weakly induce relations among tokens. The system is initially designed to process a single sequence but we also demonstrate how to integrate it with an encoder-decoder architecture. Experiments on language modeling, sentiment analysis, and natural language inference show that our model matches or outperforms the state of the art.

879 citations


Journal ArticleDOI
TL;DR: In this paper, the authors describe the nuances of the textual analysis and some of the tripwires in implementation and highlight the contemporary textual analysis literature and highlight areas of future research.
Abstract: Relative to quantitative methods traditionally used in accounting and finance, textual analysis is substantially less precise. Thus, understanding the art is of equal importance to understanding the science. In this survey, we describe the nuances of the method and, as users of textual analysis, some of the tripwires in implementation. We also review the contemporary textual analysis literature and highlight areas of future research.

858 citations


Proceedings Article
19 Jun 2016
TL;DR: This paper introduced the dynamic memory network (DMN), a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers, which can be trained end-to-end and obtains state-of-the-art results on several types of tasks and datasets.
Abstract: Most tasks in natural language processing can be cast into question answering (QA) problems over language input. We introduce the dynamic memory network (DMN), a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers. Questions trigger an iterative attention process which allows the model to condition its attention on the inputs and the result of previous iterations. These results are then reasoned over in a hierarchical recurrent sequence model to generate answers. The DMN can be trained end-to-end and obtains state-of-the-art results on several types of tasks and datasets: question answering (Facebook's bAbI dataset), text classification for sentiment analysis (Stanford Sentiment Treebank) and sequence modeling for part-of-speech tagging (WSJ-PTB). The training for these different tasks relies exclusively on trained word vector representations and input-question-answer triplets.

843 citations


Journal ArticleDOI
TL;DR: This paper used a 7-layer deep convolutional neural network to tag each word in opinionated sentences as either aspect or non-aspect word, and developed a set of linguistic patterns for the same purpose and combined them with the neural network.
Abstract: In this paper, we present the first deep learning approach to aspect extraction in opinion mining. Aspect extraction is a subtask of sentiment analysis that consists in identifying opinion targets in opinionated text, i.e., in detecting the specific aspects of a product or service the opinion holder is either praising or complaining about. We used a 7-layer deep convolutional neural network to tag each word in opinionated sentences as either aspect or non-aspect word. We also developed a set of linguistic patterns for the same purpose and combined them with the neural network. The resulting ensemble classifier, coupled with a word-embedding model for sentiment analysis, allowed our approach to obtain significantly better accuracy than state-of-the-art methods.

716 citations


Proceedings ArticleDOI
01 Jun 2016
TL;DR: The SemEval-2016 Task 4 comprises five subtasks, three of which represent a significant departure from previous editions. as mentioned in this paper discusses the fourth year of the Sentiment Analysis in Twitter Task and discusses the three new subtasks focus on two variants of the basic sentiment classification in Twitter task.
Abstract: This paper discusses the fourth year of the ”Sentiment Analysis in Twitter Task”. SemEval-2016 Task 4 comprises five subtasks, three of which represent a significant departure from previous editions. The first two subtasks are reruns from prior years and ask to predict the overall sentiment, and the sentiment towards a topic in a tweet. The three new subtasks focus on two variants of the basic “sentiment classification in Twitter” task. The first variant adopts a five-point scale, which confers an ordinal character to the classification task. The second variant focuses on the correct estimation of the prevalence of each class of interest, a task which has been called quantification in the supervised learning literature. The task continues to be very popular, attracting a total of 43 teams.

702 citations


Journal ArticleDOI
TL;DR: A novel solution to target-oriented sentiment summarization and SA of short informal texts with a main focus on Twitter posts known as “tweets” is introduced and it is shown that the hybrid polarity detection system not only outperforms the unigram state-of-the-art baseline, but also could be an advantage over other methods when used as a part of a sentiment summarizing system.
Abstract: Sentiment Analysis (SA) and summarization has recently become the focus of many researchers, because analysis of online text is beneficial and demanded in many different applications. One such application is productbased sentiment summarization of multi-documents with the purpose of informing users about pros and cons of various products. This paper introduces a novel solution to target-oriented sentiment summarization and SA of short informal texts with a main focus on Twitter posts known as “tweets”. We compare different algorithms and methods for SA polarity detection and sentiment summarization. We show that our hybrid polarity detection system not only outperforms the unigram state-of-the-art baseline, but also could be an advantage over other methods when used as a part of a sentiment summarization system. Additionally, we illustrate that our SA and summarization system exhibits a high performance with various useful functionalities and features. Sentiment classification aims to automatically predict sentiment polarity (e.g., positive or negative) of users publishing sentiment data (e.g., reviews, blogs). Although traditional classification algorithms can be used to train sentiment classifiers from manually labeled text data, the labeling work can be time-consuming and ex-pensive. Meanwhile, users often use some different words when they express sentiment in different domains. If we directly apply a classifier trained in one domain to other domains, the performance will be very low due to the differences between these domains. In this work, we develop a general solution to sentiment classification when we do not have any labels in a target domain but have some labeled data in a different domain, regarded as source domain.

679 citations


Journal ArticleDOI
TL;DR: An in-depth overview of the current state-of-the-art of aspect-level sentiment analysis is given, showing the tremendous progress that has been made in finding both the target, which can be an entity as such, or some aspect of it, and the corresponding sentiment.
Abstract: The field of sentiment analysis, in which sentiment is gathered, analyzed, and aggregated from text, has seen a lot of attention in the last few years. The corresponding growth of the field has resulted in the emergence of various subareas, each addressing a different level of analysis or research question. This survey focuses on aspect-level sentiment analysis, where the goal is to find and aggregate sentiment on entities mentioned within documents or aspects of them. An in-depth overview of the current state-of-the-art is given, showing the tremendous progress that has already been made in finding both the target, which can be an entity as such, or some aspect of it, and the corresponding sentiment. Aspect-level sentiment analysis yields very fine-grained sentiment information which can be useful for applications in various domains. Current solutions are categorized based on whether they provide a method for aspect detection, sentiment analysis, or both. Furthermore, a breakdown based on the type of algorithm used is provided. For each discussed study, the reported performance is included. To facilitate the quantitative evaluation of the various proposed methods, a call is made for the standardization of the evaluation methodology that includes the use of shared data sets. Semantically-rich concept-centric aspect-level sentiment analysis is discussed and identified as one of the most promising future research direction.

Journal ArticleDOI
TL;DR: This survey describes the nuances of the method and, as users of textual analysis, some of the tripwires in implementation and reviews the contemporary textual analysis literature to highlight areas of future research.
Abstract: Relative to quantitative methods traditionally used in accounting and finance, textual analysis is substantially less precise Thus, understanding the art is of equal importance to understanding the science In this survey we describe the nuances of the method and, as users of textual analysis, some of the tripwires in implementation We also review the contemporary textual analysis literature and highlight areas of future research

Journal ArticleDOI
01 Sep 2016
TL;DR: This comprehensive introduction to sentiment analysis takes a natural-language-processing point of view to help readers understand the underlying structure of the problem and the language constructs commonly used to express opinions, sentiments, and emotions.

Proceedings ArticleDOI
Zhiting Hu1, Xuezhe Ma1, Zhengzhong Liu1, Eduard Hovy1, Eric P. Xing1 
21 Mar 2016
TL;DR: This paper proposed a general framework capable of enhancing various types of neural networks (e.g., CNNs and RNNs) with declarative first-order logic rules, which transferred the structured information of logic rules into the weights of neural network.
Abstract: Combining deep neural networks with structured logic rules is desirable to harness flexibility and reduce uninterpretability of the neural models. We propose a general framework capable of enhancing various types of neural networks (e.g., CNNs and RNNs) with declarative first-order logic rules. Specifically, we develop an iterative distillation method that transfers the structured information of logic rules into the weights of neural networks. We deploy the framework on a CNN for sentiment analysis, and an RNN for named entity recognition. With a few highly intuitive rules, we obtain substantial improvements and achieve state-of-the-art or comparable results to previous best-performing systems.

Posted Content
TL;DR: This paper proposes a general methodology to analyze and interpret decisions from a neural model by observing the effects on the model of erasing various parts of the representation, such as input word-vector dimensions, intermediate hidden units, or input words.
Abstract: While neural networks have been successfully applied to many natural language processing tasks, they come at the cost of interpretability. In this paper, we propose a general methodology to analyze and interpret decisions from a neural model by observing the effects on the model of erasing various parts of the representation, such as input word-vector dimensions, intermediate hidden units, or input words. We present several approaches to analyzing the effects of such erasure, from computing the relative difference in evaluation metrics, to using reinforcement learning to erase the minimum set of input words in order to flip a neural model's decision. In a comprehensive analysis of multiple NLP tasks, including linguistic feature classification, sentence-level sentiment analysis, and document level sentiment aspect prediction, we show that the proposed methodology not only offers clear explanations about neural model decisions, but also provides a way to conduct error analysis on neural models.

Proceedings ArticleDOI
01 Dec 2016
TL;DR: A novel method to extract features from visual and textual modalities using deep convolutional neural networks and significantly outperform the state of the art of multimodal emotion recognition and sentiment analysis on different datasets is presented.
Abstract: Technology has enabled anyone with an Internet connection to easily create and share their ideas, opinions and content with millions of other people around the world. Much of the content being posted and consumed online is multimodal. With billions of phones, tablets and PCs shipping today with built-in cameras and a host of new video-equipped wearables like Google Glass on the horizon, the amount of video on the Internet will only continue to increase. It has become increasingly difficult for researchers to keep up with this deluge of multimodal content, let alone organize or make sense of it. Mining useful knowledge from video is a critical need that will grow exponentially, in pace with the global growth of content. This is particularly important in sentiment analysis, as both service and product reviews are gradually shifting from unimodal to multimodal. We present a novel method to extract features from visual and textual modalities using deep convolutional neural networks. By feeding such features to a multiple kernel learning classifier, we significantly outperform the state of the art of multimodal emotion recognition and sentiment analysis on different datasets.

Journal ArticleDOI
TL;DR: A survey which covers Opining Mining, Sentiment Analysis, techniques, tools and classification is presented which covers the polarity of extracted public opinions.
Abstract: In this age, in this nation, public sentiment is everything. With it, nothing can fail; against it, nothing can succeed. Whoever molds public sentiment goes deeper than he who enacts statutes, or pronounces judicial decisions (Abraham Lincoln, 1858 ) [1]. It is apparent from President Lincoln's well known quote that legislators understood the force of open assumption quite a while prior. In today world, the Internet is the main source of information. An enormous amount of information and opinion online is scattered and unstructured with no machine to arrange it. Because of demand the public to know opinions about exact product and services, political issues, or social scientists. That’s led us to study of field Opining Mining and Sentiment Analysis. Opining Mining and Sentiment Analysis have recently played a significant role for researchers because analysis of online text is beneficial for the market research political issue, business intelligence, online shopping, and scientific survey from psychological. Sentiment Analysis identifies the polarity of extracted public opinions. This paper presents a survey which covers Opining Mining, Sentiment Analysis, techniques, tools and classification.

Journal ArticleDOI
TL;DR: This paper presents a survey on the sentiment analysis challenges relevant to their approaches and techniques, and presents a summary of current techniques and approaches used for sentiment analysis.
Abstract: With accelerated evolution of the internet as websites, social networks, blogs, online portals, reviews, opinions, recommendations, ratings, and feedback are generated by writers. This writer generated sentiment content can be about books, people, hotels, products, research, events, etc. These sentiments become very beneficial for businesses, governments, and individuals. While this content is meant to be useful, a bulk of this writer generated content require using the text mining techniques and sentiment analysis. But there are several challenges facing the sentiment analysis and evaluation process. These challenges become obstacles in analyzing the accurate meaning of sentiments and detecting the suitable sentiment polarity. Sentiment analysis is the practice of applying natural language processing and text analysis techniques to identify and extract subjective information from text. This paper presents a survey on the sentiment analysis challenges relevant to their approaches and techniques.

Journal ArticleDOI
TL;DR: Four different machine learning algorithms such as Naive Bayes (NB), Maximum Entropy (ME), Stochastic Gradient Descent (SGD), and Support Vector Machine (SVM) have been considered for classification of human sentiments.
Abstract: A large number of sentiment reviews, blogs and comments present online.These reviews must be classified to obtain a meaningful information.Four different supervised machine learning algorithm used for classification.Unigram, Bigram, Trigram models and their combinations used for classification.The classification is done on IMDb movie review dataset. With the ever increasing social networking and online marketing sites, the reviews and blogs obtained from those, act as an important source for further analysis and improved decision making. These reviews are mostly unstructured by nature and thus, need processing like classification or clustering to provide a meaningful information for future uses. These reviews and blogs may be classified into different polarity groups such as positive, negative, and neutral in order to extract information from the input dataset. Supervised machine learning methods help to classify these reviews. In this paper, four different machine learning algorithms such as Naive Bayes (NB), Maximum Entropy (ME), Stochastic Gradient Descent (SGD), and Support Vector Machine (SVM) have been considered for classification of human sentiments. The accuracy of different methods are critically examined in order to access their performance on the basis of parameters such as precision, recall, f-measure, and accuracy.

Proceedings ArticleDOI
07 Aug 2016
TL;DR: A regional CNN-LSTM model consisting of two parts: regional CNN and LSTM to predict the VA ratings of texts is proposed, showing that the proposed method outperforms lexicon-based, regression- based, and NN-based methods proposed in previous studies.
Abstract: Dimensional sentiment analysis aims to recognize continuous numerical values in multiple dimensions such as the valencearousal (VA) space. Compared to the categorical approach that focuses on sentiment classification such as binary classification (i.e., positive and negative), the dimensional approach can provide more fine-grained sentiment analysis. This study proposes a regional CNN-LSTM model consisting of two parts: regional CNN and LSTM to predict the VA ratings of texts. Unlike a conventional CNN which considers a whole text as input, the proposed regional CNN uses an individual sentence as a region, dividing an input text into several regions such that the useful affective information in each region can be extracted and weighted according to their contribution to the VA prediction. Such regional information is sequentially integrated across regions using LSTM for VA prediction. By combining the regional CNN and LSTM, both local (regional) information within sentences and long-distance dependency across sentences can be considered in the prediction process. Experimental results show that the proposed method outperforms lexicon-based, regression-based, and NN-based methods proposed in previous studies.

Journal ArticleDOI
TL;DR: This paper proposes a novel methodology for multimodal sentiment analysis, which consists in harvesting sentiments from Web videos by demonstrating a model that uses audio, visual and textual modalities as sources of information.

Journal ArticleDOI
TL;DR: Fields related to sentiment analysis in Twitter including Twitter opinion retrieval, tracking sentiments over time, irony detection, emotion detection, and tweet sentiment quantification, tasks that have recently attracted increasing attention are discussed.
Abstract: Sentiment analysis in Twitter is a field that has recently attracted research interest. Twitter is one of the most popular microblog platforms on which users can publish their thoughts and opinions. Sentiment analysis in Twitter tackles the problem of analyzing the tweets in terms of the opinion they express. This survey provides an overview of the topic by investigating and briefly describing the algorithms that have been proposed for sentiment analysis in Twitter. The presented studies are categorized according to the approach they follow. In addition, we discuss fields related to sentiment analysis in Twitter including Twitter opinion retrieval, tracking sentiments over time, irony detection, emotion detection, and tweet sentiment quantification, tasks that have recently attracted increasing attention. Resources that have been used in the Twitter sentiment analysis literature are also briefly presented. The main contributions of this survey include the presentation of the proposed approaches for sentiment analysis in Twitter, their categorization according to the technique they use, and the discussion of recent research trends of the topic and its related fields.

Journal ArticleDOI
TL;DR: Different from typical lexicon-based approaches, SentiCircles takes into account the co-occurrence patterns of words in different contexts in tweets to capture their semantics and update their pre-assigned strength and polarity in sentiment lexicons accordingly.
Abstract: We propose a semantic sentiment representation of words called SentiCircle.SentiCircle captures the contextual semantic of words from their co-occurrences.SentiCircle updates the sentiment of words based on their contextual semantics.SentiCircle can be used to perform entity- and tweet-level level sentiment analysis. Sentiment analysis on Twitter has attracted much attention recently due to its wide applications in both, commercial and public sectors. In this paper we present SentiCircles, a lexicon-based approach for sentiment analysis on Twitter. Different from typical lexicon-based approaches, which offer a fixed and static prior sentiment polarities of words regardless of their context, SentiCircles takes into account the co-occurrence patterns of words in different contexts in tweets to capture their semantics and update their pre-assigned strength and polarity in sentiment lexicons accordingly. Our approach allows for the detection of sentiment at both entity-level and tweet-level. We evaluate our proposed approach on three Twitter datasets using three different sentiment lexicons to derive word prior sentiments. Results show that our approach significantly outperforms the baselines in accuracy and F-measure for entity-level subjectivity (neutral vs. polar) and polarity (positive vs. negative) detections. For tweet-level sentiment detection, our approach performs better than the state-of-the-art SentiStrength by 4-5% in accuracy in two datasets, but falls marginally behind by 1% in F-measure in the third dataset.

Journal ArticleDOI
TL;DR: This article addresses the fundamental question of exploiting the dynamics between visual gestures and verbal messages to be able to better model sentiment by introducing the first multimodal dataset with opinion-level sentiment intensity annotations and proposing a new computational representation, called multi-modal dictionary, based on a language-gesture study.
Abstract: People share their opinions, stories, and reviews through online video sharing websites every day. The automatic analysis of these online opinion videos is bringing new or understudied research challenges to the field of computational linguistics and multimodal analysis. Among these challenges is the fundamental question of exploiting the dynamics between visual gestures and verbal messages to be able to better model sentiment. This article addresses this question in four ways: introducing the first multimodal dataset with opinion-level sentiment intensity annotations; studying the prototypical interaction patterns between facial gestures and spoken words when inferring sentiment intensity; proposing a new computational representation, called multimodal dictionary, based on a language-gesture study; and evaluating the authors' proposed approach in a speaker-independent paradigm for sentiment intensity prediction. The authors' study identifies four interaction types between facial gestures and verbal content: neutral, emphasizer, positive, and negative interactions. Experiments show statistically significant improvement when using multimodal dictionary representation over the conventional early fusion representation (that is, feature concatenation).

Proceedings Article
Peng Zhou1, Zhenyu Qi1, Suncong Zheng1, Jiaming Xu1, Hongyun Bao1, Bo Xu1 
21 Nov 2016
TL;DR: One of the proposed models achieves highest accuracy on Stanford Sentiment Treebank binary classification and fine-grained classification tasks and also utilizes 2D convolution to sample more meaningful information of the matrix.
Abstract: Recurrent Neural Network (RNN) is one of the most popular architectures used in Natural Language Processsing (NLP) tasks because its recurrent structure is very suitable to process variablelength text. RNN can utilize distributed representations of words by first converting the tokens comprising each text into vectors, which form a matrix. And this matrix includes two dimensions: the time-step dimension and the feature vector dimension. Then most existing models usually utilize one-dimensional (1D) max pooling operation or attention-based operation only on the time-step dimension to obtain a fixed-length vector. However, the features on the feature vector dimension are not mutually independent, and simply applying 1D pooling operation over the time-step dimension independently may destroy the structure of the feature representation. On the other hand, applying two-dimensional (2D) pooling operation over the two dimensions may sample more meaningful features for sequence modeling tasks. To integrate the features on both dimensions of the matrix, this paper explores applying 2D max pooling operation to obtain a fixed-length representation of the text. This paper also utilizes 2D convolution to sample more meaningful information of the matrix. Experiments are conducted on six text classification tasks, including sentiment analysis, question classification, subjectivity classification and newsgroup classification. Compared with the state-of-the-art models, the proposed models achieve excellent performance on 4 out of 6 tasks. Specifically, one of the proposed models achieves highest accuracy on Stanford Sentiment Treebank binary classification and fine-grained classification tasks.

Journal ArticleDOI
TL;DR: A survey and a comparative analyses of existing techniques for opinion mining like machine learning and lexicon-based approaches, together with evaluation metrics are provided, using various machine learning algorithms on twitter data streams.
Abstract: With the advancement of web technology and its growth, there is a huge volume of data present in the web for internet users and a lot of data is generated too. Internet has become a platform for online learning, exchanging ideas and sharing opinions. Social networking sites like Twitter, Facebook, Google+ are rapidly gaining popularity as they allow people to share and express their views about topics,have discussion with different communities, or post messages across the world. There has been lot of work in the field of sentiment analysis of twitter data. This survey focuses mainly on sentiment analysis of twitter data which is helpful to analyze the information in the tweets where opinions are highly unstructured, heterogeneous and are either positive or negative, or neutral in some cases. In this paper, we provide a survey and a comparative analyses of existing techniques for opinion mining like machine learning and lexicon-based approaches, together with evaluation metrics. Using various machine learning algorithms like Naive Bayes, Max Entropy, and Support Vector Machine, we provide a research on twitter data streams.General challenges and applications of Sentiment Analysis on Twitter are also discussed in this paper.

Book ChapterDOI
01 Jan 2016
TL;DR: Sentiment analysis is the task of automatically determining from text the attitude, emotion, or some other affectual state of the author as mentioned in this paper, which is a difficult task due to the complexity and subtlety of language use.
Abstract: Sentiment analysis is the task of automatically determining from text the attitude, emotion, or some other affectual state of the author. This chapter summarizes the diverse landscape of tasks and applications associated with sentiment analysis. We outline key challenges stemming from the complexity and subtlety of language use, the prevalence of creative and non-standard language, and the lack of paralinguistic information, such as tone and stress markers. We describe automatic systems and datasets commonly used in sentiment analysis. We summarize several manual and automatic approaches to creating valence- and emotion-association lexicons. We also discuss preliminary approaches for sentiment composition (how smaller units of text combine to express sentiment) and approaches for detecting sentiment in figurative and metaphoric language—these are the areas where we expect to see significant work in the near future.

Journal ArticleDOI
TL;DR: A benchmark comparison of twenty-four popular sentiment analysis methods, covering messages posted on social networks, movie and product reviews, as well as opinions and comments in news articles is presented, highlighting the extent to which the prediction performance of these methods varies considerably across datasets.
Abstract: In the last few years thousands of scientific papers have investigated sentiment analysis, several startups that measure opinions on real data have emerged and a number of innovative products related to this theme have been developed. There are multiple methods for measuring sentiments, including lexical-based and supervised machine learning methods. Despite the vast interest on the theme and wide popularity of some methods, it is unclear which one is better for identifying the polarity (i.e., positive or negative) of a message. Accordingly, there is a strong need to conduct a thorough apple-to-apple comparison of sentiment analysis methods, as they are used in practice, across multiple datasets originated from different data sources. Such a comparison is key for understanding the potential limitations, advantages, and disadvantages of popular methods. This article aims at filling this gap by presenting a benchmark comparison of twenty-four popular sentiment analysis methods (which we call the state-of-the-practice methods). Our evaluation is based on a benchmark of eighteen labeled datasets, covering messages posted on social networks, movie and product reviews, as well as opinions and comments in news articles. Our results highlight the extent to which the prediction performance of these methods varies considerably across datasets. Aiming at boosting the development of this research area, we open the methods’ codes and datasets used in this article, deploying them in a benchmark system, which provides an open API for accessing and comparing sentence-level sentiment analysis methods.

Journal ArticleDOI
Duyu Tang1, Furu Wei2, Bing Qin1, Nan Yang2, Ting Liu1, Ming Zhou2 
TL;DR: This work develops a number of neural networks with tailoring loss functions, and applies sentiment embeddings to word-level sentiment analysis, sentence level sentiment classification, and building sentiment lexicons, showing results that consistently outperform context-basedembeddings on several benchmark datasets of these tasks.
Abstract: We propose learning sentiment-specific word embeddings dubbed sentiment embeddings in this paper. Existing word embedding learning algorithms typically only use the contexts of words but ignore the sentiment of texts. It is problematic for sentiment analysis because the words with similar contexts but opposite sentiment polarity, such as good and bad , are mapped to neighboring word vectors. We address this issue by encoding sentiment information of texts (e.g., sentences and words) together with contexts of words in sentiment embeddings. By combining context and sentiment level evidences, the nearest neighbors in sentiment embedding space are semantically similar and it favors words with the same sentiment polarity. In order to learn sentiment embeddings effectively, we develop a number of neural networks with tailoring loss functions, and collect massive texts automatically with sentiment signals like emoticons as the training data. Sentiment embeddings can be naturally used as word features for a variety of sentiment analysis tasks without feature engineering. We apply sentiment embeddings to word-level sentiment analysis, sentence level sentiment classification, and building sentiment lexicons. Experimental results show that sentiment embeddings consistently outperform context-based embeddings on several benchmark datasets of these tasks. This work provides insights on the design of neural networks for learning task-specific word embeddings in other natural language processing tasks.

Proceedings Article
01 Dec 2016
TL;DR: SenticNet 4 overcomes limitations by leveraging on conceptual primitives automatically generated by means of hierarchical clustering and dimensionality reduction.
Abstract: An important difference between traditional AI systems and human intelligence is the human ability to harness commonsense knowledge gleaned from a lifetime of learning and experience to make informed decisions. This allows humans to adapt easily to novel situations where AI fails catastrophically due to a lack of situation-specific rules and generalization capabilities. Commonsense knowledge also provides background information that enables humans to successfully operate in social situations where such knowledge is typically assumed. Since commonsense consists of information that humans take for granted, gathering it is an extremely difficult task. Previous versions of SenticNet were focused on collecting this kind of knowledge for sentiment analysis but they were heavily limited by their inability to generalize. SenticNet 4 overcomes such limitations by leveraging on conceptual primitives automatically generated by means of hierarchical clustering and dimensionality reduction.