scispace - formally typeset
Search or ask a question
Author

Kevan Buckley

Bio: Kevan Buckley is an academic researcher from University of Wolverhampton. The author has contributed to research in topics: Sentiment analysis & Reputation system. The author has an hindex of 14, co-authored 48 publications receiving 3811 citations. Previous affiliations of Kevan Buckley include University of Wales & Information Technology University.

Papers
More filters
Journal IssueDOI
TL;DR: SentiStrength as discussed by the authors is able to predict positive emotion with 60.6p accuracy and negative emotion with 72.8p accuracy, both based upon strength scales of 1-5.
Abstract: A huge number of informal messages are posted every day in social network sites, blogs, and discussion forums. Emotions seem to be frequently important in these texts for expressing friendship, showing social support or as part of online arguments. Algorithms to identify sentiment and sentiment strength are needed to help understand the role of emotion in this informal communication and also to identify inappropriate or anomalous affective utterances, potentially associated with threatening behavior to the self or others. Nevertheless, existing sentiment detection algorithms tend to be commercially oriented, designed to identify opinions about products rather than user behaviors. This article partly fills this gap with a new algorithm, SentiStrength, to extract sentiment strength from informal English text, using new methods to exploit the de facto grammars and spelling styles of cyberspace. Applied to MySpace comments and with a lookup table of term sentiment strengths optimized by machine learning, SentiStrength is able to predict positive emotion with 60.6p accuracy and negative emotion with 72.8p accuracy, both based upon strength scales of 1–5. The former, but not the latter, is better than baseline and a wide range of general machine learning approaches. © 2010 Wiley Periodicals, Inc.

1,371 citations

Journal ArticleDOI
TL;DR: An improved version of the algorithm SentiStrength for sentiment strength detection across the social web that primarily uses direct indications of sentiment is assessed, suggesting that, even unsupervised, Senti strength is robust enough to be applied to a wide variety of different social web contexts.
Abstract: Sentiment analysis is concerned with the automatic extraction of sentiment-related information from text. Although most sentiment analysis addresses commercial tasks, such as extracting opinions from product reviews, there is increasing interest in the affective dimension of the social web, and Twitter in particular. Most sentiment analysis algorithms are not ideally suited to this task because they exploit indirect indicators of sentiment that can reflect genre or topic instead. Hence, such algorithms used to process social web texts can identify spurious sentiment patterns caused by topics rather than affective phenomena. This article assesses an improved version of the algorithm SentiStrength for sentiment strength detection across the social web that primarily uses direct indications of sentiment. The results from six diverse social web data sets (MySpace, Twitter, YouTube, Digg, RunnersWorld, BBCForums) indicate that SentiStrength 2 is successful in the sense of performing better than a baseline approach for all data sets in both supervised and unsupervised cases. SentiStrength is not always better than machine-learning approaches that exploit indirect indicators of sentiment, however, and is particularly weaker for positive sentiment in news-related discussions. Overall, the results suggest that, even unsupervised, SentiStrength is robust enough to be applied to a wide variety of different social web contexts.

1,008 citations

Journal ArticleDOI
TL;DR: A study of a month of English Twitter posts is reported, assessing whether popular events are typically associated with increases in sentiment strength, as seems intuitively likely and using the top 30 events as a measure of relative increase in (general) term usage.
Abstract: The microblogging site Twitter generates a constant stream of communication, some of which concerns events of general interest. An analysis of Twitter may, therefore, give insights into why particular events resonate with the population. This article reports a study of a month of English Twitter posts, assessing whether popular events are typically associated with increases in sentiment strength, as seems intuitively likely. Using the top 30 events, determined by a measure of relative increase in (general) term usage, the results give strong evidence that popular events are normally associated with increases in negative sentiment strength and some evidence that peaks of interest in events have stronger positive sentiment than the time before the peak. It seems that many positive events, such as the Oscars, are capable of generating increased negative sentiment in reaction to them. Nevertheless, the surprisingly small average change in sentiment associated with popular events (typically 1% and only 6% for Tiger Woods' confessions) is consistent with events affording posters opportunities to satisfy pre-existing personal goals more often than eliciting instinctive reactions. © 2011 Wiley Periodicals, Inc.

783 citations

Journal ArticleDOI
27 Jul 2011-PLOS ONE
TL;DR: The results prove that collective emotional states can be created and modulated via Internet communication and that emotional expressiveness is the fuel that sustains some e-communities.
Abstract: Background E-communities, social groups interacting online, have recently become an object of interdisciplinary research. As with face-to-face meetings, Internet exchanges may not only include factual information but also emotional information – how participants feel about the subject discussed or other group members. Emotions in turn are known to be important in affecting interaction partners in offline communication in many ways. Could emotions in Internet exchanges affect others and systematically influence quantitative and qualitative aspects of the trajectory of e-communities? The development of automatic sentiment analysis has made large scale emotion detection and analysis possible using text messages collected from the web. However, it is not clear if emotions in e-communities primarily derive from individual group members' personalities or if they result from intra-group interactions, and whether they influence group activities. Methodology/Principal Findings Here, for the first time, we show the collective character of affective phenomena on a large scale as observed in four million posts downloaded from Blogs, Digg and BBC forums. To test whether the emotions of a community member may influence the emotions of others, posts were grouped into clusters of messages with similar emotional valences. The frequency of long clusters was much higher than it would be if emotions occurred at random. Distributions for cluster lengths can be explained by preferential processes because conditional probabilities for consecutive messages grow as a power law with cluster length. For BBC forum threads, average discussion lengths were higher for larger values of absolute average emotional valence in the first ten comments and the average amount of emotion in messages fell during discussions. Conclusions/Significance Overall, our results prove that collective emotional states can be created and modulated via Internet communication and that emotional expressiveness is the fuel that sustains some e-communities.

203 citations

Journal ArticleDOI
TL;DR: An empirical study of user activity in online BBC discussion forums, measured by the number of posts written by individual debaters and the average sentiment of these posts, shows that most posts contain negative emotions and the most active users in individual threads express predominantly negative sentiments.
Abstract: We present an empirical study of user activity in online BBC discussion forums, measured by the number of posts written by individual debaters and the average sentiment of these posts. Nearly 2.5 million posts from over 18 thousand users were investigated. Scale-free distributions were observed for activity in individual discussion threads as well as for overall activity. The number of unique users in a thread normalized by the thread length decays with thread length, suggesting that thread life is sustained by mutual discussions rather than by independent comments. Automatic sentiment analysis shows that most posts contain negative emotions and the most active users in individual threads express predominantly negative sentiments. It follows that the average emotion of longer threads is more negative and that threads can be sustained by negative comments. An agent-based computer simulation model has been used to reproduce several essential characteristics of the analyzed system. The model stresses the role of discussions between users, especially emotionally laden quarrels between supporters of opposite opinions, and represents many observed statistics of the forum.

162 citations


Cited by
More filters
01 Jan 1990
TL;DR: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article, where the authors present an overview of their work.
Abstract: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article.

2,933 citations

Journal ArticleDOI
TL;DR: The Semantic Orientation CALculator (SO-CAL) uses dictionaries of words annotated with their semantic orientation (polarity and strength), and incorporates intensification and negation, and is applied to the polarity classification task.
Abstract: We present a lexicon-based approach to extracting sentiment from text. The Semantic Orientation CALculator (SO-CAL) uses dictionaries of words annotated with their semantic orientation (polarity and strength), and incorporates intensification and negation. SO-CAL is applied to the polarity classification task, the process of assigning a positive or negative label to a text that captures the text's opinion towards its main subject matter. We show that SO-CAL's performance is consistent across domains and in completely unseen data. Additionally, we describe the process of dictionary creation, and our use of Mechanical Turk to check dictionaries for consistency and reliability.

2,798 citations

01 Nov 2008

2,686 citations

01 Jan 1995
TL;DR: In this paper, the authors propose a method to improve the quality of the data collected by the data collection system. But it is difficult to implement and time consuming and computationally expensive.
Abstract: 本文对国际科学计量学杂志《Scientometrics》1979-1991年的研究论文内容、栏目、作者及国别和编委及国别作了计量分析,揭示出科学计量学研究的重点、活动的中心及发展趋势,说明了学科带头人在发展科学计量学这门新兴学科中的作用。

1,636 citations

Proceedings ArticleDOI
01 Jun 2014
TL;DR: Three neural networks are developed to effectively incorporate the supervision from sentiment polarity of text (e.g. sentences or tweets) in their loss functions and the performance of SSWE is improved by concatenating SSWE with existing feature set.
Abstract: We present a method that learns word embedding for Twitter sentiment classification in this paper. Most existing algorithms for learning continuous word representations typically only model the syntactic context of words but ignore the sentiment of text. This is problematic for sentiment analysis as they usually map words with similar syntactic context but opposite sentiment polarity, such as good and bad, to neighboring word vectors. We address this issue by learning sentimentspecific word embedding (SSWE), which encodes sentiment information in the continuous representation of words. Specifically, we develop three neural networks to effectively incorporate the supervision from sentiment polarity of text (e.g. sentences or tweets) in their loss functions. To obtain large scale training corpora, we learn the sentiment-specific word embedding from massive distant-supervised tweets collected by positive and negative emoticons. Experiments on applying SSWE to a benchmark Twitter sentiment classification dataset in SemEval 2013 show that (1) the SSWE feature performs comparably with hand-crafted features in the top-performed system; (2) the performance is further improved by concatenating SSWE with existing feature set.

1,157 citations