scispace - formally typeset
Search or ask a question

Showing papers on "Microblogging published in 2015"


Proceedings ArticleDOI
17 Oct 2015
TL;DR: A novel approach to capture the temporal characteristics of features related to microblog contents, users and propagation patterns based on the time series of rumor's lifecycle, for which time series modeling technique is applied to incorporate various social context information.
Abstract: Automatically identifying rumors from online social media especially microblogging websites is an important research issue. Most of existing work for rumor detection focuses on modeling features related to microblog contents, users and propagation patterns, but ignore the importance of the variation of these social context features during the message propagation over time. In this study, we propose a novel approach to capture the temporal characteristics of these features based on the time series of rumor's lifecycle, for which time series modeling technique is applied to incorporate various social context information. Our experiments using the events in two microblog datasets confirm that the method outperforms state-of-the-art rumor detection approaches by large margins. Moreover, our model demonstrates strong performance on detecting rumors at early stage after their initial broadcast.

514 citations


Proceedings ArticleDOI
13 Apr 2015
TL;DR: A graph-kernel based hybrid SVM classifier which captures the high-order propagation patterns in addition to semantic features such as topics and sentiments and is 88% confident in detecting an average false rumor just 24 hours after the initial broadcast.
Abstract: This paper studies the problem of automatic detection of false rumors on Sina Weibo, the popular Chinese microblogging social network. Traditional feature-based approaches extract features from the false rumor message, its author, as well as the statistics of its responses to form a flat feature vector. This ignores the propagation structure of the messages and has not achieved very good results. We propose a graph-kernel based hybrid SVM classifier which captures the high-order propagation patterns in addition to semantic features such as topics and sentiments. The new model achieves a classification accuracy of 91.3% on randomly selected Weibo dataset, significantly higher than state-of-the-art approaches. Moreover, our approach can be applied at the early stage of rumor propagation and is 88% confident in detecting an average false rumor just 24 hours after the initial broadcast.

507 citations


Journal ArticleDOI
TL;DR: This work constructs several forms of social network for users communicating about climate change on the popular microblogging platform Twitter and identifies a number of general patterns in user behaviours relating to engagement with alternative views.
Abstract: Action to tackle the complex and divisive issue of climate change will be strongly influenced by public perception. Online social media and associated social networks are an increasingly important forum for public debate and are known to influence individual attitudes and behaviours – yet online discussions and social networks related to climate change are not well understood. Here we construct several forms of social network for users communicating about climate change on the popular microblogging platform Twitter. We classify user attitudes to climate change based on message content and find that social networks are characterised by strong attitude-based homophily and segregation into polarised “sceptic” and “activist” groups. Most users interact only with like-minded others, in communities dominated by a single view. However, we also find mixed-attitude communities in which sceptics and activists frequently interact. Messages between like-minded users typically carry positive sentiment, while messages between sceptics and activists carry negative sentiment. We identify a number of general patterns in user behaviours relating to engagement with alternative views. Users who express negative sentiment are themselves the target of negativity. Users in mixed-attitude communities are less likely to hold a strongly polarised view, but more likely to express negative sentiment towards other users with differing views. Overall, social media discussions of climate change often occur within polarising “echo chambers”, but also within “open forums”, mixed-attitude communities that reduce polarisation and stimulate debate. Our results have implications for public engagement with this important global challenge.

378 citations


Proceedings ArticleDOI
Xiaomo Liu1, Armineh Nourbakhsh1, Quanzhi Li1, Rui Fang1, Sameena Shah1 
17 Oct 2015
TL;DR: This paper shows using real streaming data that it is possible, using their approach, to debunk rumors accurately and efficiently, often much faster than manual verification by professionals.
Abstract: In this paper, we propose the first real time rumor debunking algorithm for Twitter. We use cues from 'wisdom of the crowds', that is, the aggregate 'common sense' and investigative journalism of Twitter users. We concentrate on identification of a rumor as an event that may comprise of one or more conflicting microblogs. We continue monitoring the rumor event and generate real time updates dynamically based on any additional information received. We show using real streaming data that it is possible, using our approach, to debunk rumors accurately and efficiently, often much faster than manual verification by professionals.

319 citations


Journal ArticleDOI
TL;DR: The authors analyzed the structure and content of the political conversations that took place through the microblogging platform Twitter in the context of the 2011 Spanish legislative elections and the 2012 U.S. presidential elections and found that Twitter replicates most of the existing inequalities in public political exchanges.
Abstract: In this article, we analyze the structure and content of the political conversations that took place through the microblogging platform Twitter in the context of the 2011 Spanish legislative elections and the 2012 U.S. presidential elections. Using a unique database of nearly 70 million tweets collected during both election campaigns, we find that Twitter replicates most of the existing inequalities in public political exchanges. Twitter users who write about politics tend to be male, to live in urban areas, and to have extreme ideological preferences. Our results have important implications for future research on the relationship between social media and politics, since they highlight the need to correct for potential biases derived from these sources of inequality.

274 citations


Journal ArticleDOI
TL;DR: In this paper, the authors provide an empirical test of the Twitter effect, which postulates that micro bloggingging word of mouth (MWOM) shared through Twitter and similar services affects early product adoption behaviors by immediately disseminating consumers' post-purchase quality evaluations.
Abstract: This research provides an empirical test of the “Twitter effect,” which postulates that microblogging word of mouth (MWOM) shared through Twitter and similar services affects early product adoption behaviors by immediately disseminating consumers’ post-purchase quality evaluations. This is a potentially crucial factor for the success of experiential media products and other products whose distribution strategy relies on a hyped release. Studying the four million MWOM messages sent via Twitter concerning 105 movies on their respective opening weekends, the authors find support for the Twitter effect and report evidence of a negativity bias. In a follow-up incident study of 600 Twitter users who decided not to see a movie based on negative MWOM, the authors shed additional light on the Twitter effect by investigating how consumers use MWOM information in their decision-making processes and describing MWOM’s defining characteristics. They use these insights to position MWOM in the word-of-mouth landscape, to identify future word-of-mouth research opportunities based on this conceptual positioning, and to develop managerial implications.

259 citations


Journal ArticleDOI
TL;DR: This article conducted a survey of K-16 educators regarding their use of the micro blogging service for professional purposes and found that teachers described multifaceted and intense use, with PD activities more common than use with students and families.
Abstract: Traditional, top-down professional development (PD) can render teachers mere implementers of the ideas of others, but there is some hope that the participatory nature of social media such as Twitter might support more grassroots PD. To better understand Twitter’s role in education, we conducted a survey of K–16 educators regarding their use of the microblogging service for professional purposes. Respondents described multifaceted and intense use, with PD activities more common than use with students and families. This paper delves into qualitative data from 494 respondents who described their perspectives on Twitter PD. Educators praised the platform as efficient, accessible and interactive. Twitter was credited with providing opportunities to access novel ideas and stay abreast of education advances and trends, particularly regarding educational technology. Numerous respondents compared Twitter favorably with other PD available to them. Members of our sample also appreciated how Twitter connected them to...

211 citations


Journal ArticleDOI
TL;DR: It is argued for a comprehensive definition that extends virality to social networking and microblogging sites, by emphasizing users’ behaviors beyond shear access and viewership, across two studies that investigate viral behavioral intentions toward pro-social messages shared on Facebook and Twitter.
Abstract: With the growing sophistication of social media, virality of online content has become an indicator of online message effectiveness. We argue for a comprehensive definition that extends virality to social networking and microblogging sites, by emphasizing users’ behaviors beyond shear access and viewership. Across two studies, we investigate viral behavioral intentions (VBIs) toward pro-social messages shared on Facebook and Twitter. We further explore how motivations and uses of Facebook and Twitter predict VBIs toward messages shared on these websites.

161 citations


Proceedings ArticleDOI
17 Oct 2015
TL;DR: A novel framework which first classifies tweets to extract situational information, and then summarizes the information achieves superior performance compared to state-of-the-art tweet summarization approaches.
Abstract: Microblogging sites like Twitter have become important sources of real-time information during disaster events. A significant amount of valuable situational information is available in these sites; however, this information is immersed among hundreds of thousands of tweets, mostly containing sentiments and opinion of the masses, that are posted during such events. To effectively utilize microblogging sites during disaster events, it is necessary to (i) extract the situational information from among the large amounts of sentiment and opinion, and (ii) summarize the situational information, to help decision-making processes when time is critical. In this paper, we develop a novel framework which first classifies tweets to extract situational information, and then summarizes the information. The proposed framework takes into consideration the typicalities pertaining to disaster events where (i) the same tweet often contains a mixture of situational and non-situational information, and (ii) certain numerical information, such as number of casualties, vary rapidly with time, and thus achieves superior performance compared to state-of-the-art tweet summarization approaches.

161 citations


Proceedings ArticleDOI
02 Feb 2015
TL;DR: The proposed model explicitly characterizes the process through which a message gain its retweets, by capturing a power-law temporal relaxation function corresponding to the aging in the ability of the message to attract new retwets and an exponential reinforcement mechanism characterizing the "richer-get-richer" phenomenon.
Abstract: Popularity prediction on microblogging platforms aims to predict the future popularity of a message based on its retweeting dynamics in the early stages. Existing works mainly focus on exploring effective features for prediction, while ignoring the underlying arrival process of retweets. Also, the effect of user activity variation on the retweeting dynamics in the early stages has been neglected. In this paper, we propose an extended reinforced Poisson process model with time mapping process to model the retweeting dynamics and predict the future popularity. The proposed model explicitly characterizes the process through which a message gain its retweets, by capturing a power-law temporal relaxation function corresponding to the aging in the ability of the message to attract new retweets and an exponential reinforcement mechanism characterizing the "richer-get-richer" phenomenon. Further, we introduce the notation of weibo time and integrate a time mapping process into the proposed model to eliminate the effect of user activity variation. Extensive experiments on two Weibo datasets, with 10K and 18K messages respectively, well demonstrate the effectiveness of our proposed model in popularity prediction.

145 citations


Journal ArticleDOI
TL;DR: This paper investigates machine-learning-based rumor identification schemes by applying five new features based on users' behaviors, and combines the new features with the existing well-proved effective user behavior-based features to predict whether a microblog post is a rumor.
Abstract: In recent years, microblog systems such as Twitter and Sina Weibo have averaged multimillion active users. On the other hand, the microblog system has become a new means of rumor-spreading platform. In this paper, we investigate the machine-learning-based rumor identification approaches. We observed that feature design and selection has a stronger impact on the rumor identification accuracy than the selection of machine-learning algorithms. Meanwhile, the rumor publishers' behavior may diverge from normal users', and a rumor post may have different responses from a normal post. However, mass behavior on rumor posts has not been explored adequately. Hence, we investigate rumor identification schemes by applying five new features based on users' behaviors, and combine the new features with the existing well-proved effective user behavior-based features, such as followers' comments and reposting, to predict whether a microblog post is a rumor. Experiment results on real-world data from Sina Weibo demonstrate the efficacy and efficiency of our proposed method and features. From the experiments, we conclude that the rumor detection based on mass behaviors is more effective than the detection based on microblogs' inherent features.

Journal ArticleDOI
TL;DR: In this paper, the authors survey a range of techniques applied to infer the location of Twitter users from inception to state of the art, finding significant improvements over time in the granularity levels and better accuracy with results driven by refinements to algorithms and inclusion of more spatial features.
Abstract: The increasing popularity of the social networking service, Twitter, has made it more involved in day-to-day communications, strengthening social relationships and information dissemination. Conversations on Twitter are now being explored as indicators within early warning systems to alert of imminent natural disasters such as earthquakes and aid prompt emergency responses to crime. Producers are privileged to have limitless access to market perception from consumer comments on social media and microblogs. Targeted advertising can be made more effective based on user profile information such as demography, interests and location. While these applications have proven beneficial, the ability to effectively infer the location of Twitter users has even more immense value. However, accurately identifying where a message originated from or an author's location remains a challenge, thus essentially driving research in that regard. In this paper, we survey a range of techniques applied to infer the location of Twitter users from inception to state of the art. We find significant improvements over time in the granularity levels and better accuracy with results driven by refinements to algorithms and inclusion of more spatial features.

Journal ArticleDOI
TL;DR: A model of microblogging use continuance based on theories of continuance, habit and critical mass suggests that continued use intention is strongly determined by perceived usefulness, satisfaction and habit.
Abstract: The most popular microblogging service, Twitter, has established a large user base, in spite of numerous criticisms. This study aims to examine why this is the case. In particular, the study develops a model of microblogging use continuance based on theories of continuance, habit and critical mass. The model is then tested via a Web survey of Twitter users and PLS path modeling. The results suggest that continued use intention is strongly determined by perceived usefulness, satisfaction and habit (R 2 =0.454). The paper rounds off with conclusions and implications for future research and practice in this very new area of inquiry.

Journal ArticleDOI
TL;DR: Results reveal that three types of gratifications were obtained from using both microblog and WeChat: content gratification, social gratification and hedonic gratification.
Abstract: Purpose – The purpose of this study is to explore the general and specific gratifications obtained from using microblog and WeChat. Design/methodology/approach – To shed light on the difference of gratifications to use microblog and WeChat, 18 interviews with social media users in China were conducted. Findings – Results reveal that three types of gratifications were obtained from using both microblog and WeChat: content gratification, social gratification and hedonic gratification. Also, the strength and components of each gratification for microblog and WeChat were different. Content gratification plays the most salient role in using microblog, while social gratification is the most important for WeChat usage. In addition, content gratification of microblog usage is related to information seeking and information sharing, while social gratification of WeChat usage is constituted by private social networking and convenient communication. Furthermore, content gratification of WeChat usage refers to high-qu...

Book ChapterDOI
01 Jan 2015
TL;DR: An approach to selection of a new feature set based on Information Gain, Bigram, Object-oriented extraction methods in sentiment analysis on social networking side is introduced and a sentiment analysis model based on Naive Bayes and Support Vector Machine is proposed.
Abstract: Twitter is a microblogging site in which users can post updates (tweets) to friends (followers). It has become an immense dataset of the so-called sentiments. In this paper, we introduce an approach to selection of a new feature set based on Information Gain, Bigram, Object-oriented extraction methods in sentiment analysis on social networking side. In addition, we also proposes a sentiment analysis model based on Naive Bayes and Support Vector Machine. Its purpose is to analyze sentiment more effectively. This model proved to be highly effective and accurate on the analysis of feelings.

Journal ArticleDOI
TL;DR: This paper proposes a multimedia social event summarization framework to automatically generate visualized summaries from the microblog stream of multiple media types and conducts extensive experiments on two real-world microblog datasets to demonstrate the superiority of the proposed framework as compared to the state-of-the-art approaches.
Abstract: Microblogging services have revolutionized the way people exchange information. Confronted with the ever-increasing numbers of social events and the corresponding microblogs with multimedia contents, it is desirable to provide visualized summaries to help users to quickly grasp the essence of these social events for better understanding. While existing approaches mostly focus only on text-based summary, microblog summarization with multiple media types (e.g., text, image, and video) is scarcely explored. In this paper, we propose a multimedia social event summarization framework to automatically generate visualized summaries from the microblog stream of multiple media types. Specifically, the proposed framework comprises three stages, as follows. 1) A noise removal approach is first devised to eliminate potentially noisy images. An effective spectral filtering model is exploited to estimate the probability that an image is relevant to a given event. 2) A novel cross-media probabilistic model, termed Cross-Media-LDA (CMLDA), is proposed to jointly discover subevents from microblogs of multiple media types. The intrinsic correlations among these different media types are well explored and exploited for reinforcing the cross-media subevent discovery process. 3) Finally, based on the cross-media knowledge of all the discovered subevents, a multimedia microblog summary generation process is designed to jointly identify both representative textual and visual samples, which are further aggregated to form a holistic visualized summary. We conduct extensive experiments on two real-world microblog datasets to demonstrate the superiority of the proposed framework as compared to the state-of-the-art approaches.

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper investigated the value of Chinese social media for monitoring air quality trends and related public perceptions and response, and found that the volume of pollution-related messages is highly correlated with particle pollution levels, with Pearson correlation values up to.718 (n=74, P <.001).
Abstract: Background: Recent studies have demonstrated the utility of social media data sources for a wide range of public health goals, including disease surveillance, mental health trends, and health perceptions and sentiment. Most such research has focused on English-language social media for the task of disease surveillance. Objective: We investigated the value of Chinese social media for monitoring air quality trends and related public perceptions and response. The goal was to determine if this data is suitable for learning actionable information about pollution levels and public response. Methods: We mined a collection of 93 million messages from Sina Weibo, China’s largest microblogging service. We experimented with different filters to identify messages relevant to air quality, based on keyword matching and topic modeling. We evaluated the reliability of the data filters by comparing message volume per city to air particle pollution rates obtained from the Chinese government for 74 cities. Additionally, we performed a qualitative study of the content of pollution-related messages by coding a sample of 170 messages for relevance to air quality, and whether the message included details such as a reactive behavior or a health concern. Results: The volume of pollution-related messages is highly correlated with particle pollution levels, with Pearson correlation values up to .718 (n=74, P <.001). Our qualitative results found that 67.1% (114/170) of messages were relevant to air quality and of those, 78.9% (90/114) were a firsthand report. Of firsthand reports, 28% (32/90) indicated a reactive behavior and 19% (17/90) expressed a health concern. Additionally, 3 messages of 170 requested that action be taken to improve quality. Conclusions: We have found quantitatively that message volume in Sina Weibo is indicative of true particle pollution levels, and we have found qualitatively that messages contain rich details including perceptions, behaviors, and self-reported health effects. Social media data can augment existing air pollution surveillance data, especially perception and health-related data that traditionally requires expensive surveys or interviews. [J Med Internet Res 2015;17(3):e22]

Journal ArticleDOI
TL;DR: A new approach to detect burst novel events in microblogging stream by utilizing multiple types of information, i.e., term frequency, and user social relation, and the popularity of detected event is predicted through a proposed diffusion model which takes both the content and user information of the event into account.

15 Nov 2015
TL;DR: In this article, the authors address two common issues within the context of microblog social media: first, they detect rumors as a type of misinformation propagation and next they go beyond detection to perform the task of rumor classification.
Abstract: With the pervasiveness of online media data as a source of information verifying the validity of this information is becoming even more important yet quite challenging. Rumors spread a large quantity of misinformation on microblogs. In this study we address two common issues within the context of microblog social media. First we detect rumors as a type of misinformation propagation and next we go beyond detection to perform the task of rumor classification. WE explore the problem using a standard data set. We devise novel features and study their impact on the task. We experiment with various levels of preprocessing as a precursor of the classification as well as grouping of features. We achieve and f-measure of over 0.82 in RDC task in mixed rumors data set and 84 percent in a single rumor data set using a two-step classification approach.

Journal ArticleDOI
TL;DR: This article utilizes users' microblogs to extract their emotions at different granularity levels and during different time windows and proves that considering user emotional context can indeed improve recommendation performance in terms of hit rate, precision, recall, and F1 score.
Abstract: Utilize microblogs to extract users' emotions.Correlate users, music and the users' emotion.Develop an emotion-aware method to perform music recommendation. Context-aware recommendation has become increasingly important and popular in recent years when users are immersed in enormous music contents and have difficulty to make their choices. User emotion, as one of the most important contexts, has the potential to improve music recommendation, but has not yet been fully explored due to the great difficulty of emotion acquisition. This article utilizes users' microblogs to extract their emotions at different granularity levels and during different time windows. The approach then correlates three elements: user, music and the user's emotion when he/she is listening to the music piece. Based on the associations extracted from a data set crawled from a Chinese Twitter service, we develop several emotion-aware methods to perform music recommendation. We conduct a series of experiments and show that the proposed solution proves that considering user emotional context can indeed improve recommendation performance in terms of hit rate, precision, recall, and F1 score.

Journal ArticleDOI
TL;DR: A competency-enhancing social networking application is provided which provides a solution for the dilemma of non-participating (non-engaged) students in class: 'pedagogical tweeting'.

Proceedings ArticleDOI
18 May 2015
TL;DR: A probabilistic model using a Self-Excited Hawkes Process (SEHP) to characterize the process through which individual microblogs gain their popularity, which demonstrates that the SEHP model consistently outperforms the model based on reinforced Poisson process.
Abstract: The ability to model and predict the popularity dynamics of individual user generated items on online media has important implications in a wide range of areas. In this paper, we propose a probabilistic model using a Self-Excited Hawkes Process (SEHP) to characterize the process through which individual microblogs gain their popularity. This model explicitly captures the triggering effect of each forwarding, distinguishing itself from the reinforced Poisson process based model where all previous forwardings are simply aggregated as a single triggering effect. We validate the proposed model by applying it on Sina Weibo, the most popular microblogging network in China. Experimental results demonstrate that the SEHP model consistently outperforms the model based on reinforced Poisson process.

Proceedings Article
25 Jan 2015
TL;DR: A general unsupervised framework to explore events from tweets, which consists of a pipeline process of filtering, extraction and categorization, which achieves a precision of 70.49% and is outperforming a competitive baseline by nearly 6%.
Abstract: Twitter, as a popular microblogging service, has become a new information channel for users to receive and exchange the most up-to-date information on current events. However, since there is no control on how users can publish messages on Twitter, finding newsworthy events from Twitter becomes a difficult task like "finding a needle in a haystack". In this paper we propose a general unsupervised framework to explore events from tweets, which consists of a pipeline process of filtering, extraction and categorization. To filter out noisy tweets, the filtering step exploits a lexicon-based approach to separate tweets that are event-related from those that are not. Then, based on these event-related tweets, the structured representations of events are extracted and categorized automatically using an unsupervised Bayesian model without the use of any labelled data. Moreover, the categorized events are assigned with the event type labels without human intervention. The proposed framework has been evaluated on over 60 millions tweets which were collected for one month in December 2010. A precision of 70.49% is achieved in event extraction, outperforming a competitive baseline by nearly 6%. Events are also clustered into coherence groups with the automatically assigned event type label.

Journal ArticleDOI
TL;DR: An integrated research model is developed based on TAM and motivational models to explore the motives that lead users to continued Twitter usage and revealed the salient role of intrinsic motivation and perceived ease of use in continuedTwitter usage.
Abstract: This research investigates use continuance in the most popular microblogging service, Twitter. Based on TAM and motivational models, we develop an integrated research model to explore the motives that lead users to continued Twitter usage. Structural equation modelling analysis of survey data from 385 Twitter users revealed the salient role of intrinsic motivation and perceived ease of use in continued Twitter usage. Our findings have important implications for theory and practice in this new area of inquiry.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors examined a vast amount of tourism-related Chinese social media posts using a visual analytic approach to examine the impact of travel news on tourists' attitudes towards travel policy changes.
Abstract: This research note seeks to examine a vast amount of tourism-related Chinese social media posts using a visual analytic approach. Visual analytics turns information overload into an opportunity. In this case, the mainstream Chinese microblog service, Sina Weibo, was selected as it generates large volumes of data, representing significant consumer insights, that are challenging to analyse by other common research methods. The most frequently reposted tourist visa news in the first eight months of 2014 were harvested and used as a case study. Findings from this study demonstrate that a visual analytic approach can offer insights into the impact of travel news on Chinese consumers. These insights include potential tourist generating regions, the life span of travel news, and tourists’ attitudes towards travel policy changes. Such insights provide important implications for scholars and practitioners, such as enabling real-time decisions of Destination Management Organizations’ social media marketing strategi...

Journal Article
TL;DR: In this paper, the authors investigated the use of Twitter by preservice teachers in a face-to-face undergraduate teacher education course taught by the author and found that the majority of participants maintained a positive opinion of Twitter's educational potential and indicated intentions to utilize it for professional purposes, including classroom applications, in the future.
Abstract: Twitter has demonstrated potential to facilitate learning at the university level, and K-12 educators' use of the microblogging service Twitter to facilitate professional development appears to be on the rise. Research on microblogging as a part of teacher education is, however, limited. This paper investigates the use of Twitter by preservice teachers (N� = 20) in a face-to-face undergraduate teacher education course taught by the author. The participants completed student teaching the subsequent semester, after which a survey was conducted to explore whether they had continued to use Twitter for professional purposes and why or why not. In reflections upon the fall semester's experience, preservice teachers noted several benefits to the use of Twitter in the course, including support of resource sharing, communication, and connection with educators both inside and outside of the class. During the spring semester, the majority of participants stopped professional Twitter activity, with many citing a lack of time. Those who continued use in the spring most commonly did so to gather teaching resources. The majority of participants maintained a positive opinion of Twitter's educational potential and indicated intentions to utilize it for professional purposes, including classroom applications, in the future.

Posted Content
TL;DR: In this article, a Self-Excited Hawkes Process (SEHP) model is proposed to characterize the triggering effect of each microblogging post, distinguishing itself from the reinforced Poisson process based model where all previous forwardings are simply aggregated as a single triggering effect.
Abstract: The ability to model and predict the popularity dynamics of individual user generated items on online media has important implications in a wide range of areas. In this paper, we propose a probabilistic model using a Self-Excited Hawkes Process(SEHP) to characterize the process through which individual microblogs gain their popularity. This model explicitly captures the triggering effect of each forwarding, distinguishing itself from the reinforced Poisson process based model where all previous forwardings are simply aggregated as a single triggering effect. We validate the proposed model by applying it on Sina Weibo, the most popular microblogging network in China. Experimental results demonstrate that the SEHP model consistently outperforms the model based on reinforced Poisson process.

Journal ArticleDOI
TL;DR: The main results show that URLs or use of informal communication increases chances of message forwarding and contextual factors such as user characteristics impact diffusion probability, and recommendations about how to reach a larger number of citizens through social media communications are presented.
Abstract: Social media are becoming increasingly important for communication between government organisations and citizens. Although research on this issue is expanding, the structure of these new communication patterns is still poorly understood. This study contributes to our understanding of these new communication patterns by developing an explanatory model of message diffusion on social media. Messages from 964 Dutch police force Twitter accounts are analysed using trace data drawn from the Twitter™ API to explain why certain police tweets are forwarded and others are not. Based on an iterative human calibration procedure, message topics were automatically coded based on customised lexicons. A principal component analysis of message characteristics generated four distinct patterns of use in (in)personal communication and new/versus reproduced content. Message characteristics were combined with user characteristics in a multilevel logistic general linear model. Our main results show that URLs or use of informal communication increases chances of message forwarding. In addition, contextual factors such as user characteristics impact diffusion probability. Recommendations are discussed for further research into authorship styles and their implications for social media message diffusion. For the police and other government practitioners, a list of recommendation about how to reach a larger number of citizens through social media communications is presented.

Journal Article
TL;DR: An open source approach is presented, throughout which, twitter Microblogs data has been collected, pre-processed, analyzed and visualized using open source tools to perform text mining and sentiment analysis for analyzing user contributed online reviews about two giant retail stores in the UK namely Tesco and Asda stores over Christmas period 2014.
Abstract: Social media has arisen not only as a personal communication media, but also, as a media to communicate opinions about products and services or even political and general events among its users. Due to its widespread and popularity, there is a massive amount of user reviews or opinions produced and shared daily. Twitter is one of the most widely used social media micro blogging sites. Mining user opinions from social media data is not a straight forward task; it can be accomplished in different ways. In this work, an open source approach is presented, throughout which, twitter Microblogs data has been collected, pre-processed, analyzed and visualized using open source tools to perform text mining and sentiment analysis for analyzing user contributed online reviews about two giant retail stores in the UK namely Tesco and Asda stores over Christmas period 2014. Collecting customer opinions can be expensive and time consuming task using conventional methods such as surveys. The sentiment analysis of the customer opinions makes it easier for businesses to understand their competitive value in a changing market and to understand their customer views about their products and services, which also provide an insight into future marketing strategies and decision making policies.

Journal ArticleDOI
TL;DR: A comparative analysis on samples obtained from two of Twitter’s streaming APIs with a more complete Twitter dataset is performed to gain an in-depth understanding of the nature of Twitter data samples and their potential for use in various data mining tasks.
Abstract: Researchers have begun studying content obtained from microblogging services such as Twitter to address a variety of technological, social, and commercial research questions. The large number of Twitter users and even larger volume of tweets often make it impractical to collect and maintain a complete record of activity; therefore, most research and some commercial software applications rely on samples, often relatively small samples, of Twitter data. For the most part, sample sizes have been based on availability and practical considerations. Relatively little attention has been paid to how well these samples represent the underlying stream of Twitter data. To fill this gap, this article performs a comparative analysis on samples obtained from two of Twitter’s streaming APIs with a more complete Twitter dataset to gain an in-depth understanding of the nature of Twitter data samples and their potential for use in various data mining tasks.