scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Information credibility on twitter

TL;DR: There are measurable differences in the way messages propagate, that can be used to classify them automatically as credible or not credible, with precision and recall in the range of 70% to 80%.
Abstract: We analyze the information credibility of news propagated through Twitter, a popular microblogging service. Previous research has shown that most of the messages posted on Twitter are truthful, but the service is also used to spread misinformation and false rumors, often unintentionally.On this paper we focus on automatic methods for assessing the credibility of a given set of tweets. Specifically, we analyze microblog postings related to "trending" topics, and classify them as credible or not credible, based on features extracted from them. We use features from users' posting and re-posting ("re-tweeting") behavior, from the text of the posts, and from citations to external sources.We evaluate our methods using a significant number of human assessments about the credibility of items on a recent sample of Twitter postings. Our results shows that there are measurable differences in the way messages propagate, that can be used to classify them automatically as credible or not credible, with precision and recall in the range of 70% to 80%.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
09 Mar 2018-Science
TL;DR: A large-scale analysis of tweets reveals that false rumors spread further and faster than the truth, and false news was more novel than true news, which suggests that people were more likely to share novel information.
Abstract: We investigated the differential diffusion of all of the verified true and false news stories distributed on Twitter from 2006 to 2017. The data comprise ~126,000 stories tweeted by ~3 million people more than 4.5 million times. We classified news as true or false using information from six independent fact-checking organizations that exhibited 95 to 98% agreement on the classifications. Falsehood diffused significantly farther, faster, deeper, and more broadly than the truth in all categories of information, and the effects were more pronounced for false political news than for false news about terrorism, natural disasters, science, urban legends, or financial information. We found that false news was more novel than true news, which suggests that people were more likely to share novel information. Whereas false stories inspired fear, disgust, and surprise in replies, true stories inspired anticipation, sadness, joy, and trust. Contrary to conventional wisdom, robots accelerated the spread of true and false news at the same rate, implying that false news spreads more than the truth because humans, not robots, are more likely to spread it.

4,241 citations


Cites background from "Information credibility on twitter"

  • ...Some work develops theoretical models of rumor diffusion [37, 38, 39, 40], or methods for rumor detection [41, 42, 43, 44], credibility evaluation [45] or interventions to curtail the spread of rumors [46, 47, 48]....

    [...]

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors presented a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets.
Abstract: Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of \fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ine ective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.

1,891 citations

Posted Content
TL;DR: This survey presents a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets, and future research directions for fake news detection on socialMedia.
Abstract: Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of "fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ineffective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.

887 citations


Cites background or methods from "Information credibility on twitter"

  • ...to infer the credibility and reliability for each user using various aspects of user demographics, such as registration age, number of followers/followees, number of tweets the user has authored, etc [11]. Group level user features capture overall characteristics of groups of users related to the news [99]. The assumption is that the spreaders of fake news 10https://www.wired.com/2016/12/photos-fuel-s...

    [...]

  • ...h as supporting, denying, etc [37]. Topic features can be extracted using topic models, such as latent Dirichlet allocation (LDA) [49]. Credibility features for posts assess the degree of reliability [11]. Group level features aim to aggregate the feature values for all relevant posts for specic news articles by using \wisdom of crowds". For example, the average credibility scores are used to ev...

    [...]

Proceedings Article
09 Jul 2016
TL;DR: A novel method that learns continuous representations of microblog events for identifying rumors based on recurrent neural networks that detects rumors more quickly and accurately than existing techniques, including the leading online rumor debunking services.
Abstract: Microblogging platforms are an ideal place for spreading rumors and automatically debunking rumors is a crucial problem. To detect rumors, existing approaches have relied on hand-crafted features for employing machine learning algorithms that require daunting manual effort. Upon facing a dubious claim, people dispute its truthfulness by posting various cues over time, which generates long-distance dependencies of evidence. This paper presents a novel method that learns continuous representations of microblog events for identifying rumors. The proposed model is based on recurrent neural networks (RNN) for learning the hidden representations that capture the variation of contextual information of relevant posts over time. Experimental results on datasets from two real-world microblog platforms demonstrate that (1) the RNN method outperforms state-of-the-art rumor detection models that use hand-crafted features; (2) performance of the RNN-based algorithm is further improved via sophisticated recurrent units and extra hidden layers; (3) RNN-based method detects rumors more quickly and accurately than existing techniques, including the leading online rumor debunking services.

791 citations


Cites background or methods from "Information credibility on twitter"

  • ...We construct two microblog datasets using Twitter (www. twitter.com) and Sina Weibo (weibo.com)....

    [...]

  • ...To balance the two classes, we further added some non-rumor events from two public datasets [Castillo et al., 2011; Kwon et al., 2013]....

    [...]

  • ...The simplest RNN model, tanh-RNN, achieves 82.7% accuracy on Twitter and 87.3% on Weibo....

    [...]

  • ...For example, on August 25th of 2015, a rumor about “shootouts and kidnappings by drug gangs happening near schools in Veracruz” spread through Twitter and Facebook1....

    [...]

  • ...We refine the keywords by adding, deleting or replacing words manually, and iteratively until the composed queries can have reasonably precise Twitter search results....

    [...]

Journal ArticleDOI
TL;DR: An increasing trend in published articles on health-related misinformation and the role of social media in its propagation is observed, and the most extensively studied topics involving misinformation relate to vaccination, Ebola and Zika Virus, although others, such as nutrition, cancer, fluoridation of water and smoking also featured.

773 citations


Cites background from "Information credibility on twitter"

  • ...Many studies have thus analysed the credibility of user-generated contents and the cognitive process involved in the decision to spread online information on social and political events (Abbasi and Liu, 2013; Castillo et al., 2011; Lupia, 2013; Swire et al., 2017)....

    [...]

References
More filters
Proceedings ArticleDOI
04 Oct 2010
TL;DR: A characterization of spam on Twitter finds that 8% of 25 million URLs posted to the site point to phishing, malware, and scams listed on popular blacklists, and examines whether the use of URL blacklists would help to significantly stem the spread of Twitter spam.
Abstract: In this work we present a characterization of spam on Twitter. We find that 8% of 25 million URLs posted to the site point to phishing, malware, and scams listed on popular blacklists. We analyze the accounts that send spam and find evidence that it originates from previously legitimate accounts that have been compromised and are now being puppeteered by spammers. Using clickthrough data, we analyze spammers' use of features unique to Twitter and the degree that they affect the success of spam. We find that Twitter is a highly successful platform for coercing users to visit spam pages, with a clickthrough rate of 0.13%, compared to much lower rates previously reported for email spam. We group spam URLs into campaigns and identify trends that uniquely distinguish phishing, malware, and spam, to gain an insight into the underlying techniques used to attract users.Given the absence of spam filtering on Twitter, we examine whether the use of URL blacklists would help to significantly stem the spread of Twitter spam. Our results indicate that blacklists are too slow at identifying new threats, allowing more than 90% of visitors to view a page before it becomes blacklisted. We also find that even if blacklist delays were reduced, the use by spammers of URL shortening services for obfuscation negates the potential gains unless tools that use blacklists develop more sophisticated spam filtering.

613 citations


"Information credibility on twitter" refers background in this paper

  • ...This has attracted spammers that use Twitter to attract visitors to (typically) web pages offering products or services [4, 11, 36]....

    [...]

Proceedings ArticleDOI
28 Mar 2011
TL;DR: A web service that tracks political memes in Twitter and helps detect astroturfing, smear campaigns, and other misinformation in the context of U.S. political elections is demonstrated.
Abstract: Online social media are complementing and in some cases replacing person-to-person social interaction and redefining the diffusion of information. In particular, microblogs have become crucial grounds on which public relations, marketing, and political battles are fought. We demonstrate a web service that tracks political memes in Twitter and helps detect astroturfing, smear campaigns, and other misinformation in the context of U.S. political elections. We also present some cases of abusive behaviors uncovered by our service. Our web service is based on an extensible framework that will enable the real-time analysis of meme diffusion in social media by mining, visualizing, mapping, classifying, and modeling massive streams of public microblogging events.

506 citations

Proceedings ArticleDOI
06 Feb 2010
TL;DR: This paper considers a subset of the computer-mediated communication that took place during the flooding of the Red River Valley in the US and Canada in March and April 2009, focusing on the use of Twitter, a microblogging service, to identify mechanisms of information production, distribution, and organization.
Abstract: This paper considers a subset of the computer-mediated communication (CMC) that took place during the flooding of the Red River Valley in the US and Canada in March and April 2009. Focusing on the use of Twitter, a microblogging service, we identified mechanisms of information production, distribution, and organization. The Red River event resulted in a rapid generation of Twitter communications by numerous sources using a variety of communications forms, including autobiographical and mainstream media reporting, among other types. We examine the social life of microblogged information, identifying generative, synthetic, derivative and innovative properties that sustain the broader system of interaction. The landscape of Twitter is such that the production of new information is supported through derivative activities of directing, relaying, synthesizing, and redistributing, and is additionally complemented by socio-technical innovation. These activities comprise self-organization of information.

493 citations


Additional excerpts

  • ...Twitter has been used widely during emergency situations, such as wildfires [6], hurricanes [12], floods [32, 33, 31] and earthquakes [15, 7]....

    [...]

Journal ArticleDOI
TL;DR: This article examines spam around a one-time Twitter meme—“robotpickuplines” and shows the existence of structural network differences between spam accounts and legitimate users, highlighting challenges in disambiguating spammers from legitimate users.
Abstract: Spam becomes a problem as soon as an online communication medium becomes popular. Twitter’s behavioral and structural properties make it a fertile breeding ground for spammers to proliferate. In this article we examine spam around a one-time Twitter meme—“robotpickuplines”. We show the existence of structural network differences between spam accounts and legitimate users. We conclude by highlighting challenges in disambiguating spammers from legitimate users.

350 citations


"Information credibility on twitter" refers background in this paper

  • ...This has attracted spammers that use Twitter to attract visitors to (typically) web pages offering products or services [4, 11, 36]....

    [...]

Proceedings ArticleDOI
03 Nov 2009
TL;DR: Improved the understanding on how LBSN can be used as a reliable source of spatio-temporal information is improved by analysing the temporal, spatial and social dynamics of Twitter activity during a major forest fire event in the South of France in July 2009.
Abstract: The emergence of innovative web applications, often labelled as Web 2.0, has permitted an unprecedented increase of content created by non-specialist users. In particular, Location-based Social Networks (LBSN) are designed as platforms allowing the creation, storage and retrieval of vast amounts of georeferenced and user-generated contents. LBSN can thus be seen by Geographic Information specialists as a timely and cost-effective source of spatio-temporal information for many fields of application, provided that they can set up workflows to retrieve, validate and organise such information. This paper aims to improve the understanding on how LBSN can be used as a reliable source of spatio-temporal information, by analysing the temporal, spatial and social dynamics of Twitter activity during a major forest fire event in the South of France in July 2009.

350 citations