Information credibility on twitter

doi:10.1145/1963405.1963500

Home
/
Papers
/
Information credibility on twitter

Proceedings Article•DOI•

Information credibility on twitter

Carlos Castillo¹, Marcelo Mendoza², Barbara Poblete³•Institutions (3)

Yahoo!¹, Federico Santa María Technical University², University of Chile³

28 Mar 2011-pp 675-684

TL;DR: There are measurable differences in the way messages propagate, that can be used to classify them automatically as credible or not credible, with precision and recall in the range of 70% to 80%.

read less

Abstract: We analyze the information credibility of news propagated through Twitter, a popular microblogging service. Previous research has shown that most of the messages posted on Twitter are truthful, but the service is also used to spread misinformation and false rumors, often unintentionally.On this paper we focus on automatic methods for assessing the credibility of a given set of tweets. Specifically, we analyze microblog postings related to "trending" topics, and classify them as credible or not credible, based on features extracted from them. We use features from users' posting and re-posting ("re-tweeting") behavior, from the text of the posts, and from citations to external sources.We evaluate our methods using a significant number of human assessments about the credibility of items on a recent sample of Twitter postings. Our results shows that there are measurable differences in the way messages propagate, that can be used to classify them automatically as credible or not credible, with precision and recall in the range of 70% to 80%.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

The spread of true and false news online

[...]

Soroush Vosoughi¹, Deb Roy¹, Sinan Aral¹•Institutions (1)

Massachusetts Institute of Technology¹

09 Mar 2018-Science

TL;DR: A large-scale analysis of tweets reveals that false rumors spread further and faster than the truth, and false news was more novel than true news, which suggests that people were more likely to share novel information.

...read moreread less

Abstract: We investigated the differential diffusion of all of the verified true and false news stories distributed on Twitter from 2006 to 2017. The data comprise ~126,000 stories tweeted by ~3 million people more than 4.5 million times. We classified news as true or false using information from six independent fact-checking organizations that exhibited 95 to 98% agreement on the classifications. Falsehood diffused significantly farther, faster, deeper, and more broadly than the truth in all categories of information, and the effects were more pronounced for false political news than for false news about terrorism, natural disasters, science, urban legends, or financial information. We found that false news was more novel than true news, which suggests that people were more likely to share novel information. Whereas false stories inspired fear, disgust, and surprise in replies, true stories inspired anticipation, sadness, joy, and trust. Contrary to conventional wisdom, robots accelerated the spread of true and false news at the same rate, implying that false news spreads more than the truth because humans, not robots, are more likely to spread it.

...read moreread less

4,241 citations

Cites background from "Information credibility on twitter"

...Some work develops theoretical models of rumor diffusion [37, 38, 39, 40], or methods for rumor detection [41, 42, 43, 44], credibility evaluation [45] or interventions to curtail the spread of rumors [46, 47, 48]....
[...]

Journal Article•DOI•

Fake News Detection on Social Media: A Data Mining Perspective

[...]

Kai Shu¹, Amy Sliva², Suhang Wang¹, Jiliang Tang³, Huan Liu¹ - Show less +1 more•Institutions (3)

Arizona State University¹, Charles River Laboratories², Michigan State University³

01 Sep 2017-Sigkdd Explorations

TL;DR: Wang et al. as discussed by the authors presented a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets.

...read moreread less

Abstract: Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of \fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ine ective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.

...read moreread less

1,891 citations

Posted Content•

Fake News Detection on Social Media: A Data Mining Perspective

[...]

Kai Shu¹, Amy Sliva², Suhang Wang¹, Jiliang Tang³, Huan Liu¹ - Show less +1 more•Institutions (3)

Arizona State University¹, Charles River Laboratories², Michigan State University³

07 Aug 2017-arXiv: Social and Information Networks

TL;DR: This survey presents a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets, and future research directions for fake news detection on socialMedia.

...read moreread less

Abstract: Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of "fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ineffective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.

...read moreread less

887 citations

Cites background or methods from "Information credibility on twitter"

...to infer the credibility and reliability for each user using various aspects of user demographics, such as registration age, number of followers/followees, number of tweets the user has authored, etc [11]. Group level user features capture overall characteristics of groups of users related to the news [99]. The assumption is that the spreaders of fake news 10https://www.wired.com/2016/12/photos-fuel-s...
[...]
...h as supporting, denying, etc [37]. Topic features can be extracted using topic models, such as latent Dirichlet allocation (LDA) [49]. Credibility features for posts assess the degree of reliability [11]. Group level features aim to aggregate the feature values for all relevant posts for specic news articles by using \wisdom of crowds". For example, the average credibility scores are used to ev...
[...]

Proceedings Article•

Detecting rumors from microblogs with recurrent neural networks

[...]

Jing Ma¹, Wei Gao², Prasenjit Mitra², Sejeong Kwon³, Bernard J. Jansen², Kam-Fai Wong¹, Meeyoung Cha³ - Show less +3 more•Institutions (3)

The Chinese University of Hong Kong¹, Qatar Computing Research Institute², KAIST³

09 Jul 2016

TL;DR: A novel method that learns continuous representations of microblog events for identifying rumors based on recurrent neural networks that detects rumors more quickly and accurately than existing techniques, including the leading online rumor debunking services.

...read moreread less

Abstract: Microblogging platforms are an ideal place for spreading rumors and automatically debunking rumors is a crucial problem. To detect rumors, existing approaches have relied on hand-crafted features for employing machine learning algorithms that require daunting manual effort. Upon facing a dubious claim, people dispute its truthfulness by posting various cues over time, which generates long-distance dependencies of evidence. This paper presents a novel method that learns continuous representations of microblog events for identifying rumors. The proposed model is based on recurrent neural networks (RNN) for learning the hidden representations that capture the variation of contextual information of relevant posts over time. Experimental results on datasets from two real-world microblog platforms demonstrate that (1) the RNN method outperforms state-of-the-art rumor detection models that use hand-crafted features; (2) performance of the RNN-based algorithm is further improved via sophisticated recurrent units and extra hidden layers; (3) RNN-based method detects rumors more quickly and accurately than existing techniques, including the leading online rumor debunking services.

...read moreread less

791 citations

Cites background or methods from "Information credibility on twitter"

...We construct two microblog datasets using Twitter (www. twitter.com) and Sina Weibo (weibo.com)....
[...]
...To balance the two classes, we further added some non-rumor events from two public datasets [Castillo et al., 2011; Kwon et al., 2013]....
[...]
...The simplest RNN model, tanh-RNN, achieves 82.7% accuracy on Twitter and 87.3% on Weibo....
[...]
...For example, on August 25th of 2015, a rumor about “shootouts and kidnappings by drug gangs happening near schools in Veracruz” spread through Twitter and Facebook1....
[...]
...We refine the keywords by adding, deleting or replacing words manually, and iteratively until the composed queries can have reasonably precise Twitter search results....
[...]

Journal Article•DOI•

Systematic Literature Review on the Spread of Health-related Misinformation on Social Media

[...]

Yuxi Wang¹, Martin McKee², Aleksandra Torbica¹, David Stuckler¹•Institutions (2)

Bocconi University¹, University of London²

01 Nov 2019-Social Science & Medicine

TL;DR: An increasing trend in published articles on health-related misinformation and the role of social media in its propagation is observed, and the most extensively studied topics involving misinformation relate to vaccination, Ebola and Zika Virus, although others, such as nutrition, cancer, fluoridation of water and smoking also featured.

...read moreread less

773 citations

Cites background from "Information credibility on twitter"

...Many studies have thus analysed the credibility of user-generated contents and the cognitive process involved in the decision to spread online information on social and political events (Abbasi and Liu, 2013; Castillo et al., 2011; Lupia, 2013; Swire et al., 2017)....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

@spam: the underground on 140 characters or less

[...]

Chris Grier¹, Kurt Thomas², Vern Paxson¹, Michael Zhang¹•Institutions (2)

University of California, Berkeley¹, University of Illinois at Urbana–Champaign²

04 Oct 2010

TL;DR: A characterization of spam on Twitter finds that 8% of 25 million URLs posted to the site point to phishing, malware, and scams listed on popular blacklists, and examines whether the use of URL blacklists would help to significantly stem the spread of Twitter spam.

...read moreread less

Abstract: In this work we present a characterization of spam on Twitter. We find that 8% of 25 million URLs posted to the site point to phishing, malware, and scams listed on popular blacklists. We analyze the accounts that send spam and find evidence that it originates from previously legitimate accounts that have been compromised and are now being puppeteered by spammers. Using clickthrough data, we analyze spammers' use of features unique to Twitter and the degree that they affect the success of spam. We find that Twitter is a highly successful platform for coercing users to visit spam pages, with a clickthrough rate of 0.13%, compared to much lower rates previously reported for email spam. We group spam URLs into campaigns and identify trends that uniquely distinguish phishing, malware, and spam, to gain an insight into the underlying techniques used to attract users.Given the absence of spam filtering on Twitter, we examine whether the use of URL blacklists would help to significantly stem the spread of Twitter spam. Our results indicate that blacklists are too slow at identifying new threats, allowing more than 90% of visitors to view a page before it becomes blacklisted. We also find that even if blacklist delays were reduced, the use by spammers of URL shortening services for obfuscation negates the potential gains unless tools that use blacklists develop more sophisticated spam filtering.

...read moreread less

613 citations

"Information credibility on twitter" refers background in this paper

...This has attracted spammers that use Twitter to attract visitors to (typically) web pages offering products or services [4, 11, 36]....
[...]

Proceedings Article•DOI•

Truthy: mapping the spread of astroturf in microblog streams

[...]

Jacob Ratkiewicz¹, Michael Conover¹, Mark R. Meiss¹, Bruno Gonçalves¹, Snehal Patil¹, Alessandro Flammini¹, Filippo Menczer¹ - Show less +3 more•Institutions (1)

Indiana University¹

28 Mar 2011

TL;DR: A web service that tracks political memes in Twitter and helps detect astroturfing, smear campaigns, and other misinformation in the context of U.S. political elections is demonstrated.

...read moreread less

Abstract: Online social media are complementing and in some cases replacing person-to-person social interaction and redefining the diffusion of information. In particular, microblogs have become crucial grounds on which public relations, marketing, and political battles are fought. We demonstrate a web service that tracks political memes in Twitter and helps detect astroturfing, smear campaigns, and other misinformation in the context of U.S. political elections. We also present some cases of abusive behaviors uncovered by our service. Our web service is based on an extensible framework that will enable the real-time analysis of meme diffusion in social media by mining, visualizing, mapping, classifying, and modeling massive streams of public microblogging events.

...read moreread less

506 citations

Proceedings Article•DOI•

Chatter on the red: what hazards threat reveals about the social life of microblogged information

[...]

Kate Starbird¹, Leysia Palen¹, Amanda Lee Hughes¹, Sarah Vieweg¹•Institutions (1)

University of Colorado Boulder¹

06 Feb 2010

TL;DR: This paper considers a subset of the computer-mediated communication that took place during the flooding of the Red River Valley in the US and Canada in March and April 2009, focusing on the use of Twitter, a microblogging service, to identify mechanisms of information production, distribution, and organization.

...read moreread less

Abstract: This paper considers a subset of the computer-mediated communication (CMC) that took place during the flooding of the Red River Valley in the US and Canada in March and April 2009. Focusing on the use of Twitter, a microblogging service, we identified mechanisms of information production, distribution, and organization. The Red River event resulted in a rapid generation of Twitter communications by numerous sources using a variety of communications forms, including autobiographical and mainstream media reporting, among other types. We examine the social life of microblogged information, identifying generative, synthetic, derivative and innovative properties that sustain the broader system of interaction. The landscape of Twitter is such that the production of new information is supported through derivative activities of directing, relaying, synthesizing, and redistributing, and is additionally complemented by socio-technical innovation. These activities comprise self-organization of information.

...read moreread less

493 citations

Additional excerpts

...Twitter has been used widely during emergency situations, such as wildfires [6], hurricanes [12], floods [32, 33, 31] and earthquakes [15, 7]....
[...]

Journal Article•DOI•

Detecting spam in a Twitter network

[...]

Sarita Yardi¹, Daniel M. Romero², Grant Schoenebeck³, danah boyd⁴•Institutions (4)

Georgia Institute of Technology¹, Cornell University², University of California, Berkeley³, Microsoft⁴

30 Dec 2009-First Monday

TL;DR: This article examines spam around a one-time Twitter meme—“robotpickuplines” and shows the existence of structural network differences between spam accounts and legitimate users, highlighting challenges in disambiguating spammers from legitimate users.

...read moreread less

Abstract: Spam becomes a problem as soon as an online communication medium becomes popular. Twitter’s behavioral and structural properties make it a fertile breeding ground for spammers to proliferate. In this article we examine spam around a one-time Twitter meme—“robotpickuplines”. We show the existence of structural network differences between spam accounts and legitimate users. We conclude by highlighting challenges in disambiguating spammers from legitimate users.

...read moreread less

350 citations

"Information credibility on twitter" refers background in this paper

...This has attracted spammers that use Twitter to attract visitors to (typically) web pages offering products or services [4, 11, 36]....
[...]

Proceedings Article•DOI•

"OMG, from here, I can see the flames!": a use case of mining location based social networks to acquire spatio-temporal data on forest fires

[...]

Bertrand De Longueville, Robin S. Smith, Gianluca Luraschi

03 Nov 2009

TL;DR: Improved the understanding on how LBSN can be used as a reliable source of spatio-temporal information is improved by analysing the temporal, spatial and social dynamics of Twitter activity during a major forest fire event in the South of France in July 2009.

...read moreread less

Abstract: The emergence of innovative web applications, often labelled as Web 2.0, has permitted an unprecedented increase of content created by non-specialist users. In particular, Location-based Social Networks (LBSN) are designed as platforms allowing the creation, storage and retrieval of vast amounts of georeferenced and user-generated contents. LBSN can thus be seen by Geographic Information specialists as a timely and cost-effective source of spatio-temporal information for many fields of application, provided that they can set up workflows to retrieve, validate and organise such information. This paper aims to improve the understanding on how LBSN can be used as a reliable source of spatio-temporal information, by analysing the temporal, spatial and social dynamics of Twitter activity during a major forest fire event in the South of France in July 2009.

...read moreread less

350 citations