Information credibility on twitter

doi:10.1145/1963405.1963500

Home
/
Papers
/
Information credibility on twitter

Proceedings Article•DOI•

Information credibility on twitter

Carlos Castillo¹, Marcelo Mendoza², Barbara Poblete³•Institutions (3)

Yahoo!¹, Federico Santa María Technical University², University of Chile³

28 Mar 2011-pp 675-684

TL;DR: There are measurable differences in the way messages propagate, that can be used to classify them automatically as credible or not credible, with precision and recall in the range of 70% to 80%.

read less

Abstract: We analyze the information credibility of news propagated through Twitter, a popular microblogging service. Previous research has shown that most of the messages posted on Twitter are truthful, but the service is also used to spread misinformation and false rumors, often unintentionally.On this paper we focus on automatic methods for assessing the credibility of a given set of tweets. Specifically, we analyze microblog postings related to "trending" topics, and classify them as credible or not credible, based on features extracted from them. We use features from users' posting and re-posting ("re-tweeting") behavior, from the text of the posts, and from citations to external sources.We evaluate our methods using a significant number of human assessments about the credibility of items on a recent sample of Twitter postings. Our results shows that there are measurable differences in the way messages propagate, that can be used to classify them automatically as credible or not credible, with precision and recall in the range of 70% to 80%.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

A Probabilistic Model for Malicious User and Rumor Detection on Social Media

[...]

Yihong Zhang¹, Takahiro Hara¹•Institutions (1)

Osaka University¹

07 Jan 2020

TL;DR: A probabilistic model that describes user maliciousness with a two-sided perception of rumors and true stories is proposed, and it is shown that the approach can outperform existing methods in detecting rumors, especially for more confusing stories.

...read moreread less

Abstract: Rumor detection in recent years has emerged as an important research topic, as fake news on social media now has more significant impacts on people’s lives, especially during complex and controversial events. Most existing rumor detection techniques, however, only provide shallow analyses of users who propagate rumors. In this paper, we propose a probabilistic model that describes user maliciousness with a two-sided perception of rumors and true stories. We model not only the behavior of retweeting rumors, but also the intention. We propose learning algorithms for discovering latent attributes and detecting rumors based on such attributes, supposedly more effectively when the stories involve retweets with mixed intentions. Using real-world rumor datasets, we show that our approach can outperform existing methods in detecting rumors, especially for more confusing stories. We also show that our approach can capture malicious users more effectively.

...read moreread less

1 citations

Cites background or methods from "Information credibility on twitter"

...has tested a list of features with regard to their effectiveness in predicting message credibility, and has been used as a popular baseline method in later works regarding message veracity [14]....
[...]
...investigated the topic of information credibility on Twitter, and proposed a prediction model based on a list of features [14]....
[...]

Journal Article•DOI•

FuDFEND: Fuzzy-Domain for Multi-domain Fake News Detection

[...]

Mahmood R. Azimi-Sadjadi¹•Institutions (1)

University of Chinese Academy of Sciences¹

01 Jan 2022-Lecture Notes in Computer Science

TL;DR: FuDFEND as discussed by the authors utilizes a neural network to fit the fuzzy inference process which constructs a fuzzy domain label for each news item, and then the feature extraction module uses the fuzzy label to extract the multi-domain features of the news and obtain the total feature representation.

...read moreread less

Abstract: On the Internet, fake news exists in various domain (e.g., education, health). Since news in different domains has different features, researchers have begun to use single domain label for fake news detection recently. Existing works show that using single domain label can improve the accuracy of fake news detection model. However, there are two problems in previous works. Firstly, they ignore that a piece of news may have features from different domains. The single domain label focuses only on the features of one domain. This may reduce the performance of the model. Secondly, their model cannot transfer the domain knowledge to the other dataset without domain label. In this paper, we propose a novel model, FuDFEND, which solves the limitations above by introducing the fuzzy inference mechanism. Specifically, FuDFEND utilizes a neural network to fit the fuzzy inference process which constructs a fuzzy domain label for each news item. Then, the feature extraction module uses the fuzzy domain label to extract the multi-domain features of the news and obtain the total feature representation. Finally, the discriminator module uses the total feature representation to discriminate whether the news item is fake news. The results on the Weibo21 show that our model works better than the model using only single domain label. In addition, our model transfers domain knowledge better to Thu dataset which has no domain label.

...read moreread less

1 citations

Book Chapter•DOI•

"Not some trumped up beef": Assessing Credibility of Online Restaurant Reviews

[...]

Marina Kobayashi¹, Victoria Schwanda Sosik¹, David A. Huffaker¹•Institutions (1)

Google¹

14 Sep 2015

TL;DR: Results are presented showing that attributes of the reviewer and review content influence credibility ratings, and how eWOM platforms can be designed to coach reviewers to write better reviews and present reviews in a manner that facilitates credibility judgments is suggested.

...read moreread less

Abstract: Online reviews, or electronic word of mouth (eWOM), are an essential source of information for people making decisions about products and services, however they are also susceptible to abuses such as spamming and defamation. Therefore when making decisions, readers must determine if reviews are credible. Yet relatively little research has investigated how people make credibility judgments of online reviews. This paper presents quantitative and qualitative results from a survey of 1,979 respondents, showing that attributes of the reviewer and review content influence credibility ratings. Especially important for judging credibility is the level of detail in the review, whether or not it is balanced in sentiment, and whether the reviewer demonstrates expertise. Our findings contribute to the understanding of how people judge eWOM credibility, and we suggest how eWOM platforms can be designed to coach reviewers to write better reviews and present reviews in a manner that facilitates credibility judgments.

...read moreread less

Cites background from "Information credibility on twitter"

..., [8], [22], [34]) and allows researchers to collect high-quality data from a more diverse population than the typical university student sample [6]....
[...]
...Studies modeling the credib ility of factual informat ion in tweets show that informat ion like the number of tweets people have made or the number of fo llowers they have predicts credibility [8], [17], and these cues may likewise be useful for eWOM....
[...]
...We find support for these heuristics in the case of eWOM specifically, and inspired by work on judging factual information in tweets [8] we also suggest concrete signals to facilitate heuristic credib ility judgment in eWOM....
[...]
..., [8]), however with respect to verifiable, factual news events , not subjective opinions about products and services ....
[...]

Posted Content•

Combining exogenous and endogenous signals with a semi-supervised co-attention network for early detection of COVID-19 fake tweets

[...]

Rachit Bansal¹, William Scott Paka², Nidhi³, Shubhashis Sengupta³, Tanmoy Chakraborty² - Show less +1 more•Institutions (3)

Delhi Technological University¹, Indraprastha Institute of Information Technology², Accenture³

12 Apr 2021

TL;DR: In this article, a co-attention mechanism is proposed to fuse signal representations optimally for early detection of fake news in COVID-19 Twitter fake news, with additional behavioral test sets to validate early detection.

...read moreread less

Abstract: Fake tweets are observed to be ever-increasing, demanding immediate countermeasures to combat their spread. During COVID-19, tweets with misinformation should be flagged and neutralized in their early stages to mitigate the damages. Most of the existing methods for early detection of fake news assume to have enough propagation information for large labeled tweets -- which may not be an ideal setting for cases like COVID-19 where both aspects are largely absent. In this work, we present ENDEMIC, a novel early detection model which leverages exogenous and endogenous signals related to tweets, while learning on limited labeled data. We first develop a novel dataset, called CTF for early COVID-19 Twitter fake news, with additional behavioral test sets to validate early detection. We build a heterogeneous graph with follower-followee, user-tweet, and tweet-retweet connections and train a graph embedding model to aggregate propagation information. Graph embeddings and contextual features constitute endogenous, while time-relative web-scraped information constitutes exogenous signals. ENDEMIC is trained in a semi-supervised fashion, overcoming the challenge of limited labeled data. We propose a co-attention mechanism to fuse signal representations optimally. Experimental results on ECTF, PolitiFact, and GossipCop show that ENDEMIC is highly reliable in detecting early fake tweets, outperforming nine state-of-the-art methods significantly.

...read moreread less

Supervised Machine Learning Methods for Early Detection of Untrue Information

[...]

Akanksha Mathur, Chandra Prakash Gupta

01 Jan 2021

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

What is Twitter, a social network or a news media?

[...]

Haewoon Kwak¹, Changhyun Lee¹, Hosung Park¹, Sue Moon¹•Institutions (1)

KAIST¹

26 Apr 2010

TL;DR: In this paper, the authors have crawled the entire Twittersphere and found a non-power-law follower distribution, a short effective diameter, and low reciprocity, which all mark a deviation from known characteristics of human social networks.

...read moreread less

Abstract: Twitter, a microblogging service less than three years old, commands more than 41 million users as of July 2009 and is growing fast. Twitter users tweet about any topic within the 140-character limit and follow others to receive their tweets. The goal of this paper is to study the topological characteristics of Twitter and its power as a new medium of information sharing.We have crawled the entire Twitter site and obtained 41.7 million user profiles, 1.47 billion social relations, 4,262 trending topics, and 106 million tweets. In its follower-following topology analysis we have found a non-power-law follower distribution, a short effective diameter, and low reciprocity, which all mark a deviation from known characteristics of human social networks [28]. In order to identify influentials on Twitter, we have ranked users by the number of followers and by PageRank and found two rankings to be similar. Ranking by retweets differs from the previous two rankings, indicating a gap in influence inferred from the number of followers and that from the popularity of one's tweets. We have analyzed the tweets of top trending topics and reported on their temporal behavior and user participation. We have classified the trending topics based on the active period and the tweets and show that the majority (over 85%) of topics are headline news or persistent news in nature. A closer look at retweets reveals that any retweeted tweet is to reach an average of 1,000 users no matter what the number of followers is of the original tweet. Once retweeted, a tweet gets retweeted almost instantly on next hops, signifying fast diffusion of information after the 1st retweet.To the best of our knowledge this work is the first quantitative study on the entire Twittersphere and information diffusion on it.

...read moreread less

6,108 citations

Proceedings Article•DOI•

Earthquake shakes Twitter users: real-time event detection by social sensors

[...]

Takeshi Sakaki¹, Makoto Okazaki¹, Yutaka Matsuo¹•Institutions (1)

University of Tokyo¹

26 Apr 2010

TL;DR: This paper investigates the real-time interaction of events such as earthquakes in Twitter and proposes an algorithm to monitor tweets and to detect a target event and produces a probabilistic spatiotemporal model for the target event that can find the center and the trajectory of the event location.

...read moreread less

Abstract: Twitter, a popular microblogging service, has received much attention recently. An important characteristic of Twitter is its real-time nature. For example, when an earthquake occurs, people make many Twitter posts (tweets) related to the earthquake, which enables detection of earthquake occurrence promptly, simply by observing the tweets. As described in this paper, we investigate the real-time interaction of events such as earthquakes in Twitter and propose an algorithm to monitor tweets and to detect a target event. To detect a target event, we devise a classifier of tweets based on features such as the keywords in a tweet, the number of words, and their context. Subsequently, we produce a probabilistic spatiotemporal model for the target event that can find the center and the trajectory of the event location. We consider each Twitter user as a sensor and apply Kalman filtering and particle filtering, which are widely used for location estimation in ubiquitous/pervasive computing. The particle filter works better than other comparable methods for estimating the centers of earthquakes and the trajectories of typhoons. As an application, we construct an earthquake reporting system in Japan. Because of the numerous earthquakes and the large number of Twitter users throughout the country, we can detect an earthquake with high probability (96% of earthquakes of Japan Meteorological Agency (JMA) seismic intensity scale 3 or more are detected) merely by monitoring tweets. Our system detects earthquakes promptly and sends e-mails to registered users. Notification is delivered much faster than the announcements that are broadcast by the JMA.

...read moreread less

3,976 citations

"Information credibility on twitter" refers background in this paper

...to track epidemics [17], detect news events [28], geolocate such events [27], and find controversial emerging controversial topics [24]....
[...]

Proceedings Article•DOI•

Why we twitter: understanding microblogging usage and communities

[...]

Akshay Java¹, Xiaodan Song, Tim Finin¹, Belle L. Tseng•Institutions (1)

University of Maryland, Baltimore County¹

12 Aug 2007

TL;DR: It is found that people use microblogging to talk about their daily activities and to seek or share information and the user intentions associated at a community level are analyzed to show how users with similar intentions connect with each other.

...read moreread less

Abstract: Microblogging is a new form of communication in which users can describe their current status in short posts distributed by instant messages, mobile phones, email or the Web. Twitter, a popular microblogging tool has seen a lot of growth since it launched in October, 2006. In this paper, we present our observations of the microblogging phenomena by studying the topological and geographical properties of Twitter's social network. We find that people use microblogging to talk about their daily activities and to seek or share information. Finally, we analyze the user intentions associated at a community level and show how users with similar intentions connect with each other.

...read moreread less

3,025 citations

"Information credibility on twitter" refers background in this paper

...In the table we have separated two broad types of topics: news and conversation, following the broad categories found in [13, 22]....
[...]
...While most messages on Twitter are conversation and chatter, people also use it to share relevant information and to report news [13, 22, 21]....
[...]

Proceedings Article•DOI•

Microblogging during two natural hazards events: what twitter may contribute to situational awareness

[...]

Sarah Vieweg¹, Amanda Lee Hughes¹, Kate Starbird¹, Leysia Palen¹•Institutions (1)

University of Colorado Boulder¹

10 Apr 2010

TL;DR: Analysis of microblog posts generated during two recent, concurrent emergency events in North America via Twitter, a popular microblogging service, aims to inform next steps for extracting useful, relevant information during emergencies using information extraction (IE) techniques.

...read moreread less

Abstract: We analyze microblog posts generated during two recent, concurrent emergency events in North America via Twitter, a popular microblogging service. We focus on communications broadcast by people who were "on the ground" during the Oklahoma Grassfires of April 2009 and the Red River Floods that occurred in March and April 2009, and identify information that may contribute to enhancing situational awareness (SA). This work aims to inform next steps for extracting useful, relevant information during emergencies using information extraction (IE) techniques.

...read moreread less

1,479 citations

Additional excerpts

...Twitter has been used widely during emergency situations, such as wildfires [6], hurricanes [12], floods [32, 33, 31] and earthquakes [15, 7]....
[...]

Proceedings Article•DOI•

Finding high-quality content in social media

[...]

Eugene Agichtein¹, Carlos Castillo², Debora Donato², Aristides Gionis², Gilad Mishne² - Show less +1 more•Institutions (2)

Emory University¹, Yahoo!²

11 Feb 2008

TL;DR: This paper introduces a general classification framework for combining the evidence from different sources of information, that can be tuned automatically for a given social media type and quality definition, and shows that its system is able to separate high-quality items from the rest with an accuracy close to that of humans.

...read moreread less

Abstract: The quality of user-generated content varies drastically from excellent to abuse and spam. As the availability of such content increases, the task of identifying high-quality content sites based on user contributions --social media sites -- becomes increasingly important. Social media in general exhibit a rich variety of information sources: in addition to the content itself, there is a wide array of non-content information available, such as links between items and explicit quality ratings from members of the community. In this paper we investigate methods for exploiting such community feedback to automatically identify high quality content. As a test case, we focus on Yahoo! Answers, a large community question/answering portal that is particularly rich in the amount and types of content and social interactions available in it. We introduce a general classification framework for combining the evidence from different sources of information, that can be tuned automatically for a given social media type and quality definition. In particular, for the community question/answering domain, we show that our system is able to separate high-quality items from the rest with an accuracy close to that of humans

...read moreread less

1,300 citations