Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter

doi:10.18653/V1/P17-2102

Open AccessProceedings ArticleDOI

Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter

- Vol. 2, pp 647-653

TLDR

This work builds predictive models to classify 130 thousand news posts as suspicious or verified, and predict four sub-types of suspicious news – satire, hoaxes, clickbait and propaganda, and shows that neural network models trained on tweet content and social network interactions outperform lexical models.

Abstract:

Pew research polls report 62 percent of U.S. adults get news on social media (Gottfried and Shearer, 2016). In a December poll, 64 percent of U.S. adults said that “made-up news” has caused a “great deal of confusion” about the facts of current events (Barthel et al., 2016). Fabricated stories in social media, ranging from deliberate propaganda to hoaxes and satire, contributes to this confusion in addition to having serious effects on global stability. In this work we build predictive models to classify 130 thousand news posts as suspicious or verified, and predict four sub-types of suspicious news – satire, hoaxes, clickbait and propaganda. We show that neural network models trained on tweet content and social network interactions outperform lexical models. Unlike previous work on deception detection, we find that adding syntax and grammar features to our models does not improve performance. Incorporating linguistic features improves classification results, however, social interaction features are most informative for finer-grained separation between four types of suspicious news posts.

Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter

Citations

MisInfoWars: A linguistic analysis of deceptive and credible news

Technological Approaches to Detecting Online Disinformation and Manipulation

LSACoNet: A Combination of Lexical and Conceptual Features for Analysis of Fake News Spreaders on Twitter.

FacTweet: Profiling Fake News Twitter Accounts.

A Large-Scale Longitudinal Multimodal Dataset of State-Backed Information Operations on Twitter

References

Adam: A Method for Stochastic Optimization

Glove: Global Vectors for Word Representation

Large-scale Video Classiﬁcation with Convolutional Neural Networks

Large-Scale Video Classification with Convolutional Neural Networks

Liberals and Conservatives Rely on Different Sets of Moral Foundations

Related Papers (5)

Fake News Detection on Social Media: A Data Mining Perspective

"Liar, Liar Pants on Fire": A New Benchmark Dataset for Fake News Detection

The spread of true and false news online

Information credibility on twitter

Social Media and Fake News in the 2016 Election