scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Fake News Detection on Social Media: A Data Mining Perspective

01 Sep 2017-Sigkdd Explorations (ACM)-Vol. 19, Iss: 1, pp 22-36
TL;DR: Wang et al. as discussed by the authors presented a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets.
Abstract: Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of \fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ine ective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.
Citations
More filters
Proceedings ArticleDOI
13 May 2020
TL;DR: A model for detecting fake news by examining the accuracy of a report and predicting its authenticity and builds an ensemble network to learn the portrayals of news reports, authors, and titles simultaneously is proposed.
Abstract: Due to easy access, rapid growth, and proliferation of the information available through regular news mediums or social media, it is becoming easy for people to look for news and consume it. But on the other hand, it is becoming a daunting task to differentiate between false information and true information thus leading to widespread fake news. We can define fake news as a type of deceiving journalism and statements that are used to artifice and mislead people. Also, the credibility of social media platforms is at stake where this news is mostly shared. These types of forged information news can have serious negative societal impacts and thus their detection has become the emerging area that is attracting research attention. In this paper, we propose a model for detecting fake news by examining the accuracy of a report and predicting its authenticity. By feature extraction and forming credibility scores from the textual information, this model builds an ensemble network to learn the portrayals of news reports, authors, and titles simultaneously. Different machine learning algorithms like SVM, CNN, LSTM, KNN, and Naive Bayes are used for higher accuracy and it was observed that LSTM showed better accuracy with 97%. The performance and effectiveness of classifiers were evaluated based on their precision, recall, and F1-Score. The usage of different algorithms shows the effectiveness of performance on the dataset.

32 citations

Proceedings ArticleDOI
01 Nov 2019
TL;DR: This article has checked and analysed many research articles along with many survey articles and summed up this paper so as to provide the readers with a short idea of what fake news is, it's different flavours in the news spectrum, its characteristics and identification basic.
Abstract: One can easily say in today's world, information aka news to few is more precious than money itself. This news needs to be in authentic form which is usually found in adulterated version. Leading us to have a dire need for an identification of real news from any possible fake news. News, being a form of information can be subjective to the proofs and source for its authenticity. As a human, one can easily identify real news from fake news with the help of one's innate capability to deduce logic and outlandish source of the information piece. Just that one needs few trusted sources to check for the facts and myths. But on a real time basis, there is a dire need for some software which can nip such ‘false news’ in its bud. Leading it to be one of the most researched area nowadays. Primarily being a part of Information Retrieval, this area is taking up a lot of attention from researchers worldwide to come up with a real-time solution for such an issue. In this article we have checked and analysed many research articles along with many survey articles and summed up this paper so as to provide the readers with a short idea of what fake news is, it's different flavours in the news spectrum, its characteristics and identification basic. We also included the different methods used by prior researchers in the same field. Using few researches as examples we learned about the basics of those methods used in fake news identification. The future aspects are also included in this article along with the challenges one faces while doing research in this very field.

31 citations

Journal ArticleDOI
TL;DR: A manually assembled and verified dataset containing 900 news articles, 500 annotated as real and 400, as fake, allowing the investigation of automated fake news detection approaches in Urdu, along with the baseline classification and its evaluation.
Abstract: The paper presents a new corpus for fake news detection in the Urdu language along with the baseline classification and its evaluation. With the escalating use of the Internet worldwide and substantially increasing impact produced by the availability of ambiguous information, the challenge to quickly identify fake news in digital media in various languages becomes more acute. We provide a manually assembled and verified dataset containing 900 news articles, 500 annotated as real and 400, as fake, allowing the investigation of automated fake news detection approaches in Urdu. The news articles in the truthful subset come from legitimate news sources, and their validity has been manually verified. In the fake subset, the known difficulty of finding fake news was solved by hiring professional journalists native in Urdu who were instructed to intentionally write deceptive news articles. The dataset contains 5 different topics: (i) Business, (ii) Health, (iii) Showbiz, (iv) Sports, and (v) Technology. To establish our Urdu dataset as a benchmark, we performed baseline classification. We crafted a variety of text representation feature sets including word n-grams, character n-grams, functional word n-grams, and their combinations. After applying a variety of feature weighting schemes, we ran a series of classifiers on the train-test split. The results show sizable performance gains by AdaBoost classifier with 0.87 F1Fake and 0.90 F1Real. We provide the results evaluated against different metrics for a convenient comparison of future research. The dataset is publicly available for research purposes.

31 citations

Proceedings ArticleDOI
06 May 2021
TL;DR: This paper investigated downstream consequences of social corrections on users' subsequent sharing of other content and found causal evidence that being publicly corrected by another user shifts one's attention away from accuracy, presenting an important challenge for social correction approaches.
Abstract: A prominent approach to combating online misinformation is to debunk false content. Here we investigate downstream consequences of social corrections on users’ subsequent sharing of other content. Being corrected might make users more attentive to accuracy, thus improving their subsequent sharing. Alternatively, corrections might not improve subsequent sharing - or even backfire - by making users feel defensive, or by shifting their attention away from accuracy (e.g., towards various social factors). We identified N=2,000 users who shared false political news on Twitter, and replied to their false tweets with links to fact-checking websites. We find causal evidence that being corrected decreases the quality, and increases the partisan slant and language toxicity, of the users’ subsequent retweets (but has no significant effect on primary tweets). This suggests that being publicly corrected by another user shifts one's attention away from accuracy - presenting an important challenge for social correction approaches.

31 citations

Journal ArticleDOI
TL;DR: In this paper, the authors used the ongoing COVID-19 pandemic as a case study to systematically investigate factors associated with the spread of multi-topic misinformation related to one event on social media based on the heuristic-systematic model.
Abstract: The spread of misinformation on social media has become a major societal issue during recent years. In this work, we used the ongoing COVID-19 pandemic as a case study to systematically investigate factors associated with the spread of multi-topic misinformation related to one event on social media based on the heuristic-systematic model. Among factors related to systematic processing of information, we discovered that the topics of a misinformation story matter, with conspiracy theories being the most likely to be retweeted. As for factors related to heuristic processing of information, such as when citizens look up to their leaders during such a crisis, our results demonstrated that behaviors of a political leader, former US President Donald J. Trump, may have nudged people's sharing of COVID-19 misinformation. Outcomes of this study help social media platform and users better understand and prevent the spread of misinformation on social media.

31 citations

References
More filters
Journal ArticleDOI
28 May 2015-Nature
TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
Abstract: Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

46,982 citations

Book ChapterDOI
TL;DR: In this paper, the authors present a critique of expected utility theory as a descriptive model of decision making under risk, and develop an alternative model, called prospect theory, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights.
Abstract: This paper presents a critique of expected utility theory as a descriptive model of decision making under risk, and develops an alternative model, called prospect theory. Choices among risky prospects exhibit several pervasive effects that are inconsistent with the basic tenets of utility theory. In particular, people underweight outcomes that are merely probable in comparison with outcomes that are obtained with certainty. This tendency, called the certainty effect, contributes to risk aversion in choices involving sure gains and to risk seeking in choices involving sure losses. In addition, people generally discard components that are shared by all prospects under consideration. This tendency, called the isolation effect, leads to inconsistent preferences when the same choice is presented in different forms. An alternative theory of choice is developed, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights. The value function is normally concave for gains, commonly convex for losses, and is generally steeper for losses than for gains. Decision weights are generally lower than the corresponding probabilities, except in the range of low prob- abilities. Overweighting of low probabilities may contribute to the attractiveness of both insurance and gambling. EXPECTED UTILITY THEORY has dominated the analysis of decision making under risk. It has been generally accepted as a normative model of rational choice (24), and widely applied as a descriptive model of economic behavior, e.g. (15, 4). Thus, it is assumed that all reasonable people would wish to obey the axioms of the theory (47, 36), and that most people actually do, most of the time. The present paper describes several classes of choice problems in which preferences systematically violate the axioms of expected utility theory. In the light of these observations we argue that utility theory, as it is commonly interpreted and applied, is not an adequate descriptive model and we propose an alternative account of choice under risk. 2. CRITIQUE

35,067 citations

Book ChapterDOI
09 Jan 2004
TL;DR: A theory of intergroup conflict and some preliminary data relating to the theory is presented in this article. But the analysis is limited to the case where the salient dimensions of the intergroup differentiation are those involving scarce resources.
Abstract: This chapter presents an outline of a theory of intergroup conflict and some preliminary data relating to the theory. Much of the work on the social psychology of intergroup relations has focused on patterns of individual prejudices and discrimination and on the motivational sequences of interpersonal interaction. The intensity of explicit intergroup conflicts of interests is closely related in human cultures to the degree of opprobrium attached to the notion of "renegade" or "traitor." The basic and highly reliable finding is that the trivial, ad hoc intergroup categorization leads to in-group favoritism and discrimination against the out-group. Many orthodox definitions of "social groups" are unduly restrictive when applied to the context of intergroup relations. The equation of social competition and intergroup conflict rests on the assumptions concerning an "ideal type" of social stratification in which the salient dimensions of intergroup differentiation are those involving scarce resources.

14,812 citations

Journal ArticleDOI
TL;DR: Cumulative prospect theory as discussed by the authors applies to uncertain as well as to risky prospects with any number of outcomes, and it allows different weighting functions for gains and for losses, and two principles, diminishing sensitivity and loss aversion, are invoked to explain the characteristic curvature of the value function and the weighting function.
Abstract: We develop a new version of prospect theory that employs cumulative rather than separable decision weights and extends the theory in several respects. This version, called cumulative prospect theory, applies to uncertain as well as to risky prospects with any number of outcomes, and it allows different weighting functions for gains and for losses. Two principles, diminishing sensitivity and loss aversion, are invoked to explain the characteristic curvature of the value function and the weighting functions. A review of the experimental evidence and the results of a new experiment confirm a distinctive fourfold pattern of risk attitudes: risk aversion for gains and risk seeking for losses of high probability; risk seeking for gains and risk aversion for losses of low probability. Expected utility theory reigned for several decades as the dominant normative and descriptive model of decision making under uncertainty, but it has come under serious question in recent years. There is now general agreement that the theory does not provide an adequate description of individual choice: a substantial body of evidence shows that decision makers systematically violate its basic tenets. Many alternative models have been proposed in response to this empirical challenge (for reviews, see Camerer, 1989; Fishburn, 1988; Machina, 1987). Some time ago we presented a model of choice, called prospect theory, which explained the major violations of expected utility theory in choices between risky prospects with a small number of outcomes (Kahneman and Tversky, 1979; Tversky and Kahneman, 1986). The key elements of this theory are 1) a value function that is concave for gains, convex for losses, and steeper for losses than for gains,

13,433 citations

Trending Questions (1)
Issue of fake news

The paper discusses the issue of fake news on social media and its potential negative impacts on individuals and society.