scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Fake News Detection on Social Media: A Data Mining Perspective

01 Sep 2017-Sigkdd Explorations (ACM)-Vol. 19, Iss: 1, pp 22-36
TL;DR: Wang et al. as discussed by the authors presented a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets.
Abstract: Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of \fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ine ective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.
Citations
More filters
Journal ArticleDOI
TL;DR: In this article , the authors conducted a meta-analysis of 125 studies to aggregate their results quantitatively, focusing on statistics and the quantitative analysis of data from numerous separate primary investigations to identify overall trends.
Abstract: The ubiquitous access and exponential growth of information available on social media networks have facilitated the spread of fake news, complicating the task of distinguishing between this and real news. Fake news is a significant social barrier that has a profoundly negative impact on society. Despite the large number of studies on fake news detection, they have not yet been combined to offer coherent insight on trends and advancements in this domain. Hence, the primary objective of this study was to fill this knowledge gap. The method for selecting the pertinent articles for extraction was created using the preferred reporting items for systematic reviews and meta-analyses (PRISMA). This study reviewed deep learning, machine learning, and ensemble-based fake news detection methods by a meta-analysis of 125 studies to aggregate their results quantitatively. The meta-analysis primarily focused on statistics and the quantitative analysis of data from numerous separate primary investigations to identify overall trends. The results of the meta-analysis were reported by the spatial distribution, the approaches adopted, the sample size, and the performance of methods in terms of accuracy. According to the statistics of between-study variance high heterogeneity was found with τ2 = 3.441; the ratio of true heterogeneity to total observed variation was I2 = 75.27% with the heterogeneity chi-square (Q) = 501.34, the degree of freedom = 124, and p ≤ 0.001. A p-value of 0.912 from the Egger statistical test confirmed the absence of a publication bias. The findings of the meta-analysis demonstrated satisfaction with the effectiveness of the recommended approaches from the primary studies on fake news detection that were included. Furthermore, the findings can inform researchers about various approaches they can use to detect online fake news.

2 citations

Journal ArticleDOI
TL;DR: In this paper, the authors argue that only an institutional approach can adequately take up the challenge against the corresponding institution of fake news, and propose that we should indicate reliable information with a trademark and news signing-approved information and brand equity.
Abstract: This paper argues that Michael Polanyi’s account of how science, as an institution, establishes knowledge can provide a structure for a future institution capable of countering misinformation, or fake news, and deepfakes. I argue that only an institutional approach can adequately take up the challenge against the corresponding institution of fake news. The fact of filtering news and information may be bothering. It is the threat of censorship and free speech limitation. Instead, I propose that we should indicate reliable information with a trademark and news signing-approved information and brand equity. I offer a method of creating a standard for online news that people can rely on (similar to high-quality shopping products).

2 citations

Proceedings ArticleDOI
30 Nov 2020
TL;DR: This work proposes FakeNewsSetGen, a process that builds Fake News datasets that contain news propagation data and support comparison amid the state-of-the-art MLFN, and illustrates its viability and adequacy.
Abstract: Due to easy access and low cost, social media online news consumption has increased significantly for the last decade. Despite their benefits, some social media allow anyone to post news with intense spreading power, which amplifies an old problem: the dissemination of Fake News. In the face of this scenario, several machine learning-based methods to automatically detect Fake News (MLFN) have been proposed. All of them require datasets to train and evaluate their detection models. Although recent MLFN were designed to consider data regarding the news propagation on social media, most of the few available datasets do not contain this kind of data. Hence, comparing the performances amid those recent MLFN and the others is restricted to a very limited number of datasets. Moreover, all existing datasets with propagation data do not contain news in Portuguese, which impairs the evaluation of the MLFN in this language. Thus, this work proposes FakeNewsSetGen, a process that builds Fake News datasets that contain news propagation data and support comparison amid the state-of-the-art MLFN. FakeNewsSetGen's software engineering process was guided to include all kind of data required by the existing MLFN. In order to illustrate FakeNewsSetGen's viability and adequacy, a case study was carried out. It encompassed the implementation of a FakeNewsSetGen prototype and the application of this prototype to create a dataset called FakeNewsSet, with news in Portuguese. Five MLFN with different kind of data requirements (two of them demanding news propagation data) were applied to FakeNewsSet and compared, demonstrating the potential use of both the proposed process and the created dataset.

2 citations

Journal ArticleDOI
TL;DR: In this article , a new and automated procedure for the monitoring of online news accuracy is proposed, which relies on the fact that online news articles are often updated after initial publication, thereby also correcting errors.
Abstract: In the past decade, news consumption has shifted from printed news media to online alternatives. Although these come with advantages, online news poses challenges as well. Notable here is the increased competition between online newspapers and other online news providers to attract readers. Hereby, speed is often favored over quality. As a consequence, the need for new tools to monitor online news accuracy has grown. In this work, a fundamentally new and automated procedure for the monitoring of online news accuracy is proposed. The approach relies on the fact that online news articles are often updated after initial publication, thereby also correcting errors. Automated observation of the changes being made to online articles and detection of the errors that are corrected may offer useful insights concerning news accuracy. The potential of the presented automated error correction detection model is illustrated by building supervised classification models for the detection of objective, subjective and linguistic errors in online news updates respectively. The models are built using a large news update data set being collected during two consecutive years for six different Flemish online newspapers. A subset of 21,129 changes is then annotated using a combination of automated and human annotation via an online annotation platform. Finally, manually crafted features and text embeddings obtained by four different language models (TF-IDF, word2vec, BERTje and SBERT) are fed to three supervised machine learning algorithms (logistic regression, support vector machines and decision trees) and performance of the obtained models is subsequently evaluated. Results indicate that small differences in performance exist between the different learning algorithms and language models. Using the best-performing models, F 2 -scores of 0.45, 0.25 and 0.80 are obtained for the classification of objective, subjective and linguistic errors respectively. • Online news accuracy can be monitored by exploiting information in news updates. • A dataset consisting of 21,129 changes to online news is gathered and annotated. • Automated classification models are built for error detection in online news. • Performance of three learning algorithms and four language models is evaluated. • Good predictive capability is obtained for objective and linguistic error detection.

2 citations

References
More filters
Journal ArticleDOI
28 May 2015-Nature
TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
Abstract: Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

46,982 citations

Book ChapterDOI
TL;DR: In this paper, the authors present a critique of expected utility theory as a descriptive model of decision making under risk, and develop an alternative model, called prospect theory, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights.
Abstract: This paper presents a critique of expected utility theory as a descriptive model of decision making under risk, and develops an alternative model, called prospect theory. Choices among risky prospects exhibit several pervasive effects that are inconsistent with the basic tenets of utility theory. In particular, people underweight outcomes that are merely probable in comparison with outcomes that are obtained with certainty. This tendency, called the certainty effect, contributes to risk aversion in choices involving sure gains and to risk seeking in choices involving sure losses. In addition, people generally discard components that are shared by all prospects under consideration. This tendency, called the isolation effect, leads to inconsistent preferences when the same choice is presented in different forms. An alternative theory of choice is developed, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights. The value function is normally concave for gains, commonly convex for losses, and is generally steeper for losses than for gains. Decision weights are generally lower than the corresponding probabilities, except in the range of low prob- abilities. Overweighting of low probabilities may contribute to the attractiveness of both insurance and gambling. EXPECTED UTILITY THEORY has dominated the analysis of decision making under risk. It has been generally accepted as a normative model of rational choice (24), and widely applied as a descriptive model of economic behavior, e.g. (15, 4). Thus, it is assumed that all reasonable people would wish to obey the axioms of the theory (47, 36), and that most people actually do, most of the time. The present paper describes several classes of choice problems in which preferences systematically violate the axioms of expected utility theory. In the light of these observations we argue that utility theory, as it is commonly interpreted and applied, is not an adequate descriptive model and we propose an alternative account of choice under risk. 2. CRITIQUE

35,067 citations

Book ChapterDOI
09 Jan 2004
TL;DR: A theory of intergroup conflict and some preliminary data relating to the theory is presented in this article. But the analysis is limited to the case where the salient dimensions of the intergroup differentiation are those involving scarce resources.
Abstract: This chapter presents an outline of a theory of intergroup conflict and some preliminary data relating to the theory. Much of the work on the social psychology of intergroup relations has focused on patterns of individual prejudices and discrimination and on the motivational sequences of interpersonal interaction. The intensity of explicit intergroup conflicts of interests is closely related in human cultures to the degree of opprobrium attached to the notion of "renegade" or "traitor." The basic and highly reliable finding is that the trivial, ad hoc intergroup categorization leads to in-group favoritism and discrimination against the out-group. Many orthodox definitions of "social groups" are unduly restrictive when applied to the context of intergroup relations. The equation of social competition and intergroup conflict rests on the assumptions concerning an "ideal type" of social stratification in which the salient dimensions of intergroup differentiation are those involving scarce resources.

14,812 citations

Journal ArticleDOI
TL;DR: Cumulative prospect theory as discussed by the authors applies to uncertain as well as to risky prospects with any number of outcomes, and it allows different weighting functions for gains and for losses, and two principles, diminishing sensitivity and loss aversion, are invoked to explain the characteristic curvature of the value function and the weighting function.
Abstract: We develop a new version of prospect theory that employs cumulative rather than separable decision weights and extends the theory in several respects. This version, called cumulative prospect theory, applies to uncertain as well as to risky prospects with any number of outcomes, and it allows different weighting functions for gains and for losses. Two principles, diminishing sensitivity and loss aversion, are invoked to explain the characteristic curvature of the value function and the weighting functions. A review of the experimental evidence and the results of a new experiment confirm a distinctive fourfold pattern of risk attitudes: risk aversion for gains and risk seeking for losses of high probability; risk seeking for gains and risk aversion for losses of low probability. Expected utility theory reigned for several decades as the dominant normative and descriptive model of decision making under uncertainty, but it has come under serious question in recent years. There is now general agreement that the theory does not provide an adequate description of individual choice: a substantial body of evidence shows that decision makers systematically violate its basic tenets. Many alternative models have been proposed in response to this empirical challenge (for reviews, see Camerer, 1989; Fishburn, 1988; Machina, 1987). Some time ago we presented a model of choice, called prospect theory, which explained the major violations of expected utility theory in choices between risky prospects with a small number of outcomes (Kahneman and Tversky, 1979; Tversky and Kahneman, 1986). The key elements of this theory are 1) a value function that is concave for gains, convex for losses, and steeper for losses than for gains,

13,433 citations

Trending Questions (1)
Issue of fake news

The paper discusses the issue of fake news on social media and its potential negative impacts on individuals and society.