Fake News Detection on Social Media: A Data Mining Perspective
TL;DR: Wang et al. as discussed by the authors presented a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets.
Abstract: Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of \fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ine ective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.
Citations
More filters
••
29 Oct 2019TL;DR: Preliminary experiments with two real-world datasets provided evidence that the proposed method can detect Fake News without demanding the explicit opinion of the users about the news and without compromising the classification results obtained by the state-of-the-art crowd signal-based method.
Abstract: The proliferation of Fake News on social media has been a source of widespread concern. One of the main approaches to automatically detect this type of news is based on crowd signals, i.e., opinions manifested by social media users concerning whether the news are fake or not. Although promising, this approach depends on information that is not always available: the explicit opinion of the users about the news to be checked. To overcome this drawback, this article proposes a crowd signal-based method that does not demand the users' explicit opinion to detect Fake News. The proposed method infers the users' opinions from their news spreading (publication/propagation) behavior. Preliminary experiments with two real-world datasets provided evidence that the proposed method can detect Fake News without demanding the explicit opinion of the users about the news and without compromising the classification results obtained by the state-of-the-art crowd signal-based method.
7 citations
••
30 Nov 2020TL;DR: This study proposes an extended method that, in addition to the grammatical classification and polarity based sentiment analysis, also uses the analysis of emotions to detect Fake News written in Portuguese.
Abstract: In the last decades, the dissemination of News through digital media has increased the information accessibility previously offered by traditional channels. Despite their benefits, digital media have exacerbated an old problem: the spread of Fake News, (i.e., false News intentionally published). Faced with this scenario, the linguistic approaches to automatic Fake News detection use information that can be directly extracted from the News' text. Several methods based on these approaches use grammatical classification and sentiment analysis over News writing in Portuguese. However, as far as it was possible to observe in the related literature, these methods are limited to the identification of polarity of sentiment (i.e., positive, neutral or negative) existing in the text. Although polarity classification be an effective method for a wide range of natural language processing applications, it does not address language nuances (e.g., emotions such as anger, sadness, etc.) that can provide evidence that a text contains false information. Hence, this study proposes an extended method that, in addition to the grammatical classification and polarity based sentiment analysis, also uses the analysis of emotions to detect Fake News written in Portuguese. The extended method showed promising results in experimental data, obtaining accuracy greater than 92%. In average, the proposed method overcame polarity and gramatical classification based methods in 1.4 percentage points.
7 citations
••
TL;DR: In this article, the authors compare and classify current research works according to multiple features, including: the use of Social Network Analysis and Social Capital models, users' motivations for participation and organizational costs, adoption of the social media platform from below, and results show that many of these current systems are developed without taking into proper consideration the social structures and processes, with some notable and positive exceptions.
Abstract: This article describes the current landscape in the fields of social media and socio-technical systems. In particular, it analyzes the different ways in which social media are adopted in organizations, workplaces, educational and smart environments. One interesting aspect of this integration, is the use of social media for members’ participation and access to the processes and services of their organization. Those services cover many different types of daily routines and life activities, such as health, education, transports. In this survey, we compare and classify current research works according to multiple features, including: the use of Social Network Analysis and Social Capital models, users’ motivations for participation and organizational costs, adoption of the social media platform from below. Our results show that many of these current systems are developed without taking into proper consideration the social structures and processes, with some notable and positive exceptions.
7 citations
••
13 Sep 2021TL;DR: This article proposed a zero-shot cross-lingual transfer learning framework that can adapt a rumour detection model trained for a source language to another target language by using pre-trained multilingual language models (e.g. multilingual BERT) and a self-training loop to iteratively bootstrap the creation of "silver labels" in the target language.
Abstract: Most rumour detection models for social media are designed for one specific language (mostly English). There are over 40 languages on Twitter and most languages lack annotated resources to build rumour detection models. In this paper we propose a zero-shot cross-lingual transfer learning framework that can adapt a rumour detection model trained for a source language to another target language. Our framework utilises pretrained multilingual language models (e.g. multilingual BERT) and a self-training loop to iteratively bootstrap the creation of “silver labels” in the target language to adapt the model from the source language to the target language. We evaluate our methodology on English and Chinese rumour datasets and demonstrate that our model substantially outperforms competitive benchmarks in both source and target language rumour detection.
7 citations
•
21 Mar 2019TL;DR: This work has shown that structured text, as one of the most important data forms, plays a crucial role in data-driven decision making in domains ranging from social networking and information retrieval to scien...
Abstract: Unstructured text, as one of the most important data forms, plays a crucial role in data-driven decision making in domains ranging from social networking and information retrieval to scien...
7 citations
References
More filters
••
[...]
TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
Abstract: Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
46,982 citations
••
TL;DR: In this paper, the authors present a critique of expected utility theory as a descriptive model of decision making under risk, and develop an alternative model, called prospect theory, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights.
Abstract: This paper presents a critique of expected utility theory as a descriptive model of decision making under risk, and develops an alternative model, called prospect theory. Choices among risky prospects exhibit several pervasive effects that are inconsistent with the basic tenets of utility theory. In particular, people underweight outcomes that are merely probable in comparison with outcomes that are obtained with certainty. This tendency, called the certainty effect, contributes to risk aversion in choices involving sure gains and to risk seeking in choices involving sure losses. In addition, people generally discard components that are shared by all prospects under consideration. This tendency, called the isolation effect, leads to inconsistent preferences when the same choice is presented in different forms. An alternative theory of choice is developed, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights. The value function is normally concave for gains, commonly convex for losses, and is generally steeper for losses than for gains. Decision weights are generally lower than the corresponding probabilities, except in the range of low prob- abilities. Overweighting of low probabilities may contribute to the attractiveness of both insurance and gambling. EXPECTED UTILITY THEORY has dominated the analysis of decision making under risk. It has been generally accepted as a normative model of rational choice (24), and widely applied as a descriptive model of economic behavior, e.g. (15, 4). Thus, it is assumed that all reasonable people would wish to obey the axioms of the theory (47, 36), and that most people actually do, most of the time. The present paper describes several classes of choice problems in which preferences systematically violate the axioms of expected utility theory. In the light of these observations we argue that utility theory, as it is commonly interpreted and applied, is not an adequate descriptive model and we propose an alternative account of choice under risk. 2. CRITIQUE
35,067 citations
••
09 Jan 2004TL;DR: A theory of intergroup conflict and some preliminary data relating to the theory is presented in this article. But the analysis is limited to the case where the salient dimensions of the intergroup differentiation are those involving scarce resources.
Abstract: This chapter presents an outline of a theory of intergroup conflict and some preliminary data relating to the theory. Much of the work on the social psychology of intergroup relations has focused on patterns of individual prejudices and discrimination and on the motivational sequences of interpersonal interaction. The intensity of explicit intergroup conflicts of interests is closely related in human cultures to the degree of opprobrium attached to the notion of "renegade" or "traitor." The basic and highly reliable finding is that the trivial, ad hoc intergroup categorization leads to in-group favoritism and discrimination against the out-group. Many orthodox definitions of "social groups" are unduly restrictive when applied to the context of intergroup relations. The equation of social competition and intergroup conflict rests on the assumptions concerning an "ideal type" of social stratification in which the salient dimensions of intergroup differentiation are those involving scarce resources.
14,812 citations
••
TL;DR: Cumulative prospect theory as discussed by the authors applies to uncertain as well as to risky prospects with any number of outcomes, and it allows different weighting functions for gains and for losses, and two principles, diminishing sensitivity and loss aversion, are invoked to explain the characteristic curvature of the value function and the weighting function.
Abstract: We develop a new version of prospect theory that employs cumulative rather than separable decision weights and extends the theory in several respects. This version, called cumulative prospect theory, applies to uncertain as well as to risky prospects with any number of outcomes, and it allows different weighting functions for gains and for losses. Two principles, diminishing sensitivity and loss aversion, are invoked to explain the characteristic curvature of the value function and the weighting functions. A review of the experimental evidence and the results of a new experiment confirm a distinctive fourfold pattern of risk attitudes: risk aversion for gains and risk seeking for losses of high probability; risk seeking for gains and risk aversion for losses of low probability. Expected utility theory reigned for several decades as the dominant normative and descriptive model of decision making under uncertainty, but it has come under serious question in recent years. There is now general agreement that the theory does not provide an adequate description of individual choice: a substantial body of evidence shows that decision makers systematically violate its basic tenets. Many alternative models have been proposed in response to this empirical challenge (for reviews, see Camerer, 1989; Fishburn, 1988; Machina, 1987). Some time ago we presented a model of choice, called prospect theory, which explained the major violations of expected utility theory in choices between risky prospects with a small number of outcomes (Kahneman and Tversky, 1979; Tversky and Kahneman, 1986). The key elements of this theory are 1) a value function that is concave for gains, convex for losses, and steeper for losses than for gains,
13,433 citations