scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Fake News Detection on Social Media: A Data Mining Perspective

01 Sep 2017-Sigkdd Explorations (ACM)-Vol. 19, Iss: 1, pp 22-36
TL;DR: Wang et al. as discussed by the authors presented a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets.
Abstract: Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of \fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ine ective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.
Citations
More filters
Journal ArticleDOI
14 Oct 2020
TL;DR: In this article, the authors focus on the problem of fact-checking online information in the context of Bangladesh, a country in the Global South where most people in the 'news audience' want the news media to verify the authenticity of online information that they see online.
Abstract: There has been a growing interest within CSCW community in understanding the characteristics of misinformation propagated through computational media, and the devising techniques to address the associated challenges. However, most work in this area has been concentrated on the cases in the western world leaving a major portion of this problem unaddressed that is situated in the Global South. This paper aims to broaden the scope of this discourse by focusing on this problem in the context of Bangladesh, a country in the Global South. The spread of misinformation on Facebook in Bangladesh, a country with a population of over 163 million, has resulted in chaos, hate attacks, and killings. By interviewing journalists, fact-checkers, in addition to surveying the general public, we analyzed the current state of verifying misinformation in Bangladesh. Our findings show that most people in the 'news audience' want the news media to verify the authenticity of online information that they see online. However, the newspaper journalists say that fact-checking online information is not a part of their job, and it is also beyond their capacity given the amount of information being published online every day. We further find that the voluntary fact-checkers in Bangladesh are not equipped with sufficient infrastructural support to fill in this gap. We show how our findings are connected to some of the core concerns of CSCW community around social media, collaboration, infrastructural politics, and information inequality. From our analysis, we also suggest several pathways to increase the impact of fact-checking efforts through collaboration, technology design, and infrastructure development.

22 citations

Journal ArticleDOI
TL;DR: This paper proposed a fine-grained social psychology grounded taxonomy with six categories to capture the different dimensions of the stereotype (positive vs. negative) and annotated a novel StereoImmigrants dataset with sentences that Spanish politicians have stated in the Congress of Deputies.
Abstract: Stereotype is a type of social bias massively present in texts that computational models use. There are stereotypes that present special difficulties because they do not rely on personal attributes. This is the case of stereotypes about immigrants, a social category that is a preferred target of hate speech and discrimination. We propose a new approach to detect stereotypes about immigrants in texts focusing not on the personal attributes assigned to the minority but in the frames, that is, the narrative scenarios, in which the group is placed in public speeches. We have proposed a fine-grained social psychology grounded taxonomy with six categories to capture the different dimensions of the stereotype (positive vs. negative) and annotated a novel StereoImmigrants dataset with sentences that Spanish politicians have stated in the Congress of Deputies. We aggregate these categories in two supracategories: one is Victims that expresses the positive stereotypes about immigrants and the other is Threat that expresses the negative stereotype. We carried out two preliminary experiments: first, to evaluate the automatic detection of stereotypes; and second, to distinguish between the two supracategories of immigrants’ stereotypes. In these experiments, we employed state-of-the-art transformer models (monolingual and multilingual) and four classical machine learning classifiers. We achieve above 0.83 of accuracy with the BETO model in both experiments, showing that transformers can capture stereotypes about immigrants with a high level of accuracy.

22 citations

Journal ArticleDOI
TL;DR: This research explored how ungeotagged tweets could be used to understand traffic events in India and developed a novel framework that does not only categorize traffic related tweets but also extracts the locations of the traffic events from the tweet content in Greater Mumbai.
Abstract: Detecting traffic events and their locations is important for an effective transportation management system and better urban policy making. Traffic events are related to traffic accidents, congestion, parking issues, to name a few. Currently, traffic events are detected through static sensors e.g., CCTV camera, loop detectors. However they have limited spatial coverage and high maintenance cost, especially in developing regions. On the other hand, with Web 2.0 and ubiquitous mobile platforms, people can act as social sensors sharing different traffic events along with their locations. We investigated whether Twitter – a social media platform can be useful to understand urban traffic events from tweets in India. However, such tweets are informal and noisy and containing vernacular geographical information making the location retrieval task challenging. So far most authors have used geotagged tweets to identify traffic events which accounted for only 0.1%-3% or sometimes less than that. Recently Twitter has removed precise geotagging, further decreasing the utility of such approaches. To address these issues, this research explored how ungeotagged tweets could be used to understand traffic events in India. We developed a novel framework that does not only categorize traffic related tweets but also extracts the locations of the traffic events from the tweet content in Greater Mumbai. The results show that an SVM based model performs best detecting traffic related tweets. While extracting location information, a hybrid georeferencing model consists of a supervised learning algorithm and a number of spatial rules outperforms other models. The results suggest people in India, especially in Greater Mumbai often share traffic information along with location mentions, which can be used to complement existing physical transport infrastructure in a cost-effective manner to manage transport services in the urban environment.

21 citations

Proceedings ArticleDOI
11 Jul 2021
TL;DR: Wang et al. as discussed by the authors proposed a rumor detection on social media with event augmentation (RDEA), which integrates three augmentation strategies by modifying both reply attributes and event structure to extract meaningful rumor propagation patterns and to learn intrinsic representations of user engagement.
Abstract: With the rapid growth of digital data on the Internet, rumor detection on social media has been vital. Existing deep learning-based methods have achieved promising results due to their ability to learn high-level representations of rumors. Despite the success, we argue that these approaches require large reliable labeled data to train, which is time-consuming and data-inefficient. To address this challenge, we present a new solution, Rumor Detection on social media with Event Augmentations (RDEA), which innovatively integrates three augmentation strategies by modifying both reply attributes and event structure to extract meaningful rumor propagation patterns and to learn intrinsic representations of user engagement. Moreover, we introduce contrastive self-supervised learning for the efficient implementation of event augmentations and alleviate limited data issues. Extensive experiments conducted on two public datasets demonstrate that RDEA achieves state-of-the-art performance over existing baselines. Besides, we empirically show the robustness of RDEA when labeled data are limited.

21 citations

Journal ArticleDOI
TL;DR: A novel system to detect fake news articles based on content-based features and the WOA-Xgbtree algorithm and an Extreme Gradient Boosting Tree algorithm optimized by the Whale Optimization Algorithm to classify news articles using extracted features is proposed.

21 citations

References
More filters
Journal ArticleDOI
28 May 2015-Nature
TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
Abstract: Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

46,982 citations

Book ChapterDOI
TL;DR: In this paper, the authors present a critique of expected utility theory as a descriptive model of decision making under risk, and develop an alternative model, called prospect theory, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights.
Abstract: This paper presents a critique of expected utility theory as a descriptive model of decision making under risk, and develops an alternative model, called prospect theory. Choices among risky prospects exhibit several pervasive effects that are inconsistent with the basic tenets of utility theory. In particular, people underweight outcomes that are merely probable in comparison with outcomes that are obtained with certainty. This tendency, called the certainty effect, contributes to risk aversion in choices involving sure gains and to risk seeking in choices involving sure losses. In addition, people generally discard components that are shared by all prospects under consideration. This tendency, called the isolation effect, leads to inconsistent preferences when the same choice is presented in different forms. An alternative theory of choice is developed, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights. The value function is normally concave for gains, commonly convex for losses, and is generally steeper for losses than for gains. Decision weights are generally lower than the corresponding probabilities, except in the range of low prob- abilities. Overweighting of low probabilities may contribute to the attractiveness of both insurance and gambling. EXPECTED UTILITY THEORY has dominated the analysis of decision making under risk. It has been generally accepted as a normative model of rational choice (24), and widely applied as a descriptive model of economic behavior, e.g. (15, 4). Thus, it is assumed that all reasonable people would wish to obey the axioms of the theory (47, 36), and that most people actually do, most of the time. The present paper describes several classes of choice problems in which preferences systematically violate the axioms of expected utility theory. In the light of these observations we argue that utility theory, as it is commonly interpreted and applied, is not an adequate descriptive model and we propose an alternative account of choice under risk. 2. CRITIQUE

35,067 citations

Book ChapterDOI
09 Jan 2004
TL;DR: A theory of intergroup conflict and some preliminary data relating to the theory is presented in this article. But the analysis is limited to the case where the salient dimensions of the intergroup differentiation are those involving scarce resources.
Abstract: This chapter presents an outline of a theory of intergroup conflict and some preliminary data relating to the theory. Much of the work on the social psychology of intergroup relations has focused on patterns of individual prejudices and discrimination and on the motivational sequences of interpersonal interaction. The intensity of explicit intergroup conflicts of interests is closely related in human cultures to the degree of opprobrium attached to the notion of "renegade" or "traitor." The basic and highly reliable finding is that the trivial, ad hoc intergroup categorization leads to in-group favoritism and discrimination against the out-group. Many orthodox definitions of "social groups" are unduly restrictive when applied to the context of intergroup relations. The equation of social competition and intergroup conflict rests on the assumptions concerning an "ideal type" of social stratification in which the salient dimensions of intergroup differentiation are those involving scarce resources.

14,812 citations

Journal ArticleDOI
TL;DR: Cumulative prospect theory as discussed by the authors applies to uncertain as well as to risky prospects with any number of outcomes, and it allows different weighting functions for gains and for losses, and two principles, diminishing sensitivity and loss aversion, are invoked to explain the characteristic curvature of the value function and the weighting function.
Abstract: We develop a new version of prospect theory that employs cumulative rather than separable decision weights and extends the theory in several respects. This version, called cumulative prospect theory, applies to uncertain as well as to risky prospects with any number of outcomes, and it allows different weighting functions for gains and for losses. Two principles, diminishing sensitivity and loss aversion, are invoked to explain the characteristic curvature of the value function and the weighting functions. A review of the experimental evidence and the results of a new experiment confirm a distinctive fourfold pattern of risk attitudes: risk aversion for gains and risk seeking for losses of high probability; risk seeking for gains and risk aversion for losses of low probability. Expected utility theory reigned for several decades as the dominant normative and descriptive model of decision making under uncertainty, but it has come under serious question in recent years. There is now general agreement that the theory does not provide an adequate description of individual choice: a substantial body of evidence shows that decision makers systematically violate its basic tenets. Many alternative models have been proposed in response to this empirical challenge (for reviews, see Camerer, 1989; Fishburn, 1988; Machina, 1987). Some time ago we presented a model of choice, called prospect theory, which explained the major violations of expected utility theory in choices between risky prospects with a small number of outcomes (Kahneman and Tversky, 1979; Tversky and Kahneman, 1986). The key elements of this theory are 1) a value function that is concave for gains, convex for losses, and steeper for losses than for gains,

13,433 citations

Trending Questions (1)
Issue of fake news

The paper discusses the issue of fake news on social media and its potential negative impacts on individuals and society.