scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Fake News Detection on Social Media: A Data Mining Perspective

01 Sep 2017-Sigkdd Explorations (ACM)-Vol. 19, Iss: 1, pp 22-36
TL;DR: Wang et al. as discussed by the authors presented a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets.
Abstract: Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of \fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ine ective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.
Citations
More filters
Proceedings ArticleDOI
11 Aug 2021
TL;DR: In this paper, a training-free machine learning algorithm of Differential Sentence Semantic Analyses (DSSA) is designed and implemented for fake news detection, which achieved a level of 70.4% accuracy that outperforms the traditional data-driven neural network technologies normally projected at the accuracy level of 55.0%.
Abstract: A persistent challenge to AI theories and technologies is fake news recognition which demands not only syntactic analyses of language expressions, but also their semantics comprehension. This work presents an autonomous system for fake news recognition based on a novel approach of machine semantic learning. A training-free machine learning algorithm of Differential Sentence Semantic Analyses (DSSA) is designed and implemented for fake news detection. A large set of 876 experiments randomly selected from DataCup’ 19 has demonstrated a level of 70.4% accuracy that outperforms the traditional data-driven neural network technologies normally projected at the accuracy level of 55.0%. The DSSA methodology paves a way towards autonomous, training-free, and real-time trustworthy technologies for machine knowledge learning and semantics composition.

2 citations

Journal ArticleDOI
TL;DR: Recently, counterfactual learning on graphs has shown promising results in alleviating the drawbacks of GNNs as mentioned in this paper , such as lacking interpretability, can inherit the bias of the training data and cannot model the casual relations.
Abstract: Graph-structured data are pervasive in the real-world such as social networks, molecular graphs and transaction networks. Graph neural networks (GNNs) have achieved great success in representation learning on graphs, facilitating various downstream tasks. However, GNNs have several drawbacks such as lacking interpretability, can easily inherit the bias of the training data and cannot model the casual relations. Recently, counterfactual learning on graphs has shown promising results in alleviating these drawbacks. Various graph counterfactual learning approaches have been proposed for counterfactual fairness, explainability, link prediction and other applications on graphs. To facilitate the development of this promising direction, in this survey, we categorize and comprehensively review papers on graph counterfactual learning. We divide existing methods into four categories based on research problems studied. For each category, we provide background and motivating examples, a general framework summarizing existing works and a detailed review of these works. We point out promising future research directions at the intersection of graph-structured data, counterfactual learning, and real-world applications. To offer a comprehensive view of resources for future studies, we compile a collection of open-source implementations, public datasets, and commonly-used evaluation metrics. This survey aims to serve as a ``one-stop-shop'' for building a unified understanding of graph counterfactual learning categories and current resources. We also maintain a repository for papers and resources and will keep updating the repository https://github.com/TimeLovercc/Awesome-Graph-Causal-Learning.

2 citations

Proceedings ArticleDOI
28 Nov 2018
TL;DR: A machine learning method to evaluate the trustworthiness of a piece of information especially considering its associated image and reach an accuracy of about 85% in hoax identification is proposed.
Abstract: In the last few years, the impact of information spread through online social networks has continuously grown. For this reason, understanding the trustworthiness of news has become one of the most important challenges for an Internet user, especially during crisis events or in political, health and social issues. As part of a more comprehensive project for the detection of fake news, this paper proposes a machine learning method to evaluate the trustworthiness of a piece of information especially considering its associated image. In the work described in this paper, the training and test datasets have been first collected from the web, downloading more than 1000 images related to trusted and fake Facebook pages. All collected images have been processed using the Google Vision online service for extracting their specific internal details. For each image, various kinds of features have been considered, including its color composition, the recognized objects, the list of sites in which it is published, and eventually the contained text. These details have been then used for training a classifier using different algorithms which allowed us to reach an accuracy of about 85% in hoax identification. Future research will focus on social-network information related to images, to improve the system accuracy and acquire more knowledge about various types of news spread online.

2 citations

Proceedings ArticleDOI
27 Jul 2020
TL;DR: This paper has analyzed public sentiments on the Goods and Services tax, popularly known as GST in India and used a hybrid approach to do the sentiment analysis which uses a combination of lexicon-based method and supervised machine learning approach to determine public sentiments.
Abstract: Micro blogging sites and other social networking platforms have become the primary means of communication and knowledge sharing with progressing technological trends. People across the globe express their views on products & services, predict share price and present feedback on the policies of the regimes. Everything that is shared on social networks may not be authentic or denote the truth. However, it definitely forms a basis to investigate and comprehend the public sentiments. Public sentiments are capable of affecting the economic landscape via foreign investments and stock markets among having other financial and social impacts. In this paper, we have analyzed public sentiments on the Goods and Services tax, popularly known as GST in India. GST subsumes eight central and nine state taxes thereby integrating the absolute indirect tax framework in the country which paves the way for varied opinions & reactions imperative to analyze a collective sentiment. We used a hybrid approach to do the sentiment analysis which uses a combination of lexicon-based method and supervised machine learning approach to determine public sentiments. We accumulated 163,373 tweets over a span of three weeks from July 4th to 25th, 2017 after GST was implemented in India w.e.f. July 1st, 2017. A spatio-temporal analysis was performed on the collected tweets. In this research, we annotated 22,000 unique tweets with the help of a lexicon-based method and thenceforth applied supervised machine learning techniques with a set of six distinct algorithms to train and predict the polarity on the complete data set. K-fold cross validation technique, for K in range of 3–10, was used to assess the model for an independent data set. Subsequently, it was found that accuracy, precision, recall and F1 score of all the models provided the best results when K approached 10. Resultantly, we observed that SVM and Logistic Regression could predict the polarity of new incoming tweets with an accuracy of 77.6% and 79.31% respectively.

2 citations

Book ChapterDOI
02 Dec 2019
TL;DR: The paper presents a hybrid approach to social network analysis for obtaining information on suspicious user profiles based on integration of statistical techniques, data mining and visual analysis.
Abstract: The paper presents a hybrid approach to social network analysis for obtaining information on suspicious user profiles The offered approach is based on integration of statistical techniques, data mining and visual analysis The advantage of the proposed approach is that it needs limited kinds of social network data (“likes” in groups and links between users) which is often in open access The results of experiments confirming the applicability of the proposed approach are outlined

2 citations

References
More filters
Journal ArticleDOI
28 May 2015-Nature
TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
Abstract: Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

46,982 citations

Book ChapterDOI
TL;DR: In this paper, the authors present a critique of expected utility theory as a descriptive model of decision making under risk, and develop an alternative model, called prospect theory, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights.
Abstract: This paper presents a critique of expected utility theory as a descriptive model of decision making under risk, and develops an alternative model, called prospect theory. Choices among risky prospects exhibit several pervasive effects that are inconsistent with the basic tenets of utility theory. In particular, people underweight outcomes that are merely probable in comparison with outcomes that are obtained with certainty. This tendency, called the certainty effect, contributes to risk aversion in choices involving sure gains and to risk seeking in choices involving sure losses. In addition, people generally discard components that are shared by all prospects under consideration. This tendency, called the isolation effect, leads to inconsistent preferences when the same choice is presented in different forms. An alternative theory of choice is developed, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights. The value function is normally concave for gains, commonly convex for losses, and is generally steeper for losses than for gains. Decision weights are generally lower than the corresponding probabilities, except in the range of low prob- abilities. Overweighting of low probabilities may contribute to the attractiveness of both insurance and gambling. EXPECTED UTILITY THEORY has dominated the analysis of decision making under risk. It has been generally accepted as a normative model of rational choice (24), and widely applied as a descriptive model of economic behavior, e.g. (15, 4). Thus, it is assumed that all reasonable people would wish to obey the axioms of the theory (47, 36), and that most people actually do, most of the time. The present paper describes several classes of choice problems in which preferences systematically violate the axioms of expected utility theory. In the light of these observations we argue that utility theory, as it is commonly interpreted and applied, is not an adequate descriptive model and we propose an alternative account of choice under risk. 2. CRITIQUE

35,067 citations

Book ChapterDOI
09 Jan 2004
TL;DR: A theory of intergroup conflict and some preliminary data relating to the theory is presented in this article. But the analysis is limited to the case where the salient dimensions of the intergroup differentiation are those involving scarce resources.
Abstract: This chapter presents an outline of a theory of intergroup conflict and some preliminary data relating to the theory. Much of the work on the social psychology of intergroup relations has focused on patterns of individual prejudices and discrimination and on the motivational sequences of interpersonal interaction. The intensity of explicit intergroup conflicts of interests is closely related in human cultures to the degree of opprobrium attached to the notion of "renegade" or "traitor." The basic and highly reliable finding is that the trivial, ad hoc intergroup categorization leads to in-group favoritism and discrimination against the out-group. Many orthodox definitions of "social groups" are unduly restrictive when applied to the context of intergroup relations. The equation of social competition and intergroup conflict rests on the assumptions concerning an "ideal type" of social stratification in which the salient dimensions of intergroup differentiation are those involving scarce resources.

14,812 citations

Journal ArticleDOI
TL;DR: Cumulative prospect theory as discussed by the authors applies to uncertain as well as to risky prospects with any number of outcomes, and it allows different weighting functions for gains and for losses, and two principles, diminishing sensitivity and loss aversion, are invoked to explain the characteristic curvature of the value function and the weighting function.
Abstract: We develop a new version of prospect theory that employs cumulative rather than separable decision weights and extends the theory in several respects. This version, called cumulative prospect theory, applies to uncertain as well as to risky prospects with any number of outcomes, and it allows different weighting functions for gains and for losses. Two principles, diminishing sensitivity and loss aversion, are invoked to explain the characteristic curvature of the value function and the weighting functions. A review of the experimental evidence and the results of a new experiment confirm a distinctive fourfold pattern of risk attitudes: risk aversion for gains and risk seeking for losses of high probability; risk seeking for gains and risk aversion for losses of low probability. Expected utility theory reigned for several decades as the dominant normative and descriptive model of decision making under uncertainty, but it has come under serious question in recent years. There is now general agreement that the theory does not provide an adequate description of individual choice: a substantial body of evidence shows that decision makers systematically violate its basic tenets. Many alternative models have been proposed in response to this empirical challenge (for reviews, see Camerer, 1989; Fishburn, 1988; Machina, 1987). Some time ago we presented a model of choice, called prospect theory, which explained the major violations of expected utility theory in choices between risky prospects with a small number of outcomes (Kahneman and Tversky, 1979; Tversky and Kahneman, 1986). The key elements of this theory are 1) a value function that is concave for gains, convex for losses, and steeper for losses than for gains,

13,433 citations

Trending Questions (1)
Issue of fake news

The paper discusses the issue of fake news on social media and its potential negative impacts on individuals and society.