scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Fake News Detection on Social Media: A Data Mining Perspective

01 Sep 2017-Sigkdd Explorations (ACM)-Vol. 19, Iss: 1, pp 22-36
TL;DR: Wang et al. as discussed by the authors presented a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets.
Abstract: Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of \fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ine ective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.
Citations
More filters
Posted Content
TL;DR: It is shown that the emergence of Trusted Execution Environments and anonymous cryptocurrencies, for the first time, allows the implementation of such a lease service while guaranteeing fairness, plausible deniability and anonymity, therefore shielding the users and account renters from prosecution.
Abstract: We investigate identity lease, a new type of service in which users lease their identities to third parties by providing them with full or restricted access to their online accounts or credentials. We discuss how identity lease could be abused to subvert the digital society, facilitating the spread of fake news and subverting electronic voting by enabling the sale of votes. We show that the emergence of Trusted Execution Environments and anonymous cryptocurrencies, for the first time, allows the implementation of such a lease service while guaranteeing fairness, plausible deniability and anonymity, therefore shielding the users and account renters from prosecution. To show that such a service can be practically implemented, we build an example service that we call TEEvil leveraging Intel SGX and ZCash. Finally, we discuss defense mechanisms and challenges in the mitigation of identity lease services.

4 citations

Proceedings ArticleDOI
19 Sep 2022
TL;DR: A Domain- and Instance-level Transfer Framework for Fake News Detection (DITFEND) is proposed, which could improve the performance of specific target domains and shows superior effectiveness for fake news detection.
Abstract: Social media spreads both real news and fake news in various domains including politics, health, entertainment, etc. It is crucial to automatically detect fake news, especially for news of influential domains like politics and health because they may lead to serious social impact, e.g., panic in the COVID-19 pandemic. Some studies indicate the correlation between domains and perform multi-domain fake news detection. However, these multi-domain methods suffer from a seesaw problem that the performance of some domains is often improved by hurting the performance of other domains, which could lead to an unsatisfying performance in the specific target domains. To address this issue, we propose a Domain- and Instance-level Transfer Framework for Fake News Detection (DITFEND), which could improve the performance of specific target domains. To transfer coarse-grained domain-level knowledge, we train a general model with data of all domains from the meta-learning perspective. To transfer fine-grained instance-level knowledge and adapt the general model to a target domain, a language model is trained on the target domain to evaluate the transferability of each data instance in source domains and re-weight the instance’s contribution. Experiments on two real-world datasets demonstrate the effectiveness of DITFEND. According to both offline and online experiments, the DITFEND shows superior effectiveness for fake news detection.

4 citations

Journal ArticleDOI
TL;DR: A novel heterogeneous GCN-based method for dynamic rumor detection (HDGCN), mainly composed of a joint content and propagation module and an ODE-based dynamic module that outperformed the mainstream model on two real-world datasets from Twitter.
Abstract: The development of social media has provided open and convenient platforms for people to express their opinions, which leads to rumors being circulated. Therefore, detecting rumors from massive information becomes particularly essential. Previous methods for rumor detection focused on mining features from content and propagation patterns but neglected the dynamic features with joint content and propagation pattern. In this paper, we propose a novel heterogeneous GCN-based method for dynamic rumor detection (HDGCN), mainly composed of a joint content and propagation module and an ODE-based dynamic module. The joint content and propagation module constructs a content-propagation heterogeneous graph to obtain rumor representations by mining and discovering the interaction between post content and propagation structures in the rumor propagation process. The ODE-based dynamic module leverages a GCN integrated with an ordinary differential system to explore dynamic features of heterogeneous graphs. To evaluate the performance of our proposed HDGCN model, we have conducted extensive experiments on two real-world datasets from Twitter. The results of our proposed model have outperformed the mainstream model.

4 citations

Journal ArticleDOI
TL;DR: In this article , the authors developed the first comprehensive fake news detection dataset for Pakistani news by using multiple fact-checked news APIs and evaluated the developed dataset using multiple state-of-the-art artificial intelligence techniques.
Abstract: Fake news causes a huge impact on the reader’s mind, therefore it has become a major concern. Identifying fake news or differentiating between fake and authentic news is quite challenging. The trend of fake news in Pakistan has grown a lot in the last decade. This research aims to develop the first comprehensive fake news detection dataset for Pakistani news by using multiple fact-checked news APIs. This research also evaluates the developed dataset by using multiple state-of-the-art artificial intelligence techniques. Five machine learning techniques namely Naive Bayes, KNN, Logistic Regression, SVM, and Decision Trees are used. While two deep learning techniques CNN and LSTM are used with GloVe and BERT embeddings. The performance of all the applied models and embeddings is compared based on precision, F1-score, accuracy, and recall. The results show that LSTM initialized with GloVe embeddings has performed best. The research also analyzes the misclassified samples by comparing such samples with human judgements.

4 citations

Proceedings ArticleDOI
04 Nov 2020
TL;DR: Results of the suggested hybrid model have shown that the suggested CNN, RNN-LSTM based Hybrid approach achieves the highest accuracy of 92% by surpassing most of the classical models today with Adam optimizer and Binary Cross-Entropy loss function.
Abstract: Fake news is a new phenomenon related to false information and fraud that spreads through online social media or traditional news media Today, fake news can be easily created and distributed across many social media platforms and has a widespread impact on the real world It is critical to develop efficient algorithms and tools for early detection of how false information is disseminated on social media platforms and why it is successful in deceiving users Most research methods today are based on machine learning, deep learning, feature engineering, graph mining, image and video analysis and newly developed datasets and web services for detecting deceptive content Therefore, a strong need emerges to find a suitable method that can easily detect false information A hybrid approach has suggested using the CNN model and RNN-LSTM model to detect false information from this study First, NLTK toolkit has used to remove stop words, punctuations and special characters from the text Then the same toolkit applies to tokenize the text and preprocesses the text From there on, GloVe word embeddings have added to the preprocessed text Higher-level features of the input text extract from the CNN model using convolutional layers and max-pooling layers Long-term dependencies between word sequences capture from RNN-LSTM model The suggested model also applies dropout technology with Dense layers to enhance the efficiency of the hybrid model Results of the suggested hybrid model have shown that the suggested CNN, RNN-LSTM based Hybrid approach achieves the highest accuracy of 92% by surpassing most of the classical models today with Adam optimizer and Binary Cross-Entropy loss function

4 citations

References
More filters
Journal ArticleDOI
28 May 2015-Nature
TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
Abstract: Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

46,982 citations

Book ChapterDOI
TL;DR: In this paper, the authors present a critique of expected utility theory as a descriptive model of decision making under risk, and develop an alternative model, called prospect theory, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights.
Abstract: This paper presents a critique of expected utility theory as a descriptive model of decision making under risk, and develops an alternative model, called prospect theory. Choices among risky prospects exhibit several pervasive effects that are inconsistent with the basic tenets of utility theory. In particular, people underweight outcomes that are merely probable in comparison with outcomes that are obtained with certainty. This tendency, called the certainty effect, contributes to risk aversion in choices involving sure gains and to risk seeking in choices involving sure losses. In addition, people generally discard components that are shared by all prospects under consideration. This tendency, called the isolation effect, leads to inconsistent preferences when the same choice is presented in different forms. An alternative theory of choice is developed, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights. The value function is normally concave for gains, commonly convex for losses, and is generally steeper for losses than for gains. Decision weights are generally lower than the corresponding probabilities, except in the range of low prob- abilities. Overweighting of low probabilities may contribute to the attractiveness of both insurance and gambling. EXPECTED UTILITY THEORY has dominated the analysis of decision making under risk. It has been generally accepted as a normative model of rational choice (24), and widely applied as a descriptive model of economic behavior, e.g. (15, 4). Thus, it is assumed that all reasonable people would wish to obey the axioms of the theory (47, 36), and that most people actually do, most of the time. The present paper describes several classes of choice problems in which preferences systematically violate the axioms of expected utility theory. In the light of these observations we argue that utility theory, as it is commonly interpreted and applied, is not an adequate descriptive model and we propose an alternative account of choice under risk. 2. CRITIQUE

35,067 citations

Book ChapterDOI
09 Jan 2004
TL;DR: A theory of intergroup conflict and some preliminary data relating to the theory is presented in this article. But the analysis is limited to the case where the salient dimensions of the intergroup differentiation are those involving scarce resources.
Abstract: This chapter presents an outline of a theory of intergroup conflict and some preliminary data relating to the theory. Much of the work on the social psychology of intergroup relations has focused on patterns of individual prejudices and discrimination and on the motivational sequences of interpersonal interaction. The intensity of explicit intergroup conflicts of interests is closely related in human cultures to the degree of opprobrium attached to the notion of "renegade" or "traitor." The basic and highly reliable finding is that the trivial, ad hoc intergroup categorization leads to in-group favoritism and discrimination against the out-group. Many orthodox definitions of "social groups" are unduly restrictive when applied to the context of intergroup relations. The equation of social competition and intergroup conflict rests on the assumptions concerning an "ideal type" of social stratification in which the salient dimensions of intergroup differentiation are those involving scarce resources.

14,812 citations

Journal ArticleDOI
TL;DR: Cumulative prospect theory as discussed by the authors applies to uncertain as well as to risky prospects with any number of outcomes, and it allows different weighting functions for gains and for losses, and two principles, diminishing sensitivity and loss aversion, are invoked to explain the characteristic curvature of the value function and the weighting function.
Abstract: We develop a new version of prospect theory that employs cumulative rather than separable decision weights and extends the theory in several respects. This version, called cumulative prospect theory, applies to uncertain as well as to risky prospects with any number of outcomes, and it allows different weighting functions for gains and for losses. Two principles, diminishing sensitivity and loss aversion, are invoked to explain the characteristic curvature of the value function and the weighting functions. A review of the experimental evidence and the results of a new experiment confirm a distinctive fourfold pattern of risk attitudes: risk aversion for gains and risk seeking for losses of high probability; risk seeking for gains and risk aversion for losses of low probability. Expected utility theory reigned for several decades as the dominant normative and descriptive model of decision making under uncertainty, but it has come under serious question in recent years. There is now general agreement that the theory does not provide an adequate description of individual choice: a substantial body of evidence shows that decision makers systematically violate its basic tenets. Many alternative models have been proposed in response to this empirical challenge (for reviews, see Camerer, 1989; Fishburn, 1988; Machina, 1987). Some time ago we presented a model of choice, called prospect theory, which explained the major violations of expected utility theory in choices between risky prospects with a small number of outcomes (Kahneman and Tversky, 1979; Tversky and Kahneman, 1986). The key elements of this theory are 1) a value function that is concave for gains, convex for losses, and steeper for losses than for gains,

13,433 citations

Trending Questions (1)
Issue of fake news

The paper discusses the issue of fake news on social media and its potential negative impacts on individuals and society.