scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Fake News Detection on Social Media: A Data Mining Perspective

01 Sep 2017-Sigkdd Explorations (ACM)-Vol. 19, Iss: 1, pp 22-36
TL;DR: Wang et al. as discussed by the authors presented a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets.
Abstract: Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of \fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ine ective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.
Citations
More filters
Journal ArticleDOI
01 Aug 2019
TL;DR: This tutorial provides a unifying framework for categorizing prior research focusing on four facets of fake news: detection, propagation, mitigation and intervention.
Abstract: Fake news is a major threat to global democracy resulting in diminished trust in government, journalism and civil society. The public popularity of social media and social networks has caused a contagion of fake news where conspiracy theories, disinformation and extreme views flourish. Detection and mitigation of fake news is one of the fundamental problems of our times and has attracted widespread attention. While fact checking websites such as snopes, politifact and major companies such as Google, Facebook, and Twitter have taken preliminary steps towards addressing fake news, much more remains to be done. As an interdisciplinary topic, various facets of fake news have been studied by communities as diverse as machine learning, databases, journalism, political science and many more.The objective of this tutorial is two-fold. First, we wish to familiarize the database community with the efforts by other communities on combating fake news. We provide a panoramic view of the state-of-the-art of research on various aspects including detection, propagation, mitigation, and intervention of fake news. Next, we provide a concise and intuitive summary of prior research by the database community and discuss how it could be used to counteract fake news. The tutorial covers research from areas such as data integration, truth discovery and fusion, probabilistic databases, knowledge graphs and crowdsourcing from the lens of fake news. Effective tools for addressing fake news could only be built by leveraging the synergistic relationship between database and other research communities. We hope that our tutorial provides an impetus towards such synthesis of ideas and the creation of new ones.

12 citations

Journal ArticleDOI
TL;DR: In this paper, the authors used k-means clustering and manual coding to classify tweets by theme, sentiment, length and count of emojis, pictures, videos and links.
Abstract: BACKGROUND Micro-blogging services empower health institutions to quickly disseminate health information to many users. By analysing user data, infodemiology (i.e. improving public health using user contributed health related content) can be measured in terms of information diffusion. OBJECTIVES Tweets by the WHO were examined in order to identify tweet attributes that lead to a high information diffusion rate using Twitter data collected between November 2019 and January 2020. METHODS One thousand hundred and seventy-seven tweets were collected using Python's Tweepy library. Afterwards, k-means clustering and manual coding were used to classify tweets by theme, sentiment, length and count of emojis, pictures, videos and links. Resulting groups with different characteristics were analysed for significant differences using Mann-Whitney U- and Kruskal-Wallis H-tests. RESULTS The topic of the tweet, the included links, emojis and (one) picture as well as the tweet length significantly affected the tweets' diffusion, whereas sentiment and videos did not show any significant influence on the diffusion of tweets. DISCUSSION The findings of this study give insights on why specific health topics might generate less attention and do not showcase sufficient information diffusion. CONCLUSION The subject and appearance of a tweet influence its diffusion, making the design equally essential to the preparation of its content.

12 citations

Proceedings ArticleDOI
01 Oct 2020
TL;DR: The results show that the proposed model outperforms other state-of-the-art methods on rumor detection, and the three modules are jointly trained to improve the model effect.
Abstract: With the development of technology and the expansion of social media, rumors spread widely and the rumor detection has gradually caused widespread concern. The early method of using handcrafted features has been eliminated due to inefficiency, and deep learning methods have been gradually adopted in recent years. However, most of the methods only consider content information such as text, which is often not enough for the specific field, rumor detection. Some studies take propagation rule into consideration, such as Kernel-based, RvNN. In addition, the structure formed via propagation of rumors and non-rumors have different properties. Compared with dynamic propagation, structure here is the final result of propagation and it’s static and global. In order to enhance the structure information, we proposes a model that obtains textual, propagation and structure information. The model contains three components: Encoder, Decoder, and Detector. The encoder uses the efficient Graph Convolutional Network to regard the initial text as input and update the representation through propagation to learn text and propagation information. Then the encoded representation would be used for subsequent decoder which uses AutoEncoder to learn the overall structure information. Simultaneously, the detector utilizes the output of encoder to classify events as fake or not. These three modules are jointly trained to improve the model effect. We verified our method on three real-world datasets, and the results show that our method outperforms other state-of-the-art methods.

12 citations

Journal ArticleDOI
TL;DR: Multiple data sets integrating the DCFM and FSA models are examined to help cybersecurity experts see a better picture of the threat which will help to plan a better response.
Abstract: Social media has influenced socio-political aspects of many societies around the world. It is an effortless way for people to enhance their communication, connect with like-minded people, and share ideas. Online social networks (OSNs) can be used for noble causes by bringing together communities with common shared interests and to promote awareness of various causes. However, there is a dark side to the use of OSNs. OSNs can also be used as a coordination and amplification platform for attacks. For instance, adversaries can increase the impact of an attack by causing panic in an area by promoting attacks using OSNs. Public data can help adversaries to determine the best timing for attacks, scheduling attacks, and then using OSNs to coordinate attacks on networks or physical locations. This convergence of the cyber and physical worlds is known as cybernetics. In this paper, we introduce an integrated method to identify malicious behavior and the actors responsible for propagating this behavior via online social networks. Throughout history we have used surveillance techniques to monitor negative behavior, activities, and information. Quantitative socio-technical methods such as deviant cyber flash mob (DCFM) detection and focal structure analysis (FSA) can provide reconnaissance capabilities that enable cities and governments to look beyond internal data and identify threats based on active events. Groups of powerful hackers can be identified through FSA which is an integrated model that uses a betweenness centrality method at the node-level and spectral modularity at group-level to identify a hidden malicious and powerful focal structure (a subset of the network). Assessment of groups using DCFM methods can help to identify powerful actors and prevent attacks. In this study, we examine multiple data sets integrating the DCFM and FSA models to help cybersecurity experts see a better picture of the threat which will help to plan a better response.

12 citations

Journal ArticleDOI
TL;DR: The findings show that previous studies on the applications of Blockchain in social media are focused mainly on blocking fake news and enhancing data privacy, and this is the first systematic literature review that elucidates the combination of Blockchain and social media.
Abstract: Social media has transformed the mode of communication globally by providing an extensive system for exchanging ideas, initiating business contracts, and proposing new professional ideas. However, there are many limitations to the use of social media, such as misinformation, lack of effective content moderation, digital piracy, data breaches, identity fraud, and fake news. In order to address these limitations, several studies have introduced the application of Blockchain technology in social media. Blockchains can provides transparency, traceability, tamper-proofing, confidentiality, security, information control, and supervision. This paper is a systematic literature review of papers covering the application of Blockchain technology in social media. To the best of our knowledge, this is the first systematic literature review that elucidates the combination of Blockchain and social media. Using several electronic databases, 42 related papers were reviewed. Our findings show that previous studies on the applications of Blockchain in social media are focused mainly on blocking fake news and enhancing data privacy. Research in this domain began in 2017. This review additionally discusses several challenges in applying Blockchain technologies in social media contexts, and proposes alternative ideas for future implementation and research.

12 citations

References
More filters
Journal ArticleDOI
28 May 2015-Nature
TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
Abstract: Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

46,982 citations

Book ChapterDOI
TL;DR: In this paper, the authors present a critique of expected utility theory as a descriptive model of decision making under risk, and develop an alternative model, called prospect theory, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights.
Abstract: This paper presents a critique of expected utility theory as a descriptive model of decision making under risk, and develops an alternative model, called prospect theory. Choices among risky prospects exhibit several pervasive effects that are inconsistent with the basic tenets of utility theory. In particular, people underweight outcomes that are merely probable in comparison with outcomes that are obtained with certainty. This tendency, called the certainty effect, contributes to risk aversion in choices involving sure gains and to risk seeking in choices involving sure losses. In addition, people generally discard components that are shared by all prospects under consideration. This tendency, called the isolation effect, leads to inconsistent preferences when the same choice is presented in different forms. An alternative theory of choice is developed, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights. The value function is normally concave for gains, commonly convex for losses, and is generally steeper for losses than for gains. Decision weights are generally lower than the corresponding probabilities, except in the range of low prob- abilities. Overweighting of low probabilities may contribute to the attractiveness of both insurance and gambling. EXPECTED UTILITY THEORY has dominated the analysis of decision making under risk. It has been generally accepted as a normative model of rational choice (24), and widely applied as a descriptive model of economic behavior, e.g. (15, 4). Thus, it is assumed that all reasonable people would wish to obey the axioms of the theory (47, 36), and that most people actually do, most of the time. The present paper describes several classes of choice problems in which preferences systematically violate the axioms of expected utility theory. In the light of these observations we argue that utility theory, as it is commonly interpreted and applied, is not an adequate descriptive model and we propose an alternative account of choice under risk. 2. CRITIQUE

35,067 citations

Book ChapterDOI
09 Jan 2004
TL;DR: A theory of intergroup conflict and some preliminary data relating to the theory is presented in this article. But the analysis is limited to the case where the salient dimensions of the intergroup differentiation are those involving scarce resources.
Abstract: This chapter presents an outline of a theory of intergroup conflict and some preliminary data relating to the theory. Much of the work on the social psychology of intergroup relations has focused on patterns of individual prejudices and discrimination and on the motivational sequences of interpersonal interaction. The intensity of explicit intergroup conflicts of interests is closely related in human cultures to the degree of opprobrium attached to the notion of "renegade" or "traitor." The basic and highly reliable finding is that the trivial, ad hoc intergroup categorization leads to in-group favoritism and discrimination against the out-group. Many orthodox definitions of "social groups" are unduly restrictive when applied to the context of intergroup relations. The equation of social competition and intergroup conflict rests on the assumptions concerning an "ideal type" of social stratification in which the salient dimensions of intergroup differentiation are those involving scarce resources.

14,812 citations

Journal ArticleDOI
TL;DR: Cumulative prospect theory as discussed by the authors applies to uncertain as well as to risky prospects with any number of outcomes, and it allows different weighting functions for gains and for losses, and two principles, diminishing sensitivity and loss aversion, are invoked to explain the characteristic curvature of the value function and the weighting function.
Abstract: We develop a new version of prospect theory that employs cumulative rather than separable decision weights and extends the theory in several respects. This version, called cumulative prospect theory, applies to uncertain as well as to risky prospects with any number of outcomes, and it allows different weighting functions for gains and for losses. Two principles, diminishing sensitivity and loss aversion, are invoked to explain the characteristic curvature of the value function and the weighting functions. A review of the experimental evidence and the results of a new experiment confirm a distinctive fourfold pattern of risk attitudes: risk aversion for gains and risk seeking for losses of high probability; risk seeking for gains and risk aversion for losses of low probability. Expected utility theory reigned for several decades as the dominant normative and descriptive model of decision making under uncertainty, but it has come under serious question in recent years. There is now general agreement that the theory does not provide an adequate description of individual choice: a substantial body of evidence shows that decision makers systematically violate its basic tenets. Many alternative models have been proposed in response to this empirical challenge (for reviews, see Camerer, 1989; Fishburn, 1988; Machina, 1987). Some time ago we presented a model of choice, called prospect theory, which explained the major violations of expected utility theory in choices between risky prospects with a small number of outcomes (Kahneman and Tversky, 1979; Tversky and Kahneman, 1986). The key elements of this theory are 1) a value function that is concave for gains, convex for losses, and steeper for losses than for gains,

13,433 citations

Trending Questions (1)
Issue of fake news

The paper discusses the issue of fake news on social media and its potential negative impacts on individuals and society.