scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Fake News Detection on Social Media: A Data Mining Perspective

01 Sep 2017-Sigkdd Explorations (ACM)-Vol. 19, Iss: 1, pp 22-36
TL;DR: Wang et al. as discussed by the authors presented a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets.
Abstract: Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of \fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ine ective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.
Citations
More filters
Journal ArticleDOI
28 May 2020
TL;DR: If people perceive that a CNR owner has prior experience in crisis response, can help the public to respond to the event, understands the situation, has the best interests of affected individuals in mind, or will correct misinformation, they tend to trust that CNR.
Abstract: People often create social media accounts and pages named after crisis events. We call such accounts and pages Crisis Named Resources (CNRs). CNRs share information about crisis events and are followed by many. Yet, they also appear suddenly (at crisis onset) and in most cases, the owners are unknown. Thus, it can be challenging for audiences in particular to know whether to trust (or not trust) these CNRs and the information they provide. In this study, we conducted surveys and interviews with members of the public and experts in crisis informatics, emergency response, and communication studies to evaluate the trustworthiness of CNRs named after the 2017 Hurricane Irma. Findings showed that participants evaluated trustworthiness based on their perceptions of a CNR's content, information source, profile, and owner. Findings also show that if people perceive that a CNR owner has prior experience in crisis response, can help the public to respond to the event, understands the situation, has the best interests of affected individuals in mind, or will correct misinformation, they tend to trust that CNR. Participant demographics and expertise showed no effect on perceptions of trustworthiness.

6 citations

Journal ArticleDOI
16 Jul 2021
TL;DR: In this paper, the authors discuss a new conceptual model to examine the phenomenon of fake news and propose a mechanism by which to determine how likely users may be to share fake news with others.
Abstract: The authors discuss a new conceptual model to examine the phenomenon of fake news. Their model focuses on the relationship between the creator and the consumer of fake news and proposes a mechanism by which to determine how likely users may be to share fake news with others. In particular, it is hypothesized that information users would likely be influenced by seven factors in choosing to share fake news or to verify information, including the user’s: (1) level of online trust; (2) level of self-disclosure online; (3) amount of social comparison; (4) level of FoMO anxiety; (5) level of social media fatigue; (6) concept of self and role identity; and (7) level of education attainment. The implications reach into many well-established avenues of inquiry in education, Library and Information Science (LIS), sociology, and other disciplines, including communities of practice, information acquiring and sharing, social positioning, social capital theory, self-determination, rational choice (e.g., satisficing and information overload), critical thinking, and information literacy. Understanding the multiple root causes of creating and sharing fake news will help to alleviate its spread. Relying too heavily on but one factor to combat fake news—education level, for example—may have limited impact on mitigating its effects. Establishing thresholds for a certain combination of factors may better predict the tendency of users to share fake news. The authors also speculate on the role information literacy education programs can play in light of a more complex understanding of how fake news operates.

6 citations

Journal ArticleDOI
TL;DR: A novel method for analyzing topics extracted from Twitter by utilizing the dynamic wavelet fingerprint technique (DWFT), which finds that it can identify behavior, localized in time, that is characteristic to how different topics propagate through Twitter.
Abstract: We describe a novel method for analyzing topics extracted from Twitter by utilizing the dynamic wavelet fingerprint technique (DWFT). Topics are derived from 7 different tweet storms analyzed in the study by using a dynamic topic model. Using the time series of each topic, we run DWFT analyses to get a two-dimensional, time-scale, binary image. Gaussian mixture model clustering is used to identify individual objects, or storm cells, that are characteristic to specific local behaviors commonly occurring in topics. The DWFT time series transformation is volume agnostic, meaning we can compare tweet storms of different intensities. We find that we can identify behavior, localized in time, that is characteristic to how different topics propagate through Twitter. The use of dynamic topic models and the DWFT create the basis for future applications as a real-time Twitter analysis system for flagging fake news.

6 citations

Posted ContentDOI
TL;DR: Although these findings were obtained during the first encounter with the Corona pandemic, the authorities immediately introduced the national media as a reliable news source, which allowed the media and its journalists to reduce the gap between themselves and the public sphere.
Abstract: BACKGROUND In the COVID-19 pandemic, rumors travel far faster than the outbreak itself. The current study aimed to evaluate the factors affecting the attitudes of individuals towards the rumors-producing media in Iran. METHODS An online cross-sectional survey was conducted in Iran in March 2020 on the source of information and rumors, along with the perception of individuals regarding the reasons for rumors propagation during the COVID-19 pandemic. RESULTS Results showed that the majority of the participants (59.3%) believed that social media were the main source of rumors. The lack of a reliable and formal news resource was also considered the most common cause of rumoring by the participants (63.6%). An evaluation was carried out to identify the main source of misinformation and rumors. Results showed that Retired participants considered foreign media (P < 0.001) as the main resource. The middle-income level participants believed that social media (P < 0.001) were the main source. In this regard, the highly educated participants (P < 0.001), government employees, and middle-income individuals (P = 0.008) believed that national media produced rumors. CONCLUSION Although findings were achieved during the first peak of the COVID-19 pandemic, the authorities immediately introduced the national media as a reliable news resource, which allowed both media and its journalists to reduce the gap between themselves and the public sphere. It was suggested that social networks and foreign media be more accountable in pandemics.

6 citations

Journal ArticleDOI
TL;DR: In this article, Wu et al. investigated the ways in which the mainstream digital news covers the etiology of obesity and diseases associated with the burden of obesity, and found that the discourse between the obesity epidemic and personal afflictions is the most emphasized approach.
Abstract: Background: The fact that the number of individuals with obesity has increased worldwide calls into question media efforts for informing the public. This study attempts to determine the ways in which the mainstream digital news covers the etiology of obesity and diseases associated with the burden of obesity. Objective: The dual objectives of this study are to obtain an understanding of what the news reports on obesity and to explore meaning in data by extending the preconceived grounded theory. Methods: The 10 years of news text from 2010 to 2019 compared the development of obesity-related coverage and its potential impact on its perception in Mainland China, Hong Kong, and Taiwan. Digital news stories on obesity along with affliction and inferences in 9 Chinese mainstream newspapers were sampled. An automatic content analysis tool, DiVoMiner was proposed. This computer-aided platform is designed to organize and filter large sets of data on the basis of the patterns of word occurrence and term discovery. Another programming language, Python 3, was used to explore connections and patterns created by the aggregated interactions. Results: A total of 30,968 news stories were identified with increasing attention since 2016. The highest intensity of newspaper coverage of obesity communication was observed in Taiwan. Overall, a stronger focus on 2 shared causative attributes of obesity is on stress (n=4483, 33.0%) and tobacco use (n=3148, 23.2%). The burdens of obesity and cardiovascular diseases are implied to be the most, despite the aggregated interaction of edge centrality showing the highest link between the “cancer” and obesity. This study goes beyond traditional journalism studies by extending the framework of computational and customizable web-based text analysis. This could set a norm for researchers and practitioners who work on data projects largely for an innovative attempt. Conclusions: Similar to previous studies, the discourse between the obesity epidemic and personal afflictions is the most emphasized approach. Our study also indicates that the inclination of blaming personal attributes for health afflictions potentially limits social and governmental responsibility for addressing this issue.

6 citations

References
More filters
Journal ArticleDOI
28 May 2015-Nature
TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
Abstract: Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

46,982 citations

Book ChapterDOI
TL;DR: In this paper, the authors present a critique of expected utility theory as a descriptive model of decision making under risk, and develop an alternative model, called prospect theory, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights.
Abstract: This paper presents a critique of expected utility theory as a descriptive model of decision making under risk, and develops an alternative model, called prospect theory. Choices among risky prospects exhibit several pervasive effects that are inconsistent with the basic tenets of utility theory. In particular, people underweight outcomes that are merely probable in comparison with outcomes that are obtained with certainty. This tendency, called the certainty effect, contributes to risk aversion in choices involving sure gains and to risk seeking in choices involving sure losses. In addition, people generally discard components that are shared by all prospects under consideration. This tendency, called the isolation effect, leads to inconsistent preferences when the same choice is presented in different forms. An alternative theory of choice is developed, in which value is assigned to gains and losses rather than to final assets and in which probabilities are replaced by decision weights. The value function is normally concave for gains, commonly convex for losses, and is generally steeper for losses than for gains. Decision weights are generally lower than the corresponding probabilities, except in the range of low prob- abilities. Overweighting of low probabilities may contribute to the attractiveness of both insurance and gambling. EXPECTED UTILITY THEORY has dominated the analysis of decision making under risk. It has been generally accepted as a normative model of rational choice (24), and widely applied as a descriptive model of economic behavior, e.g. (15, 4). Thus, it is assumed that all reasonable people would wish to obey the axioms of the theory (47, 36), and that most people actually do, most of the time. The present paper describes several classes of choice problems in which preferences systematically violate the axioms of expected utility theory. In the light of these observations we argue that utility theory, as it is commonly interpreted and applied, is not an adequate descriptive model and we propose an alternative account of choice under risk. 2. CRITIQUE

35,067 citations

Book ChapterDOI
09 Jan 2004
TL;DR: A theory of intergroup conflict and some preliminary data relating to the theory is presented in this article. But the analysis is limited to the case where the salient dimensions of the intergroup differentiation are those involving scarce resources.
Abstract: This chapter presents an outline of a theory of intergroup conflict and some preliminary data relating to the theory. Much of the work on the social psychology of intergroup relations has focused on patterns of individual prejudices and discrimination and on the motivational sequences of interpersonal interaction. The intensity of explicit intergroup conflicts of interests is closely related in human cultures to the degree of opprobrium attached to the notion of "renegade" or "traitor." The basic and highly reliable finding is that the trivial, ad hoc intergroup categorization leads to in-group favoritism and discrimination against the out-group. Many orthodox definitions of "social groups" are unduly restrictive when applied to the context of intergroup relations. The equation of social competition and intergroup conflict rests on the assumptions concerning an "ideal type" of social stratification in which the salient dimensions of intergroup differentiation are those involving scarce resources.

14,812 citations

Journal ArticleDOI
TL;DR: Cumulative prospect theory as discussed by the authors applies to uncertain as well as to risky prospects with any number of outcomes, and it allows different weighting functions for gains and for losses, and two principles, diminishing sensitivity and loss aversion, are invoked to explain the characteristic curvature of the value function and the weighting function.
Abstract: We develop a new version of prospect theory that employs cumulative rather than separable decision weights and extends the theory in several respects. This version, called cumulative prospect theory, applies to uncertain as well as to risky prospects with any number of outcomes, and it allows different weighting functions for gains and for losses. Two principles, diminishing sensitivity and loss aversion, are invoked to explain the characteristic curvature of the value function and the weighting functions. A review of the experimental evidence and the results of a new experiment confirm a distinctive fourfold pattern of risk attitudes: risk aversion for gains and risk seeking for losses of high probability; risk seeking for gains and risk aversion for losses of low probability. Expected utility theory reigned for several decades as the dominant normative and descriptive model of decision making under uncertainty, but it has come under serious question in recent years. There is now general agreement that the theory does not provide an adequate description of individual choice: a substantial body of evidence shows that decision makers systematically violate its basic tenets. Many alternative models have been proposed in response to this empirical challenge (for reviews, see Camerer, 1989; Fishburn, 1988; Machina, 1987). Some time ago we presented a model of choice, called prospect theory, which explained the major violations of expected utility theory in choices between risky prospects with a small number of outcomes (Kahneman and Tversky, 1979; Tversky and Kahneman, 1986). The key elements of this theory are 1) a value function that is concave for gains, convex for losses, and steeper for losses than for gains,

13,433 citations

Trending Questions (1)
Issue of fake news

The paper discusses the issue of fake news on social media and its potential negative impacts on individuals and society.