scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Fake News Detection on Social Media: A Data Mining Perspective

01 Sep 2017-Sigkdd Explorations (ACM)-Vol. 19, Iss: 1, pp 22-36
TL;DR: Wang et al. as discussed by the authors presented a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets.
Abstract: Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of \fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ine ective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.
Citations
More filters
Proceedings ArticleDOI
19 Jul 2018
TL;DR: An end-to-end framework named Event Adversarial Neural Network (EANN), which can derive event-invariant features and thus benefit the detection of fake news on newly arrived events, is proposed.
Abstract: As news reading on social media becomes more and more popular, fake news becomes a major issue concerning the public and government. The fake news can take advantage of multimedia content to mislead readers and get dissemination, which can cause negative effects or even manipulate the public events. One of the unique challenges for fake news detection on social media is how to identify fake news on newly emerged events. Unfortunately, most of the existing approaches can hardly handle this challenge, since they tend to learn event-specific features that can not be transferred to unseen events. In order to address this issue, we propose an end-to-end framework named Event Adversarial Neural Network (EANN), which can derive event-invariant features and thus benefit the detection of fake news on newly arrived events. It consists of three main components: the multi-modal feature extractor, the fake news detector, and the event discriminator. The multi-modal feature extractor is responsible for extracting the textual and visual features from posts. It cooperates with the fake news detector to learn the discriminable representation for the detection of fake news. The role of event discriminator is to remove the event-specific features and keep shared features among events. Extensive experiments are conducted on multimedia datasets collected from Weibo and Twitter. The experimental results show our proposed EANN model can outperform the state-of-the-art methods, and learn transferable feature representations.

627 citations

Journal ArticleDOI
TL;DR: A fake news data repository FakeNewsNet is presented, which contains two comprehensive data sets with diverse features in news content, social context, and spatiotemporal information, and is discussed for potential applications on fake news study on social media.
Abstract: Social media has become a popular means for people to consume and share the news. At the same time, however, it has also enabled the wide dissemination of fake news, that is, news with intentionally false information, causing significant negative effects on society. To mitigate this problem, the research of fake news detection has recently received a lot of attention. Despite several existing computational solutions on the detection of fake news, the lack of comprehensive and community-driven fake news data sets has become one of major roadblocks. Not only existing data sets are scarce, they do not contain a myriad of features often required in the study such as news content, social context, and spatiotemporal information. Therefore, in this article, to facilitate fake news-related research, we present a fake news data repository FakeNewsNet, which contains two comprehensive data sets with diverse features in news content, social context, and spatiotemporal information. We present a comprehensive description of the FakeNewsNet, demonstrate an exploratory analysis of two data sets from different perspectives, and discuss the benefits of the FakeNewsNet for potential applications on fake news study on social media.

577 citations

Journal ArticleDOI
TL;DR: This survey provides a thorough review of techniques for manipulating face images including DeepFake methods, and methods to detect such manipulations, with special attention to the latest generation of DeepFakes.

502 citations

Journal ArticleDOI
TL;DR: To address the spread of misinformation, the frontline healthcare providers should be equipped with the most recent research findings and accurate information, and advanced technologies like natural language processing or data mining approaches should be applied in the detection and removal of online content with no scientific basis from all social media platforms.
Abstract: The coronavirus disease 2019 (COVID-19) pandemic has not only caused significant challenges for health systems all over the globe but also fueled the surge of numerous rumors, hoaxes, and misinformation, regarding the etiology, outcomes, prevention, and cure of the disease. Such spread of misinformation is masking healthy behaviors and promoting erroneous practices that increase the spread of the virus and ultimately result in poor physical and mental health outcomes among individuals. Myriad incidents of mishaps caused by these rumors have been reported globally. To address this issue, the frontline healthcare providers should be equipped with the most recent research findings and accurate information. The mass media, healthcare organization, community-based organizations, and other important stakeholders should build strategic partnerships and launch common platforms for disseminating authentic public health messages. Also, advanced technologies like natural language processing or data mining approaches should be applied in the detection and removal of online content with no scientific basis from all social media platforms. Furthermore, these practices should be controlled with regulatory and law enforcement measures alongside ensuring telemedicine-based services providing accurate information on COVID-19.

474 citations

Journal ArticleDOI
TL;DR: A comprehensive overview of the finding to date relating to fake news is presented, characterized the negative impact of online fake news, and the state-of-the-art in detection methods are characterized.
Abstract: Over the recent years, the growth of online social media has greatly facilitated the way people communicate with each other. Users of online social media share information, connect with other people and stay informed about trending events. However, much recent information appearing on social media is dubious and, in some cases, intended to mislead. Such content is often called fake news. Large amounts of online fake news has the potential to cause serious problems in society. Many point to the 2016 U.S. presidential election campaign as having been influenced by fake news. Subsequent to this election, the term has entered the mainstream vernacular. Moreover it has drawn the attention of industry and academia, seeking to understand its origins, distribution and effects. Of critical interest is the ability to detect when online content is untrue and intended to mislead. This is technically challenging for several reasons. Using social media tools, content is easily generated and quickly spread, leading to a large volume of content to analyse. Online information is very diverse, covering a large number of subjects, which contributes complexity to this task. The truth and intent of any statement often cannot be assessed by computers alone, so efforts must depend on collaboration between humans and technology. For instance, some content that is deemed by experts of being false and intended to mislead are available. While these sources are in limited supply, they can form a basis for such a shared effort. In this survey, we present a comprehensive overview of the finding to date relating to fake news. We characterize the negative impact of online fake news, and the state-of-the-art in detection methods. Many of these rely on identifying features of the users, content, and context that indicate misinformation. We also study existing datasets that have been used for classifying fake news. Finally, we propose promising research directions for online fake news analysis.

449 citations

References
More filters
Proceedings Article
01 Jul 2013
TL;DR: This paper analyzes the effectiveness of the fusion of global and local features in automatic image annotation and content based image retrieval community, including some classic models and their illustrations in the literature.
Abstract: Feature extraction and representation is a crucial step for multimedia processing. How to extract ideal features that can reflect the intrinsic content of the images as complete as possible is still a challenging problem in computer vision. However, very little research has paid attention to this problem in the last decades. So in this paper, we focus our review on the latest development in image feature extraction and provide a comprehensive survey on image feature representation techniques. In particular, we analyze the effectiveness of the fusion of global and local features in automatic image annotation and content based image retrieval community, including some classic models and their illustrations in the literature. Finally, we summarize this paper with some important conclusions and point out the future potential research directions.

248 citations

Journal ArticleDOI
TL;DR: In this article, the authors focus on how both Italian and US Facebook users relate to two distinct narratives (involving conspiracy theories and science), and offer quantitative evidence that echo chambers actually exist on social media.
Abstract: Do echo chambers actually exist on social media? By focusing on how both Italian and US Facebook users relate to two distinct narratives (involving conspiracy theories and science), we offer quantitative evidence that they do. The explanation involves users’ tendency to promote their favored narratives and hence to form polarized groups. Confirmation bias helps to account for users’ decisions about whether to spread content, thus creating informational cascades within identifiable communities. At the same time, aggregation of favored information within those communities reinforces selective exposure and group polarization. We provide empirical evidence that because they focus on their preferred narratives, users tend to assimilate only confirming claims and to ignore apparent refutations.

242 citations

Proceedings ArticleDOI
03 Apr 2017
TL;DR: A new framework Visual Content Enhanced POI recommendation (VPOI), which incorporates visual contents for POI recommendations, is proposed, which demonstrates the effectiveness of the proposed framework on real-world datasets.
Abstract: The rapid growth of Location-based Social Networks (LBSNs) provides a vast amount of check-in data, which facilitates the study of point-of-interest (POI) recommendation. The majority of the existing POI recommendation methods focus on four aspects, i.e., temporal patterns, geographical influence, social correlations and textual content indications. For example, user's visits to locations have temporal patterns and users are likely to visit POIs near them. In real-world LBSNs such as Instagram, users can upload photos associating with locations. Photos not only reflect users' interests but also provide informative descriptions about locations. For example, a user who posts many architecture photos is more likely to visit famous landmarks; while a user posts lots of images about food has more incentive to visit restaurants. Thus, images have potentials to improve the performance of POI recommendation. However, little work exists for POI recommendation by exploiting images. In this paper, we study the problem of enhancing POI recommendation with visual contents. In particular, we propose a new framework Visual Content Enhanced POI recommendation (VPOI), which incorporates visual contents for POI recommendations. Experimental results on real-world datasets demonstrate the effectiveness of the proposed framework.

233 citations

Journal ArticleDOI
TL;DR: Key achievements of user identity linkage across online social networks including state-of- the-art algorithms, evaluation metrics, and representative datasets are reviewed.
Abstract: The increasing popularity and diversity of social media sites has encouraged more and more people to participate on multiple online social networks to enjoy their services. Each user may create a user identity, which can includes profile, content, or network information, to represent his or her unique public figure in every social network. Thus, a fundamental question arises -- can we link user identities across online social networks? User identity linkage across online social networks is an emerging task in social media and has attracted increasing attention in recent years. Advancements in user identity linkage could potentially impact various domains such as recommendation and link prediction. Due to the unique characteristics of social network data, this problem faces tremendous challenges. To tackle these challenges, recent approaches generally consist of (1) extracting features and (2) constructing predictive models from a variety of perspectives. In this paper, we review key achievements of user identity linkage across online social networks including stateof- the-art algorithms, evaluation metrics, and representative datasets. We also discuss related research areas, open problems, and future research directions for user identity linkage across online social networks.

231 citations

Book ChapterDOI
01 Jan 2017
TL;DR: Experimental results on two realworld datasets of social media demonstrate the effectiveness of the proposed deep learning framework SiNE for signed network embedding that optimizes an objective function guided by social theories that provide a fundamental understanding of signed social networks.
Abstract: Network embedding is to learn low-dimensional vector representations for nodes of a given social network, facilitating many tasks in social network analysis such as link prediction. The vast majority of existing embedding algorithms are designed for unsigned social networks or social networks with only positive links. However, networks in social media could have both positive and negative links, and little work exists for signed social networks. From recent findings of signed network analysis, it is evident that negative links have distinct properties and added value besides positive links, which brings about both challenges and opportunities for signed network embedding. In this paper, we propose a deep learning framework SiNE for signed network embedding. The framework optimizes an objective function guided by social theories that provide a fundamental understanding of signed social networks. Experimental results on two realworld datasets of social media demonstrate the effectiveness of the proposed framework SiNE.

230 citations

Trending Questions (1)
Issue of fake news

The paper discusses the issue of fake news on social media and its potential negative impacts on individuals and society.