Fake News Detection on Social Media: A Data Mining Perspective

doi:10.1145/3137597.3137600

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

SAME: sentiment-aware multi-modal embedding for detecting fake news

[...]

Limeng Cui¹, Suhang Wang¹, Dongwon Lee¹•Institutions (1)

Pennsylvania State University¹

27 Aug 2019

TL;DR: This work incorporates users' latent sentiments into an end-to-end deep embedding framework for detecting fake news, named as SAME, and defines a novel regularization loss to bring embeddings of relevant pairs closer.

...read moreread less

Abstract: How to effectively detect fake news and prevent its diffusion on social media has gained much attention in recent years. However, relatively little focus has been given on exploiting user comments left for posts and latent sentiments therein in detecting fake news. Inspired by the rich information available in user comments on social media, therefore, we investigate whether the latent sentiments hidden in user comments can potentially help distinguish fake news from reliable content. We incorporate users' latent sentiments into an end-to-end deep embedding framework for detecting fake news, named as SAME. First, we use multi-modal networks to deal with heterogeneous data modalities. Second, to learn semantically meaningful spaces per data source, we adopt an adversarial mechanism. Third, we define a novel regularization loss to bring embeddings of relevant pairs closer. Our comprehensive validation using two real-world datasets, PolitiFact and GossipCop, demonstrates the effectiveness of SAME in detecting fake news, significantly outperforming state-of-the-art methods.

...read moreread less

87 citations

Journal Article•DOI•

A benchmark study of machine learning models for online fake news detection

[...]

Junaed Younus Khan¹, Md. Tawkat Islam Khondaker¹, Sadia Afroz², Gias Uddin³, Anindya Iqbal¹ - Show less +1 more•Institutions (3)

Bangladesh University of Engineering and Technology¹, International Computer Science Institute², University of Calgary³

01 Jun 2021

TL;DR: BERT and similar pre-trained models perform the best for fake news detection, especially with very small dataset, and these models are significantly better option for languages with limited electronic contents, i.e., training data.

...read moreread less

Abstract: The proliferation of fake news and its propagation on social media has become a major concern due to its ability to create devastating impacts. Different machine learning approaches have been suggested to detect fake news. However, most of those focused on a specific type of news (such as political) which leads us to the question of dataset-bias of the models used. In this research, we conducted a benchmark study to assess the performance of different applicable machine learning approaches on three different datasets where we accumulated the largest and most diversified one. We explored a number of advanced pre-trained language models for fake news detection along with the traditional and deep learning ones and compared their performances from different aspects for the first time to the best of our knowledge. We find that BERT and similar pre-trained models perform the best for fake news detection, especially with very small dataset. Hence, these models are significantly better option for languages with limited electronic contents, i.e., training data. We also carried out several analysis based on the models’ performance, article’s topic, article’s length, and discussed different lessons learned from them. We believe that this benchmark study will help the research community to explore further and news sites/blogs to select the most appropriate fake news detection method.

...read moreread less

86 citations

Posted Content•

Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society

[...]

Firoj Alam, Shaden Shaar, Fahim Dalvi, Hassan Sajjad, Alex Nikolov, Hamdy Mubarak, Giovanni Da San Martino, Ahmed Abdelali, Nadir Durrani, Kareem Darwish, Preslav Nakov - Show less +7 more

30 Apr 2020-arXiv: Computation and Language

TL;DR: A new dataset for fine-grained disinformation analysis that focuses on COVID-19, combines the perspectives and the interests of journalists, fact-checkers, social media platforms, policy makers, and society as a whole, and covers both English and Arabic is designed and annotated.

...read moreread less

Abstract: With the emergence of the COVID-19 pandemic, the political and the medical aspects of disinformation merged as the problem got elevated to a whole new level to become the first global infodemic. Fighting this infodemic is ranked second in the list of the most important focus areas of the World Health Organization, with dangers ranging from promoting fake cures, rumors, and conspiracy theories to spreading xenophobia and panic. Addressing the issue requires solving a number of challenging problems such as identifying messages containing claims, determining their check-worthiness and factuality, and their potential to do harm as well as the nature of that harm, to mention just a few. Thus, here we design, annotate, and release to the research community a new dataset for fine-grained disinformation analysis that (i)focuses on COVID-19, (ii) combines the perspectives and the interests of journalists, fact-checkers, social media platforms, policy makers, and society as a whole, and (iii) covers both English and Arabic. Finally, we show strong evaluation results using state-of-the-art Transformers, thus confirming the practical utility of the annotation schema and of the dataset.

...read moreread less

86 citations

Posted Content•

Hierarchical Propagation Networks for Fake News Detection: Investigation and Exploitation

[...]

Kai Shu¹, Deepak Mahudeswaran¹, Suhang Wang², Huan Liu¹•Institutions (2)

Arizona State University¹, Pennsylvania State University²

21 Mar 2019-arXiv: Social and Information Networks

TL;DR: This work builds a hierarchical propagation network from macro-level and micro-level of fake news and shows the effectiveness of these propagation network features for fake news detection and paves the way towards a healthier online news ecosystem.

...read moreread less

Abstract: Consuming news from social media is becoming increasingly popular. However, social media also enables the widespread of fake news. Because of its detrimental effects brought by social media, fake news detection has attracted increasing attention. However, the performance of detecting fake news only from news content is generally limited as fake news pieces are written to mimic true news. In the real world, news pieces spread through propagation networks on social media. The news propagation networks usually involve multi-levels. In this paper, we study the challenging problem of investigating and exploiting news hierarchical propagation network on social media for fake news detection. In an attempt to understand the correlations between news propagation networks and fake news, first, we build a hierarchical propagation network from macro-level and micro-level of fake news and true news; second, we perform a comparative analysis of the propagation network features of linguistic, structural and temporal perspectives between fake and real news, which demonstrates the potential of utilizing these features to detect fake news; third, we show the effectiveness of these propagation network features for fake news detection. We further validate the effectiveness of these features from feature important analysis. Altogether, this work presents a data-driven view of hierarchical propagation network and fake news and paves the way towards a healthier online news ecosystem.

...read moreread less

86 citations

Journal Article•DOI•

CoAID-DEEP: An Optimized Intelligent Framework for Automated Detecting COVID-19 Misleading Information on Twitter

[...]

Diaa Salama Abdelminaam¹, Fatma Helmy Ismail², Mohamed Taha¹, Ahmed Taha¹, Essam H. Houssein³, Ayman Nabil² - Show less +2 more•Institutions (3)

Banha University¹, Misr International University², Minia University³

09 Feb 2021-IEEE Access

TL;DR: In this article, an updated deep neural network for identification of false news was proposed for detecting false news in tweets passing on data with respect to COVID-19 information, and the results obtained with the proposed framework reveal high accuracy in detecting Fake and non-Fake tweets containing COVID19 information.

...read moreread less

Abstract: COVID-19 has affected all peoples’ lives Though COVID-19 is on the rising, the existence of misinformation about the virus also grows in parallel Additionally, the spread of misinformation has created confusion among people, caused disturbances in society, and even led to deaths Social media is central to our daily lives The Internet has become a significant source of knowledge Owing to the widespread damage caused by fake news, it is important to build computerized systems to detect fake news The paper proposes an updated deep neural network for identification of false news The deep learning techniques are The Modified-LSTM (one to three layers) and The Modified GRU (one to three layers) In particular, we carry out investigations of a large dataset of tweets passing on data with respect to COVID-19 In our study, we separate the dubious claims into two categories: true and false We compare the performance of the various algorithms in terms of prediction accuracy The six machine learning techniques are decision trees, logistic regression, k nearest neighbors, random forests, support vector machines, and naive Bayes (NB) The parameters of deep learning techniques are optimized using Keras-tuner Four Benchmark datasets were used Two feature extraction methods were used (TF-ID with N-gram) to extract essential features from the four benchmark datasets for the baseline machine learning model and word embedding feature extraction method for the proposed deep neural network methods The results obtained with the proposed framework reveal high accuracy in detecting Fake and non-Fake tweets containing COVID-19 information These results demonstrate significant improvement as compared to the existing state of art results of baseline machine learning models In our approach, we classify the data into two categories: fake or nonfake We compare the execution of the proposed approaches with Six machine learning procedures The six machine learning procedures are Decision Tree (DT), Logistic Regression (LR), K Nearest Neighbor (KNN), Random Forest (RF), Support Vector Machine (SVM), and Naive Bayes (NB) The parameters of deep learning techniques are optimized using Keras-tuner Four Benchmark datasets were used Two feature extraction methods were used (TF-ID with N-gram) to extract essential features from the four benchmark datasets for the baseline machine learning model and word embedding feature extraction method for the proposed deep neural network methods The results obtained with the proposed framework reveal high accuracy in detecting Fake and non-Fake tweets containing COVID-19 information These results demonstrate significant improvement as compared to the existing state of art results of baseline machine learning models

...read moreread less

85 citations

Collapse

Fake News Detection on Social Media: A Data Mining Perspective

Citations

References

Related Papers (5)

Trending Questions (1)