Stance classification in rumours as a sequential task exploiting the tree structure of social media conversations

Home
/
Papers
/
Stance classification in rumours as a sequential task exploiting the tree structure of social media conversations

Proceedings Article•

Stance classification in rumours as a sequential task exploiting the tree structure of social media conversations

Arkaitz Zubiaga¹, Elena Kochkina¹, Maria Liakata¹, Rob Procter¹, Michal Lukasik² - Show less +1 more•Institutions (2)

University of Warwick¹, University of Sheffield²

21 Sep 2016-pp 2438-2448

TL;DR: The authors model Twitter conversations as a tree structure and test two sequential classifiers, Linear-Chain CRF and Tree CRF, each of which makes different assumptions about the conversational structure.

read less

Abstract: Rumour stance classification, the task that determines if each tweet in a collection discussing a rumour is supporting, denying, questioning or simply commenting on the rumour, has been attracting substantial interest. Here we introduce a novel approach that makes use of the sequence of transitions observed in tree-structured conversation threads in Twitter. The conversation threads are formed by harvesting users’ replies to one another, which results in a nested tree-like structure. Previous work addressing the stance classification task has treated each tweet as a separate unit. Here we analyse tweets by virtue of their position in a sequence and test two sequential classifiers, Linear-Chain CRF and Tree CRF, each of which makes different assumptions about the conversational structure. We experiment with eight Twitter datasets, collected during breaking news, and show that exploiting the sequential structure of Twitter conversations achieves significant improvements over the non-sequential methods. Our work is the first to model Twitter conversations as a tree structure in this manner, introducing a novel way of tackling NLP tasks on Twitter conversations.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

What is Twitter

[...]

Rizal Setya Perdana

01 Jan 2013

1,098 citations

Journal Article•DOI•

Detection and Resolution of Rumours in Social Media: A Survey

[...]

Arkaitz Zubiaga¹, Ahmet Aker², Kalina Bontcheva³, Maria Liakata¹, Rob Procter¹ - Show less +1 more•Institutions (3)

University of Warwick¹, University of Duisburg-Essen², University of Sheffield³

20 Feb 2018-ACM Computing Surveys

TL;DR: The authors provide an overview of research into social media rumours with the ultimate goal of developing a rumour classification system that consists of four components: rumour detection, rumor tracking, rumour stance classification, and rumour veracity classification.

...read moreread less

Abstract: Despite the increasing use of social media platforms for information and news gathering, its unmoderated nature often leads to the emergence and spread of rumours, i.e., items of information that are unverified at the time of posting. At the same time, the openness of social media platforms provides opportunities to study how users share and discuss rumours, and to explore how to automatically assess their veracity, using natural language processing and data mining techniques. In this article, we introduce and discuss two types of rumours that circulate on social media: long-standing rumours that circulate for long periods of time, and newly emerging rumours spawned during fast-paced events such as breaking news, where reports are released piecemeal and often with an unverified status in their early stages. We provide an overview of research into social media rumours with the ultimate goal of developing a rumour classification system that consists of four components: rumour detection, rumour tracking, rumour stance classification, and rumour veracity classification. We delve into the approaches presented in the scientific literature for the development of each of these four components. We summarise the efforts and achievements so far toward the development of rumour classification systems and conclude with suggestions for avenues for future research in social media mining for the detection and resolution of rumours.

...read moreread less

498 citations

Proceedings Article•DOI•

Rumor Detection on Twitter with Tree-structured Recursive Neural Networks

[...]

Jing Ma¹, Wei Gao², Kam-Fai Wong¹•Institutions (2)

The Chinese University of Hong Kong¹, University of New South Wales²

01 Jul 2018

TL;DR: This work proposes two recursive neural models based on a bottom-up and a top-down tree-structured neural networks for rumor representation learning and classification, which naturally conform to the propagation layout of tweets.

...read moreread less

Abstract: Automatic rumor detection is technically very challenging. In this work, we try to learn discriminative features from tweets content by following their non-sequential propagation structure and generate more powerful representations for identifying different type of rumors. We propose two recursive neural models based on a bottom-up and a top-down tree-structured neural networks for rumor representation learning and classification, which naturally conform to the propagation layout of tweets. Results on two public Twitter datasets demonstrate that our recursive neural models 1) achieve much better performance than state-of-the-art approaches; 2) demonstrate superior capacity on detecting rumors at very early stage.

...read moreread less

370 citations

Journal Article•DOI•

Fake news detection: A hybrid CNN-RNN based deep learning approach

[...]

Jamal Abdul Nasir¹, Jamal Abdul Nasir², Osama Subhani Khan², Iraklis Varlamis³•Institutions (3)

National University of Ireland, Galway¹, International Islamic University, Islamabad², National and Kapodistrian University of Athens³

01 Apr 2021

TL;DR: In this article, a hybrid deep learning model that combines convolutional and recurrent neural networks for fake news classification was proposed, achieving detection results that are significantly better than other non-hybrid baseline methods.

...read moreread less

Abstract: The explosion of social media allowed individuals to spread information without cost, with little investigation and fewer filters than before. This amplified the old problem of fake news, which became a major concern nowadays due to the negative impact it brings to the communities. In order to tackle the rise and spreading of fake news, automatic detection techniques have been researched building on artificial intelligence and machine learning. The recent achievements of deep learning techniques in complex natural language processing tasks, make them a promising solution for fake news detection too. This work proposes a novel hybrid deep learning model that combines convolutional and recurrent neural networks for fake news classification. The model was successfully validated on two fake news datasets (ISO and FA-KES), achieving detection results that are significantly better than other non-hybrid baseline methods. Further experiments on the generalization of the proposed model across different datasets, had promising results.

...read moreread less

202 citations

Book Chapter•DOI•

Exploiting Context for Rumour Detection in Social Media

[...]

Arkaitz Zubiaga¹, Maria Liakata¹, Maria Liakata², Rob Procter², Rob Procter¹ - Show less +1 more•Institutions (2)

University of Warwick¹, The Turing Institute²

13 Sep 2017

TL;DR: A novel approach using Conditional Random Fields that learns from the sequential dynamics of social media posts with the current state-of-the-art rumour detection system, as well as other baselines, and results provide evidence for the generalisability of the classifier.

...read moreread less

Abstract: Tools that are able to detect unverified information posted on social media during a news event can help to avoid the spread of rumours that turn out to be false. In this paper we compare a novel approach using Conditional Random Fields that learns from the sequential dynamics of social media posts with the current state-of-the-art rumour detection system, as well as other baselines. In contrast to existing work, our classifier does not need to observe tweets querying the stance of a post to deem it a rumour but, instead, exploits context learned during the event. Our classifier has improved precision and recall over the state-of-the-art classifier that relies on querying tweets, as well as outperforming our best baseline. Moreover, the results provide evidence for the generalisability of our classifier.

...read moreread less

172 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13

Collapse

References

PDF

Open Access

More filters

Proceedings Article•

Distributed Representations of Words and Phrases and their Compositionality

[...]

Tomas Mikolov¹, Ilya Sutskever¹, Kai Chen¹, Greg S. Corrado¹, Jeffrey Dean¹ - Show less +1 more•Institutions (1)

Google¹

05 Dec 2013

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.

...read moreread less

Abstract: The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of "Canada" and "Air" cannot be easily combined to obtain "Air Canada". Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.

...read moreread less

24,012 citations

Proceedings Article•

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

[...]

John Lafferty¹, Andrew McCallum, Fernando Pereira•Institutions (1)

Carnegie Mellon University¹

28 Jun 2001

TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.

...read moreread less

Abstract: We present conditional random fields , a framework for building probabilistic models to segment and label sequence data. Conditional random fields offer several advantages over hidden Markov models and stochastic grammars for such tasks, including the ability to relax strong independence assumptions made in those models. Conditional random fields also avoid a fundamental limitation of maximum entropy Markov models (MEMMs) and other discriminative Markov models based on directed graphical models, which can be biased towards states with few successor states. We present iterative parameter estimation algorithms for conditional random fields and compare the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.

...read moreread less

13,190 citations

Proceedings Article•DOI•

What is Twitter, a social network or a news media?

[...]

Haewoon Kwak¹, Changhyun Lee¹, Hosung Park¹, Sue Moon¹•Institutions (1)

KAIST¹

26 Apr 2010

TL;DR: In this paper, the authors have crawled the entire Twittersphere and found a non-power-law follower distribution, a short effective diameter, and low reciprocity, which all mark a deviation from known characteristics of human social networks.

...read moreread less

Abstract: Twitter, a microblogging service less than three years old, commands more than 41 million users as of July 2009 and is growing fast. Twitter users tweet about any topic within the 140-character limit and follow others to receive their tweets. The goal of this paper is to study the topological characteristics of Twitter and its power as a new medium of information sharing.We have crawled the entire Twitter site and obtained 41.7 million user profiles, 1.47 billion social relations, 4,262 trending topics, and 106 million tweets. In its follower-following topology analysis we have found a non-power-law follower distribution, a short effective diameter, and low reciprocity, which all mark a deviation from known characteristics of human social networks [28]. In order to identify influentials on Twitter, we have ranked users by the number of followers and by PageRank and found two rankings to be similar. Ranking by retweets differs from the previous two rankings, indicating a gap in influence inferred from the number of followers and that from the popularity of one's tweets. We have analyzed the tweets of top trending topics and reported on their temporal behavior and user participation. We have classified the trending topics based on the active period and the tweets and show that the majority (over 85%) of topics are headline news or persistent news in nature. A closer look at retweets reveals that any retweeted tweet is to reach an average of 1,000 users no matter what the number of followers is of the original tweet. Once retweeted, a tweet gets retweeted almost instantly on next hops, signifying fast diffusion of information after the 1st retweet.To the best of our knowledge this work is the first quantitative study on the entire Twittersphere and information diffusion on it.

...read moreread less

6,108 citations

Proceedings Article•DOI•

Twitter under crisis: can we trust what we RT?

[...]

Marcelo Mendoza¹, Barbara Poblete¹, Carlos Castillo¹•Institutions (1)

Yahoo!¹

25 Jul 2010

TL;DR: The behavior of Twitter users under an emergency situation is explored and it is shown that it is posible to detect rumors by using aggregate analysis on tweets, and that the propagation of tweets that correspond to rumors differs from tweets that spread news.

...read moreread less

Abstract: In this article we explore the behavior of Twitter users under an emergency situation. In particular, we analyze the activity related to the 2010 earthquake in Chile and characterize Twitter in the hours and days following this disaster. Furthermore, we perform a preliminary study of certain social phenomenons, such as the dissemination of false rumors and confirmed news. We analyze how this information propagated through the Twitter network, with the purpose of assessing the reliability of Twitter as an information source under extreme circumstances. Our analysis shows that the propagation of tweets that correspond to rumors differs from tweets that spread news because rumors tend to be questioned more than news by the Twitter community. This result shows that it is posible to detect rumors by using aggregate analysis on tweets.

...read moreread less

1,012 citations

Proceedings Article•

Rumor has it: Identifying Misinformation in Microblogs

[...]

Vahed Qazvinian¹, Emily Rosengren¹, Dragomir R. Radev¹, Qiaozhu Mei¹•Institutions (1)

University of Michigan¹

27 Jul 2011

TL;DR: This paper addresses the problem of rumor detection in microblogs and explores the effectiveness of 3 categories of features: content- based, network-based, and microblog-specific memes for correctly identifying rumors, and believes that its dataset is the first large-scale dataset on rumor detection.

...read moreread less

Abstract: A rumor is commonly defined as a statement whose true value is unverifiable. Rumors may spread misinformation (false information) or disinformation (deliberately false information) on a network of people. Identifying rumors is crucial in online social media where large amounts of information are easily spread across a large network by sources with unverified authority. In this paper, we address the problem of rumor detection in microblogs and explore the effectiveness of 3 categories of features: content-based, network-based, and microblog-specific memes for correctly identifying rumors. Moreover, we show how these features are also effective in identifying disinformers, users who endorse a rumor and further help it to spread. We perform our experiments on more than 10,000 manually annotated tweets collected from Twitter and show how our retrieval model achieves more than 0.95 in Mean Average Precision (MAP). Finally, we believe that our dataset is the first large-scale dataset on rumor detection. It can open new dimensions in analyzing online misinformation and other aspects of microblog conversations.

...read moreread less

792 citations