Real-time Detection of Content Polluters in Partially Observable Twitter Networks
read more
Citations
Detect Me If You Can: Spam Bot Detection Using Inductive Representation Learning
You talkin’ to me? Exploring Human/Bot Communication Patterns during Riot Events
DeeProBot: a hybrid deep neural network model for social bot detection based on user profile data
Fake news detection based on explicit and implicit signals of a hybrid crowd: An approach inspired in meta-learning
#ArsonEmergency and Australia's "Black Summer": Polarisation and Misinformation on Social Media.
References
Fast unfolding of communities in large networks
The rise of social bots
Social bots distort the 2016 U.S. Presidential election online discussion
BotOrNot: A System to Evaluate Social Bots
The Size Distribution of Cities: An Examination of the Pareto Law and Primacy
Related Papers (5)
Frequently Asked Questions (12)
Q2. Why were the authors restricted to using only streamed tweets?
Due to rate limits on the public API and the high cost of accessing data, the authors were restricted to using only streamed tweets satisfying certain criteria.
Q3. How many bots were created on 20 February 2014?
A total of 109 political bot accounts were created on 20 February 2014 withonly 12 unique names, a strong indication of being a bot network.
Q4. What is the purpose of the data?
The data are targeted at studying civil unrest and intends to capture ways in which people express opinions and organize marches, rallies, peaceful/violent protests etc., within Australia.
Q5. What was the main goal of the research?
The main goal of their work was finding out content polluters in a dataset comprising tweets related to Australian social unrest events in real time, without access to complete profile information of the users.
Q6. What is the challenging aspect of this work?
The most challenging aspect of this work is to validate results since user perceptions are not always correct, and standard bot detection methods are very much prone to misclassification despite using complete twitter account information [9, 17, 18].
Q7. How did the authors detect content polluter accounts?
The authors reiterate that the authors detected content polluter accounts using message diversity since the authors did not have access to complete account information, whereas Truthy exploited features obtained from the complete user profile and network.
Q8. What did the researchers find interesting about the model?
The authors noted that the model was no longer erroneously predicting events related to ‘escorts’, which improved model performance noticeably.
Q9. How do the authors measure the extent of diversity?
The authors measure the extent of diversity in two ways: (1) The Gini coefficient (G ∈ R, G=[0,1]):G =∑n i=1 ∑n j=1 |udi − u d j |2n ∑n i=1 u d i, (1)where n is the number of users tweeting a particular URL.
Q10. What is the function that returns an error message when a user deletes a Twitter account?
Given a query for a specific account, the Twitter API returns an error message if the account is suspended by Twitter or deleted by the user.
Q11. What did the bots in the dataset help to remove noise?
The bots that the authors detected in their dataset helped to remove noise in the data and significantly improved the performance of prediction models.
Q12. What is the way to mark a bot?
Otherwise look through pattern of tweets, if very spamy tweet behaviour, for example highly consistent frequency of tweeting behaviour and tweets are from a single source then mark as bot.