Proceedings ArticleDOI
A social-spam detection framework
De Wang,Danesh Irani,Calton Pu +2 more
- pp 46-54
Reads0
Chats0
TLDR
This work proposes a framework for spam detection which can be used across all social network sites and provides an experimental study of real datasets from social networks to demonstrate the flexibility and feasibility of the framework.Abstract:
Social networks such as Facebook, MySpace, and Twitter have become increasingly important for reaching millions of users. Consequently, spammers are increasing using such networks for propagating spam. Existing filtering techniques such as collaborative filters and behavioral analysis filters are able to significantly reduce spam, each social network needs to build its own independent spam filter and support a spam team to keep spam prevention techniques current. We propose a framework for spam detection which can be used across all social network sites. There are numerous benefits of the framework including: 1) new spam detected on one social network, can quickly be identified across social networks; 2) accuracy of spam detection will improve with a large amount of data from across social networks; 3) other techniques (such as blacklists and message shingling) can be integrated and centralized; 4) new social networks can plug into the system easily, preventing spam at an early stage. We provide an experimental study of real datasets from social networks to demonstrate the flexibility and feasibility of our framework.read more
Citations
More filters
Early detection of fake news on social media
TL;DR: The experimental results demonstrate that the proposed models can detect fake news with over 90% accuracy within five minutes after it starts to spread and before it is retweeted 50 times, which is significantly faster than state-of-the-art baselines.
When social bots attack: Modeling susceptibility of users in online social networks
TL;DR: The results suggest that susceptible users tend to use Twitter for a conversational purpose and tend to be more open and social since they communicate with many different users, use more social words and show more affection than non-susceptible users.
Journal ArticleDOI
Twitter spam detection: Survey of new approaches and comparative study
TL;DR: A new survey about Twitter spam detection techniques to include those who do or do not have expertise in this area and those who are looking for deep understanding of this field in order to develop new methods.
Journal ArticleDOI
Deep learning for misinformation detection on online social networks: a survey and new perspectives
TL;DR: A state-of-the-art review of automated misinformation detection in social networks where deep learning (DL) is used to automatically process data and create patterns to make decisions not only to extract global features but also to achieve better results.
Proceedings ArticleDOI
TubeSpam: Comment Spam Filtering on YouTube
TL;DR: The statistical analysis of results indicate that, with 99.9% of confidence level, decision trees, logistic regression, Bernoulli Naive Bayes, random forests, linear and Gaussian SVMs are statistically equivalent for comment spam filtering on YouTube.
References
More filters
Journal ArticleDOI
The WEKA data mining software: an update
TL;DR: This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.
Journal ArticleDOI
Additive Logistic Regression : A Statistical View of Boosting
TL;DR: This work shows that this seemingly mysterious phenomenon of boosting can be understood in terms of well-known statistical principles, namely additive modeling and maximum likelihood, and develops more direct approximations and shows that they exhibit nearly identical results to boosting.
Proceedings Article
A Bayesian Approach to Filtering Junk E-Mail
TL;DR: This work examines methods for the automated construction of filters to eliminate such unwanted messages from a user’s mail stream, and shows the efficacy of such filters in a real world usage scenario, arguing that this technology is mature enough for deployment.
Journal ArticleDOI
Support vector machines for spam categorization
TL;DR: The use of support vector machines in classifying e-mail as spam or nonspam is studied by comparing it to three other classification algorithms: Ripper, Rocchio, and boosting decision trees, which found SVM's performed best when using binary features.
Book ChapterDOI
Combating web spam with trustrank
TL;DR: This paper proposes techniques to semi-automatically separate reputable, good pages from spam, and shows that they can effectively filter out spam from a significant fraction of the web, based on a good seed set of less than 200 sites.