scispace - formally typeset
Proceedings ArticleDOI

A social-spam detection framework

Reads0
Chats0
TLDR
This work proposes a framework for spam detection which can be used across all social network sites and provides an experimental study of real datasets from social networks to demonstrate the flexibility and feasibility of the framework.
Abstract
Social networks such as Facebook, MySpace, and Twitter have become increasingly important for reaching millions of users. Consequently, spammers are increasing using such networks for propagating spam. Existing filtering techniques such as collaborative filters and behavioral analysis filters are able to significantly reduce spam, each social network needs to build its own independent spam filter and support a spam team to keep spam prevention techniques current. We propose a framework for spam detection which can be used across all social network sites. There are numerous benefits of the framework including: 1) new spam detected on one social network, can quickly be identified across social networks; 2) accuracy of spam detection will improve with a large amount of data from across social networks; 3) other techniques (such as blacklists and message shingling) can be integrated and centralized; 4) new social networks can plug into the system easily, preventing spam at an early stage. We provide an experimental study of real datasets from social networks to demonstrate the flexibility and feasibility of our framework.

read more

Citations
More filters

Early detection of fake news on social media

Yang Liu
TL;DR: The experimental results demonstrate that the proposed models can detect fake news with over 90% accuracy within five minutes after it starts to spread and before it is retweeted 50 times, which is significantly faster than state-of-the-art baselines.

When social bots attack: Modeling susceptibility of users in online social networks

TL;DR: The results suggest that susceptible users tend to use Twitter for a conversational purpose and tend to be more open and social since they communicate with many different users, use more social words and show more affection than non-susceptible users.
Journal ArticleDOI

Twitter spam detection: Survey of new approaches and comparative study

TL;DR: A new survey about Twitter spam detection techniques to include those who do or do not have expertise in this area and those who are looking for deep understanding of this field in order to develop new methods.
Journal ArticleDOI

Deep learning for misinformation detection on online social networks: a survey and new perspectives

TL;DR: A state-of-the-art review of automated misinformation detection in social networks where deep learning (DL) is used to automatically process data and create patterns to make decisions not only to extract global features but also to achieve better results.
Proceedings ArticleDOI

TubeSpam: Comment Spam Filtering on YouTube

TL;DR: The statistical analysis of results indicate that, with 99.9% of confidence level, decision trees, logistic regression, Bernoulli Naive Bayes, random forests, linear and Gaussian SVMs are statistically equivalent for comment spam filtering on YouTube.
References
More filters
Journal ArticleDOI

The WEKA data mining software: an update

TL;DR: This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.
Journal ArticleDOI

Additive Logistic Regression : A Statistical View of Boosting

TL;DR: This work shows that this seemingly mysterious phenomenon of boosting can be understood in terms of well-known statistical principles, namely additive modeling and maximum likelihood, and develops more direct approximations and shows that they exhibit nearly identical results to boosting.
Proceedings Article

A Bayesian Approach to Filtering Junk E-Mail

TL;DR: This work examines methods for the automated construction of filters to eliminate such unwanted messages from a user’s mail stream, and shows the efficacy of such filters in a real world usage scenario, arguing that this technology is mature enough for deployment.
Journal ArticleDOI

Support vector machines for spam categorization

TL;DR: The use of support vector machines in classifying e-mail as spam or nonspam is studied by comparing it to three other classification algorithms: Ripper, Rocchio, and boosting decision trees, which found SVM's performed best when using binary features.
Book ChapterDOI

Combating web spam with trustrank

TL;DR: This paper proposes techniques to semi-automatically separate reputable, good pages from spam, and shows that they can effectively filter out spam from a significant fraction of the web, based on a good seed set of less than 200 sites.