scispace - formally typeset
Journal ArticleDOI

A collaborative anti-spam system

Reads0
Chats0
TLDR
Rough set theory is used to generate spam rules and XML format for exchanging spam rules are used and the spam rule management is achieved by reinforcement learning approach to improve the accuracy and efficiency of spam filter.
Abstract
Growing volume of spam mails has generated a need for a precise anti-spam filter detecting unsolicited emails. Most works only focus on spam rule generation on a standalone mail server. This paper presents a collaborative framework on spam rule generation, exchange and management. The spam filter can be built based on the mixture of rough set theory, genetic algorithm, and reinforcement learning. In this paper, we use rough set theory to generate spam rules and XML format for exchanging spam rules. The spam rule management is achieved by reinforcement learning approach. The results of experiment draw the following conclusion: (1) Rule management can keep high performance rules and discard out-of-date rules to improve the accuracy and efficiency of spam filter. (2) Rules exchanged among mail servers indeed help the spam filter block more spam messages than standalone one.

read more

Citations
More filters
Journal ArticleDOI

Collaborative Security: A Survey and Taxonomy

TL;DR: A comprehensive study of different mechanisms of collaboration and defense in collaborative security, covering six types of security systems, with the goal of helping to make collaborative security systems more resilient and efficient.
Journal ArticleDOI

A decision support system: Automated crime report analysis and classification for e-government

TL;DR: A decision support system (DSS), combining natural language processing (NLP) techniques, similarity measures, and machine learning, i.e., a Naive Bayes' classifier, to support crime analysis and classify which crime reports discuss the same and different crime is developed.
Journal ArticleDOI

Rough sets for spam filtering: Selecting appropriate decision rules for boundary e-mail classification

TL;DR: From the experiments carried out, it is concluded that the proposed algorithms can outperform other well-known anti-spam filtering techniques such as support vector machines (SVM), Adaboost and different types of Bayes classifiers.
Journal ArticleDOI

Overview of textual anti-spam filtering techniques

TL;DR: Most common techniques used for anti-spam filtering by analyzing the e-mail content are summarized and machine learning algorithms such as Naive Bayesian, support vector machine and neural network that have been adopted to detect and control spam are looked into.
Journal ArticleDOI

Machine Learning Techniques for Spam Detection in Email and IoT Platforms: Analysis and Research Challenges

TL;DR: The machine learning techniques used for spam filtering techniques used in email and IoT platforms are surveyed by classifying them into suitable categories and a comprehensive comparison of these techniques is made based on accuracy, precision, recall, etc.
References
More filters
Book

Reinforcement Learning: An Introduction

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Proceedings Article

A Bayesian Approach to Filtering Junk E-Mail

TL;DR: This work examines methods for the automated construction of filters to eliminate such unwanted messages from a user’s mail stream, and shows the efficacy of such filters in a real world usage scenario, arguing that this technology is mature enough for deployment.
Journal ArticleDOI

Support vector machines for spam categorization

TL;DR: The use of support vector machines in classifying e-mail as spam or nonspam is studied by comparing it to three other classification algorithms: Ripper, Rocchio, and boosting decision trees, which found SVM's performed best when using binary features.
Journal ArticleDOI

Rough sets and intelligent data analysis

TL;DR: It is shown that every decision algorithm reveals some well-known probabilistic properties, in particular it satisfies the total probability theorem and the Bayes' theorem, giving a new method of drawing conclusions from data, without referring to prior and posterior probabilities.
Posted Content

Boosting Trees for Anti-Spam Email Filtering

TL;DR: The boosting-based methods clearly outperform the baseline learning algorithms on the PU1 corpus, achieving very high levels of the F1 measure and obtaining better ``high-precision'' classifiers, which is a very important issue when misclassification costs are considered.
Related Papers (5)