scispace - formally typeset
Open AccessJournal ArticleDOI

Improving Knowledge Based Spam Detection Methods: The Effect of Malicious Related Features in Imbalance Data Distribution

Reads0
Chats0
TLDR
The issue of spam detection is investigated with the aim to develop an efficient method to identify spam email based on the analysis of the content of email messages and a set of features that have a considerable number of malicious related features are identified.
Abstract
Spam is no longer just commercial unsolicited email messages that waste our time, it consumes network traffic and mail servers’ storage. Furthermore, spam has become a major component of several attack vectors including attacks such as phishing, cross-site scripting, cross-site request forgery and malware infection. Statistics show that the amount of spam containing malicious contents increased compared to the one advertising legitimate products and services. In this paper, the issue of spam detection is investigated with the aim to develop an efficient method to identify spam email based on the analysis of the content of email messages. We identify a set of features that have a considerable number of malicious related features. Our goal is to study the effect of these features in helping the classical classifiers in identifying spam emails. To make the problem more challenging, we developed spam classification models based on imbalanced data where spam emails form the rare class with only 16.5% of the total emails. Different metrics were utilized in the evaluation of the developed models. Results show noticeable improvement of spam classification models when trained by dataset that includes malicious related features.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

An intelligent system for spam detection and identification of the most relevant features based on evolutionary Random Weight Networks

TL;DR: An intelligent detection system that is based on Genetic Algorithm and Random Weight Network is proposed to deal with email spam detection tasks and can automatically identify the most relevant features of the spam emails.
Proceedings ArticleDOI

Optimizing Feedforward neural networks using Krill Herd algorithm for E-mail spam detection

TL;DR: Evaluation results show that the developed training approach using Krill Herd algorithm outperforms the other two algorithms and will be applied for an E-mail spam detection model.
Proceedings ArticleDOI

Spam profile detection in social networks based on public features

TL;DR: The nature of spam profiles in Twitter is investigated with a goal to improve social spam detection and preliminary experiments show that the promising detection rates can be obtained using such features regardless of the language of the tweets.
Book ChapterDOI

A Hybrid Approach Based on Particle Swarm Optimization and Random Forests for E-Mail Spam Filtering

TL;DR: Experimental results on real-world spam data set show the better performance of the proposed method over other five traditional machine learning approaches from the literature.
Proceedings ArticleDOI

Statistical Detection of Online Drifting Twitter Spam: Invited Paper

TL;DR: A fuzzy-based information decomposition technique is developed and an asymmetric sampling technique is proposed to re-balance the sizes of spam samples and non-spam samples in the training data to significantly improve the detection performance for drifting Twitter spam.
References
More filters
Journal ArticleDOI

Review: A review of machine learning approaches to Spam filtering

TL;DR: A comprehensive review of recent developments in the application of machine learning algorithms to Spam filtering, focusing on both textual- and image-based approaches concludes that while important advancements have been made in the last years, several aspects remain to be explored, especially under more realistic evaluation settings.
Proceedings Article

The underground economy of spam: a botmaster's perspective of coordinating large-scale spam campaigns

TL;DR: A comprehensive analysis of a large-scale botnet from the botmaster's perspective is presented, that highlights the intricacies involved in orchestrating spam campaigns such as the quality of email address lists, the effectiveness of IP-based blacklisting, and the reliability of bots.
Journal ArticleDOI

The Economics of Spam

TL;DR: The history of the market for spam is described, highlighting the strategic cat-and-mouse game between spammers and email providers, and the spam market's externality ratio of 100 is put into context by comparing it to other activities with negative externalities.
Proceedings Article

Show me the money: characterizing spam-advertised revenue

TL;DR: Two inference techniques for peering inside the business operations of spam-advertised enterprises are described: purchase pair and basket inference, which provide informed estimates on order volumes, product sales distribution, customer makeup and total revenues for a range of spam -advertised programs.

Kaspersky security Bulletin 2013

TL;DR: This report is generated based on the data received and processed by using Kaspersky Security Network and shows clear trends in the number of attacks and severity of attacks reported in the coming months.
Related Papers (5)