scispace - formally typeset
Open AccessJournal ArticleDOI

A Novel Method of Spam Mail Detection using Text Based Clustering Approach

M. Basavaraju, +1 more
- 10 Aug 2010 - 
- Vol. 5, Iss: 4, pp 15-25
TLDR
A novel method of efficient spam mail classification using clustering techniques is presented in this research paper, which can extract spam/non-spam email and detect the spam email efficiently.
Abstract
A novel method of efficient spam mail classification using clustering techniques is presented in this research paper. E-mail spam is one of the major problems of the today’s internet, bringing financial damage to companies and annoying individual users. Among the approaches developed to stop spam, filtering is an important and popular one. A new spam detection technique using the text clustering based on vector space model is proposed in this research paper. By using this method, one can extract spam/non-spam email and detect the spam email efficiently. Representation of data is done using a vector space model. Clustering is the technique used for data reduction. It divides the data into groups based on pattern similarities such that each group is abstracted by one or more representatives. Recently, there is a growing emphasis on exploratory analysis of very large datasets to discover useful patterns, it is called data mining. Each cluster is

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

A Comprehensive Survey for Intelligent Spam Email Detection

TL;DR: A focused literature survey of Artificial Intelligence (AI) and Machine Learning (ML) methods for intelligent spam email detection, which can help in developing appropriate countermeasures.
Journal ArticleDOI

Hybrid email spam detection model with negative selection algorithm and differential evolution

TL;DR: A modified machine learning technique of the human immune system called negative selection algorithm (NSA) generates detectors at the random detector generation phase of NSA; code named NSA-DE; local outlier factor (LOF) is implemented as fitness function to maximize the distance of generated spam detectors from the non-spam space.
Proceedings ArticleDOI

Classifying Spam Emails Using Text and Readability Features

TL;DR: A novel spam classification method that uses features based on email content-language and readability combined with the previously used content-based task features to outperform a number of state-of-the-art methods proposed in previous studies.
Journal ArticleDOI

Machine Learning in Aerodynamic Shape Optimization

TL;DR: In this article , the authors review the applications of ML in aerodynamic shape optimization (ASO) and provide a perspective on the state-of-the-art and future directions.
References
More filters
Book

Data Mining: Concepts and Techniques

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Journal ArticleDOI

Data clustering: a review

TL;DR: An overview of pattern clustering methods from a statistical pattern recognition perspective is presented, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners.
Book

Introduction to Information Retrieval

TL;DR: In this article, the authors present an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections.
Related Papers (5)