scispace - formally typeset
Proceedings ArticleDOI

A Combining Classifiers Approach for Detecting Email Spams

Reads0
Chats0
TLDR
Results show the best results of novel combining classifier approach in compression with individual classifiers compared in terms of good performance accuracy and low false positives.
Abstract
Email is a rapid and cheap communication medium for sending and receiving information where spam is becoming a nuisance for such communication. A good spam filtering cannot only be achieved by high performance accuracy but low false positive is also necessary. This paper presents a combining classifiers approach with committee selection mechanism where the main objective is to combine individual decisions of the good classifiers for utmost classification outcome in spam classification domain. In this context, three different classifiers have been selected i.e. "Boosted Bayesian", "Boosted Naive Bayes and Support Vector Machine (SVM). For combining classifiers, boosted bayesian and boosted naive bayes are chosen as members of committee and SVM is taken as the president. The member of committee have been selected from our previous study where we have identified boosting with adaboost improves the performance of probabilistic classifier. Results show the best results of novel combining classifier approach in compression with individual classifiers compared in terms of good performance accuracy and low false positives. In addition, greedy step wise feature search method is found to be good in this study.

read more

Citations
More filters
Journal ArticleDOI

A Survey on Machine Learning Techniques for Cyber Security in the Last Decade

TL;DR: This paper aims to provide a comprehensive overview of the challenges that ML techniques face in protecting cyberspace against attacks, by presenting a literature on ML techniques for cyber security including intrusion detection, spam detection, and malware detection on computer networks and mobile networks in the last decade.
Journal ArticleDOI

Spam filtering using integrated distribution-based balancing approach and regularized deep neural networks

TL;DR: A novel spam filter integrating an N-gram tf.idf feature selection, modified distribution-based balancing algorithm and a regularized deep multi-layer perceptron NN model with rectified linear units is proposed, which outperforms state-of-the-art spam filters and several machine learning algorithms commonly used to classify text.
Journal ArticleDOI

Spam filtering using a logistic regression model trained by an artificial bee colony algorithm

TL;DR: A novel spam detection method that combines the artificial bee colony algorithm with a logistic regression classification model is proposed that outperforms other spam detection techniques considered in this study in terms of classification accuracy.
Journal ArticleDOI

A multi class random forest (MCRF) model for classification of small plant peptides

TL;DR: Results of this study show that the proposed MCRF classifier has potential to accurately classify multi-level imbalanced data.
Book ChapterDOI

Spam Detection Using Ensemble Learning

TL;DR: Voting classifier, a type of ensemble learning to calculate the accuracy of different combinations of classifiers is used, and results show that use of voting classifier produces more accurate prediction than individual classifier.
References
More filters

Statistical learning theory

TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Book ChapterDOI

Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

TL;DR: This paper explores the use of Support Vector Machines for learning text classifiers from examples and analyzes the particular properties of learning with text data and identifies why SVMs are appropriate for this task.
Book ChapterDOI

Naive (Bayes) at forty: the independence assumption in information retrieval

TL;DR: The naive Bayes classifier, currently experiencing a renaissance in machine learning, has long been a core technique in information retrieval, and some of the variations used for text retrieval and classification are reviewed.
Journal ArticleDOI

Jackknife, Bootstrap and Other Resampling Methods in Regression Analysis

Chien-Fu Wu
- 01 Dec 1986 - 
TL;DR: In this paper, a class of weighted jackknife variance estimators for the least square estimator by deleting any fixed number of observations at a time was proposed, and three bootstrap methods were considered.
Proceedings Article

A Bayesian Approach to Filtering Junk E-Mail

TL;DR: This work examines methods for the automated construction of filters to eliminate such unwanted messages from a user’s mail stream, and shows the efficacy of such filters in a real world usage scenario, arguing that this technology is mature enough for deployment.
Related Papers (5)