Open AccessPosted Content
PhishOut: Effective Phishing Detection Using Selected Features.
TLDR
This paper applied knowledge discovery principles from data cleansing, integration, selection, aggregation, and data mining to knowledge extraction and compared six machine-learning approaches to detect phishing based on a small number of carefully chosen features.Abstract:
Phishing emails are the first step for many of today's attacks. They come with a simple hyperlink, request for action or a full replica of an existing service or website. The goal is generally to trick the user to voluntarily give away his sensitive information such as login credentials. Many approaches and applications have been proposed and developed to catch and filter phishing emails. However, the problem still lacks a complete and comprehensive solution. In this paper, we apply knowledge discovery principles from data cleansing, integration, selection, aggregation, data mining to knowledge extraction. We study the feature effectiveness based on Information Gain and contribute two new features to the literature. We compare six machine-learning approaches to detect phishing based on a small number of carefully chosen features. We calculate false positives, false negatives, mean absolute error, recall, precision and F-measure and achieve very low false positive and negative rates. Na{\"i}ve Bayes has the least true positives rate and overall Neural Networks holds the most promise for accurate phishing detection with accuracy of 99.4\%.read more
Citations
More filters
Journal ArticleDOI
A systematic literature review on phishing website detection techniques
Asadullah Safi,Satinder Singh +1 more
TL;DR: A systematic literature survey was conducted on 80 scientific papers published in the last five years in research journals, conferences, leading workshops, the thesis of researchers, book chapters, and from high-rank websites as discussed by the authors .
Proceedings ArticleDOI
A Comparative Study on Email Phishing Detection Using Machine Learning Techniques
TL;DR: In this paper , a comparison of previous studies in commonly used Supervised Machine Learning techniques on detecting the phishing email attack such as Decision Tree (DT), Naive Bayes (NB), Random Forest (RF), and Support Vector machine(SVM).
Journal ArticleDOI
Identification of pharming in communication networks using ensemble learning
TL;DR: This work aims at enhancing pharming detection strategies by adopting machine learning classification algorithms that include K-Nearest Neighbors, Decision Tree, Random Forest, Gaussian Naive Bayes, Logistic Regression, Support Vector Machine, Adaptiveboosting, Gradient Boosting, and Extra Trees Classifier.
Proceedings ArticleDOI
Accuracy Comparison of Different Machine Learning Models in Phishing Detection
TL;DR: In this paper , the authors compared different machine learning algorithms to detect whether a URL is a legitimate URL or a phishing URL with a certain feature using a Web page phishing detection dataset.
Journal ArticleDOI
Systematic Literature Review: Anti-Phishing Defences and Their Application to Before-the-click Phishing Email Detection
TL;DR: This paper discusses the performance and suitability of using these techniques for detecting phishing emails before the end-user even reads the email, and suggests some promising areas for further research.
References
More filters
Book
Data Mining: Practical Machine Learning Tools and Techniques
TL;DR: This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.
Journal ArticleDOI
The random subspace method for constructing decision forests
TL;DR: A method to construct a decision tree based classifier is proposed that maintains highest accuracy on training data and improves on generalization accuracy as it grows in complexity.
Proceedings ArticleDOI
Learning to detect phishing emails
TL;DR: This method is applicable, with slight modification, to detection of phishing websites, or the emails used to direct victims to these sites, and correctly identify over 96% of the phishing emails while only mis-classifying on the order of 0.1%" of the legitimate emails.
Proceedings Article
Client-Side Defense Against Web-Based Identity Theft.
TL;DR: A framework for client-side defense is proposed: a browser plug-in that examines web pages and warns the user when requests for data may be part of a spoof attack.
Journal ArticleDOI
The state of phishing attacks
TL;DR: Looking past the systems people use, they target the people using the systems.
Related Papers (5)
Classifying Phishing Emails Using Confidence-Weighted Linear Classifiers
Ram B. Basnet,Andrew H. Sung +1 more
Applying machine learning and natural language processing to detect phishing email
Areej AlHogail,Afrah Alsabih +1 more