scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

An integrated approach to detect phishing mail attacks: a case study

06 Oct 2009-pp 193-199
TL;DR: An approach to create a resilient and effective method that uses fuzzy logic to quantify and qualify all the website phishing characteristics and factors in order to detect phishing websites to assess whether phishing activity is taking place or not is proposed.
Abstract: Phishing is a process of luring unsuspecting Internet users to a fake website by using authentic looking email and messages for fraudulent purposes. Most preferred way that the phishers employ to lure victims is through a mass email, constructed to look like an authentic message from a well-known company. Phishing website has its own technical and social problem with each other and being a very complicate and complex issue to understand and analyze, to till date there exist no known single silver bullet to solve it entirely. Here an approach to create a resilient and effective method is proposed that uses fuzzy logic to quantify and qualify all the website phishing characteristics and factors in order to detect phishing websites to assess whether phishing activity is taking place or not. The approach visualizes the webpage in three layers of which the first layer, Domain Name checker, is fully based on characteristics of hyperlinks, the second, Code Script Checker which checks out for the tricks of the attackers in a way how they use JavaScript to hide information from user, and potentially launch sophisticated attacks, and the last layer of our approach, Page Content Checker, checks for phishing site based on its sub criteria. Finally if any of them (with regards to the true one) is higher than its corresponding preset threshold then that webpage is reported as a phishing suspect.
Citations
More filters
Proceedings ArticleDOI
01 Nov 2011
TL;DR: A cadre of file matching algorithms is implemented to detect phishing websites based on their content, employing a custom data set consisting of 17,992 phishing attacks targeting 159 different brands and demonstrating that some can achieve a detection rate of greater than 90% while maintaining a low false positive rate.
Abstract: Phishers continue to alter the source code of the web pages used in their attacks to mimic changes to legitimate websites of spoofed organizations and to avoid detection by phishing countermeasures. Manipulations can be as subtle as source code changes or as apparent as adding or removing significant content. To appropriately respond to these changes to phishing campaigns, a cadre of file matching algorithms is implemented to detect phishing websites based on their content, employing a custom data set consisting of 17,992 phishing attacks targeting 159 different brands. The results of the experiments using a variety of different content-based approaches demonstrate that some can achieve a detection rate of greater than 90% while maintaining a low false positive rate.

35 citations

Proceedings Article
24 Apr 2012
TL;DR: An attempt to create an automated method that produces clusters of phishing websites with the same brand and evidence suggests created by the same phishing group or individual is described in this paper.
Abstract: Phishing websites attempt to deceive people to expose their passwords, user IDs and other sensitive information by mimicking legitimate websites such as banks, product vendors, and service providers. Phishing websites are a pervasive and ongoing problem. Examining and analyzing a phishing website is a good first step in an investigation. Examining and analyzing phishing websites can be a manually intensive job and analyzing a large continuous feed of phishing websites manually would be an almost insurmountable problem because of the amount of time and labor required. Automated methods need to be created that group large volumes of phishing website data and allow investigators to focus their investigative efforts on the largest phishing website groupings that represent the most prevalent phishing groups or individuals. An attempt to create such an automated method is described in this paper. The method is based upon the assumption that phishing websites attacking a particular brand are often used many times by a particular group or individual. And when the targeted brand changes a new phishing website is not created from scratch, but rather incremental upgrades are made to the original phishing website. The method employs a SLINK-style clustering algorithm using local domain file commonality between websites as a distance metric. This method produces clusters of phishing websites with the same brand and evidence suggests created by the same phishing group or individual.

22 citations


Cites methods from "An integrated approach to detect ph..."

  • ...Other researchers have used components within the source code [2][11]....

    [...]

Proceedings ArticleDOI
28 Jul 2015
TL;DR: This study investigates and identifies parameters in a single platform based on fuzzy system and neural network for phishing websites detection and achieves the best performance compared to other results in the field.
Abstract: This study investigates and identifies parameters in a single platform based on fuzzy system and neural network for phishing websites detection. The new approach utilizes Fuzzy systems, neural network with a set of parameters and a data set to detect phishing sites with high accuracy in real-time. A total of 300 data from six sources were used as training and testing sets using 2-fold cross-validation to train and validate the model, which has achieved the best performance (99.6%) compared to other results in the field.

13 citations

Journal ArticleDOI
06 May 2019
TL;DR: An Enhanced Malicious URLs Detection (EMUD) model is developed with machine learning techniques for better classification and accurate results.
Abstract: Phishing is the process of enticing people into visiting fraudulent websites and persuading them to enter their personal information. Number in phishing email are spread with the aim of making web users believe that they are communicating with a trusted entity or organization. Phishing is deployed by the use of advanced and harmful tactics like malicious or phishing URLs. So, it becomes necessary to detect malicious or phishing URLs in the present scenario. Numerous anti- phishing techniques are in vogue to discriminate fake and the authentic website but are not effective. This research, focuses on the relevant URLs features that discriminate between legitimate and malicious/phishing URLs. The impact of email phishing can be largely reduced by adopting an appropriate combination of all these features with classification techniques. Therefore, an Enhanced Malicious URLs Detection (EMUD) model is developed with machine learning techniques for better classification and accurate results.

11 citations

Proceedings ArticleDOI
09 Oct 2014
TL;DR: This study presents a novel parameter tuning framework based on a neuron-fuzzy system with comprehensive features, which can enhance system performance in realtime and will provide guidance to the researchers who are using similar techniques in the field.
Abstract: Phishing attacks have become more sophisticated in web-based transactions. As a result, various solutions have been developed to tackle the problem. Such solutions including feature-based and blacklist-based approaches applying machine learning algorithms. However, there is still a lack of accuracy and real-time solution. Most machine learning algorithms are parameter driven, but the parameters are difficult to tune to a desirable output. In line with Jiang and Ma’s findings, this study presents a parameter tuning framework, using Neuron-fuzzy system with comprehensive features in order to maximize systems performance. The neuron-fuzzy system was chosen because it has ability to generate fuzzy rules by given features and to learn new features. Extensive experiments were conducted, using different feature-sets, two cross-validation methods, a hybrid method and different parameters and achieved 98.4% accuracy. Our results demonstrated a high performance compared to other results in the field. As a contribution, we introduced a novel parameter tuning framework based on a neuron-fuzzy with six feature-sets and identified different numbers of membership functions different number of epochs, different sizes of feature-sets on a single platform. Parameter tuning based on neuron-fuzzy system with comprehensive features can enhance system performance in real-time. The outcome will provide guidance to the researchers who are using similar techniques in the field. It will decrease difficulties and increase confidence in the process of tuning parameters on a given problem.

8 citations


Cites background or methods from "An integrated approach to detect ph..."

  • ...…extract feature including: (1) Data Protection Act designed to protect personal data, (2) Bill C-28, CAN SPAM is a new Canada bill of 2010 requires that all email senders to obtain prior consent from recipient, (3) Web copyright: one of the tactic used by phishing is to create a 548 | P a g e…...

    [...]

  • ...Overall 342 features were utilized for training and testing....

    [...]

References
More filters
Proceedings ArticleDOI
08 May 2007
TL;DR: The design, implementation, and evaluation of CANTINA, a novel, content-based approach to detecting phishing web sites, based on the TF-IDF information retrieval algorithm, are presented.
Abstract: Phishing is a significant problem involving fraudulent email and web sites that trick unsuspecting users into revealing private information. In this paper, we present the design, implementation, and evaluation of CANTINA, a novel, content-based approach to detecting phishing web sites, based on the TF-IDF information retrieval algorithm. We also discuss the design and evaluation of several heuristics we developed to reduce false positives. Our experiments show that CANTINA is good at detecting phishing sites, correctly labeling approximately 95% of phishing sites.

813 citations

Proceedings Article
01 Jan 2004
TL;DR: A framework for client-side defense is proposed: a browser plug-in that examines web pages and warns the user when requests for data may be part of a spoof attack.
Abstract: Web spoofing is a significant problem involving fraudulent email and web sites that trick unsuspecting users into revealing private information We discuss some aspects of common attacks and propose a framework for client-side defense: a browser plug-in that examines web pages and warns the user when requests for data may be part of a spoof attack While the plugin, SpoofGuard, has been tested using actual sites obtained through government agencies concerned about the problem, we expect that web spoofing and other forms of identity theft will be continuing problems in

487 citations

Book ChapterDOI
12 Jul 2007
TL;DR: Over a period of three weeks, the effectiveness of the blacklists maintained by Google and Microsoft with 10,000 phishing URLs was tested, and the existence of page properties that can be used to identify phishing pages were explored.
Abstract: Phishing is an electronic online identity theft in which the attackers use a combination of social engineering and web site spoofing techniques to trick a user into revealing confidential information This information is typically used to make an illegal economic profit (eg, by online banking transactions, purchase of goods using stolen credentials, etc) Although simple, phishing attacks are remarkably effective As a result, the numbers of successful phishing attacks have been continuously increasing and many anti-phishing solutions have been proposed One popular and widely-deployed solution is the integration of blacklist-based anti-phishing techniques into browsers However, it is currently unclear how effective such blacklisting approaches are in mitigating phishing attacks in real-life In this paper, we report our findings on analyzing the effectiveness of two popular anti-phishing solutions Over a period of three weeks, we automatically tested the effectiveness of the blacklists maintained by Google and Microsoft with 10,000 phishing URLs Furthermore, by analyzing a large number of phishing pages, we explored the existence of page properties that can be used to identify phishing pages

194 citations

Proceedings ArticleDOI
01 Oct 2006
TL;DR: This paper proposes a new end-host based anti-phishing algorithm, which it is called LinkGuard, by utilizing the generic characteristics of the hyperlinks in phishing attacks, derived by analyzing the phishing data archive provided by the anti- phishing working group (APWG).
Abstract: Phishing is a new type of network attack where the attacker creates a replica of an existing Web page to fool users (eg, by using specially designed e-mails or instant messages) into submitting personal, financial, or password data to what they think is their service providers' Web site In this paper, we propose a new end-host based anti-phishing algorithm, which we call LinkGuard, by utilizing the generic characteristics of the hyperlinks in phishing attacks These characteristics are derived by analyzing the phishing data archive provided by the anti-phishing working group (APWG) Because it is based on the generic characteristics of phishing attacks, LinkGuard can detect not only known but also unknown phishing attacks We have implemented LinkGuard in Windows XP Our experiments verified that LinkGuard is effective to detect and prevent both known and unknown phishing attacks with minimal false negatives LinkGuard successfully detects 195 out of the 203 phishing attacks Our experiments also showed that LinkGuard is lightweighted and can detect and prevent phishing attacks in real-time

126 citations

Proceedings ArticleDOI
07 Apr 2008
TL;DR: A novel approach to overcome the 'fuzziness' in traditional Website phishing risk assessment is presented and an intelligent resilient and effective model for detecting phishing Websites is proposed.
Abstract: Phishing websites are forged web pages that are created by malicious people to mimic web pages of real websites and it attempts to defraud people of their personal information. Detecting and identifying Phishing websites is really a complex and dynamic problem involving many factors and criteria, and because of the subjective considerations and the ambiguities involved in the detection, Fuzzy Logic model can be an effective tool in assessing and identifying phishing websites than any other traditional tool since it offers a more natural way of dealing with quality factors rather than exact values. In this paper, we present novel approach to overcome the `fuzziness' in traditional website phishing risk assessment and propose an intelligent resilient and effective model for detecting phishing websites. The proposed model is based on FL operators which is used to characterize the website phishing factors and indicators as fuzzy variables and produces six measures and criteria's of website phishing attack dimensions with a layer structure. Our experimental results showed the significance and importance of the phishing website criteria (URL & Domain Identity) represented by layer one, and the variety influence of the phishing characteristic layers on the final phishing website rate.

58 citations