EvilSeed: A Guided Approach to Finding Malicious Web Pages

doi:10.1109/SP.2012.33

Proceedings ArticleDOI

EvilSeed: A Guided Approach to Finding Malicious Web Pages

- pp 428-442

TLDR

EVILSEED leverages the crawling infrastructure of search engines to retrieve URLs that are much more likely to be malicious than a random page on the web, and increases the "toxicity" of the input URL stream.

Abstract:

Malicious web pages that use drive-by download attacks or social engineering techniques to install unwanted software on a user's computer have become the main avenue for the propagation of malicious code. To search for malicious web pages, the first step is typically to use a crawler to collect URLs that are live on the Internet. Then, fast prefiltering techniques are employed to reduce the amount of pages that need to be examined by more precise, but slower, analysis tools (such as honey clients). While effective, these techniques require a substantial amount of resources. A key reason is that the crawler encounters many pages on the web that are benign, that is, the "toxicity" of the stream of URLs being analyzed is low. In this paper, we present EVILSEED, an approach to search the web more efficiently for pages that are likely malicious. EVILSEED starts from an initial seed of known, malicious web pages. Using this seed, our system automatically generates search engines queries to identify other malicious pages that are similar or related to the ones in the initial seed. By doing so, EVILSEED leverages the crawling infrastructure of search engines to retrieve URLs that are much more likely to be malicious than a random page on the web. In other words EVILSEED increases the "toxicity" of the input URL stream. Also, we envision that the features that EVILSEED presents could be directly applied by search engines in their prefilters. We have implemented our approach, and we evaluated it on a large-scale dataset. The results show that EVILSEED is able to identify malicious web pages more efficiently when compared to crawler-based approaches.

EvilSeed: A Guided Approach to Finding Malicious Web Pages

Citations

Graph based anomaly detection and description: a survey

Graph-based Anomaly Detection and Description: A Survey

Manufacturing compromise: the emergence of exploit-as-a-service

Nazca: Detecting Malware Distribution in Large-Scale Networks

Automatically detecting vulnerable websites before they turn malicious

References

Detection and analysis of drive-by-download attacks and malicious JavaScript code

EXPOSURE : Finding malicious domains using passive DNS analysis

All your iFRAMEs point to Us

The ghost in the browser analysis of web-based malware

Automated Web Patrol with Strider HoneyMonkeys: Finding Web Sites That Exploit Browser Vulnerabilities.

Related Papers (5)

Prophiler: a fast filter for the large-scale detection of malicious web pages

All your iFRAMEs point to Us

Detection and analysis of drive-by-download attacks and malicious JavaScript code

Beyond blacklists: learning to detect malicious web sites from suspicious URLs

Knowing your enemy: understanding and detecting malicious web advertising