scispace - formally typeset
Search or ask a question

Showing papers in "Journal in Computer Virology in 2008"


Journal ArticleDOI
TL;DR: A survey of the different reasoning techniques deployed among the behavioral detectors has been drawn up, classified according to a new taxonomy introduced inside the paper.
Abstract: Behavioral detection differs from appearance detection in that it identifies the actions performed by the malware rather than syntactic markers. Identifying these malicious actions and interpreting their final purpose is a complex reasoning process. This paper draws up a survey of the different reasoning techniques deployed among the behavioral detectors. These detectors have been classified according to a new taxonomy introduced inside the paper. Strongly inspired from the domain of program testing, this taxonomy divides the behavioral detectors into two main families:simulation-basedandformaldetectors.Insidethese families, ramifications are then derived according to the data collection mechanisms, the data interpretation, the adopted model and its generation, and the decision support.

213 citations


Journal ArticleDOI
TL;DR: The Intelligent Malware Detection System (IMDS) is an integrated system consisting of three major modules: PE parser, OOA rule generator, and rule based classifier, and an OOA_Fast_FP-Growth algorithm is adapted to efficiently generate OOA rules for classification.
Abstract: The proliferation of malware has presented a serious threat to the security of computer systems. Traditional signature-based anti-virus systems fail to detect polymorphic/metamorphic and new, previously unseen malicious executables. Data mining methods such as Naive Bayes and Decision Tree have been studied on small collections of executables. In this paper, resting on the analysis of Windows APIs called by PE files, we develop the Intelligent Malware Detection System (IMDS) using Objective-Oriented Association (OOA) mining based classification. IMDS is an integrated system consisting of three major modules: PE parser, OOA rule generator, and rule based classifier. An OOA_Fast_FP-Growth algorithm is adapted to efficiently generate OOA rules for classification. A comprehensive experimental study on a large collection of PE files obtained from the anti-virus laboratory of KingSoft Corporation is performed to compare various malware detection approaches. Promising experimental results demonstrate that the accuracy and efficiency of our IMDS system outperform popular anti-virus software such as Norton AntiVirus and McAfee VirusScan, as well as previous data mining based detection systems which employed Naive Bayes, Support Vector Machine (SVM) and Decision Tree techniques. Our system has already been incorporated into the scanning tool of KingSoft’s Anti-Virus software.

183 citations


Journal ArticleDOI
TL;DR: It is proved that reliable static detection of a particular category of metamorphic viruses is an $${\mathcal{NP}}$$-complete problem.
Abstract: This paper deals with metamorphic viruses. More precisely, it examines the use of advanced code obfuscation techniques with respect to metamorphic viruses. Our objective is to evaluate the difficulty of a reliable static detection of viruses that use such obfuscation techniques. Here we extend Spinellis’ result (IEEE Trans. Inform. Theory, 49(1), 280–284, 2003) on the detection complexity of bounded-length polymorphic viruses to metamorphic viruses. In particular, we prove that reliable static detection of a particular category of metamorphic viruses is an \({\mathcal{NP}}\)-complete problem. Then we empirically illustrate our result by constructing a practical obfuscator which could be used by metamorphic viruses in the future to evade detection.

176 citations


Journal ArticleDOI
TL;DR: This paper proposes a flexible and automated approach to extract malware behaviour by observing all the system function calls performed in a virtualized execution environment and shows how the accuracy of the classification process can be improved using a phylogenetic tree.
Abstract: Several malware analysis techniques suppose that the disassembled code of a piece of malware is available, which is however not always possible. This paper proposes a flexible and automated approach to extract malware behaviour by observing all the system function calls performed in a virtualized execution environment. Similarities and distances between malware behaviours are computed which allows to classify malware behaviours. The main features of our approach reside in coupling a sequence alignment method to compute similarities and leverage the Hellinger distance to compute associated distances. We also show how the accuracy of the classification process can be improved using a phylogenetic tree. Such a tree shows common functionalities and evolution of malware. This is relevant when dealing with obfuscated malware variants that have often similar behaviour. The phylogenetic trees were assessed using known antivirus results and only a few malware behaviours were wrongly classified.

115 citations


Journal ArticleDOI
TL;DR: This paper has taken advantage of the lack of security mechanisms for browser extensions and implemented a malware application for the popular Firefox web browser, which it is claimed takes complete control of the user’s browser space, can observe all activity performed through the browser and is undetectable.
Abstract: In this paper we examine security issues of functionality extension mechanisms supported by web browsers. Extensions (or “plug-ins”) in modern web browsers enjoy unrestrained access at all times and thus are attractive vectors for malware. To solidify the claim, we take on the role of malware writers looking to assume control of a user’s browser space. We have taken advantage of the lack of security mechanisms for browser extensions and implemented a malware application for the popular Firefox web browser, which we call browserSpy, that requires no special privileges to be installed. browserSpy takes complete control of the user’s browser space, can observe all activity performed through the browser and is undetectable. We then adopt the role of defenders to discuss defense strategies against such malware. Our primary contribution is a mechanism that uses code integrity checking techniques to control the extension installation and loading process. We describe two implementations of this mechanism: a drop-in solution that employs JavaScript and a faster, in-browser solution that makes uses of the browser’s native cryptography implementation. We also discuss techniques for runtime monitoring of extension behavior to provide a foundation for defending threats posed by installed extensions.

99 citations


Journal ArticleDOI
TL;DR: This paper surveys various techniques that have been used in public or privates tools in order to enhance the password cracking process, and addresses the issues of algorithmic and implementation optimisations, the use of special purpose hardware and the Use of the Markov chains tool.
Abstract: This paper surveys various techniques that have been used in public or privates tools in order to enhance the password cracking process. After a brief overview of this process, it addresses the issues of algorithmic and implementation optimisations, the use of special purpose hardware and the use of the Markov chains tool. Experimental results are then shown, comparing several implementations.

70 citations


Journal ArticleDOI
TL;DR: The state of the art concerning this class of attacks is documents, relevant protection approaches are summed up, and directions for future research are provided.
Abstract: The term JavaScript Malware describes attacks that abuse the web browser’s capabilities to execute malicious script-code within the victim’s local execution context. Unlike related attacks, JavaScript Malware does not rely on security vulnerabilities in the web browser’s code but instead solely utilizes legal means in respect to the applying specification documents. Such attacks can either invade the user’s privacy, explore and exploit the LAN, or use the victimized browser as an attack proxy. This paper documents the state of the art concerning this class of attacks, sums up relevant protection approaches, and provides directions for future research.

53 citations


Journal ArticleDOI
TL;DR: An overview of all known “live” memory collection techniques on a Windows system, and freely available memory analysis tools, andLimitations and known anti-collection techniques are reviewed.
Abstract: This paper gives an overview of all known “live” memory collection techniques on a Windows system, and freely available memory analysis tools. Limitations and known anti-collection techniques will also be reviewed. Analysis techniques will be illustrated through some practical examples, drawn from past forensics challenges. This paper is forensics-oriented, but the information provided information will also be of interest to malware analysts fighting against stealth rootkits.

48 citations


Journal ArticleDOI
TL;DR: This paper exposes the research results on 802.11 driver vulnerabilities by focusing on the design and implementation of a fully featured802.11 fuzzer that enabled us to find several critical implementation bugs that are potentially exploitable by attackers.
Abstract: 802.11 Wireless local area networks are unfortunately notoriously infamous due to their many, critical security flaws. Last year, world-first 802.11 wireless driver vulnerabilities were publicly disclosed, making them a critical and recent threat. In this paper, we expose our research results on 802.11 driver vulnerabilities by focusing on the design and implementation of a fully featured 802.11 fuzzer that enabled us to find several critical implementation bugs that are potentially exploitable by attackers. Lastly, we will detail the successful exploitation of the first 802.11 remote kernel stack overflow under Linux (madwifi driver).

39 citations


Journal ArticleDOI
TL;DR: This paper first introduces two formal definitions for the interactive and the distributed viruses, and designs an operational framework to describe malicious behaviors based on interactive languages.
Abstract: Several semantic-based malware analyzers have recently been put forward, each one defining its own model to capture the code behavior. All these semantic models, and abstract virology models likewise, fundamentally rely on formalisms equivalent to Turing Machines. However, as stated by recent advances in computer theory, these same formalisms do not capture appropriately interactions and concurrency. Unfortunately, malware, adaptable and resilient by essence, are likely to use these mechanisms intensively. In this paper, we thus extend the malware models to the specifically designed Interaction Machines. We first introduce two formal definitions for the interactive and the distributed viruses. According to different classes of interactions, their detection complexity is strongly impacted. Based on interactive languages, we then design an operational framework to describe malicious behaviors. Descriptions for some representative behaviors are given to complete and assess this framework.

19 citations


Journal ArticleDOI
TL;DR: This paper proposes a solution that provides at the anonymity, security to Ad Hoc network with a limited impact on QoS, and this method could be efficient against some viral attacks.
Abstract: Privacy and security solutions require today the protection of personal information so that it may not be disclosed to unauthorized participant for illegal purposes. It is a challenge to address these issues in networks with strong constraints such as Ad Hoc network. The security increase is often obtained with a quality of service (QoS) decrease. We propose in this paper a solution that provides at the anonymity, security to Ad Hoc network with a limited impact on QoS. This method could be efficient against some viral attacks. We also give some security proofs of our solution for Ad Hoc networks.

Journal ArticleDOI
TL;DR: A dynamic redirection mechanism of connections initiated from the honeypot that gives the attacker the illusion of being actually connected to a remote machine whereas he is redirected to another local honeypot.
Abstract: High-interaction honeypots are interesting as they help understand how attacks unfold on a compromised machine However, observations are generally limited to the operations performed by the attackers on the honeypot itself Outgoing malicious activities carried out from the honeypot towards remote machines on the Internet are generally disallowed for legal liability reasons It is particularly instructive, however, to observe activities initiated from the honeypot in order to monitor attacker behavior across different, possibly compromised remote machines This paper proposes to this end a dynamic redirection mechanism of connections initiated from the honeypot This mechanism gives the attacker the illusion of being actually connected to a remote machine whereas he is redirected to another local honeypot The originality of the proposed redirection mechanism lies in its dynamic aspect: the redirections are made automatically on the fly This mechanism has been implemented and tested on a Linux kernel This paper presents the design and the implementation of this mechanism

Journal ArticleDOI
TL;DR: Two light-weight worm detection algorithms that offer significant advantages over fixed-threshold methods are presented and RBS+TRW, an algorithm that combines fan-out rate (RBS) and probability of failure (TRW) of connections to new destinations, is introduced.
Abstract: We present two light-weight worm detection algorithms that offer significant advantages over fixed-threshold methods. The first algorithm, rate-based sequential hypothesis testing (RBS), aims at the large class of worms that attempts to quickly propagate, thus exhibiting abnormal levels of the rate at which hosts initiate connections to new destinations. The foundation of RBS derives from the theory of sequential hypothesis testing, the use of which for detecting randomly scanning hosts was first introduced by our previous work developing TRW (Jung et al. in Proceedings of the IEEE Symposium on Security and Privacy, 9–12 May 2004). The sequential hypothesis testing methodology enables us to engineer detectors to meet specific targets for false-positive and false-negative rates, rather than triggering when fixed thresholds are crossed. In this sense, the detectors that we introduce are truly adaptive. We then introduce RBS+TRW, an algorithm that combines fan-out rate (RBS) and probability of failure (TRW) of connections to new destinations. RBS+TRW provides a unified framework that at one end acts as pure RBS and at the other end as pure TRW. Selecting an operating point that includes both mechanisms extends RBS’s power in detecting worms that scan randomly selected IP addresses. Using four traces from three qualitatively different sites, we evaluate RBS and RBS+TRW in terms of false positives, false negatives, and detection speed, finding that RBS+TRW provides good detection of actual Code Red worm outbreaks that we caught in our trace as well as internal Web crawlers that we use as proxies for targeting worms. In doing so, RBS+TRW generates fewer than one false alarm per hour for wide range of parameter choices.

Journal ArticleDOI
Philippe Lagadec1
TL;DR: The security issues with technical details are shown, including XML and ZIP obfuscation techniques that may be used to bypass antiviruses, and how to design a filter to get rid of unwanted parts in a safe way is described.
Abstract: OpenDocument and Open XML are both new open file formats for office documents. OpenDocument is an ISO standard, promoted by OpenOffice.org and Sun StarOffice. Open XML is the new format for Microsoft Office 2007 documents, an ECMA standard. These two formats share the same basic principles: XML files within a ZIP archive, with an open schema, in contrast to good-old proprietary formats (MS Word, Excel, PowerPoint, ...). However, both of them suffer from many security issues, similar to previous Office formats: malicious people can still embed and hide malware (Trojan horses and viruses) thanks to macros, scripts, OLE objects and similar features. This paper shows the security issues with technical details, including XML and ZIP obfuscation techniques that may be used to bypass antiviruses, and describes how to design a filter to get rid of unwanted parts in a safe way.

Journal ArticleDOI
TL;DR: This paper extends the attack using a self-referential query to Windows, SQL Server 2005, and ASP with their respective latest updates installed and is more robust by making certain that the table can contain it.
Abstract: Automatic identification and collection (AIDC) technologies have made the life of a man much easier on numerous platforms Of the various such technologies the radio frequency identification devices (RFID) have become pervasive essentially because they can track from a greater physical distance than the rest The back end that supports these RFID systems has always been working well until they encounter a sbadly-formatted RFID tag There have hardly been any incidents where such tags, once identified by the back-end systems, can in fact wreak havoc via the interacting databases in the RFID infrastructure Recently, there has been significant research in this area In the previous work, the author managed to do an attack using a self-referential query on Linux, Oracle, and PHP However, they have been unable to test it on SQL Server 2005 This paper differs from the previous work in the way that it extends the attack using a self-referential query to Windows, SQL Server 2005, and ASP with their respective latest updates installed The query itself is more robust by making certain that the table can contain it

Journal ArticleDOI
TL;DR: This paper introduces the “normalizer construction problem” (NCP), and formalizes a restricted form of the problem called “NCP=”, which assumes a model of the engine is already known in the form of a term rewriting system, and shows that even this restricted version of the problems is undecidable.
Abstract: A malware mutation engine is able to transform a malicious program to create a different version of the program. Such mutation engines are used at distribution sites or in self-propagating malware in order to create variation in the distributed programs. Program normalization is a way to remove variety introduced by mutation engines, and can thus simplify the problem of detecting variant strains. This paper introduces the “normalizer construction problem” (NCP), and formalizes a restricted form of the problem called “NCP=”, which assumes a model of the engine is already known in the form of a term rewriting system. It is shown that even this restricted version of the problem is undecidable. A procedure is provided that can, in certain cases, automatically solve NCP= from the model of the engine. This procedure is analyzed in conjunction with term rewriting theory to create a list of distinct classes of normalizer construction problems. These classes yield a list of possible attack vectors. Three strategies are defined for approximate solutions of NCP=, and an analysis is provided of the risks they entail. A case study using the \({\tt W32.Evol}\) virus suggests the approximations may be effective in practice for countering mutated malware.

Journal ArticleDOI
TL;DR: A static analysis by abstract interpretation that is focused on security properties: without executing the program, it ensures the absence of any heap overflows.
Abstract: Several security flaws are the consequence of the presence of programming errors or bugs in software. Heap overflow is the typical example of such errors that allows an attacker to take control of a machine. But considering the growing size and complexity of present software, implementing programs without any error is not an easy task. In this paper, we present a static analysis by abstract interpretation that is focused on security properties: without executing the program, it ensures the absence of any heap overflows.

Journal ArticleDOI
TL;DR: It is shown how these particular malicious codes are innovative comparing to usual malware like virus, Trojan horses, etc, and a functional architecture for rootkits is introduced.
Abstract: This article deals with rootkit conception. We show how these particular malicious codes are innovative comparing to usual malware like virus, Trojan horses, etc. From that comparison, we introduce a functional architecture for rootkits. We also propose some criteria to characterize a rootkit and thus, to qualify and assess the different kinds of rootkits. We purposely adopt a global view with respect to this topic, that is, we do not restrict our study to the rootkit software. Namely, we also consider the communication between the attacker and his tool, and the induced interactions with the system. Obviously, we notice that the problems faced up during rootkit conception are close to those of steganography, while however showing the limits of such a comparison. Finally, we present a rootkit paradigm that runs in kernel-mode under Linux and also some new techniques in order to improve its stealth features.

Journal ArticleDOI
TL;DR: A new technique that preserves the malicious code on the target system even after the browser window is closed is described, which makes it possible to perform effective CSRF attacks using the XMLHTTPRequest object.
Abstract: This paper presents a state of the art of cross-site request forgery (CSRF) attacks and new techniques which can be used by potential intruders to make them more effective. Several attack scenarios on widely used web applications are discussed, and a vulnerability which affect most recent browsers is explained. This vulnerability makes it possible to perform effective CSRF attacks using the XMLHTTPRequest object. In addition, this paper describes a new technique that preserves the malicious code on the target system even after the browser window is closed. Lastly, best solutions to prevent these attacks are discussed to enable everyone (users, browser or Web applications developers, professionals in charge of IT security in an organization or a company) to prevent or manage this threat.

Journal ArticleDOI
TL;DR: The paper shows that virus replication can be identified and used to detect known and unknown viruses and develops two detection models based on virus replication.
Abstract: New viruses spread faster than ever and current signature based detection do not protect against these unknown viruses. Behavior based detection is the currently preferred defense against unknown viruses. The drawback of behavior based detection is the ability only to detect specific classes of viruses or have successful detection under certain conditions plus false positives. This paper presents a characterization of virus replication which is the only virus characteristic guaranteed to be consistently present in all viruses. Two detection models based on virus replication are developed, one using operation sequence matching and the other using frequency measures. Regression analysis was generated for both models. A safe list is used to minimize false positives. In our testing using operation sequence matching, over 250 viruses were detected with 43 subsequences. There were minimal false negatives. The replication sequence of just one virus detected 130 viruses, 45% of all tested viruses. Our testing using frequency measures detected all test viruses with no false negatives. The paper shows that virus replication can be identified and used to detect known and unknown viruses.

Journal ArticleDOI
TL;DR: Three new classifiers are introduced that characterize a function’s use of global data that will allow them to correlate both functions and global data references between the two and will lead to a quick identification of any sharing that is occurring.
Abstract: Research and development efforts have recently started to compare malware variants, as it is believed that malware authors are reusing code. A number of these projects have focused on identifying functions through the use of signature-based classifiers. We introduce three new classifiers that characterize a function’s use of global data. Experiments on malware show that we can meaningfully correlate functions on the basis of their global data references even when their functions share little code. We also present an algorithm that combines existing classifiers and our new ones into an ensemble for correlating functions in two binary programs. For testing, we developed a model for comparing our work to previous signature based classifiers. We then used that model to show how our new combined ensemble classifier dominates the previously reported classifiers. The resulting ensemble can be used by malware analysts when they are comparing two binaries. This technique will allow them to correlate both functions and global data references between the two and will lead to a quick identification of any sharing that is occurring.

Journal ArticleDOI
TL;DR: A novel classification of computer viruses using a formalised notion of reproductive models based on Gibson's theory of affordances is presented, and how computer virus reproduction models can be classified according to whether or not any of their reproductive actions are afforded by other entities is shown.
Abstract: We present a novel classification of computer viruses using a formalised notion of reproductive models based on Gibson’s theory of affordances. A computer virus reproduction model consists of: a labelled transition system to represent the states and actions involved in that virus’s reproduction; a notion of entities that are active in the reproductive process, and are present in certain states; a sequence of actions corresponding to the means of reproduction of the virus; and a formalisation of the actions afforded by entities to other entities. Informally, an affordance is an action that one entity allows another to perform. For example, an operating system might afford a computer virus the ability to read data from the disk. We show how computer virus reproduction models can be classified according to whether or not any of their reproductive actions are afforded by other entities. We give examples of reproduction models for three different computer viruses, and show how reproduction model classification can be automated. To demonstrate this we give three examples of how computer viruses can be classified automatically using static and dynamic analysis, and show how classifications can be tailored for different types of anti-virus behaviour monitoring software. Finally, we compare our approach with related work, and give directions for future research.

Journal ArticleDOI
TL;DR: This work proposes the CRYPTOPAGE architecture which implements memory encryption, memory integrity protection checking and information leakage protection together with a low performance penalty (3% slowdown on average) by combining the Counter Mode of operation, local authentication values and MERKLE trees.
Abstract: Malicious software and other attacks are a major concern in the computing ecosystem and there is a need to go beyond the answers based on untrusted software. Trusted and secure computing can add a new hardware dimension to software protection. Several secure computing hardware architectures using memory encryption and memory integrity checkers have been proposed during the past few years to provide applications with a tamper resistant environment. Some solutions, such as HIDE, have also been proposed to solve the problem of information leakage on the address bus. We propose the CRYPTOPAGE architecture which implements memory encryption, memory integrity protection checking and information leakage protection together with a low performance penalty (3% slowdown on average) by combining the Counter Mode of operation, local authentication values and MERKLE trees. It has also several other security features such as attestation, secure storage for applications and program identification. We present some applications of the CRYPTOPAGE architecture in the computer virology field as a proof of concept of improving security in presence of viruses compared to software only solutions.

Journal ArticleDOI
TL;DR: An overview of major kernel data structures which are used to handle processes under Linux 2.6 on an Intel IA-32 architecture is given and constraints such as execution context problems, module relocation, system calls usage prerequisites and kernel shellcode development have to be dealt with.
Abstract: Exploits are increasingly targeting operating system kernel vulnerabilities. For one, applications in user space are better protected by the developers and the kernel than in the past. Second, the promise of a successful kernel exploit is tantalizing full control over the targeted environment. Under Linux, kernel space exploits differ noticeably from user space exploits. Constraints such as execution context problems, module relocation, system calls usage prerequisites and kernel shellcode development have to be dealt with. These kernel exploits are the focus of this paper. We first give an overview of major kernel data structures which are used to handle processes under Linux 2.6 on an Intel IA-32 architecture. We then illustrate the aforementioned constraints by means of two practical Wifi Linux Drivers Stack Overflow exploits.