scispace - formally typeset
Search or ask a question
Author

Andrew H. Sung

Bio: Andrew H. Sung is an academic researcher from University of Southern Mississippi. The author has contributed to research in topics: Support vector machine & Steganalysis. The author has an hindex of 38, co-authored 163 publications receiving 5692 citations. Previous affiliations of Andrew H. Sung include New Mexico Institute of Mining and Technology & University of Coimbra.


Papers
More filters
Proceedings ArticleDOI
07 Aug 2002
TL;DR: Using a set of benchmark data from a KDD (knowledge discovery and data mining) competition designed by DARPA, it is demonstrated that efficient and accurate classifiers can be built to detect intrusions.
Abstract: Information security is an issue of serious global concern. The complexity, accessibility, and openness of the Internet have served to increase the security risk of information systems tremendously. This paper concerns intrusion detection. We describe approaches to intrusion detection using neural networks and support vector machines. The key ideas are to discover useful patterns or features that describe user behavior on a system, and use the set of relevant features to build classifiers that can recognize anomalies and known intrusions, hopefully in real time. Using a set of benchmark data from a KDD (knowledge discovery and data mining) competition designed by DARPA, we demonstrate that efficient and accurate classifiers can be built to detect intrusions. We compare the performance of neural networks based, and support vector machine based, systems for intrusion detection.

779 citations

Proceedings ArticleDOI
27 Jan 2003
TL;DR: This paper applies the technique of deleting one feature at a time to perform experiments on SVMs and neural networks to rank the importance of input features for the DARPA collected intrusion data and shows that SVM-based and neural network based IDSs using a reduced number of features can deliver enhanced or comparable performance.
Abstract: Intrusion detection is a critical component of secure information systems. This paper addresses the issue of identifying important input features in building an intrusion detection system (IDS). Since elimination of the insignificant and/or useless inputs leads to a simplification of the problem, faster and more accurate detection may result. Feature ranking and selection, therefore, is an important issue in intrusion detection. We apply the technique of deleting one feature at a time to perform experiments on SVMs and neural networks to rank the importance of input features for the DARPA collected intrusion data. Important features for each of the 5 classes of intrusion patterns in the DARPA data are identified. It is shown that SVM-based and neural network based IDSs using a reduced number of features can deliver enhanced or comparable performance. An IDS for class-specific detection based on five SVMs is proposed.

419 citations

Journal ArticleDOI
TL;DR: It is shown that an ensemble of ANNs, SVMs and MARS is superior to individual approaches for intrusion detection in terms of classification accuracy.

369 citations

Proceedings ArticleDOI
06 Dec 2004
TL;DR: This paper presents a robust signature-based malware detection technique, with emphasis on detecting obfuscated malware and mutated (or metamorphic) malware.
Abstract: Software security assurance and malware (Trojans, worms, and viruses, etc.) detection are important topics of information security. Software obfuscation, a general technique that is useful for protecting software from reverse engineering, can also be used by hackers to circumvent the malware detection tools. Current static malware detection techniques have serious limitations, and sandbox testing also fails to provide a complete solution due to time constraints. In this paper, we present a robust signature-based malware detection technique, with emphasis on detecting obfuscated (or polymorphic) malware and mutated (or metamorphic) malware. The hypothesis is that all versions of the same malware share a common core signature that is a combination of several features of the code. After a particular malware has been first identified, it can be analyzed to extract the signature, which provides a basis for detecting variants and mutants of the same malware in the future. Encouraging experimental results on a large set of recent malware are presented.

234 citations

Book ChapterDOI
01 Jan 2008
TL;DR: Though there are several anti-phishing software and techniques for detecting potential phishing attempts in emails and detecting phishing contents on websites, phishers come up with new and hybrid techniques to circumvent the availableSoftware and techniques.
Abstract: Phishing is a form of identity theft that occurs when a malicious Web site impersonates a legitimate one in order to acquire sensitive information such as passwords, account details, or credit card numbers.Though there are several anti-phishing software and techniques for detecting potential phishing attempts in emails and detecting phishing contents on websites, phishers come up with new and hybrid techniques to circumvent the available software and techniques.

210 citations


Cited by
More filters
Book
12 Aug 2008
TL;DR: This book explains the principles that make support vector machines (SVMs) a successful modelling and prediction tool for a variety of applications and provides a unique in-depth treatment of both fundamental and recent material on SVMs that so far has been scattered in the literature.
Abstract: This book explains the principles that make support vector machines (SVMs) a successful modelling and prediction tool for a variety of applications. The authors present the basic ideas of SVMs together with the latest developments and current research questions in a unified style. They identify three reasons for the success of SVMs: their ability to learn well with only a very small number of free parameters, their robustness against several types of model violations and outliers, and their computational efficiency compared to several other methods. Since their appearance in the early nineties, support vector machines and related kernel-based methods have been successfully applied in diverse fields of application such as bioinformatics, fraud detection, construction of insurance tariffs, direct marketing, and data and text mining. As a consequence, SVMs now play an important role in statistical machine learning and are used not only by statisticians, mathematicians, and computer scientists, but also by engineers and data analysts. The book provides a unique in-depth treatment of both fundamental and recent material on SVMs that so far has been scattered in the literature. The book can thus serve as both a basis for graduate courses and an introduction for statisticians, mathematicians, and computer scientists. It further provides a valuable reference for researchers working in the field. The book covers all important topics concerning support vector machines such as: loss functions and their role in the learning process; reproducing kernel Hilbert spaces and their properties; a thorough statistical analysis that uses both traditional uniform bounds and more advanced localized techniques based on Rademacher averages and Talagrand's inequality; a detailed treatment of classification and regression; a detailed robustness analysis; and a description of some of the most recent implementation techniques. To make the book self-contained, an extensive appendix is added which provides the reader with the necessary background from statistics, probability theory, functional analysis, convex analysis, and topology.

4,664 citations

Journal ArticleDOI
TL;DR: The second part of the tutorial focuses on the recently proposed layer-wise relevance propagation (LRP) technique, for which the author provides theory, recommendations, and tricks, to make most efficient use of it on real data.

1,939 citations

Journal ArticleDOI
01 Nov 2000
TL;DR: The issues of posterior probability estimation, the link between neural and conventional classifiers, learning and generalization tradeoff in classification, the feature variable selection, as well as the effect of misclassification costs are examined.
Abstract: Classification is one of the most active research and application areas of neural networks. The literature is vast and growing. This paper summarizes some of the most important developments in neural network classification research. Specifically, the issues of posterior probability estimation, the link between neural and conventional classifiers, learning and generalization tradeoff in classification, the feature variable selection, as well as the effect of misclassification costs are examined. Our purpose is to provide a synthesis of the published research in this area and stimulate further research interests and efforts in the identified topics.

1,737 citations

Journal ArticleDOI
TL;DR: The complexity of ML/DM algorithms is addressed, discussion of challenges for using ML/ DM for cyber security is presented, and some recommendations on when to use a given method are provided.
Abstract: This survey paper describes a focused literature survey of machine learning (ML) and data mining (DM) methods for cyber analytics in support of intrusion detection. Short tutorial descriptions of each ML/DM method are provided. Based on the number of citations or the relevance of an emerging method, papers representing each method were identified, read, and summarized. Because data are so important in ML/DM approaches, some well-known cyber data sets used in ML/DM are described. The complexity of ML/DM algorithms is addressed, discussion of challenges for using ML/DM for cyber security is presented, and some recommendations on when to use a given method are provided.

1,704 citations