Author

Hee-Sun Kim

Bio: Hee-Sun Kim is an academic researcher from Konkuk University. The author has contributed to research in topics: Bag-of-words model & Fingerprint (computing). The author has an hindex of 1, co-authored 1 publications receiving 13 citations.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Application of sim-hash algorithm and big data analysis in spam email detection system

[...]

Phuc-Tran Ho¹, Hee-Sun Kim¹, Sung-Ryul Kim¹•Institutions (1)

Konkuk University¹

05 Oct 2014

TL;DR: A novel similarity-based method is proposed that implements the fingerprinting technique on parallel processing framework and meet-in-the-middle approach is used in this method to achieve a higher accuracy in the spam email detection system.

...read moreread less

Abstract: Currently, there are many effective techniques that are used for filtering spam emails. However, spammers have mostly identified the weakness of those methods in order to bypass current detection systems. In this paper, we propose a novel similarity-based method that implements the fingerprinting technique on parallel processing framework. Furthermore, meet-in-the-middle approach is used in our method to achieve a higher accuracy in the spam email detection system. Our experimental result demonstrates the improved efficiency of this study.

...read moreread less

16 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Malytics: A Malware Detection Scheme

[...]

Mahmood Yousefi-Azar¹, Leonard G. C. Hamey¹, Vijay Varadharajan², Shiping Chen³•Institutions (3)

Macquarie University¹, University of Newcastle², Commonwealth Scientific and Industrial Research Organisation³

24 Sep 2018-IEEE Access

TL;DR: Malytics is a novel scheme to detect malware which is not dependent on any particular tool or operating system and outperforms a wide range of learning-based techniques and also individual state-of-the-art models on both platforms.

...read moreread less

Abstract: An important problem of cyber-security is malware analysis. Besides good precision and recognition rate, ideally, a malware detection scheme needs to be able to generalize well for novel malware families (a.k.a zero-day attacks). It is important that the system does not require excessive computation particularly for deployment on the mobile devices. In this paper, we propose a novel scheme to detect malware which we call Malytics. It is not dependent on any particular tool or operating system. It extracts static features of any given binary file to distinguish malware from benign. Malytics consists of three stages: feature extraction, similarity measurement, and classification. The three phases are implemented by a neural network with two hidden layers and an output layer. We show feature extraction, which is performed by tf -simhashing, is equivalent to the first layer of a particular neural network. We evaluate Malytics performance on both Android and Windows platforms. Malytics outperforms a wide range of learning-based techniques and also individual state-of-the-art models on both platforms. We also show Malytics is resilient and robust in addressing zero-day malware samples. The F1-score of Malytics is 97.21% and 99.45% on Android dex file and Windows PE files, respectively, in the applied datasets. The speed and efficiency of Malytics are also evaluated.

...read moreread less

28 citations

Proceedings Article•DOI•

Improve the Prediction Accuracy of Naïve Bayes Classifier with Association Rule Mining

[...]

Tianda Yang¹, Kai Qian¹, Dan Chia-Tien Lo¹, Ying Xie¹, Yong Shi¹, Lixin Tao² - Show less +2 more•Institutions (2)

Kennesaw State University¹, Pace University²

09 Apr 2016

TL;DR: This work proposes an association rule mining to improve Naïve Bayes Classifier, one of the famous algorithm in big data classification but based on an independent assumptions between features.

...read moreread less

Abstract: Nowadays, big data contains infinite business opportunities. Companies begin to analyze their data to predict their potential customers and business decisions using Naive Bayes Classifier, Association Rule Mining, Decision Tree and other famous algorithms. An accurate classification result may help companies leading in its industry. Companies seek to find feasible business intelligences to obtain reliable prediction results. In this paper we propose an association rule mining to improve Naive Bayes Classifier. Naive Bayes Classifier is one of the famous algorithm in big data classification but based on an independent assumptions between features. Association rule mining is popular and useful for discovering relations between inputs in big data analysis. We use bank marketing data set to illustrate in this work. In general, this work is helpful to all the business data set.

...read moreread less

19 citations

Proceedings Article•DOI•

Detecting spam and phishing mails using SVM and obfuscation URL detection algorithm

[...]

Prajakta S. Patil¹, Rashmi A. Rane¹, Madhuri Bhalekar¹•Institutions (1)

Maharashtra Institute of Technology¹

01 Jan 2017

TL;DR: A system that uses SVM technique along with map-reduce paradigm to achieve a higher accuracy in detection of the spam email is proposed and tries to overcome the two hurdles of the SVM.

...read moreread less

Abstract: Phishing is a criminal scheme to steal the user's personal data and other credential information. It is a fraud that acquires victim's confidential information such as password, bank account detail, credit card number, financial username and password etc. and later it can be misuse by attacker. We aim to use fundamental visual features of a web page's appearance as the basis of detecting page similarities. We propose a novel solution, to efficiently detect phishing web pages. Note that page layouts and contents are fundamental feature of web pages' appearance. Since the standard way to specify page layouts is through the style sheet (CSS), we develop an algorithm to detect similarities in key elements related to CSS. In this paper, we proposed a system that uses SVM technique along with map-reduce paradigm to achieve a higher accuracy in detection of the spam email. By using the map-reduce technique we also try to overcome the two hurdles of the SVM.

...read moreread less

15 citations

Proceedings Article•

Proceedings of the thiry-fourth annual ACM symposium on Theory of computing

[...]

John H. Reif¹•Institutions (1)

Duke University¹

19 May 2002

TL;DR: The topics included algorithms and computational complexity bounds for classical problems in algebra, geometry, topology, graph theory, game theory, logic and machine learning, as well as theoretical aspects of security, databases, information retrieval and networks, the web, computational biology, and alternative models of computation including quantum computation and self-assembly.

...read moreread less

Abstract: The papers in this volume were presented at the Thirty-Fourth Annual ACM Symposium on Theory of Computing (STOC2002), held in Montreal, Quebec, Canada, May 19-21, 2002. The Symposium was sponsored by the ACM Special Interest Group on Algorithms and Computation Theory (SIGACT).In response to a call for papers, 287 paper submissions were received. All were submitted electronically. The program committee conducted its deliberations electronically, via an on-line meeting that ran from January 10 to January 19. The committee selected 91 papers from among the submissions. The submissions were not refereed, and many of these papers represented reports of continuing research. It is expected that most of them will appear in a more polished and complete form in scientific journals.The papers encompassed in wide variety of areas of theoretical computer science. The topics included algorithms and computational complexity bounds for classical problems in algebra, geometry, topology, graph theory, game theory, logic and machine learning, as well as theoretical aspects of security, databases, information retrieval, and networks, the web, computational biology, and alternative models of computation including quantum computation and self-assembly.

...read moreread less

14 citations

Proceedings Article•DOI•

Web Service-Enabled Spam Filtering with Naïve Bayes Classification

[...]

Wanqing You¹, Kai Qian¹, Dan Lo¹, Prabir Bhattacharya², Minzhe Guo², Ying Qian³ - Show less +2 more•Institutions (3)

Southern Polytechnic State University¹, University of Cincinnati², East China Normal University³

30 Mar 2015

TL;DR: An anti-spam filter is developed that employs the Naïve Bayesian classifier, an effective engine to pick out spam emails that was trained on Enron Spam Dataset, a well-known spam/legitimate email dataset.

...read moreread less

Abstract: Electronic mail has nowadays become a convenient and inexpensive way for communication regardless of the distance. However, an increasing volume of unsolicited emails is bringing down the productivity dramatically. There is a need for reliable anti-spam filters to separate such messages from legitimate ones. The Naive Bayesian classifier is suggested as an effective engine to pick out spam emails. We have developed an anti-spam filter that employs this content-based classifier. This statistic-based classifier was trained on Enron Spam Dataset, a well-known spam/legitimate email dataset. We developed this filter as a Web Service, which would consume the emails user uploads and give back the predicted probability that in what degree the given email is spam. This engine was achieved by Rest easy technology, and consists three phases to train pre-labeled emails and then apply Naive Bays theorem to calculate email's Spamicity.

...read moreread less

10 citations