scispace - formally typeset
Search or ask a question
Author

Hiroki Takakura

Other affiliations: Nagoya University, Tottori University, Kyoto University  ...read more
Bio: Hiroki Takakura is an academic researcher from National Institute of Informatics. The author has contributed to research in topics: Intrusion detection system & Anomaly-based intrusion detection system. The author has an hindex of 19, co-authored 84 publications receiving 1156 citations. Previous affiliations of Hiroki Takakura include Nagoya University & Tottori University.


Papers
More filters
Proceedings ArticleDOI
10 Apr 2011
TL;DR: A new evaluation dataset, called Kyoto 2006+, built on the 3 years of real traffic data which are obtained from diverse types of honeypots which will greatly contribute to IDS researchers in obtaining more practical, useful and accurate evaluation results.
Abstract: With the rapid evolution and proliferation of botnets, large-scale cyber attacks such as DDoS, spam emails are also becoming more and more dangerous and serious cyber threats. Because of this, network based security technologies such as Network based Intrusion Detection Systems (NIDSs), Intrusion Prevention Systems (IPSs), firewalls have received remarkable attention to defend our crucial computer systems, networks and sensitive information from attackers on the Internet. In particular, there has been much effort towards high-performance NIDSs based on data mining and machine learning techniques. However, there is a fatal problem in that the existing evaluation dataset, called KDD Cup 99' dataset, cannot reflect current network situations and the latest attack trends. This is because it was generated by simulation over a virtual network more than 10 years ago. To the best of our knowledge, there is no alternative evaluation dataset. In this paper, we present a new evaluation dataset, called Kyoto 2006+, built on the 3 years of real traffic data (Nov. 2006 ~ Aug. 2009) which are obtained from diverse types of honeypots. Kyoto 2006+ dataset will greatly contribute to IDS researchers in obtaining more practical, useful and accurate evaluation results. Furthermore, we provide detailed analysis results of honeypot data and share our experiences so that security researchers are able to get insights into the trends of latest cyber attacks and the Internet situations.

259 citations

Journal ArticleDOI
TL;DR: A new anomaly detection method by which it can automatically tune and optimize the values of parameters without predefining them is proposed and evaluated over real traffic data obtained from Kyoto University honeypots.

95 citations

Journal ArticleDOI
TL;DR: This work applies a hierarchical data visualization technique that represents leaf nodes as black square icons and branch nodes as rectangular borders enclosing the icons to bioactive chemical visualization and job distribution in parallel-computing environments.
Abstract: A technique for visualizing intrusion-detection system log files using hierarchical data based on IP addresses represents the number of incidents for thousands of computers in one display space. Our technique applies a hierarchical data visualization technique that represents leaf nodes as black square icons and branch nodes as rectangular borders enclosing the icons. This representation style visualizes thousands of hierarchical data leaf nodes equally in one display space. We applied the technique to bioactive chemical visualization and job distribution in parallel-computing environments.

72 citations

Patent
01 Apr 2008
TL;DR: In this paper, a plurality of frames of representative images are extracted from each motion image in accordance with a specified frame extraction method, and thumbnail images of the respective extracted representative images were produced, and each thumbnail image is related to a playback position in the motion image.
Abstract: In an information presenting apparatus, information associated with each of one or more motion images stored in a storage apparatus is displayed at a location corresponding to a file production time in the form of a calendar view. A plurality of frames of representative images are extracted from each motion image in accordance with a specified frame extraction method. Thumbnail images of the respective extracted representative images are produced, and each thumbnail image is related to a playback position in the motion image. The resultant thumbnail images are managed as a set of motion image thumbnail images in accordance with a production time of the motion image. Information associated with each motion image is displayed at a location corresponding to the production time of the motion image in the form of the calendar view. Thumbnail images associated with each motion image are displayed in an expanded form in the order of time.

43 citations

Proceedings ArticleDOI
21 Apr 2008
TL;DR: Two types of honeypot are proposed to collect unforeseen exploit codes automatically while maintaining their concealment against malicious attackers; cooperation based active honeypot and self-protection type honeypot.
Abstract: Honeypot is one of the most popular tools to decoy attackers into our network, and to capture lots of information about the activity of malicious attackers. By tracing and analyzing collected traffic data, we can find out unknown malicious codes under an experimental stage before some codes become hazardous to an application. Although many honeypots have been proposed, there is a common problem that they can be detected easily by malicious attackers. This is very important in success or failure of honeypots because if once an attacker notices that he/she is working on a honeypot, we can no longer observe his/her malicious activities. In this paper, we propose two types of honeypot to collect unforeseen exploit codes automatically while maintaining their concealment against malicious attackers; cooperation based active honeypot and self-protection type honeypot. We have evaluated the proposed honeypots which are deployed in Kyoto University, and showed that they have capability to collect some unknown malicious codes.

39 citations


Cited by
More filters
Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

Proceedings ArticleDOI
01 Jan 2018
TL;DR: A reliable dataset is produced that contains benign and seven common attack network flows, which meets real world criteria and is publicly avaliable and evaluates the performance of a comprehensive set of network traffic features and machine learning algorithms to indicate the best set of features for detecting the certain attack categories.
Abstract: With exponential growth in the size of computer networks and developed applications, the significant increasing of the potential damage that can be caused by launching attacks is becoming obvious. Meanwhile, Intrusion Detection Systems (IDSs) and Intrusion Prevention Systems (IPSs) are one of the most important defense tools against the sophisticated and ever-growing network attacks. Due to the lack of adequate dataset, anomaly-based approaches in intrusion detection systems are suffering from accurate deployment, analysis and evaluation. There exist a number of such datasets such as DARPA98, KDD99, ISC2012, and ADFA13 that have been used by the researchers to evaluate the performance of their proposed intrusion detection and intrusion prevention approaches. Based on our study over eleven available datasets since 1998, many such datasets are out of date and unreliable to use. Some of these datasets suffer from lack of traffic diversity and volumes, some of them do not cover the variety of attacks, while others anonymized packet information and payload which cannot reflect the current trends, or they lack feature set and metadata. This paper produces a reliable dataset that contains benign and seven common attack network flows, which meets real world criteria and is publicly avaliable. Consequently, the paper evaluates the performance of a comprehensive set of network traffic features and machine learning algorithms to indicate the best set of features for detecting the certain attack categories.

1,931 citations

Journal ArticleDOI
TL;DR: A highly scalable and hybrid DNNs framework called scale-hybrid-IDS-AlertNet is proposed which can be used in real-time to effectively monitor the network traffic and host-level events to proactively alert possible cyberattacks.
Abstract: Machine learning techniques are being widely used to develop an intrusion detection system (IDS) for detecting and classifying cyberattacks at the network-level and the host-level in a timely and automatic manner. However, many challenges arise since malicious attacks are continually changing and are occurring in very large volumes requiring a scalable solution. There are different malware datasets available publicly for further research by cyber security community. However, no existing study has shown the detailed analysis of the performance of various machine learning algorithms on various publicly available datasets. Due to the dynamic nature of malware with continuously changing attacking methods, the malware datasets available publicly are to be updated systematically and benchmarked. In this paper, a deep neural network (DNN), a type of deep learning model, is explored to develop a flexible and effective IDS to detect and classify unforeseen and unpredictable cyberattacks. The continuous change in network behavior and rapid evolution of attacks makes it necessary to evaluate various datasets which are generated over the years through static and dynamic approaches. This type of study facilitates to identify the best algorithm which can effectively work in detecting future cyberattacks. A comprehensive evaluation of experiments of DNNs and other classical machine learning classifiers are shown on various publicly available benchmark malware datasets. The optimal network parameters and network topologies for DNNs are chosen through the following hyperparameter selection methods with KDDCup 99 dataset. All the experiments of DNNs are run till 1,000 epochs with the learning rate varying in the range [0.01-0.5]. The DNN model which performed well on KDDCup 99 is applied on other datasets, such as NSL-KDD, UNSW-NB15, Kyoto, WSN-DS, and CICIDS 2017, to conduct the benchmark. Our DNN model learns the abstract and high-dimensional feature representation of the IDS data by passing them into many hidden layers. Through a rigorous experimental testing, it is confirmed that DNNs perform well in comparison with the classical machine learning classifiers. Finally, we propose a highly scalable and hybrid DNNs framework called scale-hybrid-IDS-AlertNet which can be used in real-time to effectively monitor the network traffic and host-level events to proactively alert possible cyberattacks.

847 citations

Journal ArticleDOI
TL;DR: In this article, the authors provide a focused literature survey of data sets for network-based intrusion detection and describes the underlying packet-and flow-based network data in detail, identifying 15 different properties to assess the suitability of individual data sets.

422 citations

Journal ArticleDOI
TL;DR: The experimental results demonstrate that the proposed hybrid intrusion detection method is better than the conventional methods in terms of the detection rate for both unknown and known attacks while it maintains a low false positive rate.
Abstract: In this paper, a new hybrid intrusion detection method that hierarchically integrates a misuse detection model and an anomaly detection model in a decomposition structure is proposed. First, a misuse detection model is built based on the C4.5 decision tree algorithm and then the normal training data is decomposed into smaller subsets using the model. Next, multiple one-class SVM models are created for the decomposed subsets. As a result, each anomaly detection model does not only use the known attack information indirectly, but also builds the profiles of normal behavior very precisely. The proposed hybrid intrusion detection method was evaluated by conducting experiments with the NSL-KDD data set, which is a modified version of well-known KDD Cup 99 data set. The experimental results demonstrate that the proposed method is better than the conventional methods in terms of the detection rate for both unknown and known attacks while it maintains a low false positive rate. In addition, the proposed method significantly reduces the high time complexity of the training and testing processes. Experimentally, the training and testing time of the anomaly detection model is shown to be only 50% and 60%, respectively, of the time required for the conventional models.

414 citations