scispace - formally typeset
Search or ask a question
Author

Mauno Pihelgas

Bio: Mauno Pihelgas is an academic researcher from NATO Cooperative Cyber Defence Centre of Excellence. The author has contributed to research in topics: Reference architecture & Autonomous agent. The author has an hindex of 7, co-authored 13 publications receiving 191 citations. Previous affiliations of Mauno Pihelgas include NATO & Tallinn University of Technology.

Papers
More filters
Proceedings ArticleDOI
09 Nov 2015
TL;DR: The LogCluster algorithm is presented, which implements data clustering and line pattern mining for textual event logs and an open source implementation of LogClusters is described.
Abstract: Modern IT systems often produce large volumes of event logs, and event pattern discovery is an important log management task. For this purpose, data mining methods have been suggested in many previous works. In this paper, we present the LogCluster algorithm which implements data clustering and line pattern mining for textual event logs. The paper also describes an open source implementation of LogCluster.

143 citations

Proceedings ArticleDOI
06 Oct 2014
TL;DR: This paper will first focus on using log analysis techniques for collecting technical security metrics from security logs of common types (e.g., Network IDS alarm logs, workstation logs, and Net flow data sets), and describes a production framework for collecting and reportingTechnical security metrics which is based on novel open-source technologies for big data.
Abstract: During recent years, establishing proper metrics for measuring system security has received increasing attention. Security logs contain vast amounts of information which are essential for creating many security metrics. Unfortunately, security logs are known to be very large, making their analysis a difficult task. Furthermore, recent security metrics research has focused on generic concepts, and the issue of collecting security metrics with log analysis methods has not been well studied. In this paper, we will first focus on using log analysis techniques for collecting technical security metrics from security logs of common types (e.g., Network IDS alarm logs, workstation logs, and Net flow data sets). We will also describe a production framework for collecting and reporting technical security metrics which is based on novel open-source technologies for big data.

26 citations

Posted Content
TL;DR: This report describes a reference architecture for intelligent software agents performing active, largely autonomous cyber-defense actions on military networks of computing and communicating devices, and describes the rationale and purpose that drive the definition of the AICA Reference Architecture.
Abstract: This report - a major revision of its previous release - describes a reference architecture for intelligent software agents performing active, largely autonomous cyber-defense actions on military networks of computing and communicating devices. The report is produced by the North Atlantic Treaty Organization (NATO) Research Task Group (RTG) IST-152 "Intelligent Autonomous Agents for Cyber Defense and Resilience". In a conflict with a technically sophisticated adversary, NATO military tactical networks will operate in a heavily contested battlefield. Enemy software cyber agents - malware - will infiltrate friendly networks and attack friendly command, control, communications, computers, intelligence, surveillance, and reconnaissance and computerized weapon systems. To fight them, NATO needs artificial cyber hunters - intelligent, autonomous, mobile agents specialized in active cyber defense. With this in mind, in 2016, NATO initiated RTG IST-152. Its objective has been to help accelerate the development and transition to practice of such software agents by producing a reference architecture and technical roadmap. This report presents the concept and architecture of an Autonomous Intelligent Cyber-defense Agent (AICA). We describe the rationale of the AICA concept, explain the methodology and purpose that drive the definition of the AICA Reference Architecture, and review some of the main features and challenges of AICAs.

24 citations

Book ChapterDOI
02 Nov 2016
TL;DR: The Internet Protocol Version 6 (IPv6) transition opens a wide scope for potential attack vectors, and effective tools are required for the execution of security operations for assessment of possible attack vectors related to IPv6 security.
Abstract: The Internet Protocol Version 6 (IPv6) transition opens a wide scope for potential attack vectors. IPv6 transition mechanisms could allow the set-up of covert egress communication channels over an IPv4-only or dual-stack network, resulting in full compromise of a target network. Therefore effective tools are required for the execution of security operations for assessment of possible attack vectors related to IPv6 security.

14 citations

Proceedings ArticleDOI
22 May 2018
TL;DR: In this paper, the authors present the concept and architecture of an Autonomous Intelligent Cyber defense Agent (AICA) and present the future lines of research that will help develop and test the AICA / MAICA concept.
Abstract: Within the future Global Information Grid, complex massively interconnected systems, isolated defense vehicles, sensors and effectors, and infrastructures and systems demanding extremely low failure rates, to which human security operators cannot have an easy access and cannot deliver fast enough reactions to cyber-attacks, need an active, autonomous and intelligent cyber defense. Multi Agent Systems for Cyber Defense may provide an answer to this requirement. This paper presents the concept and architecture of an Autonomous Intelligent Cyber defense Agent (AICA). First, we describe the rationale of the AICA concept. Secondly, we explain the methodology and purpose that drive the definition of the AICA Reference Architecture (AICARA) by NATO's IST-152 Research and Technology Group. Thirdly, we review some of the main features and challenges of Multi Autonomous Intelligent Cyber defense Agent (MAICA). Fourthly, we depict the initially assumed AICA Reference Architecture. Then we present one of our preliminary research issues, assumptions and ideas. Finally, we present the future lines of research that will help develop and test the AICA / MAICA concept.

14 citations


Cited by
More filters
Book ChapterDOI
01 Jan 2014
TL;DR: This chapter is devoted to a more detailed examination of game theory, and two game theoretic scenarios were examined: Simultaneous-move and multi-stage games.
Abstract: This chapter is devoted to a more detailed examination of game theory. Game theory is an important tool for analyzing strategic behavior, is concerned with how individuals make decisions when they recognize that their actions affect, and are affected by, the actions of other individuals or groups. Strategic behavior recognizes that the decision-making process is frequently mutually interdependent. Game theory is the study of the strategic behavior involving the interaction of two or more individuals, teams, or firms, usually referred to as players. Two game theoretic scenarios were examined in this chapter: Simultaneous-move and multi-stage games. In simultaneous-move games the players effectively move at the same time. A normal-form game summarizes the players, possible strategies and payoffs from alternative strategies in a simultaneous-move game. Simultaneous-move games may be either noncooperative or cooperative. In contrast to noncooperative games, players of cooperative games engage in collusive behavior. A Nash equilibrium, which is a solution to a problem in game theory, occurs when the players’ payoffs cannot be improved by changing strategies. Simultaneous-move games may be either one-shot or repeated games. One-shot games are played only once. Repeated games are games that are played more than once. Infinitely-repeated games are played over and over again without end. Finitely-repeated games are played a limited number of times. Finitely-repeated games have certain or uncertain ends.

814 citations

Proceedings ArticleDOI
27 May 2019
TL;DR: This paper presents a comprehensive evaluation study on automated log parsing, evaluating 13 log parsers on a total of 16 log datasets spanning distributed systems, supercomputers, operating systems, mobile systems, server applications, and standalone software and reports the results in terms of accuracy, robustness, and efficiency.
Abstract: Logs are imperative in the development and maintenance process of many software systems. They record detailed runtime information that allows developers and support engineers to monitor their systems and dissect anomalous behaviors and errors. The increasing scale and complexity of modern software systems, however, make the volume of logs explodes. In many cases, the traditional way of manual log inspection becomes impractical. Many recent studies, as well as industrial tools, resort to powerful text search and machine learning-based analytics solutions. Due to the unstructured nature of logs, a first crucial step is to parse log messages into structured data for subsequent analysis. In recent years, automated log parsing has been widely studied in both academia and industry, producing a series of log parsers by different techniques. To better understand the characteristics of these log parsers, in this paper, we present a comprehensive evaluation study on automated log parsing and further release the tools and benchmarks for easy reuse. More specifically, we evaluate 13 log parsers on a total of 16 log datasets spanning distributed systems, supercomputers, operating systems, mobile systems, server applications, and standalone software. We report the benchmarking results in terms of accuracy, robustness, and efficiency, which are of practical importance when deploying automated log parsing in production. We also share the success stories and lessons learned in an industrial application at Huawei. We believe that our work could serve as the basis and provide valuable guidance to future research and deployment of automated log parsing.

254 citations

Proceedings ArticleDOI
09 Nov 2015
TL;DR: The LogCluster algorithm is presented, which implements data clustering and line pattern mining for textual event logs and an open source implementation of LogClusters is described.
Abstract: Modern IT systems often produce large volumes of event logs, and event pattern discovery is an important log management task. For this purpose, data mining methods have been suggested in many previous works. In this paper, we present the LogCluster algorithm which implements data clustering and line pattern mining for textual event logs. The paper also describes an open source implementation of LogCluster.

143 citations

Proceedings ArticleDOI
13 Aug 2016
TL;DR: The novelty in this work stems from the new techniques employed to overcome the instrumentation requirements or application specific assumptions made in prior log mining approaches, and improve the accuracy of mined templates and the cfg in the presence of long parameters and high amount of interleaving respectively.
Abstract: We focus on the problem of detecting anomalous run-time behavior of distributed applications from their execution logs. Specifically we mine templates and template sequences from logs to form a control flow graph (cfg) spanning distributed components. This cfg represents the baseline healthy system state and is used to flag deviations from the expected behavior of runtime logs. The novelty in our work stems from the new techniques employed to: (1) overcome the instrumentation requirements or application specific assumptions made in prior log mining approaches, (2) improve the accuracy of mined templates and the cfg in the presence of long parameters and high amount of interleaving respectively, and (3) improve by orders of magnitude the scalability of the cfg mining process in terms of volume of log data that can be processed per day. We evaluate our approach using (a) synthetic log traces and (b) multiple real-world log datasets collected at different layers of application stack. Results demonstrate that our template mining, cfg mining, and anomaly detection algorithms have high accuracy. The distributed implementation of our pipeline is highly scalable and has more than 500 GB/day of log data processing capability even on a 10 low-end VM based (Spark + Hadoop) cluster. We also demonstrate the efficacy of our end-to-end system using a case study with the Openstack VM provisioning system.

101 citations

Posted Content
TL;DR: Huang et al. as discussed by the authors presented a comprehensive evaluation study on automated log parsing and further released the tools and benchmarks for easy reuse, and evaluated 13 log parsers on a total of 16 log datasets.
Abstract: Logs are imperative in the development and maintenance process of many software systems. They record detailed runtime information that allows developers and support engineers to monitor their systems and dissect anomalous behaviors and errors. The increasing scale and complexity of modern software systems, however, make the volume of logs explodes. In many cases, the traditional way of manual log inspection becomes impractical. Many recent studies, as well as industrial tools, resort to powerful text search and machine learning-based analytics solutions. Due to the unstructured nature of logs, a first crucial step is to parse log messages into structured data for subsequent analysis. In recent years, automated log parsing has been widely studied in both academia and industry, producing a series of log parsers by different techniques. To better understand the characteristics of these log parsers, in this paper, we present a comprehensive evaluation study on automated log parsing and further release the tools and benchmarks for easy reuse. More specifically, we evaluate 13 log parsers on a total of 16 log datasets spanning distributed systems, supercomputers, operating systems, mobile systems, server applications, and standalone software. We report the benchmarking results in terms of accuracy, robustness, and efficiency, which are of practical importance when deploying automated log parsing in production. We also share the success stories and lessons learned in an industrial application at Huawei. We believe that our work could serve as the basis and provide valuable guidance to future research and deployment of automated log parsing.

92 citations