What are the future works mentioned in the paper "Hierarchical visualization of network intrusion detection data in the ip address space" ?

The authors plan to prove the effectiveness of the technique by observing with real network management and users. Also, the following issues, as well as issues discussed in Section 6, will be the focus of future work based on this technique: • Combination with intelligent techniques, such as data mining and knowledge management, to effectively discover and alert high-security incidents. Some kinds of trends or attack patterns can be also discovered by developing visualizations of the time-sequence of intrusions.

What is the way to minimize occlusions?

Another idea for minimizing the occlusions is applying the viewing optimization problem so that entropy of the visualization result is maximized.

How long did it take to create the hierarchy of computers?

In their measurement, the implementation took 120 seconds for reading the log file, 0.6 seconds for forming and visualizing the hierarchy of computers, and 7.1 seconds for recounting incidents while GUI operations.

What is the function for displaying the list of attacks?

If the display space allows displaying a larger dialog window, additional information, such as IP addresses of receivers, and signature IDs, is presented so that users can easily specify past attacks.

How many times did administrators disconnect the senders or receivers of high-security incidents?

administrators of the computer network used for these figures disconnected the senders or receivers of incidents 16 times in two months, because of pernicious attacks.

(Open Access) Hierarchical visualization of network intrusion detection data (2006) | Takayuki Itoh

Q: What are the contributions in "Hierarchical visualization of network intrusion detection data in the ip address space" ?

In this paper, the authors present a technique for representing the statistics and trends of incidents in large-scale computer network.

Hierarchical Visualization of Network Intrusion Detection Data

in the IP Address Space

Takayuki ITOH

1,2

Hiroki TAKAKURA

Atsushi SAWADA

Koji KOYAMADA

1) Department of Information Sciences, Faculty of Science, Ochanomizu University

2) Academic Center for Computing and Media Studies, Kyoto University

3) Center for the Promotion of Excellence in Higher Education, Kyoto University

itot@computer.org, {takakura, sawada, koyamada}@media.kyoto-u.ac.jp

1. Introduction

Intrusion detection is an active area of research. Many

Intrusion Detection System (IDS) products are available, and

these systems generally detect network intrusions and record

the intrusions into log files. To understand the performance

and limitations of these systems, we have conducted a study

on several of the IDS products that are deployed on open and

large-scale computer networks. We have identified the

following issues:

• Several IDS systems send e-mails to network

administrators for each incident (i.e., an intrusion

record). They often send enormous numbers of e-mails

if the network is large-scale. Moreover, it is very

difficult to understand the relevancy and statistics of

aggregate incidents by only receiving the alerts for

individual incidents.

• Recent attacks often consist of complicated

combinations of various incidents. Intelligent, intuitive,

and real-time solutions are required for an overall

understanding of the complicated behavior.

• Databases for storing IDS logs often grow huge, and

therefore usability of the databases is often problematic.

Solutions that assist the user in querying for data are

desirable to reduce query operations.

• GUIs of current IDS products visualize information

very superficially. For example, the time sequence of

numbers of incidents of the whole domain may be

visualized as simple bar charts or polygonal charts. The

user may need to perform many operations to explore

detailed information, but in many cases administrators

are often too busy to take the time to operate the GUI.

Recently various studies on intrusion analysis and secure

network management have been reported [1,2]. In addition to

those, visualization of incidents is very effective for

intuitively and quickly understanding their distribution.

This paper presents a new technique to visualize the

contents of huge IDS log files. The goals of the visualization

technique are to make the available statistics from IDS

systems understandable and to offer an interactive way of

exploring detailed information. Another feature of the

visualization technique is representing the distribution of

incidents in IP-address spaces, revealing the relevancy of the

distribution to the organizational structure of real society.

The technique first forms a four-level hierarchy of

computers, by grouping the computers according to their IP

addresses byte-by-byte. It then visualizes the hierarchical data

as bars and nested rectangles [3,4], where bars denote

computers and rectangles denote groups of computers. It

finally represents the statistics of incidents by mapping the

number of incidents of each computer as heights of the bars.

The technique can represent the distribution of incidents in

large-scale computer networks consisting of several thousand

computers.

The technique helps the user intuitively understand the

distribution and trend of enormous numbers of incidents in

IP-address spaces of computer networks. It also helps the

discovery of relevant relationships between distributions of

incidents and the organization of real society, because IP

addresses are usually assigned according to the organization

of real society.

Moreover, the technique can provide the capability to

explore detailed information about incidents for each

computer, by representing computers as clickable icons. This

capability assists users in exploring the detailed information

of incidents for each computer.

This paper presents experimental results on visualizing

enormous numbers of real incidents, and describes what kind

of trends are observed from the experimental results.

2. Related works

Many IDS products provide detection, warning, and analysis

capabilities for incidents, but they have not completely solved

the issues described in Section 1. Several recent works

improve the issues.

On the other hand, it is important to minimize damage by

discovering high-security incidents intuitively and

immediately, and information visualization is a valuable

technology for this task. As described in the sidebar

“Visualization for computer network and intrusion detection”,

recent works for visualization of network intrusion include

the following features:

• Visualization and detail-on-demand user interfaces

showing time sequences of network traffic,

• Filtering of error detection or unimportant malicious

accesses from visualization results,

• Visual data mining for discovery of suspicious traffic

patterns from general log files, and

• Visualization of distribution of traffic on

IP-address-oriented display spaces.

The technique presented in this paper can be categorized as

“visualization of IP-address-oriented spaces” here the

difference of this technique over existing techniques is that

this technique attempts to maximize the density of the

information on display. This feature represents computers as

small clickable icons, enabling a user interface that presents

its detail on-demand for each computer.

Also, the technique is useful for discovering the behavior of

incidents relevant to the distribution of computers and the

organizational structure of real society. For example, it

visualizes the following behaviors:

• One computer attacks many others simultaneously.

• Many computers attack one computer simultaneously.

• When a computer is attacked and virus is placed on the

computer, it then turns to attack other computers.

• One (or more) computers attack other computers in the

same group or department.

With these features, the presented visualization technique

complements existing techniques well.

3. Hierarchical data visualization

3.1 Rectangle packing for hierarchical data

The proposed technique applies a hierarchical data

visualization technique presented in [3,4]. Figure 1 is an

example of the visualization by this technique, which

represents leaf-nodes as black square icons, and branch-nodes

as rectangular borders enclosing the icons.

The visualization technique places thousands of leaf-nodes

into one display space while satisfying the following

conditions:

• It never overlaps the leaf-nodes and branch-nodes in a

single hierarchy of other nodes,

• It attempts to minimize the display area requirement,

and

• It draws all leaf-nodes by equally shaped and sized

icons.

Figure 1. Example of hierarchical data visualization using a

rectangle packing algorithm.

This representation style is suitable to equally visualize

thousands of leaf-nodes of hierarchical data in one display

space. We applied the technique to visualization of bioactive

chemicals [4], distribution of jobs in parallel computing

environments [5], and so on.

The technique first packs icons, and then encloses them in

rectangular borders. Similarly, it packs a set of rectangles that

belong to higher levels, and generates the larger rectangles

that enclose them. Repeating the process from the lowest

level toward the highest level, the technique places all of the

data onto the layout area. The packing algorithm for icons and

rectangles is the key technology for the visualization

technique. Itoh et al. proposed a rectangle packing algorithm

[3] for hierarchical data visualization, but an improved

rectangle packing algorithm has been later presented in [4].

Both algorithms place icons and rectangles one-by-one onto

display spaces, while the algorithms choose their positions

from multiple candidate positions.

As shown in Figure 2, the improved rectangle packing

algorithm [4] applies grid-like subdivision of a display area

using extension lines of edges of previously placed rectangles.

The algorithm quickly generates multiple candidate positions

for the rectangle currently being placed by referring to the

grid-like subdivision. It generates at most four candidates at

the corner of empty subspaces of the grid-like space, where

the current rectangle can be placed without yielding any

unnecessary gaps with previously placed rectangles. The

algorithm then decides the position of the rectangle while it

avoids overlapping the rectangle with previously placed ones,

and attempts to minimize the area and aspect ratio of the

whole grid-like space. If there is no adequate candidate

position to place the rectangle, the algorithm additionally

generates several candidate positions outside the grid-like

space, and selects one of the candidates to place the rectangle.

3.2 Visualization in the IP address space

The presented technique groups the computers according to

their IP addresses to form hierarchical data. It first groups

them according to the first byte of the IP addresses. It again

groups them according to the second byte of the IP addresses,

and finally groups according to the third byte of the IP

addresses. Consequently the technique forms four-level

hierarchical data as shown in Figure 3(Left). The technique

visualizes the structure of computer network by representing

the hierarchical data as shown in Figure 3(Right). Here, black

icons in Figure 3(Right) represent computers, and the

rectangular borders represent groups of computers.

We think that the technique is useful for the visualization of

computer network spaces because:

• The technique visualizes large-scale hierarchical data

containing thousands of leaf-nodes without overlapping,

and therefore it can represent thousands of computers as

clickable icons in one display space. The technique is

: candidate position

Figure 2. Improved rectangle packing algorithm.

(Upper) Previously-placed rectangles, and grid-like

subdivision of a display space. (Center) Candidate

positions for placing the current rectangle. (Lower)

Placement of the current rectangle, and the update of

grid-like subdivision.

therefore useful as a GUI to directly explore detailed

information about incidents of arbitrary computers in

large-scale computer networks.

• The technique visualizes a hierarchy of computers

according to their IP addresses. Therefore, it can briefly

represent the correlation between incidents and groups

of computers in real society, because IP addresses are

often assigned according to the structure of a real

organization.

4. Implementation

4.1 Network intrusion detection data

The presented technique consumes the log files of a

commercial IDS system (Cisco Secure IDS 4320 [6]). The

system detects incidents based on signatures that predefine

the typical patterns of malicious accesses. The technique

inputs the following items from the description of the log files,

as shown in Figure 4(1):

• IP address of a computer sending incidents.

• IP address of a computer receiving incidents.

• Date and time.

• Positive integer ID (signature ID) that denotes the

specific signature.

• Security level (1, 2, 3, 4, and 5).

4.2 Visualization procedure

Consuming the log files, the presented technique visualizes

the incidents in the following processing order:

RDB-like data structure:

Consuming the log file, the presented technique forms a data

structure like a relational database (RDB), as shown in Figure

4(2). It constructs tables for time, signature IDs, security

levels, senders’ IP addresses, and receivers’ IP addresses. The

data structure accelerates the aggregation of incidents.

Construction of hierarchical data:

Simultaneously the technique lists the IP addresses of senders

and receivers, and forms hierarchical data by referring to IP

addresses byte-by-byte, as shown in Figure 4(3). Here all

the computers described in the log file are registered in the

hierarchical data.

Aggregation of incidents for each computer:

The technique then counts the total number of sending and

receiving incidents for each computer. Here it can specify the

conditions, such as signature IDs, security levels, and range of

times, to filter non-important incidents. If a signature ID is

specified, the technique counts them, referring to the

signature ID table. Similarly, it refers to the time or security

level tables if the range of time or the security level is

specified.

Representation:

The technique then visualizes the hierarchical data. Here it

represents the numbers of sending and receiving incidents for

1.2.3.4

1.2.3.5

1.2.4.5

1.2.4.6

1.3.3.4

2.3.5.5

3.5.6.8

2.3.5.7

1.*.*.*

1.2.*.*

1.2.3.* 1.2.4.*

1.3.*.*

2.*.*.* 3.*.*.*

Figure 3. (Left) Hierarchy of computers according to

their IP addresses. (Right) Illustration of visualization

results of the hierarchical data.

Figure 4. Processing order of the proposed

visualization technique.

each computer, by mapping the numbers as heights of

leaf-nodes. As shown in Figure 5, the technique represents the

numbers of sending and receiving incidents by assigning

different colors. Examples shown in Figures 8 to 10 represent

the number of sent incidents as blue, and the number of

received incidents as red.

Configuration of high-security (or low-security)

incidents:

Generally an IDS does not always provide adequate warning

of the security level of incidents because impact of incidents

strongly depends on each computer network’s situation. The

presented technique consumes the description of signature

IDs and IP addresses of experienced high-security (or

low-security) incidents, as shown in Figure 4(4). This

capability allows an administrator to configure the

visualization results according to his or her preferences, for

example:

• ''Incidents which have specific signature IDs are always

erroneous or ignorable in this network'',

• ''Incidents which have specific signature IDs have

damaged this network in the past'', and

• ''Incidents which have specific IP addresses of senders

have damaged this network in the past''.

Also, the capability allows configuring the following

computers as high-security:

• ''Computers that sent or received more than a constant

number of incidents in a constant time'', and

• ''Computers whose number of sending or receiving

incidents drastically increases.''

The technique can control the level of detail of visualization

by eliminating or assigning dark colors to leaf-nodes

corresponding to low-security computers.

Also, it can assign bright colors to leaf-nodes corresponding

to computers sending or receiving pre-defined high-security

incidents, to alert administrators of the return of known

attacks. The example shown in Figure 10 represents

computers sending or receiving high-security incidents in

yellow.

4.3 GUI capability

We developed the GUI of the presented technique as a Java

Applet. The features of the GUI are as follows.

Dialog window for configuring conditions for

counting incidents:

The GUI pops up a dialog window for configuring conditions

for counting incidents, including signature IDs, security levels,

ranges of times, and IP addresses. Figure 6(Upper) shows an

example of the dialog window. Given the conditions, the

technique only counts incidents satisfying the conditions. The

GUI enables more focused visualization, for example:

• “The network was damaged during 13:05 to 13:10, so I

would like to visualize the distribution of incidents

during that time,”

• “The network was damaged by the specific signatures,

so I would like to visualize the distribution of the

signatures,” or

• “This specific computer is often problematic, so I would

like to visualize the distribution of incidents related to

the computer by specifying its IP address.”

Dialog window for listing incidents for specific

computer:

The GUI pops up a dialog window that displays the list of

incidents for a specific computer that is the sender or the

receiver. The dialog window pops up when a leaf-node is

clicked, and then shows the list of incidents for the specific

computer corresponding to the clicked leaf-node. Figure

6(Lower-left) shows an example of the dialog window.

Dialog for listing records of typical attacks:

Figure 5. Illustration of visualizing the numbers of

incidents as heights of leaf-nodes.

Hierarchical visualization of network intrusion detection data

Figures

Citations

CAT: A Hierarchical Image Browser Using a Rectangle Packing Technique

Visual Analysis of Network Traffic for Resource Planning, Interactive Monitoring, and Interpretation of Security Threats

idMAS-SQL: Intrusion Detection Based on MAS to Detect and Block SQL injection through data mining

A hybrid space-filling and force-directed layout method for visualizing multiple-category graphs

Nature-Inspired Techniques in the Context of Fraud Detection

References

VisFlowConnect: netflow visualizations of link relationships for security situational awareness

Home-centric visualization of network traffic for security administration

Intrusion and misuse detection in large-scale systems

SnortView: visualization system of snort logs

MAIDS: mining alarming incidents from data streams

Related Papers (5)

Tree-maps: a space-filling approach to the visualization of hierarchical information structures

Ordered and quantum treemaps: Making effective use of 2D space to display hierarchies

VisFlowConnect: netflow visualizations of link relationships for security situational awareness

Visualizing network data

The hyperbolic browser: a focus + context technique for visualizing large hierarchies

Frequently Asked Questions (13)

Q1. What are the contributions in "Hierarchical visualization of network intrusion detection data in the ip address space" ?

Q2. What are the future works mentioned in the paper "Hierarchical visualization of network intrusion detection data in the ip address space" ?

Q3. What is the way to minimize occlusions?

Q4. How long did it take to create the hierarchy of computers?

Q5. How many candidates can be generated for the rectangle?

Q6. What is the function for displaying the list of attacks?

Q7. How many times did administrators disconnect the senders or receivers of high-security incidents?

Q8. What is the algorithm for putting a rectangle?

Q9. What is the description of the visualization technique?

Q10. What is the improved rectangle packing algorithm?

Q11. How many computers can be represented in the technique?

Q12. What are the goals of the visualization technique?

Q13. What is the ability to configure high-security computers?