What future works have the authors mentioned in the paper "Nstreamaware: real-time visual analytics for data streams to enhance situational awareness" ?

However, the system still needs to be applied to a larger computer network, which is part of the future work. Automatically defining good sizes for the sliding windows is also planed for the future. The merging model based on the feature selection process, could be applied to the realtime stream in the future, to actually merge sliding slices in real-time, which is not fully implemented yet. Tracking individual events over time was not the focus of this work, however, more work seems to be promising to extend the approach in that respect as well.

Why do the authors need to decouple the flow-rate of a data stream from screen?

Because of the unpredictable characteristics of data streams with respect to volume, velocity, variety, and veracity, the authors additionally need visualizations able to decouple the flow-rate of a data stream from screen updates and keep the latter constant and predictable to not overwhelm the user.

How long did it take to create a new sliding slice?

To provide a new sliding slice every 30 seconds, the authors initialized the system with a batch and slide interval of 30s and a window length of 60s.

What is the main limitation with respect to performance and scalability?

When displaying hundreds of sliding slices at the same time the performance decreased, because of browser and memory restrictions of the workstation.

How many times did the participants have to do the analysis?

A first analysis had to be sent to the organizers within three hours after first connecting to the final data stream from 20:00 to 21:30, which could only be streamed once, to force the participants to do real-time processing and provide immediate situational assessment under time pressure.

Why is the POK suspected in the disappearance?

Because of an ongoing conflict between an organization known as the Protectors of Kronos (POK), they are suspected in the disappearance.

(Open Access) NStreamAware: real-time visual analytics for data streams to enhance situational awareness (2014) | Fabian Fischer

Q: What is the purpose of Apache Spark?

Apache Spark introduces a programming model, called Resilient Distributed Datasets (RDDs), which provide an interface to coarse-grained transformations (e.g., map, group-by, filter, join).

Q: What is the architecture of the NVisAware service?

Their architecture consists of their REST Service, Spark Service and a web application with various visualizations, called NVisAware.

Q: How many servers are connected to a central syslog server?

13 servers are connected to a central syslog server, producing 30 000 to 80 000 messages per day with individual peaks of up to 5 000 messages per minute.

Q: How many threads were used to run the Spark Service?

The Spark Service was operated in local mode on a normal workstation Dell OptiPlex 980, Core i7-860, 8GB RAM 4x 2.80GHz with 10 separate working threads.

NStreamAware: Real-Time Visual Analytics for Data

Streams to Enhance Situational Awareness

Fabian Fischer

University of Konstanz, Germany

Fabian.Fischer@uni-konstanz.de

Daniel A. Keim

University of Konstanz, Germany

Daniel.Keim@uni-konstanz.de

ABSTRACT

The analysis of data streams is important in many security-

related domains to gain situational awareness. To provide

monitoring and visual analysis of such data streams, we

propose a system, called NStreamAware, that uses mod-

ern distributed processing technologies to analyze streams

using stream slices, which are presented to analysts in a

web-based visual analytics application, called NVisAware.

Furthermore, we visually guide the user in the feature se-

lection process to summarize the slices to focus on the most

interesting parts of the stream based on introduced expert

knowledge of the analyst. We show through case studies,

how the system can be used to gain situational awareness

and eventually enhance network security. Furthermore, we

apply the system to a social media data stream to compete

in an international challenge to evaluate the applicability of

our approach to other domains.

Categories and Subject Descriptors

C.2.0 [Computer-Communication Networks]: General—

Security and protection; C.3.8 [Computer Graphics]: Ap-

plication; H.5.2 [Information Interfaces and Presenta-

tion]: User Interfaces

General Terms

Real-Time Processing, Data Streams, Situational Aware-

ness, Network Security, Visual Analytics

1. INTRODUCTION

In many security-related scenarios the analysis and situ-

ational assessment of data streams is crucial to detect sus-

picious behavior, to monitor and understand ongoing activ-

ities, or to reduce streams to focus on the most relevant

parts. For example, in the ﬁeld of system and network ad-

ministration, network routers and servers produce a con-

tinuous stream of NetFlow records or system log messages,

and hundreds of system metrics and performance data. In

some times, analysts do a close real-time monitoring, while

in other situations analysts have no choice, but to focus

only on the most important parts of a data stream. The

same is true in the ﬁeld of law enforcement in the analysis

of criminal activities of ongoing threats to maintain situ-

ational awareness (SA). In this scenario, analysts need to

handle streams of possibly important social media messages

and call center messages. Both scenarios are technically re-

lated and show the high importance of research in the ﬁeld

of data stream analysis with the analyst in the loop that

is a key to enhance situational awareness. The challenge

in this ﬁeld is also to merge and aggregate heterogeneous

high velocity data streams. While we do have a wide variety

of highly-scalable databases and there has been much re-

search in intrusion and anomaly detection, fully automated

systems are not working suﬃciently. To convey and support

understanding, generate insights, and evaluate hypothesis,

analysts needs to have a central role in such a system, to

not loose context, and to be able to judge data provenance.

The ultimate goal allows the analysts to actually get an

idea what is going on in a data stream to gain situational

awareness. Such analysts are often “being asked to make

decisions on ill-deﬁned problems. These problems may con-

tain uncertain or incomplete data, and are often complex to

piece together. Consequently, decision makers rely heavily

on intuition, knowledge and experience” [14], which high-

lights the need to guide analysts to the right parts of a data

stream, because it is impossible to analyze everything in the

same level of detail.

In this paper, we introduce NStreamAware, which is a vi-

sual analytics system designed to address this challenge us-

ing latest analysis technologies available from the big data

analysis community [20] and real-time visual analytics re-

search [12].

The main contributions of our work are the following:

Firstly, a system architecture, called NStreamAware, based

on Apache Spark Streaming [2] to summarize incoming data

streams in sliding slices. Secondly, a web-based visual an-

alytics application, called NVisAware, using a novel com-

bination of various visualization techniques within multiple

sliding slices to visually summarize the data stream based

on selected features steered by a visual analytics interface.

The remainder of this paper is structured as follows: Sec-

tion 2 elaborates on important design considerations. Sec-

tion 3 gives an overview of related work. Section 4 describes

the diﬀerent aspects of our approach, while the evaluation is

discussed in Section 5. Section 6 discusses limitations and

future work and concludes with Section 7.

Konstanzer Online-Publikations-System (KOPS)

URL: http://nbn-resolving.de/urn:nbn:de:bsz:352-0-267315

Erschienen in: Proceedings of the Eleventh Workshop on Visualization for Cyber Security (VizSec '14), Paris, France, November 10, 2014 / Lane

Harrison ... (Hrsg.). - New York, NY : ACM, 2014. - S. 65-72. - ISBN 978-1-4503-2826-5

2. DESIGN CONSIDERATIONS

Based on the given problem, experience, and expert feed-

back with earlier work in the ﬁeld, we identiﬁed following

design considerations and principles as crucial for our ap-

proach.

DC1 Incorporate novel scalable analytics methods:

Scalable, distributed, and proven large-scale analysis

frameworks must be building blocks of a system able

to address big data problems. We need to take ad-

vantage of such novel technologies from the big data

community and use them in visual analytics applica-

tion. We need to bring those worlds together and keep

the analyst in the loop to address complex problems.

DC2 Enabling real-time monitoring: While it is not

possible to present all raw messages for high speed

streams, it is still relevant for many scenarios, where

analysts want to closely monitor messages from a par-

ticular system, or based on a speciﬁc ﬁlter criteria in

real-time. Many available visual analytics systems,

however, still do require a static batch loading ﬁrst.

We see the need to be able to directly push data to our

system in a streaming fashion, and be able to smoothly

switch between monitoring and exploration.

DC3 Deterministic screen updates, independent from

underlying data streams: The problem in systems

supporting DC2 is the high cognitive load for the an-

alysts when analyzing real-time streams. Because of

the unpredictable characteristics of data streams with

respect to volume, velocity, variety, and veracity, we

additionally need visualizations able to decouple the

ﬂow-rate of a data stream from screen updates and

keep the latter constant and predictable to not over-

whelm the user. There is a trade oﬀ between DC2 and

DC3 to achieve both at the same time.

DC4 Fusion of heterogeneous data sources: Many avail-

able systems do focus on individual data sources, and

provide less ﬂexibility to incorporate and correlate var-

ious heterogeneous data sources. However, focusing

on particular individual data sources helps to develop

highly eﬀective speciﬁc visualization systems. On the

other hand, it is important to cover a broader ﬁeld of

scenarios and tasks, to provide better situational as-

sessment.

DC5 User-steered feature selection: Feature selection

is an important ﬁeld to support analysts using appro-

priate visualization and interaction techniques. Our

goal is to enhance understanding of data streams and

provide more compact overviews. In this process, we

want to integrate the human in the workﬂow that re-

quires a tight coupling of visual representations, inter-

action and analytic methods.

3. RELATED WORK

The contributions of our work are related to various re-

search ﬁelds, so we discuss various areas in the following

section. Many researchers focus on the algorithmic analysis

of data streams, especially in the ﬁeld of stream clustering

[1] and event detection. In recent years, there was a focus

on social data streams, because of the wide availability of

such data. While most of these systems focus on the de-

tection of events, our work contributes more in the ﬁeld of

visualizing a condensed heterogeneous data stream to fo-

cus on more interesting changes, omitting or merging less

interesting ranges to eventually focus on important parts

in more detail. This idea is related to the work of Xie et

al. [19] proposing a fully-automated merging algorithm for

time-series data streams.

A recent study by Wanner et al. [18] takes a look at the

evolution of visual analytics applications for event detection

for text streams and concludes that “visualizations were pri-

marily used as presentation, but had no interaction possible

to steer the underlying data processing algorithm”. This

conﬁrms our assumption, that many systems do not cover

DC5 appropriately. Our approach diﬀers, that we provide

interactions, so that users are able to steer the feature se-

lection process. Therefore, the system does not only rely

on the fully-automated selection of interesting parts, but on

the user-adjusted feature set. The ultimate goal of visual

analytics systems for data streams is to enhance situational

awareness to facilitate decision making. Endsley provides a

widely used generic deﬁnition of SA. It “is the perception

of the elements in the environment within a volume of time

and space, the comprehension of their meaning, and the pro-

jection of their status in the near future” [6]. Further work

makes it clear, that situation awareness primarily resides “in

the minds of humans”, while situation assessment better de-

scribes the “process or set of processes” leading to the state

of SA [16]. In the complex ﬁeld of computer network secu-

rity operations, only a combination of various tools used by

experienced domain experts, will eventually be able to guide

the user to such cognitive state. Franke and Brynielsson [9]

give a systematic literature overview speciﬁcally for the ﬁeld

of cyber situational awareness.

Furthermore, there is not only work on SA systems, but

also visualization techniques (e.g., [7]) designed to convey

the current state of the network to best support situational

assessment. ELVIS [10] is a highly interactive system to an-

alyze system log data, but cannot be applied to real-time

streams. SnortView [11] focus on the speciﬁc analysis of

intrusion detection alerts and does satisfy DC2. The focus

of Event Visualizer [8], is to provide real-time visualizations

for event data streams (e.g., system log data) to provide

real-time monitoring and possibilities to smoothly switch to

exploration mode covering DC2 and DC4. In contrast to

this event-based approach, Best et al. [3] proposes another

real-time system to enhance situational awareness using the

analysis of network traﬃc based on LiveRAC [13]. The ana-

lyzed and aggregated time-series are displayed in a zoomable

tabular interface to provide the analyst an interactive explo-

ration interface for time-series data, while our approach is

more general to include also other data types (e.g., frequent

words or users, hierarchical overviews) addressing DC4.

Additionally, Shiravi et al. [15] provides an extensive

overview of various visualization systems for network secu-

rity based on ﬁve major use case classes: Host/Server Moni-

toring, Internal/External Monitoring, Port Activity, Attack

Patterns, and Routing Behavior. The authors also identi-

ﬁed the fact, that most security visualization systems, in

their current state, are mostly suitable for oﬄine forensics

analysis”, while “real-time processing of network events re-

quires extensive resources, both in terms of the computation

power required to process an event, as well as the amount

of memory needed to store the aggregated statistics” [15].

Compared to work speciﬁcally found in one of these use case

classes, our approach tries to combine multiple use cases into

a real-time visual analytics system and addresses the scala-

bility issues using Apache Spark.

4. VISUAL ANALYTICS SYSTEM

In the following, we describe the building blocks of NStrea-

mAware. The overall architecture can be seen in Figure 1.

To process the data stream, we made use of various mod-

ern technologies to provide a scalable infrastructure for our

modular visual analytics system. Our architecture consists

of our REST Service, Spark Service and a web application

with various visualizations, called NVisAware. To provide

proven and scalable data processing, we make use of Apache

Spark

, RabbitMQ

, ElasticSearch

, and MongoDB

The REST Service (1) connects to the data streams (2)

and preprocesses the data and calculates various additional

information for the incoming events. The service does also

provide a REST interface to retrieve historical data or man-

age insights. All events are stored to a distributed Elas-

ticSearch cluster and are forwarded to our message broker

RabbitMQ.

The Spark Service (3), which runs on top of the Apache

Spark Streaming platform for analytics, generates real-time

summaries on sliding windows, and stores them to a Mon-

goDB database (4). Spark Streaming is a development frame-

work to help to implement analytical algorithms executed

in large distributed cluster environments to provide scala-

bility even in big data scenarios. The Spark Service is im-

plemented using Scala and calculates various statistics and

features based on sliding windows. Table 1 shows a selec-

tion of calculated example features for a network security

use case. We call these summaries, which are generated in

a regular interval, sliding slices. Those slices and also a se-

lection of raw messages are eventually forwarded to our web

application NVisAware (5), so that they can be visualized

in the graphical user interface to the analyst using various

interactive real-time displays. All modules are loosely cou-

pled, so that they can be run on separate computers or in

cluster environments to achieve best performance for large-

scale data streams.

4.1 REST Service Module

The REST Service (1), which is implemented as multi-

threaded standalone Java application, provides a REST in-

terface accessible by all other modules, especially the web

application. This REST service is used to handle job queu-

ing and to answer data requests. To attach new data streams,

the respective jobs can be sent to the service via a deﬁned

REST API. The job is added as new thread and the API

can be used to control or retrieve status information about

these running jobs. Incoming messages from the data stream

are then preprocessed, ﬁelds are extracted, and eventually

treated as individual events, enriched with various addi-

tional attributes. The procedure is based on the assigned

scenario conﬁguration. For social media messages, sentiment

values are calculated, while for IP-related data geo lookups

https://spark.apache.org/

http://www.rabbitmq.com/

http://www.elasticsearch.org/

http://www.mongodb.com/

can be made. In practice, many server do not provide very

accurate timestamps, therefore, a new ﬁeld with the current

timestamp is added as well, to have a more accurate tim-

ings in cases where the workstation does not make use of the

network time protocol or uses deviating time settings.

Figure 1: System Architecture: NStreamAware uses

various modern systems, including Apache Spark,

RabbitMQ, MongoDB, and ElasticSearch, to pro-

vide the needed scalability for an interactive visual

analytics application.

4.2 Module for Spark Streaming

Apache Spark provides distributed memory abstraction,

that is fault-tolerant and eﬃcient. This helps to program

distributed data processing applications without worrying

about fault-tolerance. Apache Spark introduces a program-

ming model, called Resilient Distributed Datasets (RDDs),

which provide an interface to coarse-grained transformations

(e.g., map, group-by, ﬁlter, join). The RDDs can be ad-

dresses within Scala similar to normal collections, however,

they are indeed spread over the underlying cluster machines.

If a transformation is called on a RDD, the execution is ac-

tually done on various worker machines. When an action is

called (e.g., count), the result is retrieved from all workers

to return ﬁnal results. We use the streaming extension of

Apache Spark and use the same programming model to an-

alyze data streams in real-time. We deﬁne a sliding window

and connect to a RabbitMQ queue to receive messages for-

warded by the REST Service. Currently, we deﬁned various

feature types to be calculated on the incoming messages:

count, set, new-set, key-value list, and key-array list. All

features as seen in Table 1 for example belong to one of

these message types. After calculating the various features,

they are directly stored to a MongoDB collection. When all

features are ready, NVisAware is notiﬁed via RabbitMQ to

retrieve the sliding slice content via the REST API using

the appropriate database queries. Count provides a simple

counter of number of messages. A set stores a list of unique

values occurred within a sliding window, while a new-set

feature will only include values, which have never been seen

in the whole stream before. A key-value list can be used

to count the number of occurrences for all words to gather

a list of frequent words. The key-array list can be used to

store for each key an array of values. This can be used, for

example, to track for each IP address, all used port numbers

in the sliding window.

Feature Type Stream

#events count Syslog

timestamps set Syslog

#pr ograms count Syslog

#hosts count Syslog

#frequentW ords count Syslog

programs key-value list Syslog

hosts key-value list Syslog

frequentW ords key-value list Syslog

newHosts new-set Syslog

newP rograms new-set Syslog

srcAddr key-value list NetFlow

dstAddr key-value list NetFlow

srcP orts key-value list NetFlow

dstP orts key-value list NetFlow

topT alker key-array list NetFlow

#srcAddr count NetFlow

#dstAddr count NetFlow

#srcP orts count NetFlow

#dstP orts count NetFlow

ossecAlerts key-value list OSSEC

Table 1: Selection of aggregation features for each

sliding slice generated by our implemented analysis

and aggregation module.

4.3 NVisAware Web Application

The graphical user interface is provided by our web ap-

plication NVisAware, which provides various displays. The

application is written in HTML5 and JavaScript using var-

ious visualization libraries. The display consists of multiple

conﬁguration and parameter views and six main tabs: Real-

Time Data Stream, Real-Time Sliding Slices, Visual Feature

Selection, Summarized Sliding Slices, Event Timeline & In-

sights, and Search & Exploration. The ﬁrst display can be

seen in Figure 3 and is used to take a look at the raw mes-

sages in the data stream.

4.4 Real-Time Sliding Slices

To visually represent the generated sliding slices, we pro-

vide a novel visualization with various embedded charts like

word clouds, node-link diagrams, treemaps, and counters

within each slice. The slices are juxtapositioned next to

each other to provide a timeline based on consecutive slices

as seen in Figure 4. The prominent background color uses

a colormap from dark green over white to pink based on a

diverging ColorBrewer set. The color indicates a similarity

score to the previous slice to alarm the analyst. In the up-

per left corner a star icon can be used, to store the slice for

further investigations. The slice will also be added to the

Event Timeline & Insights view, where all starred objects

are presented in a traditional interactive timeline to explore

the events ﬂagged and labeled by the analysts.

4.5 Visual Feature Selection

In many situations, the analyst is not interested in fol-

lowing the data stream in real-time. However, in some

cases a summary of the current data stream should be pro-

vided. Fully-automated summarizations are hard to achieve

for complex heterogeneous data streams. Therefore, we pro-

vide a visual feature selection interface, to steer the merging

algorithm based on the user’s criteria.

All count features in Table 1 can directly be used in the

feature timelines in Figure 2. More features can be derived

from key-value lists. For example the occurrences over time

of a speciﬁc word found in the stream. Each feature time-

line contains many values, one value for each sliding slice

observed so far. This data is processed on the server side

and each feature timeline is cut into segments: Each time-

line is clustered using the DBSCAN algorithm. Afterwards,

consecutive slices belonging to the same cluster are merged

to a segment. The start and end points of these possibly

important segments are visible as vertical colored lines and

through the background shading within the timelines. The

analyst can visually interpret these segments, modify them,

or add new segments for interesting parts, which were not

detected by the algorithm. The analyst can remove or re-

order the features using drag and drop.

Figure 2: Visual Feature Selection: The analyst is in

the loop to steer the merging algorithm to provide

meaningful summaries of sliding slices.

The ﬁnal feature order and selection is sent to the REST

service, where all segments are merged together with the

given constraints, while ignoring low-ranked conﬂicting fea-

tures and keeping non-conﬂicting and more speciﬁc segments.

Eventually, the original sliding slices can be compressed

according to the resulting heuristic merge and importance

model. Less important segments are merged together pro-

viding a multi-focal scaling of the data stream steered by

the analyst according to the tasks at hand.

5. EVALUATION

In general, it is quite challenging to evaluate complex vi-

sual analytics applications. Individual design decisions can

be formally evaluated in user studies and many decisions are

indeed based on perception studies. However, proper evalua-

tion of complex expert applications is more than to evaluate

all individual design decisions. Describing convincing use

cases or presenting case studies with experts are often the

only reasonable ways. However, also these results are often

subjective and hard to compare to alternative approaches.

Another reason is, that “insight, the major aim of visual

analytics, is ill-deﬁned and hard to measure” [17]. This is

even more true, if we are talking about a mental state of sit-

uational awareness as goal of the system. Generally, there

is also a lack of proper ground truth, and the sensitive na-

ture of the involved data streams makes it hard to share the

data. With that respect international challenges that pro-

vide complex but anonymous data streams are very helpful

for a proper evaluation based on gained insights.

Having this in mind we decided to go for two directions

of evaluations. Firstly, we describe a case study, how our

system can be used in an operational computer network of

a working group to help the system administrator to stay

informed about the most important activities. Secondly, to

evaluate the real-time capabilities of our system and the in-

sights management, we actively participated in VAST Chal-

lenge 2014 with an early version of our prototype.

5.1 Application for Network Security

To show the capabilities of our system, we implemented

our system in a computer network of a working group with

about 85 active local devices including workstations, mobile

devices, and servers, producing about 1.4 million NetFlow

records per day with peaks up to 10 000 records per minute.

13 servers are connected to a central syslog server, produc-

ing 30 000 to 80 000 messages per day with individual peaks

of up to 5 000 messages per minute. These servers are also

monitored using OSSEC [4], which is a widely used “host-

based intrusion detection system that performs log analysis,

ﬁle integrity checking, policy monitoring, rootkit detection,

real-time alerting and active response”. The generated alerts

are also pushed to the central syslog server. With this infras-

tructure in place, we were able to forward the data streams

to our REST Service to make them available for NStrea-

mAware. In the following, we made use of the system log

stream (SL), NetFlow stream (NF), and OSSEC alert stream

(OS). It would be easy, to further include additional data

from the underlying network, for example, system metrics,

Snort alerts, or web server access logs.

The analyst opened the web application NVisAware in a

modern web browser and added the data streams as jobs to

the server-side REST Service. Seconds later, the ﬁrst mes-

sages appeared in the Real-Time Data Streams tab as seen

in Figure 3. This view is a split-screen showing the real-

time events of SL and OS as textual messages, similar to a

traditional tail -f command on UNIX systems. The bottom

window presents a zoomable geographical map to plot and

cluster extracted geographic locations. NF records are not

plotted to the geographic map, because a geographic map of

the total IP traﬃc will most likely not provide actionable in-

sights. However, mapping speciﬁc IP addresses of successful

logins can be worth monitoring to identify suspicious be-

havior or to reveal misuse of login credentials. Furthermore,

Figure 3: Real-Time Data Stream: Display to mon-

itor the incoming live streams as raw messages and

plot extracted geographic locations to a map.

real-time ﬁltering and search can be applied to reduce the

number of live events shown in the display.

The Spark Service was operated in local mode on a nor-

mal workstation Dell OptiPlex 980, Core i7-860, 8GB RAM

4x 2.80GHz with 10 separate working threads. To provide

further scalability the service could also be deployed to a

cluster of hardware machines running Apache Spark or to a

cloud-based deployment. To provide a new sliding slice every

30 seconds, we initialized the system with a batch and slide

interval of 30s and a window length of 60s. These settings

depend on the general characteristics of the data streams.

To reduce the cognitive load, the analyst decided to switch

to the real-time sliding slices visualization as seen in Figure 4

showing an example of ﬁve consecutive slices. The interac-

tive display can be explored by the analyst while new slices

are continuously added to the right in regular intervals to

support situational awareness. The ﬁrst slice contains criti-

cal OSSEC alerts (L5, L10, L3) visualized in a small treemap

widget (1). Alerts with a severity of 10 should warn the an-

alyst of ongoing security issues, which should be explored

using drill-down functions. Those alerts are related to au-

thentication issues as seen in the word cloud (2). Another

treemap widget in the ﬁrst slice (3) gives an overview of

involved programs. The third slice suddenly reveals a high

port usage (4), which can be recognized at the port counter.

The treemap of source hosts (5) reveals the source host. The

analyst can use the IP-Port node-link diagram based on NF

(6) to visually explore those suspicious connections.

Later on, the analyst decided to not look on all sliding

slices, but to compress the view based on speciﬁc features.

Figure 5 shows that the analyst is interested in slices with

highly critical OSSEC alerts of level 10, segments based on

the number of syslog messages received, and based on the

number of destination ports utilized in the computer net-

work. Based on this selection the slices are merged accord-

ingly. (1) relates to the segments relating to a port scan.

After that, there were no important slices according to the

feature selection, so a long time span is merged to a single

summary slice (2). The analyst was also interested in the

message drop in (3). Then various OSSEC alerts occurred

in multiple sliding slices (4). This area seams to be highly

suspicious, leading to many individual summary slices to

provide more details. Eventually, there are further suspi-

cious events based on NF data in (5) and another peak with

OSSEC alerts in (6) related to invalid SSH logins.

NStreamAware: real-time visual analytics for data streams to enhance situational awareness

Figures

Citations

Temporal MDS Plots for Analysis of Multivariate Data

BubbleNet: A Cyber Security Dashboard for Visualizing Patterns

Unlocking user-centered design methods for building cyber security visualizations

Commercial Visual Analytics Systems–Advances in the Big Data Analytics Field

Human Factors in Streaming Data Analysis: Challenges and Opportunities for Information Visualization

References

Toward a Theory of Situation Awareness in Dynamic Systems

A framework for clustering evolving data streams

Discretized streams: an efficient and fault-tolerant model for stream processing on large clusters

A Survey of Visualization Systems for Network Security

Cyber situational awareness – A systematic review of the literature

Related Papers (5)

OCEANS: online collaborative explorative analysis on network security

7 key challenges for visualization in cyber network defense

Real-time visual analytics for event data streams

A Survey of Visualization Systems for Network Security

Visual correlation for situational awareness

Frequently Asked Questions (14)

Q1. What contributions have the authors mentioned in the paper "Nstreamaware: real-time visual analytics for data streams to enhance situational awareness" ?

Q2. What future works have the authors mentioned in the paper "Nstreamaware: real-time visual analytics for data streams to enhance situational awareness" ?

Q3. What is the purpose of Apache Spark?

Q4. What is the ultimate goal of visual analytics systems for data streams?

Q5. Why do the authors need to decouple the flow-rate of a data stream from screen?

Q6. What is the architecture of the NVisAware service?

Q7. How long did it take to create a new sliding slice?

Q8. How could the service be deployed to a cluster of hardware machines?

Q9. How many servers are connected to a central syslog server?

Q10. How many threads were used to run the Spark Service?

Q11. What is the main limitation with respect to performance and scalability?

Q12. How many times did the participants have to do the analysis?

Q13. Why is the POK suspected in the disappearance?

Q14. What is the purpose of the slice?