scispace - formally typeset
Open AccessJournal ArticleDOI

Semantic Interaction for Visual Analytics: Toward Coupling Cognition and Computation

TLDR
Alex Endert's dissertation showed that user interaction embodies users' analytic process and can thus be mapped to model-steering functionality for "human-in-the-loop" system design.
Abstract
Alex Endert's dissertation "Semantic Interaction for Visual Analytics: Inferring Analytical Reasoning for Model Steering" described semantic interaction, a user interaction methodology for visual analytics (VA). It showed that user interaction embodies users' analytic process and can thus be mapped to model-steering functionality for "human-in-the-loop" system design. The dissertation contributed a framework (or pipeline) that describes such a process, a prototype VA system to test semantic interaction, and a user evaluation to demonstrate semantic interaction's impact on the analytic process. This research is influencing current VA research and has implications for future VA research.

read more

Content maybe subject to copyright    Report

6 July/August 2014 Published by the IEEE Computer Society 0272-1716/14/$31.00 © 2014 IEEE
Dissertation Impact
Editor: Jim Foley
Semantic Interaction for Visual Analytics
Toward Coupling Cognition and Computation
Alex Endert
Paci c Northwest National Laboratory
T
he world is becoming increasingly instru-
mented with sensors, monitoring, and
other methods for generating data describ-
ing social, physical, and natural phenomena. So,
data exist that could be analyzed to uncover, or
discover, the phenomena from which they were
created. However, as the analytic models lever-
aged to analyze these data continue to increase in
complexity and computational capability, how can
visualizations and user interaction methodologies
adapt and evolve to continue to foster discovery
and sensemaking?
User interaction is critical to such visual data
explorations success because it lets users test as-
sertions, assumptions, and hypotheses about the
information, given their prior knowledge about
the world. This cognitive process can be generally
called sensemaking. Visual analytics (VA) em-
phasizes sensemaking of large, complex datasets
through interactively exploring visualizations gen-
erated through a combination of analytic models.
(For more on this, see the related sidebar.) So, a
central focus is understanding how to leverage hu-
man cognition in concert with powerful computa-
tion through usable visual metaphors.
My PhD dissertation coined the term semantic
interaction in the context of a user interaction
methodology for model steering in VA systems.
1
It made three primary contributions. First, it ex-
plained the interactions users commonly employ
when analyzing text information spatially without
computational layout models, and the meaning
they externalize into the manually crafted spatial
constructs.
2,3
Second, it enabled bidirectionality
of spatializations by inverting popular dimension
reduction models.
4–6
Finally, it evaluated seman-
tic interactions impact on sensemaking through
the synchronization of the analytic-model param-
eters, the visualization, and the users insights in
the text analysis domain.
7
Semantic Interaction
Semantic interaction aims to enable co-reasoning
between the user and the analytic models (cou-
pling cognition and computation) without requir-
ing the user to directly control them. To do this, it
utilizes the visual metaphor in two ways:
the metaphor through which the insights are
obtained (that is, the visualization of informa-
tion created by computational models) and
the interaction metaphor through which hypoth-
eses and assertions are communicated (that is,
interaction occurs within the visual metaphor).
Users directly manipulate data in visualizations;
semantic interaction then captures tacit knowl-
edge of the user and steers the underlying analytic
models. These models can be adapted incremen-
tally on the basis of the users sensemaking pro-
cess and domain expertise explicated through the
user’s interaction. (For semantic interaction de-
sign guidelines, see the related sidebar.)
That is, the visualizations visual constructs ex-
pose the underlying analytic models’ parameters.
On the basis of common visual metaphors (such
as the geographic, spatial metaphor in which prox-
imity approximates similarity), we can infer tacit
knowledge of the user’s reasoning by inverting
these analytic models. So, users are shielded from
the underlying complexities and can interact with
their data through a bidirectional visual medium.
The interactions users perform in the visualiza-
tions to augment the visual encodings within the
metaphor enable the inference of their analytic
reasoning, which is systematically applied to the
underlying models.
The Semantic Interaction Pipeline
The information visualization pipeline in Figure 1
shows how data characteristics are extracted and

IEEE Computer Graphics and Applications 7
assigned visual attributes or encodings, ultimately
creating a visualization.
8
Visualizations following
this pipeline exhibit two primary components of
the visual interface: the visualization showing the
information and a GUI. The GUI’s graphical con-
trols (sliders, knobs, and so on) let users directly
manipulate the parameters they control.
For example, direct manipulation user interfaces
let users directly augment the values of data pa-
rameters and see the corresponding change in the
visualization.
9
(One example is using a slider to
set the range of home prices and observing the
ltered results in a map showing homes for sale.)
This model has been a successful user interaction
framework for information visualizations. Figure
2a shows an example of such an interface.
VA systems have adopted this approach. How-
ever, a distinct difference is the added complex-
ity of the models (and their parameters) being
controlled. For example, instead of ltering the
data by selecting ranges for home prices, users
employ graphical controls over model parameters
such as weighting the mixture of eigenvectors of
a principal component analysis (PCA) dimension
reduction model to produce 2D views of high-
dimensional data. To users without expertise in
such models, this poses fundamental usability
challenges. Figure 2b shows an example of this
type of direct manipulation interface.
The semantic interaction pipeline (see Figure
3) directly binds model-steering techniques to
the interactive affordances created by the visual-
ization. For example, a distance function used to
determine the relative similarity between two data
points (often visually depicted using distance in a
spatial layout) can be the interactive affordance to
let users to explore that relationship. So, the user
interacts directly with the visual metaphor, creating
a bidirectional medium between the user and the
analytic models. This interaction method is similar
to “by example” interaction because users can di-
rectly show their intention using the visualization’s
structure. This adds to visualizations role in the
reasoning process, in that its not only a method for
gaining insight but also one for directly interacting
with the information and the system.
The bidirectionality afforded by semantic in-
teraction comes through binding the parameter
controls traditionally afforded by the GUI di-
rectly within the visual metaphor. Through this
binding, the system can infer the users analytic
reasoning from the users interaction with the
visualization regarding the underlying math-
ematical model’s parameters. Specically, a spa-
tial layout is one visual metaphor in which my
colleagues and I have conducted much semantic
interaction research.
4,6,7
Semantic Interaction with Spatializations
A spatial visual metaphor (a spatialization) dem-
onstrates the bidirectionality afforded by semantic
interaction. A spatial metaphor lends itself to com-
mon dimension reduction models to reduce the di-
mensionality of complex data to two dimensions.
For example, relationships and similarities be-
tween high-dimensional data objects can be shown
in two dimensions by leveraging such dimension
reduction models as PCA, multidimensional scal-
ing, and force-directed layouts. Generally, these
models try to approximate the distance between
data objects in their true, high-dimensional rep-
resentation using fewer dimensions.
Researchers have applied semantic interaction
methods to this visual metaphor. For example,
Algorithm Visualization
User
(perceive)
User
(interact)
Data
Figure 1. The information visualization pipeline.
8
Users can directly
interact with the data (for example, ltering or correcting values),
algorithm (for example, adjusting weights of relationships or changing
parameter values), or visualization (for example, selecting a different
encoding or modifying zoom levels).
(a) (b)
Figure 2. Examples of two types of direct manipulation interfaces.
(a) Spotre employs direct manipulation for dynamic querying (ranges
for data values, such as the portfolio value or number of trades) for
information visualization. (b) iPCA applies direct manipulation to visual
analytics (VA)—for example, directly controlling each dimension’s
relative contribution for principal component analysis.
10

8 July/August 2014
Dissertation Impact
inverting PCA, multidimensional scaling, and
generative topographic mapping can enable semantic
interaction in bidirectional spatializations.
4,11
The
ability to understand each model’s parameters
that can be exposed through the visual encoding
(in this case, the relative distance between data
points) enabled this affordance. Further research
has explored the tradeoffs between the various
ways to map the user feedback of changing the
relative distance between data objects to the
underlying dimension reduction models.
5,12
Impact: Current and Future
Semantic interaction to increase the usability of com-
plex VA systems has evolved along with VAs growth
and maturity as a research discipline. Interactivity has
become increasingly important, and users’ attempts
to communicate their hypotheses and assertions
about the data to foster sensemaking have contin-
ued to employ (if not depend on) analytic models.
Semantic interaction has helped foster this commu-
nication between the user and the model, having an
impact beyond that at the time of my dissertation.
S
ensemaking is the process of someone acquiring an un-
derstanding of the world based on that person’s concep-
tual model of events, actions, and information. Researchers
have developed visual-analytics (VA) systems that support
aspects of this process. This support can be characterized by
the systems’ user interactions, especially as they pertain to
the visual metaphor and underlying models. Sensemaking
has two primary subprocesses: foraging and synthesis.
1
Foraging
During foraging, users lter and gather collections of
interesting or relevant information. Scientists categorize
VA tools that support foraging by their ability to pass data
through complex analytic and statistical models and visu-
alize the datasets computed structure for the user to gain
insight (see Figure A). So, users interact with these tools
primarily by directly manipulating the models’ parameters.
For example, interfaces that apply the information
visualization interaction methodology of direct manipula-
tion
2
present users with a set of graphical controls (slid-
ers, knobs, and so on) to control and modify the model
parameters’ values. In VA tools, understanding these
parameters (and the result of changing their values) can
be difcult and is often outside the area of expertise for an
expert in the specic data domain (for example, genom-
ics and international politics). In these cases, users must
translate their domain expertise and semantics about the
information to determine which parameters to adjust (and
by how much)a fundamental usability concern.
VA tools leverage such models as entity extraction,
topic modeling, link analysis, dimensionality reduction,
clustering, and labeling. These models use various distance
metrics to measure similarity between data objects. You
can use these models to spatialize data. For example, you
can represent unstructured text as a bag of words, high-
dimensional data in which each dimension is a unique
keyword or phrase in the text. Visualizations such as
IN-SPIRE’s Galaxy View
3
organize points representing text
documents such that nearby points represent similar docu-
ments. This helps users recognize relationships between
documents and between clusters of documents.
Synthesis
On the basis of the information acquired from foraging, us-
ers advance through the synthesis stages. In these stages,
they construct and test hypotheses about how the foraged
information might relate to their understanding of the
world. Synthesis tools let users organize and maintain their
hypotheses and insight regarding the data (see Figure
B). These tools often employ a exible, informal spatial
medium or canvas.
For example, by organizing spatial layouts, users can
externalize their insights about a dataset on the basis of
the information’s position.
4
Users frequently organize such
layouts by complex schemas and mixed metaphors, often
organized topically according to the semantics relevant
to their analysis needs. Analysts use tools that support
manually constructing spatializations to visually synthesize
hypotheses.
5
That is, they create spatial structures (often
mixing clusters, timelines, connections, geography, order
Visual Analytics for Sensemaking
Statistical
Model
Figure A. Interaction with foraging tools. Users interact directly with
the statistical model (red), then gain insight through observing the
change in the visualization (blue).
Figure B. Interaction with synthesis tools. Users manually create a
spatial layout of the information to maintain and organize their insights
about the data.

IEEE Computer Graphics and Applications 9
Making Insights in Big Data Accessible
The ForceSPIRE system demonstrates how a spa-
tialization of text documents can be the pri-
mary interface for user interaction (see Figure
4).
6
ForceSPIRE uses relative distance to indicate
documents’ similarity. It computes the distances
through force-directed layout. The single spatial
layout is the primary view, through which most
interaction occurs. We chose the user interactions
specically to correspond with those found during
studies observing users performing text analysis
using a spatial metaphor.
2,3
The studies found that
users reposition documents, highlight phrases,
take notes, and perform text searches while ac-
tively reading. ForceSPIRE couples each of these
interactions with model updates.
6
My colleagues and I directly extended the nd-
ings from this research into work in analyzing large
volumes of text. We used multiple tiers and styles of
analytic and mathematical models to process and
retrieve data, extract features, and so on. Each of
these stages in the data-processing pipeline presents
opportunities to steer the model on the basis of the
inference of the user interaction.
13
For example, a
challenge in large data volumes is retrieving only
the most relevant subset of the data to maintain
locally and visualize. Thus, how can semantic in-
teraction steer information retrieval techniques to
locally maintain and visualize only the most rel-
evant information with respect to the user’s ana-
lytic process? Many such techniques can benet
of discovery, process waypoints, and so on) that
carry meaning to them regarding their sense-
making process.
Such informal relationships in the spatial
layout are benecial because they don’t require
users to overformalize relationships too early in
the process. This gradual increase in relationship
formality is called incremental formalism.
6
This
approach directly presents the user interaction
to users both in the visual metaphor and on the
data. So, the users can leverage their domain
expertise to make sense of the information.
References
1. P. Pirolli and S.K Card, “Information Foraging
Models of Browsers for Very Large Document
Spaces,” Proc. 1998 Int’l Working Conf. Advanced
Visual Interfaces (AVI 98), 1998, pp. 83–93.
2. B. Shneiderman, “Direct Manipulation: A Step
beyond Programming Languages,Computer, vol.
16, no. 8, 1983, pp. 57–69.
3. A. Endert, P. Fiaux, and C. North, “Semantic
Interaction for Visual Text Analytics,Proc. 2012
SIGCHI Conf. Human Factors in Computing Systems
(CHI 12), 2012, pp. 473–482.
4. C. Andrews, A. Endert, and C. North, “Space
to Think: Large, High-Resolution Displays for
Sensemaking,Proc. 2010 SIGCHI Conf. Human
Factors in Computing Systems (CHI 10), 2010, pp.
5564.
5. A. Endert et al., “The Semantics of Clustering:
Analysis of User-Generated Spatializations of Text
Documents,” Proc. Int’l Working Conf. Advanced
Visual Interfaces (AVI 12), 2012.
6. F.M. Shipman III and C.C. Marshall, “Formality
Considered Harmful: Experiences, Emerging
Themes, and Directions on the Use of Formal
Representations in Interactive Systems,” Computer
Supported Cooperative Work, vol. 8, no. 4, 1999,
pp. 333352.
Algorithm
(project)
Spatialization
Hard data
Soft data
Algorithm
(interpret)
User
(perceive)
User
(interact)
Figure 3. The semantic interaction pipeline. Users interact directly
with the visualization, from which inferences are made to update the
model or algorithm. Semantic interaction uses the stored “soft data” in
conjunction with the “hard data” (raw data) to incorporate the user’s
expertise into the VA system.
Search
Highlight
Annotate
Figure 4. With ForceSPIRE, users can search,
highlight, annotate, and reposition documents
spatially. Documents can appear as minimized
rectangles (see the yellow, blue, and teal rectangles
in the enlarged region at the bottom) and as full-
detail windows (resizable by the user). ForceSPIRE
makes model inferences on each user interaction,
creating machine and human co-reasoning.

10 July/August 2014
Dissertation Impact
from the information inferred about the user to
more accurately query within, and across, databases
containing relevant information. That is, how can
semantic interaction scale the inferred reasoning of
the user into the larger data volumes through the
malleability of information retrieval techniques?
Furthermore, this might require additional visual
representations (or aggregations) of information.
Semantic interaction has impacted projects at
Pacic Northwest National Laboratory that stem
from user needs to understand these large volumes
of text data. Semantic interaction’s capability to
capture the analytic reasoning associated with a
user interaction and amplify that reasoning into
the analytic model lets users extend their reach
and coverage into the larger data scales. These us-
ers’ domain expertise generally does not include
knowledge in statistics or the data sciences. So,
placing their user interaction directly onto the vi-
sual data representations enables them to reason
on the data using the visualization and to commu-
nicate their hypotheses and assertions directly in
the visualization. Anecdotal feedback from these
users has been positive, with a user evaluation
in progress. Similarly, research at Virginia Tech is
investigating how semantic interaction can help
steer information retrieval techniques to address
big-data challenges. This research is fundamen-
tally advancing our understanding of semantic
interaction and evolving ForceSPIRE as a testbed
for prototyping and evaluating specic pairings of
user interaction and computation.
Semantic interaction techniques have also af-
fected big-data challenges that emphasize a vari-
ety of data (for example, multimedia). Phenomena
that are captured, collected, and encoded digitally
often span multiple media types. So, promoting
sensemaking through VA technologies often re-
quires users to reason across multiple media types.
One challenge with such heterogeneous datasets is
to correlate, or fuse, the data types’ feature spaces
that represent a cognitively cohesive concept or
topic. Through inferring the higher-level analytic
reasoning from user interaction tailored toward
each of these data types, the opportunity exists to
successfully decode phenomena whose discovery
and understanding require multiple data types.
From Streaming Data to Streaming Insights
The continuous sensing and collecting of informa-
tion poses streaming-data challenges and oppor-
tunities. A specic challenge is how to understand
evolving and changing phenomena in real time.
In terms of steering and adapting the underlying
models using semantic interaction, challenges ex-
ist regarding the temporal nature of the data and
the reasoning process. As users generate hypothe-
ses and reason about the data, how can the models
interpret the temporal nature of those hypotheses
and assertions? How can VA systems working with
streaming data understand the temporal impor-
tance of what information to retain and what to
delete as a user progresses through sensemaking?
Researchers are applying semantic interaction
to streaming-data challenges (following the last
design guideline in the sidebar “Semantic Interac-
tion Design Guidelines”). Instead of using seman-
tic interaction to understand the features users
are interested in over time, the goal here might
be to understand the features or data that users
H
ere are guidelines for semantic interaction for spatializations:
1
A visual “near = similar” metaphor supports analysts’ spatial
cognition and is generated by statistical models and similarity
metrics.
2
Use semantic interactions within the visual metaphor, based on
common interactions occurring in spatial analytic processes
3
such as searching, highlighting, annotating, and repositioning
documents.
Interpret and map the semantic interactions to the model’s
underlying parameters, by updating weights and adding
information.
Shield users from the complexity of the underlying mathemati-
cal models and parameters.
Models should learn incrementally by taking into account inter-
action during the entire analytic process, supporting analysts’
process of incremental formalism.
4
Provide visual feedback of the updated model and learned
parameters within the visual metaphor.
Reuse learned model parameters in streaming data or future data.
References
1. A. Endert, “Semantic Interaction for Visual Analytics: Inferring
Analytical Reasoning for Model Steering,” PhD dissertation, Dept.
Computer Science, Virginia Tech, 2012.
2. A. Skupin, “A Cartographic Approach to Visualizing Conference
Abstracts,” IEEE Computer Graphics and Applications, vol. 22, no. 1,
2002, pp. 5058.
3. A. Endert, P. Fiaux, and C. North, “Semantic Interaction for Visual
Text Analytics,Proc. 2012 SIGCHI Conf. Human Factors in Computing
Systems (CHI 12), 2012, pp. 473–482.
4. F.M. Shipman III and C.C. Marshall, “Formality Considered Harmful:
Experiences, Emerging Themes, and Directions on the Use of
Formal Representations in Interactive Systems,” Computer Supported
Cooperative Work, vol. 8, no. 4, 1999, pp. 333–352.
Semantic Interaction Design Guidelines

Citations
More filters
Journal ArticleDOI

InterAxis: Steering Scatterplot Axes via Observation-Level Interaction

TL;DR: This paper presents InterAxis, a visual analytics technique to properly interpret, define, and change an axis in a user-driven manner, and describes the details of the technique and demonstrates the intended usage through two scenarios.
Journal ArticleDOI

Survey on the Analysis of User Interactions and Visualization Provenance

TL;DR: A comprehensive survey of work in the data visualization and visual analytics field that focus on the analysis of user interaction and provenance data is produced.
Journal ArticleDOI

AxiSketcher: Interactive Nonlinear Axis Mapping of Visualizations through User Drawings

TL;DR: This paper introduces a technique to interpret a user's drawings with an interactive, nonlinear axis mapping approach called AxiSketcher, which enables users to impose their domain knowledge on a visualization by allowing interaction with data entries rather than with data attributes.
Proceedings ArticleDOI

The Role of Explicit Knowledge: A Conceptual Model of Knowledge-Assisted Visual Analytics

TL;DR: A conceptual model of Knowledge-assisted VA conceptually grounded on the visualization model by van Wijk is proposed and can inspire designers to generate novel VA environments using explicit knowledge effectively.
Journal ArticleDOI

Visual Analytics: A Comprehensive Overview

TL;DR: A novel categorization of visual-analytics applications from a technical perspective is proposed, which is based on the dimensionality of visualization and the type of interaction, and a comprehensive survey of visual analytics is performed, which examines its evolution from visualization and algorithmic data analysis, and investigates how it is applied in various application domains.
References
More filters
Journal ArticleDOI

Direct Manipulation: A Step Beyond Programming Languages

TL;DR: As I talked with enthusiasts and examined the systems they used, I began to develop a model of the features that produced such delight, and the central ideas seemed to be visibility of the object of interest; rapid, reversible, incremental actions; and replacement of complex command language syntax by direct manipulation of the objects of interest.
Journal ArticleDOI

Toward a Deeper Understanding of the Role of Interaction in Information Visualization

TL;DR: Seven general categories of interaction techniques widely used in Infovis are proposed, organized around a user's intent while interacting with a system rather than the low-level interaction techniques provided by a system.
Journal ArticleDOI

Toward measuring visualization insight

TL;DR: The capability of the controlled experiment method to measure insight is examined, to determine to what degree visualizations achieve this purpose.
Proceedings ArticleDOI

Space to think: large high-resolution displays for sensemaking

TL;DR: Examining how increased space affects the way displays are regarded and used within the context of the cognitively demanding task of sensemaking finds both external memory and a semantic layer are affected.
Journal ArticleDOI

Formality Considered Harmful: Experiences, EmergingThemes, and Directions on the Use of Formal Representations inInteractive Systems

TL;DR: It is posed that, while it is impossible to remove all formalisms from computing systems, system designers need to match the level of formal expression entailed with the goals and situation of the users -- a design criteria not commonly mentioned in current interface design.
Related Papers (5)
Frequently Asked Questions (12)
Q1. What are the contributions in "Toward coupling cognition and computation" ?

In this paper, the authors present a paper, IEEE Computer Graphics and Applications, vol. 22, no. 1, No. 

(b) iPCA applies direct manipulation to visual analytics (VA)—for example, directly controlling each dimension’s relative contribution for principal component analysis.108 July/August 2014Dissertation Impactinverting PCA, multidimensional scaling, and generative topographic mapping can enable semantic interaction in bidirectional spatializations. 

Semantic interaction can help further this scientific understanding of user interaction by systematically quantifying the interaction and binding it to model parameters. 

User interaction is critical to such visual data exploration’s success because it lets users test assertions, assumptions, and hypotheses about the information, given their prior knowledge about the world. 

Use semantic interactions within the visual metaphor, based on common interactions occurring in spatial analytic processes3 such as searching, highlighting, annotating, and repositioning documents. 

Visual analytics (VA) emphasizes sensemaking of large, complex datasets through interactively exploring visualizations generated through a combination of analytic models. 

One challenge with such heterogeneous datasets is to correlate, or fuse, the data types’ feature spaces that represent a cognitively cohesive concept or topic. 

Models should learn incrementally by taking into account interaction during the entire analytic process, supporting analysts’ process of incremental formalism. 

Semantic interaction has impacted projects at Pacific Northwest National Laboratory that stem from user needs to understand these large volumes of text data. 

The ForceSPIRE system demonstrates how a spatialization of text documents can be the primary interface for user interaction (see Figure 4).6 ForceSPIRE uses relative distance to indicate documents’ similarity. 

In transitioning semantic interaction design guidelines (see the related sidebar) to such metaphors, a critical component is the model used for generating the visualization. 

This adds to visualization’s role in the reasoning process, in that it’s not only a method for gaining insight but also one for directly interacting with the information and the system.