scispace - formally typeset
Open AccessJournal ArticleDOI

Vega-Lite: A Grammar of Interactive Graphics

Reads0
Chats0
TLDR
Vega-Lite combines a traditional grammar of graphics, providing visual encoding rules and a composition algebra for layered and multi-view displays, with a novel grammar of interaction, that enables rapid specification of interactive data visualizations.
Abstract
We present Vega-Lite, a high-level grammar that enables rapid specification of interactive data visualizations. Vega-Lite combines a traditional grammar of graphics, providing visual encoding rules and a composition algebra for layered and multi-view displays, with a novel grammar of interaction. Users specify interactive semantics by composing selections. In Vega-Lite, a selection is an abstraction that defines input event processing, points of interest, and a predicate function for inclusion testing. Selections parameterize visual encodings by serving as input data, defining scale extents, or by driving conditional logic. The Vega-Lite compiler automatically synthesizes requisite data flow and event handling logic, which users can override for further customization. In contrast to existing reactive specifications, Vega-Lite selections decompose an interaction design into concise, enumerable semantic units. We evaluate Vega-Lite through a range of examples, demonstrating succinct specification of both customized interaction methods and common techniques such as panning, zooming, and linked selection.

read more

Content maybe subject to copyright    Report

Vega-Lite: A Grammar of Interactive Graphics
Arvind Satyanarayan, Dominik Moritz, Kanit Wongsuphasawat, and Jeffrey Heer
Fig. 1. Example visualizations authored with Vega-Lite. From left-to-right: layered line chart combining raw and average values,
dual-axis layered bar and line chart, brushing and linking in a scatterplot matrix, layered cross-filtering, and an interactive index chart.
Abstract—We present Vega-Lite, a high-level grammar that enables rapid specification of interactive data visualizations. Vega-Lite
combines a traditional grammar of graphics, providing visual encoding rules and a composition algebra for layered and multi-view
displays, with a novel grammar of interaction. Users specify interactive semantics by composing selections. In Vega-Lite, a selection
is an abstraction that defines input event processing, points of interest, and a predicate function for inclusion testing. Selections
parameterize visual encodings by serving as input data, defining scale extents, or by driving conditional logic. The Vega-Lite compiler
automatically synthesizes requisite data flow and event handling logic, which users can override for further customization. In contrast
to existing reactive specifications, Vega-Lite selections decompose an interaction design into concise, enumerable semantic units.
We evaluate Vega-Lite through a range of examples, demonstrating succinct specification of both customized interaction methods
and common techniques such as panning, zooming, and linked selection.
Index Terms—Information visualization, interaction, systems, toolkits, declarative specification
1 INTRODUCTION
Grammars of graphics span a gamut of expressivity. Low-level gram-
mars such as Protovis [3], D3 [4], and Vega [22] are useful for ex-
planatory data visualization or as a basis for customized analysis
tools, as their primitives offer fine-grained control. However, for ex-
ploratory visualization, higher-level grammars such as ggplot2 [27],
and grammar-based systems such as Tableau (n
´
ee Polaris [24]), are
typically preferred as they favor conciseness over expressiveness. An-
alysts rapidly author partial specifications of visualizations; the gram-
mar applies default values to resolve ambiguities, and synthesizes low-
level details to produce visualizations.
High-level languages can also enable search and inference over the
space of visualizations. For example, Wongsuphasawat et al. [30] in-
troduced Vega-Lite to power the Voyager visualization browser. By
providing a smaller surface area than the lower-level Vega language,
Vega-Lite makes systematic enumeration and ranking of data transfor-
mations and visual encodings more tractable.
However, existing high-level languages provide limited support for
interactivity. An analyst can, at most, enable a predefined set of com-
mon techniques (linked selections, panning & zooming, etc.) or pa-
rameterize their visualization with dynamic query widgets [21]. For
custom, direct-manipulation interaction they must instead turn to im-
perative event handling callbacks. Recognizing that callbacks can be
error-prone to author, and require complex static analysis to reason
about, Satyanarayan et al. [23] recently formulated declarative interac-
tion primitives for Vega. While these additions facilitate programmatic
generation and retargeting of interactive visualizations, they remain
Arvind Satyanarayan is with Stanford University. E-mail:
arvindsatya@cs.stanford.edu.
Dominik Moritz, Kanit Wongsuphasawat, and Jeffrey Heer are with the
University of Washington. E-mails: {domoritz, kanitw, jheer}@uw.edu.
Manuscript received xx xxx. 201x; accepted xx xxx. 201x. Date of
Publication xx xxx. 201x; date of current version xx xxx. 201x.
For information on obtaining reprints of this article, please send
e-mail to: reprints@ieee.org.
Digital Object Identifier: xx.xxxx/TVCG.201x.xxxxxxx/
low-level. Verbose specification impedes rapid authoring and hinders
systematic exploration of alternative designs.
In this paper we extend Vega-Lite to enable concise, high-level
specification of interactive data visualizations. To support expressive
interaction methods, we first contribute an algebra to compose single-
view Vega-Lite specifications into multi-view displays using layer,
concatenate, facet and repeat operators. Vega-Lite’s compiler infers
how input data should be reused across constituent views, and whether
scale domains should be unioned or remain independent.
Second, we contribute a high-level interaction grammar. With
Vega-Lite, an interaction design is composed of selections: visual el-
ements or data points that are chosen when input events occur. Selec-
tions parameterize visual encodings by serving as input data, defining
scale extents, and providing predicate functions for testing or filtering
items. For example, a rectangular “brush” is a common interaction
technique for data visualization. In Vega-Lite, a brush is defined as a
selection that holds two data points that correspond to its extents (e.g.,
captured when the mouse button is pressed and as it is dragged, re-
spectively). Its predicate can be used to highlight visual elements that
fall within the brushed region, and to materialize a dataset as input to
other encodings. The selection can also serve as the scale domain for a
secondary view, thereby constructing an overview + detail interaction.
For added expressivity, Vega-Lite provides a series of operators to
transform a selection. Transforms can be triggered by input events as
well, and manipulate selection points or predicate functions. For ex-
ample, a toggle transform adds or removes a point from the selection,
while a project transform modifies the predicate to define inclusion
over specified data fields.
The Vega-Lite compiler synthesizes a low-level Vega specifica-
tion [22] with the requisite data flow, and default event handling logic
that a user can override. Through a range of examples, we demon-
strate that Vega-Lite brings the advantages of high-level specification
to interactive visualization. Common methods, including linked selec-
tion, panning, and zooming, as well as custom techniques (drawn from
an established taxonomy [31]) can be concisely described. Moreover,
selections, transformations, and their application to visual encodings
decompose interaction into a parametric design space. We show how

each of these parameters can be systematically varied to generate al-
ternate interaction techniques for a given set of visual encodings. Such
enumeration can be useful to explore alternative designs, and can aid
higher-level reasoning about interaction for example, recommend-
ing suitable interaction techniques as part of a design tool.
2 RELATED WORK
Vega-Lite builds on prior work on grammars of graphics, visualization
systems, and techniques for interactive selection and querying.
2.1 Grammar-Based Visual Encoding
Since the initial publication of Wilkinson’s The Grammar of Graph-
ics [29] in 1999, formal grammars for statistical graphics have grown
increasingly popular as a way to succinctly specify visualizations.
Wilkinson’s work was quickly followed by the Stanford Polaris sys-
tem [24], later commercialized as Tableau. Hadley Wickham’s popular
ggplot2 [27] and ggvis [20] packages implement variants of Wilkin-
son’s model in the R statistical language. These tools eschew chart
templates, which offer limited means of customization, in favor of
combinatorial building blocks. Abstracting data models, graphical
marks, visual encoding channels, scales and guides (i.e., axes and leg-
ends) yields a more expressive design space, and allows analysts to
rapidly construct graphics for exploratory analysis [13]. Concise spec-
ification is achieved in part through ambiguity: users may omit details
such as scale transforms (e.g., linear or log) or color palettes, which
are then filled in using a rule-based system of smart defaults. More
expressive lower-level (and thus more verbose) grammars, including
those of Protovis [3], D3 [4], and Vega [22], have been widely used
for creating explanatory and highly-customized graphics.
The design of Vega-Lite is heavily influenced by these works.
Drawing from Wilkinson’s grammar and Polaris/Tableau, Vega-Lite
similarly represents basic plots using a set of encoding definitions that
map data attributes to visual channels such as position, color, shape,
and size, and may include common data transformations such as bin-
ning, aggregation, sorting, and filtering. Drawing from Vega, Vega-
Lite uses a portable JSON (JavaScript Object Notation) syntax that
permits generation from a variety of programming languages. Vega-
Lite specifications are compiled to full Vega specifications, hence the
expressive gamut of Vega-Lite is a strict subset of that of Vega. As we
will later demonstrate, Vega-Lite sacrifices some expressiveness for
dramatic gains in the conciseness and clarity of specification.
In terms of visual encoding, Vega-Lite differs most from other high-
level grammars in its approach to multiple view displays. Each of
these grammars supports faceting (or nesting) to construct trellis plots
in which each cell similarly visualizes a different partition of the data.
Both Wilkinson’s grammar and Polaris/Tableau achieve this through a
table algebra over data fields, which in turn determines spatial sub-
divisions. Tableau additionally supports the construction of multi-
view dashboards via a different mechanism, with each view backed
by a separate specification. In contrast, we contribute a view alge-
bra: starting with unit specifications that define a single plot, Vega-
Lite expresses composite views using operators for layering, horizon-
tal or vertical concatenation, faceting, and parameterized repetition.
When applicable, these operators will merge scale domains and prop-
erly align constituent views. Disparate views can also be combined
into arbitrary dashboards, all within a unified algebraic model.
2.2 Specifying Interactions in Visualization Systems
Despite the central role of interaction in effective data visualization
[13, 19], little work has been done to develop a grammar for specify-
ing interaction techniques. Wilkinson’s grammar includes no notion
of interaction. Tableau supports common interaction techniques, but
relies on mechanisms external to the visual encoding grammar. Early
systems like GGobi [25] support common techniques as well, and pro-
vide imperative APIs for custom methods. However, such APIs make
easy tasks needlessly complex, burdening developers with learning
low-level execution details. More recent systems, including Protovis,
D3, and VisDock [7], offer a typology of common techniques that can
be applied to a visualization. Such top-down approaches, however,
limit customization and composition. For example, D3’s interactors
encapsulate event processing, making it difficult to combine them if
their events conflict (e.g., if dragging triggers brushing and panning).
The prior work perhaps most closely related to Vega-Lite is the Re-
active Vega language [23]. Reactive Vega draws on Functional Reac-
tive Programming techniques to formulate composable, declarative in-
teraction primitives for data visualization. Reactive Vega models input
events as continuous data streams. To succinctly define event streams
of interest, Vega employs an event selector syntax, which Vega-Lite
also uses for customized event logic. Event streams, in turn, drive
dynamic variables called signals. Signals parameterize the remainder
of the visualization specification, endowing it with reactive semantics.
When a new event fires, it propagates to dependent signals; visual en-
codings that use them are automatically re-evaluated and re-rendered.
This reactive approach is not only capable of expressing a diverse set
of interactions [23], it is performant as well [22], with interactive per-
formance at least twice as fast as the equivalent D3 program.
However, the resulting reactive specifications are low-level and ver-
bose. Specifying common techniques can be time-consuming, requir-
ing tens of lines of JSON, and it is difficult to know how to adapt
techniques in pursuit of alternative designs. In contrast, Vega-Lite is
a higher-level specification language, with primitives that decompose
interaction design into a parametric space. Common methods require
typically 1-2 lines of code, and design variations can be explored by
systematically enumerating defined properties. Nevertheless, Reac-
tive Vega provides a performant runtime and an “assembly language”
to which Vega-Lite specifications are compiled.
2.3 Interactive Selection and Querying
Selection, often in the form of users clicking or lassoing visual items
of interest, is a fundamental operation in user interfaces and has
been well-studied in the context of data visualization. For example,
in Snap-Together Visualization [17], multiple views are coordinated
via “primary-” and “foreign-key actions, which propagate selected
data tuples from one view to the others. Wilhelm [28] describes the
need for such “indirect object manipulation” methods as an axiom
of interactive data displays. Chen’s compound brushing [6] provides
a visual dataflow language for specifying a rich space of transfor-
mations of brush selections. More recently, Brunel [5] provides a
special #selection data field that is dynamically populated with
the elements a user interacts with, and can be used to link multi-
ple views or filter input data. Similarly, RStudio’s Shiny [21], an
imperative web application layer, provides brushedPoints and
nearestPoints functions which can be used throughout an R
script to operate on selected elements.
Other systems have studied formally representing selections as data
queries [28]. For example, brushing interactions in VQE [9] generate
extensional queries that enumerate all items of interest; a form-based
interface enables specification of intensional (declarative) queries. In-
dividual point and brush selections in DEVise [15], known as visual
queries, map to a declarative structure and are used to link together
multiple views. With VIQING [18], rectangular “rubber band” selec-
tions are modeled as range extents, and views can be dropped on top
of each other to join their underlying datasets. Heer et al. [12] demon-
strate that by modeling a selection as a declarative query, interactive
“query relaxation” can successively capture more items of interest.
Vega-Lite builds on this work by richly integrating an interactive
selection abstraction with the primitives of visual encoding grammars.
Vega-Lite selections are populated with one or more points of interest,
in response to user interaction. Extensible predicate functions map se-
lections to declarative queries, and allow a minimal set of “backing”
points to represent the full space of selected points. Additional op-
erators can transform a selection’s predicate or backing points (e.g.,
offseting them to translate a brush selection or perform panning). Se-
lections then parameterize visual encodings by serving as input data,
defining scale extents, or using predicates to test or filter items. The
end result is an enumerable, combinatorial design space of interac-
tive statistical graphics, with concise specification of not only linking
interactions, but panning, zooming, and custom techniques as well.

(b) Correlation between wind and temperature
{
"data": {
"url": "data/weather.csv",
"formatType": "csv" },
"mark": "line",
"encoding": {
"x": {
"field": "date",
"type": "temporal",
"timeUnit": "month" },
"y": {
"field": "temp_max",
"type": "quantitative",
"aggregate": "mean" },
"color": {
"field": "location",
"type": "nominal" }
}
}
{
"data": {
"url": "data/weather.csv",
"formatType": "csv" },
"mark": "point",
"encoding": {
"x": {
"field": "temp_max",
"type": "quantitative",
"bin": true },
"y": {
"field": "wind",
"type": "quantitative",
"bin": true },
"size": {
"field": "*",
"aggregate": "count" },
"color": {
"field": "location",
"type": "nominal" }
} }
{
"data": {
"url": "data/weather.csv",
"formatType": "csv" },
"mark": "bar",
"encoding": {
"x": {
"field": "location",
"type": "nominal"
},
"y": {
"field": "*",
"type": "quantitative",
"aggregate": "count"
},
"color": {
"field": "weather",
"type": "nominal"
}
}
}
(a) Line chart with aggregation (c) Stacked bar chart of weather types
Fig. 2. Vega-Lite unit specifications visualizing weather data. These examples demonstrate varied mark types and data transformations.
3 THE VEGA-LITE GRAMMAR OF GRAPHICS
Vega-Lite combines a grammar of graphics with a novel grammar of
interaction. In this section, we describe Vega-Lite’s basic visual en-
coding constructs and an algebra for view composition. In prior work,
Wongsuphasawat et al. [30] introduced the simplest Vega-Lite speci-
fication here referred to as a unit specification that defines a sin-
gle Cartesian plot with a specific mark type to encode data (e.g., bars,
lines, plotting symbols). Given multiple unit plots, we introduce layer,
concat, facet, and repeat operators to provide an algebra for construct-
ing composite views. This algebra can express layered plots, trellis
plots, and arbitrary multiple view displays. Each operator is responsi-
ble for combining or aligning underlying scales and axes as needed.
3.1 Unit Specification
A unit specification describes a single Cartesian plot, with a backing
data set, a given mark-type, and a set of one or more encoding def-
initions for visual channels such as position (x, y), color, size, etc.
Formally, a unit view consists of a four-tuple:
unit := (data, transforms, mark-type, encodings)
The data definition identifies a data source, a relational table con-
sisting of records (rows) with named attributes (columns). This data ta-
ble can be subject to a set of transforms, including filtering and adding
derived fields via formulas. The mark-type specifies the geometric ob-
ject used to visually encode the data records. Legal values include bar,
line, area, text, rule for reference lines, and plotting symbols (point &
tick). The encodings determine how data attributes map to the proper-
ties of visual marks. Formally, an encoding is a seven-tuple:
encoding := (channel, field, data-type, value, functions, scale, guide)
Available visual encoding channels include spatial position (x, y),
color, shape, size, and text. An order channel controls sorting of
stacked elements (e.g., for stacked bar charts and the layering order of
line charts). A path order channel determines the sequence in which
points of a line or area mark are connected to each other. A detail
channel includes additional group-by fields in aggregate plots.
The field string denotes a data attribute to visualize, along with a
given data-type (one of nominal, ordinal, quantitative or temporal).
Alternatively, one can specify a constant literal value to serve as the
data field. The data field can additionally be transformed using func-
tions such as binning, aggregation (sum, average, etc.), and sorting.
An encoding may also specify properties of a scale that maps from
the data domain to a visual range, and a guide (axis or legend) for
visualizing the scale. If not specified, Vega-Lite will automatically
populate default properties based on the channel and data-type. For x
and y channels, either a linear scale (for quantitative data) or an ordinal
scale (for ordinal and nominal data) is instantiated, along with an axis.
For color, size, and shape channels, suitable palettes and legends are
generated. For example, quantitative color encodings use a single-
hue luminance ramp, while nominal color encodings use a categorical
palette with varied hues. Our default assignments largely follow the
model of prior systems [24, 30].
Unit specifications are capable of expressing a variety of com-
mon, useful plots of both raw and aggregated data. Examples include
bar charts, histograms, dot plots, scatter plots, line graphs, and area
graphs. Our formal definitions are instantiated in a JSON (JavaScript
Object Notation) syntax, as shown in Fig. 2.
3.2 View Composition Algebra
Given multiple unit specifications, composite views can be created us-
ing a set of composition operators. Here we describe the set of sup-
ported operators. We use the term view to refer to any Vega-Lite spec-
ification, whether it is a unit or composite specification.
3.2.1 Layer
The layer operator accepts multiple unit specifications to produce a
view in which subsequent charts are plotted on top of each other. For
example, a layered view could consist of one layer showing a his-
togram of a full data set, and another overlaying a histogram of a fil-
tered subset (Fig. 11). The signature of the operator is:
layer([unit
1
, unit
2
, ...], resolve)
To create a layered view, we produce shared scales (if their types
match) and merge guides by default. For example, we compute the
union of the data domains for the x or y channel, for which we then
generate a single scale. We believe this is a useful default for pro-
ducing coherent and comparable layers. However, Vega-Lite can not
enforce that a unioned domain is semantically meaningful. To prohibit
layering of composite views with incongruent internal structures, the
layer operator restricts its operands to be unit views.
To override the default behavior, users can specify strategies to re-
solve scales and guides using tuples of the form (channel, scale|guide,
{
"layers": [
{
"data": {"url": "data/weather.csv","formatType": "csv"},
"transform": {"filter": "datum.location === 'Seattle'"},
"mark": "bar",
"encoding": {
"x": {
"field": "date", "type": "temporal",
"timeUnit": "month" },
"y": {
"field": "precipitation", "type": "quantitative",
"aggregate": "mean", "axis": {"grid": false} },
"color": {"value": "#77b2c7"} }
}, {
"data": {"url": "data/weather.csv","formatType": "csv"},
"transform": {"filter": "datum.location === 'Seattle'"},
"mark": "line",
"encoding": {
"x": {
"field": "date", "type": "temporal",
"timeUnit": "month" },
"y": {
"field": "temp_max", "type": "quantitative",
"aggregate": "mean", "axis": {"grid": false} },
"color": {"value": "#ce323c"} }
} ],
"resolve": {
"y": {"scale": "independent"}
} }
(a) Dual axis layered chart
{
"vconcat": [
{ ... },
{
"data": {
"url": "data/weather.csv",
"formatType": "csv"
},
"transform": {
"filter": "datum.precipitation > 0"
},
"mark": "point",
"encoding": {
"y": {"field": "location","type": "nominal"},
"x": {
"field": "*",
"type": "quantitative",
"aggregate": "count"
},
"color": {
"field": "date",
"type": "temporal",
"timeUnit": "year"
}
}
}
]
}
(b) Vertical concatenation of two charts
Fig. 3. (a) A dual axis chart that layers lines for temperature on top of bars for precipitation; each layer uses an independent y-scale. (b) The
temperature line chart from Fig. 2(a) concatenated with rainy day counts in New York and Seattle; scales and guides for each plot are independent.

(a) Faceted charts (b) Repeated charts
{
"data": {
"url": "data/weather.csv",
"formatType": "csv"
},
"facet": {
"column": {
"field": "location",
"type": "nominal"
}
},
"spec": {
"mark": "line",
"encoding": {
"x": { ... },
"y": { ... },
"color": { ... }
}
}
}
{
"repeat": {
"column": ["temp_max","precipitation"]
},
"spec": {
"data": {
"url": "data/weather.csv",
"formatType": "csv"
},
"mark": "line",
"encoding": {
"x": { ... }
"y": {
"field": {"repeat": "column"},
"type": "quantitative",
"aggregate": "mean"
},
"color": { ... }
}
} }
Fig. 4. (a) Weather data faceted by location; the y-axis is shared, and the underlying scale domains unioned, to enable easier comparison.
(b) Repetition of different measures across columns; the y channel references the column template parameter to vary the encoding.
resolution), where resolution is one of independent or union. Inde-
pendent scales and guides for each layer produce a dual-axis view, as
shown in the layered plots in Fig. 3(a).
3.2.2 Concatenation
To place views side-by-side, Vega-Lite provides operators for horizon-
tal and vertical concatenation. The signatures for these operators are:
hconcat([view
1
, view
2
, ...], resolve)
vconcat([view
1
, view
2
, ...], resolve)
If aligned spatial channels have matching data fields (e.g., the y
channels in an hconcat use the same field), a shared scale and axis
are used. Axis composition facilitates comparison across views and
optimizes the underlying implementation. Fig. 3(b) concatenates the
line chart from Fig. 2(a) with a dot plot, using independent scales.
3.2.3 Facet
While concatenation allows composition of arbitrary views, one often
wants to set up multiple views in a parameterized fashion. The facet
operator produces a trellis plot [1] by subsetting the data by the distinct
values of a field. The signature of the facet operator is:
facet(channel, data, field, view, scale, axis, resolve)
The channel indicates if sub-plots should be laid out vertically (row)
or horizontally (column). The given data source is partitioned using
distinct values of the field. The view specification provides a template
for the sub-plots, inheriting the backing data for each partition from
the operator. The scale and axis parameters specify how sub-plots are
positioned and labeled. Fig. 4(a) demonstrates faceting into columns.
To facilitate comparison, scales and guides for quantitative fields
are shared by default. This ensures that each facet visualizes the same
data domain. However, for ordinal scales we generate independent
scales by default to avoid unnecessary inclusion of empty categories,
akin to Polaris’ nest operator. When faceting by fiscal quarter and
visualizing per-month data in each cell, one likely wishes to see three
months per quarter, not twelve months of which nine are empty. Users
can override the default behavior via the resolve component.
3.2.4 Repeat
The repeat operator generates multiple plots, but unlike facet allows
full replication of a data set in each cell. For example, repeat can be
used to create a scatterplot matrix (SPLOM), where each cell shows a
different 2D projection of the same data table. The signature is:
repeat(channel, values, scale, axis, view, resolve)
Similar to facet, the channel parameter indicates if plots should di-
vide by row or column. Rather than partition data according to a field,
this operator generates one plot for each entry in a list of values. En-
codings within the repeated view specification can refer to this pro-
vided value to parameterize the plot
1
. By default, scales and axes are
independent, but legends are shared when data fields coincide. Like
1
As the repeat operator requires parameterization of the inner view, it is
not strictly algebraic. It is possible to achieve algebraic “purity” via explicit re-
peated concatenation or by reformulating the repeat operator (e.g., by including
rewrite rules that apply to the inner view specification). However, we believe
the current syntax to be more usable and concise than these alternatives.
facet, the scale and axis components allow users to override defaults
for how sub-plots are positioned and labeled, while resolve controls
resolution of scales and guides within the plots themselves.
3.3 Nested Views
Composition operators can be combined to create more complex
nested views or dashboards, with the output of one operator serving as
input to a subsequent operator. For instance, a layer of two unit views
might be repeated, and then concatenated with a different unit view.
The one exception is the layer operator, which, as previously noted,
only accepts unit views to ensure consistent plots. For concision, two
dimensional faceted or repeated layouts can be achieved by applying
the operators to the row and column channels simultaneously. When
faceting a composite view, only the dataset targeted by the operator is
partitioned; any other datasets specified in sub-views are replicated.
4 THE VEGA-LITE GRAMMAR OF INTERACTION
To support specification of interaction techniques, Vega-Lite extends
the definition of unit specifications to also include a set of selections.
Selections identify the set of points a user is interested in manipulat-
ing. In this section, we define the components of a selection, describe
a series of transforms for modifying selections, and detail how selec-
tions can parameterize visual encodings to make them interactive.
4.1 Selection Components
We formally define a selection as an eight-tuple:
selection := (name, type, predicate, domain|range,
event, init, transforms, resolve)
When an input event occurs, the selection is populated with backing
points of interest. These points are the minimal set needed to identify
all selected points. The selection type determines how many backing
values are stored, and how the predicate function uses them to deter-
mine the set of selected points. Supported types include a single point,
a list of points, or an interval of points.
A point selection is backed by a single datum, and its predicate tests
for an exact match against properties of this datum. It can also function
like a dynamic variable (or signal in Vega [23]), and can be invoked
as such. For example, it can be referenced by name within a filter ex-
pression, or its values used directly for particular encoding channels.
List selections, on the other hand, are backed by datasets into which
points are inserted, modified or removed as events fire. Lists express
discrete selections, as their predicates test for an exact match with at
least one value in the backing dataset. The order of points in a list
selection can be semantically meaningful, for example when a list se-
lection serves as an ordinal scale domain. Fig. 5 illustrates how points
are highlighted in a scatterplot using point and list selections.
Intervals are similar to list selections. They are backed by datasets,
but their predicates determine whether an argument falls within the
minimum and maximum extent defined by the backing points. Thus,
they express continuous selections. The compiler automatically adds
a rectangle mark, as shown in Fig. 6(a), to depict the selected inter-
val. Users can customize the appearance of this mark via the brush
keyword, or disable it altogether when defining the selection.
Predicate functions enable a minimal set of backing points to rep-
resent the full space of selected points. For example, with predicates,

{
"data": {"url": "data/cars.json"},
"mark": "circle",
"select": {
"id": {"type": "point"}
},
"encoding": {
"x": {"field": "Horsepower", "type": "Q"},
"y": {"field": "MPG", "type": "Q"},
"color": [
{"if": "id", "field": "Origin", "type": "N"},
{"value": "grey"}
],
"size": {"value": 100}
}
(a) Highlight a single point on click
"id": {"type": "point", "project": {"fields": ["Origin"]}}
(d) Highlight a single Origin
"id": {"type": "list", "toggle": true}
(b) Highlight a list of individual points
"select": {
"id": {"type": "list", "toggle": true, "project": {"fields": ["Origin"]}}
}, ...
(e) Highlight a list of Origins
(c) "Paintbrush": highlight multiple points on hover
"id": {"type": "list", "on": "mouseover", "toggle": true}
Fig. 5. (a) Adding a single point selection to parameterize the fill color of a scatterplot’s circle mark. (b) Switching to a list selection, with the toggle
transform automatically added (true enables default shift-click event handling). (c) Specifying a custom event trigger: the first point is selected on
mouseover and subsequent points when the shift key is pressed (customizable via the toggle transform). (d) Using the project transform with a
single-point selection to highlight all points with a matching Origin, and (e) combining it with a list selection to select multiple Origins.
an interval selection need only be backed by two points: the minimum
and maximum values of the interval. While selection types provide
default definitions, predicates can be customized to concisely specify
an expressive space of selections. For example, a single point selec-
tion with a custom predicate of the form datum.binned price
== selection.binned price is sufficient for selecting all data
points that fall within a given bin.
By default, backing points lie in the data domain. For example,
if the user clicks a mark instance, the underlying data tuple is added
to the selection. If no tuple is available, event properties are passed
through inverse scale transforms. For example, as the user moves
their mouse within the data rectangle, the mouse position is inverted
through the x and y scales and stored in the selection. Defining selec-
tions over data values, rather than visual properties, facilitates reuse
across distinct views; each view may have different encodings spec-
ified, but are likely to share the same data domain. However, some
interactions are inherently about manipulating visual properties for
example, interactively selecting the colors of a heatmap. For such
cases, users can define selections over the visual range instead. When
input events occur, visual elements or event properties are then stored.
The particular events that update a selection are determined by
the platform a Vega-Lite specification is compiled on, and the input
modalities it supports. By default we use mouse events on desktops,
and touch events on mobile and tablet devices. A user can specify
alternate events using Vega’s event selector syntax [23]. For exam-
ple, Fig. 5(c) demonstrates how mouseover events are used to pop-
ulate a list selection. With the event selector syntax, multiple events
are specified using a comma (e.g., mousedown, mouseup adds
items to the selection when either event occurs). A sequence of events
is denoted with the right-combinator. For example, [mousedown,
mouseup] > mousemove selects all mousemove events that oc-
cur between a mousedown and a mouseup (otherwise known as
“drag” events). Events can also be filtered using square brackets (e.g.,
mousemove [event.pageY > 5] for events at the top of the
page) and throttled using braces (e.g., mousemove{100ms} popu-
lates a selection at most every 100 milliseconds).
Finally, selections can be initialized with specific backing points
(we defer discussion of transforms and resolve to subsequent sections).
Vega-Lite provides a built-in mechanism to initialize list and interval
selections using the scales of the unit specification they are defined
in. Doing so populates the selection with the given scales’ domain or
range, as appropriate for the selection, and parameterizes the scales to
use the selection instead. By default, this occurs for the scales of the x
and y channels, but alternate scales can be specified by the user. This
step allows scale extents to be interactively manipulated, yet remain
automatically initialized by the input data.
4.2 Selection Transforms
Analogous to data transforms, selection transforms manipulate the
components of the selection they are applied to. For example, they
may perform operations on the backing points, alter a selection’s pred-
icate function, or modify the input events that update the selection.
We identify the following transforms as a minimal set to support both
common and custom interaction techniques:
project(fields, channels): Alters a selection’s predicate function to
determine inclusion by matching only the given fields. Some fields,
however, may be difficult for users to address directly (e.g., new fields
introduced due to inline binning or aggregation transformations). For
such cases, a list of channels may also be specified (e.g., color,
size). Fig. 5(d, e) demonstrate how project can be used to select
all points with matching Origin fields, for example. This transform
is also used to restrict interval selections to a particular dimension
(Fig. 6(c)) or to determine which scales initialize a selection.
toggle(event): This transform is automatically instantiated for
uninitialized list selections. When the event occurs, the corresponding
point is added or removed from a list selection’s backing dataset. By
default, the toggle event corresponds to the selection’s event but with
the shift key pressed. For example, in Fig. 5(b), additional points are
added to the list selection on shift-click (where click is the default
event for list selections). The selection in Fig. 5(c), however, speci-
fies a custom mouseover event. Thus, additional points are inserted
when the shift key is pressed and the mouse cursor hovers over a point.
translate(events, by): Offsets the spatial properties (or correspond-
ing data fields) of backing points by an amount determined by the
coordinates of the sequenced events. For example, on the desk-
top, drag events ([mousedown, mouseup] > mousemove) are
used and the offset corresponds to the difference between where the
mousedown and subsequent mousemove events occur. If no coor-
dinates are available (e.g., as with keyboard events), an optional by
argument should be specified. This transform respects the project
transform as well, restricting movement to the specified dimensions.
This transform is automatically instantiated for interval transforms,
enabling movement of brushed regions (Fig. 6(b)) or panning of the
visualization when scale extents initialize the selection (Fig. 7).
zoom(event, factor): Applies a scale factor, determined by the event,
to the spatial properties (or corresponding data fields) of backing
points. An optional factor should be specified, if it cannot be deter-
mined from the events (e.g., when the arrow keys are pressed).
nearest(): Computes a Voronoi decomposition, and augments the
selection’s event processing, such that the data value or visual element

Figures
Citations
More filters
Journal ArticleDOI

Co-adaptive visual data analysis and guidance processes

TL;DR: A multigranular model of co-adaptive visual analysis that is centered around incremental learning goals derived from a hierarchical taxonomy of learning goals from pedagogy, which captures how both actors adapt their data-, task-, and user/system-models over time.
Journal ArticleDOI

P5: Portable Progressive Parallel Processing Pipelines for Interactive Data Analysis and Visualization

TL;DR: This work presents P5, a web-based visualization toolkit that combines declarative visualization grammar and GPU computing for progressive data analysis and visualization and demonstrates the effectiveness and usefulness of P5 through a variety of example applications and several performance benchmark tests.
Journal ArticleDOI

A Structured Review of Data Management Technology for Interactive Visualization and Analysis

TL;DR: A systematic review of 30 years of work in data management, and a categorization of data management work that strikes a balance between specificity and generality is created.

TaskVis: Task-oriented Visualization Recommendation

TL;DR: This paper contributed TaskVis, a task-oriented visualization recommendation approach with detailed modeling of the user’s analysis task, which found that TaskVis can well reflect the users’ preferences and strike a great balance between automation and the user's intent.
Journal ArticleDOI

Does design matter when visualizing Big Data? An empirical study to investigate the effect of visualization type and interaction use

TL;DR: The results indicate that both, choosing an appropriate visualization based on task characteristics and using the feature of interaction, increase usability considerably.
References
More filters
Journal ArticleDOI

D³ Data-Driven Documents

TL;DR: This work shows how representational transparency improves expressiveness and better integrates with developer tools than prior approaches, while offering comparable notational efficiency and retaining powerful declarative components.
Journal ArticleDOI

Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods

TL;DR: The approach is based on graphical perception—the visual decoding of information encoded on graphs—and it includes both theory and experimentation to test the theory, providing a guideline for graph construction.
Journal ArticleDOI

Automating the design of graphical presentations of relational information

TL;DR: APT as discussed by the authors is an application-independent presentation tool that automatically designs effective graphical presentations (such as bar charts, scatter plots, and connected graphs) of relational information, based on the view that graphical presentations are sentences of graphical languages.
Book

Semiology of Graphics: Diagrams, Networks, Maps

TL;DR: This work is an unprecedented attempt to synthesize principles of graphic communication with the logic of standard rules applied to writing and topography in an array of more than 1,000 maps and diagrams.
Journal ArticleDOI

Toward a Deeper Understanding of the Role of Interaction in Information Visualization

TL;DR: Seven general categories of interaction techniques widely used in Infovis are proposed, organized around a user's intent while interacting with a system rather than the low-level interaction techniques provided by a system.
Related Papers (5)
Frequently Asked Questions (10)
Q1. What are the contributions in "Vega-lite: a grammar of interactive graphics" ?

The authors present Vega-Lite, a high-level grammar that enables rapid specification of interactive data visualizations. The Vega-Lite compiler automatically synthesizes requisite data flow and event handling logic, which users can override for further customization. 

One promising avenue for future work is to develop models and techniques to analogously recommend suitable interaction methods for given visualizations and underlying data types. 

The filterWith data transform applies the selection against the backing datasets such that only data values that fall within the selection are displayed. 

Low-level grammars such as Protovis [3], D3 [4], and Vega [22] are useful for explanatory data visualization or as a basis for customized analysis tools, as their primitives offer fine-grained control. 

by): Offsets the spatial properties (or corresponding data fields) of backing points by an amount determined by the coordinates of the sequenced events. 

Once the necessary components have been built, the compiler performs a bottom-up traversal of the model tree to merge redundant components. 

To support expressive interaction methods, the authors first contribute an algebra to compose singleview Vega-Lite specifications into multi-view displays using layer, concatenate, facet and repeat operators. 

nearest(): Computes a Voronoi decomposition, and augments the selection’s event processing, such that the data value or visual elementnearest the selection’s triggering event is selected (approximating a Bubble Cursor [11]). 

Their formal definitions are instantiated in a JSON (JavaScript Object Notation) syntax, as shown in Fig. 2.Given multiple unit specifications, composite views can be created using a set of composition operators. 

Specifying common techniques can be time-consuming, requiring tens of lines of JSON, and it is difficult to know how to adapt techniques in pursuit of alternative designs.