scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Vega-Lite: A Grammar of Interactive Graphics

TL;DR: Vega-Lite combines a traditional grammar of graphics, providing visual encoding rules and a composition algebra for layered and multi-view displays, with a novel grammar of interaction, that enables rapid specification of interactive data visualizations.
Abstract: We present Vega-Lite, a high-level grammar that enables rapid specification of interactive data visualizations. Vega-Lite combines a traditional grammar of graphics, providing visual encoding rules and a composition algebra for layered and multi-view displays, with a novel grammar of interaction. Users specify interactive semantics by composing selections. In Vega-Lite, a selection is an abstraction that defines input event processing, points of interest, and a predicate function for inclusion testing. Selections parameterize visual encodings by serving as input data, defining scale extents, or by driving conditional logic. The Vega-Lite compiler automatically synthesizes requisite data flow and event handling logic, which users can override for further customization. In contrast to existing reactive specifications, Vega-Lite selections decompose an interaction design into concise, enumerable semantic units. We evaluate Vega-Lite through a range of examples, demonstrating succinct specification of both customized interaction methods and common techniques such as panning, zooming, and linked selection.

Summary (5 min read)

1 INTRODUCTION

  • Grammars of graphics span a gamut of expressivity.
  • Analysts rapidly author partial specifications of visualizations; the grammar applies default values to resolve ambiguities, and synthesizes lowlevel details to produce visualizations.
  • Selections parameterize visual encodings by serving as input data, defining scale extents, and providing predicate functions for testing or filtering items.
  • Through a range of examples, the authors demonstrate that Vega-Lite brings the advantages of high-level specification to interactive visualization.

2.1 Grammar-Based Visual Encoding

  • Since the initial publication of Wilkinson’s The Grammar of Graphics [29] in 1999, formal grammars for statistical graphics have grown increasingly popular as a way to succinctly specify visualizations.
  • Wilkinson’s work was quickly followed by the Stanford Polaris system [24], later commercialized as Tableau.
  • Drawing from Wilkinson’s grammar and Polaris/Tableau, Vega-Lite similarly represents basic plots using a set of encoding definitions that map data attributes to visual channels such as position, color, shape, and size, and may include common data transformations such as binning, aggregation, sorting, and filtering.
  • VegaLite specifications are compiled to full Vega specifications, hence the expressive gamut of Vega-Lite is a strict subset of that of Vega.
  • Disparate views can also be combined into arbitrary dashboards, all within a unified algebraic model.

2.2 Specifying Interactions in Visualization Systems

  • 19], little work has been done to develop a grammar for specifying interaction techniques.
  • Wilkinson’s grammar includes no notion of interaction.
  • Reactive Vega draws on Functional Reactive Programming techniques to formulate composable, declarative interaction primitives for data visualization.
  • When a new event fires, it propagates to dependent signals; visual encodings that use them are automatically re-evaluated and re-rendered.
  • Specifying common techniques can be time-consuming, requiring tens of lines of JSON, and it is difficult to know how to adapt techniques in pursuit of alternative designs.

3 THE VEGA-LITE GRAMMAR OF GRAPHICS

  • Vega-Lite combines a grammar of graphics with a novel grammar of interaction.
  • The authors describe Vega-Lite’s basic visual encoding constructs and an algebra for view composition.
  • In prior work, Wongsuphasawat et al. [30] introduced the simplest Vega-Lite specification — here referred to as a unit specification — that defines a single Cartesian plot with a specific mark type to encode data (e.g., bars, lines, plotting symbols).
  • Given multiple unit plots, the authors introduce layer, concat, facet, and repeat operators to provide an algebra for constructing composite views.
  • Each operator is responsible for combining or aligning underlying scales and axes as needed.

3.1 Unit Specification

  • A unit specification describes a single Cartesian plot, with a backing data set, a given mark-type, and a set of one or more encoding definitions for visual channels such as position (x, y), color, size, etc.
  • Formally, an encoding is a seven-tuple: encoding := (channel, field, data-type, value, functions, scale, guide) Available visual encoding channels include spatial position (x, y), color, shape, size, and text.
  • The field string denotes a data attribute to visualize, along with a given data-type (one of nominal, ordinal, quantitative or temporal).
  • If not specified, Vega-Lite will automatically populate default properties based on the channel and data-type.
  • For x and y channels, either a linear scale (for quantitative data) or an ordinal scale (for ordinal and nominal data) is instantiated, along with an axis.

3.2 View Composition Algebra

  • Given multiple unit specifications, composite views can be created using a set of composition operators.
  • Here the authors describe the set of supported operators.
  • The authors use the term view to refer to any Vega-Lite specification, whether it is a unit or composite specification.

3.2.1 Layer

  • The layer operator accepts multiple unit specifications to produce a view in which subsequent charts are plotted on top of each other.
  • The authors compute the union of the data domains for the x or y channel, for which they then generate a single scale.
  • The authors believe this is a useful default for producing coherent and comparable layers.
  • Vega-Lite can not enforce that a unioned domain is semantically meaningful.
  • Independent scales and guides for each layer produce a dual-axis view, as shown in the layered plots in Fig. 3(a).

3.2.2 Concatenation

  • To place views side-by-side, Vega-Lite provides operators for horizontal and vertical concatenation.
  • If aligned spatial channels have matching data fields (e.g., the y channels in an hconcat use the same field), a shared scale and axis are used.
  • Axis composition facilitates comparison across views and optimizes the underlying implementation.
  • Fig. 3(b) concatenates the line chart from Fig. 2(a) with a dot plot, using independent scales.

3.2.3 Facet

  • While concatenation allows composition of arbitrary views, one often wants to set up multiple views in a parameterized fashion.
  • The facet operator produces a trellis plot [1] by subsetting the data by the distinct values of a field.
  • The scale and axis parameters specify how sub-plots are positioned and labeled.
  • To facilitate comparison, scales and guides for quantitative fields are shared by default.
  • Users can override the default behavior via the resolve component.

3.2.4 Repeat

  • The repeat operator generates multiple plots, but unlike facet allows full replication of a data set in each cell.
  • Repeat(channel, values, scale, axis, view, resolve) Similar to facet, the channel parameter indicates if plots should divide by row or column, also known as The signature is.
  • By default, scales and axes are independent, but legends are shared when data fields coincide.
  • Like 1As the repeat operator requires parameterization of the inner view, it is not strictly algebraic.
  • The authors believe the current syntax to be more usable and concise than these alternatives.

3.3 Nested Views

  • Composition operators can be combined to create more complex nested views or dashboards, with the output of one operator serving as input to a subsequent operator.
  • A layer of two unit views might be repeated, and then concatenated with a different unit view.
  • The one exception is the layer operator, which, as previously noted, only accepts unit views to ensure consistent plots.
  • For concision, two dimensional faceted or repeated layouts can be achieved by applying the operators to the row and column channels simultaneously.
  • When faceting a composite view, only the dataset targeted by the operator is partitioned; any other datasets specified in sub-views are replicated.

4 THE VEGA-LITE GRAMMAR OF INTERACTION

  • To support specification of interaction techniques, Vega-Lite extends the definition of unit specifications to also include a set of selections.
  • Selections identify the set of points a user is interested in manipulating.
  • The authors define the components of a selection, describe a series of transforms for modifying selections, and detail how selections can parameterize visual encodings to make them interactive.

4.1 Selection Components

  • The authors formally define a selection as an eight-tuple: selection := (name, type, predicate, domain|range, event, init, transforms, resolve) A point selection is backed by a single datum, and its predicate tests for an exact match against properties of this datum.
  • Fig. 5(c) demonstrates how mouseover events are used to populate a list selection.
  • Doing so populates the selection with the given scales’ domain or range, as appropriate for the selection, and parameterizes the scales to use the selection instead.

4.2 Selection Transforms

  • Analogous to data transforms, selection transforms manipulate the components of the selection they are applied to.
  • They may perform operations on the backing points, alter a selection’s predicate function, or modify the input events that update the selection.
  • In Fig. 5(b), additional points are added to the list selection on shift-click (where click is the default event for list selections).
  • If no coordinates are available (e.g., as with keyboard events), an optional by argument should be specified.
  • All transforms are first parsed, setting properties on an internal representation of a selection, before they are compiled to produce event handling and interaction logic.

4.3 Selection-Driven Visual Encodings

  • Once selections are defined, they parameterize visual encodings to make them interactive — visual encodings are automatically reevaluated as selections change.
  • Each data tuple participating in the encoding is evaluated against selection predicates in turn, and visual properties are set corresponding to the first branch that evaluates to true.
  • As shown in Fig. 5, the fill color of the scatterplot circles is determined by a data field if they fall within the id selection, or set to grey otherwise.
  • By default, this applies a selection’s predicate against the data tuples (or visual elements) of the unit specification it is defined in.
  • For multi-view displays, selection names can be specified as the domain or range of a particular channel’s scale.

4.4 Disambiguating Composite Selections

  • Selections are defined within unit specifications, providing a default context.
  • Several strategies exist for resolving this ambiguity.
  • Setting a selection to resolve to independent creates one instance per view, and each unit uses only its own selection to determine inclusion.
  • More concretely, with the SPLOM example, these settings would continue to produce one brush per cell, and points would highlight when they lie within at least one brush or if they are within every brush as shown in Fig. 8(c, d).

5 THE VEGA-LITE COMPILER

  • The Vega-Lite compiler ingests a JSON specification and outputs a lower-level Reactive Vega specification (also expressed as JSON).
  • To overcome these challenges, the compiler generates the output Vega specification in four phases: parse ingests and disambiguates the Vega-Lite specification; build creates the necessary internal representations to map between Vega-Lite and Vega primitives; merge optimizes this representation to remove redundancies; and finally, assemble compiles this representation into a Vega specification.
  • If the color channel is mapped to an nominal field, and the user has not specified a scale domain, a categorical color palette is inferred.
  • Once the necessary components have been built, the compiler performs a bottom-up traversal of the model tree to merge redundant components.
  • Each run-time selection transform (i.e., those that are triggered by an event) generates signals as well, and may augment the selection’s data source with data transformations.

6 EXAMPLE INTERACTIVE VISUALIZATIONS

  • Vega-Lite’s design is motivated by two goals: to enable rapid yet expressive specification of interactive visualizations, and to do so with concise primitives that facilitate systematic enumeration and exploration of design variations.
  • Such changes to the specification are not mutually exclusive, and can be composed as shown in Fig. 5(e).
  • Users can now brush, pan and zoom the scatterplot.
  • Moreover, by enabling this interaction through composable primitives (rather than a single, specific “pan and zoom” operator [4]), Vega-Lite also facilitates exploring related interactions in the design space.
  • Instead of applying the selection back onto the input dataset, the authors can instead materialize it as an overlay (Fig. 11).

7 DISCUSSION

  • The examples demonstrate that Vega-Lite specifications are more concise than those of the lower-level Vega language, and yet are suf- ficiently expressive to cover an interactive visualization taxonomy.
  • Nevertheless, the authors identify two classes of limitations that currently exist.
  • While their selection abstraction supports interactive linking of marks, their view algebra does not yet provide means of visually linking marks across views (e.g., as in the Domino system [10]).
  • By offering a multi-view grammar of graphics tightly integrated with a grammar of interaction, Vega-Lite facilitates rapid exploration of design variations.

Did you find this useful? Give us your feedback

Figures (10)

Content maybe subject to copyright    Report

Vega-Lite: A Grammar of Interactive Graphics
Arvind Satyanarayan, Dominik Moritz, Kanit Wongsuphasawat, and Jeffrey Heer
Fig. 1. Example visualizations authored with Vega-Lite. From left-to-right: layered line chart combining raw and average values,
dual-axis layered bar and line chart, brushing and linking in a scatterplot matrix, layered cross-filtering, and an interactive index chart.
Abstract—We present Vega-Lite, a high-level grammar that enables rapid specification of interactive data visualizations. Vega-Lite
combines a traditional grammar of graphics, providing visual encoding rules and a composition algebra for layered and multi-view
displays, with a novel grammar of interaction. Users specify interactive semantics by composing selections. In Vega-Lite, a selection
is an abstraction that defines input event processing, points of interest, and a predicate function for inclusion testing. Selections
parameterize visual encodings by serving as input data, defining scale extents, or by driving conditional logic. The Vega-Lite compiler
automatically synthesizes requisite data flow and event handling logic, which users can override for further customization. In contrast
to existing reactive specifications, Vega-Lite selections decompose an interaction design into concise, enumerable semantic units.
We evaluate Vega-Lite through a range of examples, demonstrating succinct specification of both customized interaction methods
and common techniques such as panning, zooming, and linked selection.
Index Terms—Information visualization, interaction, systems, toolkits, declarative specification
1 INTRODUCTION
Grammars of graphics span a gamut of expressivity. Low-level gram-
mars such as Protovis [3], D3 [4], and Vega [22] are useful for ex-
planatory data visualization or as a basis for customized analysis
tools, as their primitives offer fine-grained control. However, for ex-
ploratory visualization, higher-level grammars such as ggplot2 [27],
and grammar-based systems such as Tableau (n
´
ee Polaris [24]), are
typically preferred as they favor conciseness over expressiveness. An-
alysts rapidly author partial specifications of visualizations; the gram-
mar applies default values to resolve ambiguities, and synthesizes low-
level details to produce visualizations.
High-level languages can also enable search and inference over the
space of visualizations. For example, Wongsuphasawat et al. [30] in-
troduced Vega-Lite to power the Voyager visualization browser. By
providing a smaller surface area than the lower-level Vega language,
Vega-Lite makes systematic enumeration and ranking of data transfor-
mations and visual encodings more tractable.
However, existing high-level languages provide limited support for
interactivity. An analyst can, at most, enable a predefined set of com-
mon techniques (linked selections, panning & zooming, etc.) or pa-
rameterize their visualization with dynamic query widgets [21]. For
custom, direct-manipulation interaction they must instead turn to im-
perative event handling callbacks. Recognizing that callbacks can be
error-prone to author, and require complex static analysis to reason
about, Satyanarayan et al. [23] recently formulated declarative interac-
tion primitives for Vega. While these additions facilitate programmatic
generation and retargeting of interactive visualizations, they remain
Arvind Satyanarayan is with Stanford University. E-mail:
arvindsatya@cs.stanford.edu.
Dominik Moritz, Kanit Wongsuphasawat, and Jeffrey Heer are with the
University of Washington. E-mails: {domoritz, kanitw, jheer}@uw.edu.
Manuscript received xx xxx. 201x; accepted xx xxx. 201x. Date of
Publication xx xxx. 201x; date of current version xx xxx. 201x.
For information on obtaining reprints of this article, please send
e-mail to: reprints@ieee.org.
Digital Object Identifier: xx.xxxx/TVCG.201x.xxxxxxx/
low-level. Verbose specification impedes rapid authoring and hinders
systematic exploration of alternative designs.
In this paper we extend Vega-Lite to enable concise, high-level
specification of interactive data visualizations. To support expressive
interaction methods, we first contribute an algebra to compose single-
view Vega-Lite specifications into multi-view displays using layer,
concatenate, facet and repeat operators. Vega-Lite’s compiler infers
how input data should be reused across constituent views, and whether
scale domains should be unioned or remain independent.
Second, we contribute a high-level interaction grammar. With
Vega-Lite, an interaction design is composed of selections: visual el-
ements or data points that are chosen when input events occur. Selec-
tions parameterize visual encodings by serving as input data, defining
scale extents, and providing predicate functions for testing or filtering
items. For example, a rectangular “brush” is a common interaction
technique for data visualization. In Vega-Lite, a brush is defined as a
selection that holds two data points that correspond to its extents (e.g.,
captured when the mouse button is pressed and as it is dragged, re-
spectively). Its predicate can be used to highlight visual elements that
fall within the brushed region, and to materialize a dataset as input to
other encodings. The selection can also serve as the scale domain for a
secondary view, thereby constructing an overview + detail interaction.
For added expressivity, Vega-Lite provides a series of operators to
transform a selection. Transforms can be triggered by input events as
well, and manipulate selection points or predicate functions. For ex-
ample, a toggle transform adds or removes a point from the selection,
while a project transform modifies the predicate to define inclusion
over specified data fields.
The Vega-Lite compiler synthesizes a low-level Vega specifica-
tion [22] with the requisite data flow, and default event handling logic
that a user can override. Through a range of examples, we demon-
strate that Vega-Lite brings the advantages of high-level specification
to interactive visualization. Common methods, including linked selec-
tion, panning, and zooming, as well as custom techniques (drawn from
an established taxonomy [31]) can be concisely described. Moreover,
selections, transformations, and their application to visual encodings
decompose interaction into a parametric design space. We show how

each of these parameters can be systematically varied to generate al-
ternate interaction techniques for a given set of visual encodings. Such
enumeration can be useful to explore alternative designs, and can aid
higher-level reasoning about interaction for example, recommend-
ing suitable interaction techniques as part of a design tool.
2 RELATED WORK
Vega-Lite builds on prior work on grammars of graphics, visualization
systems, and techniques for interactive selection and querying.
2.1 Grammar-Based Visual Encoding
Since the initial publication of Wilkinson’s The Grammar of Graph-
ics [29] in 1999, formal grammars for statistical graphics have grown
increasingly popular as a way to succinctly specify visualizations.
Wilkinson’s work was quickly followed by the Stanford Polaris sys-
tem [24], later commercialized as Tableau. Hadley Wickham’s popular
ggplot2 [27] and ggvis [20] packages implement variants of Wilkin-
son’s model in the R statistical language. These tools eschew chart
templates, which offer limited means of customization, in favor of
combinatorial building blocks. Abstracting data models, graphical
marks, visual encoding channels, scales and guides (i.e., axes and leg-
ends) yields a more expressive design space, and allows analysts to
rapidly construct graphics for exploratory analysis [13]. Concise spec-
ification is achieved in part through ambiguity: users may omit details
such as scale transforms (e.g., linear or log) or color palettes, which
are then filled in using a rule-based system of smart defaults. More
expressive lower-level (and thus more verbose) grammars, including
those of Protovis [3], D3 [4], and Vega [22], have been widely used
for creating explanatory and highly-customized graphics.
The design of Vega-Lite is heavily influenced by these works.
Drawing from Wilkinson’s grammar and Polaris/Tableau, Vega-Lite
similarly represents basic plots using a set of encoding definitions that
map data attributes to visual channels such as position, color, shape,
and size, and may include common data transformations such as bin-
ning, aggregation, sorting, and filtering. Drawing from Vega, Vega-
Lite uses a portable JSON (JavaScript Object Notation) syntax that
permits generation from a variety of programming languages. Vega-
Lite specifications are compiled to full Vega specifications, hence the
expressive gamut of Vega-Lite is a strict subset of that of Vega. As we
will later demonstrate, Vega-Lite sacrifices some expressiveness for
dramatic gains in the conciseness and clarity of specification.
In terms of visual encoding, Vega-Lite differs most from other high-
level grammars in its approach to multiple view displays. Each of
these grammars supports faceting (or nesting) to construct trellis plots
in which each cell similarly visualizes a different partition of the data.
Both Wilkinson’s grammar and Polaris/Tableau achieve this through a
table algebra over data fields, which in turn determines spatial sub-
divisions. Tableau additionally supports the construction of multi-
view dashboards via a different mechanism, with each view backed
by a separate specification. In contrast, we contribute a view alge-
bra: starting with unit specifications that define a single plot, Vega-
Lite expresses composite views using operators for layering, horizon-
tal or vertical concatenation, faceting, and parameterized repetition.
When applicable, these operators will merge scale domains and prop-
erly align constituent views. Disparate views can also be combined
into arbitrary dashboards, all within a unified algebraic model.
2.2 Specifying Interactions in Visualization Systems
Despite the central role of interaction in effective data visualization
[13, 19], little work has been done to develop a grammar for specify-
ing interaction techniques. Wilkinson’s grammar includes no notion
of interaction. Tableau supports common interaction techniques, but
relies on mechanisms external to the visual encoding grammar. Early
systems like GGobi [25] support common techniques as well, and pro-
vide imperative APIs for custom methods. However, such APIs make
easy tasks needlessly complex, burdening developers with learning
low-level execution details. More recent systems, including Protovis,
D3, and VisDock [7], offer a typology of common techniques that can
be applied to a visualization. Such top-down approaches, however,
limit customization and composition. For example, D3’s interactors
encapsulate event processing, making it difficult to combine them if
their events conflict (e.g., if dragging triggers brushing and panning).
The prior work perhaps most closely related to Vega-Lite is the Re-
active Vega language [23]. Reactive Vega draws on Functional Reac-
tive Programming techniques to formulate composable, declarative in-
teraction primitives for data visualization. Reactive Vega models input
events as continuous data streams. To succinctly define event streams
of interest, Vega employs an event selector syntax, which Vega-Lite
also uses for customized event logic. Event streams, in turn, drive
dynamic variables called signals. Signals parameterize the remainder
of the visualization specification, endowing it with reactive semantics.
When a new event fires, it propagates to dependent signals; visual en-
codings that use them are automatically re-evaluated and re-rendered.
This reactive approach is not only capable of expressing a diverse set
of interactions [23], it is performant as well [22], with interactive per-
formance at least twice as fast as the equivalent D3 program.
However, the resulting reactive specifications are low-level and ver-
bose. Specifying common techniques can be time-consuming, requir-
ing tens of lines of JSON, and it is difficult to know how to adapt
techniques in pursuit of alternative designs. In contrast, Vega-Lite is
a higher-level specification language, with primitives that decompose
interaction design into a parametric space. Common methods require
typically 1-2 lines of code, and design variations can be explored by
systematically enumerating defined properties. Nevertheless, Reac-
tive Vega provides a performant runtime and an “assembly language”
to which Vega-Lite specifications are compiled.
2.3 Interactive Selection and Querying
Selection, often in the form of users clicking or lassoing visual items
of interest, is a fundamental operation in user interfaces and has
been well-studied in the context of data visualization. For example,
in Snap-Together Visualization [17], multiple views are coordinated
via “primary-” and “foreign-key actions, which propagate selected
data tuples from one view to the others. Wilhelm [28] describes the
need for such “indirect object manipulation” methods as an axiom
of interactive data displays. Chen’s compound brushing [6] provides
a visual dataflow language for specifying a rich space of transfor-
mations of brush selections. More recently, Brunel [5] provides a
special #selection data field that is dynamically populated with
the elements a user interacts with, and can be used to link multi-
ple views or filter input data. Similarly, RStudio’s Shiny [21], an
imperative web application layer, provides brushedPoints and
nearestPoints functions which can be used throughout an R
script to operate on selected elements.
Other systems have studied formally representing selections as data
queries [28]. For example, brushing interactions in VQE [9] generate
extensional queries that enumerate all items of interest; a form-based
interface enables specification of intensional (declarative) queries. In-
dividual point and brush selections in DEVise [15], known as visual
queries, map to a declarative structure and are used to link together
multiple views. With VIQING [18], rectangular “rubber band” selec-
tions are modeled as range extents, and views can be dropped on top
of each other to join their underlying datasets. Heer et al. [12] demon-
strate that by modeling a selection as a declarative query, interactive
“query relaxation” can successively capture more items of interest.
Vega-Lite builds on this work by richly integrating an interactive
selection abstraction with the primitives of visual encoding grammars.
Vega-Lite selections are populated with one or more points of interest,
in response to user interaction. Extensible predicate functions map se-
lections to declarative queries, and allow a minimal set of “backing”
points to represent the full space of selected points. Additional op-
erators can transform a selection’s predicate or backing points (e.g.,
offseting them to translate a brush selection or perform panning). Se-
lections then parameterize visual encodings by serving as input data,
defining scale extents, or using predicates to test or filter items. The
end result is an enumerable, combinatorial design space of interac-
tive statistical graphics, with concise specification of not only linking
interactions, but panning, zooming, and custom techniques as well.

(b) Correlation between wind and temperature
{
"data": {
"url": "data/weather.csv",
"formatType": "csv" },
"mark": "line",
"encoding": {
"x": {
"field": "date",
"type": "temporal",
"timeUnit": "month" },
"y": {
"field": "temp_max",
"type": "quantitative",
"aggregate": "mean" },
"color": {
"field": "location",
"type": "nominal" }
}
}
{
"data": {
"url": "data/weather.csv",
"formatType": "csv" },
"mark": "point",
"encoding": {
"x": {
"field": "temp_max",
"type": "quantitative",
"bin": true },
"y": {
"field": "wind",
"type": "quantitative",
"bin": true },
"size": {
"field": "*",
"aggregate": "count" },
"color": {
"field": "location",
"type": "nominal" }
} }
{
"data": {
"url": "data/weather.csv",
"formatType": "csv" },
"mark": "bar",
"encoding": {
"x": {
"field": "location",
"type": "nominal"
},
"y": {
"field": "*",
"type": "quantitative",
"aggregate": "count"
},
"color": {
"field": "weather",
"type": "nominal"
}
}
}
(a) Line chart with aggregation (c) Stacked bar chart of weather types
Fig. 2. Vega-Lite unit specifications visualizing weather data. These examples demonstrate varied mark types and data transformations.
3 THE VEGA-LITE GRAMMAR OF GRAPHICS
Vega-Lite combines a grammar of graphics with a novel grammar of
interaction. In this section, we describe Vega-Lite’s basic visual en-
coding constructs and an algebra for view composition. In prior work,
Wongsuphasawat et al. [30] introduced the simplest Vega-Lite speci-
fication here referred to as a unit specification that defines a sin-
gle Cartesian plot with a specific mark type to encode data (e.g., bars,
lines, plotting symbols). Given multiple unit plots, we introduce layer,
concat, facet, and repeat operators to provide an algebra for construct-
ing composite views. This algebra can express layered plots, trellis
plots, and arbitrary multiple view displays. Each operator is responsi-
ble for combining or aligning underlying scales and axes as needed.
3.1 Unit Specification
A unit specification describes a single Cartesian plot, with a backing
data set, a given mark-type, and a set of one or more encoding def-
initions for visual channels such as position (x, y), color, size, etc.
Formally, a unit view consists of a four-tuple:
unit := (data, transforms, mark-type, encodings)
The data definition identifies a data source, a relational table con-
sisting of records (rows) with named attributes (columns). This data ta-
ble can be subject to a set of transforms, including filtering and adding
derived fields via formulas. The mark-type specifies the geometric ob-
ject used to visually encode the data records. Legal values include bar,
line, area, text, rule for reference lines, and plotting symbols (point &
tick). The encodings determine how data attributes map to the proper-
ties of visual marks. Formally, an encoding is a seven-tuple:
encoding := (channel, field, data-type, value, functions, scale, guide)
Available visual encoding channels include spatial position (x, y),
color, shape, size, and text. An order channel controls sorting of
stacked elements (e.g., for stacked bar charts and the layering order of
line charts). A path order channel determines the sequence in which
points of a line or area mark are connected to each other. A detail
channel includes additional group-by fields in aggregate plots.
The field string denotes a data attribute to visualize, along with a
given data-type (one of nominal, ordinal, quantitative or temporal).
Alternatively, one can specify a constant literal value to serve as the
data field. The data field can additionally be transformed using func-
tions such as binning, aggregation (sum, average, etc.), and sorting.
An encoding may also specify properties of a scale that maps from
the data domain to a visual range, and a guide (axis or legend) for
visualizing the scale. If not specified, Vega-Lite will automatically
populate default properties based on the channel and data-type. For x
and y channels, either a linear scale (for quantitative data) or an ordinal
scale (for ordinal and nominal data) is instantiated, along with an axis.
For color, size, and shape channels, suitable palettes and legends are
generated. For example, quantitative color encodings use a single-
hue luminance ramp, while nominal color encodings use a categorical
palette with varied hues. Our default assignments largely follow the
model of prior systems [24, 30].
Unit specifications are capable of expressing a variety of com-
mon, useful plots of both raw and aggregated data. Examples include
bar charts, histograms, dot plots, scatter plots, line graphs, and area
graphs. Our formal definitions are instantiated in a JSON (JavaScript
Object Notation) syntax, as shown in Fig. 2.
3.2 View Composition Algebra
Given multiple unit specifications, composite views can be created us-
ing a set of composition operators. Here we describe the set of sup-
ported operators. We use the term view to refer to any Vega-Lite spec-
ification, whether it is a unit or composite specification.
3.2.1 Layer
The layer operator accepts multiple unit specifications to produce a
view in which subsequent charts are plotted on top of each other. For
example, a layered view could consist of one layer showing a his-
togram of a full data set, and another overlaying a histogram of a fil-
tered subset (Fig. 11). The signature of the operator is:
layer([unit
1
, unit
2
, ...], resolve)
To create a layered view, we produce shared scales (if their types
match) and merge guides by default. For example, we compute the
union of the data domains for the x or y channel, for which we then
generate a single scale. We believe this is a useful default for pro-
ducing coherent and comparable layers. However, Vega-Lite can not
enforce that a unioned domain is semantically meaningful. To prohibit
layering of composite views with incongruent internal structures, the
layer operator restricts its operands to be unit views.
To override the default behavior, users can specify strategies to re-
solve scales and guides using tuples of the form (channel, scale|guide,
{
"layers": [
{
"data": {"url": "data/weather.csv","formatType": "csv"},
"transform": {"filter": "datum.location === 'Seattle'"},
"mark": "bar",
"encoding": {
"x": {
"field": "date", "type": "temporal",
"timeUnit": "month" },
"y": {
"field": "precipitation", "type": "quantitative",
"aggregate": "mean", "axis": {"grid": false} },
"color": {"value": "#77b2c7"} }
}, {
"data": {"url": "data/weather.csv","formatType": "csv"},
"transform": {"filter": "datum.location === 'Seattle'"},
"mark": "line",
"encoding": {
"x": {
"field": "date", "type": "temporal",
"timeUnit": "month" },
"y": {
"field": "temp_max", "type": "quantitative",
"aggregate": "mean", "axis": {"grid": false} },
"color": {"value": "#ce323c"} }
} ],
"resolve": {
"y": {"scale": "independent"}
} }
(a) Dual axis layered chart
{
"vconcat": [
{ ... },
{
"data": {
"url": "data/weather.csv",
"formatType": "csv"
},
"transform": {
"filter": "datum.precipitation > 0"
},
"mark": "point",
"encoding": {
"y": {"field": "location","type": "nominal"},
"x": {
"field": "*",
"type": "quantitative",
"aggregate": "count"
},
"color": {
"field": "date",
"type": "temporal",
"timeUnit": "year"
}
}
}
]
}
(b) Vertical concatenation of two charts
Fig. 3. (a) A dual axis chart that layers lines for temperature on top of bars for precipitation; each layer uses an independent y-scale. (b) The
temperature line chart from Fig. 2(a) concatenated with rainy day counts in New York and Seattle; scales and guides for each plot are independent.

(a) Faceted charts (b) Repeated charts
{
"data": {
"url": "data/weather.csv",
"formatType": "csv"
},
"facet": {
"column": {
"field": "location",
"type": "nominal"
}
},
"spec": {
"mark": "line",
"encoding": {
"x": { ... },
"y": { ... },
"color": { ... }
}
}
}
{
"repeat": {
"column": ["temp_max","precipitation"]
},
"spec": {
"data": {
"url": "data/weather.csv",
"formatType": "csv"
},
"mark": "line",
"encoding": {
"x": { ... }
"y": {
"field": {"repeat": "column"},
"type": "quantitative",
"aggregate": "mean"
},
"color": { ... }
}
} }
Fig. 4. (a) Weather data faceted by location; the y-axis is shared, and the underlying scale domains unioned, to enable easier comparison.
(b) Repetition of different measures across columns; the y channel references the column template parameter to vary the encoding.
resolution), where resolution is one of independent or union. Inde-
pendent scales and guides for each layer produce a dual-axis view, as
shown in the layered plots in Fig. 3(a).
3.2.2 Concatenation
To place views side-by-side, Vega-Lite provides operators for horizon-
tal and vertical concatenation. The signatures for these operators are:
hconcat([view
1
, view
2
, ...], resolve)
vconcat([view
1
, view
2
, ...], resolve)
If aligned spatial channels have matching data fields (e.g., the y
channels in an hconcat use the same field), a shared scale and axis
are used. Axis composition facilitates comparison across views and
optimizes the underlying implementation. Fig. 3(b) concatenates the
line chart from Fig. 2(a) with a dot plot, using independent scales.
3.2.3 Facet
While concatenation allows composition of arbitrary views, one often
wants to set up multiple views in a parameterized fashion. The facet
operator produces a trellis plot [1] by subsetting the data by the distinct
values of a field. The signature of the facet operator is:
facet(channel, data, field, view, scale, axis, resolve)
The channel indicates if sub-plots should be laid out vertically (row)
or horizontally (column). The given data source is partitioned using
distinct values of the field. The view specification provides a template
for the sub-plots, inheriting the backing data for each partition from
the operator. The scale and axis parameters specify how sub-plots are
positioned and labeled. Fig. 4(a) demonstrates faceting into columns.
To facilitate comparison, scales and guides for quantitative fields
are shared by default. This ensures that each facet visualizes the same
data domain. However, for ordinal scales we generate independent
scales by default to avoid unnecessary inclusion of empty categories,
akin to Polaris’ nest operator. When faceting by fiscal quarter and
visualizing per-month data in each cell, one likely wishes to see three
months per quarter, not twelve months of which nine are empty. Users
can override the default behavior via the resolve component.
3.2.4 Repeat
The repeat operator generates multiple plots, but unlike facet allows
full replication of a data set in each cell. For example, repeat can be
used to create a scatterplot matrix (SPLOM), where each cell shows a
different 2D projection of the same data table. The signature is:
repeat(channel, values, scale, axis, view, resolve)
Similar to facet, the channel parameter indicates if plots should di-
vide by row or column. Rather than partition data according to a field,
this operator generates one plot for each entry in a list of values. En-
codings within the repeated view specification can refer to this pro-
vided value to parameterize the plot
1
. By default, scales and axes are
independent, but legends are shared when data fields coincide. Like
1
As the repeat operator requires parameterization of the inner view, it is
not strictly algebraic. It is possible to achieve algebraic “purity” via explicit re-
peated concatenation or by reformulating the repeat operator (e.g., by including
rewrite rules that apply to the inner view specification). However, we believe
the current syntax to be more usable and concise than these alternatives.
facet, the scale and axis components allow users to override defaults
for how sub-plots are positioned and labeled, while resolve controls
resolution of scales and guides within the plots themselves.
3.3 Nested Views
Composition operators can be combined to create more complex
nested views or dashboards, with the output of one operator serving as
input to a subsequent operator. For instance, a layer of two unit views
might be repeated, and then concatenated with a different unit view.
The one exception is the layer operator, which, as previously noted,
only accepts unit views to ensure consistent plots. For concision, two
dimensional faceted or repeated layouts can be achieved by applying
the operators to the row and column channels simultaneously. When
faceting a composite view, only the dataset targeted by the operator is
partitioned; any other datasets specified in sub-views are replicated.
4 THE VEGA-LITE GRAMMAR OF INTERACTION
To support specification of interaction techniques, Vega-Lite extends
the definition of unit specifications to also include a set of selections.
Selections identify the set of points a user is interested in manipulat-
ing. In this section, we define the components of a selection, describe
a series of transforms for modifying selections, and detail how selec-
tions can parameterize visual encodings to make them interactive.
4.1 Selection Components
We formally define a selection as an eight-tuple:
selection := (name, type, predicate, domain|range,
event, init, transforms, resolve)
When an input event occurs, the selection is populated with backing
points of interest. These points are the minimal set needed to identify
all selected points. The selection type determines how many backing
values are stored, and how the predicate function uses them to deter-
mine the set of selected points. Supported types include a single point,
a list of points, or an interval of points.
A point selection is backed by a single datum, and its predicate tests
for an exact match against properties of this datum. It can also function
like a dynamic variable (or signal in Vega [23]), and can be invoked
as such. For example, it can be referenced by name within a filter ex-
pression, or its values used directly for particular encoding channels.
List selections, on the other hand, are backed by datasets into which
points are inserted, modified or removed as events fire. Lists express
discrete selections, as their predicates test for an exact match with at
least one value in the backing dataset. The order of points in a list
selection can be semantically meaningful, for example when a list se-
lection serves as an ordinal scale domain. Fig. 5 illustrates how points
are highlighted in a scatterplot using point and list selections.
Intervals are similar to list selections. They are backed by datasets,
but their predicates determine whether an argument falls within the
minimum and maximum extent defined by the backing points. Thus,
they express continuous selections. The compiler automatically adds
a rectangle mark, as shown in Fig. 6(a), to depict the selected inter-
val. Users can customize the appearance of this mark via the brush
keyword, or disable it altogether when defining the selection.
Predicate functions enable a minimal set of backing points to rep-
resent the full space of selected points. For example, with predicates,

{
"data": {"url": "data/cars.json"},
"mark": "circle",
"select": {
"id": {"type": "point"}
},
"encoding": {
"x": {"field": "Horsepower", "type": "Q"},
"y": {"field": "MPG", "type": "Q"},
"color": [
{"if": "id", "field": "Origin", "type": "N"},
{"value": "grey"}
],
"size": {"value": 100}
}
(a) Highlight a single point on click
"id": {"type": "point", "project": {"fields": ["Origin"]}}
(d) Highlight a single Origin
"id": {"type": "list", "toggle": true}
(b) Highlight a list of individual points
"select": {
"id": {"type": "list", "toggle": true, "project": {"fields": ["Origin"]}}
}, ...
(e) Highlight a list of Origins
(c) "Paintbrush": highlight multiple points on hover
"id": {"type": "list", "on": "mouseover", "toggle": true}
Fig. 5. (a) Adding a single point selection to parameterize the fill color of a scatterplot’s circle mark. (b) Switching to a list selection, with the toggle
transform automatically added (true enables default shift-click event handling). (c) Specifying a custom event trigger: the first point is selected on
mouseover and subsequent points when the shift key is pressed (customizable via the toggle transform). (d) Using the project transform with a
single-point selection to highlight all points with a matching Origin, and (e) combining it with a list selection to select multiple Origins.
an interval selection need only be backed by two points: the minimum
and maximum values of the interval. While selection types provide
default definitions, predicates can be customized to concisely specify
an expressive space of selections. For example, a single point selec-
tion with a custom predicate of the form datum.binned price
== selection.binned price is sufficient for selecting all data
points that fall within a given bin.
By default, backing points lie in the data domain. For example,
if the user clicks a mark instance, the underlying data tuple is added
to the selection. If no tuple is available, event properties are passed
through inverse scale transforms. For example, as the user moves
their mouse within the data rectangle, the mouse position is inverted
through the x and y scales and stored in the selection. Defining selec-
tions over data values, rather than visual properties, facilitates reuse
across distinct views; each view may have different encodings spec-
ified, but are likely to share the same data domain. However, some
interactions are inherently about manipulating visual properties for
example, interactively selecting the colors of a heatmap. For such
cases, users can define selections over the visual range instead. When
input events occur, visual elements or event properties are then stored.
The particular events that update a selection are determined by
the platform a Vega-Lite specification is compiled on, and the input
modalities it supports. By default we use mouse events on desktops,
and touch events on mobile and tablet devices. A user can specify
alternate events using Vega’s event selector syntax [23]. For exam-
ple, Fig. 5(c) demonstrates how mouseover events are used to pop-
ulate a list selection. With the event selector syntax, multiple events
are specified using a comma (e.g., mousedown, mouseup adds
items to the selection when either event occurs). A sequence of events
is denoted with the right-combinator. For example, [mousedown,
mouseup] > mousemove selects all mousemove events that oc-
cur between a mousedown and a mouseup (otherwise known as
“drag” events). Events can also be filtered using square brackets (e.g.,
mousemove [event.pageY > 5] for events at the top of the
page) and throttled using braces (e.g., mousemove{100ms} popu-
lates a selection at most every 100 milliseconds).
Finally, selections can be initialized with specific backing points
(we defer discussion of transforms and resolve to subsequent sections).
Vega-Lite provides a built-in mechanism to initialize list and interval
selections using the scales of the unit specification they are defined
in. Doing so populates the selection with the given scales’ domain or
range, as appropriate for the selection, and parameterizes the scales to
use the selection instead. By default, this occurs for the scales of the x
and y channels, but alternate scales can be specified by the user. This
step allows scale extents to be interactively manipulated, yet remain
automatically initialized by the input data.
4.2 Selection Transforms
Analogous to data transforms, selection transforms manipulate the
components of the selection they are applied to. For example, they
may perform operations on the backing points, alter a selection’s pred-
icate function, or modify the input events that update the selection.
We identify the following transforms as a minimal set to support both
common and custom interaction techniques:
project(fields, channels): Alters a selection’s predicate function to
determine inclusion by matching only the given fields. Some fields,
however, may be difficult for users to address directly (e.g., new fields
introduced due to inline binning or aggregation transformations). For
such cases, a list of channels may also be specified (e.g., color,
size). Fig. 5(d, e) demonstrate how project can be used to select
all points with matching Origin fields, for example. This transform
is also used to restrict interval selections to a particular dimension
(Fig. 6(c)) or to determine which scales initialize a selection.
toggle(event): This transform is automatically instantiated for
uninitialized list selections. When the event occurs, the corresponding
point is added or removed from a list selection’s backing dataset. By
default, the toggle event corresponds to the selection’s event but with
the shift key pressed. For example, in Fig. 5(b), additional points are
added to the list selection on shift-click (where click is the default
event for list selections). The selection in Fig. 5(c), however, speci-
fies a custom mouseover event. Thus, additional points are inserted
when the shift key is pressed and the mouse cursor hovers over a point.
translate(events, by): Offsets the spatial properties (or correspond-
ing data fields) of backing points by an amount determined by the
coordinates of the sequenced events. For example, on the desk-
top, drag events ([mousedown, mouseup] > mousemove) are
used and the offset corresponds to the difference between where the
mousedown and subsequent mousemove events occur. If no coor-
dinates are available (e.g., as with keyboard events), an optional by
argument should be specified. This transform respects the project
transform as well, restricting movement to the specified dimensions.
This transform is automatically instantiated for interval transforms,
enabling movement of brushed regions (Fig. 6(b)) or panning of the
visualization when scale extents initialize the selection (Fig. 7).
zoom(event, factor): Applies a scale factor, determined by the event,
to the spatial properties (or corresponding data fields) of backing
points. An optional factor should be specified, if it cannot be deter-
mined from the events (e.g., when the arrow keys are pressed).
nearest(): Computes a Voronoi decomposition, and augments the
selection’s event processing, such that the data value or visual element

Citations
More filters
Journal ArticleDOI
TL;DR: The book describes clearly and intuitively the differences between exploratory and confirmatory factor analysis, and discusses how to construct, validate, and assess the goodness of fit of a measurement model in SEM by confirmatory factors analysis.
Abstract: Examples are discussed to show the differences among discriminant analysis, logistic regression, and multiple regression. Chapter 6, “Multivariate Analysis of Variance,” presents advantages of multivariate analysis of variance (MANOVA) over univariate analysis of variance (ANOVA), discusses assumptions of MANOVA, and assesses validations of MANOVA assumptions and model estimation. The authors also discuss post hoc tests of MANOVA and multivariate analysis of covariance. Chapter 7, “Conjoint Analysis,” explains what conjoint analysis does and how it is different from other multivariate techniques. Guidelines of selecting attributes, models, and methods of data collection are presented. Chapter 8, “Cluster Analysis,” studies objectives, roles, and limitations of cluster analysis. Two basic concepts: similarity and distance are discussed. The authors also discuss details of five most popular hierarchical algorithms (singlelinkage, complete-linkage, average-linkage, centroid method, Ward’s method) and three nonhierarchical algorithms (the sequential threshold method, the parallel threshold method, and the optimizing procedure). Profiles of clusters and guidelines for cluster validation are studied as well. Chapter 9, “Multidimensional Scaling and Correspondence Analysis,” introduces two interdependence techniques to display the relationships in the data. The book describes clearly and intuitively the differences between the two techniques and how these two techniques are performed. Chapters 10–12 cover topics in SEM. Chapter 10, “Structural Equation Modeling: An Introduction,” introduces SEM and related concepts such as exogenous, endogenous constructs, and so on, points out the differences between SEM and other multivariate techniques, overviews the decision process of SEM. Chapter 11, “Confirmatory Factor Analysis,” explains the differences between exploratory and confirmatory factor analysis, discusses how to construct, validate, and assess the goodness of fit of a measurement model in SEM by confirmatory factor analysis. Chapter 12, “Testing a Structural Model,” presents some methods of SEM in examining the relationships between latent constructs. The book is an excellent book for people in management and marketing. For the Technometrics audience, this book does not have much flavor of physical, chemical, and engineering sciences. For example, partial least squares, a very popular method in Chemometrics, is discussed but not as detailed as other techniques in the book. Furthermore, due to the amount of materials covered in the book, it might be inappropriate for someone who is new to multivariate analysis.

497 citations

Proceedings ArticleDOI
18 Sep 2017
TL;DR: RAWGraphs is an open source web application for the creation of static data visualisations that are designed to be further modified, and presents a chart-based approach to data visualisation that can be used to map data dimensions.
Abstract: RAWGraphs is an open source web application for the creation of static data visualisations that are designed to be further modified. Originally conceived for graphic designers to provide a series of tasks not available with other tools, it evolved into a platform that provides simple ways to map data dimensions onto visual variables. It presents a chart-based approach to data visualisation: each visual model is an independent module exposing different visual variables that can be used to map data dimensions. Consequently, users can create complex data visualisations. Finally, the tool is meant to produce outputs that are open, that is, not subjected to proprietary solutions, which can be further edited.

251 citations


Cites methods from "Vega-Lite: A Grammar of Interactive..."

  • ...However, the cited tools use visual grammars that encode data directly to the visual properties of single graphical elements [5]....

    [...]

Proceedings ArticleDOI
02 May 2017
TL;DR: This work presents Voyager 2, a mixed-initiative system that blends manual and automated chart specification to help analysts engage in both open-ended exploration and targeted question answering and contributes two partial specification interfaces.
Abstract: Visual data analysis involves both open-ended and focused exploration. Manual chart specification tools support question answering, but are often tedious for early-stage exploration where systematic data coverage is needed. Visualization recommenders can encourage broad coverage, but irrelevant suggestions may distract users once they commit to specific questions. We present Voyager 2, a mixed-initiative system that blends manual and automated chart specification to help analysts engage in both open-ended exploration and targeted question answering. We contribute two partial specification interfaces: wildcards let users specify multiple charts in parallel, while related views suggest visualizations relevant to the currently specified chart. We present our interface design and applications of the CompassQL visualization query language to enable these interfaces. In a controlled study we find that Voyager 2 leads to increased data field coverage compared to a traditional specification tool, while still allowing analysts to flexibly drill-down and answer specific questions.

250 citations


Cites background or methods from "Vega-Lite: A Grammar of Interactive..."

  • ...A specification in CompassQL (spec) has a similar structure to a Vega-Lite unit specification [24], but allows replacing concrete values with enumeration specifiers (or “wildcards”), indicating that certain properties should be determined by the query engine....

    [...]

  • ..., [24, 34, 35]) can succinctly express a variety of charts, in part by letting users omit design details required by lower-level visualization languages (e....

    [...]

  • ...Interactions in Voyager 2 produce specifications in CompassQL, a generalization of the Vega-Lite grammar [24] to support partial view specifications....

    [...]

  • ...Both specifications and recommendations in Voyager 2 are represented using CompassQL [38], a visualization query language based on Vega-Lite [24]....

    [...]

Journal ArticleDOI
TL;DR: This work proposes modeling visualization design knowledge as a collection of constraints, in conjunction with a method to learn weights for soft constraints from experimental data, which can take theoretical design knowledge and express it in a concrete, extensible, and testable form.
Abstract: There exists a gap between visualization design guidelines and their application in visualization tools. While empirical studies can provide design guidance, we lack a formal framework for representing design knowledge, integrating results across studies, and applying this knowledge in automated design tools that promote effective encodings and facilitate visual exploration. We propose modeling visualization design knowledge as a collection of constraints, in conjunction with a method to learn weights for soft constraints from experimental data. Using constraints, we can take theoretical design knowledge and express it in a concrete, extensible, and testable form: the resulting models can recommend visualization designs and can easily be augmented with additional constraints or updated weights. We implement our approach in Draco, a constraint-based system based on Answer Set Programming (ASP). We demonstrate how to construct increasingly sophisticated automated visualization design systems, including systems based on weights learned directly from the results of graphical perception experiments.

238 citations


Cites methods from "Vega-Lite: A Grammar of Interactive..."

  • ...We plan to extend the model to transformations such as filtering and sorting, and incorporate Vega-Lite’s interaction primitives [52]....

    [...]

  • ...Although the current design space in Draco is limited, as noted above we plan to extend the model further, including interaction primitives such as Vega-Lite selections [52]....

    [...]

  • ...Following CompassQL [65], Draco uses a logical representation of the Vega-Lite grammar [52]....

    [...]

  • ...We first formulate a simple yet powerful visualization description language based on the Vega-Lite grammar [52] and then extend this language to express dataset and task characteristics....

    [...]

Journal ArticleDOI
TL;DR: An end‐to‐end pipeline which takes a bitmap image as input and returns a visual encoding specification as output is contributed and accurate automatic inference of text elements, mark types, and chart specifications across a variety of input chart types is demonstrated.
Abstract: We investigate how to automatically recover visual encodings from a chart image, primarily using inferred text elements. We contribute an end-to-end pipeline which takes a bitmap image as input and...

171 citations

References
More filters
Journal ArticleDOI
TL;DR: This work shows how representational transparency improves expressiveness and better integrates with developer tools than prior approaches, while offering comparable notational efficiency and retaining powerful declarative components.
Abstract: Data-Driven Documents (D3) is a novel representation-transparent approach to visualization for the web Rather than hide the underlying scenegraph within a toolkit-specific abstraction, D3 enables direct inspection and manipulation of a native representation: the standard document object model (DOM) With D3, designers selectively bind input data to arbitrary document elements, applying dynamic transforms to both generate and modify content We show how representational transparency improves expressiveness and better integrates with developer tools than prior approaches, while offering comparable notational efficiency and retaining powerful declarative components Immediate evaluation of operators further simplifies debugging and allows iterative development Additionally, we demonstrate how D3 transforms naturally enable animation and interaction with dramatic performance improvements over intermediate representations

2,550 citations


"Vega-Lite: A Grammar of Interactive..." refers background in this paper

  • ...Moreover, by enabling this interaction through composable primitives (rather than a single, specific “pan and zoom” operator [4]), Vega-Lite also facilitates exploring related interactions in the design space....

    [...]

  • ...More expressive lower-level (and thus more verbose) grammars, including those of Protovis [3], D3 [4], and Vega [22], have been widely used for creating explanatory and highly-customized graphics....

    [...]

  • ...Low-level grammars such as Protovis [3], D3 [4], and Vega [22] are useful for explanatory data visualization or as a basis for customized analysis tools, as their primitives offer fine-grained control....

    [...]

Journal ArticleDOI
TL;DR: The approach is based on graphical perception—the visual decoding of information encoded on graphs—and it includes both theory and experimentation to test the theory, providing a guideline for graph construction.
Abstract: The subject of graphical methods for data analysis and for data presentation needs a scientific foundation. In this article we take a few steps in the direction of establishing such a foundation. Our approach is based on graphical perception—the visual decoding of information encoded on graphs—and it includes both theory and experimentation to test the theory. The theory deals with a small but important piece of the whole process of graphical perception. The first part is an identification of a set of elementary perceptual tasks that are carried out when people extract quantitative information from graphs. The second part is an ordering of the tasks on the basis of how accurately people perform them. Elements of the theory are tested by experimentation in which subjects record their judgments of the quantitative information on graphs. The experiments validate these elements but also suggest that the set of elementary tasks should be expanded. The theory provides a guideline for graph construction...

1,545 citations


"Vega-Lite: A Grammar of Interactive..." refers background in this paper

  • ...Voyager leverages perceptual effectiveness criteria [2, 8, 16] to rank candidate visual encodings....

    [...]

Journal ArticleDOI
TL;DR: APT as discussed by the authors is an application-independent presentation tool that automatically designs effective graphical presentations (such as bar charts, scatter plots, and connected graphs) of relational information, based on the view that graphical presentations are sentences of graphical languages.
Abstract: The goal of the research described in this paper is to develop an application-independent presentation tool that automatically designs effective graphical presentations (such as bar charts, scatter plots, and connected graphs) of relational information. Two problems are raised by this goal: The codification of graphic design criteria in a form that can be used by the presentation tool, and the generation of a wide variety of designs so that the presentation tool can accommodate a wide variety of information. The approach described in this paper is based on the view that graphical presentations are sentences of graphical languages. The graphic design issues are codified as expressiveness and effectiveness criteria for graphical languages. Expressiveness criteria determine whether a graphical language can express the desired information. Effectiveness criteria determine whether a graphical language exploits the capabilities of the output medium and the human visual system. A wide variety of designs can be systematically generated by using a composition algebra that composes a small set of primitive graphical languages. Artificial intelligence techniques are used to implement a prototype presentation tool called APT (A Presentation Tool), which is based on the composition algebra and the graphic design criteria.

1,483 citations

Book
01 Nov 2010
TL;DR: This work is an unprecedented attempt to synthesize principles of graphic communication with the logic of standard rules applied to writing and topography in an array of more than 1,000 maps and diagrams.
Abstract: Originally published in French in 1967, "Semiology of Graphics" holds a significant place in the theory of information design. Founded on Jacques Bertin's practical experience as a cartographer, Part One of this work is an unprecedented attempt to synthesize principles of graphic communication with the logic of standard rules applied to writing and topography. Part Two brings Bertin's theory to life, presenting a close study of graphic techniques including shape, orientation, color, texture, volume, and size in an array of more than 1,000 maps and diagrams.

1,309 citations


"Vega-Lite: A Grammar of Interactive..." refers background in this paper

  • ...Voyager leverages perceptual effectiveness criteria [2, 8, 16] to rank candidate visual encodings....

    [...]

Journal ArticleDOI
TL;DR: Seven general categories of interaction techniques widely used in Infovis are proposed, organized around a user's intent while interacting with a system rather than the low-level interaction techniques provided by a system.
Abstract: Even though interaction is an important part of information visualization (Infovis), it has garnered a relatively low level of attention from the Infovis community. A few frameworks and taxonomies of Infovis interaction techniques exist, but they typically focus on low-level operations and do not address the variety of benefits interaction provides. After conducting an extensive review of Infovis systems and their interactive capabilities, we propose seven general categories of interaction techniques widely used in Infovis: 1) Select, 2) Explore, 3) Reconfigure, 4) Encode, 5) Abstract/Elaborate, 6) Filter, and 7) Connect. These categories are organized around a user's intent while interacting with a system rather than the low-level interaction techniques provided by a system. The categories can act as a framework to help discuss and evaluate interaction techniques and hopefully lay an initial foundation toward a deeper understanding and a science of interaction.

1,018 citations


"Vega-Lite: A Grammar of Interactive..." refers background in this paper

  • ...The selection can also serve as the scale domain for a secondary view, thereby constructing an overview + detail interaction....

    [...]

Frequently Asked Questions (10)
Q1. What are the contributions in "Vega-lite: a grammar of interactive graphics" ?

The authors present Vega-Lite, a high-level grammar that enables rapid specification of interactive data visualizations. The Vega-Lite compiler automatically synthesizes requisite data flow and event handling logic, which users can override for further customization. 

One promising avenue for future work is to develop models and techniques to analogously recommend suitable interaction methods for given visualizations and underlying data types. 

The filterWith data transform applies the selection against the backing datasets such that only data values that fall within the selection are displayed. 

Low-level grammars such as Protovis [3], D3 [4], and Vega [22] are useful for explanatory data visualization or as a basis for customized analysis tools, as their primitives offer fine-grained control. 

by): Offsets the spatial properties (or corresponding data fields) of backing points by an amount determined by the coordinates of the sequenced events. 

Once the necessary components have been built, the compiler performs a bottom-up traversal of the model tree to merge redundant components. 

To support expressive interaction methods, the authors first contribute an algebra to compose singleview Vega-Lite specifications into multi-view displays using layer, concatenate, facet and repeat operators. 

nearest(): Computes a Voronoi decomposition, and augments the selection’s event processing, such that the data value or visual elementnearest the selection’s triggering event is selected (approximating a Bubble Cursor [11]). 

Their formal definitions are instantiated in a JSON (JavaScript Object Notation) syntax, as shown in Fig. 2.Given multiple unit specifications, composite views can be created using a set of composition operators. 

Specifying common techniques can be time-consuming, requiring tens of lines of JSON, and it is difficult to know how to adapt techniques in pursuit of alternative designs.