What are the contributions in "Vega-lite: a grammar of interactive graphics" ?

The authors present Vega-Lite, a high-level grammar that enables rapid specification of interactive data visualizations. The Vega-Lite compiler automatically synthesizes requisite data flow and event handling logic, which users can override for further customization.

What are the future works in "Vega-lite: a grammar of interactive graphics" ?

One promising avenue for future work is to develop models and techniques to analogously recommend suitable interaction methods for given visualizations and underlying data types.

What is the function that applies the selection against the backing datasets?

The filterWith data transform applies the selection against the backing datasets such that only data values that fall within the selection are displayed.

What is the function that offsets the spatial properties of the backing points?

by): Offsets the spatial properties (or corresponding data fields) of backing points by an amount determined by the coordinates of the sequenced events.

What is the process of merging components?

Once the necessary components have been built, the compiler performs a bottom-up traversal of the model tree to merge redundant components.

How does Vega-Lite support expressive interaction methods?

To support expressive interaction methods, the authors first contribute an algebra to compose singleview Vega-Lite specifications into multi-view displays using layer, concatenate, facet and repeat operators.

What is the function that augments the selection’s event processing?

nearest(): Computes a Voronoi decomposition, and augments the selection’s event processing, such that the data value or visual elementnearest the selection’s triggering event is selected (approximating a Bubble Cursor [11]).

What is the syntax for creating a composite view?

Their formal definitions are instantiated in a JSON (JavaScript Object Notation) syntax, as shown in Fig. 2.Given multiple unit specifications, composite views can be created using a set of composition operators.

How can you adapt techniques to a different design?

Specifying common techniques can be time-consuming, requiring tens of lines of JSON, and it is difficult to know how to adapt techniques in pursuit of alternative designs.

(Open Access) Vega-Lite: A Grammar of Interactive Graphics (2017) | Arvind Satyanarayan

Q: What are the primary features of a low-level grammar?

Low-level grammars such as Protovis [3], D3 [4], and Vega [22] are useful for explanatory data visualization or as a basis for customized analysis tools, as their primitives offer fine-grained control.

Vega-Lite: A Grammar of Interactive Graphics

Arvind Satyanarayan, Dominik Moritz, Kanit Wongsuphasawat, and Jeffrey Heer

Fig. 1. Example visualizations authored with Vega-Lite. From left-to-right: layered line chart combining raw and average values,

dual-axis layered bar and line chart, brushing and linking in a scatterplot matrix, layered cross-ﬁltering, and an interactive index chart.

Abstract—We present Vega-Lite, a high-level grammar that enables rapid speciﬁcation of interactive data visualizations. Vega-Lite

combines a traditional grammar of graphics, providing visual encoding rules and a composition algebra for layered and multi-view

displays, with a novel grammar of interaction. Users specify interactive semantics by composing selections. In Vega-Lite, a selection

is an abstraction that deﬁnes input event processing, points of interest, and a predicate function for inclusion testing. Selections

parameterize visual encodings by serving as input data, deﬁning scale extents, or by driving conditional logic. The Vega-Lite compiler

automatically synthesizes requisite data ﬂow and event handling logic, which users can override for further customization. In contrast

to existing reactive speciﬁcations, Vega-Lite selections decompose an interaction design into concise, enumerable semantic units.

We evaluate Vega-Lite through a range of examples, demonstrating succinct speciﬁcation of both customized interaction methods

and common techniques such as panning, zooming, and linked selection.

Index Terms—Information visualization, interaction, systems, toolkits, declarative speciﬁcation

1 INTRODUCTION

Grammars of graphics span a gamut of expressivity. Low-level gram-

mars such as Protovis [3], D3 [4], and Vega [22] are useful for ex-

planatory data visualization or as a basis for customized analysis

tools, as their primitives offer ﬁne-grained control. However, for ex-

ploratory visualization, higher-level grammars such as ggplot2 [27],

and grammar-based systems such as Tableau (n

ee Polaris [24]), are

typically preferred as they favor conciseness over expressiveness. An-

alysts rapidly author partial speciﬁcations of visualizations; the gram-

mar applies default values to resolve ambiguities, and synthesizes low-

level details to produce visualizations.

High-level languages can also enable search and inference over the

space of visualizations. For example, Wongsuphasawat et al. [30] in-

troduced Vega-Lite to power the Voyager visualization browser. By

providing a smaller surface area than the lower-level Vega language,

Vega-Lite makes systematic enumeration and ranking of data transfor-

mations and visual encodings more tractable.

However, existing high-level languages provide limited support for

interactivity. An analyst can, at most, enable a predeﬁned set of com-

mon techniques (linked selections, panning & zooming, etc.) or pa-

rameterize their visualization with dynamic query widgets [21]. For

custom, direct-manipulation interaction they must instead turn to im-

perative event handling callbacks. Recognizing that callbacks can be

error-prone to author, and require complex static analysis to reason

about, Satyanarayan et al. [23] recently formulated declarative interac-

tion primitives for Vega. While these additions facilitate programmatic

generation and retargeting of interactive visualizations, they remain

• Arvind Satyanarayan is with Stanford University. E-mail:

arvindsatya@cs.stanford.edu.

• Dominik Moritz, Kanit Wongsuphasawat, and Jeffrey Heer are with the

University of Washington. E-mails: {domoritz, kanitw, jheer}@uw.edu.

Manuscript received xx xxx. 201x; accepted xx xxx. 201x. Date of

Publication xx xxx. 201x; date of current version xx xxx. 201x.

For information on obtaining reprints of this article, please send

e-mail to: reprints@ieee.org.

Digital Object Identiﬁer: xx.xxxx/TVCG.201x.xxxxxxx/

low-level. Verbose speciﬁcation impedes rapid authoring and hinders

systematic exploration of alternative designs.

In this paper we extend Vega-Lite to enable concise, high-level

speciﬁcation of interactive data visualizations. To support expressive

interaction methods, we ﬁrst contribute an algebra to compose single-

view Vega-Lite speciﬁcations into multi-view displays using layer,

concatenate, facet and repeat operators. Vega-Lite’s compiler infers

how input data should be reused across constituent views, and whether

scale domains should be unioned or remain independent.

Second, we contribute a high-level interaction grammar. With

Vega-Lite, an interaction design is composed of selections: visual el-

ements or data points that are chosen when input events occur. Selec-

tions parameterize visual encodings by serving as input data, deﬁning

scale extents, and providing predicate functions for testing or ﬁltering

items. For example, a rectangular “brush” is a common interaction

technique for data visualization. In Vega-Lite, a brush is deﬁned as a

selection that holds two data points that correspond to its extents (e.g.,

captured when the mouse button is pressed and as it is dragged, re-

spectively). Its predicate can be used to highlight visual elements that

fall within the brushed region, and to materialize a dataset as input to

other encodings. The selection can also serve as the scale domain for a

secondary view, thereby constructing an overview + detail interaction.

For added expressivity, Vega-Lite provides a series of operators to

transform a selection. Transforms can be triggered by input events as

well, and manipulate selection points or predicate functions. For ex-

ample, a toggle transform adds or removes a point from the selection,

while a project transform modiﬁes the predicate to deﬁne inclusion

over speciﬁed data ﬁelds.

The Vega-Lite compiler synthesizes a low-level Vega speciﬁca-

tion [22] with the requisite data ﬂow, and default event handling logic

that a user can override. Through a range of examples, we demon-

strate that Vega-Lite brings the advantages of high-level speciﬁcation

to interactive visualization. Common methods, including linked selec-

tion, panning, and zooming, as well as custom techniques (drawn from

an established taxonomy [31]) can be concisely described. Moreover,

selections, transformations, and their application to visual encodings

decompose interaction into a parametric design space. We show how

each of these parameters can be systematically varied to generate al-

ternate interaction techniques for a given set of visual encodings. Such

enumeration can be useful to explore alternative designs, and can aid

higher-level reasoning about interaction — for example, recommend-

ing suitable interaction techniques as part of a design tool.

2 RELATED WORK

Vega-Lite builds on prior work on grammars of graphics, visualization

systems, and techniques for interactive selection and querying.

2.1 Grammar-Based Visual Encoding

Since the initial publication of Wilkinson’s The Grammar of Graph-

ics [29] in 1999, formal grammars for statistical graphics have grown

increasingly popular as a way to succinctly specify visualizations.

Wilkinson’s work was quickly followed by the Stanford Polaris sys-

tem [24], later commercialized as Tableau. Hadley Wickham’s popular

ggplot2 [27] and ggvis [20] packages implement variants of Wilkin-

son’s model in the R statistical language. These tools eschew chart

templates, which offer limited means of customization, in favor of

combinatorial building blocks. Abstracting data models, graphical

marks, visual encoding channels, scales and guides (i.e., axes and leg-

ends) yields a more expressive design space, and allows analysts to

rapidly construct graphics for exploratory analysis [13]. Concise spec-

iﬁcation is achieved in part through ambiguity: users may omit details

such as scale transforms (e.g., linear or log) or color palettes, which

are then ﬁlled in using a rule-based system of smart defaults. More

expressive lower-level (and thus more verbose) grammars, including

those of Protovis [3], D3 [4], and Vega [22], have been widely used

for creating explanatory and highly-customized graphics.

The design of Vega-Lite is heavily inﬂuenced by these works.

Drawing from Wilkinson’s grammar and Polaris/Tableau, Vega-Lite

similarly represents basic plots using a set of encoding deﬁnitions that

map data attributes to visual channels such as position, color, shape,

and size, and may include common data transformations such as bin-

ning, aggregation, sorting, and ﬁltering. Drawing from Vega, Vega-

Lite uses a portable JSON (JavaScript Object Notation) syntax that

permits generation from a variety of programming languages. Vega-

Lite speciﬁcations are compiled to full Vega speciﬁcations, hence the

expressive gamut of Vega-Lite is a strict subset of that of Vega. As we

will later demonstrate, Vega-Lite sacriﬁces some expressiveness for

dramatic gains in the conciseness and clarity of speciﬁcation.

In terms of visual encoding, Vega-Lite differs most from other high-

level grammars in its approach to multiple view displays. Each of

these grammars supports faceting (or nesting) to construct trellis plots

in which each cell similarly visualizes a different partition of the data.

Both Wilkinson’s grammar and Polaris/Tableau achieve this through a

table algebra over data ﬁelds, which in turn determines spatial sub-

divisions. Tableau additionally supports the construction of multi-

view dashboards via a different mechanism, with each view backed

by a separate speciﬁcation. In contrast, we contribute a view alge-

bra: starting with unit speciﬁcations that deﬁne a single plot, Vega-

Lite expresses composite views using operators for layering, horizon-

tal or vertical concatenation, faceting, and parameterized repetition.

When applicable, these operators will merge scale domains and prop-

erly align constituent views. Disparate views can also be combined

into arbitrary dashboards, all within a uniﬁed algebraic model.

2.2 Specifying Interactions in Visualization Systems

Despite the central role of interaction in effective data visualization

[13, 19], little work has been done to develop a grammar for specify-

ing interaction techniques. Wilkinson’s grammar includes no notion

of interaction. Tableau supports common interaction techniques, but

relies on mechanisms external to the visual encoding grammar. Early

systems like GGobi [25] support common techniques as well, and pro-

vide imperative APIs for custom methods. However, such APIs make

easy tasks needlessly complex, burdening developers with learning

low-level execution details. More recent systems, including Protovis,

D3, and VisDock [7], offer a typology of common techniques that can

be applied to a visualization. Such top-down approaches, however,

limit customization and composition. For example, D3’s interactors

encapsulate event processing, making it difﬁcult to combine them if

their events conﬂict (e.g., if dragging triggers brushing and panning).

The prior work perhaps most closely related to Vega-Lite is the Re-

active Vega language [23]. Reactive Vega draws on Functional Reac-

tive Programming techniques to formulate composable, declarative in-

teraction primitives for data visualization. Reactive Vega models input

events as continuous data streams. To succinctly deﬁne event streams

of interest, Vega employs an event selector syntax, which Vega-Lite

also uses for customized event logic. Event streams, in turn, drive

dynamic variables called signals. Signals parameterize the remainder

of the visualization speciﬁcation, endowing it with reactive semantics.

When a new event ﬁres, it propagates to dependent signals; visual en-

codings that use them are automatically re-evaluated and re-rendered.

This reactive approach is not only capable of expressing a diverse set

of interactions [23], it is performant as well [22], with interactive per-

formance at least twice as fast as the equivalent D3 program.

However, the resulting reactive speciﬁcations are low-level and ver-

bose. Specifying common techniques can be time-consuming, requir-

ing tens of lines of JSON, and it is difﬁcult to know how to adapt

techniques in pursuit of alternative designs. In contrast, Vega-Lite is

a higher-level speciﬁcation language, with primitives that decompose

interaction design into a parametric space. Common methods require

typically 1-2 lines of code, and design variations can be explored by

systematically enumerating deﬁned properties. Nevertheless, Reac-

tive Vega provides a performant runtime and an “assembly language”

to which Vega-Lite speciﬁcations are compiled.

2.3 Interactive Selection and Querying

Selection, often in the form of users clicking or lassoing visual items

of interest, is a fundamental operation in user interfaces and has

been well-studied in the context of data visualization. For example,

in Snap-Together Visualization [17], multiple views are coordinated

via “primary-” and “foreign-key actions,” which propagate selected

data tuples from one view to the others. Wilhelm [28] describes the

need for such “indirect object manipulation” methods as an axiom

of interactive data displays. Chen’s compound brushing [6] provides

a visual dataﬂow language for specifying a rich space of transfor-

mations of brush selections. More recently, Brunel [5] provides a

special #selection data ﬁeld that is dynamically populated with

the elements a user interacts with, and can be used to link multi-

ple views or ﬁlter input data. Similarly, RStudio’s Shiny [21], an

imperative web application layer, provides brushedPoints and

nearestPoints functions which can be used throughout an R

script to operate on selected elements.

Other systems have studied formally representing selections as data

queries [28]. For example, brushing interactions in VQE [9] generate

extensional queries that enumerate all items of interest; a form-based

interface enables speciﬁcation of intensional (declarative) queries. In-

dividual point and brush selections in DEVise [15], known as visual

queries, map to a declarative structure and are used to link together

multiple views. With VIQING [18], rectangular “rubber band” selec-

tions are modeled as range extents, and views can be dropped on top

of each other to join their underlying datasets. Heer et al. [12] demon-

strate that by modeling a selection as a declarative query, interactive

“query relaxation” can successively capture more items of interest.

Vega-Lite builds on this work by richly integrating an interactive

selection abstraction with the primitives of visual encoding grammars.

Vega-Lite selections are populated with one or more points of interest,

in response to user interaction. Extensible predicate functions map se-

lections to declarative queries, and allow a minimal set of “backing”

points to represent the full space of selected points. Additional op-

erators can transform a selection’s predicate or backing points (e.g.,

offseting them to translate a brush selection or perform panning). Se-

lections then parameterize visual encodings by serving as input data,

deﬁning scale extents, or using predicates to test or ﬁlter items. The

end result is an enumerable, combinatorial design space of interac-

tive statistical graphics, with concise speciﬁcation of not only linking

interactions, but panning, zooming, and custom techniques as well.

(b) Correlation between wind and temperature

{

"data": {

"url": "data/weather.csv",

"formatType": "csv" },

"mark": "line",

"encoding": {

"x": {

"field": "date",

"type": "temporal",

"timeUnit": "month" },

"y": {

"field": "temp_max",

"type": "quantitative",

"aggregate": "mean" },

"color": {

"field": "location",

"type": "nominal" }

}

{

"data": {

"url": "data/weather.csv",

"formatType": "csv" },

"mark": "point",

"encoding": {

"x": {

"field": "temp_max",

"type": "quantitative",

"bin": true },

"y": {

"field": "wind",

"type": "quantitative",

"bin": true },

"size": {

"field": "*",

"aggregate": "count" },

"color": {

"field": "location",

"type": "nominal" }

} }

{

"data": {

"url": "data/weather.csv",

"formatType": "csv" },

"mark": "bar",

"encoding": {

"x": {

"field": "location",

"type": "nominal"

"y": {

"field": "*",

"type": "quantitative",

"aggregate": "count"

"color": {

"field": "weather",

"type": "nominal"

}

(a) Line chart with aggregation (c) Stacked bar chart of weather types

Fig. 2. Vega-Lite unit speciﬁcations visualizing weather data. These examples demonstrate varied mark types and data transformations.

3 THE VEGA-LITE GRAMMAR OF GRAPHICS

Vega-Lite combines a grammar of graphics with a novel grammar of

interaction. In this section, we describe Vega-Lite’s basic visual en-

coding constructs and an algebra for view composition. In prior work,

Wongsuphasawat et al. [30] introduced the simplest Vega-Lite speci-

ﬁcation — here referred to as a unit speciﬁcation — that deﬁnes a sin-

gle Cartesian plot with a speciﬁc mark type to encode data (e.g., bars,

lines, plotting symbols). Given multiple unit plots, we introduce layer,

concat, facet, and repeat operators to provide an algebra for construct-

ing composite views. This algebra can express layered plots, trellis

plots, and arbitrary multiple view displays. Each operator is responsi-

ble for combining or aligning underlying scales and axes as needed.

3.1 Unit Speciﬁcation

A unit speciﬁcation describes a single Cartesian plot, with a backing

data set, a given mark-type, and a set of one or more encoding def-

initions for visual channels such as position (x, y), color, size, etc.

Formally, a unit view consists of a four-tuple:

unit := (data, transforms, mark-type, encodings)

The data deﬁnition identiﬁes a data source, a relational table con-

sisting of records (rows) with named attributes (columns). This data ta-

ble can be subject to a set of transforms, including ﬁltering and adding

derived ﬁelds via formulas. The mark-type speciﬁes the geometric ob-

ject used to visually encode the data records. Legal values include bar,

line, area, text, rule for reference lines, and plotting symbols (point &

tick). The encodings determine how data attributes map to the proper-

ties of visual marks. Formally, an encoding is a seven-tuple:

encoding := (channel, ﬁeld, data-type, value, functions, scale, guide)

Available visual encoding channels include spatial position (x, y),

color, shape, size, and text. An order channel controls sorting of

stacked elements (e.g., for stacked bar charts and the layering order of

line charts). A path order channel determines the sequence in which

points of a line or area mark are connected to each other. A detail

channel includes additional group-by ﬁelds in aggregate plots.

The ﬁeld string denotes a data attribute to visualize, along with a

given data-type (one of nominal, ordinal, quantitative or temporal).

Alternatively, one can specify a constant literal value to serve as the

data ﬁeld. The data ﬁeld can additionally be transformed using func-

tions such as binning, aggregation (sum, average, etc.), and sorting.

An encoding may also specify properties of a scale that maps from

the data domain to a visual range, and a guide (axis or legend) for

visualizing the scale. If not speciﬁed, Vega-Lite will automatically

populate default properties based on the channel and data-type. For x

and y channels, either a linear scale (for quantitative data) or an ordinal

scale (for ordinal and nominal data) is instantiated, along with an axis.

For color, size, and shape channels, suitable palettes and legends are

generated. For example, quantitative color encodings use a single-

hue luminance ramp, while nominal color encodings use a categorical

palette with varied hues. Our default assignments largely follow the

model of prior systems [24, 30].

Unit speciﬁcations are capable of expressing a variety of com-

mon, useful plots of both raw and aggregated data. Examples include

bar charts, histograms, dot plots, scatter plots, line graphs, and area

graphs. Our formal deﬁnitions are instantiated in a JSON (JavaScript

Object Notation) syntax, as shown in Fig. 2.

3.2 View Composition Algebra

Given multiple unit speciﬁcations, composite views can be created us-

ing a set of composition operators. Here we describe the set of sup-

ported operators. We use the term view to refer to any Vega-Lite spec-

iﬁcation, whether it is a unit or composite speciﬁcation.

3.2.1 Layer

The layer operator accepts multiple unit speciﬁcations to produce a

view in which subsequent charts are plotted on top of each other. For

example, a layered view could consist of one layer showing a his-

togram of a full data set, and another overlaying a histogram of a ﬁl-

tered subset (Fig. 11). The signature of the operator is:

layer([unit

, unit

, ...], resolve)

To create a layered view, we produce shared scales (if their types

match) and merge guides by default. For example, we compute the

union of the data domains for the x or y channel, for which we then

generate a single scale. We believe this is a useful default for pro-

ducing coherent and comparable layers. However, Vega-Lite can not

enforce that a unioned domain is semantically meaningful. To prohibit

layering of composite views with incongruent internal structures, the

layer operator restricts its operands to be unit views.

To override the default behavior, users can specify strategies to re-

solve scales and guides using tuples of the form (channel, scale|guide,

{

"layers": [

{

"data": {"url": "data/weather.csv","formatType": "csv"},

"transform": {"filter": "datum.location === 'Seattle'"},

"mark": "bar",

"encoding": {

"x": {

"field": "date", "type": "temporal",

"timeUnit": "month" },

"y": {

"field": "precipitation", "type": "quantitative",

"aggregate": "mean", "axis": {"grid": false} },

"color": {"value": "#77b2c7"} }

}, {

"data": {"url": "data/weather.csv","formatType": "csv"},

"transform": {"filter": "datum.location === 'Seattle'"},

"mark": "line",

"encoding": {

"x": {

"field": "date", "type": "temporal",

"timeUnit": "month" },

"y": {

"field": "temp_max", "type": "quantitative",

"aggregate": "mean", "axis": {"grid": false} },

"color": {"value": "#ce323c"} }

} ],

"resolve": {

"y": {"scale": "independent"}

} }

(a) Dual axis layered chart

{

"vconcat": [

{ ... },

{

"data": {

"url": "data/weather.csv",

"formatType": "csv"

"transform": {

"filter": "datum.precipitation > 0"

"mark": "point",

"encoding": {

"y": {"field": "location","type": "nominal"},

"x": {

"field": "*",

"type": "quantitative",

"aggregate": "count"

"color": {

"field": "date",

"type": "temporal",

"timeUnit": "year"

}

]

}

(b) Vertical concatenation of two charts

Fig. 3. (a) A dual axis chart that layers lines for temperature on top of bars for precipitation; each layer uses an independent y-scale. (b) The

temperature line chart from Fig. 2(a) concatenated with rainy day counts in New York and Seattle; scales and guides for each plot are independent.

(a) Faceted charts (b) Repeated charts

{

"data": {

"url": "data/weather.csv",

"formatType": "csv"

"facet": {

"column": {

"field": "location",

"type": "nominal"

}

"spec": {

"mark": "line",

"encoding": {

"x": { ... },

"y": { ... },

"color": { ... }

}

{

"repeat": {

"column": ["temp_max","precipitation"]

"spec": {

"data": {

"url": "data/weather.csv",

"formatType": "csv"

"mark": "line",

"encoding": {

"x": { ... }

"y": {

"field": {"repeat": "column"},

"type": "quantitative",

"aggregate": "mean"

"color": { ... }

}

} }

Fig. 4. (a) Weather data faceted by location; the y-axis is shared, and the underlying scale domains unioned, to enable easier comparison.

(b) Repetition of different measures across columns; the y channel references the column template parameter to vary the encoding.

resolution), where resolution is one of independent or union. Inde-

pendent scales and guides for each layer produce a dual-axis view, as

shown in the layered plots in Fig. 3(a).

3.2.2 Concatenation

To place views side-by-side, Vega-Lite provides operators for horizon-

tal and vertical concatenation. The signatures for these operators are:

hconcat([view

, view

, ...], resolve)

vconcat([view

, view

, ...], resolve)

If aligned spatial channels have matching data ﬁelds (e.g., the y

channels in an hconcat use the same ﬁeld), a shared scale and axis

are used. Axis composition facilitates comparison across views and

optimizes the underlying implementation. Fig. 3(b) concatenates the

line chart from Fig. 2(a) with a dot plot, using independent scales.

3.2.3 Facet

While concatenation allows composition of arbitrary views, one often

wants to set up multiple views in a parameterized fashion. The facet

operator produces a trellis plot [1] by subsetting the data by the distinct

values of a ﬁeld. The signature of the facet operator is:

facet(channel, data, ﬁeld, view, scale, axis, resolve)

The channel indicates if sub-plots should be laid out vertically (row)

or horizontally (column). The given data source is partitioned using

distinct values of the ﬁeld. The view speciﬁcation provides a template

for the sub-plots, inheriting the backing data for each partition from

the operator. The scale and axis parameters specify how sub-plots are

positioned and labeled. Fig. 4(a) demonstrates faceting into columns.

To facilitate comparison, scales and guides for quantitative ﬁelds

are shared by default. This ensures that each facet visualizes the same

data domain. However, for ordinal scales we generate independent

scales by default to avoid unnecessary inclusion of empty categories,

akin to Polaris’ nest operator. When faceting by ﬁscal quarter and

visualizing per-month data in each cell, one likely wishes to see three

months per quarter, not twelve months of which nine are empty. Users

can override the default behavior via the resolve component.

3.2.4 Repeat

The repeat operator generates multiple plots, but unlike facet allows

full replication of a data set in each cell. For example, repeat can be

used to create a scatterplot matrix (SPLOM), where each cell shows a

different 2D projection of the same data table. The signature is:

repeat(channel, values, scale, axis, view, resolve)

Similar to facet, the channel parameter indicates if plots should di-

vide by row or column. Rather than partition data according to a ﬁeld,

this operator generates one plot for each entry in a list of values. En-

codings within the repeated view speciﬁcation can refer to this pro-

vided value to parameterize the plot

. By default, scales and axes are

independent, but legends are shared when data ﬁelds coincide. Like

As the repeat operator requires parameterization of the inner view, it is

not strictly algebraic. It is possible to achieve algebraic “purity” via explicit re-

peated concatenation or by reformulating the repeat operator (e.g., by including

rewrite rules that apply to the inner view speciﬁcation). However, we believe

the current syntax to be more usable and concise than these alternatives.

facet, the scale and axis components allow users to override defaults

for how sub-plots are positioned and labeled, while resolve controls

resolution of scales and guides within the plots themselves.

3.3 Nested Views

Composition operators can be combined to create more complex

nested views or dashboards, with the output of one operator serving as

input to a subsequent operator. For instance, a layer of two unit views

might be repeated, and then concatenated with a different unit view.

The one exception is the layer operator, which, as previously noted,

only accepts unit views to ensure consistent plots. For concision, two

dimensional faceted or repeated layouts can be achieved by applying

the operators to the row and column channels simultaneously. When

faceting a composite view, only the dataset targeted by the operator is

partitioned; any other datasets speciﬁed in sub-views are replicated.

4 THE VEGA-LITE GRAMMAR OF INTERACTION

To support speciﬁcation of interaction techniques, Vega-Lite extends

the deﬁnition of unit speciﬁcations to also include a set of selections.

Selections identify the set of points a user is interested in manipulat-

ing. In this section, we deﬁne the components of a selection, describe

a series of transforms for modifying selections, and detail how selec-

tions can parameterize visual encodings to make them interactive.

4.1 Selection Components

We formally deﬁne a selection as an eight-tuple:

selection := (name, type, predicate, domain|range,

event, init, transforms, resolve)

When an input event occurs, the selection is populated with backing

points of interest. These points are the minimal set needed to identify

all selected points. The selection type determines how many backing

values are stored, and how the predicate function uses them to deter-

mine the set of selected points. Supported types include a single point,

a list of points, or an interval of points.

A point selection is backed by a single datum, and its predicate tests

for an exact match against properties of this datum. It can also function

like a dynamic variable (or signal in Vega [23]), and can be invoked

as such. For example, it can be referenced by name within a ﬁlter ex-

pression, or its values used directly for particular encoding channels.

List selections, on the other hand, are backed by datasets into which

points are inserted, modiﬁed or removed as events ﬁre. Lists express

discrete selections, as their predicates test for an exact match with at

least one value in the backing dataset. The order of points in a list

selection can be semantically meaningful, for example when a list se-

lection serves as an ordinal scale domain. Fig. 5 illustrates how points

are highlighted in a scatterplot using point and list selections.

Intervals are similar to list selections. They are backed by datasets,

but their predicates determine whether an argument falls within the

minimum and maximum extent deﬁned by the backing points. Thus,

they express continuous selections. The compiler automatically adds

a rectangle mark, as shown in Fig. 6(a), to depict the selected inter-

val. Users can customize the appearance of this mark via the brush

keyword, or disable it altogether when deﬁning the selection.

Predicate functions enable a minimal set of backing points to rep-

resent the full space of selected points. For example, with predicates,

{

"data": {"url": "data/cars.json"},

"mark": "circle",

"select": {

"id": {"type": "point"}

"encoding": {

"x": {"field": "Horsepower", "type": "Q"},

"y": {"field": "MPG", "type": "Q"},

"color": [

{"if": "id", "field": "Origin", "type": "N"},

{"value": "grey"}

"size": {"value": 100}

}

(a) Highlight a single point on click

"id": {"type": "point", "project": {"fields": ["Origin"]}}

(d) Highlight a single Origin

"id": {"type": "list", "toggle": true}

(b) Highlight a list of individual points

"select": {

"id": {"type": "list", "toggle": true, "project": {"fields": ["Origin"]}}

}, ...

(e) Highlight a list of Origins

"id": {"type": "list", "on": "mouseover", "toggle": true}

Fig. 5. (a) Adding a single point selection to parameterize the ﬁll color of a scatterplot’s circle mark. (b) Switching to a list selection, with the toggle

transform automatically added (true enables default shift-click event handling). (c) Specifying a custom event trigger: the ﬁrst point is selected on

mouseover and subsequent points when the shift key is pressed (customizable via the toggle transform). (d) Using the project transform with a

single-point selection to highlight all points with a matching Origin, and (e) combining it with a list selection to select multiple Origins.

an interval selection need only be backed by two points: the minimum

and maximum values of the interval. While selection types provide

default deﬁnitions, predicates can be customized to concisely specify

an expressive space of selections. For example, a single point selec-

tion with a custom predicate of the form datum.binned price

== selection.binned price is sufﬁcient for selecting all data

points that fall within a given bin.

By default, backing points lie in the data domain. For example,

if the user clicks a mark instance, the underlying data tuple is added

to the selection. If no tuple is available, event properties are passed

through inverse scale transforms. For example, as the user moves

their mouse within the data rectangle, the mouse position is inverted

through the x and y scales and stored in the selection. Deﬁning selec-

tions over data values, rather than visual properties, facilitates reuse

across distinct views; each view may have different encodings spec-

iﬁed, but are likely to share the same data domain. However, some

interactions are inherently about manipulating visual properties — for

example, interactively selecting the colors of a heatmap. For such

cases, users can deﬁne selections over the visual range instead. When

input events occur, visual elements or event properties are then stored.

The particular events that update a selection are determined by

the platform a Vega-Lite speciﬁcation is compiled on, and the input

modalities it supports. By default we use mouse events on desktops,

and touch events on mobile and tablet devices. A user can specify

alternate events using Vega’s event selector syntax [23]. For exam-

ple, Fig. 5(c) demonstrates how mouseover events are used to pop-

ulate a list selection. With the event selector syntax, multiple events

are speciﬁed using a comma (e.g., mousedown, mouseup adds

items to the selection when either event occurs). A sequence of events

is denoted with the right-combinator. For example, [mousedown,

mouseup] > mousemove selects all mousemove events that oc-

cur between a mousedown and a mouseup (otherwise known as

“drag” events). Events can also be ﬁltered using square brackets (e.g.,

mousemove [event.pageY > 5] for events at the top of the

page) and throttled using braces (e.g., mousemove{100ms} popu-

lates a selection at most every 100 milliseconds).

Finally, selections can be initialized with speciﬁc backing points

(we defer discussion of transforms and resolve to subsequent sections).

Vega-Lite provides a built-in mechanism to initialize list and interval

selections using the scales of the unit speciﬁcation they are deﬁned

in. Doing so populates the selection with the given scales’ domain or

range, as appropriate for the selection, and parameterizes the scales to

use the selection instead. By default, this occurs for the scales of the x

and y channels, but alternate scales can be speciﬁed by the user. This

step allows scale extents to be interactively manipulated, yet remain

automatically initialized by the input data.

4.2 Selection Transforms

Analogous to data transforms, selection transforms manipulate the

components of the selection they are applied to. For example, they

may perform operations on the backing points, alter a selection’s pred-

icate function, or modify the input events that update the selection.

We identify the following transforms as a minimal set to support both

common and custom interaction techniques:

project(ﬁelds, channels): Alters a selection’s predicate function to

determine inclusion by matching only the given ﬁelds. Some ﬁelds,

however, may be difﬁcult for users to address directly (e.g., new ﬁelds

introduced due to inline binning or aggregation transformations). For

such cases, a list of channels may also be speciﬁed (e.g., color,

size). Fig. 5(d, e) demonstrate how project can be used to select

all points with matching Origin ﬁelds, for example. This transform

is also used to restrict interval selections to a particular dimension

(Fig. 6(c)) or to determine which scales initialize a selection.

toggle(event): This transform is automatically instantiated for

uninitialized list selections. When the event occurs, the corresponding

point is added or removed from a list selection’s backing dataset. By

default, the toggle event corresponds to the selection’s event but with

the shift key pressed. For example, in Fig. 5(b), additional points are

added to the list selection on shift-click (where click is the default

event for list selections). The selection in Fig. 5(c), however, speci-

ﬁes a custom mouseover event. Thus, additional points are inserted

when the shift key is pressed and the mouse cursor hovers over a point.

translate(events, by): Offsets the spatial properties (or correspond-

ing data ﬁelds) of backing points by an amount determined by the

coordinates of the sequenced events. For example, on the desk-

top, drag events ([mousedown, mouseup] > mousemove) are

used and the offset corresponds to the difference between where the

mousedown and subsequent mousemove events occur. If no coor-

dinates are available (e.g., as with keyboard events), an optional by

argument should be speciﬁed. This transform respects the project

transform as well, restricting movement to the speciﬁed dimensions.

This transform is automatically instantiated for interval transforms,

enabling movement of brushed regions (Fig. 6(b)) or panning of the

visualization when scale extents initialize the selection (Fig. 7).

zoom(event, factor): Applies a scale factor, determined by the event,

to the spatial properties (or corresponding data ﬁelds) of backing

points. An optional factor should be speciﬁed, if it cannot be deter-

mined from the events (e.g., when the arrow keys are pressed).

nearest(): Computes a Voronoi decomposition, and augments the

selection’s event processing, such that the data value or visual element

Vega-Lite: A Grammar of Interactive Graphics

Figures

Citations

Co-adaptive visual data analysis and guidance processes

P5: Portable Progressive Parallel Processing Pipelines for Interactive Data Analysis and Visualization

A Structured Review of Data Management Technology for Interactive Visualization and Analysis

TaskVis: Task-oriented Visualization Recommendation

Does design matter when visualizing Big Data? An empirical study to investigate the effect of visualization type and interaction use

References

D³ Data-Driven Documents

Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods

Automating the design of graphical presentations of relational information

Semiology of Graphics: Diagrams, Networks, Maps

Toward a Deeper Understanding of the Role of Interaction in Information Visualization

Related Papers (5)

D³ Data-Driven Documents

Voyager: Exploratory Analysis via Faceted Browsing of Visualization Recommendations

Polaris: a system for query, analysis, and visualization of multidimensional relational databases

Automating the design of graphical presentations of relational information

Show Me: Automatic Presentation for Visual Analysis

Frequently Asked Questions (10)

Q1. What are the contributions in "Vega-lite: a grammar of interactive graphics" ?

Q2. What are the future works in "Vega-lite: a grammar of interactive graphics" ?

Q3. What is the function that applies the selection against the backing datasets?

Q4. What are the primary features of a low-level grammar?

Q5. What is the function that offsets the spatial properties of the backing points?

Q6. What is the process of merging components?

Q7. How does Vega-Lite support expressive interaction methods?

Q8. What is the function that augments the selection’s event processing?

Q9. What is the syntax for creating a composite view?

Q10. How can you adapt techniques to a different design?