What are the future works mentioned in the paper "Visual traffic jam analysis based on trajectory data" ?

Their future work includes improving the traffic jam model, support more analysis tasks, and enable real-time traffic prediction.

(Open Access) Visual Traffic Jam Analysis Based on Trajectory Data (2013) | Zuchao Wang

Q: What are the contributions in "Visual traffic jam analysis based on trajectory data" ?

In this work, the authors present an interactive system for visual analysis of urban traffic congestion based on GPS trajectories. For these trajectories the authors develop strategies to extract and derive traffic jam information.

Accepted for publication by IEEE. ©2013 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/

republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Visual Trafﬁc Jam Analysis Based on Trajectory Data

Zuchao Wang, Min Lu, Xiaoru Yuan, Member, IEEE, Junping Zhang, Member, IEEE, and Huub van de Wetering

Fig. 1. An overview of our system. (a) The spatial view shows the trafﬁc jam density on each road of Beijing by color, and one

trafﬁc jam propagation graph is highlighted in black. (b) The embedded road speed views show the speed patterns of four roads in the

highlighted black propagation graph. (c) The graph list view shows a list of sorted trafﬁc jam propagation graphs. (d) The multi-faceted

ﬁlter view allows ﬁltering of propagation graphs by time and size. (e) The graph projection view shows the topological relationship of

graph clusters, where graphs in the same cluster have very similar topology.

Abstract—In this work, we present an interactive system for visual analysis of urban trafﬁc congestion based on GPS trajectories.

For these trajectories we develop strategies to extract and derive trafﬁc jam information. After cleaning the trajectories, they are

matched to a road network. Subsequently, trafﬁc speed on each road segment is computed and trafﬁc jam events are automatically

detected. Spatially and temporally related events are concatenated in, so-called, trafﬁc jam propagation graphs. These graphs form a

high-level description of a trafﬁc jam and its propagation in time and space. Our system provides multiple views for visually exploring

and analyzing the trafﬁc condition of a large city as a whole, on the level of propagation graphs, and on road segment level. Case

studies with 24 days of taxi GPS trajectories collected in Beijing demonstrate the effectiveness of our system.

Index Terms—Trafﬁc visualization, trafﬁc jam propagation

1 INTRODUCTION

Trafﬁc jams form a serious problem in modern cities. They bring about

considerable economic loss, increase travel times and aggravate pollu-

tion. Governments spend a great amount of money trying to monitor

and understand trafﬁc jams, but this seems difﬁcult due to the com-

plex nature of trafﬁc jams. One of the complexities is unpredictability.

Sometimes trafﬁc jams occur, sometimes not. Another complexity is

that trafﬁc jams are dynamic and interrelated. Trafﬁc jams can, for

instance, propagate from one road on to other roads. Due to these

• Zuchao Wang is with Key Laboratory of Machine Perception (Ministry of

Education), and School of EECS, Peking University. E-mail:

zuchao.wang@pku.edu.cn.

• Min Lu is with Key Laboratory of Machine Perception (Ministry of

Education), School of EECS, and Center for Computational Science and

Engineering, Peking University. E-mail: lumin.vis@gmail.com.

• Xiaoru Yuan is with Key Laboratory of Machine Perception (Ministry of

Education), School of EECS, and Center for Computational Science and

Engineering, Peking University. E-mail: xiaoru.yuan@pku.edu.cn.

• Junping Zhang is with Shanghai Key Laboratory of Intelligent Information

Processing, and School of Computer Science, Fudan University. E-mail:

jpzhang@fudan.edu.cn.

• Huub van de Wetering is with Department of Mathematics and Computer

Science, Technische Universiteit Eindhoven. E-mail:

h.v.d.wetering@tue.nl.

Manuscript received 31 March 2013; accepted 1 August 2013; posted online

13 October 2013; mailed on 4 October 2013.

For information on obtaining reprints of this article, please send

e-mail to: tvcg@computer.org.

complexities, a fully automatic analysis of trafﬁc jams is hard, requir-

ing considerable experience and knowledge. In this work, we present

a visual analysis system to study the patterns of trafﬁc jams and their

propagation. Our system combines automatic computation and human

knowledge. We ﬁrst extract trafﬁc jams from GPS trajectories, from

which we construct propagation graphs. Then we design a visual inter-

face to explore both trafﬁc jam patterns and the propagation of trafﬁc

jams. As far as we know, there is no previous work in visual analytics

that deeply studies these trafﬁc jam aspects.

Traditional trafﬁc jam detection methods are based on road side sen-

sors, like induction loops or radar [31] and monitor only a few critical

points [25]. A GPS based method, however, can theoretically monitor

a complete road network. This enables us to better study trafﬁc jam

propagation. Furthermore, the installation of expensive road side de-

vices is not required. Previous GPS based trafﬁc jam detection meth-

ods either just study separate jams [10, 34], therefore giving scattered

trafﬁc information of the road network, or being unable to attribute

trafﬁc jams to speciﬁc roads [29, 15]. Trafﬁc jam data from these

works are not suitable for visual exploration. In this work, we derive a

road bound trafﬁc jam dataset from GPS trajectory data, and structure

the detected trafﬁc jams by building propagation graphs. Our data is

more suitable for visual exploration.

In the visual interface, as shown in Figure 1, we allow users to

make multilevel exploration, from trafﬁc patterns on a single road to

the trafﬁc jam condition in a whole city. We support various ﬁltering

techniques to query speciﬁc kinds of propagation graphs, and we allow

users to compare them.

Our major contributions are:

• We present a process to automatically extract trafﬁc jams from

noisy GPS trajectory data. Our data is structured and road bound,

therefore suitable for visual exploration.

• We design a visual interface to explore the trafﬁc jams and their

propagation. The exploration is multilevel, and supports ﬁltering

and comparison of trafﬁc jam propagation graphs.

2 RELATED WORK

Our related work section is split into an analysis subsection on trafﬁc

event detection and two visual subsections on trafﬁc visualization and

propagation graph visualization.

2.1 Trafﬁc Event Detection

Perhaps the most commercialized technique to detect trafﬁc events is

by radar like sensors [31]. Analyzing video streams from a roadside

camera also helps to understand the trafﬁc [53, 22]. However, both

techniques require the installation of high-cost devices, and can only

monitor at ﬁxed positions along the road. In contrast, the GPS tech-

nique is cheaper and able to monitor the whole road network. There-

fore, much of recent research focuses on analyzing GPS trajectories.

The data mining community has long been working on trajectory

data. They have studied different kinds of patterns [23, 26]. See Zheng

et al.’s book [54] for an overview.

We are most interested in trafﬁc jam detection. Trafﬁc jams are im-

portant trafﬁc events. They are usually characterized by long travel

time, or, equivalently, low speed. Many trafﬁc jam detection algo-

rithms are based on road speed calculation [18], or low speed vehicle

cluster detection [10, 34]. Bauze et al.’s trafﬁc monitor system also de-

tects trafﬁc jams by low speed [13]. We use a road speed calculation

based method to detect trafﬁc jams. It provides trafﬁc situation data,

not only in a few congested places and during a few periods, but it

does so in much larger regions and periods. Such data is more suitable

for free exploration. Our work is different from the works cited above,

in that we focus on the propagation of trafﬁc jams.

Krogh et al. [25] have studied green waves on road stretches with

signalized intersections. However, this differs from trafﬁc jam propa-

gation, and their scenario is much simpler than our city network. Re-

cently, Zheng et al. published a paper [29] studying the causal inter-

actions of trafﬁc outliers. They ﬁrst segment a city into medium sized

regions, then study the trafﬁc ﬂows on the links between the regions.

Outliers are detected and arranged as outlier trees. Our propagation

graph construction uses the same idea, but our focus is on trafﬁc jam

events, not on outliers. A more important difference is that, trafﬁc jams

originally happen on roads, not on links between regions. So when

users detect an anomalous link, it is difﬁcult to explore and explain

their results. Although in later work [15] they correlate the anomalous

links to anomalous routes, it is still unclear where anomalies occur on

long routes. In our results, we can directly see trafﬁc jams propagating

on roads, as they actually are. Our model is more suitable for visual

analysis.

The trafﬁc condition in road networks can also be modelled by

Probabilistic Graph Models (PGM) [27, 37]. This technique is able

to learn the temporal change of trafﬁc conditions for each road, and

the spatial dependency between roads from historical data. Therefore,

it can simulate the macro trafﬁc, and can make predictions of future

trafﬁc conditions. However, here we mainly want to summarize the

historical data, and support user explorations. Our trafﬁc jam detection

and propagation graph construction algorithm already summarize the

historical trafﬁc and their results are easy to understand and explore.

In contrast, a PGM, although being more generic, has parameters that

are harder to tune, and is harder to understand, explore and evaluate.

2.2 Trafﬁc Visualization

A major type of trafﬁc data is trajectory data. In this case, all trajectory

visualization techniques can be used. An overview of all trajectories

is the ﬁrst step in their visual analysis. It often requires aggregation.

A density map [51] provides an overview by visual aggregation. It

plots the trajectory density and helps identify “hot” spots. Density

maps may also show the density of multi-variant trajectories [42] and

extracted events [41]. Different from density maps, techniques such

as spatial aggregation [12] and spatial-temporal aggregation [6, 43]

provide overview by data aggregation. They discretize the spatial and

temporal dimension into many regions, ﬂows, or bins. Statistics are

performed on each discrete spatial-temporal unit, e.g. a region in a

time bin. This aggregated information is then visualized.

Micro-behavior analysis is another common task. In this case, tra-

jectories have to be treated individually. Hurter et al. [21] show how to

select trajectories with speciﬁc position and attributes. Guo et al.[19]

present a system to analyze the trafﬁc at a road intersection. Liu et

al. [28] present a system to study the route diversity in a city.

Temporal information is critical in trajectory visualization. Space

time cube [20, 24, 7] uses z-axis to represent the time, but suffers from

visual clutter. A trajectory may also be represented as a timeline [9,

16], but the spatial information is then largely lost. Events can be

extracted from these time series [11].

For trajectory attributes, Tominski et al. [46] and von Landesberger

et al. [49] have addressed their visual analysis problem.

Some of the above cited works study the events of trajectories.

However, none of them focuses on the interactions of these events,

and none of them focuses on trafﬁc jams. Our work aims to deeply

study trafﬁc jams and their interactions.

Although most of the trafﬁc visualizations use trajectory data, some

use other types of sensor data. Pack et al. [35] study trafﬁc incidents

data. They design a linked view interface to visualize the spatial, tem-

poral and multi-dimensional aspects of the incidents. Users are al-

lowed to select, ﬁlter and cluster these incidents. Piringer et al. [38]

study the surveillance videos in a tunnel. They automatically detect

and prioritize different types of events and mark them in space and

time. For each event, users can check the original videos. Both focus

on trafﬁc events, but none of them on the interactions of these events.

Our work studies these interactions, and we use trajectory data, which

requires different event detection algorithm.

2.3 Propagation Graph Visualization

Propagation graphs may be visualized by animations and small mul-

tiples. However, these techniques have limitations [39]. Therefore,

people also designed other visual metaphors. The spatial, temporal

and topological aspects of the propagation graph can be visualized by

separate techniques, like FlowMap [48], Massive Sequence View [47]

and graph layout [44]. It remains challenging to visualize all aspects

in one view. In our work, we have applied the animation, ﬂow map

and graph layout techniques.

3 O

VERVIEW

In this section, we ﬁrst present the design requirements. After that we

describe the input data, and deﬁne the trafﬁc jam data model. Finally

we present the system workﬂow.

3.1 Design Requirement

To study trafﬁc jams, we need a data model, according to which we

extract and structure the trafﬁc jam data. It should satisfy the following

three requirements:

R1: Complete We require that basic trafﬁc jam information is

available, including location and time. Besides, speed information

should always be there, even when there is no trafﬁc jam. It helps users

to understand how trafﬁc condition changes, and to check whether the

trafﬁc jam detection is appropriate.

R2: Structured We require that the trafﬁc jams in the model are

interrelated: we are not only interested in individual trafﬁc jams at

separate locations and time, but also how these trafﬁc jams are related,

and how they propagate from one location to another.

R3: Road bound We require the trafﬁc jams to be deﬁned on roads,

and to propagate along the road network, as they are actually happen-

ing. This help users to associate the trafﬁc jam data with their real

world knowledge during visual exploration.

A visual interface to explore and analyze the data model, should

satisfy the following requirements.

R4: Informative We require that the system shows all critical infor-

mation of the trafﬁc jams, including location, time, propagation path,

size of the propagation, and the road speed.

R5: Multi-level We require that the trafﬁc jams can be explored at

multiple levels. The lowest level should be the congestion behavior on

a single road segment. Above that we require to analyze the trafﬁc jam

propagation among different road segments, and to compare different

propagations. On the highest level, we require to study the congestion

status of the whole city.

R6: Filterable We require to ﬁlter trafﬁc jams according to spatial,

temporal properties, and size of propagation. In this way, we can focus

on speciﬁc types of trafﬁc jams, and make deeper analysis of them.

3.2 Description of Input Data

We use GPS trajectory data and road network data as input, to cal-

culate and analyze trafﬁc jams. GPS trajectory data contains many

trajectories. Each trajectory consists of a list of sampling points.

Each sampling point has a position record (longitude,latitude for

2D data), time stamp time, speed magnitude velocity, moving direc-

tion vangle, and optionally a set of attributes a

,...a

n1

. These

sampling points are sorted in time ascending order. Each part between

two consecutive sampling points is called a trajectory segment.

A road network consists of nodes and ways. Each node has a spatial

position. It can be either an intersection or a shape point. Each way

contains an ordered list of nodes that deﬁnes the spatial position and

shape of the way. A way can be a one-way street or a two-way street.

Our GPS dataset is a real taxi dataset recorded in the city of Beijing,

which is prone to trafﬁc jams. The dataset contains the GPS trajecto-

ries of 28,519 taxis. Estimated from a government report [5], they

include 43% of all licensed taxis in Beijing, and account for 7% of

the trafﬁc ﬂow volume within Beijing’s 4th Ring. The dataset spans

24 days, from March 2nd to 25th, 2009. It contains 379,107,927 sam-

pling points, and the data size is 34.5GB. The only attribute is the

boolean passengerState, indicating whether there are passengers in

the taxi. The sampling rate is one point per 30 seconds. However,

60% of the sampling points are missing, so, two consecutive points

frequently have a time difference of over 3 minutes.

Our road network dataset comes from a query from Open-

StreetMap’s jXAPI [17]. We extract all roads in the spatial range

from 116.109E to 116.673E and from 39.743N to 40.119N. This gives

40.9MB of data, containing 169,171 nodes, and 35,422 ways.

3.3 Trafﬁc Jam Data Model

Our model structures three types of information: the road speed, the

trafﬁc jams, and the relationships between trafﬁc jams. In our model,

the time is discretized into time bins. The two directions on a way are

treated separately, each as a directed way (abbrev. as dWay). A dWay

and a time bin are the smallest spatial and temporal unit.

The road speed information gives a basic description of the road

condition. For each dWay, at each time bin, there will be a speed

record. The speed value can be empty if it can not be estimated.

The trafﬁc jam information summarizes all the detected trafﬁc jams.

It consists of a list of trafﬁc jam events (abbrev. as events). An event

is deﬁned as a triple d,t

, where the dWay d is the location of the

event and the integers t

and t

with t

 t

are the start and end time

bin of the event, respectively. So, the whole event takes place in the

interval [t

..t

] that spans t

t

+ 1 time bins.

The relationships between trafﬁc jams are characterized by trafﬁc

jam propagation graphs (abbrev. as graphs). A graph is a directed net-

work of events, deﬁned as V, E, where V is a set of events, and E is a

set of directed links between events. It is both acyclic and connected.

A directed link is notated as e

 e

, meaning that event e

leads to

, or equivalently, e

is caused by e

. An event can be caused by 0

or more events and can also lead to 0 or more events. For each graph,

a spatial propagation path (abbrev. as path) can be derived, which is

a directed network of dWays. It can have cycles. A link d

 d

in a

path means that the corresponding trafﬁc jam propagates from dWay

to d

3.4 Work Flow

Our visual analysis work consists of two phases. The ﬁrst phase is

preprocessing, in which we start from the input data, and extract trafﬁc

jam data that ﬁts our model. The second phase is visual exploration, in

which we explore the preprocessed data. Figure 2 gives an overview

of our system. We will explain the preprocessing phase in Section 4,

and the visual exploration phase in Section 5.

4 P

REPROCESSING

Our preprocessing phase consists of six steps. The ﬁrst two steps im-

prove the quality of the input data. In step 1 Road Network Processing,

we improve the road network quality by ﬁltering out irrelevant data,

merging and splitting ways, and correcting errors. In step 2 GPS Data

Cleaning, the trajectories are cleaned. One obvious thing is to remove

GPS errors. To accurately estimate road speed later on, we also ﬁlter

out stops that do not reﬂect trafﬁc conditions, such as parking.

To estimate road speed we perform another two steps. In step 3 Map

Matching, we match GPS trajectories to the road network to correlate

the trajectory speed with the road speed. After this step, each trajectory

sampling point is mapped to one position on a dWay (not a lane), and

each trajectory segment is mapped to a path on the road network. In

step 4 Road Speed Calculation, we estimate the speed of a dWay at

a time bin, based on the speed of the trajectories that map to it. This

can be performed by averaging the trajectory speed. This estimation

can be inaccurate due to, for instance, insufﬁcient number of mapped

trajectories, and incomplete ﬁltering of parking cases in step 2.

In the last two steps, propagation graphs are constructed on trafﬁc

jam events. In step 5 Trafﬁc Jam Detection, trafﬁc jam events are de-

tected based on speed. For each dWay, an abnormally low speed for

consecutive time bins, is considered as a trafﬁc jam event. Finally,

in step 6 Propagation Graph Construction, we predict the causal rela-

tionships between the detected trafﬁc jam events, based on their spatial

temporal relationship.

In the rest of this section, we discuss the preprocessing steps in

more detail. Further details on their parameter setting are in the ap-

pendices.

4.1 Road Network Processing

The road network data downloaded from OpenStreetMap not only

contains highways, but also waterways, buildings, etc. Therefore, we

ﬁrst extract all drivable ways from the data. Then we ﬁlter out the tiny

road pieces that are not connected to the major network, and ensure

all roads connected together. After that, we hope that the heading re-

lation between two dWay is clear and unidirectional. Therefore, we

reconstruct the ways in the road network data, such that two ways can

only intersect at their end points. In the reconstruction, we require

that the length of each way is less than 1km, which ensures the spatial

resolution.

4.2 GPS Data Cleaning

For the GPS data, we remove ﬁve kinds of records: the irrelevant data,

the erroneous data, the low sampling data, the non-jam stop data and

the tiny trajectory data. We use a set of ﬁlters to achieve this.

Data out of the spatial range of the road network is irrelevant and

removed by ﬁlter F1. Problems in erroneous data with respect to time

or position are removed by ﬁlters F2 and F3. They typically manifest

as two records with identical time stamps or segments with high speed.

F1: Unrealistic Coordinates We remove sampling points outside

the range [116.109E, 116.673E] x [39.743, 40.119N].

F2: Duplicated Time Stamp If in a trajectory there are points with

the same time stamp, we only keep the ﬁrst occurrence and remove the

other points with the same time stamp.

F3: High Speed We consider a trajectory segment speed higher

than 90km/h unrealistic. In such cases we remove the trajectory seg-

ment and split the trajectory into two parts.

A low sampling rate results in trajectories with long segments or

long time intervals. Our speed calculation is based on trajectory seg-

ments, so we require realistic speed change between the start and end

Raw taxi

GPS Data

Raw Road

Network

Cleaned GPS

Data

Processed Road

Network

GPS Trajectories Matched

to the Road Network

Road Speed Data

Traffic Jam Event Data

Traffic Jam Propagation Graphs

GPS Traject

tor

Matched

torie

m E

ropa

Propagation Graphs of

Interest

One Propagation

Graph

Roads of Interest

Spatial Filter

Temporal and Size Filter

Propagation

Graphs

Interes

One Propagation

Graph

Roads of Interest

Propa

of Inte

Road Segment Level Exploration and Analysis

Time and Size Distribution Visu-

alization of Propagation Graphs

City level Traffic Jam

Density Visualization

Propagation Graphs Comparison

Propagation Graph Level Exploration

GPS Data Cleaning Road Network Processing

Map Matching

Road Speed Calculation

Traffic Jam Detection

Propagation Graph

Construction

aned

ssed

Dynamic Query

Highlight

Roads in

This Graph

Highlight Graph

Containing This

Road

PREPROCESSING VISUAL EXPLORATION AND ANALYSIS

Grap

Highlight

Hig

ghlight

hli

Containi

Interes

ulation

Dynamic Query

Parameter

Setting

Similarity Sorting

Topological Clustering

of Propagation Graphs

Topological Filter

Fig. 2. The work ﬂow of our system. In the preprocessing step, we extract trafﬁc jam data from GPS trajectories and a road network. In the visual

exploration step, we analyze the extracted trafﬁc jams and their propagation.

points of a segment for interpolating accurately. This is not possible

in such cases. Therefore, ﬁlter F4 and F5 remove them.

F4: Long Distance We remove segments with length over 2km.

F5: Long Time We remove segments with time interval over

10min.

Non-jam stop data are due to parking, passengers getting in or out

of the car, and stopping to wait for passengers. This does not include

waiting for green lights, because long time waiting for green light im-

plies congestion. Filter F6 removes the ﬁrst parking case and F7 re-

moves the passenger cases.

F6: Parking We assume taxis staying within a 50m radius during

30min are actually parking, and thus remove these points.

F7: Waiting for Passenger We remove segments where the pas-

sengerState attribute changes. This splits trajectories into ones with

constant passengerState. Then we remove stops at the beginning and

end of the shorter trajectories, assuming that taxi drivers usually wait

for new passengers immediately after dropping old ones, or that they

wait until they have a new one. For stops at the beginning or at the

end, we assume either a few points with identical positions, or a point

with velocity equal to zero.

F6 is implemented using a stop detection algorithm [36]. We do not

plan to identify interesting spots as in the original paper, but parking

stops, including the cases when GPS position seriously oscillates (Fig-

ure 3(Right)). Therefore, we just use the Euclidean distance in their

algorithm, not the distance along the path.



Fig. 3. (left) Stops removed by F6, with each sampling point represented

by a red dot. (right) One stop with the sampling points connected by red

lines. It spans 97min, and seems to oscillate due to GPS drift.

Tiny trajectories are mostly small fragments generated by ﬁlters.

By rendering them on the screen, we ﬁnd that they can hardly be used.

We remove them by ﬁlter F8.

F8: Tiny Trajectory We remove all trajectories with at most 5

sampling points or less than 500m long.

The ﬁlters are applied in the order: F1, F2, F3, F4, F5, F6, F7, where

ﬁlter F8 is applied directly after each ﬁlter to remove tiny trajectories.

4.3 Map Matching

We adopt the ST-matching algorithm [30] for map matching, since it

is suitable for data with low sampling rate. However, the algorithm

can not be directly used in our work, and we adapt it at three points.

First of all, as most of the ways in our road network data do not have

speed limit records, T-matching is impossible. Therefore, we only do

S-matching. Analysis in the original paper [30] shows that the accu-

racy then drops by 2%. We consider that acceptable. Secondly, as

our road network data has a few errors, such as wrong road directions

and missing roads, we allow trajectory sampling points and trajectory

segments to be unmatched. Otherwise, there would be many errors, as

shown in Figure 4. We assume a missing match is better than a wrong

match, in terms of accurately estimating the road speed. Sampling

points without candidate match points are considered unmatched. Tra-

jectory segments with transmission probability V less than a threshold

Δ are considered unmatched. Finally, we would match each trajec-

tory sampling point to one position on one dWay, therefore we need

to know the driving direction of the taxi at each sampling point. This

is achieved in a post processing step by simply looking at the matched

position of neighbouring sampling points.

Fig. 4. The map matching produces many errors, if we do not allow

unmatch. One example is the red trajectory segment from sampling

point A to B matching to a long blue path. This is due to missing roads.

4.4 Road Speed Calculation

After mappping the trajectories to the road network, we can use tra-

jectory speed to calculate road speed. In this step, we only use the

matched parts of the trajectories. We choose a time bin size of 10min.

For each dWay and for each time bin, we extract all taxi trajectories

that pass the dWay within this time bin. We reconstruct the movement

of the taxis assuming they follow the map matching result, and move at

constant speed between two consecutive sampling points. Therefore,

we can calculate an average travel speed for each taxi. After removing

the taxis with exceptionally high speed (detected by an outlier detec-

tion algorithm [1]), we make an average of the average speeds on the

remaining taxis and get the road speed. The speed averaging is per

trajectory, not per sampling point. We also record support, which is

the number of remaining taxis. The higher the support, the higher the

accuracy of the road speed calculation. We deﬁne that a speed esti-

mation is valid when support  min

support. The default value is

min

support = 5.

4.5 Trafﬁc Jam Detection

After calculating the road speed, we do a trafﬁc jam event detection

on each dWay. Our idea is to use a speed threshold per dWay based on

an estimation of the free-ﬂow speed of the dWay. A speed limit may

be a good estimation. Unfortunately we do not have it in our data.

Krogh et al. [25] estimate free-ﬂow speed from non-peak hour speed

records. However, in Beijing different dWays may have different non-

peak hours. Instead, we sort all valid speeds for a dWay in ascending

order, and pick the speed value at the percentage F% position. Then

each time bin on this dWay, with a valid speed less than percentage

C% of the free ﬂow speed, is said to have a low speed. The default

parameter values, F = 85 and C = 45, give us 400,985 events.

4.6 Propagation Graph Construction

Now we have extracted events for all dWays, we build the propaga-

tion graphs by deﬁning directed links among events. We use a rule

based method. We assume a directed link e

 e

exists if and only if

 e

, and e

.d is immediately ahead of e

.d. The for-

mer statement is a temporal constraint, saying that when e

starts, e

still happening. The latter statement is a spatial constraint, saying that

the two events are spatially connected, and the trafﬁc jams propagate

backward. The backward propagation is our assumption, which means

the trafﬁc jam will propagate in a reverse direction to the direction of

trafﬁc ﬂow. Although it is not ﬁrmly validated, many observations and

experiments [14, 45] support this. We have this constraint because our

temporal resolution is not high enough. When we observe two adja-

cent roads congest at the same time bin, it is not clear from the data

which leads to which. In our road network, it is usually the case that

one dWay is ahead of another. One exception is for the two directions

on the same two-way street. We do not make any link between them,

because such propagation is associated with a u-turn trafﬁc ﬂow. By

experience, such u-turn trafﬁc ﬂow is usually not dominant in the total

trafﬁc ﬂow volume, and not likely to propagate trafﬁc jams. Besides,

our test shows adding such links it will add considerable noise in the

constructed graphs.

We construct the graphs with a modiﬁed version of the STOTree

algorithm [29] and end up with 226,227 graphs of which 162,429 con-

tain only one event. We calculate the spatial propagation path and

three size measures for each graph: number of events, time span, and

total distance. The latter is the sum of the length in kilometers of all

trafﬁc jam events in the graph.

5 V

ISUALIZATION DESIGN

According to the design requirements in Section 3.1, we provide our

system with ﬁve views (in four windows). We design a pixel-based

road speed view (embedded in Figure 1(b)) to show the speeds and

events of one dWay. We design a graph list view (Figure 1(c)) to show

the propagation graphs, and the graph projection view (Figure 1(e)) to

show their topological relationships. We design a spatial view (Fig-

ure 1(a)) to show the trafﬁc jam density on each dWay, and the prop-

agation path of one highlighted graph. We also design a multi-faceted

ﬁlter view (Figure 1(d)), to ﬁlter the propagation graphs.

5.1 Pixel Based Road Speed View

In our system, the road speeds and trafﬁc jam events carry the low

level trafﬁc jam information. In designing a visualization for them,

we have two concerns. Firstly, we need a compact visualization to be

able to present multiple roads side by side for comparison. Secondly,

according to our experience, road speed variation has strong daily and

weekly patterns. It is important to present them in the analysis.

With these concerns in mind, we design a table-like pixel based

visualization for a dWay, as illustrated in Figure 5(c). Each row rep-

resents a day, each column represents a 10 minutes time interval, and

(a) (b)

(c)

(d)

(e)

Fig. 5. Road speed view showing the speeds and events for one dWay.

For the green road in (a) the speed variation is shown in (c). Each row

represents one day, and each column represents 10min in a day, so

each cell is a time bin. Cell color represents the calculated speed, with

the color scale in (b). We mark the extracted events by black boxes

in (c). Instead of using black boxes, we can use cell size to mark the

events, as shown in (d). The events involved in the currently highlighted

propagation graph are highlighted in a thick black box. When ﬁltering is

applied, all irrelevant cells turn gray, as shown in (e).

each cell represents a time bin. Optionally, the table can be divided

into weekly blocks, by the black horizontal lines. We use cell color to

represent the road speed on a dWay at the corresponding time bin. The

color scale is given in Figure 5(b): red represents low, and green high

speed. For cells without a valid speed estimation, we use gray. Mouse

hovering over a cell reveals detailed speed information, including the

time of the cell, the speed value and its support between brackets.

In order to show the events on this dWay, we draw black boxes on

the road speed view. Figure 5(c) illustrates this. The cells covered in

the box correspond to the time bins in trafﬁc jams. Speciﬁcally, the

left/right boundary represents the start/end time of the event. If we

are more interested in the events, than in the details of speed, we can

use cell size to mark events, as shown in Figure 5(d). We make the

cells in trafﬁc jam events, which we call event cells, bigger than the

non-event cells. Events pop out in this style, and no black boxes are

required to mark events. This is especially useful when we embed the

road speed view in the spatial view, as shown in Figure 1(b). Then,

due to limited screen space, we have to compromise speed for event

information. Using black boxes would seriously hide the cell color.

In the road speed view, we can highlight a trafﬁc jam propagation

graph by clicking on an event, which then will be marked by a thick

black box (Figure 5(c),(d)). The propagation graph containing this

event will be highlighted, and shown in the spatial view.

We only show information for cells satisfying the ﬁlter, other cells

turn gray, including non-event cells outside of the time range (deﬁned

by the temporal ﬁlters), and event cells not belonging to the selected

propagation graphs. It is possible that an event outside the time range

is not gray, as long as its corresponding propagation graph intersects

with the time range. A ﬁltered road speed view is shown in Figure 5(e).

5.2 Graph List View

After showing the speeds and events on individual roads, we consider

showing the propagation graphs. This is information on a higher level,

and reveals the interactions of trafﬁc jams on different roads. In de-

signing the visualization to show the propagation graphs, we have two

concerns. Firstly, there are many propagation graphs, but we can only

show a few simultaneously on the screen. Secondly, we need to com-

Visual Traffic Jam Analysis Based on Trajectory Data

Figures

Citations

A survey on FinTech

A Survey of Traffic Data Visualization

Big Data for Social Transportation

Spoofing-Jamming Attack Strategy Using Optimal Power Distributions in Wireless Smart Grid Networks

Hierarchical and Networked Vehicle Surveillance in ITS: A Survey

References

What about people in regional science

Methods for Visual Understanding of Hierarchical System Structures

Map-matching for low-sampling-rate GPS trajectories

Traffic jams without bottlenecks—experimental evidence for the physical mechanism of the formation of a jam

Computing with Spatial Trajectories

Related Papers (5)

Visual Exploration of Big Spatio-Temporal Urban Data: A Study of New York City Taxi Trips

Stacking-Based Visualization of Trajectory Attribute Data

TripVista: Triple Perspective Visual Trajectory Analytics and its application on microscopic traffic data at a road intersection

Spatial Generalization and Aggregation of Massive Movement Data

Visual analytics of movement: an overview of methods, tools and procedures

Frequently Asked Questions (2)

Q1. What are the contributions in "Visual traffic jam analysis based on trajectory data" ?

Q2. What are the future works mentioned in the paper "Visual traffic jam analysis based on trajectory data" ?