What have the authors contributed in "Travel time estimation of a path using sparse trajectories" ?

In this paper, the authors propose a citywide and real-time model for estimating the travel time of any path ( represented as a sequence of connected road segments ) in real time in a city, based on the GPS trajectories of vehicles received in current time slots and over a period of history as well as map data sources. Though this is a strategically important task in many traffic monitoring and routing systems, the problem has not been well solved yet given the following three challenges. The authors then devise and prove an object function to model the aforementioned tradeoff, with which they find the most optimal concatenation of trajectories for an estimate through a dynamic programming solution. In addition, the authors propose using frequent trajectory patterns ( mined from historical trajectories ) to scale down the candidates of concatenation and a suffix-tree-based index to manage the trajectories received in the present time slot. The results demonstrate the effectiveness, efficiency and scalability of their method beyond baseline approaches. In most cases, the authors can not find a trajectory exactly traversing a query path either.

What future works have the authors mentioned in the paper "Travel time estimation of a path using sparse trajectories" ?

In the future, the authors plan to infer the travel time of a path for a particular driver. In addition, the authors would like to study the impact of other factors, such as weather conditions and air quality, on the travel time estimation of a path.

What is the main reason for the accuracy of the map-matching?

the map-matching for high sampling rate trajectories is more accurate than low sampling rate taxi trajectories, resulting in a more accurate estimation of the ground truth.

How does the model predict the travel time of a road segment?

When a vehicle passes through, the time interval for crossing two adjacent loop detectors is recorded, based on which the speed of the vehicle is inferred. [9, 14, 16] use various models to estimate the travel speed on an individual road segment based on the sensor readings from loop detectors, and then convert the speed into a travel time. [19] predicts the travel time of a road segment by applying support vector regression to its historical travel times.

How many segments are retrieved from the query paths?

The travel times of 58,223road segments (about 26.8% of the road segments in the query paths) are finally retrieved from for constructing the most optimal concatenation, i.e., 4.7 road segments per path.

How do the authors find the optimal concatenation of trajectories?

Using a dynamic programming solution, the authors find the most optimal concatenation of trajectories for estimating a path’s travel time.

What is the way to deal with the weakness of the individual road segment-based methods?

A possible approach to deal with the weakness of the individual road segment-based methods is to estimate the travel time of a path as a whole based on frequent trajectory patterns.

How long can the authors infer the travel time on each road segment for each particular driver?

In total, the authors can infer the travel time on each road segment for each particular driver within 6.4min if using 25 cores in a server.

How much time is the average error of the estimated travel time?

Given the queries introduced in Section 5.1.3, on average, the absolute error of the estimated travel time is about 2 minutes per path, which is about 19% of the true travel time.

Why is the length of a path collected in the study so long?

The major reason is the length of a path collected in the study is usually long (on average 8.78KM each), where their model has a better accuracy than a shorter path.

How do the authors calculate the travel time of a query path?

In the implementation, if not building an effective indexing structure, the authors need to scan a trajectory when calculating the travel time of a path based on the trajectory (i.e., Line 11 of Algorithm 2).

(Open Access) Travel time estimation of a path using sparse trajectories (2014) | Yilun Wang

Travel Time Estimation of a Path using Sparse Trajectories

Yilun Wang

1,2,*

, Yu Zheng

1,+

, Yexiang Xue

1,3,*

Microsoft Research, No.5 Danling Street, Haidian District, Beijing 100080, China

College of Computer Science, Zhejiang Univeristy

Department of Computer Science, Cornell University

{v-yilwan, yuzheng}@microsoft.com, yexiang@cs.cornell.edu

ABSTRACT

In this paper, we propose a citywide and real-time model for

estimating the travel time of any path (represented as a sequence of

connected road segments) in real time in a city, based on the GPS

trajectories of vehicles received in current time slots and over a period

of history as well as map data sources. Though this is a strategically

important task in many traffic monitoring and routing systems, the

problem has not been well solved yet given the following three

challenges. The first is the data sparsity problem, i.e., many road

segments may not be traveled by any GPS-equipped vehicles in

present time slot. In most cases, we cannot find a trajectory exactly

traversing a query path either. Second, for the fragment of a path with

trajectories, they are multiple ways of using (or combining) the

trajectories to estimate the corresponding travel time. Finding an

optimal combination is a challenging problem, subject to a tradeoff

between the length of a path and the number of trajectories traversing

the path (i.e., support). Third, we need to instantly answer users’

queries which may occur in any part of a given city. This calls for an

efficient, scalable and effective solution that can enable a citywide and

real-time travel time estimation. To address these challenges, we

model different drivers’ travel times on different road segments in

different time slots with a three dimension tensor. Combined with

geospatial, temporal and historical contexts learned from trajectories

and map data, we fill in the tensor’s missing values through a context-

aware tensor decomposition approach. We then devise and prove an

object function to model the aforementioned tradeoff, with which we

find the most optimal concatenation of trajectories for an estimate

through a dynamic programming solution. In addition, we propose

using frequent trajectory patterns (mined from historical trajectories)

to scale down the candidates of concatenation and a suffix-tree-based

index to manage the trajectories received in the present time slot. We

evaluate our method based on extensive experiments, using GPS

trajectories generated by more than 32,000 taxis over a period of two

months. The results demonstrate the effectiveness, efficiency and

scalability of our method beyond baseline approaches.

Categories and Subject Descriptors

H.2.8 [Database Management]: Database Applications - data

mining, Spatial databases and GIS;

Keywords

Travel time estimation; tensor; trajectories; urban computing;

1. INTRODUCTION

Real-time estimation of the travel time of a path, which is represented

by a sequence of connected road segments, is of great importance for

traffic monitoring [1], finding driving directions [20], ridesharing [13]

and taxi dispatching [22]. Existing solutions, e.g., using loop sensors,

usually tell people the travel speed of an individual road segment

rather than the travel time of an entire path. The latter’s value is not a

simple summation of the travel time of each individual road segment,

as a path also contains road intersections (sometimes with traffic

lights) where a driver needs to slow down or wait for a while.

Explicitly modeling the time delay at an intersection is not easy [8]. In

addition, these methods have limited coverage, as many streets do not

have a loop sensor embedded.

An alternative method is to use floating car data (e.g., GPS trajectories

of vehicles) to estimate the travel time of a path. For example, as

shown in Figure 1, we estimate the travel time of path 

















, using four trajectories 



, 



, 



, and 



. Unfortunately,

there are three major issues remaining unsolved in existing methods.

They are as follows:

Figure 1. Problem demonstration

1) Data sparsity: For example, 



is not traversed by any trajectory in

the previous 30 minutes. Using an average of 



’s historical travel

times is not accurate enough (since its traffic conditions change over

time of day and day of the week). Sometimes, the road may never be

traversed by any trajectories (even in history) in our dataset, as in

practice we only have the data of a sample of vehicles.

2) Trajectory concatenation: For the sub-path (e.g., 











)

with trajectories, how to combine these trajectories effectively to

achieve an accurate estimate is still a challenging problem. Clearly,

there are multiple ways of using the four trajectories shown in Figure

1. For instance, we can calculate the travel time of 











solely based on 



. Or, we can compute the travel time for 



(based

on 



and 



), 



(based on 



, 



and 



), and 



(using 







and 



), separately. Later, the travel time of 











can

be obtained by summing the travel times of each road segment. We

can also use 



and 



to estimate the travel time of 







, then

concatenating it with that of 



; or, do 







first based on 



and





, then concatenating it with 



Different concatenations have their own advantages and

disadvantages, subject to a trade-off between their support and

length. The ideal situation is to estimate the travel time of













using many trajectories like 



covering the entire

path. Such trajectories reflect the traffic conditions of an entire

path, including intersections, traffic lights and direction turns,

hence, no need to model these complex factors separately and

+ Yu Zheng is the correspondence author of this paper.

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are not

made or distributed for profit or commercial advantage and that copies bear

this notice and the full citation on the first page. Copyrights for components

of this work owned by others than ACM must be honored. Abstracting with

credit is permitted. To copy otherwise, or republish, to post on servers or to

redistribute to lists, requires prior specific permission and/or a fee. Request

permissions from permissions@acm.org.

KDD’14, August 24–27, 2014, New York, New York, USA.

http://dx.doi.org/10.1145/2623330.2623656

*The paper was done when the first and third authors were interns in

Microsoft Research under the supervision of the second author who

contributed the main idea and algorithms of this paper.

explicitly. However, as the length of a path increases, the number

of trajectories (i.e., the support) traveling on the path decreases

(refer to Figure 10 A) for details). Consequently, the confidence

of the travel time (derived from few drivers) decreases. For

example, what if 



is generated by an uncommon driver or in an

unusual situation like pedestrians crossing a street? Furthermore,

in many cases, we cannot even find a trajectory passing an entire

path. On the other hand, using the concatenation of shorter sub-

paths can have more occurrences of trajectories on each sub-path

(i.e., having a high confidence in the derived travel time for each

sub-path). But this results in more fragments, across which the

aforementioned complex factors are difficult to model. The more

fragments a concatenation contains, the more inaccuracy a path’s

travel time could involve.

3) Tradeoff among Scalability, effectiveness and efficiency: As

users can query any path in a city, we need to model the traffic

conditions with a city scale, which usually contains tens of

thousands of road segments. In the meantime, we have to answer

users’ query instantly. So, a good solution should be scalable,

effective and efficient, all simultaneously. This requirement fails

some complex models that work well on a particular road.

In this paper, we propose a model for instant Path Travel Time

Estimation (PTTE), based on sparse trajectories generated by a

sample of vehicles (e.g., some GPS equipped taxicabs) in the

recent time slots as well as in history. Our model is comprised of

two major components. One is to estimate the travel time for road

segments without being traversed by trajectories through a

context-aware tensor decomposition (CATD) approach. The

second is to find the most optimal concatenation (OC) of

trajectories to estimate a path’s travel time using a dynamic

programing solution. Our work has three primary contributions:

 Dealing with the missing values: We model different drivers’

travel times on different road segments in different time slots

with a three dimensional tensor. Combined with geospatial,

temporal and historical contexts learned from other data

sources, we fill in the tensor’s missing values through a

context-aware tensor decomposition approach. To expedite

the inference, we partition a city into disjoint geo-regions and

carry out the decomposition for each region in parallel.

 Optimal concatenation: We devise and prove an object

function that can model the tradeoff between the support and

length of a concatenation. Using a dynamic programming

solution, we find the most optimal concatenation of

trajectories for estimating a path’s travel time. In addition,

we use frequent trajectory patterns mined in advance to scale

down the candidates of concatenation and propose a suffix-

tree-based index to manage the recently received trajectories,

improving the efficiency of our model.

 Evaluation: We evaluate our model with the real trajectories

generated by over 32,000 taxis over a period of 2 month on

Beijing’s road network. The results of extensive experiments

demonstrate the advantages of our model. A sample of the

data has been released at [25].

The rest of the paper is organized as follows: Section 2 overviews

our model. Section 3 elaborates on the method for inferring the

travel time of road segments without trajectories. Section 3

introduces the method that searches for the most optimal

concatenation. Section 4 presents the experiments and Section 5

summarizes related work. We conclude the paper in Section 6.

2. OVERVIEW

Definition 1: Road Network. A road network  is comprised of a

set of road segments 󰇝󰇞 connected among each other in a graph

format. Each road segment  is a directed edge with two terminal

points, a list of intermediate points describing the segment, a

length ., a level . (e.g. a highway or a street), a direction

. (e.g. one-way or bi-directional) and the number of lanes ..

Definition 2: Trajectory. A spatial trajectory  is a sequence of

time-ordered points, :











, where each point

has a geospatial coordinate set and a timestamp, 󰇛,,󰇜.

Definition 3: Path. A path  is represented by a sequence of

connected road segments, e.g., :











, in an .

Definition 4: Trajectory pattern. A trajectory pattern  is a

sequential pattern of road segments with a support over a

threshold, calculated by the number of trajectories traversing these

road segments. If we set support as 2, 







and 







Figure 1 are trajectory patterns, while 











is not eligible.

Definition 5: Concatenation. A path  can be decomposed into

different concatenations ( || ) of its sub-paths,





|| ...||



…||



…|| 



, 1,, , 







.

For instance, 















can be formed by 󰇛













󰇜||



, or 󰇛







󰇜||󰇛







󰇜, or 



||󰇛











󰇜. Thus, the

travel time of  can be obtained via the summation of different

concatenations, e.g., 

















+





, or 























or 























Definition 6: Travel Time. A driver ’s travel time on a road

segment  in time slot  is defined as 

,,

. Likewise, 

,,

denotes ’s travel time on path  in time slot .

Figure 2. Framework of our model

Figure 2 presents the framework of our model which is comprised of

two major parts. In the above part, we project each trajectory received

in a current time slot onto a road network, using a map-matching

algorithm [21]. The trajectories (combined with road network data)

are then used to construct a 3D tensor 



where the three dimensions

stand for road segments, time slots and drivers, respectively. Each

entry is the travel time of a particular driver on a particular road

segment in a specific time slot. We partition a day into several time

slots based on a certain time interval (e.g., we divide a day into 48

time slots with 30 minutes each in the experiments). Clearly, the

tensor is very sparse (i.e., having many entries without values), as a

driver can only travel a few road segments in a time slot. To deal with

the data sparsity problem, we extract three categories of features,

consisting of geospatial, temporal, and historical contexts, from the

road network data and trajectories. The first two feature sets are stored

in two matrices, respectively, and the historical context is represented

by another tensor 



. The two matrices and 



are then factorized

with 



collaboratively, helping fill 



’s missing entries in a current

time slot (i.e., inferring the travel time of road segments without being

traveled by trajectories in the current time slot). The general idea is

that road segments with similar contexts could have a similar travel

time. The context matrices and tensor reveal the similarity and with a

more proportion of non-zero entries than 



, thereby reducing the

factorization error and improving the inference accuracy. After filling

Map-

Matching

Tensor

Construction

Tensor

Decomposition

Road

Networks

Trajectory

Database

Frequent Trajectory

Pattern Mining

Optim al

Concatenation

Features

Path

Context Feature

Extract ion

Trajectories

Patterns

cost

rec

the missing entries in 



, we obtain the travel time of any driver on

any road segment in current time slot (stored in 



In the bottom part, given a query path , we estimate its travel time in

the current time slot, based on 



, the trajectories received in the

time slot and trajectory patterns. Specifically, we devise and prove an

objective function that can represent the tradeoff between the length

and support of a trajectory pattern. Based on the objective function,

we find the most optimal concatenation of trajectories for a path,

using a dynamic programing approach. In practice, it is not necessary

to try every possible concatenation of a path, as some sub-paths have

never been traversed by any trajectory. So, we mine frequent

trajectory patterns from historical trajectories in advance and study the

concatenation of these existing patterns to estimate the travel time of a

path. This reduces the online computational loads significantly, while

guaranteeing accuracy in travel time estimation. Note that we are not

using the historical travel time of a trajectory pattern. The patterns just

provide us with candidate schemes of subpaths for finding an optimal

concatenation of a path. Each trajectory pattern’s travel time in current

time slot is mainly calculated based on the trajectories received in the

time slot. If a pattern contains road segments without being traversed

by trajectories in the current time slot, we retrieve the inferred time

from 



, according to the driver, road segment and time slot. For

instance, two drivers (



, 



) travelled 







, but nobody traveled





in a pattern 











, in current time slot . That is,











,



,

, and 









,



,

can be calculated from the present

trajectory data, while 





,



,

and 





,



,

are unknown. In this case,

we retrieve the latter two from 



, calculating















,



,











,



,







,



,

, and















,



,











,



,







,



,

With 



, we can estimate a driver’s travel time on a trajectory

pattern even if the recently received data is incomplete. The

dimension of drivers in 



enables us to calculate the variance

among different drivers’ travel times on a road segment or a sub-

path. Intrinsically, different drivers travel the same road segment

with different times, majorly depending on the different traffic

conditions they experience. Thus, the variance implies the

complexity of traffic conditions on a road segment or a sub-path,

helping estimate a more accurate travel time of a path (elaborated

in Section 4.1). Finally, the travel time of a path is calculated as:



∑



,,

||



; (1)

Where Ψ is the concatenation of path , represented by a set of

trajectory pattern s;  is a collection of drivers traversing (or

partially traversing) a ;  is the current time slot.

3. DEALING WITH MISSING VALUES

3.1 Tensor Building and Feature Extraction

To model the traffic conditions of the current time slot, we

construct a tensor 







, with the three dimensions

standing for road segments, drivers and time slots, respectively,

based on the GPS trajectories received in the most recent  time

slots and the road network data. As shown in Figure 3, an entry





󰇛

,,

󰇜

 denotes the th road segment is traveled by the th

driver with a time cost  in time slot  (e.g., 2-2:30pm). The last

time slot denotes the present time slot, combined with the -1

time slots right before it to formulate the tensor. Clearly, the

tensor is very sparse as a driver can only travel a few road

segments in a short time period. If we were able to fill in the

missing entries in terms of the values of non-zero entries, we can

know the travel time of any driver on any road segment in the

present time slot.

A common approach to this problem is to decompose a tensor into

the multiplication of a few (low-rank) matrices and a core tensor

(or just a few vectors), based on the tensor’s non-zero entries. For

example, we can decompose 



into the multiplication of a core

tensor 













and three matrices, 





, 







, 





, if using a tucker decomposition model. An

objective function is defined as Equation 2 to control the errors.



󰇛

,,,

󰇜





































󰇛





































󰇜

(2)

where





denotes the 



norm and





󰇛





































󰇜

is a regularization of penalties to avoid over-fitting; 



, 



, and





are usually very small, denoting the number of latent factors. 

is a parameter controlling the contributions of the regularization.

Afterwards, we can recover the missing values in 



multiplying decomposed factors as 















.

Figure 3. The model dealing with data sparsity

In our problem, however, the tensor is over sparse. For example, if

setting 30 minutes as a time slot, only 0.03% entries of 



have

values. Decomposing 



solely based on its own non-zero entries

is not accurate enough. To this end, we build another tensor 



based on the historical trajectories over a long period of time (e.g.

one month). As shown in Figure 3, 



has the same structure as





, while an entry 



󰇛

,,

󰇜

 denotes the th driver’s average

travel time on the th road segment in time slot  in the history.

Intrinsically, 



is much denser than 



, denoting the historical

traffic patterns and drivers’ behavior on an entire road network. For

instance, using one-month trajectories and setting 30 minutes as a

time slot, the non-zero entries of 



is about 0.4%. Decomposing





and 



together reduces the error of supplementing 



Besides 



, we also construct another two matrices  and  to help

the decomposition of 



. Specifically, as illustrated in Figure 4 A),

 stores the geographical features 



of each road segment, such as

., ., ., ., the number of neighbors (e.g., 



has 2

and 3 neighbors) at its terminals, and a tortuosity ratio  (e.g.





.



. 



⁄

), as well as the distribution of Point of Interests

(POIs) 



around ’s terminals. While  captures the similarity

between different road segments in geographic spaces, matrix 

(consisting of 



and 



) represents the correlation between

different time slots in terms of the coarse-grained traffic conditions.

More specifically, we partition a city into disjoint and uniform grids

(e.g., 44 in Figure 4 B), each of which is comprised of many road

segments. 



is built based on the recent trajectory data received

from 



to 



(e.g., 1pm-3pm), reflecting the present traffic

conditions on a road network. An entry of 



denotes the number of

vehicles traversing a particular grid in a particular time slot. A row

of 



represents coarse-grained traffic conditions in a city of a

particular time slot. Consequently, the similarity of two different

rows indicates the correlation of traffic flows between two time

slots. Additionally, in contrast to using the traffic flow on each

individual road segment in 



, 



can be filled densely, therefore

can help reduce the error of decomposing 



. 



has the same

structure as 



, storing the historical average number of vehicles

traversing a grid from 



to 



. In other words, 



and 



A = A

|| A

respectively correspond to the coarse-grained current and historical

traffic conditions in the same span of time of day. In the

implementation, we build 



and 



of an entire day in advance

and retrieve the entries according to current time (and the number of

time slots  needed) when constructing  and . For example, as

shown in Figure 4 C), the rows from 



to 



will be retrieved from

the prebuilt 



to construct  with 



Figure 4. Constructing context matrices

3.2 Tensor Decomposition

To achieve a high accuracy of decomposition, we put together 



and 



(i.e., 



|| 



, as shown in Figure 3), decomposing

 with context matrices  and  collaboratively. The objective

function is defined as Equation 3,



󰇛

,,,,,

󰇜







































































󰇛

























































󰇜

, (3)

where 



and 



,  denotes the number of

grids; 



,  denotes the dimension of geographical

features;  





, 







, 





and 







are low rank latent factor matrices for time slots, grids, roads and

geographical features. Later, we can recover  according to

















 . 



, 



, and 



are parameters

controlling the contribution of different parts.

In our model,  and  shares matrix , and  and  share matrix

. The dense representation of  and  helps generate a relatively

accurate  and , which reduce the decomposition error of  in

turn. Additionally, the combination of 



and 



reveals how

current coarse-grained traffic condition deviates from its historical

patterns. The information of the deviation is then propagated to ,

helping figure out the fine-grained deviation between current traffic

conditions and historical traffic patterns on each road segment. So,

our model considers both geospatial and temporal correlations. It

also incorporates the knowledge from present and historical traffic

data. As there is no closed-form solution for finding the most

optimal result of Equation 3, we use a numeric method, gradient

descent, to find a local optimization, as presented in Figure 5.

Algorithm 1: Tensor Decomposition

Input: tensor , matrix



, and matrix , an error threshold 

Output: , , , 

1. Initialize 













, 





, 





, 













, 







with small random values

2. Set  as step size

3. While 









4. Foreach 



0

5. 



























;

6. 











































󰇛







󰇜;

7. 







































;

8. 











































󰇛







󰇜;

9. 























;

10.     







󰇛







󰇜



;

11.     







󰇛







󰇜



;

12. Return , , , 

Figure 5. Algorithm for decomposing a tensor

The Symbol “” denotes the matrix multiplication; 



stands for

the tensor-matrix multiplication, where the subscript  stands for

the direction, e.g., 



 is 





∑















;  is the

tensor outer product (also called Kronecker product);

the entries

of the  th row of matrix  are represented as 



. More

specifically, we use an element-wise optimization algorithm

(instead of batch decomposition) [10], which updates the factors

independently (meaning they can be performed in parallel).

In reality, tensor  is very large, given hundreds of thousands of

road segments and tens of thousands of drivers. Decomposing

such a big tensor is very time consuming, therefore reducing the

feasibility of our method in providing online services. To address

this issue, as illustrated in Figure 6, we partition a city into several

disjoint regions, building a tensor for each region based on the

data of the region. The matrices  and  are built in each smaller

region accordingly. By setting a proper splitting boundary, we try

to keep these small tensors a similar size. As a result,  is

replaced by a few small tensors, which will be factorized in

parallel and more efficiently. We validate (in later experiments)

that the partition does not compromise the accuracy of the original

decomposition when choosing a proper number of partitions.

Figure 6. Spatial partition for expediting the tensor decomposition

4. OPTIMAL CONCATENATION (OC)

4.1 Objective Function

Given a path  covered by trajectories, we need to find the best

concatenation that results in an accurate travel time estimation.

Intuitively, the best decomposition is the one that achieves the

lowest empirical risk between the estimate and true travel time 



Suppose  is decomposed as 



||



||||



, where the estimated

travel time is 























, the squared empirical risk is

then wrote as,



,



,



,,



































, (4)

Hence, our problem is to search for the best concatenation which

yields the least empirical risk, formally defined as,

argmin





,



,,





,



,



,,



subject to 



||



||||



. (5)

To come up with a computable form of 

,



,



,,



, we relate



,



,



,,



with 󰇛













󰇜



, where 





is the true travel

time of sub-path 



. It is fair to assume if 



||



||||



then 





















. Hence we have,



,



,



,,

















































































󰇡

∑

























∑∑

󰇛













󰇜󰇛













󰇜









󰇢



∑

󰇛













󰇜









∑∑

 󰇡󰇛













󰇜󰇛













󰇜󰇢









If assuming 







and 







are independent, we have 󰇡󰇛















󰇜󰇛













󰇜󰇢 













󰇛













󰇜=0, Therefore,

)

791



,



,



,,





∑

󰇛













󰇜







. (6)

Further, 󰇛













󰇜



󰇛















∑







,









󰇜

















∑

󰇛











,

󰇜























∑

󰇛











,

󰇜





















󰇛





,

󰇜, (7)

where 





is the number of drivers passing 



, and 





,

denotes

the th driver’s travel time on 



; 󰇛





,

󰇜 is the variance of

these drivers’ travel times. Then, Equation 5 can be represented as:

argmin





,



,,



∑









󰇛





,

󰇜





subject to 



||



||||



 (8)

Equation 8 well reflects the aforementioned tradeoff between the

support and length of a concatenation. On one hand, it is easier to

find more drivers traveling a shorter sub-path. The more the

drivers pass a sub-path (i.e. support is higher, 





is bigger), the

smaller the error of the inferred travel time is. On the other hand,

the shorter a sub-path is, the bigger the variance in travel time

would be. There are a lot of uncertainties of traveling a short path.

E.g., if only traveling one road segment, the travel time will be

significantly influenced by a traffic light. As a result, different

drivers’ travel times could be dramatically different.

4.2 Dynamic Programing Solution

To solve the optimization problem shown in Equation 8, we

propose a dynamic programing solution. Suppose a path :













,  











, , denote 

󰇛





󰇜











󰇛





,

󰇜, then the optimization problem of  can be

represented as Equation 9.

argmin





,



,,



∑











subject to 



||



||||



. (9)

Let 



be the minimal value of  to the above problem, then the

minimal value of the squared empirical risk function of  is 



Additionally, we have a state transition function as Equation 10.





min



󰇛



󰇛





||



||



󰇜󰇜. (10)

Algorithm 2: Query path decomposition

Input: a query path 











, a collection of trajectory

pattern s, a time slot , trajectories received in , and tensor 





Output: Ψ



, the most optimal concatenation of path 

1. 



0, Ψ



;

2. For 1 to  do

3. 



∞; Ψ



;

4. For  down to 1 do

5. 

󰆒













; 

󰆒,,

0;

6.  retrieve the drivers traversing (or partially traversing)

 from the trajectory database

7. Foreach  do

8. 

󰆒,,

 0;

9. Foreach 



  not traversed by ’s trajectory 

10. 

󰆒,,

󰇛



󰇜





,,

;

11. 





󰆒,,

Calculate the time for the rest of  based on ;

12. 

󰆒,,







󰆒,,

;

13. 

󰇛



󰇜











󰇛

󰆒,,

󰇜;

14. If 





󰇛



󰇜





15. 









󰇛



󰇜

;

16. 







||;

17. Return Ψ



;

Figure 7. Algorithm for finding the most optimal concatenation

Using Algorithm 2 shown in Figure 7, we solve this problem with a

complexity of 󰇛



󰇜, where  is the number of road segments

in  and  is the number of drivers passing a segment.

In practice, it is not necessary to check every concatenation of a

path, as many sub-paths may not be traversed by any trajectory in

the current time slot. To further improve the efficiency of our

solution, we mine frequent trajectory patterns from historical

trajectories in advance. Then, we just need to check the

concatenation of the trajectories patterns. Specifically, we can stop

the iteration at Line 4 in algorithm 2 if 

󰆒

is not a trajectory pattern.

We use a suffix-tree-based algorithm [18] to find the frequent

trajectory patterns. Specifically, after being map-matched, a

trajectory can be regarded as a string of road segment IDs. By

building a suffix tree, where a node denotes a road segment ID, a

trajectory is then represented as a path on the tree. For example,

the four trajectories shown in Figure 1 can be represented as the

tree depicted in Figure 8 A), where 











is the most left

path of the tree. 







and 



are suffixes of 











. The

number associated with each link stands for the number of the

trajectories passing the path (i.e., the support). If setting 2 as a

support, we find 







, 







, and 



are patterns. In reality,

the suffix-tree is built based on historical trajectories over a long

period of time. As long trajectory patterns are very rare, we set the

maximum length of a pattern to 20 road segments.

Figure 8. Mining frequent trajectory patterns and used with tensor

4.3 Working with Tensor 



Note that a query path may have some road segments that are not

traversed by any trajectory in the current time slot, though these

segments may belong to a trajectory pattern (in history). Following

the example shown in Figure 1, we demonstrate in Figure 8 B) how





is used with trajectory patterns to help the decomposition of a

query path. To estimate the travel time of a query path :

















in time slot , we first search the suffix tree, which was

built based on the trajectory data over a long history (not the one

shown in the left part of Figure 8 A), for the trajectory patterns that

 contains, e.g., 







and 







. To calculate 

󰇛









󰇜

defined in Equation 9, we need to know the travel time of each

driver passing 







. However, 



is not traversed by any

trajectory in time slot . That is 









,,

is unknown for every

driver, though 





,,

can be calculated based on the recently

received trajectories, i.e. 



, 



and 



. To address this issue, we

retrieve 





,



,

, 





,



,

, and 





,



,

from 



and calculate











,,

for 󰇛



,



,



󰇜, respectively, by Equation 11.











,,







,,







,,

, (11)

Having 









,,

, we can calculate the most optimal concatenation

according to Equation 9, 10 and Algorithm 2. When the supplement

of an entry is negative, we resort to the historical average travel

time. The dimension of users in tensor 



enables us to retrieve a

more accurate travel time for a particular driver, resulting in a better

estimate of the variance of travel times (as Equation 8). We validate

that this is more accurate than just using a historical average of

Root

→r

, u

)

P: r

→r

(1)

(3)

rec

(1)

,Tr

→r

= t

A) An example of suffix-tree B) Filling in the missing time for a pattern

Patterns:

Travel time estimation of a path using sparse trajectories

Figures

Citations

Data Mining - Concepts and Techniques.

Urban Computing: Concepts, Methodologies, and Applications

Trajectory Data Mining: An Overview

Learning k for kNN Classification

Methodologies for Cross-Domain Data Fusion: An Overview

References

Data Mining: Concepts and Techniques

Data Mining - Concepts and Techniques.

Mining frequent patterns without candidate generation

Urban Computing: Concepts, Methodologies, and Applications

CarTel: a distributed mobile sensor computing system

Related Papers (5)

Urban Computing: Concepts, Methodologies, and Applications

Trajectory Data Mining: An Overview

T-drive: driving directions based on taxi trajectories

Hidden Markov map matching through noise and sparseness

Tensor Decompositions and Applications

Frequently Asked Questions (11)

Q1. What have the authors contributed in "Travel time estimation of a path using sparse trajectories" ?

Q2. What future works have the authors mentioned in the paper "Travel time estimation of a path using sparse trajectories" ?

Q3. What is the main reason for the accuracy of the map-matching?

Q4. How does the model predict the travel time of a road segment?

Q5. How many segments are retrieved from the query paths?

Q6. How do the authors find the optimal concatenation of trajectories?

Q7. What is the way to deal with the weakness of the individual road segment-based methods?

Q8. How long can the authors infer the travel time on each road segment for each particular driver?

Q9. How much time is the average error of the estimated travel time?

Q10. Why is the length of a path collected in the study so long?

Q11. How do the authors calculate the travel time of a query path?