scispace - formally typeset
Open AccessBook ChapterDOI

Mining GPS Data for Trajectory Recommendation

TLDR
This work proposed a trajectory recommendation framework and developed three recommendation methods, namely, Activity-Based Recommendation (ABR), GPS-Based recommendation (GBR) and Hybrid Recommendation, which turned out the hybrid solution displays the best performance.
Abstract
The wide use of GPS sensors in smart phones encourages people to record their personal trajectories and share them with others in the Internet. A recommendation service is needed to help people process the large quantity of trajectories and select potentially interesting ones. The GPS trace data is a new format of information and few works focus on building user preference profiles on it. In this work we proposed a trajectory recommendation framework and developed three recommendation methods, namely, Activity-Based Recommendation (ABR), GPS-Based Recommendation (GBR) and Hybrid Recommendation. The ABR recommends trajectories purely relying on activity tags. For GBR, we proposed a generative model to construct user profiles based on GPS traces. The Hybrid recommendation combines the ABR and GBR. We finally conducted extensive experiments to evaluate these proposed solutions and it turned out the hybrid solution displays the best performance.

read more

Content maybe subject to copyright    Report

Mining GPS Data for Trajectory
Recommendation
Peifeng Yin
1
, Mao Ye
2
, Wang-Chien Lee
1
, and Zhenhui Li
3
1
Department of Computer Science and Engineering, Pennsylvania State University
2
Pintrest, San Francisco Bay Area, CA
3
College of Information Science and Technology, Pennsylvania State University
pzy102@cse.psu.edu, m.daniel.ye@gmail.com, wlee@cse.psu.edu, jessieli@ist.psu.edu
Abstract. The wide use of GPS sensors in smart phones encourages
people to record their personal trajectories and share th em with others
in the Internet. A recommendation service is needed to help people pro-
cess the large quantity of trajectories and select potentially interesting
ones. The GPS trace data is a new format of information and few works
focus on building user preference profiles on it. In this work we proposed
a trajectory recommendation framework and developed three recommen-
dation methods, namely, Activity-Based Recommendation (ABR), GPS-
Based Recommendation (GBR) and Hybrid Recommendation. The ABR
recommends trajectories purely relying on activity tags. For GBR, we
proposed a generative model to construct user profiles based on GPS
traces. The Hybrid recommendation combines the ABR an d GBR. We
finally conducted extensive experiments to evaluate t hese proposed solu-
tions and it turned out the hybrid solution displays the best performance.
1 Introduction
With the rapid development of mobile devices, wireless networks and Web 2.0
technology, a number of location-based sharing servic es, e.g., Foursquare
1
, Face-
book Place
2
, Everytrail
3
and GPSXchange
4
, have emerged in rece nt years. Among
them, Everytrail and GPSXchange a re particularly unique because they allow
users to share their outdoor exper iences by uploading GPS trajectory data of
various outdoor activities, e.g., hiking and biking. By sharing trajectory infor-
mation, these Web 2.0 sites provide excellent resources for their users to pla n or
explore outdoor activities of interests.
The rich amount of trajectories available in those web sites brings significant
challenges for users to find what they se arch for. Also, different from conven-
tional items with enrich texts, it is difficult to judge whether the trajectory is
interesting or no t based on the activity tag or GPS raw data . Therefore, in order
to automatically discover interesting trajectories, a trajectory recommendation
service is highly desirable.
1
http://www.foursquare.com
2
http://www.facebook.com/places/
3
http://www.everytrail.com
4
http://www.gpsxchange.com/

2 P.Yin et. al.
Conventional collaborative filtering (CF) techniques do not fit the problem
trajectory r e commendation. The CF requires people to access the same items to
compute user interest similarity. However, in trajectory sharing website, there are
no two people who generate exactly the same trajectory and the user similarity
can not be calculated by “accessing the same item”.
In this work, we explore the ideas of c ontent-based recommendation tech-
niques [1, 8, 13 ]. We consider two types of trajectory content”, activity tags and
GPS points. The activitiy tag s, such as hiking or biking, are annotated by the
users themselves. The trajectory is represented as a sequence of GPS points with
corres ponding time stamps.
Recommendation based on tags is named as activity-based recommendation
(ABR), which utilizes the tag content (if available) to make trajectory recom-
mendation. Since the tags are manually labeled by the creator, they can be
treated as a good feature for a trajectory. Unfortunately, activity tags ar e not
always available for a GPS trajectory. In the Everytrail data we collected, about
12.61% of the trajectories do not have tags. Additionally, ABR may not be able
to make recommendation if there are too many candidates with the same tag.
For example, in our collected data, 14% of all tagged trajectories, are tagged
with “hiking”. One intuitive solution would be using geographical region as a
filtering to eliminate infeas ible candidates. However, it does not really solve the
problem. For example, after constraining the search result into “San Fran”, we
still found 96 hiking trajectories in the collected Everytrail dataset. Finally, tra-
jectories with the same tag may have different moving patterns, which the ABR
is unable to capture. Let’s consider two hiking fans. The first one likes to take a
gentle walk so s he can take a lot of photographs but the other one treats hiking
as a physical exercise. Na turally, the two trajectories, although bo th labeled as
“hiking”, may contain very different features, which ABR fails to capture.
Considering these weak points of ABR discuss ed above, we also exploit the
sampled points in GPS trajector ies for recommendation and call the prop osed
technique GPS based recommendation (GBR). The raw GPS data contains plen-
tiful movement information (e.g., speed, change of speed, etc.), which captures
the user’s outdoor experiences implicitly. For example, techniques for using raw
GPS data to infer the transportation modes (e.g., taking bus, taking subway, bik-
ing and walking) of trajectories have been studied [22, 21, 17, 18, 7, 6]. However,
these techniques are not applicable to our trajectory recommendation service
since we aim to capture users’ moving ha bits and use them to differentiate the
trajectorie s of the sa me activity type. Take the e xample of hiking fans mentioned
earlier, existing techniques can only classify them as “hiking”. However, what
a recommender system needs ar e more personalized moving habits, e.g., gentle
walking or intense trotting. We argue that such information is embedded in GPS
data and we aim to mine them out to facilitate trajectory recommendation.
The res t of the paper is organize d as follows. Section 2 formally defines the
problem, introduces ABR and reviews the related work. Section 3 and 4 re-
spectively detail the GPS feature extraction and the generative model in GBR.
Section 5 presents the eva luation of our proposed solutions. Finally, Section 6
concludes the paper.

Mining GPS for Traj. Rec. 3
2 Preliminaries
In this section, we first formally introduce the tr ajectory recommendation prob-
lem and discuss the sub-tasks to tackle the proposed problem. Then we provide
a comprehensive literature re view on recommendation and trajectory related
resear ch work.
2.1 Problem Formulation
A trajectory consists of two parts, i.e., an activity tag (could be absent) and
a raw GPS trace. Formally, a trajectory is represented as T = ha, T
G
i, where
a {hiking, biki ng, · · · , null} denotes the activity tag and T
G
stands for
the raw GPS trace.
The GPS trace is obtained via GPS sensor which sampled the moving object’s
current location together with the s ampled time stamp. Thus the origina l format
is a s eries of triple tuples defined below.
Definition 1 (R aw GPS Trace) A GPS trace T
G
= {pt
1
, · · · , pt
n
} is defined
as a series of sample points, pt
i
= hx
i
, y
i
, t
i
i where x
i
, y
i
represent the latitude
and longitude of the i
th
point and t
i
stands for the time stamp.
The rec ommendation problem is to find a subset of candidate trajectories
that could be of interest to an active use r. More formally, given a collection of
trajectorie s S = {T
1
, · · · , T
n
} and a person u, re commendation needs to find
k trajectories S
= {T
r
1
, · · · , T
r
k
} that u is most interested in. Suppose we
have a r anking function Score(T, u) that can compute the “interest degree” of
a trajectory to a user, the recommendation can b e formulated as follows.
Definition 2 (Top-k Trajectory Recommendatio n) Given a trajectory set
S = {T
1
, · · · , T
n
}, the recommendation service for user u needs to find a subset
of k trajectories S
= {T
r
1
, · · · , T
r
k
} so that T
i
S S
, we have
Score(T
i
, u) min
T
j
S
Score(T
j
, u) (1)
The above definition reveals three pro ble ms for trajectory recommendation.
The first two problems are how to represent the trajectory (Feature Extraction)
and the user (User Profile Modeling) in a proper way to facilita te the computa-
tion of a ranking scor e. And the final one is how to design an effective ranking
function Score(T, u) to measure the interest degree”.
2.2 Activity-based Recommendation
The ABR trie s capturing a person’s activity preferences based on her previously
shared trajectories. This preference to different activities is repr esented as a series
of probabilities, whose values are obtaine d by maximizing the joint pro bability
of observed data.
Let A = {a
1
, · · · , a
n
} denote the collection of all activity tags and p
i
, 1 i
n denote the probability that the user u is interested in activity a
i
. Obviously
P
n
i=1
p
i
= 1. For the user’s previous ly publis hed trajector ies, the a ctivity tags

4 P.Yin et. al.
are X = {x
1
, · · · , x
m
} where x
j
A, 1 j m. X is the observed data for the
user and the solution is to guess the user’s prefer e nc e, or exactly the value of p
i
based on these experiences. We assume that the instance x
j
X is independent
of each other and the probability of observing X is given in Equation (2).
P (X|p
1
, · · · , p
m
) =
m
Y
j=1
P (x
j
|p
1
, · · · , p
m
) =
m
Y
j=1
n
X
i=1
p
i
· 1
x
j
=a
i
=
n
Y
i=1
p
n
i
i
(2)
where n
i
represents the number of trajectories that is tagged with a
i
in X.
To learn the value of p
i
, we need to maximize the Equation (2) under the con-
straint that the sum of all probabilities is equal to 1, i.e., the objective function
as s hown in Equation (3).
L(p
1
, · · · , p
n
) = lo g P (X|p
1
, · · · , p
n
)+ λ(1
n
X
i=1
p
i
) =
n
X
i=1
n
i
log p
i
+λ(1
n
X
i=1
p
i
)
(3)
where λ is a Lagrange multiplier.
The objective function is solved by setting each partial differential
L
p
i
to 0.
For ABR, the ranking function is thus defined as:
Score
abr
(T, u) = log
n
X
i=1
p
i
· 1
T.activity=p
i
p
i
=
n
i
P
n
j=1
n
j
(4)
2.3 Related Work
Due to the wide use of GPS-equipped smart phones, much attention is focus ed
on the use of the trajectory data to improve people’s life, among which trans-
portation mode detection is most related to our work.
Zheng et. a l. [21, 22] colle cted 47 people’s GPS data and compared differ-
ent machine le arning techniques to classify transportation modes. The methods
however can not be used for recommendation. Trajectory recommendation re-
quires to give a ranking score to e ach candidate trajector y while classification
algorithms, e.g., decision tree, can only output binary values. In [17, 18], Reddy
et. al. compared and even ranked different types of trajectory features. One of
the most important features in their work is the instant ac celeration recorded
by accelerometer. This informa tion is usually unavailable for common trajectory
information since most of the smart phones are not equipped with accelerome-
ter. In [6, 7], different trajectories of moving objects, inc luding eye-tracking, are
collected for transportation mode classification.
Trajectories contain plenty of valuable information. Previous classification
works explored different type s of features that can well capture the trajectory
modes. However they did not pay attention to user’s moving habit that is also
contained in trajectory data. Li et. al. [1 0, 11] tries to mine moving patterns from
GPS data of animals. GPS data in our case are records of a person’s trips that
happen at different places and few of them overlap with each other. Therefore
no periodic patterns can be mined out of such “scattered” data. In [9, 14, 15],
Discrete Fourier Trans formation is a lso used to extract featur e s from trajectory

Mining GPS for Traj. Rec. 5
data. However, their goal is for clustering, which is quite straightforward with
the extracted data. Our work is to develop generative model based on these
features to learn user moving habits for recommendatio n.
Other works related to recommendation are based on semantic information
of trajectory [3, 4, 20]. T he se works treats tr ajectory as a sequence of meaning-
ful places” and use the semantic information of the locations, e.g., re staurant,
shopping centers.In our case, trajectories do not have semantic tags. Further-
more, not all trajectories contain meaningful locations. For example, a hiking
trajectory is unlikely to pas s places such as restaurant, shopping center.
3 GPS Feature Extraction
In this section we focus on extracting feature s from GPS data. Specifically, we
introduce two typ es of features, i.e., partial-view feature (PVF) and entire-view
feature (EVF). The PVF mainly consists o f physic values such a s spe e d, velocity,
etc., and is easy to understa nd.
Spec ific ally, given a tr ajectory’s r aw GPS data, average velocity, average
acceleration and other physical measurements can be e asily computed and they
represent some characteristics of that trajectory. In this work, the PVF c ontains
the total length of trajectory Len, the total time of the trajectory Time and
top-pf
1
maximum velocity
ˆ
V
1
, · · · ,
ˆ
V
pf
1
and top-pf
2
acceleration
ˆ
A
1
, · · · ,
ˆ
A
pf
2
.
The EVF trie s to capture the global fea tures and is harder to understand
semantically. We adopt Discrete Fourier Trans form (DFT) to transform the GPS
data and a discussion is provided in Section 3.2.
3.1 Entire-view Feature
Before applying DFT on GPS, there are two issues need to be addressed. Firstly,
different trajectories may have different lengths, i.e., different number of sam-
pling points. If we take the whole GPS trace as input, DFT will generate features
that have different dimensions. This situation makes it difficult to compare two
trajectorie s as they might be in different frequency spectrums. Secondly, there
are three kinds of signals that can be obtained from GPS tr aces, i.e., distance sig-
nal, velocity signa l and acceleration signal. We ne e d to decide which one should
be used as DFT input.
For the first pro ble m, a sliding window of fixed size is used to split the GPS
trace into several segments. DFT coefficients of these segments are then refined to
form a GPS feature of the same size. This processing method is simila r to music
compressio n and classification [12, 16]. As for the second problem, we choose
speed signal because i) it suffers less impact of sampling r ates than the distance
signal and ii) it is more ac curate in re fle c ting the moving status than acceleration
signal. Given two trajectories which have the same sampled data points (i.e.,
latitude, longitude a nd the number of points) except for the time stamp, the
DFT features will be same. However, the moving status for the two trajectories
could be quite different if the sampling rates are not the same. The speed series
can avoid this weakness. Also, note that the acceleration signal is converted from
the velocity signal under the a ssumption that the object is moving at a co nstant
acceleration be tween two sampled points. Each manipulation of the GPS data,

Citations
More filters
Journal ArticleDOI

Analysis of human mobility patterns from GPS trajectories and contextual information

TL;DR: This paper proposes a new framework for the identification of dynamic (travel modes) and static (significant places) behaviour using trajectory segmentation, data mining, and spatio-temporal analysis and evaluates this framework using a collection of trajectories from 205 volunteers linked to contextual spatial information.
Journal ArticleDOI

Review of Wearable Device Technology and Its Applications to the Mining Industry

Mokhinabonu Mardonova, +1 more
- 04 Mar 2018 - 
TL;DR: It is shown that by introducing wearable device technology to mining sites, the safety of mining operations can be enhanced and wearable devices should be further used in the mining industry.
Journal ArticleDOI

Tour recommendation and trip planning using location-based social media: a survey

TL;DR: This survey conducts a comprehensive literature review of studies on tour itinerary recommendation and presents a general taxonomy for touring-related research.
Journal ArticleDOI

Detecting Anomalous Trajectories and Behavior Patterns Using Hierarchical Clustering from Taxi GPS Data

TL;DR: The proposed trajectory clustering method can effectively detect anomalous trajectories and can be used to infer clearly fraudulent driving routes and the occurrence of adverse traffic events.
Journal ArticleDOI

Big Trajectory Data: A Survey of Applications and Services

TL;DR: This paper mainly introduces the trajectory data from the perspective of applications and services, and divides the data into explicit trajectory data and implicit trajectory data, and describes each type in detail.
References
More filters
Journal ArticleDOI

LIBSVM: A library for support vector machines

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Proceedings ArticleDOI

Item-based collaborative filtering recommendation algorithms

TL;DR: This paper analyzes item-based collaborative ltering techniques and suggests that item- based algorithms provide dramatically better performance than user-based algorithms, while at the same time providing better quality than the best available userbased algorithms.
Journal ArticleDOI

Fab: content-based, collaborative recommendation

TL;DR: It is explained how a hybrid system can incorporate the advantages of both methods while inheriting the disadvantages of neither, and how the particular design of the Fab architecture brings two additional benefits.
Related Papers (5)
Frequently Asked Questions (7)
Q1. What are the three types of signals that can be obtained from GPS traces?

there are three kinds of signals that can be obtained from GPS traces, i.e., distance signal, velocity signal and acceleration signal. 

Trajectory recommendation requires to give a ranking score to each candidate trajectory while classification algorithms, e.g., decision tree, can only output binary values. 

For ABR, the ranking function is thus defined as:Scoreabr(T, u) = logn ∑i=1pi · 1T.activity=pi pi = ni ∑nj=1 nj (4)Due to the wide use of GPS-equipped smart phones, much attention is focused on the use of the trajectory data to improve people’s life, among which transportation mode detection is most related to their work. 

the frequency of trajectory5 http://en.wikipedia.org/wiki/Window function#Overlapping windows1 lies mainly in higher spectrum while that of trajectory 2 in lower part. 

users whose shared trajectories are less than 20 are removed since small sample may hurt the model accuracy and thus the recommendation performance. 

The GPS trace is obtained via GPS sensor which sampled the moving object’s current location together with the sampled time stamp. 

Because the trajectory data is uploaded by different people and there is no strict examination, the raw data contains much noise for mining.