What contributions have the authors mentioned in the paper "Symbolic representation and retrieval of moving object trajectories" ?

In this paper, normalized edit distance ( NED ) is proposed to measure the similarity between two trajectories. The authors evaluate the efficacy of NED and compare it with those of Euclidean distance, Dynamic Time Warping ( DTW ), and Longest Common Subsequences ( LCSS ), showing that NED is more robust and accurate for trajectories that contain noise and local time shifting. Furthermore, in order to improve the retrieval efficiency, the authors propose a novel representation of trajectories, called movement pattern strings, which convert the trajectories into a symbolic representation. The distances that are computed in a symbolic space are lower bounds of the distances of original trajectory data, which guarantees that no false dismissals will be introduced using movement pattern strings to retrieve trajectories.

What have the authors stated for future works in "Symbolic representation and retrieval of moving object trajectories" ?

Future work includes the following problems: 1. Finding an embedding method, which keeps both the lower bound property and the temporal order of elements in the strings.

What is the pruning power of MPS?

MPS has quite stable pruning power over trajectory length, because it maintains in the strings, the order of the corresponding (movement direction, distance ratio) pairs, and its ability to remove a lot of false candidates due to its consideration of neighbors of each symbol.

How did they decompose the raw object sequences into components?

Chen and Chang [4] used wavelet transform to decompose raw object trajectories (position sequences) into components at different scale.

What is the simplest way to use MPS as a filter?

Using MPS as filter is based on the assumption that the retrieval cost may be reduced due to the smaller size of MPS compared to movement sequences.

how many neighbors of each integer point in the frequency space is there?

as the number of neighbors of each integer point in the frequency space is limited (at most 8), the computation time of Algorithm 3 is still linear.

How do the authors use frequency vectors to reduce the cost of computing NED?

the authors define NMFD between two frequency vectors and use frequency vectors as filters to save the cost of CPU time on computing NED.

How do the authors get the results for LCSS and NED?

The authors find that for ASL data, the authors get best results for LCSS and NED when ²dir = 0.167π and ²dis = 0.1 ∗ σmax, where σmax is the maximum value of movement distance ratio in the data set, which can be obtained when the authors convert raw trajectories to movement sequences.

what is the algorithm for quantizing a movement direction, distance ratio?

Once the authors quantize the (movement direction, distance ratio) space into subregions and derive the movement alphabet A, the authors use Algorithm 1 to map a (movement direction, distance ratio) pair (θ, σ) into a symbol.

What is the similarity measure that the authors propose?

The similarity measure that the authors propose takes the longest common subsequences, gap penalties and compared sequence lengths into consideration.

What is the algorithm for mapping a movement pattern string into a symbol?

Given a movement sequence MA = [(θa,1, σa,1), . . . , (θa,n, σa,n)] of length n and movement pattern alphabet A, a movement pattern string (MPS) is defined as a sequence of symbols: Sa,1Sa,2 . . .

What is the way to reduce false candidates in the retrieval of trajectory data?

Their experimental results confirm that NED is a suitable and superior similarity measure for trajectory data and feature vector with NMFD can effectively reduce the false candidates in trajectory retrieval.

What is the NED between the original movement sequences MA and MB?

NED between original movement sequences MA and MB is 0, whereas the NED between MPSA and MPSB that is computed based on the standard edit distance [20] is 1, which is not the lower bound of 0.

Why is the quantization map used to convert a (movement direction, distance ratio)?

This is because the (movement direction, distance ratio) pairs that are located near the boundary of quantization subregions may be assigned different symbols and require a replace operation that is not needed in the original sequence comparison.

Why does NED achieve the same number of correct results as LCSS?

Due to the lower bound property of NED on MPS, clustering on it achieves nearly the same number of correct results as that of clustering on original movement sequences.

How did they measure the distance between two trajectories?

Little and Gu [22] used the path and speed curves to represent the motion trajectories and measured the distance between two trajectories using DTW.

What is the difference between MPS and FV?

In terms of total retrieval efficiency, FV is much better than MPS due to the linearity of the computation cost of FV as opposed to quadratic cost for MPS.3.

What is the cost of converting movement patterns into MPS?

Even though the authors reduce the storage requirements by converting movement sequences into movement pattern strings, the cost of computing the NED between two MPSs is still O(n∗m), since the length of a movement sequence and that of its corresponding movement pattern string are the same.

what is the frequency distance between u and v?

Let u and v be integer points in s dimensional space, The frequency distance FD(u, v) between u and v is defined as the minimum number of steps that is required to go from u to v (or equivalently from v to u) by moving to a neighbor point at each step.

(Open Access) Symbolic representation and retrieval of moving object trajectories (2004) | Lei Chen

Symbolic Representation and Retrieval of

Moving Object Trajectories

Lei Chen, M. Tamer

Ozsu

University of Waterloo

School of Computer Science

Waterloo, Canada

{l6chen,tozsu}@uwaterlo o.ca

Vincent Oria

New Jersey Inst. of Technology

Dept. of Computer Science

Newark, New Jersey, USA

{vincent.oria@njit.edu}

Technical Report CS-2003-30 Sept 2003

Abstract

Similarity-based retrieval of moving object trajectory is useful to

many applications - GPS systems, sport and surveillance video analy-

sis. However, due to sensor failures, errors in detection techniques, or

diﬀerent sampling rates, noises, local shifts and scales may appear in

the trajectory records. Hence, it is diﬃcult to design a robust and fast

similarity measure for similarity-based retrieval in a large database.

In this paper, normalized edit distance (NED) is proposed to measure

the similarity between two trajectories. We evaluate the eﬃcacy of

NED and compare it with those of Euclidean distance, Dynamic Time

Warping (DTW), and Longest Common Subsequences (LCSS), show-

ing that NED is more robust and accurate for trajectories that contain

noise and local time shifting. Furthermore, in order to improve the

retrieval eﬃciency, we propose a novel representation of trajectories,

called movement pattern strings, which convert the trajectories into a

symbolic representation. Movement pattern strings encode both the

movement direction and the movement distance information of the

trajectories. The distances that are computed in a symbolic space

are lower bounds of the distances of original trajectory data, which

guarantees that no false dismissals will be introduced using movement

pattern strings to retrieve trajectories. Finally, we deﬁne a modiﬁed

frequency distance for frequency vectors that are obtained from move-

ment pattern strings to reduce the dimensionality of movement pattern

strings and computation cost of NED. The experimental results show

that the cost of retrieving similar trajectories can be greatly reduced

when the modiﬁed frequency distance is used as a ﬁlter.

1 Introduction

With the growth of mobile computing and the development of computer vi-

sion techniques, it has become possible to trace the trajectories of moving

objects in real life and in videos. A number of interesting applications have

been developed based on the analysis of trajectories. For example, using a

GPS system, and by mining the trajectories of animals in a large farming

area, it is possible to determine migration patterns of certain groups of ani-

mals. In sports videos, such as hockey, it is quite useful for coaches or sports

researchers to know the movement patterns of top players. In a store surveil-

lance video monitoring system, ﬁnding the customers’ movement patterns

may help in the arrangement of merchandise. All of these applications re-

quire the deﬁnition of an accurate and robust similarity measure to determine

similarity among trajectories.

The trajectory of a moving object is deﬁned as the successive positions

of the moving object over a period of time. Therefore, trajectories can be

considered as two (X −Y plane) or three (X −Y −Z plane) dimensional time

series data. Considerable research has been conducted on similarity-based

retrieval on one dimensional time series data, such as stock or commodity

prices, sales volume, weather data and biomedical measurements [1, 13, 14,

18, 19, 23, 24, 30]. A question that can be easily raised is: “Can we apply

these techniques for one dimensional time series data to trajectories?” The

answer is unfortunately, negative; directly applying these techniques will not

get satisfactory results. The reason is that trajectories of moving objects have

their own characteristics, which will be brieﬂy introduced in next section.

1.1 Characteristics of Trajectories

Compared to one dimensional time series data, trajectories of moving objects

have the following diﬀerences:

• Trajectories are always two or three dimensional. Since each point of

a trajectory is represented as a vector in two or three dimensions, di-

mensionality reduction techniques for one dimensional time series data,

such as Discrete Fourier Transform (DFT) [1], Discrete Wavelet Trans-

form (DWT) [19, 23], Single Value Decomposition (SVD) [13, 18] and

Piece-Wise Aggregate Approximation (PAA) [14, 30], cannot be applied

to trajectories. Naively treating each dimension of the moving object

positions independently, the trajectories can be considered as two or

three one-dimensional time series data. However applying dimension-

ality reduction techniques independently on each of the dimensions will

lead to the loss of valuable information on the interdependency among

the dimensions embedded in the positions of a trajectory.

• Trajectories may have many outliers. Unlike stock, weather, or com-

modity price data, trajectories of moving objects are captured by record-

ing the positions of the objects from time to time (or tracing the moving

object from frame-to-frame in video data). Therefore, due to sensor

failures or errors in detection techniques, many outliers may appear.

The similarity measures for one dimensional time series data, such as

Euclidean distance [1] and Dynamic Time Warping (DTW) [31] are

very sensitive to noise and can not be applied to trajectories [26].

• Similar movement patterns may appear in diﬀerent spatial regions of

trajectories. Diﬀerent sampling rates of tracking and recording devices

combined with diﬀerent speeds of the moving objects may introduce

various local scaling and shifting factors into trajectories. Several tech-

niques have been proposed to remove the shifting and scaling eﬀects by

introducing shifting and scaling functions [5, 6]. Unfortunately, these

techniques work ﬁne for global shifting and scaling but not for the local

shifting and scaling in movement patterns that appear in the trajecto-

ries.

After reviewing the complex characteristics of trajectory data, a question

comes to our mind is “can we ﬁnd a suitable similarity measure which takes

these characteristics into consideration when we compare trajectories?” Fur-

thermore, with the proposed similarity measure, “how can we improve the

retrieval eﬃciency?” We will address these two questions in our paper.

1.2 Accurate and Robust Similarity Measures for Tra-

jectories

0 200 400 600 800 1000 1200 1400 1600

−600

−400

−200

200

400

600

LCSS

normalized

, T

) = 0.36

(a)

0 500 1000 1500 2000 2500

−600

−400

−200

200

400

600

LCSS

normalized

, T

) = 0.36

gap

(b)

Figure 1: A comparison of trajectories with the same normalized LCSS but

diﬀerent gap sizes

Recently, Longest Common Subsequence (LCSS) has been proposed to

measure the similarity between trajectories [26]. Compared to DTW and

Euclidean distance, LCSS allows the matching sequence to stretch and some

elements to be unmatched, which makes it robust to noise [26]. However,

LCSS has diﬃculties in diﬀerentiating the sequences that have the longest

common subsequences of the same length but diﬀerent sizes of gaps in be-

tween. Figure 1 shows an example of this case

, where the normalized LCSS

The original trajectory data are two dimensional. For clarity, in the ﬁgures, we only

score [26] between trajectories T

and T

(Figure 1(a)) is the same as that

between T

and T

(Figure 1(b)). However, by comparing the three trajec-

tories (the horizontal grey lines are used to show the common subsequences

between two trajectories), it quite clear that T

is more similar to T

than

to T

In this paper, we deﬁne a distance measure called Normalized Edit Dis-

tance (NED) to measure the similarity between two trajectories. NED is

based on Edit Distance (ED) [20], which is widely used in bio-informatics

and speech recognition to measure the similarity between two strings. In

contrast to LCSS, NED considers the gaps in between subsequences as well

as the subsequences themselves. For example, for the trajectories shown in

Figure 1, the value of NED between T

and T

is 0.7 and 0.78 for T

and

(the detailed deﬁnition of NED is given in Section 2), which conform to

the perceptual similarity that T

is similar to T

than to T

However, the space and time cost of computing NED is very high, in-

creasing the retrieval cost as a consequence. Since edit distance is originally

deﬁned for strings, it seems possible to convert the real-valued trajectory data

into strings and utilize the well deﬁned algorithms and embedded distance

functions of strings to improve the retrieval eﬃciency. Thus, we propose a

novel trajectory representation, called movement pattern strings (MPS). A

MPS is derived from a trajectory by quantizing the (movement direction,

distance ratio) space into a set of distinct equal-sized subregions and rep-

resenting each subregion by a symbol. Most importantly, the NED that is

computed from two MPSs establishes the lower bound of the NED of two

original sequences of movement direction and distance pairs, which guaran-

tees that no dismissals will be introduced using the symbolic representation.

Furthermore, we deﬁne a modiﬁed frequency distance (MFD) between two

frequency vectors (FV) of movement pattern strings to reduce the cost of

CPU time on computing NED of two movement sequences. A normalized

MFD (NMFD) between two FVs is also the lower bound of NED between

two trajectories. Therefore, we can directly use FV as a ﬁlter to remove the

false candidates during the retrieval.

1.3 Our Main Contributions

The main contributions of our paper are the following:

1. We deﬁne a distance measure, NED, based on ED, to measure the

similarity between two trajectories. NED is more robust than DTW

and Euclidean distance and more accurate than LCSS.

2. We develop a transformation scheme to convert a trajectory into a

symbolic representation, called movement pattern strings, and prove

that the NED that is computed over a symbolic space is the lower

show one dimension.

Symbolic representation and retrieval of moving object trajectories

Figures

Citations

Clustering of Vehicle Trajectories

Visually mining and monitoring massive time series

Similarity search for multidimensional data sequences = 다차원 데이터 시퀀스에 대한 유사성 검색

Visualizing and discovering non-trivial patterns in large time series databases

Movement similarity assessment using symbolic representation of trajectories

References

Algorithms for clustering data

Binary codes capable of correcting deletions, insertions and reversals

Binary codes capable of correcting deletions, insertions, and reversals

Algorithms for clustering data

Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology

Related Papers (5)

Discovering similar multidimensional trajectories

Robust and fast similarity search for moving object trajectories

A symbolic representation of time series, with implications for streaming algorithms

Learning the distribution of object trajectories for event recognition

A general method applicable to the search for similarities in the amino acid sequence of two proteins

Frequently Asked Questions (19)

Q1. What contributions have the authors mentioned in the paper "Symbolic representation and retrieval of moving object trajectories" ?

Q2. What have the authors stated for future works in "Symbolic representation and retrieval of moving object trajectories" ?

Q3. What is the pruning power of MPS?

Q4. How did they decompose the raw object sequences into components?

Q5. What is the simplest way to use MPS as a filter?

Q6. how many neighbors of each integer point in the frequency space is there?

Q7. How do the authors use frequency vectors to reduce the cost of computing NED?

Q8. How do the authors get the results for LCSS and NED?

Q9. what is the algorithm for quantizing a movement direction, distance ratio?

Q10. What is the similarity measure that the authors propose?

Q11. What is the algorithm for mapping a movement pattern string into a symbol?

Q12. What is the way to reduce false candidates in the retrieval of trajectory data?

Q13. What is the NED between the original movement sequences MA and MB?

Q14. Why is the quantization map used to convert a (movement direction, distance ratio)?

Q15. Why does NED achieve the same number of correct results as LCSS?

Q16. How did they measure the distance between two trajectories?

Q17. What is the difference between MPS and FV?

Q18. What is the cost of converting movement patterns into MPS?

Q19. what is the frequency distance between u and v?