Dimensionality reduction for fast similarity search in large time series databases

doi:10.1007/PL00011669

Journal ArticleDOI

Dimensionality reduction for fast similarity search in large time series databases

Eamonn Keogh, +3 more

- 01 Aug 2001 -

Knowledge and Information Systems

- Vol. 3, Iss: 3, pp 263-286

TLDR

This work introduces a new dimensionality reduction technique which it is called Piecewise Aggregate Approximation (PAA), and theoretically and empirically compare it to the other techniques and demonstrate its superiority.

Abstract:

The problem of similarity search in large time series databases has attracted much attention recently. It is a non-trivial problem because of the inherent high dimensionality of the data. The most promising solutions involve first performing dimensionality reduction on the data, and then indexing the reduced data with a spatial access method. Three major dimensionality reduction techniques have been proposed: Singular Value Decomposition (SVD), the Discrete Fourier transform (DFT), and more recently the Discrete Wavelet Transform (DWT). In this work we introduce a new dimensionality reduction technique which we call Piecewise Aggregate Approximation (PAA). We theoretically and empirically compare it to the other techniques and demonstrate its superiority. In addition to being competitive with or faster than the other methods, our approach has numerous other advantages. It is simple to understand and to implement, it allows more flexible distance measures, including weighted Euclidean queries, and the index can be built in linear time.

Citations

PDF

Open Access

More filters

Book ChapterDOI

A Survey of Clustering Data Mining Techniques

Pavel Berkhin

TL;DR: This survey concentrates on clustering algorithms from a data mining perspective as a data modeling technique that provides for concise summaries of the data.

...read moreread less

Journal ArticleDOI

Exact indexing of dynamic time warping

Eamonn Keogh, +1 more

- 01 Mar 2005 -

Knowledge and Information Systems

TL;DR: This work introduces a novel technique for the exact indexing of Dynamic time warping and proves its vast superiority over all competing approaches in the largest and most comprehensive set of time series indexing experiments ever undertaken.

...read moreread less

Journal ArticleDOI

Experiencing SAX: a novel symbolic representation of time series

Jessica Lin, +3 more

- 01 Oct 2007 -

Data Mining and Knowledge Discovery

TL;DR: The utility of the new symbolic representation of time series formed is demonstrated, which allows dimensionality/numerosity reduction, and it also allows distance measures to be defined on the symbolic approach that lower bound corresponding distance measuresdefined on the original series.

...read moreread less

Journal ArticleDOI

Querying and mining of time series data: experimental comparison of representations and distance measures

Hui Ding, +4 more

TL;DR: An extensive set of time series experiments are conducted re-implementing 8 different representation methods and 9 similarity measures and their variants and testing their effectiveness on 38 time series data sets from a wide variety of application domains to provide a unified validation of some of the existing achievements.

...read moreread less

Journal ArticleDOI

A review on time series data mining

Tak-chung Fu

- 01 Feb 2011 -

Engineering Applications of Artificial I...

TL;DR: The primary objective of this paper is to serve as a glossary for interested researchers to have an overall picture on the current time series data mining development and identify their potential research direction to further investigation.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

R-trees: a dynamic index structure for spatial searching

Antonin Guttman

TL;DR: A dynamic index structure called an R-tree is described which meets this need, and algorithms for searching and updating it are given and it is concluded that it is useful for current database systems in spatial applications.

...read moreread less

Book

Pattern recognition and neural networks

Brian D. Ripley, +1 more

TL;DR: Professor Ripley brings together two crucial ideas in pattern recognition; statistical methods and machine learning via neural networks in this self-contained account.

...read moreread less

Pattern Recognition and Neural Networks

Yann LeCun, +3 more

TL;DR: Title Type pattern recognition with neural networks in c++ PDF pattern recognition and neural networks PDF Neural networks for pattern recognition advanced texts in econometrics PDF neural networks for applied sciences and engineering from fundamentals to complex pattern recognition PDF

...read moreread less

Book ChapterDOI

Efficient Similarity Search In Sequence Databases

Rakesh Agrawal, +2 more

TL;DR: An indexing method for time sequences for processing similarity queries using R * -trees to index the sequences and efficiently answer similarity queries and provides experimental results which show that the method is superior to search based on sequential scanning.

...read moreread less

Proceedings ArticleDOI

Fast subsequence matching in time-series databases

Christos Faloutsos, +2 more

TL;DR: An efficient indexing method to locate 1-dimensional subsequences within a collection of sequences, such that the subsequences match a given (query) pattern within a specified tolerance.

...read moreread less

Collapse

Dimensionality reduction for fast similarity search in large time series databases

Citations

A Survey of Clustering Data Mining Techniques

Exact indexing of dynamic time warping

Experiencing SAX: a novel symbolic representation of time series

Querying and mining of time series data: experimental comparison of representations and distance measures

A review on time series data mining

References

R-trees: a dynamic index structure for spatial searching

Pattern recognition and neural networks

Pattern Recognition and Neural Networks

Efficient Similarity Search In Sequence Databases

Fast subsequence matching in time-series databases

Related Papers (5)

Efficient Similarity Search In Sequence Databases

Fast subsequence matching in time-series databases

A symbolic representation of time series, with implications for streaming algorithms

Experiencing SAX: a novel symbolic representation of time series

Using dynamic time warping to find patterns in time series