scispace - formally typeset
Journal ArticleDOI

Dimensionality reduction for fast similarity search in large time series databases

TLDR
This work introduces a new dimensionality reduction technique which it is called Piecewise Aggregate Approximation (PAA), and theoretically and empirically compare it to the other techniques and demonstrate its superiority.
Abstract
The problem of similarity search in large time series databases has attracted much attention recently. It is a non-trivial problem because of the inherent high dimensionality of the data. The most promising solutions involve first performing dimensionality reduction on the data, and then indexing the reduced data with a spatial access method. Three major dimensionality reduction techniques have been proposed: Singular Value Decomposition (SVD), the Discrete Fourier transform (DFT), and more recently the Discrete Wavelet Transform (DWT). In this work we introduce a new dimensionality reduction technique which we call Piecewise Aggregate Approximation (PAA). We theoretically and empirically compare it to the other techniques and demonstrate its superiority. In addition to being competitive with or faster than the other methods, our approach has numerous other advantages. It is simple to understand and to implement, it allows more flexible distance measures, including weighted Euclidean queries, and the index can be built in linear time.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI

A Survey of Clustering Data Mining Techniques

TL;DR: This survey concentrates on clustering algorithms from a data mining perspective as a data modeling technique that provides for concise summaries of the data.
Journal ArticleDOI

Exact indexing of dynamic time warping

TL;DR: This work introduces a novel technique for the exact indexing of Dynamic time warping and proves its vast superiority over all competing approaches in the largest and most comprehensive set of time series indexing experiments ever undertaken.
Journal ArticleDOI

Experiencing SAX: a novel symbolic representation of time series

TL;DR: The utility of the new symbolic representation of time series formed is demonstrated, which allows dimensionality/numerosity reduction, and it also allows distance measures to be defined on the symbolic approach that lower bound corresponding distance measuresdefined on the original series.
Journal ArticleDOI

Querying and mining of time series data: experimental comparison of representations and distance measures

TL;DR: An extensive set of time series experiments are conducted re-implementing 8 different representation methods and 9 similarity measures and their variants and testing their effectiveness on 38 time series data sets from a wide variety of application domains to provide a unified validation of some of the existing achievements.
Journal ArticleDOI

A review on time series data mining

TL;DR: The primary objective of this paper is to serve as a glossary for interested researchers to have an overall picture on the current time series data mining development and identify their potential research direction to further investigation.
References
More filters
Proceedings ArticleDOI

R-trees: a dynamic index structure for spatial searching

TL;DR: A dynamic index structure called an R-tree is described which meets this need, and algorithms for searching and updating it are given and it is concluded that it is useful for current database systems in spatial applications.
Book

Pattern recognition and neural networks

TL;DR: Professor Ripley brings together two crucial ideas in pattern recognition; statistical methods and machine learning via neural networks in this self-contained account.

Pattern Recognition and Neural Networks

TL;DR: Title Type pattern recognition with neural networks in c++ PDF pattern recognition and neural networks PDF Neural networks for pattern recognition advanced texts in econometrics PDF neural networks for applied sciences and engineering from fundamentals to complex pattern recognition PDF
Book ChapterDOI

Efficient Similarity Search In Sequence Databases

TL;DR: An indexing method for time sequences for processing similarity queries using R * -trees to index the sequences and efficiently answer similarity queries and provides experimental results which show that the method is superior to search based on sequential scanning.
Proceedings ArticleDOI

Fast subsequence matching in time-series databases

TL;DR: An efficient indexing method to locate 1-dimensional subsequences within a collection of sequences, such that the subsequences match a given (query) pattern within a specified tolerance.
Related Papers (5)