scispace - formally typeset
Proceedings ArticleDOI

Segment-based approach for subsequence searches in sequence databases

TLDR
Time warping as mentioned in this paper is a transformation that allows any sequence element to replicate itself as many times as needed without extra costs without extra computation cost, and is defined as the smallest distance between two sequences transformed by time warping.
Abstract
The sequence database is a set of data sequences, each of which is an ordered list of elements [1]. Sequences of stock prices, money exchange rates, temperature data, product sales data, and company growth rates are the typical examples of sequence databases [2, 8]. Similarity search is an operation that finds sequences or subsequences whose changing patterns are similar to that of a given query sequence [1, 2, 8]. Similarity search is of growing importance in many new applications such as data mining and data warehousing [6, 17]. There have been many research efforts [1, 7, 8, 10, 17] for efficient similarity searches in sequence databases using the Euclidean distance as a similarity measure. However, recent techniques [13–15, 18] tend to favor the time warping distance for its higher accuracy and wider applicability at the expense of high computation cost. Time warping is a transformation that allows any sequence element to replicate itself as many times as needed without extra costs [18]. For → example, two sequences X = 〈20, 21, 21, 20, 20, 23, 23, 23〉 → and Q = 〈20, 20, 21,20, 23〉 can be identically transformed into 〈20, 20, 21, 21, 20, 20, 23, 23, 23〉 by time warping. The time warping distance is defined as the smallest distance between two sequences transformed by time warping. While the Euclidean distance can be used only when two sequences compared are of the same length, the time warping distance can be applied to any two sequences of arbitrary lengths. Therefore, the time warping distance fits well with the databases where sequences are of different lengths. The time warping distance can be applied to both whole sequence and subsequence searches. Let us first consider the

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Exact indexing of dynamic time warping

TL;DR: This work introduces a novel technique for the exact indexing of Dynamic time warping and proves its vast superiority over all competing approaches in the largest and most comprehensive set of time series indexing experiments ever undertaken.
Journal ArticleDOI

A review on time series data mining

TL;DR: The primary objective of this paper is to serve as a glossary for interested researchers to have an overall picture on the current time series data mining development and identify their potential research direction to further investigation.
Journal ArticleDOI

On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration

TL;DR: The most exhaustive set of time series experiments ever attempted, re-implementing the contribution of more than two dozen papers, and testing them on 50 real world, highly diverse datasets support the claim that there is a need for a set oftime series benchmarks and more careful empirical evaluation in the data mining community.
Proceedings ArticleDOI

An online algorithm for segmenting time series

TL;DR: This paper undertake the first extensive review and empirical comparison of all proposed techniques for mining time-series data with fatal flaws and introduces a novel algorithm that is empirically show to be superior to all others in the literature.
Book ChapterDOI

Chapter 36 – Exact Indexing of Dynamic Time Warping

TL;DR: Dynamic time warping (DTW) is a much more robust distance measure for time series, allowing similar shapes to match even if they are out of phase in the time axis, but does not obey the triangular inequality and, thus, has resisted attempts at exact indexing.
References
More filters
Book

Fundamentals of speech recognition

TL;DR: This book presents a meta-modelling framework for speech recognition that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of manually modeling speech.
Proceedings ArticleDOI

R-trees: a dynamic index structure for spatial searching

TL;DR: A dynamic index structure called an R-tree is described which meets this need, and algorithms for searching and updating it are given and it is concluded that it is useful for current database systems in spatial applications.
Book

The Analysis of Time Series: An Introduction

TL;DR: In this paper, simple descriptive techniques for time series estimation in the time domain forecasting stationary processes in the frequency domain spectral analysis bivariate processes linear systems state-space models and the Kalman filter non-linear models multivariate time series modelling some other topics.
Journal ArticleDOI

Data mining: an overview from a database perspective

TL;DR: In this paper, a survey of the available data mining techniques is provided and a comparative study of such techniques is presented, based on a database researcher's point-of-view.
Book ChapterDOI

Efficient Similarity Search In Sequence Databases

TL;DR: An indexing method for time sequences for processing similarity queries using R * -trees to index the sequences and efficiently answer similarity queries and provides experimental results which show that the method is superior to search based on sequential scanning.
Related Papers (5)