Home
/
Authors
/
Abdullah Mueen

Author

Abdullah Mueen

Other affiliations: University of California, Riverside, Bangladesh University of Engineering and Technology

Bio: Abdullah Mueen is an academic researcher from University of New Mexico. The author has contributed to research in topics: Cluster analysis & Dynamic time warping. The author has an hindex of 31, co-authored 86 publications receiving 5042 citations. Previous affiliations of Abdullah Mueen include University of California, Riverside & Bangladesh University of Engineering and Technology.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2006

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Searching and mining trillions of time series subsequences under dynamic time warping

[...]

Thanawin Rakthanmanon¹, Bilson Campana¹, Abdullah Mueen¹, Gustavo E. A. P. A. Batista², Brandon Westover³, Qiang Zhu¹, Jesin Zakaria¹, Eamonn Keogh¹ - Show less +4 more•Institutions (3)

University of California, Riverside¹, University of São Paulo², Brigham and Women's Hospital³

12 Aug 2012

TL;DR: This work shows that by using a combination of four novel ideas the authors can search and mine truly massive time series for the first time, and shows that in large datasets they can exactly search under DTW much more quickly than the current state-of-the-art Euclidean distance search algorithms.

...read moreread less

Abstract: Most time series data mining algorithms use similarity search as a core subroutine, and thus the time taken for similarity search is the bottleneck for virtually all time series data mining algorithms. The difficulty of scaling search to large datasets largely explains why most academic work on time series data mining has plateaued at considering a few millions of time series objects, while much of industry and science sits on billions of time series objects waiting to be explored. In this work we show that by using a combination of four novel ideas we can search and mine truly massive time series for the first time. We demonstrate the following extremely unintuitive fact; in large datasets we can exactly search under DTW much more quickly than the current state-of-the-art Euclidean distance search algorithms. We demonstrate our work on the largest set of time series experiments ever attempted. In particular, the largest dataset we consider is larger than the combined size of all of the time series datasets considered in all data mining papers ever published. We show that our ideas allow us to solve higher-level time series data mining problem such as motif discovery and clustering at scales that would otherwise be untenable. In addition to mining massive datasets, we will show that our ideas also have implications for real-time monitoring of data streams, allowing us to handle much faster arrival rates and/or use cheaper and lower powered devices than are currently possible.

...read moreread less

969 citations

Journal Article•DOI•

Experimental comparison of representation methods and distance measures for time series data

[...]

Xiaoyue Wang¹, Abdullah Mueen¹, Hui Ding², Goce Trajcevski², Peter Scheuermann², Eamonn Keogh¹ - Show less +2 more•Institutions (2)

University of California, Riverside¹, Northwestern University²

01 Mar 2013-Data Mining and Knowledge Discovery

TL;DR: An extensive experimental study re-implementing eight different time series representations and nine similarity measures and their variants and testing their effectiveness on 38 time series data sets from a wide variety of application domains gives an overview of these different techniques and presents comparative experimental findings regarding their effectiveness.

...read moreread less

Abstract: The previous decade has brought a remarkable increase of the interest in applications that deal with querying and mining of time series data. Many of the research efforts in this context have focused on introducing new representation methods for dimensionality reduction or novel similarity measures for the underlying data. In the vast majority of cases, each individual work introducing a particular method has made specific claims and, aside from the occasional theoretical justifications, provided quantitative experimental observations. However, for the most part, the comparative aspects of these experiments were too narrowly focused on demonstrating the benefits of the proposed methods over some of the previously introduced ones. In order to provide a comprehensive validation, we conducted an extensive experimental study re-implementing eight different time series representations and nine similarity measures and their variants, and testing their effectiveness on 38 time series data sets from a wide variety of application domains. In this article, we give an overview of these different techniques and present our comparative experimental findings regarding their effectiveness. In addition to providing a unified validation of some of the existing achievements, our experiments also indicate that, in some cases, certain claims in the literature may be unduly optimistic.

...read moreread less

747 citations

Proceedings Article•DOI•

Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View That Includes Motifs, Discords and Shapelets

[...]

Chin-Chia Michael Yeh¹, Yan Zhu¹, Liudmila Ulanova¹, Nurjahan Begum¹, Yifei Ding¹, Hoang Anh Dau¹, Diego Furtado Silva², Abdullah Mueen³, Eamonn Keogh¹ - Show less +5 more•Institutions (3)

University of California, Riverside¹, Spanish National Research Council², University of New Mexico³

01 Dec 2016

TL;DR: A novel scalable algorithm for time series subsequence all-pairs-similarity-search that computes the answer to the time series motif and time series discord problem as a side-effect, and incidentally provides the fastest known algorithm for both these extensively-studied problems.

...read moreread less

Abstract: The all-pairs-similarity-search (or similarity join) problem has been extensively studied for text and a handful of other datatypes. However, surprisingly little progress has been made on similarity joins for time series subsequences. The lack of progress probably stems from the daunting nature of the problem. For even modest sized datasets the obvious nested-loop algorithm can take months, and the typical speed-up techniques in this domain (i.e., indexing, lower-bounding, triangular-inequality pruning and early abandoning) at best produce one or two orders of magnitude speedup. In this work we introduce a novel scalable algorithm for time series subsequence all-pairs-similarity-search. For exceptionally large datasets, the algorithm can be trivially cast as an anytime algorithm and produce high-quality approximate solutions in reasonable time. The exact similarity join algorithm computes the answer to the time series motif and time series discord problem as a side-effect, and our algorithm incidentally provides the fastest known algorithm for both these extensively-studied problems. We demonstrate the utility of our ideas for two time series data mining problems, including motif discovery and novelty discovery.

...read moreread less

452 citations

Proceedings Article•DOI•

Exact Discovery of Time Series Motifs.

[...]

Abdullah Mueen¹, Eamonn Keogh¹, Qiang Zhu¹, Sydney S. Cash², M. Brandon Westover² - Show less +1 more•Institutions (2)

University of California, Riverside¹, Harvard University²

01 Jan 2009

TL;DR: For the first time, a tractable exact algorithm to find time series motifs is shown and it is shown that this algorithm is fast enough to be used as a subroutine in higher level data mining algorithms for anytime classification, near-duplicate detection and summarization.

...read moreread less

Abstract: Time series motifs are pairs of individual time series, or subsequences of a longer time series, which are very similar to each other As with their discrete analogues in computational biology, this similarity hints at structure which has been conserved for some reason and may therefore be of interest Since the formalism of time series motifs in 2002, dozens of researchers have used them for diverse applications in many different domains Because the obvious algorithm for computing motifs is quadratic in the number of items, more than a dozen approximate algorithms to discover motifs have been proposed in the literature In this work, for the first time, we show a tractable exact algorithm to find time series motifs As we shall show through extensive experiments, our algorithm is up to three orders of magnitude faster than brute-force search in large datasets We further show that our algorithm is fast enough to be used as a subroutine in higher level data mining algorithms for anytime classification, near-duplicate detection and summarization, and we consider detailed case studies in domains as diverse as electroencephalograph interpretation and entomological telemetry data mining

...read moreread less

412 citations

Proceedings Article•DOI•

Logical-shapelets: an expressive primitive for time series classification

[...]

Abdullah Mueen¹, Eamonn Keogh¹, Neal E. Young²•Institutions (2)

University of California, Riverside¹, University of California, Berkeley²

21 Aug 2011

TL;DR: This work introduces a novel algorithm that finds shapelets in less time than current methods by an order of magnitude, and shows for the first time an augmented shapelet representation that distinguishes the data based on conjunctions or disjunctions of shapelets.

...read moreread less

Abstract: Time series shapelets are small, local patterns in a time series that are highly predictive of a class and are thus very useful features for building classifiers and for certain visualization and summarization tasks. While shapelets were introduced only recently, they have already seen significant adoption and extension in the community. Despite their immense potential as a data mining primitive, there are two important limitations of shapelets. First, their expressiveness is limited to simple binary presence/absence questions. Second, even though shapelets are computed offline, the time taken to compute them is significant. In this work, we address the latter problem by introducing a novel algorithm that finds shapelets in less time than current methods by an order of magnitude. Our algorithm is based on intelligent caching and reuse of computations, and the admissible pruning of the search space. Because our algorithm is so fast, it creates an opportunity to consider more expressive shapelet queries. In particular, we show for the first time an augmented shapelet representation that distinguishes the data based on conjunctions or disjunctions of shapelets. We call our novel representation Logical-Shapelets. We demonstrate the efficiency of our approach on the classic benchmark datasets used for these problems, and show several case studies where logical shapelets significantly outperform the original shapelet representation and other time series classification techniques. We demonstrate the utility of our ideas in domains as diverse as gesture recognition, robotics, and biometrics.

...read moreread less

298 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Collapse

Cited by

PDF

Open Access

More filters

[신간의 별자리x] 우리/미술, 그리고 ‘슬픔의 박물관’

[...]

이화영

01 Jan 2015

12,972 citations

Reference Entry•DOI•

IEEE Transactions on Pattern Analysis and Machine Intelligence

[...]

King-Sun Fu

15 Oct 2004

2,118 citations

Book•

A Distribution-Free Theory of Nonparametric Regression

[...]

László Györfi

16 Apr 2013

TL;DR: How to Construct Nonparametric Regression Estimates * Lower Bounds * Partitioning Estimates * Kernel Estimates * k-NN Estimates * Splitting the Sample * Cross Validation * Uniform Laws of Large Numbers

...read moreread less

Abstract: Why is Nonparametric Regression Important? * How to Construct Nonparametric Regression Estimates * Lower Bounds * Partitioning Estimates * Kernel Estimates * k-NN Estimates * Splitting the Sample * Cross Validation * Uniform Laws of Large Numbers * Least Squares Estimates I: Consistency * Least Squares Estimates II: Rate of Convergence * Least Squares Estimates III: Complexity Regularization * Consistency of Data-Dependent Partitioning Estimates * Univariate Least Squares Spline Estimates * Multivariate Least Squares Spline Estimates * Neural Networks Estimates * Radial Basis Function Networks * Orthogonal Series Estimates * Advanced Techniques from Empirical Process Theory * Penalized Least Squares Estimates I: Consistency * Penalized Least Squares Estimates II: Rate of Convergence * Dimension Reduction Techniques * Strong Consistency of Local Averaging Estimates * Semi-Recursive Estimates * Recursive Estimates * Censored Observations * Dependent Observations

...read moreread less

1,931 citations

Journal Article•DOI•

Deep learning for time series classification: a review

[...]

Hassan Ismail Fawaz¹, Germain Forestier², Jonathan Weber¹, Lhassane Idoumghar¹, Pierre-Alain Muller¹ - Show less +1 more•Institutions (2)

University of Upper Alsace¹, Monash University²

01 Jul 2019-Data Mining and Knowledge Discovery

TL;DR: This article proposes the most exhaustive study of DNNs for TSC by training 8730 deep learning models on 97 time series datasets and provides an open source deep learning framework to the TSC community.

...read moreread less

Abstract: Time Series Classification (TSC) is an important and challenging problem in data mining. With the increase of time series data availability, hundreds of TSC algorithms have been proposed. Among these methods, only a few have considered Deep Neural Networks (DNNs) to perform this task. This is surprising as deep learning has seen very successful applications in the last years. DNNs have indeed revolutionized the field of computer vision especially with the advent of novel deeper architectures such as Residual and Convolutional Neural Networks. Apart from images, sequential data such as text and audio can also be processed with DNNs to reach state-of-the-art performance for document classification and speech recognition. In this article, we study the current state-of-the-art performance of deep learning algorithms for TSC by presenting an empirical study of the most recent DNN architectures for TSC. We give an overview of the most successful deep learning applications in various time series domains under a unified taxonomy of DNNs for TSC. We also provide an open source deep learning framework to the TSC community where we implemented each of the compared approaches and evaluated them on a univariate TSC benchmark (the UCR/UEA archive) and 12 multivariate time series datasets. By training 8730 deep learning models on 97 time series datasets, we propose the most exhaustive study of DNNs for TSC to date.

...read moreread less

1,833 citations

Book•

Computational geometry

[...]

F. Frances Yao

02 Jan 1991

1,377 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse