Home
/
Authors
/
Matthias Renz

Author

Matthias Renz

Other affiliations: George Mason University, Ludwig Maximilian University of Munich

Bio: Matthias Renz is an academic researcher from University of Kiel. The author has contributed to research in topics: Nearest neighbor search & Probabilistic logic. The author has an hindex of 26, co-authored 144 publications receiving 3094 citations. Previous affiliations of Matthias Renz include George Mason University & Ludwig Maximilian University of Munich.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2001

Papers

PDF

Open Access

More filters

Book•

Advances in Spatial and Temporal Databases

[...]

Michael Gertz, Matthias Renz, Xiaofang Zhou, Erik Hoel, Wei-Shinn Ku, Agnes Voisard, Chengyang Zhang, Haiquan Chen, Liang Tang, Yan Huang, Chang-Tien Lu, Siva Ravada - Show less +8 more

01 Jan 2008

TL;DR: The RICC (Reachability Index Construction by Contraction) approach for processing spatiotemporal reachability queries without the instant exchange assumption is proposed and tested on two types of realistic datasets.

...read moreread less

Abstract: Spatiotemporal reachability queries arise naturally when determining how diseases, information, physical items can propagate through a collection of moving objects; such queries are significant for many important domains like epidemiology, public health, security monitoring, surveillance, and social networks. While traditional reachability queries have been studied in graphs extensively, what makes spatiotemporal reachability queries different and challenging is that the associated graph is dynamic and space-time dependent. As the spatiotemporal dataset becomes very large over time, a solution needs to be I/O-efficient. Previous work assumes an ‘instant exchange’ scenario (where information can be instantly transferred and retransmitted between objects), which may not be the case in many real world applications. In this paper we propose the RICC (Reachability Index Construction by Contraction) approach for processing spatiotemporal reachability queries without the instant exchange assumption. We tested our algorithm on two types of realistic datasets using queries of various temporal lengths and different types (with single and multiple sources and targets). The results of our experiments show that RICC can be efficiently used for answering a wide range of spatiotemporal reachability queries on disk-resident datasets.

...read moreread less

438 citations

Proceedings Article•DOI•

Probabilistic frequent itemset mining in uncertain databases

[...]

Thomas Bernecker¹, Hans-Peter Kriegel¹, Matthias Renz¹, Florian Verhein¹, Andreas Zuefle¹ - Show less +1 more•Institutions (1)

Ludwig Maximilian University of Munich¹

28 Jun 2009

TL;DR: This paper introduces new probabilistic formulations of frequent itemsets based on possible world semantics, and presents a framework which is able to solve the Probabilistic Frequent Itemset Mining (PFIM) problem efficiently.

...read moreread less

Abstract: Probabilistic frequent itemset mining in uncertain transaction databases semantically and computationally differs from traditional techniques applied to standard "certain" transaction databases. The consideration of existential uncertainty of item(sets), indicating the probability that an item(set) occurs in a transaction, makes traditional techniques inapplicable. In this paper, we introduce new probabilistic formulations of frequent itemsets based on possible world semantics. In this probabilistic context, an itemset X is called frequent if the probability that X occurs in at least minSup transactions is above a given threshold τ. To the best of our knowledge, this is the first approach addressing this problem under possible worlds semantics. In consideration of the probabilistic formulations, we present a framework which is able to solve the Probabilistic Frequent Itemset Mining (PFIM) problem efficiently. An extensive experimental evaluation investigates the impact of our proposed techniques and shows that our approach is orders of magnitude faster than straight-forward approaches.

...read moreread less

276 citations

Proceedings Article•DOI•

A generic framework for efficient subspace clustering of high-dimensional data

[...]

Hans-Peter Kriegel¹, Peer Kröger¹, Matthias Renz¹, S. Wurst¹•Institutions (1)

Ludwig Maximilian University of Munich¹

27 Nov 2005

TL;DR: A generic framework to overcome limitations in subspace clustering methods, based on an efficient filter-refinement architecture that scales at most quadratic w.r.t. the data dimensionality and the dimensionality of the subspace clusters.

...read moreread less

Abstract: Subspace clustering has been investigated extensively since traditional clustering algorithms often fail to detect meaningful clusters in high-dimensional data spaces. Many recently proposed subspace clustering methods suffer from two severe problems: First, the algorithms typically scale exponentially with the data dimensionality and/or the subspace dimensionality of the clusters. Second, for performance reasons, many algorithms use a global density threshold for clustering, which is quite questionable since clusters in subspaces of significantly different dimensionality will most likely exhibit significantly varying densities. In this paper, we propose a generic framework to overcome these limitations. Our framework is based on an efficient filter-refinement architecture that scales at most quadratic w.r.t. the data dimensionality and the dimensionality of the subspace clusters. It can be applied to any clustering notions including notions that are based on a local density threshold. A broad experimental evaluation on synthetic and real-world data empirically shows that our method achieves a significant gain of runtime and quality in comparison to state-of-the-art subspace clustering algorithms.

...read moreread less

175 citations

Book Chapter•DOI•

Probabilistic nearest-neighbor query on uncertain objects

[...]

Hans-Peter Kriegel¹, Peter Kunath¹, Matthias Renz¹•Institutions (1)

Ludwig Maximilian University of Munich¹

09 Apr 2007

TL;DR: This paper introduces an efficient strategy for cessing probabilistic nearest-neighbor queries, as the computation of these probability values is very expensive.

...read moreread less

Abstract: Nearest-neighbor queries are an important query type for commonly used feature databases. In many different application areas, e.g. sensor databases, location based services or face recognition systems, distances between objects have to be computed based on vague and uncertain data. A successful approach is to express the distance between two uncertain objects by probability density functions which assign a probability value to each possible distance value. By integrating the complete probabilistic distance function as a whole directly into the query algorithm, the full information provided by these functions is exploited. The result of such a probabilistic query algorithm consists of tuples containing the result object and a probability value indicating the likelihood that the object satisfies t he query predicate. In this paper we introduce an efficient strategy for cessing probabilistic nearest-neighbor queries, as the computation of these probability values is very expensive. In a detailed experimental evaluation, we demonstrate the benefits of our probabilistic query approach. The experiments show that we can achieve high quality query results with rather low computational cost.

...read moreread less

166 citations

Proceedings Article•DOI•

Route skyline queries: A multi-preference path planning approach

[...]

Hans-Peter Kriegel¹, Matthias Renz¹, Matthias Schubert¹•Institutions (1)

Ludwig Maximilian University of Munich¹

01 Mar 2010

TL;DR: This work employs graph embedding techniques to enable a best-first based graph exploration considering route preferences based on arbitrary road attributes and shows that this approach is able to reduce the search space significantly and that the skyline can be computed in efficient time in the experimental evaluation.

...read moreread less

Abstract: In recent years, the research community introduced various methods for processing skyline queries in multidimensional databases. The skyline operator retrieves all objects being optimal w.r.t. an arbitrary linear weighting of the underlying criteria. The most prominent example query is to find a reasonable set of hotels which are cheap but close to the beach. In this paper, we propose an new approach for computing skylines on routes (paths) in a road network considering multiple preferences like distance, driving time, the number of traffic lights, gas consumption, etc. Since the consideration of different preferences usually involves different routes, a skyline-fashioned answer with relevant route candidates is highly useful. In our work, we employ graph embedding techniques to enable a best-first based graph exploration considering route preferences based on arbitrary road attributes. The core of our skyline query processor is a route iterator which iteratively computes the top routes according to (at least one) preference in an efficient way avoiding that route computations need to be issued from scratch in each iteration. Furthermore, we propose pruning techniques in order to reduce the search space. Our pruning strategies aim at pruning as many route candidates as possible during the graph exploration. Therefore, we are able to prune candidates which are only partially explored. Finally, we show that our approach is able to reduce the search space significantly and that the skyline can be computed in efficient time in our experimental evaluation.

...read moreread less

145 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

Collapse

Cited by

PDF

Open Access

More filters

Data Mining - Concepts and Techniques.

[...]

Petra Perner

01 Jan 2002

9,314 citations

Journal Article•

When is nearest neighbor meaningful

[...]

Kevin S. Beyer, Jonathan Goldstein, Raghu Ramakrishnan, Uri Shaft

01 Jan 1999-Lecture Notes in Computer Science

TL;DR: In this article, the authors explore the effect of dimensionality on the nearest neighbor problem and show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance of the farthest data point.

...read moreread less

Abstract: We explore the effect of dimensionality on the nearest neighbor problem. We show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance to the farthest data point. To provide a practical perspective, we present empirical results on both real and synthetic data sets that demonstrate that this effect can occur for as few as 10-15 dimensions. These results should not be interpreted to mean that high-dimensional indexing is never meaningful; we illustrate this point by identifying some high-dimensional workloads for which this effect does not occur. However, our results do emphasize that the methodology used almost universally in the database literature to evaluate high-dimensional indexing techniques is flawed, and should be modified. In particular, most such techniques proposed in the literature are not evaluated versus simple linear scan, and are evaluated over workloads for which nearest neighbor is not meaningful. Often, even the reported experiments, when analyzed carefully, show that linear scan would outperform the techniques being proposed on the workloads studied in high (10-15) dimensionality!.

...read moreread less

1,992 citations

Journal Article•DOI•

Querying and mining of time series data: experimental comparison of representations and distance measures

[...]

Hui Ding¹, Goce Trajcevski¹, Peter Scheuermann¹, Xiaoyue Wang², Eamonn Keogh² - Show less +1 more•Institutions (2)

Northwestern University¹, University of California, Riverside²

01 Aug 2008

TL;DR: An extensive set of time series experiments are conducted re-implementing 8 different representation methods and 9 similarity measures and their variants and testing their effectiveness on 38 time series data sets from a wide variety of application domains to provide a unified validation of some of the existing achievements.

...read moreread less

Abstract: The last decade has witnessed a tremendous growths of interests in applications that deal with querying and mining of time series data. Numerous representation methods for dimensionality reduction and similarity measures geared towards time series have been introduced. Each individual work introducing a particular method has made specific claims and, aside from the occasional theoretical justifications, provided quantitative experimental observations. However, for the most part, the comparative aspects of these experiments were too narrowly focused on demonstrating the benefits of the proposed methods over some of the previously introduced ones. In order to provide a comprehensive validation, we conducted an extensive set of time series experiments re-implementing 8 different representation methods and 9 similarity measures and their variants, and testing their effectiveness on 38 time series data sets from a wide variety of application domains. In this paper, we give an overview of these different techniques and present our comparative experimental findings regarding their effectiveness. Our experiments have provided both a unified validation of some of the existing achievements, and in some cases, suggested that certain claims in the literature may be unduly optimistic.

...read moreread less

1,387 citations

Journal Article•DOI•

A review on time series data mining

[...]

Tak-chung Fu¹•Institutions (1)

Hong Kong Polytechnic University¹

01 Feb 2011-Engineering Applications of Artificial Intelligence

TL;DR: The primary objective of this paper is to serve as a glossary for interested researchers to have an overall picture on the current time series data mining development and identify their potential research direction to further investigation.

...read moreread less

1,358 citations

Journal Article•DOI•

Trajectory Data Mining: An Overview

[...]

Yu Zheng¹•Institutions (1)

Microsoft¹

12 May 2015-ACM Transactions on Intelligent Systems and Technology

TL;DR: A systematic survey on the major research into trajectory data mining, providing a panorama of the field as well as the scope of its research topics, and introduces the methods that transform trajectories into other data formats, such as graphs, matrices, and tensors.

...read moreread less

Abstract: The advances in location-acquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles, and animals. Many techniques have been proposed for processing, managing, and mining trajectory data in the past decade, fostering a broad range of applications. In this article, we conduct a systematic survey on the major research into trajectory data mining, providing a panorama of the field as well as the scope of its research topics. Following a road map from the derivation of trajectory data, to trajectory data preprocessing, to trajectory data management, and to a variety of mining tasks (such as trajectory pattern mining, outlier detection, and trajectory classification), the survey explores the connections, correlations, and differences among these existing techniques. This survey also introduces the methods that transform trajectories into other data formats, such as graphs, matrices, and tensors, to which more data mining and machine learning techniques can be applied. Finally, some public trajectory datasets are presented. This survey can help shape the field of trajectory data mining, providing a quick understanding of this field to the community.

...read moreread less

1,289 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse