scispace - formally typeset
Search or ask a question

Showing papers by "Yannis Theodoridis published in 2002"


Journal Article
TL;DR: In this paper, the authors argue that dynamic R-tree construction is a typical clustering problem which can be addressed by incorporating existing clustering algorithms and adopt the well-known k-means algorithm.
Abstract: Spatial indexing is a well researched field that benefited computer science with many outstanding results. Our effort in this paper can be seen as revisiting some outstanding contributions to spatial indexing, questioning some paradigms, and designing an access method with globally improved performance characteristics. In particular, we argue that dynamic R-tree construction is a typical clustering problem which can be addressed by incorporating existing clustering algorithms. As a working example, we adopt the well-known k-means algorithm. Further, we study the effect of relaxing the two-way split procedure and propose a multi-way split, which inherently is supported by clustering techniques. We compare our clustering approach to two prominent examples of spatial access methods, the R- and the R*-tree.

57 citations


Book ChapterDOI
08 Sep 2002
TL;DR: It is argued that dynamic R-tree construction is a typical clustering problem which can be addressed by incorporating existing clustering algorithms, and adopted the well-known k-means algorithm as a working example.
Abstract: Spatial indexing is a well researched field that benefited computer science with many outstanding results. Our effort in this paper can be seen as revisiting some outstanding contributions to spatial indexing, questioning some paradigms, and designing an access method with globally improved performance characteristics. In particular, we argue that dynamic R-tree construction is a typical clustering problem which can be addressed by incorporating existing clustering algorithms. As a working example, we adopt the well-known k-means algorithm. Further, we study the effect of relaxing the "two-way" split procedure and propose a "multi-way" split, which inherently is supported by clustering techniques. We compare our clustering approach to two prominent examples of spatial access methods, the R- and the R*-tree.

55 citations


Proceedings ArticleDOI
04 Nov 2002
TL;DR: A new density-biased sampling algorithm that exploits spatial indexes and the local density information they preserve, to provide improved quality of sampling result and fast access to elements of the dataset.
Abstract: In this paper we describe a new density-biased sampling algorithm. It exploits spatial indexes and the local density information they preserve, to provide improved quality of sampling result and fast access to elements of the dataset. It attains improved sampling quality, with respect to factors like skew, noise or dimensionality. Moreover, it has the advantage of efficiently handling dynamic updates, and it requires low execution times. The performance of the proposed method is examined experimentally. The comparative results illustrate its superiority over existing methods.

22 citations


01 Jan 2002
TL;DR: This report examines the different types of patterns that are extracted from a data set, in order to gather the necessary requirements for the definition of a pattern model, which constitutes the heart of the Pattern Base Management System that will be designed.
Abstract: Data intensive applications produce complex information that is posing requirements for novel Database Management Systems (DBMSs). Such information is characterized by its huge volume of data and by its diversity and complexity, since the data processing methods such as pattern recognition, data mining and knowledge extraction result in knowledge artifacts like clusters, association rules, decision trees and others. These artifacts that we call patterns need to be stored and retrieved efficiently. In order to accomplish this we have to express them within a formalism and a language. In this report we review the concept of patterns and their applicability in several research domains related with the proposed work and we define the knowledge domain related with the PANDA project. It is important to interrelate these domains in order to be able to define the problem in comprehensive and complete way and come up with requirements on how a management system for patterns should be. We examine the different types of patterns that are extracted from a data set, in order to gather the necessary requirements for the definition of a pattern model. This model constitutes the heart of the Pattern Base Management System that will be designed.

8 citations