scispace - formally typeset
Search or ask a question

Showing papers by "Nikos Mamoulis published in 2006"


Proceedings ArticleDOI
01 Sep 2006
TL;DR: This paper studies k-NN monitoring in road networks, where the distance between a query and a data object is determined by the length of the shortest path connecting them, and proposes two methods that can handle arbitrary object and query moving patterns, as well as fluctuations of edge weights.
Abstract: Recent research has focused on continuous monitoring of nearest neighbors (NN) in highly dynamic scenarios, where the queries and the data objects move frequently and arbitrarily. All existing methods, however, assume the Euclidean distance metric. In this paper we study k-NN monitoring in road networks, where the distance between a query and a data object is determined by the length of the shortest path connecting them. We propose two methods that can handle arbitrary object and query moving patterns, as well as fluctuations of edge weights. The first one maintains the query results by processing only updates that may invalidate the current NN sets. The second method follows the shared execution paradigm to reduce the processing time. In particular, it groups together the queries that fall in the path between two consecutive intersections in the network, and produces their results by monitoring the NN sets of these intersections. We experimentally verify the applicability of the proposed techniques to continuous monitoring of large data and query sets.

230 citations


Journal ArticleDOI
TL;DR: This paper presents the first algorithms for efficient RNN search in generic metric spaces that require no detailed representations of objects, and can be applied as long as their mutual distances can be computed and the distance metric satisfies the triangle inequality.
Abstract: Given a set D of objects, a reverse nearest neighbor (RNN) query returns the objects o in D such that o is closer to a query object q than to any other object in D, according to a certain similarity metric. The existing RNN solutions are not sufficient because they either 1) rely on precomputed information that is expensive to maintain in the presence of updates or 2) are applicable only when the data consists of "Euclidean objects" and similarity is measured using the L2 norm. In this paper, we present the first algorithms for efficient RNN search in generic metric spaces. Our techniques require no detailed representations of objects, and can be applied as long as their mutual distances can be computed and the distance metric satisfies the triangle inequality. We confirm the effectiveness of the proposed methods with extensive experiments

112 citations


Journal ArticleDOI
TL;DR: A fundamental lemma is provided, which can be used to prune the search space while traversing the graph in search for RNN, and two RNN methods are developed; an eager algorithm that attempts to prunes network nodes as soon as they are visited and a lazy technique that prunes thesearch space when a data point is discovered.
Abstract: A reverse nearest neighbor (RNN) query returns the data objects that have a query point as their nearest neighbor (NN). Although such queries have been studied quite extensively in Euclidean spaces, there is no previous work in the context of large graphs. In this paper, we provide a fundamental lemma, which can be used to prune the search space while traversing the graph in search for RNN. Based on it, we develop two RNN methods; an eager algorithm that attempts to prune network nodes as soon as they are visited and a lazy technique that prunes the search space when a data point is discovered. We study retrieval of an arbitrary number k of reverse nearest neighbors, investigate the benefits of materialization, cover several query types, and deal with cases where the queries and the data objects reside on nodes or edges of the graph. The proposed techniques are evaluated in various practical scenarios involving spatial maps, computer networks, and the DBLP coauthorship graph.

103 citations


Proceedings ArticleDOI
18 Dec 2006
TL;DR: This work formally defines the problem of mining collocation episodes and proposes two scaleable algorithms for its efficient solution and empirically evaluates the performance of the proposed methods using synthetically generated data that emulate real-world object movements.
Abstract: Given a collection of trajectories of moving objects with different types (e.g., pumas, deers, vultures, etc.), we introduce the problem of discovering collocation episodes in them (e.g., if a puma is moving near a deer, then a vulture is also going to move close to the same deer with high probability within the next 3 minutes). Collocation episodes catch the inter-movement regularities among different types of objects. We formally define the problem of mining collocation episodes and propose two scaleable algorithms for its efficient solution. We empirically evaluate the performance of the proposed methods using synthetically generated data that emulate real-world object movements.

61 citations


Proceedings ArticleDOI
03 Apr 2006
TL;DR: A new algorithm is proposed, designed to minimize the number of object accesses, the computational cost, and the memory requirements of top-k search, which accesses fewer objects, while being orders of magnitude faster.
Abstract: A top-k query combines different rankings of the same set of objects and returns the k objects with the highest combined score according to an aggregate function. We bring to light some key observations, which impose two phases that any top-k algorithm, based on sorted accesses, should go through. Based on them, we propose a new algorithm, which is designed to minimize the number of object accesses, the computational cost, and the memory requirements of top-k search. Adaptations of our algorithm for search variants (exact scores, on-line and incremental search, top-k joins, other aggregate functions, etc.) are also provided. Extensive experiments with synthetic and real data show that, compared to previous techniques, our method accesses fewer objects, while being orders of magnitude faster.

39 citations


Proceedings ArticleDOI
03 Apr 2006
TL;DR: This paper studies an interesting generalization of the RNN query, where not all dimensions are considered, but only an ad hoc subset thereof, and develops appropriate algorithms for projected RNN queries, without relying on multidimensional indexes.
Abstract: Given an object q, modeled by a multidimensional point, a reverse nearest neighbors (RNN) query returns the set of objects in the database that have q as their nearest neighbor. In this paper, we study an interesting generalization of the RNN query, where not all dimensions are considered, but only an ad-hoc subset thereof. The rationale is that (i) the dimensionality might be too high for the result of a regular RNN query to be useful, (ii) missing values may implicitly define a meaningful subspace for RNN retrieval, and (iii) analysts may be interested in the query results only for a set of (ad-hoc) problem dimensions (i.e., object attributes). We consider a suitable storage scheme and develop appropriate algorithms for projected RNN queries, without relying on multidimensional indexes. Our methods are experimentally evaluated with real and synthetic data.

34 citations



Book ChapterDOI
26 Mar 2006
TL;DR: Analytical and experimental results demonstrate that a branch-and-bound method is highly effective in practice, outperforming alternative approaches by a significant factor.
Abstract: Given a set of N multi-dimensional points, we study the computation of φ-quantiles according to a ranking function F, which is provided by the user at runtime. Specifically, F computes a score based on the coordinates of each point; our objective is to report the object whose score is the φN-th smallest in the dataset. φ-quantiles provide a succinct summary about the F-distribution of the underlying data, which is useful for online decision support, data mining, selectivity estimation, query optimization, etc. Assuming that the dataset is indexed by a spatial access method, we propose several algorithms for retrieving a quantile efficiently. Analytical and experimental results demonstrate that a branch-and-bound method is highly effective in practice, outperforming alternative approaches by a significant factor.

14 citations


Proceedings ArticleDOI
05 Jun 2006
TL;DR: The geometric properties of the network itself can be exploited for the detection of movement direction, from a single instance of sensor reading only, and the estimation is performed in a distributed processing fashion.
Abstract: We examine the problem of detecting the direction of motion in a binary sensor network; in such a network each sensor's value is supplied reliably in a single bit of information: whether the moving object is approaching towards or moving away from the sensor. We demonstrate that the geometric properties of the network itself can be exploited for the detection of movement direction, from a single instance of sensor reading only. Moreover the estimation is performed in a distributed processing fashion, with only a minimal data collection at situation-dependent leading sensors and features a low computational burden on each sensor. In addition, different detection instances drain the resources of different groups of sensors, of a small size compared to the size of the whole network. Our experiments demonstrate high accuracy that increases with sensor density and/or sensing range, while the responsiveness of the detection model is practically instantaneous.

9 citations


Proceedings ArticleDOI
25 Apr 2006
TL;DR: This work presents novel algorithms that estimate the data distribution before deciding the physical operator independently for each partition, and suggests that the proposed methods outperform the competitors in terms of efficiency and applicability.
Abstract: PDAs, cellular phones and other mobile devices are now capable of supporting complex data manipulation operations. Here, we focus on ad-hoc spatial joins of datasets residing in multiple non-cooperative servers. Assuming that there is no mediator available, the spatial joins must be evaluated on the mobile device. Contrary to common applications that consider the cost at the server side, our main issue is the minimization of the transferred data, while meeting the resource constraints of the device. We show that existing methods, based on partitioning and pruning, are inadequate in many realistic situations. Then, we present novel algorithms that estimate the data distribution before deciding the physical operator independently for each partition. Our experiments with a prototype implementation on a WiFi-enabled PDA, suggest that the proposed methods outperform the competitors in terms of efficiency and applicability.

5 citations