scispace - formally typeset
Search or ask a question
Author

Nikos Mamoulis

Bio: Nikos Mamoulis is an academic researcher from University of Ioannina. The author has contributed to research in topics: Joins & Spatial query. The author has an hindex of 56, co-authored 282 publications receiving 11121 citations. Previous affiliations of Nikos Mamoulis include University of Hong Kong & Max Planck Society.


Papers
More filters
Proceedings ArticleDOI
15 Apr 2007
TL;DR: This paper formally defines spatial preference queries and proposes appropriate indexing techniques and search algorithms for them and their methods are experimentally evaluated for a wide range of problem settings.
Abstract: A spatial preference query ranks objects based on the qualities of features in their spatial neighborhood. For example, consider a real estate agency office that holds a database with available flats for lease. A customer may want to rank the flats with respect to the appropriateness of their location, defined after aggregating the qualities of other features (e.g., restaurants, cafes, hospital, market, etc.) within a distance range from them. In this paper, we formally define spatial preference queries and propose appropriate indexing techniques and search algorithms for them. Our methods are experimentally evaluated for a wide range of problem settings.

103 citations

Journal ArticleDOI
TL;DR: A fundamental lemma is provided, which can be used to prune the search space while traversing the graph in search for RNN, and two RNN methods are developed; an eager algorithm that attempts to prunes network nodes as soon as they are visited and a lazy technique that prunes thesearch space when a data point is discovered.
Abstract: A reverse nearest neighbor (RNN) query returns the data objects that have a query point as their nearest neighbor (NN). Although such queries have been studied quite extensively in Euclidean spaces, there is no previous work in the context of large graphs. In this paper, we provide a fundamental lemma, which can be used to prune the search space while traversing the graph in search for RNN. Based on it, we develop two RNN methods; an eager algorithm that attempts to prune network nodes as soon as they are visited and a lazy technique that prunes the search space when a data point is discovered. We study retrieval of an arbitrary number k of reverse nearest neighbors, investigate the benefits of materialization, cover several query types, and deal with cases where the queries and the data objects reside on nodes or edges of the graph. The proposed techniques are evaluated in various practical scenarios involving spatial maps, computer networks, and the DBLP coauthorship graph.

103 citations

Posted Content
TL;DR: A network embedding is a representation of a large graph in a low-dimensional space, where vertices are modeled as vectors to preserve the proximity between vertices in the original graph.
Abstract: A network embedding is a representation of a large graph in a low-dimensional space, where vertices are modeled as vectors. The objective of a good embedding is to preserve the proximity between vertices in the original graph. This way, typical search and mining methods can be applied in the embedded space with the help of off-the-shelf multidimensional indexing approaches. Existing network embedding techniques focus on homogeneous networks, where all vertices are considered to belong to a single class.

102 citations

Book ChapterDOI
22 Aug 2005
TL;DR: This work proposes adaptations of spatial access methods and search algorithms for probabilistic versions of range queries and nearest neighbors and conducts an extensive experimental study, which evaluates the effectiveness of proposed solutions.
Abstract: We study the problem of answering spatial queries in databases where objects exist with some uncertainty and they are associated with an existential probability. The goal of a thresholding probabilistic spatial query is to retrieve the objects that qualify the spatial predicates with probability that exceeds a threshold. Accordingly, a ranking probabilistic spatial query selects the objects with the highest probabilities to qualify the spatial predicates. We propose adaptations of spatial access methods and search algorithms for probabilistic versions of range queries and nearest neighbors and conduct an extensive experimental study, which evaluates the effectiveness of proposed solutions.

102 citations

Proceedings ArticleDOI
21 Jun 2004
TL;DR: Alternative methods for processing ANN queries depending on whether A and B are indexed are studied, showing that they are an order of magnitude faster than a previous approach based on closest-pairs query processing.
Abstract: Given two sets A and B of multidimensional objects, the all-nearest-neighbors (ANN) query retrieves for each object in A its nearest neighbor in B. Although this operation is common in several applications, it has not received much attention in the database literature. In this paper we study alternative methods for processing ANN queries depending on whether A and B are indexed: Our algorithms are evaluated through extensive experimentation using synthetic and real datasets. The performance studies show that they are an order of magnitude faster than a previous approach based on closest-pairs query processing.

101 citations


Cited by
More filters
01 Jan 2002

9,314 citations

01 Aug 2000
TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.
Abstract: BIOE 402. Medical Technology Assessment. 2 or 3 hours. Bioentrepreneur course. Assessment of medical technology in the context of commercialization. Objectives, competition, market share, funding, pricing, manufacturing, growth, and intellectual property; many issues unique to biomedical products. Course Information: 2 undergraduate hours. 3 graduate hours. Prerequisite(s): Junior standing or above and consent of the instructor.

4,833 citations

01 Jan 2006
TL;DR: There have been many data mining books published in recent years, including Predictive Data Mining by Weiss and Indurkhya [WI98], Data Mining Solutions: Methods and Tools for Solving Real-World Problems by Westphal and Blaxton [WB98], Mastering Data Mining: The Art and Science of Customer Relationship Management by Berry and Linofi [BL99].
Abstract: The book Knowledge Discovery in Databases, edited by Piatetsky-Shapiro and Frawley [PSF91], is an early collection of research papers on knowledge discovery from data. The book Advances in Knowledge Discovery and Data Mining, edited by Fayyad, Piatetsky-Shapiro, Smyth, and Uthurusamy [FPSSe96], is a collection of later research results on knowledge discovery and data mining. There have been many data mining books published in recent years, including Predictive Data Mining by Weiss and Indurkhya [WI98], Data Mining Solutions: Methods and Tools for Solving Real-World Problems by Westphal and Blaxton [WB98], Mastering Data Mining: The Art and Science of Customer Relationship Management by Berry and Linofi [BL99], Building Data Mining Applications for CRM by Berson, Smith, and Thearling [BST99], Data Mining: Practical Machine Learning Tools and Techniques by Witten and Frank [WF05], Principles of Data Mining (Adaptive Computation and Machine Learning) by Hand, Mannila, and Smyth [HMS01], The Elements of Statistical Learning by Hastie, Tibshirani, and Friedman [HTF01], Data Mining: Introductory and Advanced Topics by Dunham, and Data Mining: Multimedia, Soft Computing, and Bioinformatics by Mitra and Acharya [MA03]. There are also books containing collections of papers on particular aspects of knowledge discovery, such as Machine Learning and Data Mining: Methods and Applications edited by Michalski, Brakto, and Kubat [MBK98], and Relational Data Mining edited by Dzeroski and Lavrac [De01], as well as many tutorial notes on data mining in major database, data mining and machine learning conferences.

2,591 citations

Journal Article
TL;DR: In this article, the authors explore the effect of dimensionality on the nearest neighbor problem and show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance of the farthest data point.
Abstract: We explore the effect of dimensionality on the nearest neighbor problem. We show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance to the farthest data point. To provide a practical perspective, we present empirical results on both real and synthetic data sets that demonstrate that this effect can occur for as few as 10-15 dimensions. These results should not be interpreted to mean that high-dimensional indexing is never meaningful; we illustrate this point by identifying some high-dimensional workloads for which this effect does not occur. However, our results do emphasize that the methodology used almost universally in the database literature to evaluate high-dimensional indexing techniques is flawed, and should be modified. In particular, most such techniques proposed in the literature are not evaluated versus simple linear scan, and are evaluated over workloads for which nearest neighbor is not meaningful. Often, even the reported experiments, when analyzed carefully, show that linear scan would outperform the techniques being proposed on the workloads studied in high (10-15) dimensionality!.

1,992 citations