Home
/
Authors
/
Nikos Mamoulis

Author

Nikos Mamoulis

Other affiliations: University of Hong Kong, Max Planck Society, University of California, Riverside ...read more

Bio: Nikos Mamoulis is an academic researcher from University of Ioannina. The author has contributed to research in topics: Joins & Spatial query. The author has an hindex of 56, co-authored 282 publications receiving 11121 citations. Previous affiliations of Nikos Mamoulis include University of Hong Kong & Max Planck Society.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
1999
1998
1996

Papers

PDF

Open Access

More filters

Journal Article•DOI•

SRX: efficient management of spatial RDF data

[...]

Konstantinos Theocharidis¹, John Liagouris², Nikos Mamoulis³, Panagiotis Bouros⁴, Manolis Terrovitis - Show less +1 more•Institutions (4)

University of Peloponnese¹, ETH Zurich², University of Ioannina³, University of Mainz⁴

01 Oct 2019

TL;DR: Compared to RDF-3X, SRX improves its performance for queries with spatial predicates while incurring little overhead during updates, and the results show SRX ’s superior performance over the competitors.

...read moreread less

Abstract: We present a general encoding scheme for the efficient management of spatial RDF data. The scheme approximates the geometries of the RDF entities inside their (integer) IDs and can be used, along with several operators and optimizations we introduce, to accelerate queries with spatial predicates and to re-encode entities dynamically in case of updates. We implement our ideas in SRX, a system built on top of the popular RDF-3X system. SRX extends RDF-3X with support for three types of spatial queries: range selections (e.g., find entities within a given polygon), spatial joins (e.g., find pairs of entities whose locations are close to each other), and spatial k-nearest neighbors (e.g., find the three closest entities from a given location). We evaluate SRX on spatial queries and updates with real RDF data, and we also compare its performance with the latest versions of three popular RDF stores. The results show SRX ’s superior performance over the competitors; compared to RDF-3X, SRX improves its performance for queries with spatial predicates while incurring little overhead during updates.

...read moreread less

8 citations

Journal Article•DOI•

Lightweight privacy-preserving peer-to-peer data integration

[...]

Ye Zhang¹, Wai Kit Wong², Siu-Ming Yiu³, Nikos Mamoulis³, David W. Cheung³ - Show less +1 more•Institutions (3)

Pennsylvania State University¹, Hang Seng Management College², University of Hong Kong³

01 Jan 2013

TL;DR: A lightweight protocol is developed, which satisfies mapping privacy and extend it to a more complex one that facilitates parallel translation by peers and considers a stronger adversary model where there may be collusions among peers and proposes an efficient protocol that guards against collusions.

...read moreread less

Abstract: Peer Data Management Systems (PDMS) are an attractive solution for managing distributed heterogeneous information. When a peer (client) requests data from another peer (server) with a different schema, translations of the query and its answer are done by a sequence of intermediate peers (translators). There are two privacy issues in this P2P data integration process: (i) answer privacy: no unauthorized parties (including the translators) should learn the query result; (ii) mapping privacy: the schema and the value mappings used by the translators to perform the translation should not be revealed to other peers. Elmeleegy and Ouzzani proposed the PPP protocol that is the first to support privacy-preserving querying in PDMS. However, PPP suffers from several shortcomings. First, PPP does not satisfy the requirement of answer privacy, because it is based on commutative encryption; we show that this issue can be fixed by adopting another cryptographic technique called oblivious transfer. Second, PPP adopts a weaker notion for mapping privacy, which allows the client peer to observe certain mappings done by translators. In this paper, we develop a lightweight protocol, which satisfies mapping privacy and extend it to a more complex one that facilitates parallel translation by peers. Furthermore, we consider a stronger adversary model where there may be collusions among peers and propose an efficient protocol that guards against collusions. We conduct an experimental study on the performance of the proposed protocols using both real and synthetic data. The results show that the proposed protocols not only achieve a better privacy guarantee than PPP, but they are also more efficient.

...read moreread less

8 citations

Proceedings Article•DOI•

Location aware keyword query suggestion based on document proximity

[...]

Shuyao Qi¹, Dingming Wu², Nikos Mamoulis³•Institutions (3)

University of Hong Kong¹, Shenzhen University², University of Ioannina³

16 May 2016

TL;DR: This paper designs a location-aware keyword query suggestion framework that captures both the semantic relevance between keyword queries and the spatial distance between the resulting documents and the user location and proposes a weighted keyword-document graph.

...read moreread less

Abstract: Consider a user who has issued a keyword query to a search engine. We study the effective suggestion of alternative keyword queries to the user, which are semantically relevant to the original query and they have as results documents that correspond to objects near the user's location. For this purpose, we propose a weighted keyword-document graph which captures semantic and proximity relevance between queries and documents. Then, we use the graph to suggest queries that are near in terms of graph distance to the original queries. To make our framework scalable, we propose a partition-based approach that greatly outperforms the baseline algorithm.

...read moreread less

8 citations

Journal Article•DOI•

Thematic ranking of object summaries for keyword search

[...]

Georgios John Fakas¹, Yilun Cai, Zhi Cai², Nikos Mamoulis³•Institutions (3)

Uppsala University¹, Beijing University of Technology², University of Ioannina³

01 Oct 2017

TL;DR: This paper argues that the effective thematic ranking of OSs should combine gracefully IR-style properties, authoritative ranking and affinity, and proposes an algorithm that computes the join efficiently, taking advantage of appropriate count statistics and compare it with baseline approaches.

...read moreread less

Abstract: An Object Summary (OS) is a tree structure of tuples that summarizes the context of a particular Data Subject (DS) tuple. The OS has been used as a model of keyword search in relational databases; where given a set of keywords, the objective is to identify the DSs tuples relevant to the keywords and their corresponding OSs. However, a query result may return a large amount of OSs, which brings in the issue of effectively and efficiently ranking them in order to present only the most important ones to the user. In this paper, we propose a model that ranks OSs containing a set of identifying keywords (e.g., Chen ) according to their relevance to a set of thematic keywords (e.g. Mining ). We argue that the effective thematic ranking of OSs should combine gracefully IR-style properties, authoritative ranking and affinity. Our ranking problem is modeled and solved as a top-k group-by join; we propose an algorithm that computes the join efficiently, taking advantage of appropriate count statistics and compare it with baseline approaches. An experimental evaluation on the DBLP and TPC-H databases verifies the effectiveness and efficiency of our proposal.

...read moreread less

7 citations

Journal Article•DOI•

Complex Spatial Query Processing

[...]

Nikos Mamoulis¹, Dimitris Papadias², Dinos Arkoumanis³•Institutions (3)

University of Hong Kong¹, Hong Kong University of Science and Technology², National Technical University of Athens³

01 Dec 2004-Geoinformatica

TL;DR: This paper provides formulae that accurately estimate the selectivity of complex spatial queries that involve combinations of spatial selections and joins and proposes algorithms that process spatial joins and selections simultaneously and are typically more efficient than combinations of simple operators.

...read moreread less

Abstract: The user of a Geographical Information System is not limited to conventional spatial selections and joins, but may also pose more complicated and descriptive queries In this paper, we focus on the efficient processing and optimization of complex spatial queries that involve combinations of spatial selections and joins Our contribution is manifold; we first provide formulae that accurately estimate the selectivity of such queries These formulae, paired with cost models for selections and joins can be used to combine spatial operators in an optimal way Second, we propose algorithms that process spatial joins and selections simultaneously and are typically more efficient than combinations of simple operators Finally we study the problem of optimizing complex spatial queries using these operators, by providing (i) cost models, and (ii) rules that reduce the optimization space significantly The accuracy of the selectivity models and the efficiency of the proposed algorithms are evaluated through experimentation

...read moreread less

7 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
…
36
37
38
39
40
41
42
…
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59

Collapse

Cited by

PDF

Open Access

More filters

Data Mining - Concepts and Techniques.

[...]

Petra Perner

01 Jan 2002

9,314 citations

“Bioinformatics” 특집을 내면서

[...]

장병탁, 김삼묘, 허철구

01 Aug 2000

TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.

...read moreread less

Abstract: BIOE 402. Medical Technology Assessment. 2 or 3 hours. Bioentrepreneur course. Assessment of medical technology in the context of commercialization. Objectives, competition, market share, funding, pricing, manufacturing, growth, and intellectual property; many issues unique to biomedical products. Course Information: 2 undergraduate hours. 3 graduate hours. Prerequisite(s): Junior standing or above and consent of the instructor.

...read moreread less

4,833 citations

Data Mining: Concepts and Techniques (2nd edition)

[...]

Jiawei Han, Micheline Kamber

01 Jan 2006

TL;DR: There have been many data mining books published in recent years, including Predictive Data Mining by Weiss and Indurkhya [WI98], Data Mining Solutions: Methods and Tools for Solving Real-World Problems by Westphal and Blaxton [WB98], Mastering Data Mining: The Art and Science of Customer Relationship Management by Berry and Linofi [BL99].

...read moreread less

Abstract: The book Knowledge Discovery in Databases, edited by Piatetsky-Shapiro and Frawley [PSF91], is an early collection of research papers on knowledge discovery from data. The book Advances in Knowledge Discovery and Data Mining, edited by Fayyad, Piatetsky-Shapiro, Smyth, and Uthurusamy [FPSSe96], is a collection of later research results on knowledge discovery and data mining. There have been many data mining books published in recent years, including Predictive Data Mining by Weiss and Indurkhya [WI98], Data Mining Solutions: Methods and Tools for Solving Real-World Problems by Westphal and Blaxton [WB98], Mastering Data Mining: The Art and Science of Customer Relationship Management by Berry and Linofi [BL99], Building Data Mining Applications for CRM by Berson, Smith, and Thearling [BST99], Data Mining: Practical Machine Learning Tools and Techniques by Witten and Frank [WF05], Principles of Data Mining (Adaptive Computation and Machine Learning) by Hand, Mannila, and Smyth [HMS01], The Elements of Statistical Learning by Hastie, Tibshirani, and Friedman [HTF01], Data Mining: Introductory and Advanced Topics by Dunham, and Data Mining: Multimedia, Soft Computing, and Bioinformatics by Mitra and Acharya [MA03]. There are also books containing collections of papers on particular aspects of knowledge discovery, such as Machine Learning and Data Mining: Methods and Applications edited by Michalski, Brakto, and Kubat [MBK98], and Relational Data Mining edited by Dzeroski and Lavrac [De01], as well as many tutorial notes on data mining in major database, data mining and machine learning conferences.

...read moreread less

2,591 citations

Matrix Factorization Techniques for Recommender Systems

[...]

Patrick Seemann

01 Jan 2014

2,080 citations

Journal Article•

When is nearest neighbor meaningful

[...]

Kevin S. Beyer, Jonathan Goldstein, Raghu Ramakrishnan, Uri Shaft

01 Jan 1999-Lecture Notes in Computer Science

TL;DR: In this article, the authors explore the effect of dimensionality on the nearest neighbor problem and show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance of the farthest data point.

...read moreread less

Abstract: We explore the effect of dimensionality on the nearest neighbor problem. We show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance to the farthest data point. To provide a practical perspective, we present empirical results on both real and synthetic data sets that demonstrate that this effect can occur for as few as 10-15 dimensions. These results should not be interpreted to mean that high-dimensional indexing is never meaningful; we illustrate this point by identifying some high-dimensional workloads for which this effect does not occur. However, our results do emphasize that the methodology used almost universally in the database literature to evaluate high-dimensional indexing techniques is flawed, and should be modified. In particular, most such techniques proposed in the literature are not evaluated versus simple linear scan, and are evaluated over workloads for which nearest neighbor is not meaningful. Often, even the reported experiments, when analyzed carefully, show that linear scan would outperform the techniques being proposed on the workloads studied in high (10-15) dimensionality!.

...read moreread less

1,992 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse