Showing papers by "Nikos Mamoulis published in 2018"

PDF

Open Access

Proceedings Article•DOI•

[...]

Yixiang Fang¹, Reynold Cheng¹, Gao Cong², Nikos Mamoulis³, Yun Li⁴ - Show less +1 more•Institutions (4)

University of Hong Kong¹, Nanyang Technological University², University of Ioannina³, Nanjing University⁴

16 Apr 2018

TL;DR: This paper proves that answering SPM queries is computationally intractable, and proposes two efficient algorithms for their evaluation that are highly effective and efficient.

...read moreread less

Abstract: In this paper, we study the spatial pattern matching (SPM) query. Given a set D of spatial objects (e.g., houses and shops), each with a textual description, we aim at finding all combinations of objects from D that match a user-defined spatial pattern P. A pattern P is a graph where vertices represent spatial objects, and edges denote distance relationships between them. The SPM query returns the instances that satisfy P. An example of P can be "a house within 10-minute walk from a school, which is at least 2km away from a hospital". The SPM query can benefit users such as house buyers, urban planners, and archaeologists. We prove that answering such queries is computationally intractable, and propose two efficient algorithms for their evaluation. Extensive experimental evaluation and cases studies on four real datasets show that our proposed solutions are highly effective and efficient.

...read moreread less

26 citations

Journal Article•DOI•

Density-Based Place Clustering Using Geo-Social Network Data

[...]

Dingming Wu¹, Jieming Shi, Nikos Mamoulis²•Institutions (2)

Shenzhen University¹, University of Ioannina²

01 May 2018-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This paper shows how the density-based clustering paradigm can be extended to apply on places which are visited by users of a geo-social network, and considers spatio-temporal information and the social relationships between users who visit the clustered places.

...read moreread less

Abstract: Spatial clustering deals with the unsupervised grouping of places into clusters and finds important applications in urban planning and marketing. Current spatial clustering models disregard information about the people and the time who and when are related to the clustered places. In this paper, we show how the density-based clustering paradigm can be extended to apply on places which are visited by users of a geo-social network. Our model considers spatio-temporal information and the social relationships between users who visit the clustered places. After formally defining the model and the distance measure it relies on, we provide alternatives to our model and the distance measure. We evaluate the effectiveness of our model via a case study on real data; in addition, we design two quantitative measures, called social entropy and community score, to evaluate the quality of the discovered clusters. The results show that temporal-geo-social clusters have special properties and cannot be found by applying simple spatial clustering approaches and other alternatives.

...read moreread less

20 citations

Journal Article•DOI•

Entity-Based Query Recommendation for Long-Tail Queries

[...]

Zhipeng Huang¹, Bogdan Cautis², Reynold Cheng¹, Yudian Zheng¹, Nikos Mamoulis³, Jing Yan¹ - Show less +2 more•Institutions (3)

University of Hong Kong¹, University of Paris-Sud², University of Ioannina³

22 Aug 2018-ACM Transactions on Knowledge Discovery From Data

TL;DR: This article examines two information sources: a knowledge base (or KB), such as YAGO and Freebase; and a click log, which contains the URLs accessed by a query user, and studies how to use these sources to find new entities useful for query recommendation.

...read moreread less

Abstract: Query recommendation, which suggests related queries to search engine users, has attracted a lot of attention in recent years. Most of the existing solutions, which perform analysis of users’ search history (or query logs), are often insufficient for long-tail queries that rarely appear in query logs. To handle such queries, we study the use of entities found in queries to provide recommendations. Specifically, we extract entities from a query, and use these entities to explore new ones by consulting an information source. The discovered entities are then used to suggest new queries to the user. In this article, we examine two information sources: (1) a knowledge base (or KB), such as YAGO and Freebase; and (2) a click log, which contains the URLs accessed by a query user. We study how to use these sources to find new entities useful for query recommendation. We further study a hybrid framework that integrates different query recommendation methods effectively. As shown in the experiments, our proposed approaches provide better recommendations than existing solutions for long-tail queries. In addition, our query recommendation process takes less than 100ms to complete. Thus, our solution is suitable for providing online query recommendation services for search engines.

...read moreread less

20 citations

Journal Article•DOI•

Investment recommendation by discovering high-quality opinions in investor based social networks

[...]

Wenting Tu¹, Min Yang², David W. Cheung³, Nikos Mamoulis³•Institutions (3)

Shanghai University of Finance and Economics¹, Chinese Academy of Sciences², University of Hong Kong³

01 Nov 2018-Information Systems

TL;DR: Experimental results on real datasets demonstrate the effectiveness of the work in recommending high-quality investment opinions and profitable portfolios.

...read moreread less

17 citations

Proceedings Article•DOI•

T-Crowd: Effective Crowdsourcing for Tabular Data

[...]

Caihua Shan¹, Nikos Mamoulis², Guoliang Li³, Reynold Cheng¹, Zhipeng Huang¹, Yudian Zheng⁴ - Show less +2 more•Institutions (4)

University of Hong Kong¹, University of Ioannina², Tsinghua University³, Twitter⁴

16 Apr 2018

TL;DR: T-Crowd is presented: a crowdsourcing system that considers attribute relationships that seamlessly supports categorical and continuous attributes and outperforms state-of-the-art methods, improving the quality of truth inference.

...read moreread less

Abstract: We study the effective use of crowdsourcing in filling missing values in a given relation (e.g., a table containing different attributes of celebrity stars, such as nationality and age). A task given to a worker typically consists of questions about the missing attribute values (e.g., what is the age of Jet Li?). Existing work often treats related attributes independently, leading to suboptimal performance. We present T-Crowd: a crowdsourcing system that considers attribute relationships. T-Crowd integrates each worker's answers on different attributes to effectively learn his/her trustworthiness and the true data values. Our solution seamlessly supports categorical and continuous attributes. Our experiments on real datasets show that T-Crowd outperforms state-of-the-art methods, improving the quality of truth inference.

...read moreread less

14 citations

Posted Content•

Flow Motifs in Interaction Networks

[...]

Chrysanthi Kosyfaki¹, Nikos Mamoulis¹, Evaggelia Pitoura¹, Panayiotis Tsaparas¹•Institutions (1)

University of Ioannina¹

19 Oct 2018-arXiv: Social and Information Networks

TL;DR: This paper introduces network flow motifs, a novel type of motifs that model significant flow transfer among a set of vertices within a constrained time window and designs an algorithm for identifying flow motif instances in a large graph.

...read moreread less

Abstract: Many real-world phenomena are best represented as interaction networks with dynamic structures (e.g., transaction networks, social networks, traffic networks). Interaction networks capture flow of data which is transferred between their vertices along a timeline. Analyzing such networks is crucial toward comprehend- ing processes in them. A typical analysis task is the finding of motifs, which are small subgraph patterns that repeat themselves in the network. In this paper, we introduce network flow motifs, a novel type of motifs that model significant flow transfer among a set of vertices within a constrained time window. We design an algorithm for identifying flow motif instances in a large graph. Our algorithm can be easily adapted to find the top-k instances of maximal flow. In addition, we design a dynamic programming module that finds the instance with the maximum flow. We evaluate the performance of the algorithm on three real datasets and identify flow motifs which are significant for these graphs. Our results show that our algorithm is scalable and that the real networks indeed include interesting motifs, which appear much more frequently than in randomly generated networks having similar characteristics.

...read moreread less

12 citations

Proceedings Article•DOI•

SpaceKey: Exploring Patterns in Spatial Databases

[...]

Yixiang Fang¹, Reynold Cheng¹, Jikun Wang¹, Lukito Budiman¹, Gao Cong², Nikos Mamoulis - Show less +2 more•Institutions (2)

University of Hong Kong¹, Nanyang Technological University²

16 Apr 2018

TL;DR: This paper proposes SpaceKey, a system for retrieving and visualizing spatial objects returned by SGK queries, and supports a novel query, called SPM query, which is defined by a spatial pattern, a graph whose vertices contain keywords and its edges are associated with distance constraints.

...read moreread less

Abstract: Spatial objects associated with keywords are prevalent in applications such as Google Maps and Twitter. Recently, the topic of spatial keyword queries has received plenty of attention. Spatial Group Keyword (SGK) search is a popular class of queries; their goal is to find a set of objects which are close to each other and are associated to a set of input keywords. In this paper, we propose SpaceKey, a system for retrieving and visualizing spatial objects returned by SGK queries. In addition to existing SGK query types, SpaceKey supports a novel query, called SPM query. An SPM query is defined by a spatial pattern, a graph whose vertices contain keywords and its edges are associated with distance constraints. The results are sets of objects that match the pattern. SpaceKey allows users to perform comparison analysis between different SGK query types. We plan to make SpaceKey an open-source web-based platform, and design API functions for software developers to plug other SGK query algorithms into our system.

...read moreread less

8 citations

Proceedings Article•

Interval Count Semi-Joins.

[...]

Panagiotis Bouros¹, Nikos Mamoulis²•Institutions (2)

Aarhus University¹, University of Ioannina²

01 Jan 2018

TL;DR: The state-of-the-art algorithm for interval joins is extended to evaluate ICS J at the cost of only scanning the sorted interval endpoints, enabling an efficient evaluation of an interval count semi-join operation.

...read moreread less

Abstract: Interval joins find applications in several domains, including temporal and spatial databases, uncertain data management, streaming data processing. In this paper, we study the evaluation of an interval count semi-join (ICS J ) operation that can be used for selecting or ranking intervals based on the number of join pairs they appear in. We extend the state-of-the-art algorithm for interval joins to evaluate ICS J at the cost of only scanning the sorted interval endpoints.

...read moreread less

7 citations

Journal Article•DOI•

Recommending packages with validity constraints to groups of users

[...]

Shuyao Qi¹, Nikos Mamoulis², Evaggelia Pitoura², Panayiotis Tsaparas²•Institutions (2)

University of Hong Kong¹, University of Ioannina²

01 Feb 2018-Knowledge and Information Systems

TL;DR: This paper forms the P2G problem, and it proposes probabilistic models that capture the preference of a group toward a package, incorporating factors such as user impact and package viability, and investigates the issue of recommendation fairness.

...read moreread less

Abstract: The success of recommender systems has made them the focus of a massive research effort in both industry and academia. Recent work has generalized recommendations to suggest packages of items to single users, or single items to groups of users. However, to the best of our knowledge, the interesting problem of recommending a package to a group of users (P2G) has not been studied to date. This is a problem with several practical applications, such as recommending vacation packages to tourist groups, entertainment packages to groups of friends or sets of courses to groups of students. In this paper, we formulate the P2G problem, and we propose probabilistic models that capture the preference of a group toward a package, incorporating factors such as user impact and package viability. We also investigate the issue of recommendation fairness. This is a novel consideration that arises in our setting, where we require that no user is consistently slighted by the item selection in the package. In addition, we study a special case of the P2G problem, where the recommended items are places and the recommendation should consider the current locations of the users in the group. We present aggregation algorithms for finding the best packages and compare our suggested models with baseline approaches stemming from previous work. The results show that our models find packages of high quality which consider all special requirements of P2G recommendation.

...read moreread less

7 citations

Proceedings Article•

Flow Motifs in Interaction Networks.

[...]

Chrysanthi Kosyfaki¹, Nikos Mamoulis¹, Evaggelia Pitoura¹, Panayiotis Tsaparas¹•Institutions (1)

University of Ioannina¹

01 Jan 2018

TL;DR: In this paper, the authors introduce network flow motifs, a novel type of motifs that model significant flow transfer among a set of vertices within a constrained time window, and design an algorithm for identifying flow motif instances in a large graph.

...read moreread less

6 citations

Proceedings Article•DOI•

DSANLS: Accelerating Distributed Nonnegative Matrix Factorization via Sketching

[...]

Yuqiu Qian¹, Conghui Tan², Nikos Mamoulis³, David W. Cheung¹•Institutions (3)

University of Hong Kong¹, The Chinese University of Hong Kong², University of Ioannina³

02 Feb 2018

TL;DR: This paper proposes a distributed sketched alternating nonnegative least squares (DSANLS) framework for NMF, which utilizes a matrix sketching technique to reduce the size of non negative least squares subproblems in each iteration for U and V.

...read moreread less

Abstract: Nonnegative matrix factorization (NMF) has been successfully applied in different fields, such as text mining, image processing, and video analysis. NMF is the problem of determining two nonnegative low rank matrices U and V, for a given input matrix M, such that m ≈ UV⊥. There is an increasing interest in parallel and distributed NMF algorithms, due to the high cost of centralized NMF on large matrices. In this paper, we propose a distributed sketched alternating nonnegative least squares(DSANLS) framework for NMF, which utilizes a matrix sketching technique to reduce the size of nonnegative least squares subproblems in each iteration for U and V. We design and analyze two different random matrix generation techniques and two subproblem solvers. Our theoretical analysis shows that DSANLS converges to the stationary point of the original NMF problem and it greatly reduces the computational cost in each subproblem as well as the communication cost within the cluster. DSANLS is implemented using MPI for communication, and tested on both dense and sparse real datasets. The results demonstrate the efficiency and scalability of our framework, compared to the state-of-art distributed NMF MPI implementation.

...read moreread less

Journal Article•DOI•

Location-aware query reformulation for search engines

[...]

Zhipeng Huang¹, Yuqiu Qian¹, Nikos Mamoulis²•Institutions (2)

University of Hong Kong¹, University of Ioannina²

01 Oct 2018-Geoinformatica

TL;DR: This paper proposes an effective spatial proximity measure between a query issuer and a query with a location distribution obtained from its clicked URLs in the query history, and extends popular query recommendation and auto-completion approaches to the authors' location-aware setting, which suggest query reformulations that are semantically relevant to the original query and give results that are spatially close to the query issuer.

...read moreread less

Abstract: Query reformulation, including query recommendation and query auto-completion, is a popular add-on feature of search engines, which provide related and helpful reformulations of a keyword query. Due to the dropping prices of smartphones and the increasing coverage and bandwidth of mobile networks, a large percentage of search engine queries are issued from mobile devices. This makes it possible to improve the quality of query recommendation and auto-completion by considering the physical locations of the query issuers. However, limited research has been done on location-aware query reformulation for search engines. In this paper, we propose an effective spatial proximity measure between a query issuer and a query with a location distribution obtained from its clicked URLs in the query history. Based on this, we extend popular query recommendation and auto-completion approaches to our location-aware setting, which suggest query reformulations that are semantically relevant to the original query and give results that are spatially close to the query issuer. In addition, we extend the bookmark coloring algorithm for graph proximity search to support our proposed query recommendation approaches online, and we adapt an A* search algorithm to support our query auto-completion approach. We also propose a spatial partitioning based approximation that accelerates the computation of our proposed spatial proximity. We conduct experiments using a real query log, which show that our proposed approaches significantly outperform previous work in terms of quality, and they can be efficiently applied online.

...read moreread less

Patent•

Ksp algorithm-based resource description framework query method and system

[...]

Dingming Wu, Jieming Shi, Nikos Mamoulis, Yuan Shuai

29 Nov 2018

TL;DR: In this paper, a KSP algorithm-based resource description framework query method, configured to employ the kSP algorithm to search for a semantic position of a query keyword in an RDF graph, is presented.

...read moreread less

Abstract: A KSP algorithm-based resource description framework query method, configured to employ the KSP algorithm to search for a semantic position of a query keyword in an RDF graph. The query method is user-friendly because a user does not need to master a specialized query language and simply needs to input a query keyword. The query method returns a subtree containing all inputted query keywords and near a query position.

...read moreread less