Showing papers on "Skyline published in 2011"

PDF

Open Access

Proceedings Article•DOI•

Efficient parallel skyline processing using hyperplane projections

[...]

Henning Köhler¹, Jing Yang², Xiaofang Zhou¹•Institutions (2)

University of Queensland¹, Renmin University of China²

12 Jun 2011

TL;DR: This work uses hyperplane projections to obtain useful partitions of the data set for parallel processing that ensure small local skyline sets, but enable efficient merging of results as well and provides insights on the impacts of different optimization strategies.

...read moreread less

Abstract: The skyline of a set of multi-dimensional points (tuples) consists of those points for which no clearly better point exists in the given set, using component-wise comparison on domains of interest. Skyline queries, i.e., queries that involve computation of a skyline, can be computationally expensive, so it is natural to consider parallelized approaches which make good use of multiple processors. We approach this problem by using hyperplane projections to obtain useful partitions of the data set for parallel processing. These partitions not only ensure small local skyline sets, but enable efficient merging of results as well. Our experiments show that our method consistently outperforms similar approaches for parallel skyline computation, regardless of data distribution, and provides insights on the impacts of different optimization strategies.

...read moreread less

73 citations

Proceedings Article•DOI•

Mining Dominant Patterns in the Sky

[...]

Arnaud Soulet, Chedy Raïssi¹, Marc Plantevit, Bruno Crémilleux•Institutions (1)

French Institute for Research in Computer Science and Automation¹

11 Dec 2011

TL;DR: This work establishes theoretical relationships between pattern condensed representations and skyline pattern mining and shows that it is possible to compute automatically a subset of measures involved in the user query which allows the patterns to be condensed and thus facilitates the computation of the skyline patterns.

...read moreread less

Abstract: Pattern discovery is at the core of numerous data mining tasks. Although many methods focus on efficiency in pattern mining, they still suffer from the problem of choosing a threshold that influences the final extraction result. The goal of our study is to make the results of pattern mining useful from a user-preference point of view. To this end, we integrate into the pattern discovery process the idea of skyline queries in order to mine skyline patterns in a threshold-free manner. Because the skyline patterns satisfy a formal property of dominations, they not only have a global interest but also have semantics that are easily understood by the user. In this work, we first establish theoretical relationships between pattern condensed representations and skyline pattern mining. We also show that it is possible to compute automatically a subset of measures involved in the user query which allows the patterns to be condensed and thus facilitates the computation of the skyline patterns. This forms the basis for a novel approach to mining skyline patterns. We illustrate the efficiency of our approach over several data sets including a use case from chemo informatics and show that small sets of dominant patterns are produced under various measures.

...read moreread less

70 citations

Journal Article•

Corroborating Information from Web Sources.

[...]

Werner Kießling¹, Markus Endres¹, Florian Wenzel²•Institutions (2)

Rutgers University¹, University of Augsburg²

01 Jan 2011-IEEE Data(base) Engineering Bulletin

TL;DR: Preference SQL is a declarative extension of standard SQL by strict partial order preferences, behaving like soft constraints under the BMO query model, enabling a seamless application integration with standard SQL back-end systems.

...read moreread less

Abstract: Preference SQL is a declarative extension of standard SQL by strict partial order preferences, behaving like soft constraints under the BMO query model. Preference queries can be formulated intuitively following an inductive constructor-based approach. Both qualitative methods like e.g. Pareto / skyline and quantative methods like numerical ranking, definable over categorical as well as numerical attribute domains can be used. The Preference SQL System is implemented as a middleware component, enabling a seamless application integration with standard SQL back-end systems. The preference query optimizer performs algebraic transformations of preference relational algebra as well as cost-based algorithm selection e.g. for efficient Pareto / skyline evaluation. Ongoing work extends Preference SQL towards efficient support for personalized location-based mobile geo-services and social networks.

...read moreread less

68 citations

Proceedings Article•DOI•

Representative skylines using threshold-based preference distributions

[...]

Atish Das Sarma¹, Ashwin Lall¹, Danupon Nanongkai¹, Richard J. Lipton¹, Jim Xu¹ - Show less +1 more•Institutions (1)

Georgia Institute of Technology¹

11 Apr 2011

TL;DR: One of the main contributions is to formulate the problem of displaying k representative skyline points such that the probability that a random user would click on one of them is maximized.

...read moreread less

Abstract: The study of skylines and their variants has received considerable attention in recent years. Skylines are essentially sets of most interesting (undominated) tuples in a database. However, since the skyline is often very large, much research effort has been devoted to identifying a smaller subset of (say k) “representative skyline” points. Several different definitions of representative skylines have been considered. Most of these formulations are intuitive in that they try to achieve some kind of clustering “spread” over the entire skyline, with k points. In this work, we take a more principled approach in defining the representative skyline objective. One of our main contributions is to formulate the problem of displaying k representative skyline points such that the probability that a random user would click on one of them is maximized.

...read moreread less

59 citations

Journal Article•DOI•

Parallel skyline computation on multicore architectures

[...]

Hyeonseung Im¹, Jonghyun Park¹, Sungwoo Park¹•Institutions (1)

Pohang University of Science and Technology¹

01 Jun 2011-Information Systems

TL;DR: This paper compares two parallel skyline algorithms: a parallel version of the branch-and-bound algorithm (BBS) and a new parallel algorithm based on skeletal parallel programming, which is comparable to parallel BBS in speed.

...read moreread less

52 citations

Journal Article•DOI•

Constrained Skyline Query Processing against Distributed Data Sites

[...]

Lijiang Chen¹, Bin Cui¹, Hua Lu²•Institutions (2)

Peking University¹, Aalborg University²

01 Feb 2011-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This paper proposes a partition algorithm that divides all data sites into incomparable groups such that the skyline computations in all groups can be parallelized without changing the final result, and develops a novel algorithm framework called PaDSkyline for parallel skyline query processing among partitioned site groups.

...read moreread less

Abstract: The skyline of a multidimensional point set is a subset of interesting points that are not dominated by others. In this paper, we investigate constrained skyline queries in a large-scale unstructured distributed environment, where relevant data are distributed among geographically scattered sites. We first propose a partition algorithm that divides all data sites into incomparable groups such that the skyline computations in all groups can be parallelized without changing the final result. We then develop a novel algorithm framework called PaDSkyline for parallel skyline query processing among partitioned site groups. We also employ intragroup optimization and multifiltering technique to improve the skyline query processes within each group. In particular, multiple (local) skyline points are sent together with the query as filtering points, which help identify unqualified local skyline points early on a data site. In this way, the amount of data to be transmitted via network connections is reduced, and thus, the overall query response time is shortened further. Cost models and heuristics are proposed to guide the selection of a given number of filtering points from a superset. A cost-efficient model is developed to determine how many filtering points to use for a particular data site. The results of an extensive experimental study demonstrate that our proposals are effective and efficient.

...read moreread less

51 citations

Proceedings Article•DOI•

On the Use of Fuzzy Dominance for Computing Service Skyline Based on QoS

[...]

Karim Benouaret, Djamal Benslimane, Allel Hadjali¹•Institutions (1)

Institut de Recherche en Informatique et Systèmes Aléatoires¹

04 Jul 2011

TL;DR: A new concept, called alpha-dominant service skyline, is introduced to address the above issues and a suitable algorithm for computing it efficiently is developed.

...read moreread less

Abstract: Nowadays, the exploding number of functionally similar Web services has led to a new challenge of selecting the most relevant services using quality of service (QoS) aspects. Traditionally, the relevance of a service is determined by computing an overall score that aggregates individual QoS values. Users are required to assign weights to QoS attributes. This is a rather demanding task and an imprecise specification of the weights could result in missing some user desired services. Recent approaches focus on computing service skyline over a set of QoS aspects. This can completely free users from assigning weights to QoS attributes. However, two main drawbacks characterize such approaches. First, the service skyline often privileges services with a bad compromise between different QoS attributes. Second, as the size of the service skyline may be quite large, users will be overwhelmed during the service selection process. In this paper, we introduce a new concept, called alpha-dominant service skyline, to address the above issues and we develop a suitable algorithm for computing it efficiently. Experimental evaluation conducted on synthetically generated datasets, demonstrates both the effectiveness of the introduced concept and the efficiency of the proposed algorithm.

...read moreread less

50 citations

Journal Article•DOI•

Flexible and Efficient Resolution of Skyline Query Size Constraints

[...]

Hua Lu¹, Christian S. Jensen², Zhenjie Zhang•Institutions (2)

Aalborg University¹, Aarhus University²

01 Jul 2011-IEEE Transactions on Knowledge and Data Engineering

TL;DR: The paper proposes a new approach, called skyline ordering, that forms a skyline-based partitioning of a given data set such that an order exists among the partitions, and proposes a set-wide maximization techniques may be applied within each partition.

...read moreread less

Abstract: Given a set of multidimensional points, a skyline query returns the interesting points that are not dominated by other points. It has been observed that the actual cardinality (s) of a skyline query result may differ substantially from the desired result cardinality (k), which has prompted studies on how to reduce s for the case where k ;s. Based on these observations, the paper proposes a new approach, called skyline ordering, that forms a skyline-based partitioning of a given data set such that an order exists among the partitions. Then, set-wide maximization techniques may be applied within each partition. Efficient algorithms are developed for skyline ordering and for resolving size constraints using the skyline order. The results of extensive experiments show that skyline ordering yields a flexible framework for the efficient and scalable resolution of arbitrary size constraints on skyline queries.

...read moreread less

50 citations

Journal Article•DOI•

Collaborative Filtering with Personalized Skylines

[...]

Ilaria Bartolini, Zhenjie Zhang, Dimitris Papadias¹•Institutions (1)

Hong Kong University of Science and Technology¹

01 Feb 2011-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This work proposes Collaborative Filtering Skyline (CFS), a general framework that combines the advantages of CF with those of the skyline operator, and proposes the top-k personalized skyline, where the user specifies the required output cardinality.

...read moreread less

Abstract: Collaborative filtering (CF) systems exploit previous ratings and similarity in user behavior to recommend the top-k objects/records which are potentially most interesting to the user assuming a single score per object. However, in various applications, a record (e.g., hotel) maybe rated on several attributes (value, service, etc.), in which case simply returning the ones with the highest overall scores fails to capture the individual attribute characteristics and to accommodate different selection criteria. In order to enhance the flexibility of CF, we propose Collaborative Filtering Skyline (CFS), a general framework that combines the advantages of CF with those of the skyline operator. CFS generates a personalized skyline for each user based on scores of other users with similar behavior. The personalized skyline includes objects that are good on certain aspects, and eliminates the ones that are not interesting on any attribute combination. Although the integration of skylines and CF has several attractive properties, it also involves rather expensive computations. We face this challenge through a comprehensive set of algorithms and optimizations that reduce the cost of generating personalized skylines. In addition to exact skyline processing, we develop an approximate method that provides error guarantees. Finally, we propose the top-k personalized skyline, where the user specifies the required output cardinality.

...read moreread less

48 citations

Proceedings Article•DOI•

Stochastic skyline operator

[...]

Xuemin Lin¹, Ying Zhang¹, Wenjie Zhang¹, Muhammad Aamir Cheema¹•Institutions (1)

University of New South Wales¹

11 Apr 2011

TL;DR: It is shown that the problem of stochastic skyline is NP-complete with respect to the dimensionality, and novel and efficient algorithms are developed to efficiently compute stoChastic skyline over multi-dimensional uncertain data, which run in polynomial time if thedimensionality is fixed.

...read moreread less

Abstract: In many applications involving the multiple criteria optimal decision making, users may often want to make a personal trade-off among all optimal solutions. As a key feature, the skyline in a multi-dimensional space provides the minimum set of candidates for such purposes by removing all points not preferred by any (monotonic) utility/scoring functions; that is, the skyline removes all objects not preferred by any user no mater how their preferences vary. Driven by many applications with uncertain data, the probabilistic skyline model is proposed to retrieve uncertain objects based on skyline probabilities. Nevertheless, skyline probabilities cannot capture the preferences of monotonic utility functions. Motivated by this, in this paper we propose a novel skyline operator, namely stochastic skyline. In the light of the expected utility principle, stochastic skyline guarantees to provide the minimum set of candidates for the optimal solutions over all possible monotonic multiplicative utility functions. In contrast to the conventional skyline or the probabilistic skyline computation, we show that the problem of stochastic skyline is NP-complete with respect to the dimensionality. Novel and efficient algorithms are developed to efficiently compute stochastic skyline over multi-dimensional uncertain data, which run in polynomial time if the dimensionality is fixed. We also show, by theoretical analysis and experiments, that the size of stochastic skyline is quite similar to that of conventional skyline over certain data. Comprehensive experiments demonstrate that our techniques are efficient and scalable regarding both CPU and IO costs.

...read moreread less

41 citations

Proceedings Article•DOI•

Authentication of location-based skyline queries

[...]

Xin Lin¹, Jianliang Xu¹, Haibo Hu¹•Institutions (1)

Hong Kong Baptist University¹

24 Oct 2011

TL;DR: This paper proposes two authentication methods: one based on the traditional MR-tree index and the other based on a newly developed MR-Sky-tree, which have recently been receiving increasing attention in LBS applications.

...read moreread less

Abstract: In outsourced spatial databases, the location-based service (LBS) provides query services to the clients on behalf of the data owner. However, if the LBS is not trustworthy, it may return incorrect or incomplete query results. Thus, authentication is needed to verify the soundness and completeness of query results. In this paper, we study the authentication problem for location-based skyline queries, which have recently been receiving increasing attention in LBS applications. We propose two authentication methods: one based on the traditional MR-tree index and the other based on a newly developed MR-Sky-tree. Experimental results demonstrate the efficiency of our proposed methods in terms of the authentication cost.

...read moreread less

Proceedings Article•DOI•

Skyline query processing over joins

[...]

Akrivi Vlachou¹, Christos Doulkeridis¹, Neoklis Polyzotis²•Institutions (2)

Norwegian University of Science and Technology¹, University of California, Santa Cruz²

12 Jun 2011

TL;DR: The novel SFSJ algorithm is introduced that fuses the identification of skyline tuples with the computation of the join and is able to compute the correct skyline set by accessing only a subset of the input tuples, i.e., it has the property of early termination.

...read moreread less

Abstract: This paper addresses the problem of efficiently computing the skyline set of a relational join. Existing techniques either require to access all tuples of the input relations or demand specialized multi-dimensional access methods to generate the skyline join result. To avoid these inefficiencies, we introduce the novel SFSJ algorithm that fuses the identification of skyline tuples with the computation of the join. SFSJ is able to compute the correct skyline set by accessing only a subset of the input tuples, i.e., it has the property of early termination. SFSJ employs standard access methods for reading the input tuples and is readily implementable in an existing database system. Moreover, it can be used in pipelined execution plans, as it generates the skyline tuples progressively. Additionally, we formally analyze the performance of SFSJ and propose a novel strategy for accessing the input tuples that is proven to be optimal for SFSJ. Finally, we present an extensive experimental study that validates the effectiveness of SFSJ and demonstrates its advantages over existing techniques.

...read moreread less

Journal Article•DOI•

Preference elicitation in prioritized skyline queries

[...]

Denis Mindolin¹, Jan Chomicki¹•Institutions (1)

University at Buffalo¹

01 Apr 2011

TL;DR: This work studies p-skyline queries that generalize skyline queries by allowing varying attribute importance in preference relations, and proposes a proposed elicitation algorithm that has high accuracy and good scalability.

...read moreread less

Abstract: Preference queries incorporate the notion of binary preference relation into relational database querying. Instead of returning all the answers, such queries return only the best answers, according to a given preference relation. Preference queries are a fast growing area of database research. Skyline queries constitute one of the most thoroughly studied classes of preference queries. A well-known limitation of skyline queries is that skyline preference relations assign the same importance to all attributes. In this work, we study p-skyline queries that generalize skyline queries by allowing varying attribute importance in preference relations. We perform an in-depth study of the properties of p-skyline preference relations. In particular, we study the problems of containment and minimal extension. We apply the obtained results to the central problem of the paper: eliciting relative importance of attributes. Relative importance is implicit in the constructed p-skyline preference relation. The elicitation is based on user-selected sets of superior (positive) and inferior (negative) examples. We show that the computational complexity of elicitation depends on whether inferior examples are involved. If they are not, elicitation can be achieved in polynomial time. Otherwise, it is NP complete. Our experiments show that the proposed elicitation algorithm has high accuracy and good scalability.

...read moreread less

Journal Article•DOI•

Ranking uncertain sky: The probabilistic top-k skyline operator

[...]

Ying Zhang¹, Wenjie Zhang¹, Xuemin Lin¹, Bin Jiang², Jian Pei² - Show less +1 more•Institutions (2)

NICTA¹, Simon Fraser University²

01 Jul 2011-Information Systems

TL;DR: An efficient exact algorithm for computing the top-k skyline objects is developed for discrete cases and an efficient randomized algorithm with an @e@?approximation guarantee is developed to address applications where each object may have a massive set of instances or a continuous probability density function.

...read moreread less

Proceedings Article•DOI•

Efficient execution plans for distributed skyline query processing

[...]

João B. Rocha-Junior¹, Akrivi Vlachou¹, Christos Doulkeridis¹, Kjetil Nørvåg¹•Institutions (1)

Norwegian University of Science and Technology¹

21 Mar 2011

TL;DR: A novel framework, called SkyPlan, for processing distributed skyline queries that generates execution plans aiming at optimizing the performance of query processing that consistently outperforms the state-of-the-art algorithm.

...read moreread less

Abstract: In this paper, we study the generation of efficient execution plans for skyline query processing in large-scale distributed environments. In such a setting, each server stores autonomously a fraction of the data, thus all servers need to process the skyline query. An execution plan defines the order in which the individual skyline queries are processed on different servers, and influences the performance of query processing. Querying servers consecutively reduces the amount of transferred data and the number of queried servers, since skyline points obtained by one server prune points in the subsequent servers, but also increases the latency of the system. To address this trade-off, we introduce a novel framework, called SkyPlan, for processing distributed skyline queries that generates execution plans aiming at optimizing the performance of query processing. Thus, we quantify the gain of querying consecutively different servers. Then, execution plans are generated that maximize the overall gain, while also taking into account additional objectives, such as bounding the maximum number of hops required for the query or balancing the load on different servers fairly. Finally, we present an algorithm for distributed processing based on the generated plan that continuously refines the execution plan during in-network processing. Our framework consistently outperforms the state-of-the-art algorithm.

...read moreread less

Journal Article•DOI•

Asymptotically efficient algorithms for skyline probabilities of uncertain data

[...]

Mikhail J. Atallah¹, Yinian Qi¹, Hao Yuan²•Institutions (2)

Purdue University¹, City University of Hong Kong²

02 Jun 2011-ACM Transactions on Database Systems

TL;DR: This work proposes a new algorithm for computing all skyline probabilities that is asymptotically faster and studies the online version of the problem, which involves answering an online query for d-dimensional data in O(n) time and space.

...read moreread less

Abstract: Skyline computation is widely used in multicriteria decision making. As research in uncertain databases draws increasing attention, skyline queries with uncertain data have also been studied. Some earlier work focused on probabilistic skylines with a given threshold; Atallah and Qi [2009] studied the problem to compute skyline probabilities for all instances of uncertain objects without the use of thresholds, and proposed an algorithm with subquadratic time complexity. In this work, we propose a new algorithm for computing all skyline probabilities that is asymptotically faster: worst-case O(n √n log n) time and O(n) space for 2D data; O(n2−1/d logd−1n) time and O(n logd−2n) space for d-dimensional data. Furthermore, we study the online version of the problem: Given any query point p (unknown until the query time), return the probability that no instance in the given data set dominates p. We propose an algorithm for answering such an online query for d-dimensional data in O(n1−1/d logd−1n) time after preprocessing the data in O(n2−1/d logd−1) time and space.

...read moreread less

Book Chapter•DOI•

On different types of fuzzy skylines

[...]

Allel Hadjali¹, Olivier Pivert¹, Henri Prade²•Institutions (2)

University of Rennes¹, University of Toulouse²

28 Jun 2011

TL;DR: This paper deals with database preference queries based on the skyline paradigm, which aim at retrieving the tuples non Paretodominated by any other, and proposes different ways to fuzzify such queries in order to make them more flexible, to increase their discrimination power, to make they more drastic or more tolerant.

...read moreread less

Abstract: This paper deals with database preference queries based on the skyline paradigm, which aim at retrieving the tuples non Paretodominated by any other. We propose different ways to fuzzify such queries in order to make them more flexible, to increase their discrimination power, to make them more drastic or more tolerant. In particular, some of these extensions make it possible to reduce the risk of getting many incomparable tuples, even when the number of dimensions is high.

...read moreread less

Proceedings Article•DOI•

Efficient reverse skyline retrieval with arbitrary non-metric similarity measures

[...]

Prasad M. Deshpande¹, Deepak P¹•Institutions (1)

IBM¹

21 Mar 2011

TL;DR: This paper considers Reverse Skyline query processing where the distance between attribute values are not necessarily metric, and proposes a method of using group-level reasoning and early pruning to micro-optimize processing by reducing attribute level comparisons.

...read moreread less

Abstract: A Reverse Skyline query returns all objects whose skyline contains the query object. In this paper, we consider Reverse Skyline query processing where the distance between attribute values are not necessarily metric. We outline real world cases that motivate Reverse Skyline processing in such scenarios. We consider various optimizations to develop efficient algorithms for Reverse Skyline processing. Firstly, we consider block-based processing of objects to optimize on IO costs. We then explore pre-processing to re-arrange objects on disk to speed-up computational and IO costs. We then present our main contribution, which is a method of using group-level reasoning and early pruning to micro-optimize processing by reducing attribute level comparisons. An extensive empirical evaluation with real-world datasets and synthetic data of varying characteristics shows that our optimization techniques are indeed very effective in dramatically speeding Reverse Skyline processing, both in terms of computational costs and IO costs.

...read moreread less

Book Chapter•DOI•

Efficiently evaluating skyline queries on RDF databases

[...]

Ling Chen¹, Sidan Gao¹, Kemafor Anyanwu¹•Institutions (1)

North Carolina State University¹

29 May 2011

TL;DR: This paper presents an approach for optimizing skyline queries over RDF data stored using a vertically partitioned schema model based on the concept of a "Header Point" which maintains a concise summary of the already visited regions of the data space.

...read moreread less

Abstract: Skyline queries are a class of preference queries that compute the pareto-optimal tuples from a set of tuples and are valuable for multicriteria decision making scenarios. While this problem has received significant attention in the context of single relational table, skyline queries over joins of multiple tables that are typical of storage models for RDF data has received much less attention. A naive approach such as a join-first-skyline-later strategy splits the join and skyline computation phases which limit opportunities for optimization. Other existing techniques for multi-relational skyline queries assume storage and indexing techniques that are not typically used with RDF which would require a preprocessing step for data transformation. In this paper, we present an approach for optimizing skyline queries over RDF data stored using a vertically partitioned schema model. It is based on the concept of a "Header Point" which maintains a concise summary of the already visited regions of the data space. This summary allows some fraction of nonskyline tuples to be pruned from advancing to the skyline processing phase, thus reducing the overall cost of expensive dominance checks required in the skyline phase. We further present more aggressive pruning rules that result in the computation of near-complete skylines in significantly less time than the complete algorithm. A comprehensive performance evaluation of different algorithms is presented using datasets with different types of data distributions generated by a benchmark data generator.

...read moreread less

Proceedings Article•DOI•

Approximate) uncertain skylines

[...]

Peyman Afshani¹, Pankaj K. Agarwal², Lars Arge³, Kasper Green Larsen³, Jeff M. Phillips⁴ - Show less +1 more•Institutions (4)

Dalhousie University¹, Duke University², Aarhus University³, University of Utah⁴

21 Mar 2011

TL;DR: This work considers the problem of computing the probability of each point lying on the skyline, that is, the probability that it is not dominated by any other input point, and improves the best known exact solution.

...read moreread less

Abstract: Given a set of points with uncertain locations, we consider the problem of computing the probability of each point lying on the skyline, that is, the probability that it is not dominated by any other input point. If each point's uncertainty is described as a probability distribution over a discrete set of locations, we improve the best known exact solution. We also suggest why we believe our solution might be optimal. Next, we describe simple, near-linear time approximation algorithms for computing the probability of each point lying on the skyline. In addition, some of our methods can be adapted to construct data structures that can efficiently determine the probability of a query point lying on the skyline.

...read moreread less

Book Chapter•DOI•

MSSQ: manhattan spatial skyline queries

[...]

Wanbin Son¹, Seung-won Hwang¹, Hee-Kap Ahn¹•Institutions (1)

Pohang University of Science and Technology¹

24 Aug 2011

TL;DR: This work presents a simple and efficient algorithm which, given a set P of data points and a set Q of query points in the plane, returns the set of spatial skyline points in just O(|P| log |P|) time, which is significantly lower in complexity than the best known method.

...read moreread less

Abstract: Skyline queries have gained attention lately for supporting effective retrieval over massive spatial data. While efficient algorithms have been studied for spatial skyline queries using Euclidean distance, or, L2 norm, these algorithms are (1) still quite computationally intensive and (2) unaware of the road constraints. Our goal is to develop a more efficient algorithm for L1 norm, also known as Manhattan distance, which closely reflects road network distance for metro areas with well-connected road networks. Towards this goal, we present a simple and efficient algorithm which, given a set P of data points and a set Q of query points in the plane, returns the set of spatial skyline points in just O(|P| log |P|) time, assuming that |Q| = |P|. This is significantly lower in complexity than the best known method. In addition to efficiency and applicability, our proposed algorithm has another desirable property of independent computation and extensibility to L∞ norm, which naturally invites parallelism and widens applicability. Our extensive empirical results suggest that our algorithm outperforms the state-of-the-art approaches by orders of magnitude.

...read moreread less

Journal Article•DOI•

Optimized skyline queries on road networks using nearest neighbors

[...]

Maytham Safar¹, Dalal El-Amin¹, David Taniar²•Institutions (2)

Kuwait University¹, Monash University²

01 Dec 2011

TL;DR: A new algorithm that requires a remarkably less number of network distance calculations is proposed in this work, which uses a progressive nearest neighbor algorithm to minimize the set of candidates then evaluates those candidates by only comparing them to a subset of discovered skyline points.

...read moreread less

Abstract: Skyline queries are used with data extensive applications, such as mobile location-based services, to support multi-criteria decision-making and to prune the data space by returning the most "interesting" data points. Most interesting data points are the points, which are not dominated by any other point. Spatial network skyline query is a subset of the skyline query problem where data points are nodes in a road network and the attributes of the data points are network distance relative to a set of query points. Spatial network skyline query's problem is the need to calculate the attributes with an expensive distance calculation operation. Previous works (Deng et al. Proceedings of the 23th international conference on data engineering, 796---805, 2007), Sharifzadeh et al. Proceedings of the 32nd international conference on very large databases, 751---762, 2009) that addressed this problem involved extensive network distance calculation between the query points and data points. A new algorithm that requires a remarkably less number of network distance calculations is proposed in this work. Our approach uses a progressive nearest neighbor algorithm to minimize the set of candidates then evaluates those candidates by only comparing them to a subset of discovered skyline points. Experiments showed the effectiveness of our algorithm compared to previous works.

...read moreread less

Journal Article•DOI•

Skyline-based registration of 3D laser scans

[...]

Andreas Nüchter¹, Stanislav Gutev¹, Dorit Borrmann¹, Jan Elseberg¹•Institutions (1)

Jacobs University Bremen¹

15 May 2011-Geo-spatial Information Science

TL;DR: The skyline features are extracted from panoramic 3D scans and encoded as strings enabling the use of string matching for merging the scans, and initial results in the old city center of Bremen are presented.

...read moreread less

Abstract: Acquisition and registration of terrestrial 3D laser scans is a fundamental task in mapping and modeling of cities in three dimensions. To automate this task marker-free registration methods are required. Based on the existence of skyline features, this paper proposes a novel method. The skyline features are extracted from panoramic 3D scans and encoded as strings enabling the use of string matching for merging the scans. Initial results of the proposed method in the old city center of Bremen are presented.

...read moreread less

Book Chapter•DOI•

On possibilistic skyline queries

[...]

Patrick Bosc¹, Allel Hadjali¹, Olivier Pivert¹•Institutions (1)

University of Rennes¹

26 Oct 2011

TL;DR: This paper deals with Skyline queries in the context of possilistic databases, where uncertain attribute values are represented by possibility distributions, and a basic algorithm suited to their evaluation is provided.

...read moreread less

Abstract: This paper deals with Skyline queries in the context of possilistic databases, where uncertain attribute values are represented by possibility distributions. In this framework, Skyline queries aim at computing the extent to which any tuple from a given relation is possibly/certainly not dominated by any other tuple from that relation. Beside the interpretation of possibilistic Skyline queries, a basic algorithm suited to their evaluation is provided.

...read moreread less

Book Chapter•DOI•

Dynamic skylines considering range queries

[...]

Wen-Chi Wang¹, En Tzu Wang², Arbee L. P. Chen³•Institutions (3)

Chunghwa Telecom¹, Industrial Technology Research Institute², National Chengchi University³

22 Apr 2011

TL;DR: An efficient algorithm based on the grid index and a novel variant of the well-known Z-order curve is proposed to solve the problem of computing dynamic skylines considering range queries and results demonstrate that it is effective and efficient.

...read moreread less

Abstract: Dynamic skyline queries are practical in many applications. For example, if no data exist to fully satisfy a query q in an information system, the data "closer" to the requirements of q can be retrieved as answers. Finding the nearest neighbors of q can be a solution; yet finding the data not dynamically dominated by any other data with respect to q, i.e. the dynamic skyline regarding q can be another solution. A data point p is defined to dynamically dominate another data point s, if the distance between each dimension of p and the corresponding dimension of q is no larger than the corresponding distance regarding s and q and at least in one dimension, the corresponding distance regarding p and q is smaller than that regarding s and q. Some approaches for answering dynamic skyline queries have been proposed. However, the existing approaches only consider the query as a point rather than a range in each dimension, also frequently issued by users. We make the first attempt to solve a problem of computing dynamic skylines considering range queries in this paper. To deal with this problem, we propose an efficient algorithm based on the grid index and a novel variant of the well-known Z-order curve. Moreover, a series of experiments are performed to evaluate the proposed algorithm and the experiment results demonstrate that it is effective and efficient.

...read moreread less

Journal Article•

Skyline Extraction using a Multistage Edge Filtering

[...]

Byung-Ju Kim, Jong-Jin Shin, Hwa-Jin Nam, Jin-Soo Kim

20 Jul 2011-World Academy of Science, Engineering and Technology, International Journal of Electrical, Computer, Energetic, Electronic and Communication Engineering

TL;DR: This proposed edge-based skyline extraction algorithm is robust under severe environments with clutters and has even good performance for infrared sensor images with a low resolution.

...read moreread less

Abstract: Skyline extraction in mountainous images can be used for navigation of vehicles or UAV(unmanned air vehicles), but it is very hard to extract skyline shape because of clutters like clouds, sea lines and field borders in images. We developed the edge-based skyline extraction algorithm using a proposed multistage edge filtering (MEF) technique. In this method, characteristics of clutters in the image are first defined and then the lines classified as clutters are eliminated by stages using the proposed MEF technique. After this processing, we select the last line using skyline measures among the remained lines. This proposed algorithm is robust under severe environments with clutters and has even good performance for infrared sensor images with a low resolution. We tested this proposed algorithm for images obtained in the field by an infrared camera and confirmed that the proposed algorithm produced a better performance and faster processing time than conventional algorithms. Keywords—MEF, mountainous image, navigation, skyline

...read moreread less

Journal Article•DOI•

A clustering based approach for skyline diversity

[...]

Zhenhua Huang¹, Yang Xiang¹, Bo Zhang¹, Xiaoling Liu²•Institutions (2)

Tongji University¹, Fudan University²

01 Jul 2011-Expert Systems With Applications

TL;DR: An efficient evaluation approach is proposed which is based on the circinal index to seamlessly integrate subspace skyline computation, K-means clustering and representatives selection, and returns K ''representative'' and ''diverse'' skyline objects to users.

...read moreread less

Abstract: Skyline query processing has recently received a lot of attention in database and data-mining communities. To the best of our knowledge, the existing researches mainly focus on considering how to efficiently return the whole skyline set. However, when the cardinality and dimensionality of input objects increase, the number of skylines grows exponentially, and hence this ''huge'' skyline set is completely useless to users. On the other hand, in most real applications, the objects are usually clustered, and therefore many objects have similar attribute values. Motivated by the above facts, in this paper, we present a novel type of SkyCluster query to capture the skyline diversity and improve the usefulness of skyline result. The SkyCluster query integrates K-means clustering into skyline computation, and returns K ''representative'' and ''diverse'' skyline objects to users. To process such query, a straightforward approach is to simply integrate the existing techniques developed for skyline-only and clustering-only together. But this approach is costly since both skyline computation and K-means clustering are all CPU-sensitive. We propose an efficient evaluation approach which is based on the circinal index to seamlessly integrate subspace skyline computation, K-means clustering and representatives selection. Also, we present a novel optimization heuristic to further improve the query performance. Experimental study shows that our approach is both efficient and effective.

...read moreread less

Book Chapter•DOI•

Categorical data skyline using classification tree

[...]

Wookey Lee¹, Justin J. Song¹, Carson K. Leung²•Institutions (2)

Inha University¹, University of Manitoba²

18 Apr 2011

TL;DR: This paper pioneer an entirely new domain for skyline query--namely, the categorical data--with which the corresponding ranking measures for the skyline queries are developed, and tested the proposed algorithm using the ACM Computing Classification System.

...read moreread less

Abstract: Skyline query is an effective method to process large-sized multidimensional data sets as it can pinpoint the target data so that dominated data (say, 95% of data) can be efficiently excluded as unnecessary data objects. However, most of the conventional skyline algorithms were developed to handle numerical data. Thus, most of the text data were excluded from being processed by the algorithms. In this paper, we pioneer an entirely new domain for skyline query--namely, the categorical data--with which the corresponding ranking measures for the skyline queries are developed. We tested our proposed algorithm using the ACM Computing Classification System.

...read moreread less

Journal Article•DOI•

Spatial skyline queries: exact and approximation algorithms

[...]

Mu-Woong Lee¹, Wanbin Son¹, Hee-Kap Ahn¹, Seung-won Hwang¹•Institutions (1)

Pohang University of Science and Technology¹

01 Oct 2011-Geoinformatica

TL;DR: This work presents a simple and efficient algorithm that computes the correct results, and proposes a fast approximation algorithm that returns a desirable subset of the skyline results.

...read moreread less

Abstract: As more data-intensive applications emerge, advanced retrieval semantics, such as ranking and skylines, have attracted the attention of researchers. Geographic information systems are a good example of an application using a massive amount of spatial data. Our goal is to efficiently support exact and approximate skyline queries over massive spatial datasets. A spatial skyline query, consisting of multiple query points, retrieves data points that are not father than any other data points, from all query points. To achieve this goal, we present a simple and efficient algorithm that computes the correct results, also propose a fast approximation algorithm that returns a desirable subset of the skyline results. In addition, we propose a continuous query algorithm to trace changes of skyline points while a query point moves. To validate the effectiveness and efficiency of our algorithm, we provide an extensive empirical comparison between our algorithms and the best known spatial skyline algorithms from several perspectives.

...read moreread less

Journal Article•DOI•

Skyline and mapping aware join query evaluation

[...]

Venkatesh Raghavan¹, Elke A. Rundensteiner¹, Shweta Srivastava¹•Institutions (1)

Worcester Polytechnic Institute¹

01 Sep 2011-Information Systems

TL;DR: A robust execution framework called SKIN is proposed to evaluate skyline over joins and is shown to be robust for both skyline-friendly (independent and correlated) as well as skyline-unfriendly (anti-correlated) data distributions.

...read moreread less