Showing papers on "Skyline published in 2014"

PDF

Open Access

Journal Article•DOI•

Panorama: a targeted proteomics knowledge base.

[...]

Vagisha Sharma¹, Josh Eckels, Greg K. Taylor, Nicholas J. Shulman¹, Andrew B. Stergachis¹, Shannon A. Joyner², Ping Yan³, Jeffrey R. Whiteaker³, Goran N. Halusa⁴, Birgit Schilling⁵, Bradford W. Gibson⁵, Christopher M. Colangelo⁶, Amanda G. Paulovich³, Steven A. Carr⁷, Jacob D. Jaffe⁷, Michael J. MacCoss¹, Brendan MacLean¹ - Show less +13 more•Institutions (7)

University of Washington¹, Carnegie Mellon University², Fred Hutchinson Cancer Research Center³, Leidos⁴, Buck Institute for Research on Aging⁵, Yale University⁶, Broad Institute⁷

18 Aug 2014-Journal of Proteome Research

TL;DR: Panorama allows laboratories to store and organize curated results contained in Skyline documents with fine-grained permissions, which facilitates distributed collaboration and secure sharing of published and unpublished data via a web-browser interface.

...read moreread less

Abstract: Panorama is a web application for storing, sharing, analyzing, and reusing targeted assays created and refined with Skyline,1 an increasingly popular Windows client software tool for targeted proteomics experiments. Panorama allows laboratories to store and organize curated results contained in Skyline documents with fine-grained permissions, which facilitates distributed collaboration and secure sharing of published and unpublished data via a web-browser interface. It is fully integrated with the Skyline workflow and supports publishing a document directly to a Panorama server from the Skyline user interface. Panorama captures the complete Skyline document information content in a relational database schema. Curated results published to Panorama can be aggregated and exported as chromatogram libraries. These libraries can be used in Skyline to pick optimal targets in new experiments and to validate peak identification of target peptides. Panorama is open-source and freely available. It is distributed as ...

...read moreread less

191 citations

Proceedings Article•DOI•

Stochastic skyline route planning under time-varying uncertainty

[...]

Bin Yang¹, Chenjuan Guo¹, Christian S. Jensen², Manohar Kaul¹, Shuo Shang³ - Show less +1 more•Institutions (3)

Aarhus University¹, Aalborg University², China University of Petroleum³

01 Mar 2014

TL;DR: A multi-cost, time-dependent, uncertain graph (MTUG) model of a road network based on GPS data from vehicles that traversed the road network is defined and efficient algorithms to retrieve stochastic skyline routes for a given source-destination pair and a start time are proposed.

...read moreread less

Abstract: Different uses of a road network call for the consideration of different travel costs: in route planning, travel time and distance are typically considered, and green house gas (GHG) emissions are increasingly being considered. Further, travel costs such as travel time and GHG emissions are time-dependent and uncertain. To support such uses, we propose techniques that enable the construction of a multi-cost, time-dependent, uncertain graph (MTUG) model of a road network based on GPS data from vehicles that traversed the road network. Based on the MTUG, we define stochastic skyline routes that consider multiple costs and time-dependent uncertainty, and we propose efficient algorithms to retrieve stochastic skyline routes for a given source-destination pair and a start time. Empirical studies with three road networks in Denmark and a substantial GPS data set offer insight into the design properties of the MTUG and the efficiency of the stochastic skyline routing algorithms.

...read moreread less

122 citations

Proceedings Article•DOI•

Efficient Skyline Computation in MapReduce

[...]

Kasper Mullesgaard, Jens Laurits Pederseny, Hua Lu, Yongluan Zhou

01 Jan 2014

TL;DR: A novel approach to compute skylines eciently in MapReduce is proposed, using a grid partitioning scheme to divide the data space into partitions, and employing a bitstring to represent the partitions.

...read moreread less

Abstract: Skyline queries are useful for nding interesting tuples from a large data set according to multiple criteria. The sizes of data sets are constantly increasing and the architecture of back-ends are switching from single-node environments to non-conventional paradigms like MapReduce. Despite the usefulness of skyline queries, existing works on skyline computation in MapReduce do not take full advantage of parallelism but still run signicant parts serially. In this paper, we propose a novel approach to compute skylines eciently in MapReduce. We design a grid partitioning scheme to divide the data space into partitions, and employ a bitstring to represent the partitions. The bitstring is eciently obtained in MapReduce, and it clearly helps prune partitions (and tuples) that cannot have skyline tuples. Based on the grid partitioning, we propose two MapReduce algorithms to compute skylines. Both algorithms utilize the bitstring and distribute the original tuples to multiple mappers and make use of them to compute local skylines in parallel. In particular, MapReduce Grid Partitioning based Single-Reducer Skyline Computation (MR-GPSRS) employs a single reducer to assemble the local skylines appropriately to compute the global skyline. In contrast, MapReduce Grid Partitioning based Multiple Reducer Skyline Computation (MR-GPMRS) further divides local skylines and distributes them to multiple reducers that compute the global skyline in an independent and parallel manner. The proposed algorithms are evaluated through extensive experiments, and the results show that MR-GPMRS signicantly outperforms the alternatives in various settings.

...read moreread less

68 citations

Journal Article•DOI•

Scalable skyline computation using a balanced pivot selection technique

[...]

Jongwuk Lee¹, Seung-won Hwang¹•Institutions (1)

Pohang University of Science and Technology¹

01 Jan 2014-Information Systems

TL;DR: This work develops a novel technique to select a cost-optimal point, called a pivot point, that minimizes the number of comparisons in point-based space partitioning, and designs an efficient greedy algorithm for the k representative skyline using the skytree.

...read moreread less

57 citations

Proceedings Article•DOI•

Crowd-powered find algorithms

[...]

Anish Das Sarma, Aditya Parameswaran¹, Hector Garcia-Molina², Alon Halevy³•Institutions (3)

Urbana University¹, Stanford University², Google³

19 May 2014

TL;DR: This work formally defines the problem of using humans to find a bounded number of items satisfying certain properties, from a data set, and design optimal algorithms that span the skyline of cost and time, i.e., provide designers the ability to control the cost vs. time trade-off.

...read moreread less

Abstract: We consider the problem of using humans to find a bounded number of items satisfying certain properties, from a data set. For instance, we may want humans to identify a select number of travel photos from a data set of photos to display on a travel website, or a candidate set of resumes that meet certain requirements from a large pool of applicants. Since data sets can be enormous, and since monetary cost and latency of data processing with humans can be large, optimizing the use of humans for finding items is an important challenge. We formally define the problem using the metrics of cost and time, and design optimal algorithms that span the skyline of cost and time, i.e., we provide designers the ability to control the cost vs. time trade-off. We study the deterministic as well as error-prone human answer settings, along with multiplicative and additive approximations. Lastly, we study how we may design algorithms with specific expected cost and time measures.

...read moreread less

55 citations

Journal Article•DOI•

On Skyline Groups

[...]

Nan Zhang¹, Chengkai Li², Naeemul Hassan², Sundaresan Rajasekaran¹, Gautam Das² - Show less +1 more•Institutions (2)

George Washington University¹, University of Texas at Arlington²

01 Apr 2014-IEEE Transactions on Knowledge and Data Engineering

TL;DR: Two anti-monotonic properties with varying degrees of applicability are identified: order-specific property which applies to SUM, MIN, and MAX as well as weak candidate-generation property which applied to MIN and MAX only.

...read moreread less

Abstract: We formulate and investigate the novel problem of finding the skyline k-tuple groups from an n-tuple data set-i.e., groups of k tuples which are not dominated by any other group of equal size, based on aggregate-based group dominance relationship. The major technical challenge is to identify effective anti-monotonic properties for pruning the search space of skyline groups. To this end, we first show that the anti-monotonic property in the well-known Apriori algorithm does not hold for skyline group pruning. Then, we identify two anti-monotonic properties with varying degrees of applicability: order-specific property which applies to SUM, MIN, and MAX as well as weak candidate-generation property which applies to MIN and MAX only. Experimental results on both real and synthetic data sets verify that the proposed algorithms achieve orders of magnitude performance gain over the baseline method.

...read moreread less

48 citations

Journal Article•DOI•

Toward efficient multidimensional subspace skyline computation

[...]

Jongwuk Lee¹, Seung-won Hwang¹•Institutions (1)

Pohang University of Science and Technology¹

01 Feb 2014

TL;DR: This paper addresses skyline groups in which a skyline point (or a set of skyline points) is annotated with decisive subspaces and develops orthogonal optimization principles that benefit both approaches of multidimensional subspace skyline computation.

...read moreread less

Abstract: Skyline queries have attracted considerable attention to assist multicriteria analysis of large-scale datasets. In this paper, we focus on multidimensional subspace skyline computation that has been actively studied for two approaches. First, to narrow down a full-space skyline, users may consider multiple subspace skylines reflecting their interest. For this purpose, we tackle the concept of a skycube, which consists of all possible non-empty subspace skylines in a given full space. Second, to understand diverse semantics of subspace skylines, we address skyline groups in which a skyline point (or a set of skyline points) is annotated with decisive subspaces. Our primary contributions are to identify common building blocks of the two approaches and to develop orthogonal optimization principles that benefit both approaches. Our experimental results show the efficiency of proposed algorithms by comparing them with state-of-the-art algorithms in both synthetic and real-life datasets.

...read moreread less

38 citations

Journal Article•DOI•

Authenticating Location-Based Skyline Queries in Arbitrary Subspaces

[...]

Xin Lin¹, Jianliang Xu², Haibo Hu², Wang-Chien Lee³•Institutions (3)

East China Normal University¹, Hong Kong Baptist University², Pennsylvania State University³

01 Jun 2014-IEEE Transactions on Knowledge and Data Engineering

TL;DR: A prefetching-based approach is developed that enables clients to compute new LASQ results locally during movement, without frequently contacting the server for query re-evaluation, and a basic Merkle Skyline R-tree method and a novel Partial S4- tree method to authenticate one-shot LASZs are proposed.

...read moreread less

Abstract: With the ever-increasing use of smartphones and tablet devices, location-based services (LBSs) have experienced explosive growth in the past few years. To scale up services, there has been a rising trend of outsourcing data management to Cloud service providers, which provide query services to clients on behalf of data owners. However, in this data-outsourcing model, the service provider can be untrustworthy or compromised, thereby returning incorrect or incomplete query results to clients, intentionally or not. Therefore, empowering clients to authenticate query results is imperative for outsourced databases. In this paper, we study the authentication problem for location-based arbitrary-subspace skyline queries (LASQs), which represent an important class of LBS applications. We propose a basic Merkle Skyline R-tree method and a novel Partial S4-tree method to authenticate one-shot LASQs. For the authentication of continuous LASQs, we develop a prefetching-based approach that enables clients to compute new LASQ results locally during movement, without frequently contacting the server for query re-evaluation. Experimental results demonstrate the efficiency of our proposed methods and algorithms under various system settings.

...read moreread less

36 citations

Journal Article•DOI•

On efficient reverse skyline query processing

[...]

Yunjun Gao¹, Qing Liu¹, Baihua Zheng², Gang Chen¹•Institutions (2)

Zhejiang University¹, Singapore Management University²

01 Jun 2014-Expert Systems With Applications

TL;DR: Several efficient algorithms for exact RSQ processing over multidimensional datasets are proposed, which utilize a conventional data-partitioning index on the dataset P, and employ precomputation, reuse, and pruning techniques to boost the query performance.

...read moreread less

Abstract: We propose two efficient algorithms for exact RSQ processing.We use precomputation, reuse, and pruning techniques to boost query performance.We extend our techniques to tackle a natural variant of RSQ, i.e., CRSQ.Extensive experiments show that our algorithms outperform RSSA by 2-3 times. Given a D-dimensional data set P and a query point q, a reverse skyline query (RSQ) returns all the data objects in P whose dynamic skyline contains q. It is important for many real life applications such as business planning and environmental monitoring. Currently, the state-of-the-art algorithm for answering the RSQ is the reverse skyline using skyline approximations (RSSA) algorithm, which is based on the precomputed approximations of the skylines. Although RSSA has some desirable features, e.g., applicability to arbitrary data distributions and dimensions, it needs for multiple accesses of the same nodes, incurring redundant I/O and CPU costs. In this paper, we propose several efficient algorithms for exact RSQ processing over multidimensional datasets. Our methods utilize a conventional data-partitioning index (e.g., R-tree) on the dataset P, and employ precomputation, reuse, and pruning techniques to boost the query performance. In addition, we extend our techniques to tackle a natural variant of the RSQ, i.e., constrained reverse skyline query (CRSQ), which retrieves the reverse skyline inside a specified constrained region. Extensive experimental evaluation using both real and synthetic datasets demonstrates that our proposed algorithms outperform RSSA by several orders of magnitude under all experimental settings.

...read moreread less

36 citations

Journal Article•DOI•

Taking the Big Picture: representative skylines based on significance and diversity

[...]

Matteo Magnani¹, Ira Assent², Michael Lind Mortensen²•Institutions (2)

Uppsala University¹, Aarhus University²

01 Oct 2014

TL;DR: This paper introduces a novel approach taking both the significance of all the records and their diversity into account, adapting to available knowledge of the scoring function, but also working under complete ignorance.

...read moreread less

Abstract: The skyline is a popular operator to extract records from a database when a record scoring function is not available. However, the result of a skyline query can be very large. The problem addressed in this paper is the automatic selection of a small number $$(k)$$ ( k ) of representative skyline records. Existing approaches have only focused on partial aspects of this problem. Some try to identify sets of diverse records giving an overall approximation of the skyline. These techniques, however, are sensitive to the scaling of attributes or to the insertion of non-skyline records into the database. Others exploit some knowledge of the record scoring function to identify the most significant record, but not sets of records representative of the whole skyline. In this paper, we introduce a novel approach taking both the significance of all the records and their diversity into account, adapting to available knowledge of the scoring function, but also working under complete ignorance. We show the intractability of the problem and present approximate algorithms. We experimentally show that our approach is efficient, scalable and that it improves existing works in terms of the significance and diversity of the results.

...read moreread less

36 citations

Journal Article•DOI•

Processing k-skyband, constrained skyline, and group-by skyline queries on incomplete data

[...]

Yunjun Gao¹, Xiaoye Miao¹, Huiyong Cui¹, Gang Chen¹, Qing Li² - Show less +1 more•Institutions (2)

Zhejiang University¹, City University of Hong Kong²

01 Aug 2014-Expert Systems With Applications

TL;DR: This paper is the first study of k-skyband query processing on incomplete data, where multi-dimensional data items are missing some values of their dimensions, and formalizes the problem, and presents two efficient algorithms for processing it.

...read moreread less

Abstract: The skyline operator has been extensively explored in the literature, and most of the existing approaches assume that all dimensions are available for all data items. However, many practical applications such as sensor networks, decision making, and location-based services, may involve incomplete data items, i.e., some dimensional values are missing , due to the device failure or the privacy preservation. This paper is the first, to our knowledge, study of k-skyband ( k SB) query processing on incomplete data , where multi-dimensional data items are missing some values of their dimensions. We formalize the problem, and then present two efficient algorithms for processing it. Our methods introduce some novel concepts including expired skyline , shadow skyline , and thickness warehouse , in order to boost the search performance. As a second step, we extend our techniques to tackle constrained skyline (CS) and group-by skyline (GBS) queries over incomplete data. Extensive experiments with both real and synthetic data sets demonstrate the effectiveness and efficiency of our proposed algorithms under various experimental settings.

...read moreread less

Journal Article•DOI•

A framework for installable external tools in Skyline

[...]

Daniel Broudy¹, Trevor Killeen¹, Meena Choi¹, Nicholas J. Shulman¹, Deepak R. Mani¹, Susan E. Abbatiello¹, Deepak Mani¹, Rushdy Ahmad¹, Alexandria K. Sahu¹, Birgit Schilling¹, Kaipo Tamura¹, Yuval Boss¹, Vagisha Sharma¹, Bradford W. Gibson¹, Steven A. Carr¹, Olga Vitek¹, Michael J. MacCoss¹, Brendan MacLean¹ - Show less +14 more•Institutions (1)

Buck Institute for Research on Aging¹

01 Sep 2014-Bioinformatics

TL;DR: The new external tools framework allows researchers to integrate their tools into Skyline without modifying the Skyline codebase, and tool developers can now easily share their tools with proteomics researchers using Skyline.

...read moreread less

Abstract: Summary: Skyline is a Windows client application for targeted proteomics method creation and quantitative data analysis. The Skyline document model contains extensive mass spectrometry data from targeted proteomics experiments performed using selected reaction monitoring, parallel reaction monitoring and data-independent and data-dependent acquisition methods. Researchers have developed software tools that perform statistical analysis of the experimental data contained within Skyline documents. The new external tools framework allows researchers to integrate their tools into Skyline without modifying the Skyline codebase. Installed tools provide point-and-click access to downstream statistical analysis of data processed in Skyline. The framework also specifies a uniform interface to format tools for installation into Skyline. Tool developers can now easily share their tools with proteomics researchers using Skyline. Availability and implementation: Skyline is available as a single-click self-updating web installation at http://skyline.maccosslab.org. This Web site also provides access to installable external tools and documentation. Contact: ude.notgnihsaw.u@xnadnerb Supplementary information: Supplementary data are available at Bioinformatics online.

...read moreread less

Proceedings Article•DOI•

Introduction of an outranking method in the Cloud computing research and Selection System based on the Skyline

[...]

Manar Abourezq, Abdellah Idrissi

28 May 2014

TL;DR: An improvement of the Cloud Service Research and Selection System (CSRSS), conceiving an Agent that uses both the Skyline and an outranking method to determine which Cloud services meet best the users' requirements.

...read moreread less

Abstract: In this paper, we present an improvement of the Cloud Service Research and Selection System (CSRSS) which allows Cloud users to search through Cloud services in the database and find the ones that match their requirements The CSRSS is based on the Skyline and showed some promising first results We consider that the research and selection of a Cloud service among a set of Cloud services is a choice problem So, in this work, we thought of using outranking methods in order to better refine the results Our work's main contribution is conceiving an Agent that uses both the Skyline and an outranking method to determine which Cloud services meet best the users' requirements Empirical results show that our method is effective and very promising

...read moreread less

Journal Article•DOI•

MSSQ: Manhattan Spatial Skyline Queries

[...]

Wanbin Son¹, Seung-won Hwang¹, Hee-Kap Ahn¹•Institutions (1)

Pohang University of Science and Technology¹

01 Mar 2014-Information Systems

TL;DR: This work presents a simple and efficient algorithm which, given a set P of data points and a set Q of query points in the plane, returns the set of spatial skyline points in just O(P|log|P|) time, which is significantly lower in complexity than the best known method.

...read moreread less

Journal Article•DOI•

Parallelizing skyline queries over uncertain data streams with sliding window partitioning and grid index

[...]

Xiaoyong Li¹, Yijie Wang¹, Xiaoling Li¹, Yuan Wang¹•Institutions (1)

National University of Defense Technology¹

01 Nov 2014-Knowledge and Information Systems

TL;DR: An effective framework is proposed, named distributed parallel framework to address the skyline query problem over uncertain data streams with the sliding window streaming model, and an efficient approach is proposed to further optimize the parallel skyline computation with an optimized streaming item mapping strategy and the grid index.

...read moreread less

Abstract: Skyline query processing over uncertain data streams has attracted considerable attention in database community recently, due to its importance in helping users make intelligent decisions over complex data in many real applications. Although lots of recent efforts have been conducted to the skyline computation over data streams in a centralized environment typically with one processor, they cannot be well adapted to the skyline queries over complex uncertain streaming data, due to the computational complexity of the query and the limited processing capability. Furthermore, none of the existing studies on parallel skyline computation can effectively address the skyline query problem over uncertain data streams, as they are all developed to address the problem of parallel skyline queries over static certain data sets. In this paper, we formally define the parallel query problem over uncertain data streams with the sliding window streaming model. Particularly, for the first time, we propose an effective framework, named distributed parallel framework to address the problem based on the sliding window partitioning. Furthermore, we propose an efficient approach (parallel streaming skyline) to further optimize the parallel skyline computation with an optimized streaming item mapping strategy and the grid index. Extensive experiments with real deployment over synthetic and real data are conducted to demonstrate the effectiveness and efficiency of the proposed techniques.

...read moreread less

Book Chapter•DOI•

Linear Path Skyline Computation in Bicriteria Networks

[...]

Michael Shekelyan¹, Gregor Jossé¹, Matthias Schubert¹, Hans-Peter Kriegel¹•Institutions (1)

Ludwig Maximilian University of Munich¹

21 Apr 2014

TL;DR: This paper examines the subset of the path skyline which is optimal under the most common type of preference function, the weighted sum, and introduces a new algorithm to compute all linearly non-dominated paths denoted as linear path skyline.

...read moreread less

Abstract: A bicriteria network is an interlinked data set where edges are labeled with two cost attributes. An example is a road network where edges represent road segments being labeled with traversal time and energy consumption. To measure the proximity of two nodes in network data, the common method is to compute a cost optimal path between the nodes. In a bicriteria network, there often is no unique path being optimal w.r.t. both types of cost. Instead, a path skyline describes the set of non-dominated paths that are optimal under varying preference functions. In this paper, we examine the subset of the path skyline which is optimal under the most common type of preference function, the weighted sum. We will examine characteristics of this more strict domination relation. Furthermore, we introduce techniques to efficiently maintain the set of linearly non-dominated paths. Finally, we will introduce a new algorithm to compute all linearly non-dominated paths denoted as linear path skyline. In our experimental evaluation, we will compare our new approach to other methods for computing the linear skyline and efficient approaches to compute path skylines.

...read moreread less

Book Chapter•DOI•

APSkyline: Improved Skyline Computation for Multicore Architectures

[...]

Stian Liknes¹, Akrivi Vlachou¹, Christos Doulkeridis², Kjetil Nørvåg¹•Institutions (2)

Norwegian University of Science and Technology¹, University of Piraeus²

21 Apr 2014

TL;DR: APS kyline is a new approach for multicore skyline query processing, which adheres to the partition-execute-merge framework and employs an angle-based partitioning approach, which increases the degree of pruning that can be achieved in the execute phase, thus significantly reducing the number of candidate points that need to be checked in the final merging phase.

...read moreread less

Abstract: The trend towards in-memory analytics and CPUs with an increasing number of cores calls for new algorithms that can efficiently utilize the available resources. This need is particularly evident in the case of CPU-intensive query operators. One example of such a query with applicability in data analytics is the skyline query. In this paper, we present APS kyline, a new approach for multicore skyline query processing, which adheres to the partition-execute-merge framework. Contrary to existing research, we focus on the partitioning phase to achieve significant performance gains, an issue largely overlooked in previous work in multicore processing. In particular, APS kyline employs an angle-based partitioning approach, which increases the degree of pruning that can be achieved in the execute phase, thus significantly reducing the number of candidate points that need to be checked in the final merging phase. APS kyline is extremely efficient for hard cases of skyline processing, as in the cases of datasets with large skyline result sets, where it is meaningful to exploit multicore processing.

...read moreread less

Proceedings Article•DOI•

Skyline Travel Routes: Exploring Skyline for Trip Planning

[...]

Wan Ting Hsu¹, Yu Ting Wen¹, Ling Yin Wei, Wen-Chih Peng¹•Institutions (1)

National Chiao Tung University¹

14 Jul 2014

TL;DR: This paper considers some factors, such as the visiting time information of POIs and the set of query points, in retrieving travel routes, which could be mapped into dimensional spaces and shows that skyline travel routes indeed provide more diversity in the query result.

...read moreread less

Abstract: In this paper, given a spatial range Q and a set of query points specified by users, the goal of this paper is to return the travel routes that fulfill two requirements: 1.) travel routes should contain all those query points specified, and 2.) travel routes should be within the spatial range Q. Furthermore, we claim that each query point may have its proper visiting time. As such, the travel routes should go through these query points at their corresponding proper visiting time. To avoid some redundant information in the travel routes, we utilize the skyline concept to retrieve travel routes with more diversity. Specifically, in our paper, we consider some factors, such as the visiting time information of POIs and the set of query points, in retrieving travel routes. These factors could be mapped into dimensional spaces. Then, each travel route is viewed as a data point in the dimensional space. Thus, skyline data points (referred to as skyline travel routes) are returned as the query result. Skyline travel routes could provide more diversity in the query result of trip route recommendations. To evaluate our proposed methods, we conducted extensive experiments on real datasets. The experimental results show that skyline travel routes indeed provide more diversity in the query result. In addition, we evaluate the efficiency of retrieving skyline travel routes.

...read moreread less

Proceedings Article•DOI•

Continuous fragmented skylines over distributed streams

[...]

Odysseas Papapetrou¹, Minos Garofalakis¹•Institutions (1)

Technical University of Crete¹

19 May 2014

TL;DR: This paper presents the first known distributed approach for continuous fragmented skylines, namely distributed monitoring of skyline over complex functions of fragmented multi-dimensional objects, and proposes several optimizations, including a new technique based on random-walk models for adaptively determining the most efficient monitoring strategy for each object.

...read moreread less

Abstract: Distributed skyline computation is important for a wide range of application domains, from distributed and web-based systems to ISP-network monitoring and distributed databases. The problem is particularly challenging in dynamic distributed settings, where the goal is to efficiently monitor a continuous skyline query over a collection of distributed streams. All existing work relies on the assumption of a single point of reference for object attributes/dimensions, i.e., objects may be vertically or horizontally partitioned, but the accurate value of each dimension for each object is always maintained by a single site. This assumption is unrealistic for several distributed monitoring applications, where object information is fragmented over a set of distributed streams (each monitored by a different site) and needs to be aggregated (e.g., averaged) across several sites. Furthermore, it is frequently useful to define skyline dimensions through complex functions over the aggregated objects, which raises further challenges for dealing with object fragmentation. In this paper, we present the first known distributed approach for continuous fragmented skylines, namely distributed monitoring of skylines over complex functions of fragmented multi-dimensional objects. We also propose several optimizations, including a new technique based on random-walk models for adaptively determining the most efficient monitoring strategy for each object. A thorough experimental study with synthetic and real-life data sets verifies the effectiveness of our approach, demonstrating order-of-magnitude improvements in communication costs compared to the only available centralized solution.

...read moreread less

Book Chapter•DOI•

Computing Skyline from Evidential Data

[...]

Sayda Elmi¹, Karim Benouaret², Allel Hadjali, Mohamed Anis Bach Tobji¹, Boutheina Ben Yaghlane³ - Show less +1 more•Institutions (3)

Tunis University¹, Jean Monnet University², Carthage College³

15 Sep 2014

TL;DR: This paper introduces a skyline model that is appropriate to the evidential data nature and develops an efficient algorithm to compute this kind of skyline.

...read moreread less

Abstract: The skyline operator is a powerful means in multi-criteria decision-making since it retrieves the most interesting objects according to a set of attributes. On the other hand, uncertainty is inherent in many real applications. One of the most powerful approaches used to model uncertainty is the evidence theory. Databases that manage such type of data are called evidential databases. In this paper, we tackle the problem of skyline analysis on evidential databases. We first introduce a skyline model that is appropriate to the evidential data nature. We then develop an efficient algorithm to compute this kind of skyline. Finally, we present a thorough experimental evaluation of our approach.

...read moreread less

Journal Article•DOI•

Skyline ranking for uncertain databases

[...]

Hyountaek Yong¹, Jongwuk Lee¹, Jinha Kim¹, Seung-won Hwang¹•Institutions (1)

Pohang University of Science and Technology¹

20 Jul 2014-Information Sciences

TL;DR: Novel skyline algorithms are proposed that efficiently deal with maybe uncertainty, leveraging auxiliary indexes, i.e. an R-tree or a dominance graph, and are significantly faster than a naive method by orders of magnitude.

...read moreread less

Journal Article•DOI•

Domination in the Probabilistic World: Computing Skylines for Arbitrary Correlations and Ranking Semantics

[...]

Ilaria Bartolini¹, Paolo Ciaccia¹, Marco Patella¹•Institutions (1)

University of Bologna¹

26 May 2014-ACM Transactions on Database Systems

TL;DR: It is shown how, under mild conditions that indeed hold for all known PRSs, checking P-domination can be cast into an optimization problem, whose complexity is characterized for a variety of combinations of ranking semantics and correlation models.

...read moreread less

Abstract: In a probabilistic database, deciding if a tuple u is better than another tuple v has not a univocal solution, rather it depends on the specific Probabilistic Ranking Semantics (PRS) one wants to adopt so as to combine together tuples' scores and probabilities. In deterministic databases it is known that skyline queries are a remarkable alternative to (top-k) ranking queries, because they remove from the user the burden of specifying a scoring function that combines values of different attributes into a single score. The skyline of a deterministic relation R is the set of undominated tuples in R -- tuple u dominates tuple v iff on all the attributes of interest u is better than or equal to v and strictly better on at least one attribute. Domination is equivalent to having s(u) ≥ s(v) for all monotone scoring functions s(). The skyline of a probabilistic relation Rp can be similarly defined as the set of P-undominated tuples in Rp, where now u P-dominates v iff, whatever monotone scoring function one would use to combine the skyline attributes, u is reputed better than v by the PRS at hand. This definition, which is applicable to arbitrary ranking semantics and probabilistic correlation models, is parametric in the adopted PRS, thus it ensures that ranking and skyline queries will always return consistent results. In this article we provide an overall view of the problem of computing the skyline of a probabilistic relation. We show how, under mild conditions that indeed hold for all known PRSs, checking P-domination can be cast into an optimization problem, whose complexity we characterize for a variety of combinations of ranking semantics and correlation models. For each analyzed case we also provide specific P-domination rules, which are exploited by the algorithm we detail for the case where the probabilistic model is known to the query processor. We also consider the case in which the probability of tuple events can only be obtained through an oracle, and describe another skyline algorithm for this loosely integrated scenario. Our experimental evaluation of P-domination rules and skyline algorithms confirms the theoretical analysis.

...read moreread less

Journal Article•DOI•

Parallel skyline queries over uncertain data streams in cloud computing environments

[...]

Xiaoyong Li¹, Yijie Wang¹, Xiaoling Li¹, Yuan Wang¹•Institutions (1)

National University of Defense Technology¹

01 Jan 2014-International Journal of Web and Grid Services

TL;DR: Three parallel models SPM, APM, and DPM are proposed to address the parallel skyline query problem over uncertain data streams in cloud computing environments and an adaptive sliding granularity adjustment strategy and a load balance strategy are suggested to further optimise the queries.

...read moreread less

Abstract: Skyline query processing over uncertain data streams has attracted considerable attention recently, due to its importance in helping users make intelligent decisions on complex data. Nevertheless, existing studies only focus on retrieving the skylines over data streams in a centralised environment typically with one processor, which limits the scalability and cannot meet the requirement for massive data analysis. Cloud computing provides unprecedentedly opportunities for supporting massive data management, which can be well adapted to the parallel skyline queries. In this paper, we extensively study the parallel skyline query problem over uncertain data streams in cloud computing environments. Particularly, three parallel models SPM, APM, and DPM are proposed to address the problem based on the sliding window partitioning. Additionally, an adaptive sliding granularity adjustment strategy and a load balance strategy are proposed to further optimise the queries. Extensive experiments are conducted to demonstrate the effectiveness and efficiency of the proposals.

...read moreread less

Proceedings Article•DOI•

Skyline Query Processing over Encrypted Data: An Attribute-Order-Preserving-Free Approach

[...]

Suvarna Bothe¹, Alfredo Cuzzocrea², Panagiotis Karras³, Akrivi Vlachou⁴•Institutions (4)

Rutgers University¹, University of Calabria², Skolkovo Institute of Science and Technology³, Norwegian University of Science and Technology⁴

07 Nov 2014

TL;DR: In this paper, the authors present eSkyline, a prototype system and query interface that enables the processing of skyline queries over encrypted data, even without preserving the order on each attribute as order-preserving encryption would do.

...read moreread less

Abstract: Making co-existent and convergent the need for efficiency of relational query processing over Clouds and the security of data themselves is figuring-out how one of the most challenging research problems in the Big Data era. Indeed, in actual analytics-oriented engines, such as Google Analytics and Amazon S3, where key-value storage-representation and efficient-management models are employed as to cope with the simultaneous processing of billions of transactions, querying encrypted data is becoming one of the most annoying problem, which has also attracted a great deal of attention from the research community. While this issue has been applied to a large variety of data formats, e.g. relational, RDF and multidimensional data, very few initiatives have pointed-out skyline query processing over encrypted data, which is, indeed, relevant for database analytics. In order to fulfill this methodological and technological gap, in this paper we present eSkyline, a prototype system and query interface that enables the processing of skyline queries over encrypted data, even without preserving the order on each attribute as order-preserving encryption would do. Our system comprises of an encryption scheme that facilitates the evaluation of domination relationships, hence allows for state-of-the-art skyline processing algorithms to be used. In order to prove the effectiveness and the reliability of our system, we also provide the details of the underlying encryption scheme, plus a suitable GUI that allows a user to interact with a server, and showcases the efficiency of computing skyline queries and decrypting the results.

...read moreread less

Journal Article•DOI•

Faster output-sensitive skyline computation algorithm

[...]

Jinfei Liu¹, Li Xiong¹, Xiaofeng Xu¹•Institutions (1)

Emory University¹

01 Dec 2014-Information Processing Letters

TL;DR: This work presents the second output-sensitive skyline computation algorithm which is faster than the only existing output- sensitive skyline computation algorithms in worst case because the algorithm does not rely on the existence of a linear time procedure for finding medians.

...read moreread less

Journal Article•DOI•

GDPS: An Efficient Approach for Skyline Queries over Distributed Uncertain Data☆

[...]

Xiaoyong Li¹, Yijie Wang¹, Xiaoling Li¹, Xiaowei Wang¹, Jie Yu¹ - Show less +1 more•Institutions (1)

National University of Defense Technology¹

01 Aug 2014-Big Data Research

TL;DR: This paper extensively study the distributed probabilistic skyline query problem and proposes an efficient approach GDPS to address the problem with an optimized iterative feedback mechanism based on the grid summary.

...read moreread less

Book Chapter•DOI•

Geo-Social Skyline Queries

[...]

Tobias Emrich¹, Maximilian Franzke¹, Nikos Mamoulis², Matthias Renz¹, Andreas Züfle¹ - Show less +1 more•Institutions (2)

Ludwig Maximilian University of Munich¹, University of Hong Kong²

21 Apr 2014

TL;DR: This paper proposes an efficient solution to the problem of geo-social skyline queries by showing how the RWR-distance can be bounded efficiently and effectively in order to identify true hits and true drops early, and shows that the presented pruning techniques allow to vastly reduce the number of objects for which a more exact social distance has to be computed.

...read moreread less

Abstract: By leveraging the capabilities of modern GPS-equipped mobile devices providing social-networking services, the interest in developing advanced services that combine location-based services with social networking services is growing drastically. Based on geo-social networks that couple personal location information with personal social context information, such services are facilitated by geo-social queries that extract useful information combining social relationships and current locations of the users. In this paper, we tackle the problem of geo-social skyline queries, a problem that has not been addressed so far. Given a set of persons D connected in a social network SN with information about their current location, a geo-social skyline query reports for a given user U e D and a given location P (not necessarily the location of the user) the pareto-optimal set of persons who are close to P and closely connected to U in SN. We measure the social connectivity between users using the widely adoted, but very expensive Random Walk with Restart method (RWR) to obtain the social distance between users in the social network. We propose an efficient solution by showing how the RWR-distance can be bounded efficiently and effectively in order to identify true hits and true drops early. Our experimental evaluation shows that our presented pruning techniques allow to vastly reduce the number of objects for which a more exact social distance has to be computed, by using our proposed bounds only.

...read moreread less

Journal Article•DOI•

Selecting Dynamic Skyline Services for QoS-based Service Composition

[...]

Jian Wu, Liang Chenand, Tingting Liang

01 Sep 2014-Applied Mathematics & Information Sciences

TL;DR: This paper proposes a skyline service model as well as a novel skyline algorithm to maintain dynamic skyline services and an extensive performance study is propoesed to verify the effectiveness and effiency of the approach.

...read moreread less

Abstract: With the growing adoption of web services on the Internet, service selection becomes an important issue of service-oriented computing (SOC). Appropriate services selection algorithm is the fundamental guarantee to compose complex services from single- function components effectively. The quality of selected component services is crucial for the performance of the service composition. Therefore, it has become a hot issue to select the best services from a set of services with similar functionality. Recently, skyline has been introduced to solve the problem by selecting skyline services as the best candidate services. In this paper, we focus on selecting skyline services in dynamic environment, where new services may appear, original services may invalidate and QoS of services may change. We propose a skyline service model as well as a novel skyline algorithm to maintain dynamic skyline services. An extensive performance study is propoesed to verify the effectiveness and effi ciency of our approach.

...read moreread less

Journal Article•DOI•

City Skyline Conservation: Sustaining the Premier Image of Kuala Lumpur

[...]

Nurulhuda Abdul Hamid Yusoff¹, Anuar Mohd Noor¹, Rosmadi Ghazali¹•Institutions (1)

Universiti Teknologi MARA¹

01 Jan 2014-Procedia environmental sciences

TL;DR: In this article, the authors investigated the quality and image of city skyline and its transformation due to new high-rise buildings and showed that the effectiveness of these techniques for assessing and pre-testing tall building proposals depends upon the local context of decision making.

...read moreread less

Abstract: City skyline is a unique fingerprint and inherent abstract reflecting a city's image and identity in terms of its spatial, historical, social, cultural and economic structures over time. Acting as important components, skyscrapers intend to reflect premier image and status which have promotional and competitive benefits to a city. A rising city like Kuala Lumpur has aimed to improve its global standing through tall buildings and skyscrapers such as Petronas Towers and Kuala Lumpur Tower. The towers were designed to re-imaging the whole city and directly placed Kuala Lumpur on the world map as a world-class city. The city's skyline therefore, is instantly recognizable; distinctive assets which are important to be protected. However, due to improving technology and global city competition, many new tall buildings have been proposed with the intention to replace the iconic role of these two towers. The proposal and construction of these new buildings exceed the allowable maximum height and have given rise to the urge to re-image and re-brand the identity of this national capital city, eclipsing the iconic role of Petronas Towers and KL Tower. The study focused on how the potential impacts of new proposed tall buildings influence the existing Kuala Lumpur skylines. The aim was to investigate the quality and image of city skyline and its transformation due to new high-rise buildings. This research made use of the Geographical Information System (GIS) and its 3D modeling function to construct, assess and analyze the city silhouettes. It also showed that the effectiveness of these techniques for assessing and pre-test tall building proposals depends upon the local context of decision making.

...read moreread less

Journal Article•DOI•

Approximate convex skyline: a partitioned layer-based index for efficient processing top- k queries

[...]

Sun-Young Ihm¹, Ki-Eun Lee², Aziz Nasridinov³, Jun-Seok Heo⁴, Young Ho Park¹ - Show less +1 more•Institutions (4)

Sookmyung Women's University¹, LG Electronics², Dongguk University³, Samsung⁴

01 May 2014-Knowledge Based Systems

TL;DR: This paper proposes a method, called the Approximate Convex Skyline Enhanced (simply, AppCSE), which reduces the index building time and memory usage of the convex skyline, and shows that the degradation of query performance is negligible when usingAppCSE as the layering scheme.

...read moreread less

Abstract: A top-k query returns k tuples with the highest (or the lowest) scores from a relation. Layer-based methods are the representative ones for processing top-k queries efficiently. These methods construct a list of layers, where the ith layer contains the tuples that can potentially be the top-i answer. Thus, the layer-based methods can answer top-k queries by reading at most k layers. To construct layers, the existing layer-based methods use convex skyline, convex hull or skyline methods. Among them, the convex skyline is constructed by computing the convex hull over the skyline. Accordingly, the layer size of the convex skyline is relatively smaller than those of the convex hull, and the index building time is relatively shorter than those of the skyline. However, for large and high-dimensional databases, the convex skyline suffers from long index building time and large memory usage, because most objects can become the skyline points. This paper focuses on how to build an index, which contains a smaller number of objects comparing to the skyline and uses less time to construct an index comparing to the convex skyline. Specifically, we propose a method, called the Approximate Convex Skyline Enhanced (simply, AppCSE), which reduces the index building time and memory usage of the convex skyline. In the proposed method, we first construct the skyline, and then, partition the region of the skyline into multiple subregions, and compute the convex hull in each subregion with virtual objects. After that, AppCSE combines the objects obtained by computing the convex hull. Through various experiments with synthetic and real datasets, we demonstrate that the proposed method significantly reduces the index building time and memory usage comparing to the existing methods. In addition, we show that the degradation of query performance is negligible when using AppCSE as the layering scheme.

...read moreread less