Showing papers by "Yufei Tao published in 2003"

PDF

Open Access

Proceedings Article•DOI•

An optimal and progressive algorithm for skyline queries

[...]

Dimitris Papadias¹, Yufei Tao², Greg Fu¹, Bernhard Seeger³•Institutions (3)

Hong Kong University of Science and Technology¹, Carnegie Mellon University², University of Marburg³

09 Jun 2003

TL;DR: BBS is a progressive algorithm also based on nearest neighbor search, which is IO optimal, i.e., it performs a single access only to those R-tree nodes that may contain skyline points and its space overhead is significantly smaller than that of NN.

...read moreread less

Abstract: The skyline of a set of d-dimensional points contains the points that are not dominated by any other point on all dimensions. Skyline computation has recently received considerable attention in the database community, especially for progressive (or online) algorithms that can quickly return the first skyline points without having to read the entire data file. Currently, the most efficient algorithm is NN (nearest neighbors), which applies the divide -and-conquer framework on datasets indexed by R-trees. Although NN has some desirable features (such as high speed for returning the initial skyline points, applicability to arbitrary data distributions and dimensions), it also presents several inherent disadvantages (need for duplicate elimination if d>2, multiple accesses of the same node, large space overhead). In this paper we develop BBS (branch-and-bound skyline), a progressive algorithm also based on nearest neighbor search, which is IO optimal, i.e., it performs a single access only to those R-tree nodes that may contain skyline points. Furthermore, it does not retrieve duplicates and its space overhead is significantly smaller than that of NN. Finally, BBS is simple to implement and can be efficiently applied to a variety of alternative skyline queries. An analytical and experimental comparison shows that BBS outperforms NN (usually by orders of magnitude) under all problem instances.

...read moreread less

853 citations

Book Chapter•DOI•

Query processing in spatial network databases

[...]

Dimitris Papadias¹, Jun Zhang¹, Nikos Mamoulis², Yufei Tao³•Institutions (3)

Hong Kong University of Science and Technology¹, University of Hong Kong², City University of Hong Kong³

09 Sep 2003

TL;DR: A Euclidean restriction and a network expansion framework that take advantage of location and connectivity to efficiently prune the search space are developed and applied to the most popular spatial queries.

...read moreread less

Abstract: Despite the importance of spatial networks in real-life applications, most of the spatial database literature focuses on Euclidean spaces. In this paper we propose an architecture that integrates network and Euclidean information, capturing pragmatic constraints. Based on this architecture, we develop a Euclidean restriction and a network expansion framework that take advantage of location and connectivity to efficiently prune the search space. These frameworks are successfully applied to the most popular spatial queries, namely nearest neighbors, range search, closest pairs and e-distance joins, in the context of spatial network databases.

...read moreread less

675 citations

Book Chapter•DOI•

The TPR*-tree: an optimized spatio-temporal access method for predictive queries

[...]

Yufei Tao¹, Dimitris Papadias², Jimeng Sun²•Institutions (2)

City University of Hong Kong¹, Hong Kong University of Science and Technology²

09 Sep 2003

TL;DR: This paper proposes a new index structure called the TPR*- tree, which takes into account the unique features of dynamic objects through a set of improved construction algorithms and provides cost models that determine the optimal performance achievable by any data-partition spatio-temporal access method.

...read moreread less

Abstract: A predictive spatio-temporal query retrieves the set of moving objects that will intersect a query window during a future time interval. Currently, the only access method for processing such queries in practice is the TPR-tree. In this paper we first perform an analysis to determine the factors that affect the performance of predictive queries and show that several of these factors are not considered by the TPR-tree, which uses the insertion/deletion algorithms of the R*-tree designed for static data. Motivated by this, we propose a new index structure called the TPR*- tree, which takes into account the unique features of dynamic objects through a set of improved construction algorithms. In addition, we provide cost models that determine the optimal performance achievable by any data-partition spatio-temporal access method. Using experimental comparison, we illustrate that the TPR*-tree is nearly-optimal and significantly outperforms the TPR-tree under all conditions.

...read moreread less

488 citations

Proceedings Article•DOI•

Location-based spatial queries

[...]

Jun Zhang¹, Manli Zhu¹, Dimitris Papadias¹, Yufei Tao², Dik Lun Lee¹ - Show less +1 more•Institutions (2)

Hong Kong University of Science and Technology¹, Carnegie Mellon University²

09 Jun 2003

TL;DR: This paper proposes an approach that enables mobile clients to determine the validity of previous queries based on their current locations, and focuses on two of the most common spatial query types, namely nearest neighbor and window queries, define the validity region in each case and propose the corresponding query processing algorithms.

...read moreread less

Abstract: In this paper we propose an approach that enables mobile clients to determine the validity of previous queries based on their current locations. In order to make this possible, the server returns in addition to the query result, a validity region around the client's location within which the result remains the same. We focus on two of the most common spatial query types, namely nearest neighbor and window queries, define the validity region in each case and propose the corresponding query processing algorithms. In addition, we provide analytical models for estimating the expected size of the validity region. Our techniques can significantly reduce the number of queries issued to the server, while introducing minimal computational and network overhead compared to traditional spatial queries.

...read moreread less

310 citations

Proceedings Article•DOI•

Selectivity estimation for predictive spatio-temporal queries

[...]

Yufei Tao¹, Jimeng Sun², Dimitris Papadias²•Institutions (2)

Carnegie Mellon University¹, Hong Kong University of Science and Technology²

05 Mar 2003

TL;DR: A cost model for selectivity estimation of predictive spatio-temporal window queries with high accuracy, ability to handle all query types, and efficient handling of updates is proposed.

...read moreread less

Abstract: We propose a cost model for selectivity estimation of predictive spatio-temporal window queries. Initially, we focus on uniform data proposing formulae that capture both points and rectangles, and any type of object/query mobility combination (i.e., dynamic objects, dynamic queries or both). Then, we apply the model to nonuniform datasets by introducing spatio-temporal histograms, which in addition to the spatial, also consider the velocity distributions during partitioning. The advantages of our techniques are (i) high accuracy (1-2 orders of magnitude lower error than previous techniques), (ii) ability to handle all query types, and (iii) efficient handling of updates.

...read moreread less

85 citations

Journal Article•DOI•

Spatial queries in dynamic environments

[...]

Yufei Tao¹, Dimitris Papadias²•Institutions (2)

City University of Hong Kong¹, Hong Kong University of Science and Technology²

01 Jun 2003-ACM Transactions on Database Systems

TL;DR: Time-parameterized and continuous versions of the most common spatial queries, i.e., window queries, nearest neighbors, spatial joins, are studied, proposing efficient processing algorithms and accurate cost models.

...read moreread less

Abstract: Conventional spatial queries are usually meaningless in dynamic environments since their results may be invalidated as soon as the query or data objects move. In this paper we formulate two novel query types, time parameterized and continuous queries, applicable in such environments. A time-parameterized query retrieves the actual result at the time when the query is issued, the expiry time of the result given the current motion of the query and database objects, and the change that causes the expiration. A continuous query retrieves tuples of the form , where each result is accompanied by a future interval, during which it is valid. We study time-parameterized and continuous versions of the most common spatial queries (i.e., window queries, nearest neighbors, spatial joins), proposing efficient processing algorithms and accurate cost models.

...read moreread less

81 citations

Journal Article•DOI•

Analysis of predictive spatio-temporal queries

[...]

Yufei Tao¹, Jimeng Sun², Dimitris Papadias³•Institutions (3)

City University of Hong Kong¹, Carnegie Mellon University², Hong Kong University of Science and Technology³

01 Dec 2003-ACM Transactions on Database Systems

TL;DR: Probabilistic cost models that estimate the selectivity of spatio-temporal window queries and joins, and the expected distance between a query and its nearest neighbor(s) are presented.

...read moreread less

Abstract: Given a set of objects S, a spatio-temporal window query q retrieves the objects of S that will intersect the window during the (future) interval qT. A nearest neighbor query q retrieves the objects of S closest to q during qT. Given a threshold d, a spatio-temporal join retrieves the pairs of objects from two datasets that will come within distance d from each other during qT. In this article, we present probabilistic cost models that estimate the selectivity of spatio-temporal window queries and joins, and the expected distance between a query and its nearest neighbor(s). Our models capture any query/object mobility combination (moving queries, moving objects or both) and any data type (points and rectangles) in arbitrary dimensionality. In addition, we develop specialized spatio-temporal histograms, which take into account both location and velocity information, and can be incrementally maintained. Extensive performance evaluation verifies that the proposed techniques produce highly accurate estimation on both uniform and non-uniform data.

...read moreread less

73 citations

Book Chapter•DOI•

Validity information retrieval for spatio-temporal queries: Theoretical performance bounds

[...]

Yufei Tao¹, Nikos Mamoulis², Dimitris Papadias³•Institutions (3)

Carnegie Mellon University¹, University of Hong Kong², Hong Kong University of Science and Technology³

24 Jul 2003

TL;DR: This paper presents the first theoretical study on validity queries, and develops indexes and algorithms with attractive I/O complexities that reveal the problem characteristics and permit the deployment of existing structures.

...read moreread less

Abstract: The results of traditional spatial queries (ie, range search, nearest neighbor, etc) are usually meaningless in spatio-temporal applications, because they will be invalidated by the movements of query and/or data objects In practice, a query result R should be accompanied with validity information specifying (i) the (future) time T that R will expire, and (ii) the change C of R at time T (so that R can be updated incrementally) Although several algorithms have been proposed for this problem, their worst-case performance is the same as that of sequential scan This paper presents the first theoretical study on validity queries, and develops indexes and algorithms with attractive I/O complexities Our discussion covers numerous important variations of the problem and different query/object mobility combinations The solutions involve a set of non-trivial reductions that reveal the problem characteristics and permit the deployment of existing structures

...read moreread less

19 citations

Proceedings Article•DOI•

The power-method: a comprehensive estimation technique for multi-dimensional queries

[...]

Yufei Tao¹, Christos Faloutsos², Dimitris Papadias³•Institutions (3)

City University of Hong Kong¹, Carnegie Mellon University², Hong Kong University of Science and Technology³

03 Nov 2003

TL;DR: The Power-method is developed, a comprehensive technique applicable to a wide range of query optimization problems under various metrics that eliminates the local uniformity assumption and is accurate even in scenarios where existing approaches completely fail.

...read moreread less

Abstract: Existing estimation approaches for multi-dimensional databases often rely on the assumption that data distribution in a small region is uniform, which seldom holds in practice. Moreover, their applicability is limited to specific estimation tasks under certain distance metric. This paper develops the Power-method, a comprehensive technique applicable to a wide range of query optimization problems under various metrics. The Power-method eliminates the local uniformity assumption and is accurate even in scenarios where existing approaches completely fail. Furthermore, it performs estimation by evaluating only one simple formula with minimal computational overhead. Extensive experiments confirm that the Power-method outperforms previous techniques in terms of accuracy and applicability to various optimization scenarios.

...read moreread less

15 citations

Journal Article•DOI•

Recent progress on selected topics in database research: a report by nine young Chinese researchers working in the United States

[...]

Zhiyuan Chen¹, Chen Li², Jian Pei³, Yufei Tao⁴, Haixun Wang⁵, Wei Wang⁶, Jiong Yang⁷, Jun Yang⁸, Donghui Zhang⁹ - Show less +5 more•Institutions (9)

Microsoft¹, University of California, Irvine², University at Buffalo³, Carnegie Mellon University⁴, IBM⁵, University of North Carolina at Chapel Hill⁶, University of Illinois at Urbana–Champaign⁷, Durham University⁸, Northeastern University⁹

01 Sep 2003-Journal of Computer Science and Technology

TL;DR: Nine young Chinese researchers working in the United States, present concise surveys and report their recent progress on the selected fields that they are working on, and hope that such an effort would attract more and more researchers, especially those in China, to enter the frontiers of database research and promote collaborations.

...read moreread less

Abstract: The study on database technologies, or more generally, the technologies of data and information management, is an important and active research field. Recently, many exciting results have been reported. In this fast growing field, Chinese researchers play more and more active roles. Research papers from Chinese scholars, both in China and abroad, appear in prestigious academic forums.In this paper, we, nine young Chinese researchers working in the United States, present concise surveys and report our recent progress on the selected fields that we are working on. Although the paper covers only a small number of topics and the selection of the topics is far from balanced, we hope that such an effort would attract more and more researchers, especially those in China, to enter the frontiers of database research and promote collaborations. For the obvious reason, the authors are listed alphabetically, while the sections are arranged in the order of the author list.

...read moreread less

7 citations