scispace - formally typeset
Y

Yufei Tao

Researcher at The Chinese University of Hong Kong

Publications -  212
Citations -  16395

Yufei Tao is an academic researcher from The Chinese University of Hong Kong. The author has contributed to research in topics: Query optimization & Nearest neighbor search. The author has an hindex of 64, co-authored 202 publications receiving 15631 citations. Previous affiliations of Yufei Tao include University of Queensland & Hong Kong University of Science and Technology.

Papers
More filters
Proceedings ArticleDOI

Quality and efficiency in high dimensional nearest neighbor search

TL;DR: This work proposes a new access method called the locality sensitive B-tree (LSB-tree) that enables fast high-dimensional NN search with excellent quality and reduces its space and query cost dramatically, and outperforms adhoc-LSH even though the latter has no quality guarantee.
Proceedings ArticleDOI

On the Anonymization of Sparse High-Dimensional Data

TL;DR: This work proposes a novel anonymization method for sparse high-dimensional data that employs a particular representation that captures the correlation in the underlying data, and facilitates the formation of anonymized groups with low information loss.
Proceedings ArticleDOI

Distance-Based Representative Skyline

TL;DR: This work proposes a new definition of representative skyline that minimizes the distance between a non-representative skyline point and its nearest representative, and shows that it not only better captures the contour of the entire skyline than the previous method, but also can be computed much faster.
Proceedings ArticleDOI

SUBSKY: Efficient Computation of Skylines in Subspaces

TL;DR: The core of SUBSKY is a transformation that converts multi-dimensional data to 1D values, and enables several effective pruning heuristics, and can be implemented in any relational database.
Proceedings ArticleDOI

DBSCAN Revisited: Mis-Claim, Un-Fixability, and Approximation

TL;DR: It is proved that for d ≥ 3, the DBSCAN problem requires Ω(n4/3) time to solve, unless very significant breakthroughs---ones widely believed to be impossible---could be made in theoretical computer science, and the running time can be dramatically brought down to O(n) in expectation regardless of the dimensionality d.