Data structures and algorithms for nearest neighbor search in general metric spaces
Peter N. Yianilos
- pp 311-321
Reads0
Chats0
TLDR
The up-tree (vantage point tree) is introduced in several forms, together‘ with &&ciated algorithms, as an improved method for these difficult search problems in general metric spaces.Abstract:
We consider the computational problem of finding nearest neighbors in general metric spaces. Of particular interest are spaces that may not be conveniently embedded or approximated in Euclidian space, or where the dimensionality of a Euclidian representation 1s very high. Also relevant are high-dimensional Euclidian settings in which the distribution of data is in some sense of lower dimension and embedded in the space. The up-tree (vantage point tree) is introduced in several forms, together‘ with &&ciated algorithms, as an improved method for these difficult search nroblems. Tree construcI tion executes in O(nlog(n i ) time, and search is under certain circumstances and in the imit, O(log(n)) expected time. The theoretical basis for this approach is developed and the results of several experiments are reported. In Euclidian cases, kd-tree performance is compared.read more
Citations
More filters
Journal ArticleDOI
Nearest Neighbor Search in General Metric Spaces Using a Tree Data Structure with a Simple Heuristic
TL;DR: A new algorithm for nearest neighbor search in general metric spaces is presented that organizes the database into recursively partitioned Voronoi regions and represents these partitions in a tree using triangular inequality to derive the minimum possible distance.
Posted ContentDOI
A multiresolution framework to characterize single-cell state landscapes
Shahin Mohammadi,Shahin Mohammadi,Jose Davila-Velderrain,Jose Davila-Velderrain,Manolis Kellis,Manolis Kellis +5 more
TL;DR: ACTIONet is introduced, a comprehensive framework that combines archetypal analysis and network theory to provide a ready-to-use analytical approach for multiresolution single-cell state characterization.
Journal ArticleDOI
Lempel-Ziv Jaccard Distance, an effective alternative to ssdeep and sdhash
TL;DR: This work proposes and test LZJD's effectiveness as a similarity digest hash for digital forensics, and develops a high performance Java implementation with the same command-line arguments as sdhash, making it easy to integrate into existing workflows.
Book ChapterDOI
An index data structure for searching in metric space databases
TL;DR: Empirical results show that the EGNAT is suitable for conducting similarity searches on very large metric space databases, and it is shown that this data structure allows efficient parallelization on distributed memory parallel architectures.
Posted Content
Candidate Generation with Binary Codes for Large-Scale Top-N Recommendation
Wang-Cheng Kang,Julian McAuley +1 more
TL;DR: A candidate generation and re-ranking based framework (CIGAR), which first learns a preference-preserving binary embedding for building a hash table to retrieve candidates, and then learns to re-rank the candidates using real-valued ranking models with a candidate-oriented objective.
References
More filters
Book
Introduction to Statistical Pattern Recognition
TL;DR: This completely revised second edition presents an introduction to statistical pattern recognition, which is appropriate as a text for introductory courses in pattern recognition and as a reference book for workers in the field.
Journal ArticleDOI
Voronoi diagrams—a survey of a fundamental geometric data structure
TL;DR: The Voronoi diagram as discussed by the authors divides the plane according to the nearest-neighbor points in the plane, and then divides the vertices of the plane into vertices, where vertices correspond to vertices in a plane.
Journal ArticleDOI
An Algorithm for Finding Best Matches in Logarithmic Expected Time
TL;DR: An algorithm and data structure are presented for searching a file containing N records, each described by k real valued keys, for the m closest matches or nearest neighbors to a given query record.
Journal ArticleDOI
A Branch and Bound Algorithm for Computing k-Nearest Neighbors
TL;DR: The method of branch and bound is implemented in the present algorithm to facilitate rapid calculation of the k-nearest neighbors, by eliminating the necesssity of calculating many distances.