Data structures and algorithms for nearest neighbor search in general metric spaces
Peter N. Yianilos
- pp 311-321
Reads0
Chats0
TLDR
The up-tree (vantage point tree) is introduced in several forms, together‘ with &&ciated algorithms, as an improved method for these difficult search problems in general metric spaces.Abstract:
We consider the computational problem of finding nearest neighbors in general metric spaces. Of particular interest are spaces that may not be conveniently embedded or approximated in Euclidian space, or where the dimensionality of a Euclidian representation 1s very high. Also relevant are high-dimensional Euclidian settings in which the distribution of data is in some sense of lower dimension and embedded in the space. The up-tree (vantage point tree) is introduced in several forms, together‘ with &&ciated algorithms, as an improved method for these difficult search nroblems. Tree construcI tion executes in O(nlog(n i ) time, and search is under certain circumstances and in the imit, O(log(n)) expected time. The theoretical basis for this approach is developed and the results of several experiments are reported. In Euclidian cases, kd-tree performance is compared.read more
Citations
More filters
Book ChapterDOI
Evaluating document-to-document relevance based on document language model: modeling, implementation and performance evaluation
TL;DR: A document language model is exploited to represent the document topical content and it is explained why it can reveal the document topics and two distributional similarity measure based on the document languagemodel are established to evaluate document-to-document relevance.
Journal ArticleDOI
Tangent space Data Driven framework for elasto-plastic material behaviors
TL;DR: In this article , the authors describe the use of the new paradigm of Model-Free Data Driven Computational Mechanics for solving elasto-plastic evolutionary problem, using tangent operator and transition rules based on threshold laws, ready to be implemented into existing finite element software packages.
Proceedings ArticleDOI
Life between computer vision and databases
TL;DR: A correct way to create a data base that relies on such heterogeneous techniques as those developed by computer vision researchers without collapsing under the sheer weight of its own complexity goes through the definition of abstract data types.
Scalable Clustering for Immune Repertoire Sequence Analysis
TL;DR: The result shows that the indexing-enhanced HC preserves the clustering quality very well, while also significantly reducing the time complexity of the original HC; SCT with HC is the fastest approximate HC method with slightly sacrificed quality; and SparkMST scales out satisfactorily and gives significant performance gain with a large Spark cluster.
Book ChapterDOI
Indexing Multiple-Instance Objects
TL;DR: This paper introduces an indexing technique supporting efficient queries on Multiple-Instance (MI) objects that has a dynamic structure that supports efficient insertions and deletions and is based on an effective similarity measure for MI objects.
References
More filters
Book
Introduction to Statistical Pattern Recognition
TL;DR: This completely revised second edition presents an introduction to statistical pattern recognition, which is appropriate as a text for introductory courses in pattern recognition and as a reference book for workers in the field.
Journal ArticleDOI
Voronoi diagrams—a survey of a fundamental geometric data structure
TL;DR: The Voronoi diagram as discussed by the authors divides the plane according to the nearest-neighbor points in the plane, and then divides the vertices of the plane into vertices, where vertices correspond to vertices in a plane.
Journal ArticleDOI
An Algorithm for Finding Best Matches in Logarithmic Expected Time
TL;DR: An algorithm and data structure are presented for searching a file containing N records, each described by k real valued keys, for the m closest matches or nearest neighbors to a given query record.
Journal ArticleDOI
A Branch and Bound Algorithm for Computing k-Nearest Neighbors
TL;DR: The method of branch and bound is implemented in the present algorithm to facilitate rapid calculation of the k-nearest neighbors, by eliminating the necesssity of calculating many distances.