scispace - formally typeset
Open AccessProceedings ArticleDOI

Data structures and algorithms for nearest neighbor search in general metric spaces

Reads0
Chats0
TLDR
The up-tree (vantage point tree) is introduced in several forms, together‘ with &&ciated algorithms, as an improved method for these difficult search problems in general metric spaces.
Abstract
We consider the computational problem of finding nearest neighbors in general metric spaces. Of particular interest are spaces that may not be conveniently embedded or approximated in Euclidian space, or where the dimensionality of a Euclidian representation 1s very high. Also relevant are high-dimensional Euclidian settings in which the distribution of data is in some sense of lower dimension and embedded in the space. The up-tree (vantage point tree) is introduced in several forms, together‘ with &&ciated algorithms, as an improved method for these difficult search nroblems. Tree construcI tion executes in O(nlog(n i ) time, and search is under certain circumstances and in the imit, O(log(n)) expected time. The theoretical basis for this approach is developed and the results of several experiments are reported. In Euclidian cases, kd-tree performance is compared.

read more

Content maybe subject to copyright    Report

Citations
More filters
Patent

Interaction method with an expert system that utilizes stutter peak rule

TL;DR: In this article, a rule base is re-applied to make at least one second decision, wherein the third decision is different from the second decision or either the first or second decisions are accepted.

Approximation and Relaxation Approaches for Parallel and Distributed Machine Learning

Stephen Tyree
TL;DR: This work demonstrates various techniques by which non-linear models can be made amenable to larger data sets and significantly more training parallelism by strategically introducing approximation in certain optimization steps.
Proceedings ArticleDOI

HQANN: Efficient and Robust Similarity Search for Hybrid Queries with Structured and Unstructured Constraints

TL;DR: HQANN is a simple yet highly efficient hybrid query processing framework which can be easily embedded into existing proximity graph-based ANNS algorithms and guarantees both low latency and high recall by leveraging navigation sense among attributes and fusing vector similarity search with attribute filtering.

Fuzzy Clustering for Content-based Indexing in Multimedia Database

Ho-Yin Yue
TL;DR: This work uses Sequential Fuzzy Competitive Clustering (SFCC), a fast and noise resistant fuzzy clustering algorithm, to obtain the natural clusters information and uses the result of SFCC clustering to construct a good indexing structure ( SFCC-b-tree) for effective nearest-neighbor search.
Journal ArticleDOI

Data-independent vantage point selection for range queries

TL;DR: This work proposes a data-independent technique for creating vantage points that values in each dimension of the feature vectors have to be bounded and shows that the proposed technique is superior to existing methods.
References
More filters
Book

Introduction to Statistical Pattern Recognition

TL;DR: This completely revised second edition presents an introduction to statistical pattern recognition, which is appropriate as a text for introductory courses in pattern recognition and as a reference book for workers in the field.
Journal ArticleDOI

Voronoi diagrams—a survey of a fundamental geometric data structure

TL;DR: The Voronoi diagram as discussed by the authors divides the plane according to the nearest-neighbor points in the plane, and then divides the vertices of the plane into vertices, where vertices correspond to vertices in a plane.
Journal ArticleDOI

An Algorithm for Finding Best Matches in Logarithmic Expected Time

TL;DR: An algorithm and data structure are presented for searching a file containing N records, each described by k real valued keys, for the m closest matches or nearest neighbors to a given query record.
Journal ArticleDOI

A Branch and Bound Algorithm for Computing k-Nearest Neighbors

TL;DR: The method of branch and bound is implemented in the present algorithm to facilitate rapid calculation of the k-nearest neighbors, by eliminating the necesssity of calculating many distances.
Related Papers (5)