scispace - formally typeset
Open AccessProceedings ArticleDOI

Data structures and algorithms for nearest neighbor search in general metric spaces

Reads0
Chats0
TLDR
The up-tree (vantage point tree) is introduced in several forms, together‘ with &&ciated algorithms, as an improved method for these difficult search problems in general metric spaces.
Abstract
We consider the computational problem of finding nearest neighbors in general metric spaces. Of particular interest are spaces that may not be conveniently embedded or approximated in Euclidian space, or where the dimensionality of a Euclidian representation 1s very high. Also relevant are high-dimensional Euclidian settings in which the distribution of data is in some sense of lower dimension and embedded in the space. The up-tree (vantage point tree) is introduced in several forms, together‘ with &&ciated algorithms, as an improved method for these difficult search nroblems. Tree construcI tion executes in O(nlog(n i ) time, and search is under certain circumstances and in the imit, O(log(n)) expected time. The theoretical basis for this approach is developed and the results of several experiments are reported. In Euclidian cases, kd-tree performance is compared.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Content-based image retrieval at the end of the early years

TL;DR: The working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap are discussed, as well as aspects of system engineering: databases, system architecture, and evaluation.
MonographDOI

Planning Algorithms: Introductory Material

TL;DR: This coherent and comprehensive book unifies material from several sources, including robotics, control theory, artificial intelligence, and algorithms, into planning under differential constraints that arise when automating the motions of virtually any mechanical system.
Proceedings ArticleDOI

Approximate nearest neighbors: towards removing the curse of dimensionality

TL;DR: In this paper, the authors present two algorithms for the approximate nearest neighbor problem in high-dimensional spaces, for data sets of size n living in R d, which require space that is only polynomial in n and d.
Journal ArticleDOI

Accelerating t-SNE using tree-based algorithms

TL;DR: Variants of the Barnes-Hut algorithm and of the dual-tree algorithm that approximate the gradient used for learning t-SNE embeddings in O(N log N) are developed and shown to substantially accelerate and make it possible to learnembeddings of data sets with millions of objects.
Proceedings Article

From Word Embeddings To Document Distances

TL;DR: It is demonstrated on eight real world document classification data sets, in comparison with seven state-of-the-art baselines, that the Word Mover's Distance metric leads to unprecedented low k-nearest neighbor document classification error rates.
References
More filters
Journal ArticleDOI

Strategies for efficient incremental nearest neighbor search

TL;DR: It is shown that incremental search can be implemented as a sequence of invocations of a previously published non-incremental algorithm, and a new incremental search algorithm is presented which finds the next nearest neighbor more efficiently by eliminating redundant computations.
Journal ArticleDOI

A Technique to Identify Nearest Neighbors

TL;DR: It is shown that this procedure may be used to eliminate distance calculations when finding nearest neighbors according to any Minkowski p-metric.
Journal ArticleDOI

The nearest neighbor problem in an abstract metric space

TL;DR: A new method for achieving this goal in an abstract metric space by selecting those models that are closest to an unknown relational description in a database of relational models.
Journal ArticleDOI

Tree structures for high dimensionality nearest neighbor searching

TL;DR: A probabilistic version of the algorithm is presented which provides significantly faster searching with little degradation in retrieval quality and some savings over a sequential search can be achieved in this type of application.
Related Papers (5)