Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs

doi:10.1109/TPAMI.2018.2889473

Open AccessJournal ArticleDOI

Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs

Yu. A. Malkov, +1 more

- 01 Apr 2020 -

IEEE Transactions on Pattern Analysis an...

- Vol. 42, Iss: 4, pp 824-836

TLDR

Hierarchical Navigable Small World (HNSW) as mentioned in this paper is a fully graph-based approach for approximate K-nearest neighbor search without any need for additional search structures (typically used at the coarse search stage of most proximity graph techniques).

Abstract:

We present a new approach for the approximate K-nearest neighbor search based on navigable small world graphs with controllable hierarchy (Hierarchical NSW, HNSW). The proposed solution is fully graph-based, without any need for additional search structures (typically used at the coarse search stage of the most proximity graph techniques). Hierarchical NSW incrementally builds a multi-layer structure consisting of a hierarchical set of proximity graphs (layers) for nested subsets of the stored elements. The maximum layer in which an element is present is selected randomly with an exponentially decaying probability distribution. This allows producing graphs similar to the previously studied Navigable Small World (NSW) structures while additionally having the links separated by their characteristic distance scales. Starting the search from the upper layer together with utilizing the scale separation boosts the performance compared to NSW and allows a logarithmic complexity scaling. Additional employment of a heuristic for selecting proximity graph neighbors significantly increases performance at high recall and in case of highly clustered data. Performance evaluation has demonstrated that the proposed general metric space search index is able to strongly outperform previous opensource state-of-the-art vector-only approaches. Similarity of the algorithm to the skip list structure allows straightforward balanced distributed implementation.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Billion-Scale Similarity Search with GPUs

Jeff Johnson, +2 more

- 01 Jul 2021 -

IEEE Transactions on Big Data

TL;DR: This paper proposes a novel design for an inline-formula that enables the construction of a high accuracy, brute-force, approximate and compressed-domain search based on product quantization, and applies it in different similarity search scenarios.

...read moreread less

Journal ArticleDOI

Generalizing RNA velocity to transient cell states through dynamical modeling.

Volker Bergen, +4 more

- 03 Aug 2020 -

Nature Biotechnology

TL;DR: ScVelo reconstructs transient cell states and differentiation pathways from single-cell RNA-sequencing data, and infer gene-specific rates of transcription, splicing and degradation, recover each cell’s position in the underlying differentiation processes and detect putative driver genes.

...read moreread less

Posted ContentDOI

Generalizing RNA velocity to transient cell states through dynamical modeling

Volker Bergen, +4 more

- 29 Oct 2019 -

bioRxiv

TL;DR: ScVelo enables disentangling heterogeneous subpopulation kinetics with unprecedented resolution in hippocampal dentate gyrus neurogenesis and pancreatic endocrinogenesis and is anticipate that scVelo will greatly facilitate the study of lineage decisions, gene regulation, and pathway activity identification.

...read moreread less

Posted Content

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Patrick S. H. Lewis, +11 more

- 22 May 2020 -

arXiv: Computation and Language

TL;DR: A general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) -- models which combine pre-trained parametric and non-parametric memory for language generation, and finds that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.

...read moreread less

Posted Content

Pretrained Transformers for Text Ranking: BERT and Beyond

Jimmy Lin, +2 more

- 13 Oct 2020 -

arXiv: Information Retrieval

TL;DR: This tutorial provides an overview of text ranking with neural network architectures known as transformers, of which BERT (Bidirectional Encoder Representations from Transformers) is the best-known example, and covers a wide range of techniques.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe

- 01 Nov 2004 -

International Journal of Computer Vision

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Journal ArticleDOI

Gradient-based learning applied to document recognition

Yann LeCun, +6 more

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.

...read moreread less

Journal ArticleDOI

Collective dynamics of small-world networks

Duncan J. Watts, +1 more

- 04 Jun 1998 -

Nature

TL;DR: Simple models of networks that can be tuned through this middle ground: regular networks ‘rewired’ to introduce increasing amounts of disorder are explored, finding that these systems can be highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs.

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Proceedings ArticleDOI

Glove: Global Vectors for Word Representation

Jeffrey Pennington, +2 more

TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.

...read moreread less

Collapse

Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs

Citations

Billion-Scale Similarity Search with GPUs

Generalizing RNA velocity to transient cell states through dynamical modeling.

Generalizing RNA velocity to transient cell states through dynamical modeling

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Pretrained Transformers for Text Ranking: BERT and Beyond

References

Distinctive Image Features from Scale-Invariant Keypoints

Gradient-based learning applied to document recognition

Collective dynamics of small-world networks

ImageNet Large Scale Visual Recognition Challenge

Glove: Global Vectors for Word Representation

Related Papers (5)

Product Quantization for Nearest Neighbor Search

Multidimensional binary search trees used for associative searching

Approximate nearest neighbors: towards removing the curse of dimensionality

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Similarity Search in High Dimensions via Hashing