scispace - formally typeset
Search or ask a question

Showing papers on "Locality-sensitive hashing published in 1982"


Journal ArticleDOI
TL;DR: A complete characterization of the probability distribution of the Directory size and depth is derived, and its implications on the design of the directory are studied.
Abstract: Extendible hashing is an attractive direct-access technique which has been introduced recently. It is characterized by a combination of database-size flexibility and fast direct access. This paper derives performance measures for extendible hashing, and considers their implecations on the physical database design. A complete characterization of the probability distribution of the directory size and depth is derived, and its implications on the design of the directory are studied. The expected input/output costs of various operations are derived, and the effects of varying physical design parameters on the expected average operating cost and on the expected volume are studied.

62 citations


Journal ArticleDOI
TL;DR: Techniques are developed for tuning an important parameter that relates the sizes of the address region and the cellar in order to optimize the average running times of different implementations of the coalesced hashing method.
Abstract: The coalesced hashing method is one of the faster searching methods known today. This paper is a practical study of coalesced hashing for use by those who intend to implement or further study the algorithm. Techniques are developed for tuning an important parameter that relates the sizes of the address region and the cellar in order to optimize the average running times of different implementations. A value for the parameter is reported that works well in most cases. Detailed graphs explain how the parameter can be tuned further to meet specific needs. The resulting tuned algorithm outperforms several well-known methods including standard coalesced hashing, separate (or direct) chaining, linear probing, and double hashing. A variety of related methods are also analyzed including deletion algorithms, a new and improved insertion strategy called varied-insertion, and applications to external searching on secondary storage devices.

26 citations


Journal ArticleDOI
TL;DR: Efficient algorithms are presented for two geometric problems that involve finding the best projection of a set of points from two-space onto a line, with two different notions of “best”.
Abstract: Efficient algorithms are presented for two geometric problems. Both problems involve finding the best projection of a set of points from two-space onto a line, with two different notions of “best”. The key technique is to identify critical angles in between which the functions to be optimized have nice trigonometric forms that can be solved exactly. Applications to hashing arise when we look for the best linear combination of two hashing functions.

13 citations