Search or ask a question

Showing papers on "Feature hashing published in 1982"

PDF

Open Access

Journal Article•DOI•

Analysis of Extendible Hashing

[...]

Haim Mendelson¹•Institutions (1)

Saint Petersburg State University¹

01 Nov 1982-IEEE Transactions on Software Engineering

TL;DR: A complete characterization of the probability distribution of the Directory size and depth is derived, and its implications on the design of the directory are studied.

...read moreread less

Abstract: Extendible hashing is an attractive direct-access technique which has been introduced recently. It is characterized by a combination of database-size flexibility and fast direct access. This paper derives performance measures for extendible hashing, and considers their implecations on the physical database design. A complete characterization of the probability distribution of the directory size and depth is derived, and its implications on the design of the directory are studied. The expected input/output costs of various operations are derived, and the effects of varying physical design parameters on the expected average operating cost and on the expected volume are studied.

...read moreread less

62 citations

Journal Article•DOI•

A letter oriented minimal perfect hashing function

[...]

Curtis R. Cook¹, R. R. Oldehoeft²•Institutions (2)

Oregon State University¹, Colorado State University²

01 Sep 1982-Sigplan Notices

TL;DR: A letter oriented algorithm is developed that handles more than one word per iteration and that frequently outperforms Cichelli's backtracking algorithm.

...read moreread less

Abstract: Cichelli has presented a simple method for constructing minimal perfect hash tables of identifiers for small static word sets. The hash function value for a word is computed as the sum of the length of the word and the values associated with the first and last letters of the word. Cichelli's backtracking algorithm considers one word at a time and performs an exhaustive search to find the letter value assignments. In considering heuristics to improve his algorithm we were led to develop a letter oriented algorithm that handles more than one word per iteration and that frequently outperforms Cichelli's. We also investigate the impact of relaxing the minimality requirement and allowing blank spaces in the constructed table. This substantially reduced the execution time of the algorithm. This relaxation and partitioning data sets are shown to be two useful schemes for handling large data sets.

...read moreread less

46 citations

Journal Article•DOI•

Implementations for coalesced hashing

[...]

Jeffrey Scott Vitter¹•Institutions (1)

Brown University¹

01 Dec 1982-Communications of The ACM

TL;DR: Techniques are developed for tuning an important parameter that relates the sizes of the address region and the cellar in order to optimize the average running times of different implementations of the coalesced hashing method.

...read moreread less

Abstract: The coalesced hashing method is one of the faster searching methods known today. This paper is a practical study of coalesced hashing for use by those who intend to implement or further study the algorithm. Techniques are developed for tuning an important parameter that relates the sizes of the address region and the cellar in order to optimize the average running times of different implementations. A value for the parameter is reported that works well in most cases. Detailed graphs explain how the parameter can be tuned further to meet specific needs. The resulting tuned algorithm outperforms several well-known methods including standard coalesced hashing, separate (or direct) chaining, linear probing, and double hashing. A variety of related methods are also analyzed including deletion algorithms, a new and improved insertion strategy called varied-insertion, and applications to external searching on secondary storage devices.

...read moreread less

26 citations