scispace - formally typeset
Journal ArticleDOI

Multiattribute hashing using Gray codes

Christos Faloutsos
- Vol. 15, Iss: 2, pp 227-238
Reads0
Chats0
TLDR
This paper develops a mathematical model, derive formulas giving the average performance of both methods and shows that the proposed method achieves 0% - 50% relative savings over the binary codes.
Abstract
@)2 of this string decides the bucket that the record is stored. In this paper we propose to use Gray codes instead of binary codes, in order to map record signatures to buckets. In Gray codes, successive codewords differ in the value of exactly one bit position, thus, successive buckets hold records with similar record signatures. The proposed method achieves better clustering of similar records and avoids some of the (expensive) random disk accesses, replacing them with sequential ones. We develop a mathematical model, derive formulas giving the average performance of both methods and show that the proposed method achieves 0% - 50% relative savings over the binary codes. We also discuss how Gray codes could be applied to some retrieval methods designed for range queries, such as the grid file [Nievergelt84a] and the approach based on the so-called z-ordering [Orenstein84a].

read more

Citations
More filters
Journal ArticleDOI

Multidimensional access methods

TL;DR: The class of point access methods, which are used to search sets of points in two or more dimensions, are presented and a discussion of theoretical and experimental results concerning the relative performance of various approaches are discussed.
Proceedings Article

A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

TL;DR: It is shown formally that partitioning and clustering techniques for similarity search in HDVSs exhibit linear complexity at high dimensionality, and that existing methods are outperformed on average by a simple sequential scan if the number of dimensions exceeds around 10.
Journal ArticleDOI

Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases

TL;DR: An overview of the current state of the art in querying multimedia databases is provided, describing the index structures and algorithms for an efficient query processing in high-dimensional spaces.
Journal ArticleDOI

Analysis of the clustering properties of the Hilbert space-filling curve

TL;DR: This work analyzes the clustering property of the Hilbert space-filling curve by deriving closed-form formulas for the number of clusters in a given query region of an arbitrary shape and shows that the Hilbert curve achieves better clustering than the z curve.
Proceedings ArticleDOI

Linear clustering of objects with multiple attributes

TL;DR: A mapping based on Hilbert's space-filling curve is presented, which out-performs previously proposed mappings on average over a variety of different operating conditions.
References
More filters
Journal ArticleDOI

Multidimensional binary search trees used for associative searching

TL;DR: The multidimensional binary search tree (or k-d tree) as a data structure for storage of information to be retrieved by associative searches is developed and it is shown to be quite efficient in its storage requirements.
Journal ArticleDOI

The Grid File: An Adaptable, Symmetric Multikey File Structure

TL;DR: This work discusses in detail the design decisions that led to the grid file, present simulation results of its behavior, and compare it to other multikey access file structures.
Proceedings ArticleDOI

The K-D-B-tree: a search structure for large multidimensional dynamic indexes

TL;DR: The K-D-B-tree as mentioned in this paper is a data structure that combines the properties of K-d-tree and B-tree. But it does not support range queries.
Journal ArticleDOI

Extendible hashing—a fast access method for dynamic files

TL;DR: This work studies, by analysis and simulation, the performance of extendible hashing and indicates that it provides an attractive alternative to other access methods, such as balanced trees.
Proceedings Article

Linear hashing: a new tool for file and table addressing

Witold Litwin
TL;DR: In this paper, a record in the file is, in general, found in one access, while the load may stay practically constant up to 90 %. No other algorithms attaining such a performance are known.