scispace - formally typeset
Search or ask a question

Showing papers on "Feature hashing published in 2001"


Proceedings ArticleDOI
22 Apr 2001
TL;DR: This work describes an approach for obtaining good hash tables based on using multiple hashes of each input key (which is an IP address), which proves extremely suitable in instances where the goal is to have one hash bucket fit into a cache line.
Abstract: High performance Internet routers require a mechanism for very efficient IP address lookups. Some techniques used to this end, such as binary search on levels, need to construct quickly a good hash table for the appropriate IP prefixes. We describe an approach for obtaining good hash tables based on using multiple hashes of each input key (which is an IP address). The methods we describe are fast, simple, scalable, parallelizable, and flexible. In particular, in instances where the goal is to have one hash bucket fit into a cache line, using multiple hashes proves extremely suitable. We provide a general analysis of this hashing technique and specifically discuss its application to binary search on levels.

294 citations


Proceedings ArticleDOI
07 Dec 2001
TL;DR: In this paper, robust video hashing is proposed as a tool to extract perceptual features from moving image sequences and identify any sufficiently long unknown video segment by efficiently matching the hash value of the short segment with a large database of pre-computed hash values.
Abstract: This paper present the concept of robust video hashing as a tool for video identification. We present considerations and a technique for (i) extracting essential perceptual features from a moving image sequences and (ii) for identifying any sufficiently long unknown video segment by efficiently matching the hash value of the short segment with a large database of pre-computed hash values.

91 citations


Patent
24 Apr 2001
TL;DR: In this article, the authors describe an implementation of a technology for recognizing the perceptual similarity of the content of digital goods, which produces hash values for digital goods that are proximally near each other, when the digital goods contain similar content.
Abstract: An implementation of a technology is described herein for recognizing the perceptual similarity of the content of digital goods. At least one implementation, described herein, introduces a new hashing technique. More particularly, this hashing technique produces hash values for digital goods that are proximally near each other, when the digital goods contain perceptually similar content. In other words, if the content of digital goods are perceptually similar, then their hash values are, likewise, similar. The hash values are proximally near each other. This is unlike conventional hashing techniques where the hash values of goods with perceptually similar content are far apart with high probability in some distance sense (e.g., Hamming). This abstract itself is not intended to limit the scope of this patent. The scope of the present invention is pointed out in the appending claims.

82 citations


Journal Article
TL;DR: This paper presents a new approach to construct a cryptographic hash function called Pair Chaining & Modular Arithmetic Combining Incremental Hash Function (PCIHF), which has some attractive properties, which are incrementality and parallelizability.
Abstract: Most of the hash functions using iterative constructions, are inefficient for bulk hashing of documents with high similarity. In this paper, we present a new approach to construct a cryptographic hash function called Pair Chaining & Modular Arithmetic Combining Incremental Hash Function (PCIHF). PCIHF has some attractive properties, which are incrementality and parallelizability. The security of PCIHF has also been analyzed comprehensively. Finally, we show that PCIHF is not only universal one-way but also collision-free.

4 citations


Journal ArticleDOI
TL;DR: A criss-cross hash join strategy that draws from both hashing and indexing techniques, inheriting the advantages of each is proposed, and the page maps are simpler, more compact, and easier to maintain than the traditional data structures associated with index based join methods.
Abstract: Join processing in relational database systems continues to be a difficult and challenging problem. In this research, we propose a criss-cross hash join strategy that draws from both hashing and indexing techniques, inheriting the advantages of each. To facilitate the criss-cross hash join, a simple data structure, termed page map, is introduced. The page maps aid in reducing the hashing effort incurred in the current hash based join methods. Furthermore, the page maps implicitly capture and exploit the possible inherent order among tuples in the relations, however partial it may be, to achieve superior performance. As the proposed methodology relies on the hashing scheme, the page maps are simpler, more compact, and easier to maintain than the traditional data structures associated with index based join methods. We develop the ideas intuitively first, followed by a formal development of the concepts and the algorithms. A detailed probabilistic analysis of the algorithms is presented and their performance is assessed through extensive empirical investigations. The empirical analysis suggests significant performance improvements over the current state-of-the-art hybrid hash method, especially in the presence of possible inherent order.

3 citations