Showing papers on "Feature hashing published in 2001"

PDF

Open Access

Proceedings Article•DOI•

Using multiple hash functions to improve IP lookups

[...]

Andrei Z. Broder¹, Michael Mitzenmacher²•Institutions (2)

22 Apr 2001

TL;DR: This work describes an approach for obtaining good hash tables based on using multiple hashes of each input key (which is an IP address), which proves extremely suitable in instances where the goal is to have one hash bucket fit into a cache line.

...read moreread less

Abstract: High performance Internet routers require a mechanism for very efficient IP address lookups. Some techniques used to this end, such as binary search on levels, need to construct quickly a good hash table for the appropriate IP prefixes. We describe an approach for obtaining good hash tables based on using multiple hashes of each input key (which is an IP address). The methods we describe are fast, simple, scalable, parallelizable, and flexible. In particular, in instances where the goal is to have one hash bucket fit into a cache line, using multiple hashes proves extremely suitable. We provide a general analysis of this hashing technique and specifically discuss its application to binary search on levels.

...read moreread less

294 citations

Proceedings Article•DOI•

Visual hashing of digital video: applications and techniques

[...]

Job C. Oostveen¹, Ton Kalker¹, Jaap A. Haitsma¹•Institutions (1)

Philips¹

07 Dec 2001

TL;DR: In this paper, robust video hashing is proposed as a tool to extract perceptual features from moving image sequences and identify any sufficiently long unknown video segment by efficiently matching the hash value of the short segment with a large database of pre-computed hash values.

...read moreread less

Abstract: This paper present the concept of robust video hashing as a tool for video identification. We present considerations and a technique for (i) extracting essential perceptual features from a moving image sequences and (ii) for identifying any sufficiently long unknown video segment by efficiently matching the hash value of the short segment with a large database of pre-computed hash values.

...read moreread less

91 citations

Patent•

Robust recognizer of perceptually similar content

[...]

M. Kivanc Mihcak¹, Ramarathnam Venkatesan¹•Institutions (1)

Microsoft¹

24 Apr 2001

TL;DR: In this article, the authors describe an implementation of a technology for recognizing the perceptual similarity of the content of digital goods, which produces hash values for digital goods that are proximally near each other, when the digital goods contain similar content.

...read moreread less

Abstract: An implementation of a technology is described herein for recognizing the perceptual similarity of the content of digital goods. At least one implementation, described herein, introduces a new hashing technique. More particularly, this hashing technique produces hash values for digital goods that are proximally near each other, when the digital goods contain perceptually similar content. In other words, if the content of digital goods are perceptually similar, then their hash values are, likewise, similar. The hash values are proximally near each other. This is unlike conventional hashing techniques where the hash values of goods with perceptually similar content are far apart with high probability in some distance sense (e.g., Hamming). This abstract itself is not intended to limit the scope of this patent. The scope of the present invention is pointed out in the appending claims.

...read moreread less

82 citations

Journal Article•

Incremental Hash Function based on Pair Chaining & Modular Arithmetic Combining

[...]

Bok-Min Goi, Mohammad Umar Siddiqi, Hean-Teik Chuah

01 Jan 2001-Lecture Notes in Computer Science

TL;DR: This paper presents a new approach to construct a cryptographic hash function called Pair Chaining & Modular Arithmetic Combining Incremental Hash Function (PCIHF), which has some attractive properties, which are incrementality and parallelizability.

...read moreread less

Abstract: Most of the hash functions using iterative constructions, are inefficient for bulk hashing of documents with high similarity. In this paper, we present a new approach to construct a cryptographic hash function called Pair Chaining & Modular Arithmetic Combining Incremental Hash Function (PCIHF). PCIHF has some attractive properties, which are incrementality and parallelizability. The security of PCIHF has also been analyzed comprehensively. Finally, we show that PCIHF is not only universal one-way but also collision-free.

...read moreread less

4 citations

Journal Article•DOI•

Criss-cross hash joins: design and analysis

[...]

Ram Gopal¹, Ram Ramesh, Stanley Zionts•Institutions (1)

University of Connecticut¹

01 Jul 2001-IEEE Transactions on Knowledge and Data Engineering

TL;DR: A criss-cross hash join strategy that draws from both hashing and indexing techniques, inheriting the advantages of each is proposed, and the page maps are simpler, more compact, and easier to maintain than the traditional data structures associated with index based join methods.

...read moreread less

Abstract: Join processing in relational database systems continues to be a difficult and challenging problem. In this research, we propose a criss-cross hash join strategy that draws from both hashing and indexing techniques, inheriting the advantages of each. To facilitate the criss-cross hash join, a simple data structure, termed page map, is introduced. The page maps aid in reducing the hashing effort incurred in the current hash based join methods. Furthermore, the page maps implicitly capture and exploit the possible inherent order among tuples in the relations, however partial it may be, to achieve superior performance. As the proposed methodology relies on the hashing scheme, the page maps are simpler, more compact, and easier to maintain than the traditional data structures associated with index based join methods. We develop the ideas intuitively first, followed by a formal development of the concepts and the algorithms. A detailed probabilistic analysis of the algorithms is presented and their performance is assessed through extensive empirical investigations. The empirical analysis suggests significant performance improvements over the current state-of-the-art hybrid hash method, especially in the presence of possible inherent order.

...read moreread less

3 citations