Showing papers on "Feature hashing published in 2007"

PDF

Open Access

Proceedings Article•DOI•

Robust Hash for Detecting and Localizing Image Tampering

[...]

Institute for Infocomm Research Singapore¹

12 Nov 2007

TL;DR: This paper presents an image hashing method, to not only detect but also localize tampering using a small signature (< 1kB), and brings out the efficacy of the proposed method compared to existing methods.

...read moreread less

Abstract: An image hash should be (1) robust to allowable operations and (2) sensitive to illegal manipulations and distinct queries. Some applications also require the hash to be able to localize image tampering. This requires the hash to contain both robust content and alignment information to meet the above criterion. Fulfilling this is difficult because of two contradictory requirements. First, the hash should be small and second, to verify authenticity and then localize tampering, the amount of information in the hash about the original required would be large. Hence a tradeoff between these requirements needs to be found. This paper presents an image hashing method that addresses this concern, to not only detect but also localize tampering using a small signature (< 1kB). Illustrative experiments bring out the efficacy of the proposed method compared to existing methods.

...read moreread less

128 citations

Proceedings Article•DOI•

Principles of hash-based text retrieval

[...]

Benno Stein¹•Institutions (1)

Bauhaus University, Weimar¹

23 Jul 2007

TL;DR: The design principles behind hash-based search methods are revealed and it is shown how optimum hash functions for similarity search can be derived and the rationale of their effectiveness is explained.

...read moreread less

Abstract: Hash-based similarity search reduces a continuous similarity relation to the binary concept "similar or not similar": two feature vectors are considered as similar if they are mapped on the same hash key. From its runtime performance this principle is unequaled--while being unaffected by dimensionality concerns at the same time. Similarity hashing is applied with great success for near similarity search in large document collections, and it is considered as a key technology for near-duplicate detection and plagiarism analysis. This papers reveals the design principles behind hash-based search methods and presents them in a unified way. We introduce new stress statistics that are suited to analyze the performance of hash-based search methods, and we explain the rationale of their effectiveness. Based on these insights, we show how optimum hash functions for similarity search can be derived. We also present new results of a comparative study between different hash-based search methods.

...read moreread less

97 citations

Journal Article•DOI•

Multi-resolution similarity hashing

[...]

Vassil Roussev¹, Golden G. Richard¹, Lodovico Marziale¹•Institutions (1)

University of New Orleans¹

01 Sep 2007-Digital Investigation

TL;DR: The essential idea is to produce an efficient and scalable hashing scheme that can be used to supplement the traditional cryptographic hashing during the initial pass over the raw evidence, called a multi-resolution similarity hash (or MRS hash), which is a generalization of recent work in the area.

...read moreread less

75 citations

Proceedings Article•DOI•

External perfect hashing for very large key sets

[...]

Fabiano C. Botelho¹, Nivio Ziviani¹•Institutions (1)

Universidade Federal de Minas Gerais¹

06 Nov 2007

TL;DR: The main contribution is the first algorithm that has experimentally proven practicality for sets in the order of billions of keys and has time and space usage carefully analyzed without unrealistic assumptions.

...read moreread less

Abstract: We present a simple and efficient external perfect hashing scheme (referred to as EPH algorithm) for very large static key sets. We use a number of techniques from the literature to obtain a novel scheme that is theoretically well-understood and at the same time achieves an order-of-magnitude increase in the size of the problem to be solved compared to previous "practical" methods. We demonstrate the scalability of our algorithm by constructing minimum perfect hash functions for a set of 1.024 billion URLs from the World Wide Web of average length 64 characters in approximately 62 minutes, using a commodity PC. Our scheme produces minimal perfect hash functions using approximately 3.8 bits per key. For perfect hash functions in the range {0,...,2n - 1} the space usage drops to approximately 2.7 bits per key. The main contribution is the first algorithm that has experimentally proven practicality for sets in the order of billions of keys and has time and space usage carefully analyzed without unrealistic assumptions.

...read moreread less

55 citations

Book Chapter•DOI•

Content based image hashing via wavelet and radon transform

[...]

Xin C. Guo¹, Dimitrios Hatzinakos¹•Institutions (1)

University of Toronto¹

11 Dec 2007

TL;DR: Results show that the proposed method can resist perceptually insignificant modifications such as compression, filtering, scaling and rotation and is also able to successfully detect content changing attacks such as insertion of foreign objects.

...read moreread less

Abstract: Image hash function based on the image content has applications in watermarking, authentication and image retrieval. This paper presents an algorithm for generating an image hash that is robust against content-preserving modifications and at the same time, is capable of detecting malicious tampering. Robust features are first extracted from the discrete wavelet transform followed by the Radon transform. Probabilistic quantization is then used to map the feature values to a binary sequence. Results show that the proposed method can resist perceptually insignificant modifications such as compression, filtering, scaling and rotation. It is also able to successfully detect content changing attacks such as insertion of foreign objects.

...read moreread less

32 citations

Journal Article•DOI•

Recent development of perceptual image hashing

[...]

Wang Shuo-zhong¹, Zhang Xin-peng¹•Institutions (1)

Shanghai University¹

01 Aug 2007-Journal of Shanghai University (english Edition)

TL;DR: This article reviews some representative image hashing techniques proposed in the recent years, with emphases on how to meet the conflicting requirements of perceptual robustness and security, and introduces two image hashing approaches developed in the own research.

...read moreread less

Abstract: The easy generation, storage, transmission and reproduction of digital images have caused serious abuse and security problems. Assurance of the rightful ownership, integrity, and authenticity is a major concern to the academia as well as the industry. On the other hand, efficient search of the huge amount of images has become a great challenge. Image hashing is a technique suitable for use in image authentication and content based image retrieval (CBIR). In this article, we review some representative image hashing techniques proposed in the recent years, with emphases on how to meet the conflicting requirements of perceptual robustness and security. Following a brief introduction to some earlier methods, we focus on a typical two-stage structure and some geometric-distortion resilient techniques. We then introduce two image hashing approaches developed in our own research, and reveal security problems in some existing methods due to the absence of secret keys in certain stage of the image feature extraction, or availability of a large quantity of images, keys, or the hash function to the adversary. More research efforts are needed in developing truly robust and secure image hashing techniques.

...read moreread less

22 citations

Book Chapter•DOI•

Biometric hashing based on genetic selection and its application to on-line signatures

[...]

M.R. Freire¹, Julian Fierrez¹, Javier Galbally¹, Javier Ortega-Garcia¹•Institutions (1)

Autonomous University of Madrid¹

27 Aug 2007

TL;DR: A general biometric hash generation scheme based on vector quantization of multiple feature subsets selected with genetic optimization that overcomes the dimensionality problem of other hash generation algorithms and enables to exploit all the discriminative information found in large feature sets.

...read moreread less

Abstract: We present a general biometric hash generation scheme based on vector quantization of multiple feature subsets selected with genetic optimization. The quantization of subsets overcomes the dimensionality problem of other hash generation algorithms, while the feature selection step using an integer-coding genetic algorithm enables to exploit all the discriminative information found in large feature sets. We provide experimental results of the proposed hashing for verification of on-line signatures. Development and evaluation experiments are reported on the MCYT signature database, comprising 16, 500 signatures from 330 subjects.

...read moreread less

19 citations

Book Chapter•DOI•

A secure and robust wavelet-based hashing scheme for image authentication

[...]

Fawad Ahmed¹, Mohammed Yakoob Siyal¹•Institutions (1)

Nanyang Technological University¹

09 Jan 2007

TL;DR: This paper presents a novel hashing scheme that is resilient to allow non-malicious manipulations like JPEG compression, high pass filtering and is sensitive enough to detect tampering with precise localization.

...read moreread less

Abstract: The purpose of an image hash is to provide a compact representation of the whole image. Designing a good image hash function requires careful consideration of many issues such as robustness, security and tamper detection with precise localization. In this paper, we present a novel hashing scheme that addresses these issues in a unified framework. We analyze the security issues in image hashing and present new ideas to counter some of the attacks that we shall describe in this paper. Our proposed scheme is resilient to allow non-malicious manipulations like JPEG compression, high pass filtering and is sensitive enough to detect tampering with precise localization. Several experimental results are presented to demonstrate the effectiveness of the proposed scheme.

...read moreread less

14 citations

Applying Hash-based Indexing in Text-based Information Retrieval

[...]

Benno Stein, Martin Potthast

01 Jan 2007

TL;DR: This analysis shows the potential of tailored hash-based indexing methods and identifies basic retrieval tasks which can benet from this new technology, relates them to well-known applications and discusses how hash- based indexing is applied.

...read moreread less

Abstract: Hash-based indexing is a powerful technology for similarity search in large document collections [13]. Central idea is the interpretation of hash collisions as similarity indication, provided that an appropriate hash function is given. In this paper we identify basic retrieval tasks which can benet from this new technology, we relate them to well-known applications and discuss how hash-based indexing is applied. Moreover, we present two recently developed hash-based indexing approaches and compare the achieved performance improvements in real-world retrieval settings. This analysis, which has not been conducted in this or a similar form by now, shows the potential of tailored hash-based indexing methods.

...read moreread less

12 citations

Patent•

Scaling machine learning using approximate counting that uses feature hashing

[...]

Simon Tong¹, Noam Shazeer¹•Institutions (1)

Google¹

16 May 2007

TL;DR: In this paper, a system may track statistics for a number of features using an approximate counting technique by subjecting each feature to multiple, different hash functions to generate multiple different hash values, where each of the hash values may identify a particular location in a memory, and storing statistics for each feature at the particular locations identified by the hash value.

...read moreread less

Abstract: A system may track statistics for a number of features using an approximate counting technique by: subjecting each feature to multiple, different hash functions to generate multiple, different hash values, where each of the hash values may identify a particular location in a memory, and storing statistics for each feature at the particular locations identified by the hash values. The system may generate rules for a model based on the tracked statistics.

...read moreread less

8 citations

Proceedings Article•DOI•

Image authentication using soft hashing technique

[...]

Fawad Ahmed¹, Mohammed Yakoob Siyal¹•Institutions (1)

Nanyang Technological University¹

01 Dec 2007

TL;DR: An image hashing technique that attempts to simultaneously address the robustness, fragility and security issues is presented and an improved version of this scheme with a wavelet-based smoothening to improve robustness against JPEG compression and a modified intensity-transformation for enhancing the security.

...read moreread less

Abstract: Designing a hash function for multimedia authentication encompasses many issues like robustness to non- malicious distortion, sensitivity to detect malicious manipulations and security In this paper, we present an image hashing technique that attempts to simultaneously address the robustness, fragility and security issues This scheme is an improved version of our previously proposed scheme [1] with a wavelet-based smoothening to improve robustness against JPEG compression and a modified intensity-transformation for enhancing the security Several experimental results are presented to demonstrate the effectiveness of the proposed scheme

...read moreread less

Journal Article•DOI•

Optimal XOR hashing for non-uniformly distributed address lookup in computer networks

[...]

Christopher J. Martinez¹, Wei-Ming Lin¹, Parimal Patel¹•Institutions (1)

University of Texas at San Antonio¹

01 Nov 2007-Journal of Network and Computer Applications

TL;DR: This paper addresses the cases when such distribution follows a natural negative linear distribution, a partial negative linear distributions, or an exponential distribution which are found to closely approximate many real-life database distributions and derives a general formula for calculating the distribution variance produced by any given non-overlapping bit-grouping XOR hashing function.

...read moreread less

Proceedings Article•DOI•

Partial Geometric Hashing for Retrieving Similar Interaction Protein Using Profile

[...]

Y. Kiuchi¹, Tomonobu Ozaki¹, Takenao Ohkawa¹•Institutions (1)

Kobe University¹

02 Apr 2007

TL;DR: This paper proposes a method for retrieving similar interaction protein using profiles that represent the features of the interaction site binding to a certain compound using geometric hashing technique.

...read moreread less

Abstract: Protein function is expressed by binding to other compounds at a local portion, called an interaction site. Since the structure of its interaction site and function of a protein are closely related, retrieving similar interaction protein is effective in clarifying the function of a protein. We have proposed a method for retrieving similar interaction protein using profiles that represent the features of the interaction site binding to a certain compound. In this method, it is necessary to compare the structure between proteins and a profile, we use geometric hashing technique which is one of the popular methods for structure comparison. However, the problem of structure comparison by using the geometric hashing is that memory usage becomes too large. This paper proposes a method for arranging the geometric hashing to alleviate this problem. Firstly, only small parts of the target structures are stored in the hash table to reduce the size of the hash table. By evaluating this hash table we screen out candidates of similar structures between target proteins and query profiles. Secondly overall structures are compared for these candidates. In order to reduce the time for retrieval we evaluate the information of the origin which is not generally evaluated without increasing the size of the hash table. Reference set, the basis for transforming in geometric hashing, are sorted

...read moreread less

Proceedings Article•DOI•

A PCB Component Location Method Based on Image Hashing

[...]

Yu Longjiang¹, Yang Ying¹, Sun Sheng-he¹•Institutions (1)

Harbin Institute of Technology¹

01 Aug 2007

TL;DR: A novel method based on image hashing is proposed to locate the acquired component from a CCD DSC camera in this paper and is verified by experimental results.

...read moreread less

Abstract: A novel method based on image hashing is proposed to locate the acquired component from a CCD DSC camera in this paper. Image hash is extracted resistant to geometric distortions. The extracted hash is used to identify the component and related defects including missing component, mistake component, inverse orientation. The proposed method is verified by experimental results.

...read moreread less

Patent•

Methods of offering guidance on common language usage utilizing a hashing function consisting of a hash triplet

[...]

David T. Lorenzen, Nicholas J. Witchey

07 May 2007

TL;DR: In this paper, a hash triplet consisting of a hash for each document word and two involving the word and its preceding and following words is used to provide suggestions to the author, and to filter email.

...read moreread less

Abstract: Usages of language are analyzed in ways that are at least partially language independent. In preferred embodiments, portions of a document are hashed, and the resulting hash values are compared with each other and with those of other documents in real-time. Analyses can be used to gauge conformity of a document to one or more standards utilizing a hash triplet consisting of a hash for each document word and two involving the word and its preceding and following words, to provide suggestions to the author, and to filter email.

...read moreread less