scispace - formally typeset
Search or ask a question

Showing papers on "Feature hashing published in 1998"


Journal ArticleDOI
TL;DR: A new approach based on an elastic hash table that makes absolutely no assumptions about the statistical characteristics of the invariants and the geometric hash function is actually computed through learning and was shown to perform well on both artificial and real data.
Abstract: A major problem associated with geometric hashing and methods which have emerged from it is the nonuniform distribution of invariants over the hash space. In this paper, a new approach is proposed based on an elastic hash table. We proceed by distributing the hash bins over the invariants. The key idea is to associate the hash bins with the output nodes of a self-organizing feature map (SOFM) neural network which is trained using the invariants as training examples. In this way, the location of a hash bin in the space of invariants is determined by the weight vector of the node associated with the hash bin. The advantage of the proposed approach is that it is a process that adapts to the invariants through learning. Hence, it makes absolutely no assumptions about the statistical characteristics of the invariants and the geometric hash function is actually computed through learning. Furthermore, SOFM's topology preserving property ensures that the computed geometric hash function should be well behaved.

18 citations


Journal ArticleDOI
TL;DR: A non-expansive hashing scheme wherein any set of size from a large universe may be stored in a memory of size (any, and ), and where retrieval takes operations.
Abstract: hashing scheme, similar inputs are stored in memory locations which are close. We develop a non-expansive hashing scheme wherein any set of size from a large universe may be stored in a memory of size (any , and ), and where retrieval takes operations. We explain how to use non-expansive hashing schemes for efficient storage and retrieval of noisy data. A dynamic version of this hashing scheme is presented as well.

9 citations


Proceedings ArticleDOI
Tanveer Syeda-Mahmood1
TL;DR: This paper presents a new method of indexing image databases, called location hashing, that uses a special data structure, called the location hash tree, for organizing feature information from images of a database, based on the principle of geometric hashing.
Abstract: Queries referring to content embedded within images are an essential component of content-based search, browse, or summarize operations in image databases. Localization of such queries under changes in appearance, occlusions and background clutter, is a difficult problem, for which current spatial access structures in databases are not suitable. In this paper, we present a new method of indexing image databases, called location hashing, that uses a special data structure, called the location hash tree, for organizing feature information from images of a database. Location hashing is based on the principle of geometric hashing. It simultaneously determines the relevant images in the database, and the regions within them, which are most likely to contain 2D pattern query, without incurring a detailed search of either. The location hash tree being a red-black tree, allows for efficient search for candidate locations using pose-invariant feature information derived from the query.

2 citations


Journal ArticleDOI
Isidore Rigoutsos1, Alex Delis
TL;DR: A two-stage methodology that uses the knowledge of the hashing function to reorganize the group assignments so that the resulting groups have similar expected cardinalities, and is generally applicable and independent of the used hashing function.
Abstract: Increasingly larger data sets are being stored in networked architectures. Many of the available data structures are not easily amenable to parallel realizations. Hashing schemes show promise in that respect for the simple reason that the underlying data structure can be decomposed and spread among the set of cooperating nodes with minimal communication and maintenance requirements. In all cases, storage utilization and load balancing are issues that need to be addressed. One can identify two basic approaches to tackle the problem. One way is to address it as part of the design of the data structure that is used to store and retrieve the data. The other is to maintain the data structure intact but address the problem separately. The method that we present here falls in the latter category and is applicable whenever a hash table is the preferred data structure. Intrinsically attached to the used hash table is a hashing function that allows one to partition a possibly unbounded set of data items into a finite set of groups; the hashing function provides the partitioning by assigning each data item to one of the groups. In general, the hashing function cannot guarantee that the various groups will have the same cardinality on average, for all possible data item distributions. In this paper, we propose a two-stage methodology that uses the knowledge of the hashing function to reorganize the group assignments so that the resulting groups have similar expected cardinalities. The method is generally applicable and independent of the used hashing function. We show the power of the methodology using both synthetic and real-world databases. The derived quasi-uniform storage occupancy and associated load-balancing gains are significant.

2 citations