scispace - formally typeset
Search or ask a question
Topic

Hash table

About: Hash table is a research topic. Over the lifetime, 5080 publications have been published within this topic receiving 119312 citations. The topic is also known as: hash map.


Papers
More filters
Proceedings ArticleDOI
27 Aug 2001
TL;DR: The concept of a Content-Addressable Network (CAN) as a distributed infrastructure that provides hash table-like functionality on Internet-like scales is introduced and its scalability, robustness and low-latency properties are demonstrated through simulation.
Abstract: Hash tables - which map "keys" onto "values" - are an essential building block in modern software systems. We believe a similar functionality would be equally valuable to large distributed systems. In this paper, we introduce the concept of a Content-Addressable Network (CAN) as a distributed infrastructure that provides hash table-like functionality on Internet-like scales. The CAN is scalable, fault-tolerant and completely self-organizing, and we demonstrate its scalability, robustness and low-latency properties through simulation.

6,703 citations

Book ChapterDOI
07 Mar 2002
TL;DR: In this paper, the authors describe a peer-to-peer distributed hash table with provable consistency and performance in a fault-prone environment, which routes queries and locates nodes using a novel XOR-based metric topology.
Abstract: We describe a peer-to-peer distributed hash table with provable consistency and performance in a fault-prone environment. Our system routes queries and locates nodes using a novel XOR-based metric topology that simplifies the algorithm and facilitates our proof. The topology has the property that every message exchanged conveys or reinforces useful contact information. The system exploits this information to send parallel, asynchronous query messages that tolerate node failures without imposing timeout delays on users.

3,196 citations

Proceedings ArticleDOI
Wei Liu1, Jun Wang2, Rongrong Ji1, Yu-Gang Jiang3, Shih-Fu Chang1 
16 Jun 2012
TL;DR: A novel kernel-based supervised hashing model which requires a limited amount of supervised information, i.e., similar and dissimilar data pairs, and a feasible training cost in achieving high quality hashing, and significantly outperforms the state-of-the-arts in searching both metric distance neighbors and semantically similar neighbors is proposed.
Abstract: Recent years have witnessed the growing popularity of hashing in large-scale vision problems. It has been shown that the hashing quality could be boosted by leveraging supervised information into hash function learning. However, the existing supervised methods either lack adequate performance or often incur cumbersome model training. In this paper, we propose a novel kernel-based supervised hashing model which requires a limited amount of supervised information, i.e., similar and dissimilar data pairs, and a feasible training cost in achieving high quality hashing. The idea is to map the data to compact binary codes whose Hamming distances are minimized on similar pairs and simultaneously maximized on dissimilar pairs. Our approach is distinct from prior works by utilizing the equivalence between optimizing the code inner products and the Hamming distances. This enables us to sequentially and efficiently train the hash functions one bit at a time, yielding very short yet discriminative codes. We carry out extensive experiments on two image benchmarks with up to one million samples, demonstrating that our approach significantly outperforms the state-of-the-arts in searching both metric distance neighbors and semantically similar neighbors, with accuracy gains ranging from 13% to 46%.

1,461 citations

Proceedings Article
30 Jul 2011
TL;DR: KenLM is a library that implements two data structures for efficient language model queries, reducing both time and memory costs and is integrated into the Moses, cdec, and Joshua translation systems.
Abstract: We present KenLM, a library that implements two data structures for efficient language model queries, reducing both time and memory costs. The Probing data structure uses linear probing hash tables and is designed for speed. Compared with the widely-used SRILM, our Probing model is 2.4 times as fast while using 57% of the memory. The Trie data structure is a trie with bit-level packing, sorted records, interpolation search, and optional quantization aimed at lower memory consumption. Trie simultaneously uses less memory than the smallest lossless baseline and less CPU than the fastest baseline. Our code is open-source, thread-safe, and integrated into the Moses, cdec, and Joshua translation systems. This paper describes the several performance techniques used and presents benchmarks against alternative implementations.

1,297 citations

Journal ArticleDOI
TL;DR: In this article, a similar functionality would be equally valuable to large distributed systems, such as large-scale software systems, where the hash tables are an essential building block in modern software systems.
Abstract: Hash tables - which map "keys" onto "values" - are an essential building block in modern software systems. We believe a similar functionality would be equally valuable to large distributed systems....

1,254 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
91% related
Network packet
159.7K papers, 2.2M citations
86% related
Wireless sensor network
142K papers, 2.4M citations
85% related
Encryption
98.3K papers, 1.4M citations
85% related
Wireless network
122.5K papers, 2.1M citations
85% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202359
2022112
2021115
2020181
2019276
2018259