scispace - formally typeset
Search or ask a question

Showing papers by "Jeffrey Scott Vitter published in 1985"


Journal ArticleDOI
TL;DR: Theoretical and empirical results indicate that Algorithm Z outperforms current methods by a significant margin, and an efficient Pascal-like implementation is given that incorporates these modifications and that is suitable for general use.
Abstract: We introduce fast algorithms for selecting a random sample of n records without replacement from a pool of N records, where the value of N is unknown beforehand. The main result of the paper is the design and analysis of Algorithm Z; it does the sampling in one pass using constant space and in O(n(1 + log(N/n))) expected time, which is optimum, up to a constant factor. Several optimizations are studied that collectively improve the speed of the naive version of the algorithm by an order of magnitude. We give an efficient Pascal-like implementation that incorporates these modifications and that is suitable for general use. Theoretical and empirical results indicate that Algorithm Z outperforms current methods by a significant margin.

1,725 citations


Proceedings ArticleDOI
21 Oct 1985
TL;DR: An efficient new algorithm for dynamic Huffman coding in real-time, and uses at most one more bit per letter than does the standard two-pass Huffman algorithm; this is optimum in the worst case among all one-pass schemes.
Abstract: We introduce an efficient new algorithm for dynamic Huffman coding, called Algorithm V. It performs one-pass coding and transmission in real-time, and uses at most one more bit per letter than does the standard two-pass Huffman algorithm; this is optimum in the worst case among all one-pass schemes. We also analyze the dynamic Huffman algorithm due to Faller, Gallager, and Knuth. In each algorithm, both the sender and the receiver maintain equivalent dynamically varying Huffman trees. The processing time required to encode and decode a letter whose node in the dynamic Huffman tree is currently on the lth level is O(l); hence, the processing can be done in real time. Empirical tests show that Algorithm V performs quite well in practice, often better than the two-pass method. The proposed algorithm is well-suited for file compression and online encoding/decoding in data networks.

35 citations


Journal ArticleDOI
TL;DR: A practical disk- efficient I/O interface is demonstrated and it is shown that its I/W performance in many cases is optimum, up to a constant factor, among all disk-efficient interfaces.
Abstract: We introduce the notion of an I/O interface for optical digital (write-once) disks, which is quite different from earlier research. The purpose of an I/O interface is to allow existing operating systems and application programs that use magnetic disks to use optical disks instead, with minimal change. We define what it means for an I/O interface to be disk-efficient. We demonstrate a practical disk- efficient I/O interface and show that its I/O performance in many cases is optimum, up to a constant factor, among all disk-efficient interfaces. The interface is most effective for applications that are not update-intensive. An additional capability is a built-in history mechanism that provides software support for accessing previous versions of records. Even if not implemented, the I/O interface can be used as a programming tool to develop efficient special purpose applications for use with optical disks.

32 citations



Journal ArticleDOI
TL;DR: This paper gives strong evidence in favor of the conjecture of the varied-insertion coalesced hashing method (VICH) to be optimum among all direct chaining algorithms in this class by showing that VICH is optimum under fairly general conditions.
Abstract: Direct chaining is a popular and efficient class of hashing algorithms. In this paper we study optimum algorithms among direct chaining methods, under the restrictions that the records in the hash table are not moved after they are inserted, that for each chain the relative ordering of the records in the chain does not change after more insertions, and that only one link field is used per table slot. The varied-insertion coalesced hashing method (VICH), which is proposed and analyzed in [CV84], is conjectured to be optimum among all direct chaining algorithms in this class. We give strong evidence in favor of the conjecture by showing that VICH is optimum under fairly general conditions.

3 citations



Proceedings Article
01 Jan 1985
TL;DR: An efficient new algorithm for Huffman coding in real-time, and uses at most one more bit per letter than does the standard two­ pass Huffman algorithm; this is optimum in the worst case among all one-pass schemes.
Abstract: We introduce an efficient new algorithm for dy­ namic Huffman coding, called Algorithm V. It performs one-pass coding and transmission in real-time, and uses at most one more bit per letter than does the standard two­ pass Huffman algorithm; this is optimum in the worst case among all one-pass schemes. We also analyze the dynamic Huffman algorithm due to Faller, Gallager, and Knuth. In each algorithm, both the sender and the receiver maintain equivalent dynamically varying Huffman trees. The pro­ cessing time required to encode and decode a letter whose node in the dynamic Huffman tree is currently on the fth level is O(f); hence, the processing can be done in real time. Empirical tests show that Algorithm V performs quite well in practice, often better than the two-pass method. The proposed algorithm is well-suited for file compression and online encoding/decoding in data networks.

1 citations