scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A space efficient direct access data structure

01 Mar 2017-Journal of Discrete Algorithms (Elsevier)-Vol. 43, pp 26-37
TL;DR: The pruning procedure is improved and empirical evidence is given that when memory storage is of main concern, the suggested data structure outperforms other direct access techniques such as those due to Külekci, DACs and sampling, with a slowdown as compared to DAC’s and fixed length encoding.
About: This article is published in Journal of Discrete Algorithms.The article was published on 2017-03-01 and is currently open access. It has received 15 citations till now. The article focuses on the topics: Canonical Huffman code & Huffman coding.
Citations
More filters
Book ChapterDOI
01 Apr 2021
TL;DR: A new dynamic Huffman encoding approach is proposed, that provably always performs at least as good as static Huffman coding, and may be better than the standard dynamic HuffMan coding for certain files.
Abstract: Huffman coding is known to be optimal, yet its dynamic version may yield smaller compressed files. The best known bound is that the number of bits used by dynamic Huffman coding in order to encode a message of n characters is at most larger by n bits than the number of bits required by static Huffman coding. In particular, dynamic Huffman coding can also generate a larger encoded file than the static variant, though in practice the file might often, but not always, be smaller. We propose here a new dynamic Huffman encoding approach, that provably always performs at least as good as static Huffman coding, and may be better than the standard dynamic Huffman coding for certain files. This is achieved by reversing the direction for the references of the encoded elements to those forming the model of the encoding, from pointing backwards to looking into the future.

8 citations

Journal ArticleDOI
TL;DR: Evidence is presented here that arithmetic coding may produce an output that is identical to that of Huffman coding, and it is found that there is much variability in the randomness of the output of these techniques.
Abstract: It seems reasonable to expect from a good compression method that its output should not be further compressible, because it should behave essentially like random data. We investigate this premise for a variety of known lossless compression techniques, and find that, surprisingly, there is much variability in the randomness, depending on the chosen method. Arithmetic coding seems to produce perfectly random output, whereas that of Huffman or Ziv-Lempel coding still contains many dependencies. In particular, the output of Huffman coding has already been proven to be random under certain conditions, and we present evidence here that arithmetic coding may produce an output that is identical to that of Huffman.

7 citations

Journal ArticleDOI
TL;DR: An alternative to compressed suffix arrays is introduced, based on representing a sequence of integers using Fibonacci encodings, thereby reducing the space requirements of state of the art implementations of the suffix array, while retaining the searching functionalities.
Abstract: An alternative to compressed suffix arrays is introduced, based on representing a sequence of integers using Fibonacci encodings, thereby reducing the space requirements of state-of-the-art implementations of the suffix array, while retaining the searching functionalities. Empirical tests support the theoretical space complexity improvements and show that there is no deterioration in the processing times.

7 citations


Cites methods from "A space efficient direct access dat..."

  • ...Range decoding in WT is proposed in [12]....

    [...]

Journal ArticleDOI
TL;DR: This research shows experimental results of different integer encoders such as Rice, Simple9, Simple16, PForDelta codes, and DACs and a method to determine an appropriate k value for building a k2-raster compact data structure with competitive performance is discussed.
Abstract: This paper examines the various variable-length encoders that provide integer encoding to hyperspectral scene data within a k 2 -raster compact data structure This compact data structure leads to a compression ratio similar to that produced by some of the classical compression techniques This compact data structure also provides direct access for query to its data elements without requiring any decompression The selection of the integer encoder is critical for obtaining a competitive performance considering both the compression ratio and access time In this research, we show experimental results of different integer encoders such as Rice, Simple9, Simple16, PForDelta codes, and DACs Further, a method to determine an appropriate k value for building a k 2 -raster compact data structure with competitive performance is discussed

6 citations

Journal ArticleDOI
TL;DR: In this paper , a dynamic Huffman encoding was proposed, which instead of basing itself on the information gathered from the already processed portion of the file, as traditional adaptive codings do, uses rather the information that is still to come.

4 citations

References
More filters
Journal ArticleDOI
TL;DR: An application is the construction of a uniformly universal sequence of codes for countable memoryless sources, in which the n th code has a ratio of average codeword length to source rate bounded by a function of n for all sources with positive rate.
Abstract: Countable prefix codeword sets are constructed with the universal property that assigning messages in order of decreasing probability to codewords in order of increasing length gives an average code-word length, for any message set with positive entropy, less than a constant times the optimal average codeword length for that source. Some of the sets also have the asymptotically optimal property that the ratio of average codeword length to entropy approaches one uniformly as entropy increases. An application is the construction of a uniformly universal sequence of codes for countable memoryless sources, in which the n th code has a ratio of average codeword length to source rate bounded by a function of n for all sources with positive rate; the bound is less than two for n = 0 and approaches one as n increases.

1,306 citations

Proceedings ArticleDOI
12 Jan 2003
TL;DR: A novel implementation of compressed suffix arrays exhibiting new tradeoffs between search time and space occupancy for a given text (or sequence) of n symbols over an alphabet σ, where each symbol is encoded by lg|σ| bits.
Abstract: We present a novel implementation of compressed suffix arrays exhibiting new tradeoffs between search time and space occupancy for a given text (or sequence) of n symbols over an alphabet σ, where each symbol is encoded by lgvσv bits. We show that compressed suffix arrays use just nHh + σ bits, while retaining full text indexing functionalities, such as searching any pattern sequence of length m in O(m lg vσv + polylog(n)) time. The term Hh ≤ lg vσv denotes the hth-order empirical entropy of the text, which means that our index is nearly optimal in space apart from lower-order terms, achieving asymptotically the empirical entropy of the text (with a multiplicative constant 1). If the text is highly compressible so that Hn = o(1) and the alphabet size is small, we obtain a text index with o(m) search time that requires only o(n) bits. Further results and tradeoffs are reported in the paper.

818 citations


"A space efficient direct access dat..." refers background in this paper

  • ...…space efficient direct access data structure ✩ Gilad Baruch a, Shmuel T. Klein a, Dana Shapira b,∗ a Computer Science Department, Bar Ilan University, Ramat Gan 52900, Israel b Computer Science Department, Ariel University, Ariel 40700, Israel a r t i c l e i n f o a b s t r a c t...

    [...]

Book
01 Jan 1935

768 citations

Proceedings ArticleDOI
30 Oct 1989
TL;DR: Data structures that represent static unlabeled trees and planar graphs are developed, and there is no other structure that encodes n-node trees with fewer bits per node, as N grows without bound.
Abstract: Data structures that represent static unlabeled trees and planar graphs are developed. The structures are more space efficient than conventional pointer-based representations, but (to within a constant factor) they are just as time efficient for traversal operations. For trees, the data structures described are asymptotically optimal: there is no other structure that encodes n-node trees with fewer bits per node, as N grows without bound. For planar graphs (and for all graphs of bounded page number), the data structure described uses linear space: it is within a constant factor of the most succinct representation. >

759 citations


"A space efficient direct access dat..." refers background in this paper

  • ...…space efficient direct access data structure ✩ Gilad Baruch a, Shmuel T. Klein a, Dana Shapira b,∗ a Computer Science Department, Bar Ilan University, Ramat Gan 52900, Israel b Computer Science Department, Ariel University, Ariel 40700, Israel a r t i c l e i n f o a b s t r a c t...

    [...]