scispace - formally typeset
Open AccessJournal ArticleDOI

Efficient Data Compression Scheme using Dynamic Huffman Code Applied on Arabic Language

Sameh Ghwanmeh, +2 more
- 31 Dec 2006 - 
- Vol. 2, Iss: 12, pp 885-888
Reads0
Chats0
TLDR
The experimental results show that the average message length and the efficiency of compression on Arabic text is better than the compression on English text, and the main factor which significantly affects compression ratio andaverage message length is the frequency of the symbols on the text.
Abstract
The development of an efficient compression scheme to process the Arabic language represents a difficult task. This paper employs the dynamic Huffman coding on data compression with variable length bit coding, on the Arabic language. Experimental tests have been performed on both Arabic and English text. A comparison is made to measure the efficiency of compressing data results on both Arabic and English text. Also a comparison is made between the compression rate and the size of the file to be compressed. It has been found that as the file size increases, the compression ratio decreases for both Arabic and English text. The experimental results show that the average message length and the efficiency of compression on Arabic text is better than the compression on English text. Also, results show that the main factor which significantly affects compression ratio and average message length is the frequency of the symbols on the text.

read more

Citations
More filters
Journal Article

Multilayer model for Arabic text compression.

TL;DR: The novelties of the compression technique presented in this article are that the morphological structure of words may be used to support better compression and to improve the performances of traditional compression techniques; and search for words can be done on the compressed text directly through the appropriate one of its layers.
Posted Content

An Enhanced Static Data Compression Scheme Of Bengali Short Message

TL;DR: Character Masking, Dictionary Matching, Associative rule of data mining and Hyphenation algorithm for syllable based compression in hierarchical steps are implemented to achieve low complexity lossless compression of text message for any mobile devices.
Journal ArticleDOI

Arabic Short Text Compression

TL;DR: A new technique is proposed that uses the fact that Arabic texts have single case letters to compress small Arabic text with recourses limited, and a reasonable compression ratio can be achieved using less than 0.4 KB of memory overhead.
Journal Article

Hybrid Technique for Arabic Text Compression

TL;DR: This paper presents a hybrid technique that uses the linguistic features of Arabic language to improve the compression ratio of Arabic texts and uses the Burrows-Wheeler compression algorithm.
Proceedings ArticleDOI

Entropy of Malayalam language and text compression using Huffman coding

TL;DR: An informational analysis ofMalayalam language text is done and it is found that the Huffman compression algorithm achieves a compression ratio of 66 percentage for a standard Malayalam database taken.
References
More filters
Book

Introduction to data compression

TL;DR: The author explains the development of the Huffman Coding Algorithm and some of the techniques used in its implementation, as well as some of its applications, including Image Compression, which is based on the JBIG standard.
Journal ArticleDOI

Optimizing bitmap indices with efficient compression

TL;DR: This article presents a new compression scheme called Word-Aligned Hybrid (WAH) code that makes compressed bitmap indices efficient even for high-cardinality attributes and proves that the new compressed bit map index, like the best variants of the B-tree index, is optimal for one-dimensional range queries.
Journal ArticleDOI

Data compression using dynamic Markov modelling

TL;DR: Experimental results reported here indicate that the Markov modelling approach generally achieves much better data compression than that observed with competing methods on typical computer data.
Journal ArticleDOI

Bounds on the redundancy of Huffman codes (Corresp.)

TL;DR: Upper bounds are presented on the redundancy of Huffman codes when the extreme probabilities P_{1} and P_{N} are known.
Journal ArticleDOI

Improved word-aligned binary compression for text indexing

TL;DR: An improved compression mechanism for handling the compressed inverted indexes used in text retrieval systems is presented, extending the word-aligned binary coding carry method.