scispace - formally typeset
Search or ask a question
Topic

Run-length encoding

About: Run-length encoding is a research topic. Over the lifetime, 504 publications have been published within this topic receiving 4441 citations. The topic is also known as: RLE.


Papers
More filters
Journal ArticleDOI
TL;DR: A cluster of novel and hybrid efficient text and image compression algorithms employing efficient data structures like Hash and Graphs are proposed and retrieved optimal set of patterns through pruning which is efficient in terms of database scan/storage space by reducing the code table size.
Abstract: Data Compression has been one of the enabling technologies for the on-going digital multimedia revolution for decades which resulted in renowned algorithms like Huffman Encoding, LZ77, Gzip, RLE and JPEG etc. Researchers have looked into the character/word based approaches to Text and Image Compression missing out the larger aspect of pattern mining from large databases. The central theme of our compression research focuses on the Compression perspective of Data Mining as suggested by Naren Ramakrishnan et al. wherein efficient versions of seminal algorithms of Text/Image compression are developed using various Frequent Pattern Mining(FPM)/Clustering techniques. This paper proposes a cluster of novel and hybrid efficient text and image compression algorithms employing efficient data structures like Hash and Graphs. We have retrieved optimal set of patterns through pruning which is efficient in terms of database scan/storage space by reducing the code table size. Moreover, a detailed analysis of time and space complexity is performed for some of our approaches and various text structures are proposed. Simulation results over various spare/dense benchmark text corpora indicate 18% to 751% improvement in compression ratio over other state of the art techniques. In Image compression, our results showed up to 45% improvement in compression ratio and up to 40% in image quality efficiency.

2 citations

Patent
02 Nov 2001
TL;DR: In this article, the authors propose an image processor, method and its recording medium capable of exactly processing data at a boundary part as well and reducing a load to be applied to an image processing by performing data compression to split image data and properly linking the boundary part generated by the splitting.
Abstract: PROBLEM TO BE SOLVED: To provide an image processor, method and its recording medium capable of exactly processing data at a boundary part as well and reducing a load to be applied to an image processing by performing data compression to split image data and properly linking the boundary part generated by the splitting. SOLUTION: An image processor 1 as one embodiment of this invention is constituted of an A/D conversion part 1a to perform A/D conversion to an inputted image signal, a binary processing part 1b to binary process the image data outputted from 1a, a run length encoding part 1c to make binary data outputted from 1b into run length codes, a data memory 1d to temporarily store the run length encoded data outputted from 1c, a linkage processing part 1e to link the run length encoded data stored in 1d and a linked data memory 1f to store the run length encoded data linked by 1e.

2 citations

Proceedings ArticleDOI
01 Jan 2015
TL;DR: The experimental results show that the two methods are competitive if training text and testing text are in a same set of languages, but the run-length encoding based method works better than the byte pattern based method if training texts are in different sets of languages.
Abstract: Text based pictures called ASCII art are often used in Web pages, email text and so on. They enrich expression in text data, but they can be noise for natural language processing and large ASCII arts are deformed in small display devices. We can ignore ASCII arts in text data or replace them with other strings by ASCII art extraction methods, which detect areas of ASCII arts in a given text data. Our research group and another research group independently proposed two different ASCII art extraction methods, which are a run-length encoding based method and a byte pattern based method respectively. Both of the methods use text classifiers constructed by machine learning algorithms, but they use different attributes of text. In this paper, we compare the two methods by ASCII art extraction experiments where training text and testing text are in English and Japanese. Our experimental results show that the two methods are competitive if training text and testing text are in a same set of languages, but the run-length encoding based method works better than the byte pattern based method if training text and testing text are in different sets of languages.

2 citations

Journal ArticleDOI
TL;DR: CVS (Compressed Vector Set), a fast and space-efficient data mining framework that efficiently handles both sparse and dense datasets, and can process both dense datasets and sparse datasets faster than conventional sparse vector representation with smaller memory usage.
Abstract: In this paper, we present CVS (Compressed Vector Set), a fast and space-efficient data mining framework that efficiently handles both sparse and dense datasets. CVS holds a set of vectors in a compressed format and conducts primitive vector operations, such as p-norm and dot product, without decompression. By combining these primitive operations, CVS accelerates prominent data mining or machine learning algorithms including k-nearest neighbor algorithm, stochastic gradient descent algorithm on logistic regression, and kernel methods. In contrast to the commonly used sparse matrix/vector representation, which is not effective for dense datasets, CVS efficiently handles sparse datasets and dense datasets in a unified manner. Our experimental results demonstrate that CVS can process both dense datasets and sparse datasets faster than conventional sparse vector representation with smaller memory usage.

2 citations

Network Information
Related Topics (5)
Network packet
159.7K papers, 2.2M citations
76% related
Feature extraction
111.8K papers, 2.1M citations
75% related
Convolutional neural network
74.7K papers, 2M citations
74% related
Image processing
229.9K papers, 3.5M citations
74% related
Cluster analysis
146.5K papers, 2.9M citations
74% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202123
202020
201920
201828
201727
201624