scispace - formally typeset
Patent

Document information compression and retrieval system and document information registration and retrieval method

Reads0
Chats0
TLDR
A method of registering document information in a document information retrieval system which stores document information consisting of a large number of characters for retrieval of the stored document information is discussed in this paper.
Abstract
A document information compression and retrieval system which reduces the document data amount and shortens the retrieval time when mass document information is registered and retrieved. A method of registering document information in a document information retrieval system which stores document information consisting of a large number of characters for retrieval of the stored document information. Entered document information is separated into words. Whether or not each of the words is a word to which a compressed code is assigned is determined. If not already assigned, a compressed code is assigned to the word. The words are converted into the assigned compressed codes for storing a compressed text. At output, retrieval information is accepted and converted into compressed code and stored compressed texts are searched for the compressed text matching the compressed code of the retrieval information, then the words corresponding to the compressed codes are used to expand the compressed text into original document information.

read more

Citations
More filters
Patent

Method and system for fast indexing and searching of text in compound-word languages

TL;DR: In this paper, a content-index search system was proposed for fast indexing and searching of text in compound-word languages such as Japanese, Chinese, Hebrew, and Arabic.
Patent

A lempel-ziv data compression technique utilizing a dictionary pre-filled with frequent letter combinations, words and/or phrases

TL;DR: In this paper, the adaptive compression technique improves the Lempel-Ziv (LZ) technique because it reduces the required storage space (18) and transmission time with transferring data (22).
Patent

Method and apparatus for producing a hybrid data structure for displaying a raster image

TL;DR: In this article, a system for producing a raster image derived from coded and non-coded portions of a hybrid data structure from an input bitmap is presented. But the system is limited to a single image.
Patent

System and methods for accelerated data storage and retrieval

TL;DR: In this paper, a data storage and retrieval accelerator is proposed to reduce the time required to store and retrieve data from computer to disk, in conjunction with random access memory, in a display controller, and/or in an input/output controller.
Patent

Method and apparatus for classifying document information

TL;DR: A document information classification method and apparatus for classifying a document group and arranging a classified result hierarchically on the basis of key words given to the document groups and words appearing in documents without dependence on a prescribed classification system is presented in this article.
References
More filters
Proceedings ArticleDOI

Data compression and database performance

TL;DR: It is shown that many query processing algorithms can manipulate compressed data just as well as decompressed data, and that processing compressed data can speed query processing by a factor much larger than the compression factor.
Patent

Fast data compressor with direct lookup table indexing into history buffer

Chambers, +1 more
TL;DR: In this paper, a cooperating data compressor, compressed data format, and data decompressor is proposed to compress an input data block (HB) to a compressed data block having the format.
Patent

Method for determining boundaries of words in text

TL;DR: A method for determining the boundaries of a symbol or word string within an image, including the steps of determining page orientation, isolating symbol strings from adjacent symbol strings, establishing a set of boundaries or references with respect to which measurements about, or further processing of, the symbol string may be made, is described in this article.
Patent

Stem for dynamically compressing and decompressing electronic data

TL;DR: In this paper, a data compression system for encoding and decoding textual data, including an encoder for encoding the data and a decoder for decoding the encoded data, is presented.
Patent

Text compression and expansion method and apparatus

TL;DR: In this article, a text compression method and apparatus are disclosed that enable overall compression ratios of more than six or eight to one for normal language text, and entries in these dictionaries are categorized by a weighted frequency of use ranking in which the product of the word length in characters and the frequency of occurrence of that word in the text is taken as the weighted figure of merit for ranking words to be placed in the individual dictionaries.