scispace - formally typeset
Search or ask a question

Showing papers on "Document layout analysis published in 1990"


Journal ArticleDOI
TL;DR: Recognition experiments with a prototype system for a variety of complex printed documents shows that the proposed system is capable of reading different types of printed documents at an accuracy rate of 94.8–97.2%.

258 citations


Patent
30 Jul 1990
TL;DR: A document storage and retrieval system stores a document body in the form of an image, storing text information in a form of a character code string for retrieval, and executing a retrieval with reference to the text information, followed by displaying a document image relating thereto on a retrieval terminal according to the retrieval result as mentioned in this paper.
Abstract: A document storage and retrieval system stores a document body in the form of an image, storing text information in the form of a character code string for retrieval, and executing a retrieval with reference to the text information, followed by displaying a document image relating thereto on a retrieval terminal according to the retrieval result. Such a form of the system is available for retrieving the full contents of a document and also for displaying the document body printed in a format easy to read straight in the form of an image.

160 citations


Proceedings ArticleDOI
S. Tsujimoto1, H. Asada1
16 Jun 1990
TL;DR: Experimental results on a variety of document formats have shown that the proposed method is applicable to most of the documents commonly encountered in daily use, although there is still room for further refinement of the transformation rules.
Abstract: A document understanding method based on the tree representation of document structures is proposed. It is shown that documents have an obvious hierarchical structure in their geometry which is represented by a tree. A small number of rules are introduced to transform the geometric structure into the logical structure which represents the semantics. The virtual field separator technique is employed to utilize the information carried by special constituents of documents such as field separators and frames, keeping the number of transformation rules small. Experimental results on a variety of document formats have shown that the proposed method is applicable to most of the documents commonly encountered in daily use, although there is still room for further refinement of the transformation rules. >

122 citations


Proceedings ArticleDOI
01 Mar 1990
TL;DR: This paper shows how user-specified layout constraints may be easily added to many automatic graph layout algorithms and allows a continuum between manual and automatic layout by allowing the user to specify how stable the graph's layout should be.
Abstract: Automatic layout algorithms are commonly used when displaying graphs on the screen because they provide a “nice” drawing of the graph without user intervention. There are, however, a couple of disadvantages to automatic layout. Without user intervention, an automatic layout algorithm is only capable of producing an aesthetically pleasing drawing of the graph. User- or application-specified layout constraints (often concerning the semantics of a graph) are difficult or impossible to specify. A second problem is that automatic layout algorithms seldom make use of information in the current layout when calculating the new layout. This can also be frustrating to the user because whenever a new layout is done, the user's orientation in the graph is lost.This paper suggests using layout constraints to solve both of these problems. We show how user-specified layout constraints may be easily added to many automatic graph layout algorithms. Additionally, the constraints specified by the current layout are used when calculating the new layout to achieve a more stable layout. This approach allows a continuum between manual and automatic layout by allowing the user to specify how stable the graph's layout should be.

92 citations


Proceedings ArticleDOI
16 Jun 1990
TL;DR: A novel approach to automatic classification of digitized office documents based on the inductive generalization of their layout style, is presented, supported by the observation that for a number of printed documents it is possible to find a set of relevant and invariant layout features.
Abstract: A novel approach to automatic classification of digitized office documents based on the inductive generalization of their layout style, is presented. It is supported by the observation that for a number of printed documents it is possible to find a set of relevant and invariant layout features. These are geometrical characteristics automatically detected through a segmentation and layout analysis process. The learning step, in which significant examples of document classes are used to train the classification system, involves the novel idea of integrating parametric (numerical) and conceptual (symbolic) learning methods. >

75 citations


Patent
Isamu Iwai1, Miwako Doi1, Mika Fukui1
29 Mar 1990
TL;DR: A document processing system includes an input section, a memory section, text analyzing section, image identifying section, an image size identifying section and a layout processing section, and an output section as discussed by the authors.
Abstract: A document processing system includes an input section, a memory section, a text analyzing section, an image identifying section, an image size identifying section, a layout processing section, and an output section. Document data is constituted by text data and image data. The test data includes key information corresponding to the image data, and the image data is laid out in the document data. The text data and image data input through the input section are stored in the memory section. The text analyzing section identifies a position in the document data at which the image data is to be laid out, based on a position of key information in the text data. The image identifying section identifies image data corresponding to the key information. The image size identifying section identifies an image size of the image data identified by the image identifying section. The layout processing section lays out the identified image data at the identified image layout position in accordance with a predetermined layout rule.

41 citations


Proceedings Article
01 Jan 1990
TL;DR: An overview of techniques for document image analysis can be found in this article, with an emphasis on those for grnphics recognition and interpretation, which is derived from the fields of image processing pattern recognition, and machine vision.
Abstract: An overview is presented of algorithms and techniques for document image analysis with an emphasis on those for grnphics recognition and interpretation The techniques are derived from the fields of image processing pattern recognition, and machine vision The objective in document image analysis is to recognize page contents including layout, text, and figures Although optical character recognition (OCR) fds within the context of document image analysis we do not cover this area since OCR techniques have been covered extensively in the literature We also limit the focus to images containing binary information Topics covered are segmentation of document image into text and graphics regions, vectorization to obtain lines, identification of graphical primitives, and generation of succinct image interpretations

29 citations


Proceedings ArticleDOI
01 Aug 1990
TL;DR: Applications forseen for the image segmentation include modified facsimile systems, achievement of artifact-free OCR and conversion of document images into files with separate formats for text, graphics and pictures.
Abstract: Document scanning is now an accepted part of office procedure, allowing the incorporation of digitized images into new documents and the conversion of scanned print into ASCII by optical character recognition ( OCR). Often document pages contain more than one form of information - textual, graphical and/or pictorial. Segmentation of document images into these three categories is feasible with the aid of image processing. Projections of the thresholded document images in conjunction with autocorrelation are used to check text alignment. Then the edge shifting properties of the rank filter are used to coalesce image regions containing text into solid near-rectangular blocks. Pyramidal reduction is combined with the filtering to ease the computational burden. Horizontal and vertical projections are used to segment whole pages recursively into homogeneous blocks whose properties are then analysed. Applications forseen for the image segmentation include modified facsimile systems, achievement of artifact-free OCR and conversion of document images into files with separate formats for text, graphics and pictures.© (1990) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

20 citations


Patent
07 Dec 1990
TL;DR: In this paper, a method and an apparatus for document formatting, capable of reflecting the preference of the operator and the overall balance such that the desired formatting can be obtained efficiently without tedious post-processing operations.
Abstract: A method and an apparatus for document formatting, capable of reflecting the preference of the operator and the overall balance such that the desired formatting can be obtained efficiently without tedious post-processing operations. In the apparatus, document data representing the document including figure data representing figure elements of the document, and region data indicating layout region to which the document is to be laid out are inputted, candidate layouts for each figure element to be laid out are generated, one of the generated candidate layouts is selected, and the document is formatted in the layout region, according to the selected one of the candidate layouts.

20 citations


Proceedings ArticleDOI
16 Jun 1990
TL;DR: An enhanced border-following algorithm and its application to document image processing is presented and various kinds of components in a document image can be flexibly segmented and extracted with a variable-size mask for border following instead of the conventional 3*3-size Mask.
Abstract: An enhanced border-following algorithm and its application to document image processing is presented. Various kinds of components (characters, text lines, text blocks, figures, tables, etc.) in a document image can be flexibly segmented and extracted with a variable-size mask for border following instead of the conventional 3*3-size mask. An automatic document image structuring process to construct a multimedia document and a raster/geometric conversion method for the segmented graphic parts of the image. such as diagrams and tables, are discussed. >

19 citations


Journal Article
TL;DR: This paper proposes a new method of document image understanding which employs the domain specific knowledge base called document model, and introduces the strategy of hypothesis generation and testing.
Abstract: Document image understanding is a task to generate the structured description about contents of a document. In this paper, we propose a new method of document image understanding which employs the domain specific knowledge base called document model. Document model is structural representation of constraints on the layout structure as well as the logical structure of a target document. Since the variation of the structure can be described in document model, intermediate results of understanding generally include multiple candidates. In order to generate plausible description from such candidates, we introduce the strategy of hypothesis generation and testing. From the experiments for 100 visiting cards, we demonstrate the effectiveness of our method.

Journal Article
TL;DR: The basic idea in the method is to utilize the spatial and geometric relationships between document items to extract and classify the meaningful information from documents automatically.
Abstract: This paper introduces a new method to extract and classify the meaningful information from documents automatically. The basic idea in our method is to utilize the spatial and geometric relationships between document items. Our approach is adaptable even if the layout structures are modified more or less, because the coordinate values of positions, sizes, lengthes and so on are not specified directly. Additionally, some experiments for typical documents such as library cataloging cards, name cards and letters are shown concretely.

Patent
09 Nov 1990
TL;DR: In this article, the authors propose to convert a text file which is represented with linear character strings into a hierarchical tree structure by analyzing index character strings corresponding to the chapters, paragraphs, and clauses in the main body of a document and automatically generating the tree-shaped logical structure.
Abstract: PURPOSE: To convert a text file which is represented with linear character strings into a hierarchical tree structure by analyzing index character strings corresponding to the chapters, paragraphs, and clauses in the main body of a document and automatically generating the tree-shaped logical structure. CONSTITUTION: A document read part 101 recognizes the characters of inputted document image data and the recognized document data are stored, document by document, in a document data storage part 103; and an index symbol analytic part 102 extracts index symbols and generate the logical structures of the documents from the meaning of the index symbols, and the generated logical structures are stored in the logical structure data storage part 104. A display control part 105 displays the logical structure of a document on a terminal device 106 with a screen according to the stored logical structure data. Consequently, the document file which is represented with linear character strings can be converted into the hierarchical tree structure. COPYRIGHT: (C)1992,JPO&Japio

Journal Article
TL;DR: The authors have developed a document image structure analysis method to generate a layout structure, as well as to detect such document elements as characters, pictures and figures.
Abstract: 1. Abstract A document input system, with character recognition technique, is used for converting printed matter, such as books and magazines, into code-format information. In order to improve this document input system's performance, an appropriate document structure analysis technique is indispen~able(''~'). When storing data from general printed documents into a database, it is necessary to represent the document structure. Therefore, a document layout structure generation method is especially important(*)(6). For this purpose, the authors have developed a document image structure analysis method to generate a layout structure, as well as to detect such document elements as characters, pictures and figures. This method was developed on a personal computer. Its usability is described in this paper.

Journal ArticleDOI
TL;DR: More flexiblw and interactively formatting editors for structured document preparation presuppose a strong distinction of logical and layout structure and incorporate a formal description of the mapping, how the layout is derived from the logical structure.

Patent
03 Jan 1990
TL;DR: In this paper, a general layout structure of a document is used to optimize its processing by identifying the possible layout presentation constructs appearing in the subsequent specific instance of the conforming document.
Abstract: A method is disclosed for utilizing a general layout structure of a document which contains relationships within its layout constructs that offer choices when creating the document and conforming instances of logical elements with the general layout structure, taking in to account specific device characteristics, to generate the final-form document. The relationships are defined as expressions similar to those existing in general logical document structure definitions. Thus, an intermediate phase of document interchange between revision and final-form is introduced which saves data transmission time and gives the receiver some flexibility in presentation options while still conforming to a general layout definition. Further, the general layout definition may be used by a receiver to optimize its processing by identifying the possible layout presentation constructs appearing in the subsequent specific instance of the conforming document.

Patent
13 Dec 1990
TL;DR: In this paper, a document formatting apparatus automatically arranged input document data so as to match a pre-formatted document by extracting layout structure including character properties and logical properties of each text data item from an input document.
Abstract: A document formatting apparatus automatically arrange input document data so as to match a pre-formatted document. Firstly, layout structure including character properties and logical properties of each text data item are extracted from a pre-formatted document. Secondly, logical properties of each text data item are extracted from an input document. Thirdly each of the logical properties of the input document is compared with corresponding logical properties of the pre-formatted document. When the logical properties of input text data are matched with logical properties of the pre-formatted document, corresponding character properties of the pre-formatted document are applied to the input text data. Therefore, each text data item of the input document is automatically arranged in accordance with the preset layout structure and corresponding character properties.

DOI
01 Jan 1990
TL;DR: A knowledge-based approach, developed for the identification of logical objects in a document image, is described, which has been implemented for the analysis of single-sided business letters in Common Lisp on a SUN 3/60 Workstation.
Abstract: This report focuses on analysis steps necessary for a paper document processing. It is divided in three major parts: a document image preprocessing, a knowledge-based geometric classification of the image, and a expectation-driven text recognition. It first illustrates the several low level image processing procedures providing the physical document structure of a scanned document image. Furthermore, it describes a knowledge-based approach, developed for the identification of logical objects (e.g., sender or the footnote of a letter) in a document image. The logical identifiers provide a context-restricted consideration of the containing text. While using specific logical dictionaries, a expectation-driven text recognition is possible to identify text parts of specific interest. The system has been implemented for the analysis of single-sided business letters in Common Lisp on a SUN 3/60 Workstation. It is running for a large population of different letters. The report also illustrates and discusses examples of typical results obtained by the system.