scispace - formally typeset
Search or ask a question
Topic

Document layout analysis

About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.


Papers
More filters
Patent
14 Nov 1997
TL;DR: In this article, a document search system automatically segments document images into one or more layout objects and then computes a set of attributes for each of the identified layout objects, which are used to describe the layout structure of a page image of a document in terms of the spatial relations that layout objects have to frames of reference.
Abstract: A programming interface of document search system enables a user to dynamically specifying features of documents recorded in a corpus of documents. The programming interface provides category and format flexibility for defining different genre of documents. The document search system initially segments document images into one or more layout objects. Each layout object identifies a structural element in a document such as text blocks, graphics, or halftones. Subsequently, the document search system computes a set of attributes for each of the identified layout objects. The set of attributes are used to describe the layout structure of a page image of a document in terms of the spatial relations that layout objects have to frames of reference that are defined by other layout objects. Using the set of attributes a user defines features of a document with the programming interface. After receiving a feature or attribute and a set of document images selected by a user, the system forms a set of image segments by identifying those layout objects in the set of document images that make up the selected feature or attribute. The system then sorts the set of image segments into meaningful groupings of objects which have similarities and/or recurring patterns. In operation, the system sorts images in the image domain based on segments (or portions) of a document image which have been automatically extracted by the system. As a result, searching becomes more efficient because it is performed on limited portions of a document. Subsequently, document images in the set of document images are order and displayed to a user in accordance with the meaningful groupings.

113 citations

Patent
04 Oct 1994
TL;DR: A document marker, including first values dependent upon the layout and the contents of the document and assigned by generating or preprocessing software, is provided in machine-readable symbology on the face of a printed version of a document as discussed by the authors.
Abstract: A document marker, including first values dependent upon the layout and the contents of the document and assigned by generating or preprocessing software, is provided in machine-readable symbology on the face of a printed version of the document. The marker may include encoded document layout information and values assigned on sequences of the original text, including text-dependent decimation sequences, error correction codes or check-sums. Upon optical character recognition scanning, or other digitizing reproduction, the marker is also scanned. The scanning computer, having corresponding software, assigns second values dependent upon the layout and contents of the reproduced document. Upon comparison of the first and second decimation sequences, line and character errors can be detected and some errors corrected, thereby generating re-aligned candidate sequences. Optional error correction codes can provide further correcting capabilities, as applied to the re-aligned reproduced document sequences; and, an optional check-sum comparison can be utilized to verify the accuracy of the reproduced sequences are correct.

112 citations

Patent
15 Dec 1995
TL;DR: In this article, a method of displaying information in a computer system is described, which consists of a plurality of document images, text files, and positions files, where the first text file of the plurality of text files represents optical character recognized text of a corresponding first document image.
Abstract: A method of displaying information in a computer system is described. The computer system includes a plurality of document images, a plurality of text files, and a plurality of positions files. A first text file of the plurality of text files represents optical character recognized text of a corresponding first document image of the plurality of document images. A first positions file of the plurality of positions files relates character information in the first text file to a position in the first document image. The computer system searches the plurality of text files using a search term to generate a set of found text files. Each found text file of the set of found text files includes at least a first matching text string to the search term; the set of found text files includes the first text file. The system accesses the first positions file to determine a first region in the first document image corresponding to the first matching text string. The system displays the first document image including displaying a first enhanced view of the first region, the first enhanced view being enhanced relative to a display of the first document image, the first enhanced view being determined from a previously stored visual enhancement definition.

111 citations

Journal ArticleDOI
TL;DR: The proposed technique retrieves document images by a new word shape coding scheme, which captures the document content through annotating each word image by a word shape code.
Abstract: This paper presents a document retrieval technique that is capable of searching document images without optical character recognition (OCR). The proposed technique retrieves document images by a new word shape coding scheme, which captures the document content through annotating each word image by a word shape code. In particular, we annotate word images by using a set of topological shape features including character ascenders/descenders, character holes, and character water reservoirs. With the annotated word shape codes, document images can be retrieved by either query keywords or a query document image. Experimental results show that the proposed document image retrieval technique is fast, efficient, and tolerant to various types of document degradation.

111 citations

Patent
19 Jan 1994
TL;DR: In this article, a word-based retrieval index is created based on title-type regions and/or text type regions, the retrieval index being used in conjunction with a search query so as to be able to retrieve documents which match the search query.
Abstract: Method and apparatus for storing document images, for creating retrieval index by which the document images may be retrieved, and for displaying the retrieved document images. When a document image is obtained, the document image is subjected to rule-based block selection techniques whereby individual regions within the document region are identified, and the types of regions are also identified, such as title-type regions, text-type regions, line art-type regions, halftone-type regions and color image-type regions. The identification is used to create structural information and both the document image and the structural information is stored. A word-based retrieval index is created based on title-type regions and/or text-type regions, the retrieval index being used in conjunction with a search query so as to be able to retrieve documents which match the search query. The retrieved documents are displayed in either a full image mode or a rapid browsing mode. In the rapid browsing mode, the full image of the document is not displayed, but rather only an abstract structural view of the document image based on the stored structural information. The level of abstraction may be specified by the operator in connection with the identified structural regions of the document, whereby, for example, only a structural view is displayed, or only title-type regions are displayed mixed with remaining structure.

110 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
82% related
Feature (computer vision)
128.2K papers, 1.7M citations
82% related
Object detection
46.1K papers, 1.3M citations
81% related
Image segmentation
79.6K papers, 1.8M citations
80% related
Convolutional neural network
74.7K papers, 2M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20235
202219
202134
202019
201914
20189