scispace - formally typeset
Search or ask a question
Topic

Document layout analysis

About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.


Papers
More filters
Patent
02 Aug 1991
TL;DR: In this paper, a high-speed document verification system includes a document which is printed with a pattern having a predetermined arrangement of different reflectivity due to varying densities, line resolutions, or fluorescence.
Abstract: A high-speed document verification system includes a document which is printed with a pattern having a predetermined arrangement of different reflectivity due to varying densities, line resolutions, or fluorescence. The arrangement represents information about the document. The document is fed into a high-speed document scanner sensitive to the varying ink densities or line resolutions. A graphic image of the document is produced by the scanner and this image or a graphic file of the image is checked to see if the proper pattern exists. A comparison unit, such as an optical character recognition system may be used to compare the scanned document's image with known density arrangements of valid documents to determine what information, if any, is represented by the arrangement. The graphic image may be sent to an operator's work station to be visually checked rather than being compared by the comparison unit or the image may be sent to the operator after it has been rejected by the comparison unit.

67 citations

Journal ArticleDOI
TL;DR: A system for segmenting and understanding text and mathematical expressions in a document can be divided into six stages: page segmentation and labeling, character segmentation, feature extraction, character recognition, expression formation, and error correction and expression extraction.

66 citations

Journal ArticleDOI
TL;DR: A geometric matching algorithm is used to find the optimal page frame of structured documents (journal articles, books, magazines) by exploiting their text alignment property and shows that by removing characters outside the computed page frame, the OCR error rate is reduced.
Abstract: When a page of a book is scanned or photocopied, textual noise (extraneous symbols from the neighboring page) and/or non-textual noise (black borders, speckles, ...) appear along the border of the document. Existing document analysis methods can handle non-textual noise reasonably well, whereas textual noise still presents a major issue for document analysis systems. Textual noise may result in undesired text in optical character recognition (OCR) output that needs to be removed afterwards. Existing document cleanup methods try to explicitly detect and remove marginal noise. This paper presents a new perspective for document image cleanup by detecting the page frame of the document. The goal of page frame detection is to find the actual page contents area, ignoring marginal noise along the page border. We use a geometric matching algorithm to find the optimal page frame of structured documents (journal articles, books, magazines) by exploiting their text alignment property. We evaluate the algorithm on the UW-III database. The results show that the error rates are below 4% each of the performance measures used. Further tests were run on a dataset of magazine pages and on a set of camera captured document images. To demonstrate the benefits of using page frame detection in practical applications, we choose OCR and layout-based document image retrieval as sample applications. Experiments using a commercial OCR system show that by removing characters outside the computed page frame, the OCR error rate is reduced from 4.3 to 1.7% on the UW-III dataset. The use of page frame detection in layout-based document image retrieval application decreases the retrieval error rates by 30%.

66 citations

Patent
08 Nov 1994
TL;DR: In this article, a document layout processing device for the layout of a structured document is disclosed wherein a logical structure of a document is stored in the device, the logical structure has a preselected specific page format with a plurality of columns; document contents corresponding to each of the components of the logical structures; and a layout directive information indicating whether the components should be laid out in a single column or in a multi-column area.
Abstract: A document layout processing device for the layout of a structured document is disclosed wherein a logical structure of a document is stored in the device; the logical structure has a preselected specific page format with a plurality of columns; document contents corresponding to each of the components of the logical structure; and a layout directive information indicating whether the components of the logical structure should be laid out in a single column or in a multi-column area, whereby a content layout method lays out the document in one of the columns or in the multi-column area according to the logical structure while referring to the layout directive information; and a method of using the device for generating a multi-column area that extends over a number of columns including a specific column.

66 citations

Proceedings ArticleDOI
21 Dec 2000
TL;DR: A new paradigm, 'random graph probing,' is described for comparing the results returned by the recognition system and the representation created during ground-truthing, which could be applied to other document recognition tasks and perhaps even other computer vision problems as well.
Abstract: Tables are an important means for communicating information in written media, and understanding such tables is a challenging problem in document layout analysis. In this paper we describe a general solution to the problem of recognizing the structure of a detected table region. First hierarchial clustering is used to identify columns and then spatial and lexical criteria to classify headers. We also address the problem of evaluating table structure recognition. Our model is based on a directed acyclic attribute graph, or table DAG. We describe a new paradigm, 'random graph probing,' for comparing the results returned by the recognition system and the representation created during ground-truthing. Probing is in fact a general concept that could be applied to other document recognition tasks and perhaps even other computer vision problems as well.© (2000) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

65 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
82% related
Feature (computer vision)
128.2K papers, 1.7M citations
82% related
Object detection
46.1K papers, 1.3M citations
81% related
Image segmentation
79.6K papers, 1.8M citations
80% related
Convolutional neural network
74.7K papers, 2M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20235
202219
202134
202019
201914
20189