scispace - formally typeset
Search or ask a question
Topic

Document layout analysis

About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.


Papers
More filters
Patent
30 Jun 1995
TL;DR: In this paper, the geometric and logical structure of a document page from its image is determined by partitioning the document image into text and non-text regions, which are then organized into related groups in the correct reading order.
Abstract: Apparatus and method are provided which determine the geometric and logical structure of a document page from its image. The document image is partitioned into regions (both text and non-text) which are then organized into related "article" groups in the correct reading order. The invention uses image-based features, text-based features, and assumptions based on knowledge of expected layout, to find the correct reading order of the text blocks on a document page. It can handle complex layouts which have multiple configurations of columns on a page and inset material (such as figures and inset text blocks). The apparatus comprises two main components, a geometric page segmentor and a logical page organizer. The geometric page segmentor partitions a binary image of a document page into fundamental units of text or non-text, and produces a list of rectangular blocks, their locations on the page in points (1/72 inch), and the locations of horizontal rule lines on the page. The logical page organizer receives a list of text region locations, rule line locations, associated ASCII text (as found from an OCR) for the text blocks, and a list of text attributes (such as face style and point size). The logical page organizer groups appropriately the components (both text and non-text) which comprise a document page, sequences them in a correct reading order and establishes the dominance pattern (e.g., find the lead article on a newspaper page).

158 citations

Patent
14 Jan 2004
TL;DR: In this article, a method is proposed to generate a minimum set of simplified and navigable web contents from a single web document that is oversized for targeted smaller devices, while preserving text, image, transactional and embedded presentation constraint information.
Abstract: A method is disclosed to generate, while preserving text, image, transactional and embedded presentation constraint information, a minimum set of simplified and navigable web contents from a single web document that is oversized for targeted smaller devices. The method includes a parser, a content tree builder, a document tree builder, a document simplifier, a virtual layout engine, a document partitioner, a content scalar and a markup generator. The parser generates markup and data tags from an HTML source document. The builder constructs a content tree. The simplifier transforms the document tree into an intermediate one defined by a subset of XHTML tags and attributes. Layout constraints, including size, area, placement order, and column/row relationships, are calculated for partitioning and scaling the document tree into sub document trees with assigned navigation order and hierarchical hyperlinks. A simplified HTML document is then generated with the markup generator.

155 citations

Proceedings ArticleDOI
20 Sep 1999
TL;DR: It is argued that the experimental evaluation on relative small test sets-as is very common in document analysis has to be taken with extreme care from a statistical point of view.
Abstract: In document analysis, it is common to prove the usefulness of a component by an experimental evaluation. By applying the respective algorithms to a test sample, effectiveness measures such as recall, precision, and accuracy are computed. The goal of such an evaluation is two-fold: on the one hand it shows that the absolute effectiveness of the algorithm is acceptable for practical use. On the other hand the evaluation can prove that the algorithm has a better or worse effectiveness than another algorithm. We argue that the experimental evaluation on relative small test sets-as is very common in document analysis has to be taken with extreme care from a statistical point of view. In fact, it is surprising how weak statements derived from such evaluations are.

153 citations

Proceedings ArticleDOI
21 Jun 1994
TL;DR: Document image understanding encompasses the technology required to make paper documents equivalent to other computer exchange media like floppies, tapes, and CDROMs and restricts ourselves to documents such as business letters, forms, and scientific and technical articles such as those found in archival journals and technical conferences.
Abstract: Document image understanding encompasses the technology required to make paper documents equivalent to other computer exchange media like floppies, tapes, and CDROMs. The physical reader of the paper document is the scanner just like the physical reader of the floppy is the floppy drive and the physical reader of the tape cartridge is the tape cartridge drive, and the physical reader of the CDROM is the CDROM drive. In the survey presented, we restrict ourselves to documents such as business letters, forms, and scientific and technical articles such as those found in archival journals and technical conferences. Understanding such documents involves estimating the rotation skew of each document page, determining the geometric page layout, labeling blocks as text or non-text, determining the read order for text blocks, recognizing the text of text blocks through an OCR system, determining the logical page layout, and formatting the data and information of the document in a suitable way for use by a word processing system or by an information retrieval system. >

152 citations

Journal ArticleDOI
TL;DR: The authors introduce a classification tree to manage the relationships among different classes of layout structures and propose a method to recognize the layout structures of multi-kinds of table-form document images.
Abstract: Many approaches have reported that knowledge-based layout recognition methods are very successful in classifying the meaningful data from document images automatically. However, these approaches are applicable to only the same kind of documents because they are based on the paradigm that specifies the structure definition information in advance so as to be able to analyze a particular class of documents intelligently. In this paper, the authors propose a method to recognize the layout structures of multi-kinds of table-form document images. For this purpose, the authors introduce a classification tree to manage the relationships among different classes of layout structures. The authors' recognition system has two modes: layout knowledge acquisition and layout structure recognition. In the layout knowledge acquisition mode, table-form document images are distinguished according to this. Classification tree and then the structure description trees which specify the logical structures of table-form documents are generated automatically. While, in the layout structure recognition mode, individual item fields in the table-form document images are extracted and classified successfully by searching the classification tree and interpreting the structure description tree. >

151 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
82% related
Feature (computer vision)
128.2K papers, 1.7M citations
82% related
Object detection
46.1K papers, 1.3M citations
81% related
Image segmentation
79.6K papers, 1.8M citations
80% related
Convolutional neural network
74.7K papers, 2M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20235
202219
202134
202019
201914
20189