scispace - formally typeset
Search or ask a question
Topic

Document layout analysis

About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper proposes a new method, called the document multithresholding technique, based on a page layout analysis (PLA) technique and on a neural-network multilevel threshold-selection approach, which is applicable to any mixed-type document and achieves document multathresholding by taking advantage of the types of the document blocks.

25 citations

Proceedings ArticleDOI
01 Dec 2012
TL;DR: A new text line extraction technique based on Spiral Run Length Smearing Algorithm (SRLSA) is reported, where digitized document image is partitioned into a number of vertical fragments of equal width and text line segments present in these fragments are identified by applying SRLSA.
Abstract: Extraction of text lines from document images is one of the important steps in the process of an Optical Character Recognition (OCR) system In case of handwritten document images, presence of skewed, touching or overlapping text line(s) makes this process a real challenge to the researcher In the present work, a new text line extraction technique based on Spiral Run Length Smearing Algorithm (SRLSA) is reported Firstly, digitized document image is partitioned into a number of vertical fragments of equal width Then all the text line segments present in these fragments are identified by applying SRLSA Finally, the neighboring text line segments are analyzed and merged (if necessary) to place them inside the same text line boundary in which they actually belong For experimental purpose, the technique is tested on CMATERdb111 and CMATERdb121 databases The present technique extracts 8709% and 8935% text lines successfully from the said databases respectively

25 citations

01 Jan 1997
TL;DR: This paper presents the layout strategy developed for GraphVisualizer3D, which combines manual layout techniques and automatic algorithms in a synergistic manner, and a grid system is provided that can be nested to any arbitrary depth.
Abstract: There is increasing evidence that 3D visualization of complex structures has advantages over 2D visualization While nested directed graphs are an important method of representing information in 2D or 3D, they must be effectively organized in order to be understood Most work on graph layout has assumed that fully automatic layout is desirable Through our work with graphs representing large software structures, we have found that, due to the importance of the semantic content, it is necessary to combine automatic layout with manual layout This paper describes a system called GraphVisualizer3D, which was designed to help people understand large nested graph structures by displaying them in 3D This system is currently being applied to the problem of understanding large bodies of software In this paper we present the layout strategy developed for GraphVisualizer3D, which combines manual layout techniques and automatic algorithms in a synergistic manner In order to facilitate manual layout, a grid system is provided that can be nested to any arbitrary depth The automatic layout is accomplished by layering followed by a node migration algorithm, whereby nodes migrate to their final position under the influence of a variety of different forces Options are provided to allow users to switch back and forth between manual layout and automatic layout GV3D has been tested with large examples containing more than 35,000 nodes and 40,000 relationships

25 citations

Proceedings ArticleDOI
07 Mar 1996
TL;DR: In this article, a bottom-up method for recognizing tables within a document is proposed based on the paradigm of graph rewriting, where the document image is transformed into a layout graph whose nodes and edges represent document entities and their interrelations respectively.
Abstract: This paper proposes a bottom-up method for recognizing tables within a document. This method is based on the paradigm of graph-rewriting. First, the document image is transformed into a layout graph whose nodes and edges represent document entities and their interrelations respectively. This graph is subsequently rewritten using a set of rules designed based on a priori document knowledge and general formatting conventions. The resulting graph provides a logical view of the document content. It can be parsed to provide general format analysis information.

25 citations

Book ChapterDOI
01 Jan 2008
TL;DR: The problem of detecting the reading order relationship between components of a logical structure is investigated, typically denoted as document layout analysis, which involves several steps including preprocessing, page decomposition, classification of segments according to content type and hierarchical organization on the basis of perceptual meaning.
Abstract: Summary. Document image understanding refers to logical and semantic analysis of document images in order to extract information understandable to humans and codify it into machine-readable form. Most of the studies on document image understanding have targeted the specific problem of associating layout components with logical labels, while less attention has been paid to the problem of extracting relationships between logical components, such as cross-references. In this chapter, we investigate the problem of detecting the reading order relationship between components of a logical structure. The domain specific knowledge required for this task is automatically acquired from a set of training examples by applying a machine learning method. The input of the learning method is the description of “chains” of layout components defined by the user. The output is a logical theory which defines two predicates, fi rst to read/ 1a ndsucc in reading/2, useful for consistently reconstructing all chains in the training set. Only spatial information on the page layout is exploited for both single and multiple chain reconstruction. The proposed approach has been evaluated on a set of document images processed by the system WISDOM++. Documents are characterized by two important structures: the layout structure and the logical structure. Both are the results of repeatedly dividing the content of a document into increasingly smaller parts, and are typically represented by means of a tree structure. The difference between them is the criteria adopted for structuring the document content: the layout structure is based on the presentation of the content, while the logical structure is based on the human-perceptible meaning of the content. The extraction of the layout structures from images of scanned paper documents is a complex process, typically denoted as document layout analysis, which involves several steps including preprocessing, page decomposition (or segmentation), classification of segments according to content type (e.g., text, graphics, pictures) and hierarchical organization on the basis of perceptual

25 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
82% related
Feature (computer vision)
128.2K papers, 1.7M citations
82% related
Object detection
46.1K papers, 1.3M citations
81% related
Image segmentation
79.6K papers, 1.8M citations
80% related
Convolutional neural network
74.7K papers, 2M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20235
202219
202134
202019
201914
20189