scispace - formally typeset
Search or ask a question
Topic

Document layout analysis

About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.


Papers
More filters
Proceedings ArticleDOI
23 Sep 2007
TL;DR: This paper investigates the problem of detecting the reading order of layout components by resorting to a data mining approach which acquires the domain specific knowledge from a set of training examples and induces a probabilistic classifier based on the Bayesian framework which is used for reconstructing either single or multiple chains of layout component.
Abstract: Determining the reading order for layout components extracted from a document image can be a crucial problem for several applications. It enables the reconstruction of a single textual element from texts associated to multiple layout components and makes both information extraction and content-based retrieval of documents more effective. A common aspect for all methods reported in the literature is that they strongly depend on the specific domain and are scarcely reusable when the classes of documents or the task at hand changes. In this paper, we investigate the problem of detecting the reading order of layout components by resorting to a data mining approach which acquires the domain specific knowledge from a set of training examples. The input of the learning method is the description of the "chains" of layout components defined by the user. Only spatial information is exploited to describe a chain, thus making the proposed approach also applicable to the cases in which no text can be associated to a layout component. The method induces a probabilistic classifier based on the Bayesian framework which is used for reconstructing either single or multiple chains of layout components. It has been evaluated on a set of document images.

11 citations

Book
01 Jan 1997

11 citations

Su Chen1
03 Oct 1996
TL;DR: The aim of this study is to apply solid statistical methods to systematically model and extract various layout structures on document images, such as words, text lines and text blocks, through the computation theory of the recursive morphological transforms.
Abstract: The aim of this study is to apply solid statistical methods to systematically model and extract various layout structures on document images, such as words, text lines and text blocks. We first establish the computation theory of the recursive morphological transforms, namely the recursive erosion transform, the recursive dilation transform, the recursive opening transform, and the recursive closing transform. The transforms serve as a set of powerful tools for the document image shape analysis. Then we describe our efforts to construct a series of carefully ground-truthed document image databases, such as the UW English document image database (I). The database offers a platform based on which we can develop, train and evaluate our document layout analysis system. We present three sub-components of our document layout analysis system. They are the text skew estimation, the word segmentation, and the object spatial analysis: The text skew estimation finds the text skew angle of a document image. We develop an automatic text skew estimation algorithm using the recursive opening and closing transforms. It computes the estimated text skew angles which are within 0.5$\sp\circ$ of the true text skew angles with a probability of 0.95 on real images. The word segmentation detects all the words on a document image. We describe a word segmentation algorithm that utilizes the recursive closing transform. We derive the quantitative measures, such as the rates of miss, false, correct, splitting, merging and spurious detections, to evaluate its performance. The results show that the algorithm correctly detects the words on a document image at a rate of about 95%. The object spatial analysis treats the detected words as atomic and employs a probabilistic linear displacement model (PLDM) and an augmented PLDM model to model and extract the text lines and text blocks in a document image. By gathering statistics from a large population of document images, we are able to validate our models and determine the proper model parameters. The correct text line and text block detection rates are about 92% and 81% respectively.

11 citations

Proceedings ArticleDOI
01 Dec 2016
TL;DR: The layout analysis method merges a classic top-down approach and a bottom-up classification process based on local geometrical features, while regions are classified by means of features extracted from a Convolutional Neural Network merged in a Random Forest classifier.
Abstract: Document layout segmentation and recognition is an important task in the creation of digitized documents collections, especially when dealing with historical documents. This paper presents an hybrid approach to layout segmentation as well as a strategy to classify document regions, which is applied to the process of digitization of an historical encyclopedia. Our layout analysis method merges a classic top-down approach and a bottom-up classification process based on local geometrical features, while regions are classified by means of features extracted from a Convolutional Neural Network merged in a Random Forest classifier. Experiments are conducted on the first volume of the “Enciclopedia Treccani”, a large dataset containing 999 manually annotated pages from the historical Italian encyclopedia.

11 citations

Patent
26 Dec 1991
TL;DR: In this paper, a column setting position using blank area of components in an input document is detected using a blank area detecting technique. But the blank area between components is not defined.
Abstract: PURPOSE: To detect a column setting position using blank area of components in an input document. CONSTITUTION: A component extracting means 21 extracts components of an input document out of the image data inputted through an image inputting part 1 where an input document is inputted as an image data. A blank area detecting means 22, relating to the components extracted by the component extracting means 21, decides the components adjoining each other in the direction of character strings of the input document and also defines the blank area between components. A column setting position deciding means 23, relating to the blank area between components which has been defined, analyses continuity between the character string direction and the blank area between other components in vertical direction so as to decide the column setting position. COPYRIGHT: (C)1993,JPO&Japio

11 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
82% related
Feature (computer vision)
128.2K papers, 1.7M citations
82% related
Object detection
46.1K papers, 1.3M citations
81% related
Image segmentation
79.6K papers, 1.8M citations
80% related
Convolutional neural network
74.7K papers, 2M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20235
202219
202134
202019
201914
20189