scispace - formally typeset
Search or ask a question
Topic

Document layout analysis

About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.


Papers
More filters
Proceedings ArticleDOI
06 Jun 2021
TL;DR: In this paper, a hierarchical recurrent neural network (RNN) architecture is proposed to address the hierarchical structure inherent to the handwritten document, and the novelty of feature aggregation pooling technique for transferring data between hierarchical levels allows achieving higher computational efficiency for using the suggested approach in on-device mobile computing.
Abstract: The paper presents an original solution to the online handwritten document processing in a free form, which is aimed at separating multi-class handwritten documents into texts, tables, formulas, drawings, etc. Stroke classification is an important step in automatic document layout analysis (DLA) in handwritten document recognition systems. Major DLA challenges arise due to a wide diversity of handwritten content, various writing styles, a lack of contextual knowledge, and the complicated structure of freeform handwritten documents. In this paper, we propose the hierarchical recurrent neural network (RNN) architecture to address the hierarchical structure inherent to the handwritten document. The novelty of feature aggregation pooling technique for transferring data between hierarchical levels allows achieving higher computational efficiency for using the suggested approach in on-device mobile computing. The presented approach gives an access to new state-of-the-art results in the task of multi-class classification with an accuracy of 97.25% on the IAMonDo dataset. This result can serve as the basis for efficient mobile applications for freeform handwriting document recognition.

9 citations

Patent
28 Sep 2015
TL;DR: In this article, the layout intent associated with explicitly formatted document elements in a document is inferred and an intent-based document is then created using the inferred layout intent for some or all of the explicitly formatted documents in the document.
Abstract: Technologies are described herein for inferring the layout intent associated with explicitly formatted document elements in a document. The layout type of a document having explicitly formatted document elements is determined. Once the layout type for the document has been determined, the layout intent of explicitly formatted document elements in the document may be determined based, at least in part, on the determined layout type of the document. Heuristic algorithms and/or machine learning classifiers may determine the layout intent of the explicitly formatted document elements in the document. An intent-based document is then created using the inferred layout intent for some or all of the explicitly formatted document elements in the document. The intent-based document may then be provided to an intent-based rendering or authoring application for rendering based upon the inferred layout intent.

9 citations

Proceedings ArticleDOI
12 Dec 2008
TL;DR: This research focuses on the classification of non-text block in technical documents into table, graph, and figure and shows that support vector machine classifies better than back propagation neural network.
Abstract: Text and non-text segmentation and classification is very important in document layout analysis system before it is presented to an OCR system. Heuristic rules have been used in segmenting and classifying the text and non-text blocks. This research focuses on the classification of non-text block in technical documents into table, graph, and figure. A comparative study is conducted between backpropagation neural network and support vector machine and the result shows that support vector machine classifies better than back propagation neural network.

9 citations

Patent
27 Jul 2005
TL;DR: A method of generating a document template comprising: analysing a document (16) by: extracting layout information from the document (6), the layout information comprising one or more document zones and the position of the zones (112) on a page or pages of the document; determining properties of each of the one/more document zones, and generating a semantic label for each zone according to the properties determined for that zone as discussed by the authors.
Abstract: A method of generating a document template comprising: analysing a document (16) by: extracting layout information from the document (6), the layout information comprising one or more document zones and the position of the zones (112) on a page or pages of the document; determining properties of each of the one or more zones (112); and generating a semantic label (113) for each zone (112) according to the properties determined for that zone (112); and generating a template, the template comprising the layout information and the semantic labels (112).

9 citations

Proceedings ArticleDOI
25 Aug 2013
TL;DR: A new method of ground-truth estimation using multispectral (MS) imaging representation space for the sake of document image binarization and based on the cooperation of multiple classifiers under some constraints is proposed.
Abstract: Human ground-truthing is the manual labelling of samples (pixels for example) to generate reference data without any automatic algorithm help. Although a manual ground-truth is more accurate than a machine ground-truth, it still suffers from mislabeling and/or judgement errors. In this paper we propose a new method of ground-truth estimation using multispectral (MS) imaging representation space for the sake of document image binarization. Starting from the initial manual ground-truth, the proposed classification method aims to select automatically some samples with correct labels (well-labeled pixels) from each class for the training phase, then reassign new labels to the document image pixels. The classification scheme is based on the cooperation of multiple classifiers under some constraints. A real data set of MS historical document images and their ground-truth is created to demonstrate the effectiveness of the proposed method of ground-truth estimation.

9 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
82% related
Feature (computer vision)
128.2K papers, 1.7M citations
82% related
Object detection
46.1K papers, 1.3M citations
81% related
Image segmentation
79.6K papers, 1.8M citations
80% related
Convolutional neural network
74.7K papers, 2M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20235
202219
202134
202019
201914
20189