Topic

Document layout analysis

About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Patent•

Layout design supporting device

[...]

Takashi Kiguchi, Yasuhiro Kobayashi, Mitsuda Toru, Wada Yutaka

19 Dec 1984

TL;DR: In this article, the information concerning the estimation of layout in the process of a layout plan by a dialog processing is obtained and displayed as a pair at an output device at the computer main body.

...read moreread less

Abstract: PURPOSE: To improve the layout design processing efficiency by obtaining and displaying the information concerning the estimation of a layout in the process of a layout plan by a dialog processing. CONSTITUTION: First, position data 5, cost data 6 and object data 7 are fetched into a computer main body 2 in accordance with the procedure of a program 8 as initial data. After that, by noticing one or two objects out of layout objects in accordance with the procedure of the program 8, under the conditions of fixing others, the cost as the layout estimation value to the shifting of noticed objects is counted at a central arithmetic unit 4, and the shifting of the object and the changing quantity of the cost are obtained as a pair. After this, in the obtained result, the local shifting of the object of the layout and the change of the layout estimation value to it are outputted and displayed as a pair at an output device 3. The position data are renewed by a layout designer in accordance with the necessity. COPYRIGHT: (C)1986,JPO&Japio

...read moreread less

5 citations

Proceedings Article•

Restoration of arbitrarily warped document images based on text line and word detection

[...]

Basilis Gatos, Konstantinos Ntirogiannis¹•Institutions (1)

National and Kapodistrian University of Athens¹

31 Jan 2007

TL;DR: This paper presents a novel technique for efficient restoration of arbitrarily warped document images that are mainly bounded volumes captured by a digital camera and suffer from non-linear warp based on an adaptive document image binarization, a text line and word detection, and a first draft binary image dewarping based on word rotation and shifting.

...read moreread less

Abstract: This paper presents a novel technique for efficient restoration of arbitrarily warped document images. Our aim is to recover document images that are mainly bounded volumes captured by a digital camera and suffer from non-linear warp. The proposed technique is applied on gray scale document images and is based on several distinct steps: an adaptive document image binarization, a text line and word detection, a first draft binary image dewarping based on word rotation and shifting and, finally, a complete restoration of the original grayscale warped image guided by the binary dewarping result. In this paper, we present a detailed description of the proposed technique as well as the implementation results for each step of our methodology. The experimental results on several arbitrarily warped documents indicate the effectiveness of the proposed technique.

...read moreread less

5 citations

Proceedings Article•DOI•

Vision-Based Layout Detection from Scientific Literature using Recurrent Convolutional Neural Networks

[...]

Huichen Yang¹, William H. Hsu¹•Institutions (1)

Kansas State University¹

10 Jan 2021

TL;DR: In this paper, the authors presented an approach for adapting convolutional neural networks for object recognition and classification to scientific literature layout detection, a shared subtask of several information extraction problems.

...read moreread less

Abstract: We present an approach for adapting convolutional neural networks for object recognition and classification to scientific literature layout detection (SLLD), a shared subtask of several information extraction problems. Scientific publications contain multiple types of information sought by researchers in various disciplines, organized into an abstract, bibliography, and sections documenting related work, experimental methods, and results; however, there is no effective way to extract this information due to their diverse layout. In this paper, we present a novel approach to developing an end-to-end learning framework to segment and classify major regions of a scientific document. We consider scientific document layout analysis as an object detection task over digital images, without any additional text features that need to be added into the network during the training process. Our technical objective is to implement transfer learning via fine-tuning of pre-trained networks and thereby demonstrate that this deep learning architecture is suitable for tasks that lack very large document corpora for training ab initio. As part of the experimental test bed for empirical evaluation of this approach, we created a merged multi-corpus data set for scientific publication layout detection tasks. Our results show good improvement with fine-tuning of a pre-trained base network using this merged data set, compared to the baseline convolutional neural network architecture.

...read moreread less

5 citations

Journal Article•DOI•

Structure recognition of various kinds of table-form documents

[...]

Qin Luo¹, Toyohide Watanabe¹, Noboru Sugie¹•Institutions (1)

Nagoya University¹

01 Jan 1994-Systems and Computers in Japan

TL;DR: This paper addresses table-form documents as the objects of processing, and reports on a method which can recognize the document structures for various kinds of table-forms, using the classification tree, which hierarchically manages the information for each case of table

...read moreread less

Abstract: The recognition of the structure of a document is to discriminate the layout structure, i.e., the two-dimensional configuration and format, of the document, and to identify the individual item data. Most of the studies of this kind so far, however, are based on the paradigm for the document structure discrimination, where the information concerning the document structure is defined beforehand for a particular type of document and is utilized as the knowledge-base. Such a paradigm is successful in recognizing the same document structure or document structure of the same kind, but is not applicable to the case where various kinds of document structures are mixed. This paper addresses table-form documents as the objects of processing, and reports on a method which can recognize the document structures for various kinds of table-form documents. Various classes of table-form documents with various configurations and contents are available according to its use and adjacent relationship between item fields. To recognize exactly the document structure for various kinds of table-form documents, it is essential to develop the processing method based on the information for each class of table-form documents. For this purpose, the classification tree is used, which hierarchically manages the information for each case of table-form documents. A structure recognition system for multiple kinds of table-form documents, is realized with this framework, including the recognition of table-form document class, the automatic acquisition of layout structure information and the recognition of document structure.

...read moreread less

5 citations

Journal Article•

Text line Segmentation of Curved Document Images

[...]

Anusree, Dhanya M. Dhanalakshmy

01 Jan 2014-International Journal of Engineering Research and Applications

TL;DR: A segmentation technique is described that detects the curled text line in camera captured document images that helps to address the problem of printed document text line detection.

...read moreread less

Abstract: Document image analysis has been widely used in historical and heritage studies, education and digital library. Document image analytical techniques are mainly used for improving the human readability and the OCR quality of the document. During the digitization, camera captured images contain warped document due perspective and geometric distortions. The main difficulty is text line detection in the document. Many algorithms had been proposed to address the problem of printed document text line detection, but they failed to extract text lines in curved document. This paper describes a segmentation technique that detects the curled text line in camera captured document images. Keywords - Curved document images, Gradient vector flow, Gray scale image, Optical character recognition, Rectification.

...read moreread less

5 citations

Collapse

Network Information

Performance

Metrics

1,488

Papers

35,779

Citations

No. of papers in the topic in previous years
Year	Papers
2023	5
2022	19
2021	34
2020	19
2019	14
2018	9

Document layout analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics