scispace - formally typeset
Search or ask a question
Topic

Document layout analysis

About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.


Papers
More filters
01 Jan 2014
TL;DR: This chapter provides a comprehensive review of the state of the art in the field of automated document understanding, highlights key methods developed for different target applications, and provides practical recommendations for designing a document understanding system for the problem at hand.
Abstract: Automatic document understanding is one of the most important tasks when dealing with printed documents since all post-ordered systems require the captured but process-relevant data. Analysis of the logical layout of documents not only enables an automatic conversion into a semantically marked-up electronic representation but also reveals options for developing higher-level functionality like advanced search (e.g., limiting search to titles only), automatic routing of business letters, automatic processing of invoices, and developing link structures to facilitate navigation through books. Over the last three decades, a number of techniques have been proposed to address the challenges arising in logical layout analysis of documents originating from many different domains. This chapter provides a comprehensive review of the state of the art in the field of automated document understanding, highlights key methods developed for different target applications, and provides practical recommendations for designing a document understanding system for the problem at hand.

18 citations

Patent
Doron Kletter1
05 Feb 2010
TL;DR: In this paper, a method and system generates fine-grained fingerprints for identifying content in a rendered document, which includes applying imagebased techniques to identify patterns in a document rendered by an electronic document rendering system, irrespective of a file format in which the rendered document was electronically created.
Abstract: A method and system generates fine-grained fingerprints for identifying content in a rendered document. It includes applying image-based techniques to identify patterns in a document rendered by an electronic document rendering system, irrespective of a file format in which the rendered document was electronically created. The applying of the image-based technique includes identifying candidate keypoints at locations in a local image neighborhood of the document, and combining the locations of the candidate keypoints to form a fine-grained fingerprint identifying patterns representing content in the document.

18 citations

Proceedings ArticleDOI
26 Jul 2009
TL;DR: A general bottom-up strategy to tackle the layout analysis of (possibly) non-Manhattan documents, and two specializations of it to handle both bitmap and PS/PDF sources are proposed.
Abstract: Layout analysis is a fundamental step in automatic document processing. Many different techniques have been proposed to perform this task. Some follow a top-down approach: they start by identifying the high level components of the page structure and then recursively split them until basic blocks are found. On the other hand, bottom-up approaches start with the smallest elements (e.g., the pixels in case of digitized document) and then recursively merge them into higher level components. A first limitation of such methods is that most of them are designed to deal only with digitized documents and hence are not applicable to native digital documents which are nowadays pervasive. Furthermore, top-down and most of bottom-up methods are able to process Manhattan layout documents only. In this work, we propose a general bottom-up strategy to tackle the layout analysis of (possibly) non-Manhattan documents, and two specializations of it to handle both bitmap and PS/PDF sources. It was successfully embedded and tested in the DOMINUS document management system.

18 citations

Posted ContentDOI
TL;DR: A novel nonparametric and unsupervised method to compensate for undesirable document image distortions aiming to optimally improve OCR accuracy and text detection rate is presented.
Abstract: Digital camera and mobile document image acquisition are new trends arising in the world of Optical Character Recognition and text detection. In some cases, such process integrates many distortions and produces poorly scanned text or text-photo images and natural images, leading to an unreliable OCR digitization. In this paper, we present a novel nonparametric and unsupervised method to compensate for undesirable document image distortions aiming to optimally improve OCR accuracy. Our approach relies on a very efficient stack of document image enhancing techniques to recover deformation of the entire document image. First, we propose a local brightness and contrast adjustment method to effectively handle lighting variations and the irregular distribution of image illumination. Second, we use an optimized greyscale conversion algorithm to transform our document image to greyscale level. Third, we sharpen the useful information in the resulting greyscale image using Un-sharp Masking method. Finally, an optimal global binarization approach is used to prepare the final document image to OCR recognition. The proposed approach can significantly improve text detection rate and optical character recognition accuracy. To demonstrate the efficiency of our approach, an exhaustive experimentation on a standard dataset is presented

18 citations

Proceedings ArticleDOI
01 Dec 2008
TL;DR: An algorithm for ruling estimation of Glagolitic texts based on text line extraction and is suitable for degraded manuscripts by extrapolating the baselines with the a priori knowledge of the ruling.
Abstract: In order to preserve our cultural heritage and for automated document processing libraries and national archives have started digitizing historical documents. In the case of degraded manuscripts (e.g. by mold, humidity, bad storage conditions) the text or parts of it can disappear. The remaining parts of the text can be segmented and the ruling can be extrapolated with the a priori knowledge. Since the ruling defines the position of the text within a page, it can be used for layout analysis and as a basis for the enhancement of the readability. Furthermore, information about the scribe (hand) of the manuscript, its spatiotemporal origin can be gained by analyzing the ruling. This paper presents an algorithm for ruling estimation of Glagolitic texts based on text line extraction and is suitable for degraded manuscripts by extrapolating the baselines with the a priori knowledge of the ruling. The algorithm was tested on 30 pages of the Missale Sinaiticum and the evaluation was based on visual criteria.

18 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
82% related
Feature (computer vision)
128.2K papers, 1.7M citations
82% related
Object detection
46.1K papers, 1.3M citations
81% related
Image segmentation
79.6K papers, 1.8M citations
80% related
Convolutional neural network
74.7K papers, 2M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20235
202219
202134
202019
201914
20189