Topic

Document layout analysis

About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Detecting Text Areas and Decorative Elements in Ancient Manuscripts

[...]

Angelika Garz¹, Markus Diem¹, Robert Sablatnig¹•Institutions (1)

Vienna University of Technology¹

16 Nov 2010

TL;DR: An approach for the detection of decorative elements – such as initials and headlines – and text regions, focused on ancient manuscripts, is presented and shows that the method is able to locate regular text in ancient manuscripts.

...read moreread less

Abstract: An approach for the detection of decorative elements – such as initials and headlines – and text regions, focused on ancient manuscripts, is presented. Due to their age, ancient manuscripts suffer from degradation and staining as well as ink is faded-out over the time. Identifying decorative elements and text regions allows indexing a manuscript and serves as input for Optical Character Recognition (OCR) as it localizes regions of interest within document pages. We propose a robust method inspired by state-of-the-art object recognition methodologies. Scale Invariant Feature Transform (SIFT) descriptors are chosen to detect the regions of interest, and the scale of the interest points is used for localization. The classification is based on the fact that local properties of the decorative elements are different to those of regular text. The results show that the method is able to locate regular text in ancient manuscripts. The detection rate of decorative elements is not as high as for regular text but already yields to promising results.

...read moreread less

13 citations

Proceedings Article•DOI•

Skew Estimation of Sparsely Inscribed Document Fragments

[...]

Markus Diem¹, Florian Kleber¹, Robert Sablatnig¹•Institutions (1)

Vienna University of Technology¹

27 Mar 2012

TL;DR: Results show that the proposed skew estimation is comparable with state-of-the-art methods and outperforms them on a real dataset consisting of 658 snippets.

...read moreread less

Abstract: Document analysis is done to analyze entire forms (e.g. intelligent form analysis, table detection) or to describe the layout/structure of a document for further processing. A pre-processing step of document analysis methods is a skew estimation of scanned or photographed documents. Current skew estimation methods require the existence of large text areas, are dependent on the text type and can be limited on a specific angle range. The proposed method is gradient based in combination with a Focused Nearest Neighbor Clustering of interest points and has no limitations regarding the detectable angle range. The upside/down decision is based on statistical analysis of ascenders and descenders. It can be applied to entire documents as well as to document fragments containing only a few words. Results show that the proposed skew estimation is comparable with state-of-the-art methods and outperforms them on a real dataset consisting of 658 snippets.

...read moreread less

13 citations

Proceedings Article•DOI•

Document image skew detection and correction method based on extreme points

[...]

Marian Wagdy, Ibrahima Faye, Dayang Rohaya

03 Jun 2014

TL;DR: The main idea of this method is based on the concept that any document image has objects with rectangular shape such as paragraphs, text lines, tables and figures that can be bounded by rectangles, which represents the angle of document skew.

...read moreread less

Abstract: In this paper we present a method for estimating the document image skew angle. The main idea of this method is based on the concept that any document image has objects with rectangular shape such as paragraphs, text lines, tables and figures. These objects can be bounded by rectangles. We use the extreme point's properties to obtain the corners of the rectangle which fits the largest connected component of the document image. The angle of this rectangle represents the angle of document skew. The experimental results show the high performance of the algorithm in detecting the angle of skew for a variety of documents with different levels of complexity.

...read moreread less

13 citations

Patent•

System for processing structured document

[...]

Kazuyoshi Tanaka, 一義田中

18 Nov 2003

TL;DR: In this paper, the authors propose a system for processing a structured document that can reduce labor for programming, in a software program for handling structured documents, by eliminating data conversion processing for processing of document contents and data extraction processing for selection of necessary data.

...read moreread less

Abstract: PROBLEM TO BE SOLVED: To provide a system for processing a structured document that can reduce labor for programming, in a software program for handling a structured document, by eliminating data conversion processing for processing of document contents and data extraction processing for selection of necessary data. SOLUTION: A structured document holds a document structure definition represented by declarations of document elements forming document contents and by a set of semantic relations defined between the document elements, and a set of instances of the respective document elements matching the document structure definition. The system for processing a structured document, which comprises data reading means for reading the instances of document elements from the structured document and editing them into data processible by a software program to provide them, comprises basic data structure selecting means for selecting and specifying a data structure of the data provided by the data reading means, from known basic data structures or object structures, such as array, set, list, tree, graph and table structures. COPYRIGHT: (C)2005,JPO&NCIPI

...read moreread less

13 citations

Proceedings Article•DOI•

A hybrid method for table detection from document image

[...]

Tran Tuan Anh¹, Na Inseop¹, Kim， Soo-Hyung¹•Institutions (1)

Chonnam National University¹

01 Nov 2015

TL;DR: A hybrid method consisting of the alternative bottom-up and top-down approaches is implemented to find the table region candidates by analyzing text lines and spare lines for detecting tables in document images.

...read moreread less

Abstract: In this paper, we present a hybrid method consisting of three main stages for detecting tables in document images. Based on table structure, our system separates table into two main categories, ruling line table and non-ruling line table. In the first stage, the text and non-text elements in document are classified by a heuristic filter. Then, the white space analysis is used to group the text elements into text lines, while ruling line table candidates are identified from non-text elements. In the second stage, based on the text lines, text and non-text elements, a hybrid method which consist of the alternative bottom-up and top-down approaches is implemented to find the table region candidates. In the final stage, these candidates are examined to get the table regions by analyzing text lines and spare lines. Experimental results with the document database from the ICDAR2013 table competition show that the proposed method works better than the previous ones.

...read moreread less

13 citations

Collapse

Network Information

Performance

Metrics

1,488

Papers

35,779

Citations

No. of papers in the topic in previous years
Year	Papers
2023	5
2022	19
2021	34
2020	19
2019	14
2018	9

Document layout analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics