Topic

Document layout analysis

About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

From paper to office document standard representation

[...]

Andreas Dengel¹, Rainer Bleisinger¹, Rainer Hoch¹, Frank Fein¹, Frank Hönes¹ - Show less +1 more•Institutions (1)

German Research Centre for Artificial Intelligence¹

01 Jul 1992-IEEE Computer

TL;DR: The principles of the model-based document analysis system called Pi ODA (paper interface to office document architecture), which was developed as a prototype for the analysis of single-sided business letters in German, are presented.

...read moreread less

Abstract: The principles of the model-based document analysis system called Pi ODA (paper interface to office document architecture), which was developed as a prototype for the analysis of single-sided business letters in German, are presented. Initially, Pi ODA extracts a part-of hierarchy of nested layout objects such as text-blocks, lines, and words based on their presentation on the page. Subsequently, in a step called logical labeling, the layout objects and their compositions are geometrically analyzed to identify corresponding logical objects that can be related to a human perceptible meaning, such as sender, recipient, and date in a letter. A context-sensitive text recognition for logical objects is then applied using logical vocabularies and syntactic knowledge. As a result, Pi ODA produces a document representation that conforms to the ODA international standard. >

...read moreread less

168 citations

Proceedings Article•DOI•

PubLayNet: Largest Dataset Ever for Document Layout Analysis

[...]

Xu Zhong¹, Jianbin Tang¹, Antonio Jimeno Yepes¹•Institutions (1)

IBM¹

16 Aug 2019

TL;DR: The PubLayNet dataset for document layout analysis is developed by automatically matching the XML representations and the content of over 1 million PDF articles that are publicly available on PubMed Central and demonstrated that deep neural networks trained on Pub LayNet accurately recognize the layout of scientific articles.

...read moreread less

Abstract: Recognizing the layout of unstructured digital documents is an important step when parsing the documents into structured machine-readable format for downstream applications. Deep neural networks that are developed for computer vision have been proven to be an effective method to analyze layout of document images. However, document layout datasets that are currently publicly available are several magnitudes smaller than established computing vision datasets. Models have to be trained by transfer learning from a base model that is pre-trained on a traditional computer vision dataset. In this paper, we develop the PubLayNet dataset for document layout analysis by automatically matching the XML representations and the content of over 1 million PDF articles that are publicly available on PubMed Central. The size of the dataset is comparable to established computer vision datasets, containing over 360 thousand document images, where typical document layout elements are annotated. The experiments demonstrate that deep neural networks trained on PubLayNet accurately recognize the layout of scientific articles. The pre-trained models are also a more effective base mode for transfer learning on a different document domain. We release the dataset (https://github.com/ibm-aur-nlp/PubLayNet) to support development and evaluation of more advanced models for document layout analysis.

...read moreread less

160 citations

Patent•

Document storage and retrieval system

[...]

Hiromichi Fujisawa¹, Atsushi Hatakeyama¹, Yasuaki Nakano¹, Junichi Higashino¹, Toshihiro Hananoi¹ - Show less +1 more•Institutions (1)

Hitachi¹

30 Jul 1990

TL;DR: A document storage and retrieval system stores a document body in the form of an image, storing text information in a form of a character code string for retrieval, and executing a retrieval with reference to the text information, followed by displaying a document image relating thereto on a retrieval terminal according to the retrieval result as mentioned in this paper.

...read moreread less

Abstract: A document storage and retrieval system stores a document body in the form of an image, storing text information in the form of a character code string for retrieval, and executing a retrieval with reference to the text information, followed by displaying a document image relating thereto on a retrieval terminal according to the retrieval result. Such a form of the system is available for retrieving the full contents of a document and also for displaying the document body printed in a format easy to read straight in the form of an image.

...read moreread less

160 citations

Proceedings Article•DOI•

Table Detection Using Deep Learning

[...]

Azka Gilani, Shah Rukh Qasim¹, Imran Malik¹, Faisal Shafait¹•Institutions (1)

University of the Sciences¹

01 Nov 2017

TL;DR: The proposed method works with high precision on document images with varying layouts that include documents, research papers, and magazines and beats Tesseract's state of the art table detection system by a significant margin.

...read moreread less

Abstract: Table detection is a crucial step in many document analysis applications as tables are used for presenting essential information to the reader in a structured manner. It is a hard problem due to varying layouts and encodings of the tables. Researchers have proposed numerous techniques for table detection based on layout analysis of documents. Most of these techniques fail to generalize because they rely on hand engineered features which are not robust to layout variations. In this paper, we have presented a deep learning based method for table detection. In the proposed method, document images are first pre-processed. These images are then fed to a Region Proposal Network followed by a fully connected neural network for table detection. The proposed method works with high precision on document images with varying layouts that include documents, research papers, and magazines. We have done our evaluations on publicly available UNLV dataset where it beats Tesseract's state of the art table detection system by a significant margin.

...read moreread less

159 citations

Journal Article•DOI•

A fast algorithm for bottom-up document layout analysis

[...]

A. Simon¹, J.-C. Pret¹, A.P. Johnson¹•Institutions (1)

University of Leeds¹

01 Mar 1997-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A new bottom-up method for document layout analysis based on Kruskal's algorithm and uses a special distance-metric between the components to construct the physical page structure.

...read moreread less

Abstract: This paper describes a new bottom-up method for document layout analysis. The algorithm was implemented in the CLIDE (Chemical Literature Data Extraction) system, but the method described here is suitable for a broader range of documents. It is based on Kruskal's algorithm and uses a special distance-metric between the components to construct the physical page structure. The method has all the major advantages of bottom-up systems: independence from different text spacing and independence from different block alignments. The algorithms computational complexity is reduced to linear by using heuristics and path-compression.

...read moreread less

158 citations

Collapse

Network Information

Performance

Metrics

1,488

Papers

35,779

Citations

No. of papers in the topic in previous years
Year	Papers
2023	5
2022	19
2021	34
2020	19
2019	14
2018	9

Document layout analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics