Topic
Document layout analysis
About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.
Papers published on a yearly basis
Papers
More filters
•
01 Jan 1990
TL;DR: An overview of techniques for document image analysis can be found in this article, with an emphasis on those for grnphics recognition and interpretation, which is derived from the fields of image processing pattern recognition, and machine vision.
Abstract: An overview is presented of algorithms and techniques for document image analysis with an emphasis on those for grnphics recognition and interpretation The techniques are derived from the fields of image processing pattern recognition, and machine vision The objective in document image analysis is to recognize page contents including layout, text, and figures Although optical character recognition (OCR) fds within the context of document image analysis we do not cover this area since OCR techniques have been covered extensively in the literature We also limit the focus to images containing binary information Topics covered are segmentation of document image into text and graphics regions, vectorization to obtain lines, identification of graphical primitives, and generation of succinct image interpretations
29 citations
•
07 Mar 2005
TL;DR: In this article, an apparatus and method for easily generating document data (tag file) in a form that makes it possible to perform various processes upon the document data is disclosed for easily retrieving document data.
Abstract: An apparatus and method are disclosed for easily generating document data (tag file) in a form that makes it possible to perform various processes upon the document data. An original document (plain text) is divided into morphological elements, and morphological information is added thereto. Information representing the hierarchical document structures is also added. Furthermore information indicating referential relations between portions in the original document is also added.
29 citations
••
07 Apr 2014
TL;DR: A Document Image Analysis system able to extract homogeneous typed and handwritten text regions from complex layout documents of various types based on two connected component classification stages that successively discriminate text/non text and typed/handwritten shapes.
Abstract: This paper presents a Document Image Analysis (DIA) system able to extract homogeneous typed and handwritten text regions from complex layout documents of various types. The method is based on two connected component classification stages that successively discriminate text/non text and typed/handwritten shapes, followed by an original block segmentation method based on white rectangles detection. We present the results obtained by the system during the first competition round of the MAURDOR campaign.
29 citations
••
05 Sep 2021
TL;DR: Wang et al. as discussed by the authors proposed a unified framework VSR for document layout analysis, combining vision, semantics and relations, which can extract modality-specific visual and semantic features using two-stream network, which are adaptively fused to make full use of complementary information.
Abstract: Document layout analysis is crucial for understanding document structures. On this task, vision and semantics of documents, and relations between layout components contribute to the understanding process. Though many works have been proposed to exploit the above information, they show unsatisfactory results. NLP-based methods model layout analysis as a sequence labeling task and show insufficient capabilities in layout modeling. CV-based methods model layout analysis as a detection or segmentation task, but bear limitations of inefficient modality fusion and lack of relation modeling between layout components. To address the above limitations, we propose a unified framework VSR for document layout analysis, combining vision, semantics and relations. VSR supports both NLP-based and CV-based methods. Specifically, we first introduce vision through document image and semantics through text embedding maps. Then, modality-specific visual and semantic features are extracted using a two-stream network, which are adaptively fused to make full use of complementary information. Finally, given component candidates, a relation module based on graph neural network is incorported to model relations between components and output final results. On three popular benchmarks, VSR outperforms previous models by large margins. Code will be released soon.
29 citations
••
14 Aug 1995TL;DR: This paper presents a new approach to document analysis based on modified fractal signature that can divide a document into blocks in only one step and be used to process documents with high geometrical complexity.
Abstract: This paper presents a new approach to document analysis. The proposed approach is based on modified fractal signature. Instead of the time-consuming traditional approaches (top-down and bottom-up approaches) where iterative operations are necessary to break a document into blocks to extract its geometric (layout) structure, this new approach can divide a document into blocks in only one step. This approach can be used to process documents with high geometrical complexity. Experiments have been conducted to prove the proposed new approach for document processing.
28 citations