scispace - formally typeset
Search or ask a question
Topic

Document layout analysis

About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.


Papers
More filters
Proceedings ArticleDOI
23 Sep 2007
TL;DR: This paper presents a new framework for in-depth analysis of the performance of layout analysis methods that provides detailed information at various levels that can be used by method developers to identify specific problems and improve their work.
Abstract: This paper presents a new framework for in-depth analysis of the performance of layout analysis methods. Contrary to existing approaches aimed at evaluation or benchmarking, the proposed framework provides detailed information at various levels that can be used by method developers to identify specific problems and improve their work. Complex layouts are supported as well as the flexible configuration of goal-oriented performance analysis scenarios. The comparison of segmentation results against the ground truth is performed in a very efficient way based on a decomposition of any region shape into an interval-based description. The framework has been validated using the dataset and method results of the ICDAR2005 Page Segmentation Competition.

26 citations

Patent
25 Aug 1982
TL;DR: In this paper, an improved method in an interactive text processing system for creating/revising documents by adding a multi-page insert of text data at a specified location in the document is presented.
Abstract: An improved method in an interactive text processing system for creating/revising documents by adding a multi-page insert of text data at a specified location in the document comprising displaying the document, signalling to the system the location within the document into which the text data is to be copied, specifying to the system the identification and location within the document into which the text data is to be copied, specifying to the system the identification and location of the insert text data, scanning the insert text data for INCLUDE instructions, resolving the INCLUDE instructions prior to adding the text data into the document, and copying the specified text data into the document at the signalled location. In a specific embodiment, text data up to ten pages can be copied and up to five levels of nested INCLUDE instructions can be resolved.

26 citations

Proceedings ArticleDOI
23 Oct 2006
TL;DR: An automatic orientation detection and categorization technique that is capable of detecting the orientation of multilingual documents with arbitrary skew and categorizing document images according to the underlying languages is presented.
Abstract: This paper presents an automatic orientation detection and categorization technique that is capable of detecting the orientation of multilingual documents with arbitrary skew and categorizing document images according to the underlying languages. We carry out orientation detection and categorization through document vectorization, which encodes document orientation and language information and converts each document image into an electronic document vector through the exploitation of the density and distribution of vertical component runs. For each language of interest, a pair of vector templates is first constructed through a training process. Orientation and category of the query image are then determined based on distances between the query document vector and the constructed vector templates. Experiments over 492 testing document images show that the average orientation detection and categorization rates reach up to 97.56% and 99.59%, respectively.

26 citations

Patent
28 May 2010
TL;DR: In this article, a method for generating flat layout design view that comprises importing port definitions of a first hierarchical block of digital instances from a source as a schematic symbol, importing port definition of digital instance within the first hierarchical blocks from the source, instantiating the schematic symbol as a hierarchical layout instance in the flat layout, binding the hierarchical layouts instance to the schemas symbol, and embedding digital layout block instances within the design layout by replacing the digital instances of a digital layout blocks with digital layout instance of a top layout module of a design layout.
Abstract: The present invention provides a method for generating flat layout design view that comprises importing port definitions of a first hierarchical block of digital instances from a source as a schematic symbol, importing port definitions of digital instances within the first hierarchical block from the source, instantiating the schematic symbol as a hierarchical layout instance in the flat layout, binding the hierarchical layout instance to the schematic symbol, and embedding digital layout block instances within the design layout by replacing the digital instances of a digital layout block with digital layout instances of a top layout module of the design layout.

26 citations

Proceedings ArticleDOI
T. Kochi1, T. Saitoh
20 Sep 1999
TL;DR: An automatic document entry system is described that identifies the type of document and extracts textual information, such as titles or authors, from semi-formatted document images.
Abstract: An automatic document entry system is described that identifies the type of document and extracts textual information, such as titles or authors, from semi-formatted document images. The system registers documents, offers easy retrieval of documents used in a daily workflow analyzes the layout structure of documents by using document specific models, and assumes that each type of document is known in advance. In this paper we focus on a method for identifying the type of document.

26 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
82% related
Feature (computer vision)
128.2K papers, 1.7M citations
82% related
Object detection
46.1K papers, 1.3M citations
81% related
Image segmentation
79.6K papers, 1.8M citations
80% related
Convolutional neural network
74.7K papers, 2M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20235
202219
202134
202019
201914
20189