scispace - formally typeset
Search or ask a question
Topic

Document layout analysis

About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.


Papers
More filters
Patent
03 Oct 1983
TL;DR: In this article, a flexible, expandable document structure incorporating information item blocks and indexing blocks related through pointers and means for applying visual and informational attributes to document text is presented.
Abstract: A document processing system including a control structure having separated supervisory and document functions. The document functions, including a document buffer and document access control means are the sole means for accessing documents and the document function routines are selected from predetermined library of such routines. The system includes a flexible, expandable document structure incorporating information item blocks and indexing blocks related through pointers and means for applying visual and informational attributes to document text.

46 citations

Journal ArticleDOI
15 Oct 2007
TL;DR: This paper presents a hybrid approach to segment and classify contents of document images, segmented into three types of regions: Graphics, Text and Space.
Abstract: In this paper we present a hybrid approach to segment and classify contents of document images. A Document Image is segmented into three types of regions: Graphics, Text and Space. The image of a document is subdivided into blocks and for each block five GLCM (Grey Level Co-occurrence Matrix) features are extracted. Based on these features, blocks are then clustered into three groups using K-Means algorithm; connected blocks that belong to the same group are merged. The classification of groups is done using pre-learned heuristic rules. Experiments were conducted on scanned newspapers and images from MediaTeam Document Database

46 citations

Patent
01 Dec 1992
TL;DR: A document reading apparatus which can determine an effective image pickup area containing no object such as operator's hands or fingers pressing a document and rectifying image data prior to imaging operation, making use of a difference of the object from the document in chromaticity, luminous density, and the like as mentioned in this paper.
Abstract: A document reading apparatus which can determine an effective image pickup area containing no object such as operator's hands or fingers pressing a document and rectify image data prior to imaging operation, making use of a difference of the object from the document in chromaticity, luminous density, and the like.

45 citations

Patent
11 Jan 1996
TL;DR: In this article, the authors present a method for bottom-up recognition of tables within a document based on the paradigm of graph rewriting, where the document image is transformed into a layout graph whose nodes and edges represent document entities and their interrelations respectively.
Abstract: The present invention is a method for bottom-up recognition of tables within a document. This method is based on the paradigm of graph-rewriting. First, the document image is transformed into a layout graph whose nodes and edges represent document entities and their interrelations respectively. This graph is subsequently rewritten using a set of rules designed based on apriori document knowledge and general formatting conventions. The resulting graph provides a logical view of the document content. It can be parsed to provide general format analysis information.

45 citations

Journal Article
TL;DR: A novel skew detection method is presented for binary document images that considered the some selected characters of the text which may be subjected to thinning and Hough transform to estimate skew angle accurately.
Abstract: Document image processing has become an increasingly important technology in the automation of office documentation tasks. During document scanning, skew is inevitably introduced into the incoming document image. Since the algorithm for layout analysis and character recognition are generally very sensitive to the page skew. Hence, skew detection and correction in document images are the critical steps before layout analysis. In this paper, a novel skew detection method is presented for binary document images. The method considered the some selected characters of the text which may be subjected to thinning and Hough transform to estimate skew angle accurately. Several experiments have been conducted on various types of documents such as documents containing English Documents, Journals, Text-Book, Different Languages and Document with different fonts, Documents

45 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
82% related
Feature (computer vision)
128.2K papers, 1.7M citations
82% related
Object detection
46.1K papers, 1.3M citations
81% related
Image segmentation
79.6K papers, 1.8M citations
80% related
Convolutional neural network
74.7K papers, 2M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20235
202219
202134
202019
201914
20189