scispace - formally typeset
Search or ask a question
Topic

Document layout analysis

About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.


Papers
More filters
Patent
24 Feb 2004
TL;DR: In this article, a document information retrieving device for retrieving document information with higher accuracy of retrieval result than the conventional manner, on the basis of an accepted input text, was proposed, in which fixed relation strength is calculated, and variable relation strength was calculated The similarity of input text information with document information is calculated at the time of retrieval, and a similarity order document information list in which the document information was stored in the order of higher similarity, and relation strength total value showing the strength of a relation between the link destination document information of link destinations described in the similarityorder document information
Abstract: PROBLEM TO BE SOLVED: To provide a document information retrieving device for retrieving document information with higher accuracy of retrieval result than the conventional manner, on the basis of an accepted input text SOLUTION: Fixed relation strength is calculated, and variable relation strength is calculated The similarity of input text information with document information is calculated at the time of retrieval, and a similarity order document information list in which the document information is stored in the order of higher similarity, and a relation strength total value showing the strength of a relation between the link destination document information of link destinations described in the document information of the similarity order document information list and all the pieces of document information in the similarity order document information list in which the link destination document information is designated as link destinations is calculated Also, the link destination document information is inserted between the respective pieces of document information of the similarity order document information list so as not to be overlapped on the basis of the similarity, and the sequences of the document information and the link destination document information are changed, and a retrieval result list including the document information and the link destination document information is generated COPYRIGHT: (C)2005,JPO&NCIPI

4 citations

Proceedings ArticleDOI
14 Aug 1995
TL;DR: An automatic determination of the positions for input of new text in partially filled text columns is described to bridge the gap between the non-coded archived documents and the coded information which is used to update the documents later.
Abstract: This paper describes how document analysis techniques like OCR, layout analysis, model based recognition and interpretation can be fruitfully applied in the field of high-volume, high-accuracy document capturing with very hard time constraints. We describe the way we set up a workflow that enables reliable capturing of real-estate registration documents. Techniques from document analysis are used to speed up the archiving process and to raise its quality. In particular an automatic determination of the positions for input of new text in partially filled text columns is described. This enables to bridge the gap between the non-coded archived documents and the coded information which is used to update the documents later.

4 citations

Proceedings ArticleDOI
07 Dec 1999
TL;DR: This work proposes a new layout method using the magnetic spring model for object diagrams used in the Object Modeling Technique (OMT) and implements it into an object diagram-drawing CASE tool.
Abstract: Proposes a new layout method using the magnetic spring model for object diagrams used in the Object Modeling Technique (OMT). In our new method, there are two main characteristics. Firstly, our new method considers the interactive layout. User can use the layout function many times during the drawing of the diagram. The result of the layout is suitable for continuing to draw the diagram smoothly. Secondly, our method can avoid the overlapping of nodes with a new definition of the distance between nodes. We have implemented the new layout method into an object diagram-drawing CASE tool. We compare the results of the new layout method with results of the old layout method. We have made sure that our new layout method is more suitable for the interactive layout than the old layout method was, and that it avoids node overlaps.

4 citations

Proceedings ArticleDOI
02 Feb 2018
TL;DR: A hybrid method consisting of three fundamental steps to detect table zones: classification of the regions, detection of the tables that constitute intersecting horizontal and vertical lines, and identification of the table zones made up by only parallel lines is presented.
Abstract: Table detection is a crucial step in many document analysis applications as tables are used for presenting essential information to readers in a structured manner. It is still a challenging problem due to the variety of table structures and the complexity of document layout. This paper presents a hybrid method consisting of three fundamental steps to detect table zones: classification of the regions, detection of the tables that constitute intersecting horizontal and vertical lines, and identification of the tables made up by only parallel lines. Experiments on the UW-III dataset show that the obtained results are very promising.

4 citations

Patent
27 Aug 2015
TL;DR: In this paper, a system including a document processing module, a feature processing module and a feature generation module is described for the identification of augmented features based on a Bayesian analysis of a text document.
Abstract: Identification of augmented features based on a Bayesian analysis of a text document is disclosed. One example is a system including a document processing module, a feature processing module,and a feature generation module. The document processing module receives a text document via a processor. The feature processing module automatically identifies, based on a Bayesian analysis of the text document, a plurality of augmented features in the text document, the plurality of augmented features including at least one of local, sectional, and document-level features of the text document, and extracts, via the processor, the identified plurality of augmented features from the text document. The feature generation module generates, via the processor, a feature representation of the text document based on the extracted plurality of augmented features.

4 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
82% related
Feature (computer vision)
128.2K papers, 1.7M citations
82% related
Object detection
46.1K papers, 1.3M citations
81% related
Image segmentation
79.6K papers, 1.8M citations
80% related
Convolutional neural network
74.7K papers, 2M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20235
202219
202134
202019
201914
20189