Topic
Document layout analysis
About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.
Papers published on a yearly basis
Papers
More filters
••
31 Aug 2005
TL;DR: A filter-based method was designed to organize the features in clusters, which allows finding a good subset of input features during each cycle, which reduce the computations.
Abstract: The purpose of this work is to develop a pattern recognition system simulating the human vision. A transparent neural network, with context returns is used. The context returns consist in using global vision to correct local vision (i.e. input data are corrected according to neural network outputs). In order not to compute all the input features during these context returns, a filter-based method was designed to organize the features in clusters. This allows finding a good subset of input features during each cycle, which reduce the computations. The method interest is shown in the case of logical document structure retrieval.
5 citations
••
TL;DR: It is shown that the presented Mask R-CNN-based method can successfully segment text lines, even in such a challenging scenario, and introduced a new challenging dataset of Arabic historical manuscripts, VML-AHTE, where numerous diacritics are present.
Abstract: Text line extraction is an essential preprocessing step in many handwritten document image analysis tasks. It includes detecting text lines in a document image and segmenting the regions of each detected line. Deep learning-based methods are frequently used for text line detection. However, only a limited number of methods tackle the problems of detection and segmentation together. This paper proposes a holistic method that applies Mask R-CNN for text line extraction. A Mask R-CNN model is trained to extract text lines fractions from document patches, which are further merged to form the text lines of an entire page. The presented method was evaluated on the two well-known datasets of historical documents, DIVA-HisDB and ICDAR 2015-HTR, and achieved state-of-the-art results. In addition, we introduce a new challenging dataset of Arabic historical manuscripts, VML-AHTE, where numerous diacritics are present. We show that the presented Mask R-CNN-based method can successfully segment text lines, even in such a challenging scenario.
5 citations
••
01 Nov 2017TL;DR: A novel technique for layout analysis of documents with complex Manhattan layouts that requires only one parameter - the number of gaussians to fit the height histogram data and is therefore easy to automate and adapt to many documents.
Abstract: This paper proposes a novel technique for layout analysis of documents with complex Manhattan layouts. The technique is designed for Indic script newspapers and works on many types of documents not necessarily with Indic scripts with Manhattan layout. The main idea behind the algorithm is to categorise the physical elements of a document into noise, text, titles and graphics based on their heights. A histogram of heights is computed from the bounding boxes of connected components and a multigaussian fit is used to discover optimal split points between the categories. The gaussian with the highest peak is assumed to correspond to running text. Running text regions are grouped into blocks using nearest neighbour analysis. These initial regions are further refined using a second-level classification of the other elements into graphics, light-coloured text on a dark background, and graphical separators. The resulting layouts show accuracies comparable to some of the best and most popular algorithms such as MHS (winner of ICDAR-RDCL2015 competition) and PRImA's Aletheia (tool developed by PRImA Research Lab). Results of testing on many Indic script newspapers and other documents, and comparison with Aletheia and MHS on ICDAR dataset show its performance. Our initial results on an Indic document dataset show high performance in identifying running text (> 98%) with an accuracy of 82% on identifying the other elements. Ground truth data for the Indic script newspaper documents is being generated for a more extensive quantitative testing. The strength of our algorithm is that it requires only one parameter - the number of gaussians to fit the height histogram data and is therefore easy to automate and adapt to many documents.
5 citations
•
29 Mar 2005
TL;DR: In this paper, the copying machine 1 specifies a character carrying portion in the image data outputted by the scanner, performs character recognition processing to generate 1st character information on the basis of recognized character data, and compares the 1st information with 2nd character information as information on characters included in a document to be controlled that is referred to from document management index data 18b, thereby deciding whether the document and the document having similar document contents.
Abstract: PROBLEM TO BE SOLVED: To provide an information processor that can suitably restrain copying processing and FAX processing of various kinds of documents including general documents generated and used in an office, an information processing method and a program therefor. SOLUTION: A scanner 16 reads the document and outputs image data. A copying machine 1 specifies a character carrying portion in the image data outputted by the scanner, performs character recognition processing to generate 1st character information on the basis of recognized character data, and compares the 1st character information with 2nd character information as information on characters included in a document to be controlled that is referred to from document management index data 18b, thereby deciding whether the document and the document to be controlled have similar document contents. Then the copying machine 1 specifies a document to be controlled which have document contents similar to that of the document in accordance with the decision and refers to control information of the document to be controlled specified from the document management index data 18b to decide whether document processing in response to a request to process the document is performed. COPYRIGHT: (C)2007,JPO&INPIT
5 citations
••
23 Sep 2007TL;DR: This paper presents its work on automatically locating charts from document pages, which is an important stage in the chart image recognition and understanding system currently being developed, and proposes a set of simple statistical features for building the classifier.
Abstract: This paper presents our work on automatically locating charts from document pages, which is an important stage in our chart image recognition and understanding system currently being developed. To achieve this, there are two sub-goals to be reached: locating figure blocks in a given document image, and building a classifier to differentiate charts from non- chart figures. For the first sub-goal, besides traditional logical block labelling, relevant text blocks such as text descriptions and labels in a figure must be included in the located figure blocks to facilitate the interpretation processes in the following stages. For the second sub- goal, we propose a set of simple statistical features for building the classifier. We tested our system with the entire collection of scanned journal pages in the University of Washington database I. The experimental results are discussed in this paper.
5 citations