Topic
Document layout analysis
About: Document layout analysis is a(n) research topic. Over the lifetime, 1462 publication(s) have been published within this topic receiving 34021 citation(s).
Papers published on a yearly basis
Papers
More filters
Book•
01 Jan 1995
TL;DR: The document spectrum (or docstrum), which is a method for structural page layout analysis based on bottom-up, nearest-neighbor clustering of page components, yields an accurate measure of skew, within-line, and between-line spacings and locates text lines and text blocks.
Abstract: Page layout analysis is a document processing technique used to determine the format of a page. This paper describes the document spectrum (or docstrum), which is a method for structural page layout analysis based on bottom-up, nearest-neighbor clustering of page components. The method yields an accurate measure of skew, within-line, and between-line spacings and locates text lines and text blocks. It is advantageous over many other methods in three main ways: independence from skew angle, independence from different text spacings, and the ability to process local regions of different text orientations within the same image. Results of the method shown for several different page formats and for randomly oriented subpages on the same image illustrate the versatility of the method. We also discuss the differences, advantages, and disadvantages of the docstrum with respect to other lay-out methods. >
628 citations
TL;DR: The document spectrum (or docstrum) as discussed by the authors is a method for structural page layout analysis based on bottom-up, nearest-neighbor clustering of page components, which yields an accurate measure of skew, within-line, and between-line spacings and locates text lines and text blocks.
Abstract: Page layout analysis is a document processing technique used to determine the format of a page. This paper describes the document spectrum (or docstrum), which is a method for structural page layout analysis based on bottom-up, nearest-neighbor clustering of page components. The method yields an accurate measure of skew, within-line, and between-line spacings and locates text lines and text blocks. It is advantageous over many other methods in three main ways: independence from skew angle, independence from different text spacings, and the ability to process local regions of different text orientations within the same image. Results of the method shown for several different page formats and for randomly oriented subpages on the same image illustrate the versatility of the method. We also discuss the differences, advantages, and disadvantages of the docstrum with respect to other lay-out methods. >
624 citations
TL;DR: This paper discusses some layout adjustment methods and the preservation of the 'mental map' of the diagram, and two kinds of layout adjustments are described, an algorithm for rearranging a diagram to avoid overlapping nodes and a method aimed at changing the focus of interest of the user without destroying the mental map.
Abstract: Many models in software and information engineering use graph representations; examples are data flow diagrams, state transition diagrams, flow charts, PERT charts, organization charts, Petri nets and entity-relationship diagrams. The usefulness of these graph representations depends on the quality of the layout of the graphs. Automatic graph layout, which can release humans from graph drawing, is now available in several visualization systems. Most automatic layout facilities take a purely combinatorial description of a graph and produce a layout of the graph; these methods are called 'layout creation' methods. For interactive systems, another kind of layout is needed: a facility which can adjust a layout after a change is made by the user or by the application. Although layout adjustment is essential in interactive systems, most existing layout algorithms are designed for layout creation. The use of a layout creation method for layout adjustment may totally rearrange the layout and thus destroy the user's 'mental map' of the diagram; thus a set of layout adjustment methods, separate from layout creation methods, is needed. This paper discusses some layout adjustment methods and the preservation of the 'mental map' of the diagram. First, several models are proposed to make the concept of 'mental map' more precise. Then two kinds of layout adjustments are described. One is an algorithm for rearranging a diagram to avoid overlapping nodes, and the other is a method aimed at changing the focus of interest of the user without destroying the mental map. Next, some experience with visualization systems in which the techniques have been employed is also described.
591 citations
TL;DR: The document image acquisition process and the knowledge base that must be entered into the system to process a family of page images are described, and the process by which the X-Y tree data structure converts a 2-D page-segmentation problem into a series of 1-D string-parsing problems that can be tackled using conventional compiler tools.
Abstract: Gobbledoc, a system providing remote access to stored documents, which is based on syntactic document analysis and optical character recognition (OCR), is discussed. In Gobbledoc, image processing, document analysis, and OCR operations take place in batch mode when the documents are acquired. The document image acquisition process and the knowledge base that must be entered into the system to process a family of page images are described. The process by which the X-Y tree data structure converts a 2-D page-segmentation problem into a series of 1-D string-parsing problems that can be tackled using conventional compiler tools is also described. Syntactic analysis is used in Gobbledoc to divide each page into labeled rectangular blocks. Blocks labeled text are converted by OCR to obtain a secondary (ASCII) document representation. Since such symbolic files are better suited for computerized search than for human access to the document content and because too many visual layout clues are lost in the OCR process (including some special characters), Gobbledoc preserves the original block images for human browsing. Storage, networking, and display issues specific to document images are also discussed. >
456 citations