Topic

Document layout analysis

About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Patent•

System, method, and architecture for displaying a document

[...]

Luiz Pereira¹, Edgard Lindner¹, Lily Xia¹, Kevin Markman¹•Institutions (1)

Google¹

10 Aug 2011

TL;DR: In this paper, a plurality of elements within a document is identified from an online document processing service, and an object corresponding to an element is invoked to generate layout data associated with the element and the element is rendered based on the layout data.

...read moreread less

Abstract: Data defining a document is received from an online document processing service, and a plurality of elements within the document is identified. The plurality of elements may comprise paragraphs, lines of text, images, tables, headers, footers, footnotes, footnote reference information, etc. For each of the plurality of elements, a respective object comprising a layout function and a render function is generated. An object corresponding to an element is invoked to generate layout data associated with the element, and the element is rendered based on the layout data.

...read moreread less

5 citations

Proceedings Article•DOI•

Text Line Extraction Based on Distance Map Features and Dynamic Programming

[...]

Vicente Bosch Campos¹, Verónica Romero Gómez¹, Alejandro Hector Toselli Rossi, Enrique Vidal Ruiz•Institutions (1)

Polytechnic University of Valencia¹

01 Aug 2018

TL;DR: A binarization free dynamic programming approach that generates an equidistant text line extraction polygon is presented and compared with other solutions ranging from the actual human reviewed ground-truth polygons to simpler automatic generated rectangle areas.

...read moreread less

Abstract: Text Line Segmentation is a basic document layout task that consists in detecting and extracting the text lines present in a document page image. Although considered a basic task, generally, it is a necessary step for Handwritten Text Recognition (HTR) higher level tasks. Most state of the art automatic text recognition, text-to-line image alignment and key word spotting systems require it due to their need for isolated text line images as input. Traditionally most Text Line Segmentation approaches cover both detection and extraction sub steps. However, the community has recently shifted its focus to tackle independently the baseline detection in document images. This shift generates the need for extraction methods that use these detected baselines as input. In this paper, a binarization free dynamic programming approach that generates an equidistant text line extraction polygon is presented. The approach performs this calculation, based on the information provided by priorly detected text baselines and automatically generated foreground pixels distance maps. We evaluate our approach both in a synthetic competition corpus and in a challenging real handwritten text recognition task corpus. We evaluate it not only at the graphical error level but also the impact it produces on an HTR task trained with the line images it yields. We compare our solution with other solutions ranging from the actual human reviewed ground-truth polygons to simpler automatic generated rectangle areas.

...read moreread less

5 citations

Journal Article•DOI•

Layout Analysis for Scanned PDF and Transformation to the Structured PDF Suitable for Vocalization and Navigation

[...]

Azaedeh Nazemi, Iain Murray, David A. McMeekin

19 Jan 2014-Computer and Information Science

TL;DR: This paper presents a technology that addresses this issue by closely preserving the original textual layout of the scanned PDF using the open source document analysis and OCR system (OCRopus) based on geometric layout and positioning information.

...read moreread less

Abstract: Information can include text, pictures and signatures that can be scanned into a document format, such as the Portable Document Format (PDF), and easily emailed to recipients around the world. Upon the document’s arrival, the receiver can open and view it using a vast array of different PDF viewing applications such as Adobe Reader and Apple Preview. Hence, today the use of the PDF has become pervasive. Since the scanned PDF is an image format, it is inaccessible to assistive technologies such as a screen reader. Therefore, the retrieval of the information needs Optical Character Recognition (OCR). The OCR software scans the scanned PDF file and through text extraction generates an editable text formatted document. This text document can then be edited, formatted, searched and indexed as well as translated or converted to speech. A problem that the OCR software does not solve is the accurate regeneration of the full text layout. This paper presents a technology that addresses this issue by closely preserving the original textual layout of the scanned PDF using the open source document analysis and OCR system (OCRopus) based on geometric layout and positioning information. The main issues considered in this research are the preservation of the correct reading order, and the representation of common logical structured elements such as section headings, line breaks, paragraphs, captions, and sidebars, foot-bars, running headers, embedded images, graphics, tables and mathematical expressions.

...read moreread less

5 citations

Proceedings Article•DOI•

A fast and flexible statistical method for text extraction in document pages

[...]

P. Parodi¹, G. Piccioli•Institutions (1)

University of Toronto¹

18 Jun 1996

TL;DR: The algorithm is very fast, is able to work on low-resolution document pages and is robust against skew, no assumptions are made on the layout of the document, the shape of the text regions, and the font size and style.

...read moreread less

Abstract: This paper describes a fast and flexible method for extracting text regions from a document page containing text, graphics, and pictures. Such regions can be given as an input to an OCR system. The user fixes two parameters, the minimum width w of the text to be detected, and the precision /spl epsiv/ needed (both expressed as a percentage of the image width), according to the implementation needs. The method works by subdividing the page into overlapping columns whose width and inter-shift depend on w and /spl epsiv/, and by performing text lines extraction on each column separately. Successively, a statistical analysis of the text line elements found in each column is performed, and they are connected to form complete text lines. Finally, related pieces of text are merged into blocks so that a sensible reading order is provided for the OCR system. The algorithm is very fast, is able to work on low-resolution document pages and is robust against skew. The algorithm as also very flexible: no assumptions are made on the layout of the document, the shape of the text regions, and the font size and style; the main assumption is that the background is uniform and the text approximately horizontal. Despite the statistical nature of the method, a single line of text of a certain font size is generally sufficient to warrant detection. Experimental results are shown which demonstrate the effectiveness of the method on several different kinds of documents.

...read moreread less

5 citations

Proceedings Article•DOI•

Skew detection of document images using line structural information

[...]

Chih-Hong Kao¹, Hon-Son Don¹•Institutions (1)

National Chung Hsing University¹

04 Jul 2005

TL;DR: A method for skew detection of document images is presented, based on the information of the directions of text lines in the images, which reflects the strength and direction of a pixel being part of a text line.

...read moreread less

Abstract: A method for skew detection of document images is presented. The method is based on the information of the directions of text lines in the images. To extract the directions of text lines in the images, a measure of line-likeness of texts is adopted, which reflects the strength and direction of a pixel being part of a text line. This measure is applicable to texts of various fonts, sizes, and even languages. The skew angle of the whole image is the consensus among all pixels in the image. Several methods for the consensus are proposed. Experimental results of the method on document images are presented.

...read moreread less

5 citations

Collapse

Network Information

Performance

Metrics

1,488

Papers

35,779

Citations

No. of papers in the topic in previous years
Year	Papers
2023	5
2022	19
2021	34
2020	19
2019	14
2018	9

Document layout analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics