scispace - formally typeset
Book ChapterDOI

Automatic table detection in document images

TLDR
The efficiency of the proposed method is demonstrated by using a performance evaluation scheme which considers a great variety of documents such as forms, newspapers/magazines, scientific journals, tickets/bank cheques, certificates and handwritten documents.
Abstract
In this paper, we propose a novel technique for automatic table detection in document images. Lines and tables are among the most frequent graphic, non-textual entities in documents and their detection is directly related to the OCR performance as well as to the document layout description. We propose a workflow for table detection that comprises three distinct steps: (i) image pre-processing; (ii) horizontal and vertical line detection and (iii) table detection. The efficiency of the proposed method is demonstrated by using a performance evaluation scheme which considers a great variety of documents such as forms, newspapers/magazines, scientific journals, tickets/bank cheques, certificates and handwritten documents.

read more

Content maybe subject to copyright    Report

Citations
More filters
Reference BookDOI

Handbook of Document Image Processing and Recognition

TL;DR: The Handbook of Document Image Processing and Recognition is a comprehensive resource on the latest methods and techniques in document image processing and recognition that enables the reader to make an informed decision for their specific problems.
Proceedings ArticleDOI

Table Detection Using Deep Learning

TL;DR: The proposed method works with high precision on document images with varying layouts that include documents, research papers, and magazines and beats Tesseract's state of the art table detection system by a significant margin.
Proceedings ArticleDOI

ICDAR 2003 page segmentation competition

TL;DR: The results indicate that although methods continue to mature, there is still a considerable need to develop robust methods that deal with everyday documents.
Journal ArticleDOI

Keyword-guided word spotting in historical printed documents using synthetic data and user feedback

TL;DR: A novel technique for word spotting in historical printed documents combining synthetic data and user feedback is proposed to search for keywords typed by the user in a large collection of digitized printed historical documents.
Proceedings ArticleDOI

Table detection in heterogeneous documents

TL;DR: Evaluation of the algorithm on document images from publicly available UNLV dataset shows competitive performance in comparison to the table detection module of a commercial OCR system.
References
More filters
Journal ArticleDOI

A survey of table recognition: Models, observations, transformations, and inferences

TL;DR: This presentation clarifies both the decisions made by a table recognizer and the assumptions and inferencing techniques that underlie these decisions.
Proceedings ArticleDOI

ICDAR 2003 page segmentation competition

TL;DR: The results indicate that although methods continue to mature, there is still a considerable need to develop robust methods that deal with everyday documents.
Book ChapterDOI

An Adaptive Binarization Technique for Low Quality Historical Documents

TL;DR: A novel digital image binarization scheme for low quality historical documents allowing further content exploitation in an efficient way and performs better compared to current state-of-the-art adaptive thresholding techniques.
Proceedings ArticleDOI

Page Segmentation Competition

TL;DR: The results of the page segmentation competition held in the context of ICDAR2005 indicate that although methods seem to be maturing, there is still a considerable need to develop robust methods that deal with everyday documents.
Proceedings ArticleDOI

Trainable table location in document images

TL;DR: An approach for table location in document images is described by means of a hierarchical representation that is based on the MXY tree and the use of an optimization method allows us to identify, the optimal values of thresholds involved in the algorithm.
Related Papers (5)