Book ChapterDOI
Automatic table detection in document images
Basilios Gatos,Dimitrios Danatsas,Ioannis Pratikakis,Stavros Perantonis +3 more
- pp 609-618
TLDR
The efficiency of the proposed method is demonstrated by using a performance evaluation scheme which considers a great variety of documents such as forms, newspapers/magazines, scientific journals, tickets/bank cheques, certificates and handwritten documents.Abstract:
In this paper, we propose a novel technique for automatic table detection in document images. Lines and tables are among the most frequent graphic, non-textual entities in documents and their detection is directly related to the OCR performance as well as to the document layout description. We propose a workflow for table detection that comprises three distinct steps: (i) image pre-processing; (ii) horizontal and vertical line detection and (iii) table detection. The efficiency of the proposed method is demonstrated by using a performance evaluation scheme which considers a great variety of documents such as forms, newspapers/magazines, scientific journals, tickets/bank cheques, certificates and handwritten documents.read more
Citations
More filters
Reference BookDOI
Handbook of Document Image Processing and Recognition
David Doermann,Karl Tombre +1 more
TL;DR: The Handbook of Document Image Processing and Recognition is a comprehensive resource on the latest methods and techniques in document image processing and recognition that enables the reader to make an informed decision for their specific problems.
Proceedings ArticleDOI
Table Detection Using Deep Learning
TL;DR: The proposed method works with high precision on document images with varying layouts that include documents, research papers, and magazines and beats Tesseract's state of the art table detection system by a significant margin.
Proceedings ArticleDOI
ICDAR 2003 page segmentation competition
TL;DR: The results indicate that although methods continue to mature, there is still a considerable need to develop robust methods that deal with everyday documents.
Journal ArticleDOI
Keyword-guided word spotting in historical printed documents using synthetic data and user feedback
Thomas Konidaris,Basilis Gatos,K. Ntzios,Ioannis Pratikakis,Sergios Theodoridis,Stavros Perantonis +5 more
TL;DR: A novel technique for word spotting in historical printed documents combining synthetic data and user feedback is proposed to search for keywords typed by the user in a large collection of digitized printed historical documents.
Proceedings ArticleDOI
Table detection in heterogeneous documents
Faisal Shafait,Ray Smith +1 more
TL;DR: Evaluation of the algorithm on document images from publicly available UNLV dataset shows competitive performance in comparison to the table detection module of a commercial OCR system.
References
More filters
Journal ArticleDOI
A survey of table recognition: Models, observations, transformations, and inferences
TL;DR: This presentation clarifies both the decisions made by a table recognizer and the assumptions and inferencing techniques that underlie these decisions.
Proceedings ArticleDOI
ICDAR 2003 page segmentation competition
TL;DR: The results indicate that although methods continue to mature, there is still a considerable need to develop robust methods that deal with everyday documents.
Book ChapterDOI
An Adaptive Binarization Technique for Low Quality Historical Documents
TL;DR: A novel digital image binarization scheme for low quality historical documents allowing further content exploitation in an efficient way and performs better compared to current state-of-the-art adaptive thresholding techniques.
Proceedings ArticleDOI
Page Segmentation Competition
TL;DR: The results of the page segmentation competition held in the context of ICDAR2005 indicate that although methods seem to be maturing, there is still a considerable need to develop robust methods that deal with everyday documents.
Proceedings ArticleDOI
Trainable table location in document images
TL;DR: An approach for table location in document images is described by means of a hierarchical representation that is based on the MXY tree and the use of an optimization method allows us to identify, the optimal values of thresholds involved in the algorithm.