scispace - formally typeset
Open AccessJournal ArticleDOI

A comprehensive survey of mostly textual document segmentation algorithms since 2008

Sebastien Eskenazi, +2 more
- 01 Apr 2017 - 
- Vol. 64, pp 1-14
TLDR
This survey highlights the variety of the approaches that have been proposed for document image segmentation since 2008 and provides a clear typology of documents and of document images segmentation algorithms.
About
This article is published in Pattern Recognition.The article was published on 2017-04-01 and is currently open access. It has received 84 citations till now. The article focuses on the topics: Document clustering & Image segmentation.

read more

Citations
More filters
Journal ArticleDOI

Recent advances in convolutional neural networks

TL;DR: A broad survey of the recent advances in convolutional neural networks can be found in this article, where the authors discuss the improvements of CNN on different aspects, namely, layer design, activation function, loss function, regularization, optimization and fast computation.
Posted Content

Recent Advances in Convolutional Neural Networks

TL;DR: This paper details the improvements of CNN on different aspects, including layer design, activation function, loss function, regularization, optimization and fast computation, and introduces various applications of convolutional neural networks in computer vision, speech and natural language processing.
Proceedings ArticleDOI

FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents

TL;DR: This work presents a new dataset for form understanding in noisy scanned documents (FUNSD) that aims at extracting and structuring the textual content of forms, and is the first publicly available dataset with comprehensive annotations to address FoUn task.
Journal ArticleDOI

A Two-Stage Method for Text Line Detection in Historical Documents

TL;DR: In this paper, a two-stage text line detection method for historical documents is presented, where the first stage labels pixels to belong to one of the three classes: baseline, separator and other, and the second stage performs bottom-up clustering to build baselines.
Journal ArticleDOI

Document Layout Analysis: A Comprehensive Survey

TL;DR: This survey paper presents a critical study of different document layout analysis techniques and discusses comprehensively the different phases of the DLA algorithms based on a general framework that is formed as an outcome of reviewing the research in the field.
References
More filters
Journal ArticleDOI

A feature-integration theory of attention

TL;DR: A new hypothesis about the role of focused attention is proposed, which offers a new set of criteria for distinguishing separable from integral features and a new rationale for predicting which tasks will show attention limits and which will not.
Journal ArticleDOI

SLIC Superpixels Compared to State-of-the-Art Superpixel Methods

TL;DR: A new superpixel algorithm is introduced, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels and is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.
Journal ArticleDOI

Shape matching and object recognition using shape contexts

TL;DR: This paper presents work on computing shape models that are computationally fast and invariant basic transformations like translation, scaling and rotation, and proposes shape detection using a feature called shape context, which is descriptive of the shape of the object.
Journal ArticleDOI

Algorithms for the reduction of the number of points required to represent a digitized line or its caricature

TL;DR: In this paper, two algorithms to reduce the number of points required to represent the line and, if desired, produce caricatures are presented and compared with the most promising methods so far suggested.
Journal ArticleDOI

Document analysis system

TL;DR: The requirements and components for a proposed Document Analysis System, which assists a user in encoding printed documents for computer processing, are outlined and several critical functions have been investigated and the technical approaches are discussed.
Related Papers (5)