scispace - formally typeset
Search or ask a question

Showing papers by "Thomas M. Breuel published in 2005"


Proceedings ArticleDOI
31 Aug 2005
TL;DR: A new algorithm for removing both perspective and page curl distortion is presented, which requires only a single camera image as input and relies on a priori layout information instead of additional hardware, having the potential to become a general purpose preprocessing tool for camera based document capture.
Abstract: Digital cameras have become almost ubiquitous and their use for fast and casual capturing of natural images is unchallenged. For making images of documents, however, they have not caught up to flatbed scanners yet, mainly because camera images tend to suffer from distortion due to the perspective and are therefore limited in their further use for archival or OCR. For images of non-planar paper surfaces like books, page curl causes additional distortion, which poses an even greater problem due to its nonlinearity. This paper presents a new algorithm for removing both perspective and page curl distortion. It requires only a single camera image as input and relies on a priori layout information instead of additional hardware. Therefore, it is much more user friendly than most previous approaches, and allows for flexible ad hoc document capture. Results are presented showing that the algorithm produces visually pleasing output and increases OCR accuracy, thus having the potential to become a general purpose preprocessing tool for camera based document capture.

116 citations


01 Jan 2005
TL;DR: The design of a prototype system that takes a step into handling documents printed on paper as comfortably as electronic ones, while at the same time offering electronic access to all paper documents that have been present on the desktop during the uptime of the system.
Abstract: Ever since text processors became popular, users have dreamt of handling documents printed on paper as comfortably as electronic ones, with full text search typically appearing very close to the top of the wish list. This paper presents the design of a prototype system that takes a step into this direction. The user’s desktop is continuously monitored and of each detected document a high resolution snapshot is taken using a digital camera. The resulting image is processed using specially designed dewarping and OCR algorithms, making a digital and fully searchable version of the document available to the user in real-time. These steps are performed without any user interaction. This enables the system to run as a background task without disturbing the user in her work, while at the same time offering electronic access to all paper documents that have been present on the desktop during the uptime of the system.

44 citations


Proceedings ArticleDOI
14 Nov 2005
TL;DR: An approximate separable filtering scheme which consists of three ID convolutions is proposed which can outperform an FFT based implementation when the kernel size is small compared to the size of the 3D images.
Abstract: Anisotropie Gaussian filters are useful for adaptive smoothing and feature extraction. In our application, micro-tomographic images of fibers were smoothed by anisotropic Gaussians. In this case, this is more natural than using their isotropic counterparts. But filtering in large 3D data is very time consuming. We extend the work of Geusebroek et al. on fast Gauss filtering to three dimensions [(J-M Geusebroek et al., 2003), (G.Z. Yang et al., 1996)]. We propose an approximate separable filtering scheme which consists of three ID convolutions. Initial experiments suggest that this filter can outperform an FFT based implementation when the kernel size is small compared to the size of the 3D images.

8 citations


Book ChapterDOI
25 Aug 2005
TL;DR: This work presents a new solution to the robust detection of lines and arcs in scanned documents or technical drawings that works directly on run-length encoded data and is based on a branch-and-bound approach.
Abstract: The robust detection of lines and arcs in scanned documents or technical drawings is an important problem in document image understanding. We present a new solution to this problem that works directly on run-length encoded data. The method finds globally optimal solutions to parameterized thick line and arc models. Line thickness is part of the model and directly used during the matching process. Unlike previous approaches, it does not require any thinning or other preprocessing steps, no computation of the line adjacency graphs, and no heuristics. Furthermore, the only search-related parameter that needs to be specified is the desired numerical accuracy of the solution. The method is based on a branch-and-bound approach for the globally optimal detection of these geometric primitives using runs of black pixels in a bi-level image. We present qualitative and quantitative results of the algorithm on images used in the 2003 and 2005 GREC arc segmentation contests.

5 citations


01 Jan 2005
TL;DR: This paper proposes a collection of methods to measure the quality of such restoration algorithms for document image which show a non-linear distortion due to perspective or page curl and starts with the buildup of a document image database meant to serve as a common data basis for all kinds of restoration from images of 3D-shaped document.
Abstract: Many algorithms to remove distortion from document images have be proposed in recent years, but so far there is no reliable method for comparing their performance. In this paper we propose a collection of methods to measure the quality of such restoration algorithms for document image which show a non-linear distortion due to perspective or page curl. For the result from these measurement to be meaningful, a common data set of ground truth is required. We therefore started with the buildup of a document image database that is meant to serve as a common data basis for all kinds of restoration from images of 3D-shaped document. The long term goal would be to establish this database and following extensions in the area of document image dewarping as an as fruitful and indispensable tool as e.g. the NIST database is for OCR, or the Caltech database is for object and face recognition.