scispace - formally typeset
Search or ask a question
Author

James L. Fisher

Bio: James L. Fisher is an academic researcher from Mitre Corporation. The author has contributed to research in topics: Run-length encoding & Pixel. The author has an hindex of 2, co-authored 2 publications receiving 267 citations.

Papers
More filters
Proceedings ArticleDOI
16 Jun 1990
TL;DR: By creating a burst image from the original document image, the processing time of the Hough transform can be reduced by a factor of as much as 7.4 for documents with gray-scale images and interline spacing can be determined more accurately.
Abstract: As part of the development of a document image analysis system, a method, based on the Hough transform, was devised for the detection of document skew and interline spacing-necessary parameters for the automatic segmentation of text from graphics. Because the Hough transform is computationally expensive, the amount of data within a document image is reduced through the computation of its horizontal and vertical black runlengths. Histograms of these runlengths are used to determine whether the document is in portrait or landscape orientation. A gray scale burst image is created from the black runlengths that are perpendicular to the text lines by placing the length of the run in the run's bottom-most pixel. By creating a burst image from the original document image, the processing time of the Hough transform can be reduced by a factor of as much as 7.4 for documents with gray-scale images. Because only small runlengths are input to the Hough transform and because the accumulator array is incremented by the runlength associated with a pixel rather than by a factor of 1, the negative effects of noise, black margins, and figures are avoided. Consequently, interline spacing can be determined more accurately. >

263 citations

Book
01 Jan 1995
TL;DR: In this article, a gray scale burst image is created from the black runlengths that are perpendicular to the text lines by placing the length of the run in the run's bottom-most pixel.
Abstract: As part of the development of a document image analysis system, a method, based on the Hough transform, was devised for the detection of document skew and interline spacing-necessary parameters for the automatic segmentation of text from graphics. Because the Hough transform is computationally expensive, the amount of data within a document image is reduced through the computation of its horizontal and vertical black runlengths. Histograms of these runlengths are used to determine whether the document is in portrait or landscape orientation. A gray scale burst image is created from the black runlengths that are perpendicular to the text lines by placing the length of the run in the run's bottom-most pixel. By creating a burst image from the original document image, the processing time of the Hough transform can be reduced by a factor of as much as 7.4 for documents with gray-scale images. Because only small runlengths are input to the Hough transform and because the accumulator array is incremented by the runlength associated with a pixel rather than by a factor of 1, the negative effects of noise, black margins, and figures are avoided. Consequently, interline spacing can be determined more accurately. >

7 citations


Cited by
More filters
Book
25 Nov 1996
TL;DR: Algorithms for Image Processing and Computer Vision, 2nd Edition provides the tools to speed development of image processing applications.
Abstract: A cookbook of algorithms for common image processing applicationsThanks to advances in computer hardware and software, algorithms have been developed that support sophisticated image processing without requiring an extensive background in mathematics This bestselling book has been fully updated with the newest of these, including 2D vision methods in content-based searches and the use of graphics cards as image processing computational aids Its an ideal reference for software engineers and developers, advanced programmers, graphics programmers, scientists, and other specialists who require highly specialized image processingAlgorithms now exist for a wide variety of sophisticated image processing applications required by software engineers and developers, advanced programmers, graphics programmers, scientists, and related specialistsThis bestselling book has been completely updated to include the latest algorithms, including 2D vision methods in content-based searches, details on modern classifier methods, and graphics cards used as image processing computational aidsSaves hours of mathematical calculating by using distributed processing and GPU programming, and gives non-mathematicians the shortcuts needed to program relatively sophisticated applicationsAlgorithms for Image Processing and Computer Vision, 2nd Edition provides the tools to speed development of image processing applications

1,517 citations

Journal ArticleDOI
Lawrence O'Gorman1
TL;DR: The document spectrum (or docstrum) as discussed by the authors is a method for structural page layout analysis based on bottom-up, nearest-neighbor clustering of page components, which yields an accurate measure of skew, within-line, and between-line spacings and locates text lines and text blocks.
Abstract: Page layout analysis is a document processing technique used to determine the format of a page. This paper describes the document spectrum (or docstrum), which is a method for structural page layout analysis based on bottom-up, nearest-neighbor clustering of page components. The method yields an accurate measure of skew, within-line, and between-line spacings and locates text lines and text blocks. It is advantageous over many other methods in three main ways: independence from skew angle, independence from different text spacings, and the ability to process local regions of different text orientations within the same image. Results of the method shown for several different page formats and for randomly oriented subpages on the same image illustrate the versatility of the method. We also discuss the differences, advantages, and disadvantages of the docstrum with respect to other lay-out methods. >

654 citations

Book
01 Jan 1995
TL;DR: The document spectrum (or docstrum), which is a method for structural page layout analysis based on bottom-up, nearest-neighbor clustering of page components, yields an accurate measure of skew, within-line, and between-line spacings and locates text lines and text blocks.
Abstract: Page layout analysis is a document processing technique used to determine the format of a page. This paper describes the document spectrum (or docstrum), which is a method for structural page layout analysis based on bottom-up, nearest-neighbor clustering of page components. The method yields an accurate measure of skew, within-line, and between-line spacings and locates text lines and text blocks. It is advantageous over many other methods in three main ways: independence from skew angle, independence from different text spacings, and the ability to process local regions of different text orientations within the same image. Results of the method shown for several different page formats and for randomly oriented subpages on the same image illustrate the versatility of the method. We also discuss the differences, advantages, and disadvantages of the docstrum with respect to other lay-out methods. >

628 citations

Book
01 Jan 1995
TL;DR: In this article, a class of techniques based on smeared run length codes that divide a page into gray and nearly white parts are described, and then segmentation is performed by finding connected components either by the gray elements or of the white.
Abstract: Page segmentation is the process by which a scanned page is divided into columns and blocks which are then classified as halftones, graphics, or text. Past techniques have used the fact that such parts form right rectangles for most printed material. This property is not true when the page is tilted, and the heuristics based on it fail in such cases unless a rather expensive tilt angle estimation is performed. We describe a class of techniques based on smeared run length codes that divide a page into gray and nearly white parts. Segmentation is then performed by finding connected components either by the gray elements or of the white, the latter forming white streams that partition a page into blocks of printed material. Such techniques appear quite robust in the presence of severe tilt (even greater than 10 °) and are also quite fast (about a second a page on a SPARC station for gray element aggregation). Further classification into text or halftones is based mostly on properties of the across scanlines correlation. For text correlation of adjacent scanlines tends to be quite high, but then it drops rapidly. For halftones, the correlation of adjacent scanlines is usually well below that for text, but it does not change much with distance.

249 citations

Journal ArticleDOI
TL;DR: A new and fast approach is advanced herein whereby skew angle detection takes advantage of information found using the page orientation algorithm, and it is indicated that detection accuracy can be improved by minimizing the effects of non-textual data.

225 citations