scispace - formally typeset
Search or ask a question
Author

Richard G. Casey

Other affiliations: Cisco Systems, Inc.
Bio: Richard G. Casey is an academic researcher from IBM. The author has contributed to research in topics: Tree (data structure) & Character (mathematics). The author has an hindex of 16, co-authored 26 publications receiving 1896 citations. Previous affiliations of Richard G. Casey include Cisco Systems, Inc..

Papers
More filters
Journal ArticleDOI
TL;DR: The requirements and components for a proposed Document Analysis System, which assists a user in encoding printed documents for computer processing, are outlined and several critical functions have been investigated and the technical approaches are discussed.
Abstract: This paper outlines the requirements and components for a proposed Document Analysis System, which assists a user in encoding printed documents for computer processing. Several critical functions have been investigated and the technical approaches are discussed. The first is the segmentation and classification of digitized printed documents into regions of text and images. A nonlinear, run-length smoothing algorithm has been used for this purpose. By using the regular features of text lines, a linear adaptive classification scheme discriminates text regions from others. The second technique studied is an adaptive approach to the recognition of the hundreds of font styles and sizes that can occur on printed documents. A preclassifier is constructed during the input process and used to speed up a well-known pattern-matching method for clustering characters from an arbitrary print source into a small sample of prototypes. Experimental results are included.

718 citations

Journal ArticleDOI
TL;DR: It is shown that a constrained run length algorithm is well suited to partition most documents into areas of text lines, solid black lines, and rectangular ☐es enclosing graphics and halftone images.

428 citations

Journal ArticleDOI
Richard G. Casey1
TL;DR: Comparison experiments showed that error rates were reduced by integral factors if the patterns were normalized before scanning for recognition, and second-order moments of the pattern are convenient properties to use in specifying the transformation.
Abstract: Handprinted characters can be made more uniform in appearance than the as-written version if an appropriate linear transformation is performed on each input pattern. The transformation can be implemented electronically by programming a flying-spot raster-scanner to scan at specified angles rather than only along specified axes. Alternatively, curve-follower normalization can be achieved by transforming the coordinate waveforms in a linear combining network. Second-order moments of the pattern are convenient properties to use in specifying the transformation. By mapping the original pattern into one having a scalar moment matrix all linear pattern variations can be removed. Comparison experiments with three sets of handprinted numerals showed that error rates were reduced by integral factors if the patterns were normalized before scanning for recognition.

130 citations

Journal ArticleDOI
TL;DR: An important equivalence between operations with functional relations and operations with analogous Boolean functions is demonstrated and is computationally helpful in exploring the properties of a given set of functional relations, as well as in the task of partitioning a data set into subfiles for efficient implementation.
Abstract: The notion of a functional relation among the attributes of a data set can be fruitfully applied in the structuring of an information system. These relations are meaningful both to the user of the system in his semantic understanding of the data, and to the designer in implementing the system. An important equivalence between operations with functional relations and operations with analogous Boolean functions is demonstrated in this paper. The equivalence is computationally helpful in exploring the properties of a given set of functional relations, as well as in the task of partitioning a data set into subfiles for efficient implementation.

127 citations

Patent
Richard G. Casey1, David R. Ferguson1
02 Oct 1989
TL;DR: In this paper, a computer-implemented method is proposed to extract character data from printed forms, which consists only of lines in the master form, and the resulting image can then be displayed, each mask corresponding to a field where data would be located in a filled-in form.
Abstract: A computer-implemented method operable with conventional OCR scanning equipment and software, extracts character data from printed forms. A blank master form is scanned and its digital image stored. Clusters of ON bits of the master form image are first recognized as part of a line and then connected to form lines. All of the lines in the master form image are then identified by row and column start position and column end position, thereby creating a master-form-description. The resulting image, which consists only of lines in the master form, can then be displayed. Regions or masks in the displayed image of master form lines are then created, each mask corresponding to a field where data would be located in a filled-in form. Each data mask is spaced from nearby lines by a predetermined data margin, referred to as D. A filled-in or data form is then scanned and lines are also recognized and identified in a similar manner to create a data-form-description. The data-form-description is compared with the master-form-description by computing the horizontal and vertical offsets and skew of the two forms relative to one another. The created data masks, whose orientation with respect to the master form has been previously determined, are then transposed into the data form image using the computed values of horizontal and vertical offsets and skew. In this manner, the data masks are correctly located on the data form so that the actual data values in the data form reside within the corresponding data masks. Routines are then implemented for detecting extraneous data intruding into the data masks and for growing the masks, i.e. enlarging the masks to capture data which may extend beyond the perimeter of the masks. Thus, the data masks are adaptive in that they are grown if data does not lie entirely within the perimeter of the masks. During the mask growth routine, lines which are part of the background form are detected and removed by line removal algorithms. Following the removal of extraneous data from the masks, the growth of the masks to capture data, and any subsequent line removal, the remaining data from the masks is extracted and transferred to a new file. The new file then contains only data comprising characters of the data values in the desired regions, which can then be operated on by conventional OCR software to identify the specific character values.

110 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Various types of moments have been used to recognize image patterns in a number of applications and some fundamental questions are addressed, such as image-representation ability, noise sensitivity, and information redundancy.
Abstract: Various types of moments have been used to recognize image patterns in a number of applications. A number of moments are evaluated and some fundamental questions are addressed, such as image-representation ability, noise sensitivity, and information redundancy. Moments considered include regular moments, Legendre moments, Zernike moments, pseudo-Zernike moments, rotational moments, and complex moments. Properties of these moments are examined in detail and the interrelationships among them are discussed. Both theoretical and experimental results are presented. >

1,522 citations

Journal ArticleDOI
TL;DR: This work discusses in detail the design decisions that led to the grid file, present simulation results of its behavior, and compare it to other multikey access file structures.
Abstract: Traditional file structures that provide multikey access to records, for example, inverted files, are extensions of file structures originally designed for single-key access. They manifest various deficiencies in particular for multikey access to highly dynamic files. We study the dynamic aspects of file structures that treat all keys symmetrically, that is, file structures which avoid the distinction between primary and secondary keys. We start from a bitmap approach and treat the problem of file design as one of data compression of a large sparse matrix. This leads to the notions of a grid partition of the search space and of a grid directory, which are the keys to a dynamic file structure called the grid file. This file system adapts gracefully to its contents under insertions and deletions, and thus achieves an upper bound of two disk accesses for single record retrieval; it also handles range queries and partially specified queries efficiently. We discuss in detail the design decisions that led to the grid file, present simulation results of its behavior, and compare it to other multikey access file structures.

1,222 citations

Journal ArticleDOI
TL;DR: A new approach to shape recognition based on a virtually infinite family of binary features (queries) of the image data, designed to accommodate prior information about shape invariance and regularity, and a comparison with artificial neural networks methods is presented.
Abstract: We explore a new approach to shape recognition based on a virtually infinite family of binary features (queries) of the image data, designed to accommodate prior information about shape invariance and regularity. Each query corresponds to a spatial arrangement of several local topographic codes (or tags), which are in themselves too primitive and common to be informative about shape. All the discriminating power derives from relative angles and distances among the tags. The important attributes of the queries are a natural partial ordering corresponding to increasing structure and complexity; semi-invariance, meaning that most shapes of a given class will answer the same way to two queries that are successive in the ordering; and stability, since the queries are not based on distinguished points and substructures. No classifier based on the full feature set can be evaluated, and it is impossible to determine a priori which arrangements are informative. Our approach is to select informative features and build tree classifiers at the same time by inductive learning. In effect, each tree provides an approximation to the full posterior where the features chosen depend on the branch that is traversed. Due to the number and nature of the queries, standard decision tree construction based on a fixed-length feature vector is not feasible. Instead we entertain only a small random sample of queries at each node, constrain their complexity to increase with tree depth, and grow multiple trees. The terminal nodes are labeled by estimates of the corresponding posterior distribution over shape classes. An image is classified by sending it down every tree and aggregating the resulting distributions. The method is applied to classifying handwritten digits and synthetic linear and nonlinear deformations of three hundred L AT E X symbols. Stateof-the-art error rates are achieved on the National Institute of Standards and Technology database of digits. The principal goal of the experiments on L AT E X symbols is to analyze invariance, generalization error and related issues, and a comparison with artificial neural networks methods is presented in this context.

1,214 citations

Patent
Todd A. Cass1
30 Jul 1996
TL;DR: In this article, a reference-based mark extraction technique was proposed, in which the second document image serves as a reference image and in which substantially the entirety of the first document image is compared with substantially the whole of the second image.
Abstract: A processor is provided with first and second document images. The first image represents an instance of a reference document to which instance a mark has been added. The second image is selected from among a collection of document images and represents the reference document without the mark. The processor automatically extracts from the first document image a set of pixels representing the mark. This is done by performing a reference-based mark extraction technique in which the second document image serves as a reference image and in which substantially the entirety of the first document image is compared with substantially the entirety of the second document image. Also, the processor is provided with information about a set of active elements of the reference document. The reference document has at least one such active element and each active element is associated with at least one action. The processor interprets the extracted set of pixels representing the mark by determining whether the mark indicates any of the active elements of the reference document. If the mark indicates an active element, the processor facilitates the action with which the indicated active element is associated.

1,099 citations

Journal ArticleDOI
TL;DR: It is shown that logic provides a convenient formalism for studying classical database problems and the representation and manipulation of deduced facts and incomplete information is shown.
Abstract: The purpose of this paper is to show that logic provides a convenient formalism for studying classical database problems. There are two main parts to the paper, devoted respectively to conventional databases and deductive databases. In the first part, we focus on query languages, integrity modeling and maintenance, query optimization, and data dependencies. The second part deals mainly with the representation and manipulation of deduced facts and incomplete information. Categories and Subject Descriptors: H.2.1 [Database Management]: Logical Design— data models; H.2.3 [Database Management]: Languages— query languages; H.2.4 [Database Management]: Systems— query processing General Terms: Deductive Databases, Indefinite Data, Logic and Databases, Null Values, Relational Databases

769 citations