scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Pattern Analysis and Machine Intelligence in 1983"


Journal ArticleDOI
TL;DR: This paper describes a number of statistical models for use in speech recognition, with special attention to determining the parameters for such models from sparse data, and describes two decoding methods appropriate for constrained artificial languages and one appropriate for more realistic decoding tasks.
Abstract: Speech recognition is formulated as a problem of maximum likelihood decoding. This formulation requires statistical models of the speech production process. In this paper, we describe a number of statistical models for use in speech recognition. We give special attention to determining the parameters for such models from sparse data. We also describe two decoding methods, one appropriate for constrained artificial languages and one appropriate for more realistic decoding tasks. To illustrate the usefulness of the methods described, we review a number of decoding results that have been obtained with them.

1,637 citations


Journal ArticleDOI
TL;DR: The power of the binomial model to produce blurry, sharp, line-like, and blob-like textures is demonstrated and the synthetic microtextures closely resembled their real counterparts, while the regular and inhomogeneous textures did not.
Abstract: We consider a texture to be a stochastic, possibly periodic, two-dimensional image field. A texture model is a mathematical procedure capable of producing and describing a textured image. We explore the use of Markov random fields as texture models. The binomial model, where each point in the texture has a binomial distribution with parameter controlled by its neighbors and ``number of tries'' equal to the number of gray levels, was taken to be the basic model for the analysis. A method of generating samples from the binomial model is given, followed by a theoretical and practical analysis of the method's convergence. Examples show how the parameters of the Markov random field control the strength and direction of the clustering in the image. The power of the binomial model to produce blurry, sharp, line-like, and blob-like textures is demonstrated. Natural texture samples were digitized and their parameters were estimated under the Markov random field model. A hypothesis test was used for an objective assessment of goodness-of-fit under the Markov random field model. Overall, microtextures fit the model well. The estimated parameters of the natural textures were used as input to the generation procedure. The synthetic microtextures closely resembled their real counterparts, while the regular and inhomogeneous textures did not.

1,496 citations


Journal ArticleDOI
TL;DR: It is shown that the problem of finding consistent labelings is equivalent to solving a variational inequality, and a procedure nearly identical to the relaxation operator derived under restricted circum-stances serves in the more general setting.
Abstract: A large class of problems can be formulated in terms of the assignment of labels to objects. Frequently, processes are needed which reduce ambiguity and noise, and select the best label among several possible choices. Relaxation labeling processes are just such a class of algorithms. They are based on the parallel use of local constraints between labels. This paper develops a theory to characterize the goal of relaxation labeling. The theory is founded on a definition of con-sistency in labelings, extending the notion of constraint satisfaction. In certain restricted circumstances, an explicit functional exists that can be maximized to guide the search for consistent labelings. This functional is used to derive a new relaxation labeling operator. When the restrictions are not satisfied, the theory relies on variational cal-culus. It is shown that the problem of finding consistent labelings is equivalent to solving a variational inequality. A procedure nearly identical to the relaxation operator derived under restricted circum-stances serves in the more general setting. Further, a local convergence result is established for this operator. The standard relaxation labeling formulas are shown to approximate our new operator, which leads us to conjecture that successful applications of the standard methods are explainable by the theory developed here. Observations about con-vergence and generalizations to higher order compatibility relations are described.

964 citations


Journal ArticleDOI
TL;DR: A variety of approaches to generalized range finding are surveyed and a perspective on their applicability and shortcomings in the context of computer vision studies is presented.
Abstract: In recent times a great deal of interest has been shown, amongst the computer vision and robotics research community, in the acquisition of range data for supporting scene analysis leading to remote (noncontact) determination of configurations and space filling extents of three-dimensional object assemblages. This paper surveys a variety of approaches to generalized range finding and presents a perspective on their applicability and shortcomings in the context of computer vision studies.

887 citations


Journal ArticleDOI
TL;DR: The ``volume segment'' representation presented in this paper is a volumetric representation that facilitates modification yet is descriptive of surface detail in a bounding volume approximating the object generating the contours.
Abstract: Occluding contours from an image sequence with view-point specifications determine a bounding volume approximating the object generating the contours. The initial creation and continual refinement of the approximation requires a volumetric representation that facilitates modification yet is descriptive of surface detail. The ``volume segment'' representation presented in this paper is one such representation.

500 citations


Journal ArticleDOI
TL;DR: A general purpose automated planner/scheduler is described which generates parallel plans to achieve goals with imposed time con-straints and whose durations and start time windows may be specified for sets of goal conditions.
Abstract: A general purpose automated planner/scheduler is described which generates parallel plans to achieve goals with imposed time con-straints Both durations and start time windows may be specified for sets of goal conditions The parallel plans consist of not just actions but also of events (triggered by circumstances), inferences, and scheduled events (completely beyond the actor's control) Deterministic dura-tions of all such activities are explicitly modeled, and may be any com-putable function of the activity variables A start time window for each activity in the plan is updated dynamically during plan generation, in order to maintain consistency with the windows and durations of adja-cent activities and goals The plans are tailored around scheduled events The final plan network resembles a PERT chart From this a schedule of nominal start times for each activity is generated Ex-amples are drawn from the traditional blocksworld and also from a real-istic ``Spaceworld,'' in which an autonomous spacecraft photographs objects in deep space and transmits the information to Earth The author is with the Information Systems Research Section, Jet Propulsion Laboratory, Pasadena, CA 91109

423 citations


Journal ArticleDOI
TL;DR: A method for automated construction of classifications called conceptual clustering is described and compared to methods used in numerical taxonomy, in which descriptive concepts are conjunctive statements involving relations on selected object attributes and optimized according to an assumed global criterion of clustering quality.
Abstract: A method for automated construction of classifications called conceptual clustering is described and compared to methods used in numerical taxonomy. This method arranges objects into classes representing certain descriptive concepts, rather than into classes defined solely by a similarity metric in some a priori defined attribute space. A specific form of the method is conjunctive conceptual clustering, in which descriptive concepts are conjunctive statements involving relations on selected object attributes and optimized according to an assumed global criterion of clustering quality. The method, implemented in program CLUSTER/2, is tested together with 18 numerical taxonomy methods on two exemplary problems: 1) a construction of a classification of popular microcomputers and 2) the reconstruction of a classification of selected plant disease categories. In both experiments, the majority of numerical taxonomy methods (14 out of 18) produced results which were difficult to interpret and seemed to be arbitrary. In contrast to this, the conceptual clustering method produced results that had a simple interpretation and corresponded well to solutions preferred by people.

342 citations


Journal ArticleDOI
TL;DR: ACRONYM as mentioned in this paper is a comprehensive domain independent model-based system for vision and manipulation related tasks, which uses invariants for image feature prediction and makes predictions of image features and their relations from 3D geometric models.
Abstract: ACRONYM is a comprehensive domain independent model-based system for vision and manipulation related tasks. Many of its submodules and representations have been described elsewhere. Here the derivation and use of invariants for image feature prediction is described. Predictions of image features and their relations are made from three-dimensional geometric models. Instructions are generated which teli the interpretation algorithms how to make use of image feature measurements to derive three-dimensional size, structural, and spatial constraints on the original three-dimensional models. Some preliminary examples of ACRONYM's interpretations of aerial images are shown.

338 citations


Journal ArticleDOI
TL;DR: In this article, the authors describe an approach to the recognition of stacked objects with planar and curved surfaces. But their system works in two phases: in the learning phase, a scene containing a single object is shown one at a time, and then the description is matched to the object models so that the stacked objects are recognized sequentially.
Abstract: This paper describes an approach to the recognition of stacked objects with planar and curved surfaces. The system works in two phases. In the learning phase, a scene containing a single object is shown one at a time. The range data of a scene are obtained by a range finder. The description of each scene is built in terms of properties of regions and relations between them. This description is stored as an object model. In the recognition phase, an unknown scene is described in the same way as in the learning phase. Then the description is matched to the object models so that the stacked objects are recognized sequentially. Efficient matching is achieved by a combination of data-driven and model-driven search processes. Experimental results for blocks and machine parts are shown.

250 citations


Journal ArticleDOI
TL;DR: An Automated Lumber Processing System (ALPS) that employs computer tomography, optical scanning technology, the calculation of an optimum cutting strategy, and a computer-driven laser cutting device is described.
Abstract: Continued increases in the cost of materials and labor make it imperative for furniture manufacturers to control costs by improved yield and increased productivity This paper describes an Automated Lumber Processing System (ALPS) that employs computer tomography, optical scanning technology, the calculation of an optimum cutting strategy, and a computer-driven laser cutting device While certain major hardware components of ALPS are already commercially available, a major missing element is the automatic inspection system needed to locate and identify surface defects on boards This paper reports research aimed at developing such an inspection system The basic strategy is to divide the digital image of a board into a number of disjoint rectangular regions and classify each independently This simple procedure has the advantage of allowing an obvious parallel processing implementation The study shows that measures of tonal and pattern related qualities are needed The tonal measures are the mean, variance, skewness, and kurtosis of the gray levels The pattern related measures are those based on cooccurrence matrices In this initial feasibility study, these combined measures yielded an overall 883 percent correct classification on the eight defects most commonly found in lumber To minimize the number of calculations needed to make the required classifications a sequential classifier is proposed

238 citations


Journal ArticleDOI
TL;DR: In this article, a nonparametric method of discriminant analysis is proposed based on non-parametric extensions of commonly used scatter matrices for non-Gaussian data sets and a procedure is proposed to test the structural similarity of two distributions.
Abstract: A nonparametric method of discriminant analysis is proposed. It is based on nonparametric extensions of commonly used scatter matrices. Two advantages result from the use of the proposed nonparametric scatter matrices. First, they are generally of full rank. This provides the ability to specify the number of extracted features desired. This is in contrast to parametric discriminant analysis, which for an L class problem typically can determine at most L 1 features. Second, the nonparametric nature of the scatter matrices allows the procedure to work well even for non-Gaussian data sets. Using the same basic framework, a procedure is proposed to test the structural similarity of two distributions. The procedure works in high-dimensional space. It specifies a linear decomposition of the original data space in which a relative indication of dissimilarity along each new basis vector is provided. The nonparametric scatter matrices are also used to derive a clustering procedure, which is recognized as a k-nearest neighbor version of the nonparametric valley seeking algorithm. The form which results provides a unified view of the parametric nearest mean reclassification algorithm and the nonparametric valley seeking algorithm.

Journal ArticleDOI
TL;DR: The effectiveness of the theory of fuzzy sets in detecting different regional boundaries of X-ray images is demonstrated and the system performance for different parameter conditions is illustrated by application to an image of a radiograph of the wrist.
Abstract: The effectiveness of the theory of fuzzy sets in detecting different regional boundaries of X-ray images is demonstrated. The algorithm includes a prior enhancement of the contrast among the regions (having small change in gray levels) using the contrast intensification (INT) operation along with smoothing in the fuzzy property plane before detecting its edges. The property plane is extracted from the spatial domain using S, ? and (1 ?) functions and the fuzzifiers. Final edge detection is achieved using max or min operator. The system performance for different parameter conditions is illustrated by application to an image of a radiograph of the wrist.

Journal ArticleDOI
TL;DR: Considering the Hough transformation as a linear imaging process recasts certain well-known problems, provides a useful vocab-ulary, and possibly indicates a source of applicable literature on the behavior of the H Dough transformation in various forms of noise.
Abstract: Considering the Hough transformation as a linear imaging process recasts certain well-known problems, provides a useful vocab-ulary, and possibly indicates a source of applicable literature on the behavior of the Hough transformation in various forms of noise. A consideration of the analytic form of peaks in parameter space sets the stage for the idea of using complementary (negative) votes to cancel off-peak positive votes in parameter space, thus sharpening peaks and reducing bias.

Journal ArticleDOI
TL;DR: An evaluation of four clustering methods and four external criterion measures was conducted with respect to the effect of the number of clusters, dimensionality, and relative cluster sizes on the recovery of true cluster structure, and results indicated that the four criterion measures were generally consistent with each other.
Abstract: An evaluation of four clustering methods and four external criterion measures was conducted with respect to the effect of the number of clusters, dimensionality, and relative cluster sizes on the recovery of true cluster structure. The four methods were the single link, complete link, group average (UPGMA), and Ward's minimum variance algorithms. The results indicated that the four criterion measures were generally consistent with each other, of which two highly similar pairs were identified. The tirst pair consisted of the Rand and corrected Rand statistics, and the second pair was the Jaccard and the Fowlkes and Mallows indexes. With respect to the methods, recovery was found to improve as the number of clusters increased and as the number of dimensions increased. The relative cluster size factor produced differential performance effects, with Ward's procedure providing the best recovery when the clusters were of equal size. The group average method gave equivalent or better recovery when the clusters were of unequal size.

Journal ArticleDOI
TL;DR: A laser time-of-flight range scanner capable of acquiring a reasonable rangepic of 64 × 64 spatial resolution in 4 s is described, however, at the current state of development, 100 samples/point are required to achieve a range accuracy of approxi-mately ±¿ cm.
Abstract: The requirements of three-dimensional scene analysis in support of vision driven robotic manipulation are such that direct range finding capabilities can greatly reduce the computational burden (time and complexity) for reliable determination of the placement and spatial extent of objects in the scene. This paper describes a laser time-of-flight range scanner capable of acquiring a reasonable rangepic of 64 × 64 spatial resolution in 4 s. However, at the current state of development, 100 samples/point are required to achieve a range accuracy of approxi-mately ±? cm, with a scan time of 40 s. Returned signal amplitude de-pendency is also apparent in the range determination, the dynamic range of intensity being considerable. Improved accuracy can be traded for tine. The vision/robotics laboratory environment within which the range scanner operates is briefly touched upon and some preliminary rangepic results presented.

Journal ArticleDOI
TL;DR: A fractal curve first discussed by Giuseppe Peano has useful statistical properties in image processing and applications include multispectral image display, compression in the color and spa-tial domain, classification, and color display.
Abstract: A fractal curve first discussed by Giuseppe Peano has useful statistical properties in image processing. The curve can act as a transform between a line and n-dimensional space retaining some of the spatially associative properties of the space. Applications of the curve include multispectral image display, compression in the color and spa-tial domain, classification, and color display. The technique enables the display of high quality color images on a frame store capable of displaying only 256 colors simultaneously. A logically ordered and complete set of colors for false color work can be selected. An im-provement in compression by about 3X for color images can be ob-tained by using this method in conjunction with an existing spatial compression technique. A simple algorithm for constructing the curves is shown.

Journal ArticleDOI
TL;DR: A modified Mellin transform for digital implementation is developed and applied to range radar profiles of naval vessels and results in the desired insensitivity without having the low-pass filtering characteristic that exists in other Fourier-Mellin implementations.
Abstract: A modified Mellin transform for digital implementation is developed and applied to range radar profiles of naval vessels. The scale invariance property of the Mellin transform provides a means for extracting features from the profiles which are insensitive to the aspect angle of the radar. Past implementations of the Mellin transform based on the FFT have required exponential sampling, interpolation, and the computation of a correction term, all of which introduce errors into the transform. In addition, exponential sampling results in a factor of ln N increase in the number of data points. An alternate implementation, developed in the paper, utilizes a direct expansion of the Mellin integral definition. This direct Mellin transform (DMT) eliminates the implementation problems associated with the FFT approach, and does not increase the number of samples. A scale and translation invariant transform is developed from a modification of the DMT. The MDMT applied to the FFT of the radar profiles results in the desired insensitivity without having the low-pass filtering characteristic that exists in other Fourier-Mellin implementations.

Journal ArticleDOI
TL;DR: The correlation coefficients are used for segmentation according to texture and are first evaluated on a set of square regions forming two levels of the quadratic picture tree (or pyramid).
Abstract: The correlation coefficients are used for segmentation according to texture. They are first evaluated on a set of square regions forming two levels of the quadratic picture tree (or pyramid). If the coefficients of a square and its four children in the tree are similar, then that region is considered to be of uniform texture. If not, it is replaced by its children. In this way, the split-and-merge algorithm is used to achieve a preliminary segmentation. It is followed by a grouping algorithm using the correlation coefficients and the region adjacency graph, plus a small region elimination step. The latter regions are grouped according to their gray level because texture cannot be defined reliably on very small regions. Examples of implementation on four pictures are included.

Journal ArticleDOI
TL;DR: This paper presents a method for the direct computation of the focus of expansion using an optimization approach, and shows how the optical flow can be computed using thefocus of expansion.
Abstract: Optical flow carries valuable information about the nature and depth of surfaces and the relative motion between observer and objects. In the extraction of this information, the focus of expansion plays a vital role. In contrast to the current approaches, this paper presents a method for the direct computation of the focus of expansion using an optimization approach. The optical flow can then be computed using the focus of expansion.

Journal ArticleDOI
TL;DR: The results make it theoretically possible to identify extremal edges of an intensity function f(x, y) of two variables by considering the gradient vector field V = ¿f, and establish the properties needed, and then use these properties in three ways.
Abstract: We use rotational and curvature properties of vector fields to identify critical features of an image. Using vector analysis and dif-ferential geometry, we establish the properties needed, and then use these properties in three ways. First, our results make it theoretically possible to identify extremal edges of an intensity function f(x, y) of two variables by considering the gradient vector field V = ?f. There is also enough information in ?f to find regions of high curvature (i.e., high curvature of the level paths of f). For color images, we use the vector field V = (I, Q). In application, the image is partitioned into a grid of squares. On the boundary of each square, V/|V| is sampled, and these unit vectors are used as the tangents of a curve ?. The rotation number (or topological degree) ?(?) and the average curvature f|??| are computed for each square. Analysis of these numbers yields infor-mation on edges and curvature. Experimental results from both simu-lated and real data are described.

Journal ArticleDOI
Robert M. Haralick1
TL;DR: From a Bayesian decision theoretic framework, it is shown that the reason why the usual statistical approaches do not take context into account is because of the assumptions made on the joint prior probability function andBecause of the simplistic loss function chosen.
Abstract: From a Bayesian decision theoretic framework, we show that the reason why the usual statistical approaches do not take context into account is because of the assumptions made on the joint prior probability function and because of the simplistic loss function chosen. We illustrate how the constraints sometimes employed by artificial intelligence researchers constitute a different kind of assumption on the joint prior probability function. We discuss a couple of loss functions which do take context into account and when combined with the joint prior probability constraint create a decision problem requiring a combinatorial state space search. We also give a theory for how probabilistic relaxation works from a Bayesian point of view.

Journal ArticleDOI
TL;DR: This paper describes a method for handling this case: a known object is detected by finding changes in orientation, translation, and scale of the object from its canonical description, a Hough technique and has the characteristic insensitivity to occlusion and noise.
Abstract: An important problem in vision is to detect the presence of a known rigid 3-D object. The general 3-D object recognition task can be thought of as building a description of the object that must have at least two parts: 1) the internal description of the object itself (with respect to an object-centered frame); and 2) the transformation of the object-centered frame to the viewer-centered (image) frame. The reason for this decomposition is parsimony: different views of the object should have minimal impact on its description. This is achieved by factoring the object's description into two sets of parameters, one which is view-independent (the object-centered component) and one which is view-varying (the viewing transformation). Often a description of the object is known beforehand and the task reduces to finding the objectframe to viewer-frame transformation. This paper describes a method for handling this case: a known object is detected by finding changes in orientation, translation, and scale of the object from its canonical description. The method is a Hough technique and has the characteristic insensitivity to occlusion and noise.

Journal ArticleDOI
TL;DR: problems and methods in automating visual inspection of printed circuit boards (PCB's) and an algorithm comparing local features of the patterns to be inspected with those of the pattern to be referenced is proposed.
Abstract: The purpose of this correspondence is to present problems and methods in automating visual inspection of printed circuit boards (PCB's). Vertical and diagonal illumination are useful in detecting PCB patterns correctly. An algorithm comparing local features of the patterns to be inspected with those of the pattern to be referenced is proposed. An inspection system using developed technologies is also described.

Journal ArticleDOI
TL;DR: In this paper, the problem of image segmentation is considered in the context of a mixture of probability distributions, where segments fall into classes and a probability distribution is associated with each class of segment.
Abstract: The problem of image segmentation is considered in the context of a mixture of probability distributions. The segments fall into classes. A probability distribution is associated with each class of segment. Parametric families of distributions are considered, a set of parameter values being associated with each class. With each observation is associated an unobservable label, indicating from which class the observation arose. Segmentation algorithms are obtained by applying a method of iterated maximum likelihood to the resulting likelihood function. A numerical example is given. Choice of the number of classes, using Akaike's information criterion (AIC) for model identification, is illustrated.

Journal ArticleDOI
Perry L. Miller1
TL;DR: A heuristic approach to risk analysis is outlined, based on three basic principles which are de-scribed in detail, which may prove more acceptable clinically, and may avoid certain social, medical, and medicolegal drawbacks.
Abstract: The ATTENDING system is designed to critique a physician's preoperative plan for anesthetic management. In undertaking to critique a physician's plan, ATTENDING differs from other medical decision making systems, which in effect attempt to tell a physician what to do. ATTENDING's approach may prove more acceptable clinically, and may avoid certain social, medical, and medicolegal drawbacks. To cri-tique a physician's plan, ATTENDING must confront three basic problems. 1) It must be able to explore flexibly all possible approaches for a patient's management. The formalism of an ``augmented decision network'' allows this. 2) It must be able to assess the relative risks and benefits of alternative approaches intelligently. A heuristic approach to risk analysis is outlined, based on three basic principles which are de-scribed in detail. 3) It must produce a potentially complex analysis which critiques the plan in focused, readable prose. This is facilitated by PROSENET, an approach which allows clean separation between the organization of the content of an analysis and its expression in English prose.

Journal ArticleDOI
TL;DR: It is shown that the envelopes of deterministic autocorrelations have essentially a cosine-like behavior but with jump discontinuities at points where the normalized relative displacement is the reciprocal of an integer.
Abstract: The auto/cross correlation of L2 functions are constrained by certain bounds which may often be used to advantage. These bounds apply to all the common cross correlation functions used for registration purposes (called ``deterministic'' correlation functions in this paper, as opposed to stochastic correlation based on non-L2 functions). It is shown that the envelopes of deterministic autocorrelations have essentially a cosine-like behavior but with jump discontinuities at points where the normalized relative displacement is the reciprocal of an integer. Several inequalities extending these results are given. It is shown how these can be applied toward obtaining improved registration algorithms.

Journal ArticleDOI
TL;DR: An inspection system designed to automatically detect and classify surface imperfections occurring on continuously cast hot steel slabs in a steel mill, which uses statistical as well as syntactic/semantic classification techniques.
Abstract: This paper describes an inspection system designed to automatically detect and classify surface imperfections occurring on continuously cast hot steel slabs in a steel mill. The need to perform inspection at real-time throughput rates of 546 Kpixels/s using off-the-shelf components has led to the development of a unique architecture and algorithm set. The segmentation operations are done in a high-speed array processor front end. The imperfection classification is done in a mini-computer which uses statistical as well as syntactic/semantic classification techniques. The inspection system is currently undergoing evaluation in a steel mill.

Journal ArticleDOI
D. J. Burr1
TL;DR: The simpler problem of recognizing separated hand-written letters is addressed here, and a warp-based character matching technique is employed with dictionary lookup, using a 16 000 word vocabulary.
Abstract: Computer recognition of cursive handwriting is complisated by the lack of letter separation. In an attempt to gain insight on the cursive problem, the simpler problem of recognizing separated hand-written letters is addressed here. A warp-based character matching technique is employed with dictionary lookup, using a 16 000 word vocabulary. Cooperative users achieve high recognition accuracy with this on-line system, which is easily tailored to the individual. Possible extensions to cursive writing are discussed.

Journal ArticleDOI
Chih-Shing Ho1
TL;DR: The precision of geometric features such as the centroid, area, perimeter, and orientation measured by a 2-D digital vision system is analyzed and tested experimentally and the sampling process can be simplified or eliminated.
Abstract: The precision of geometric features such as the centroid, area, perimeter, and orientation measured by a 2-D digital vision system is analyzed and tested experimentally. The digitizing error of various geometric features can be expressed in terms of the dimensionless perimeter of the object. As a result of this work, the sampling process can be simplified or eliminated. The analysis is also expanded to cover the 3-D digital vision systems.

Journal ArticleDOI
TL;DR: A criterion which measures the quality of the estimate of the covariance matrix of a multivariate normal distribution is developed and the necessary number of training samples is predicted.
Abstract: In this paper a criterion which measures the quality of the estimate of the covariance matrix of a multivariate normal distribution is developed. Based on this criterion, the necessary number of training samples is predicted. Experimental results which are used as a guide for determining the number of training samples are included.