scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Pattern Analysis and Machine Intelligence in 1993"


Journal ArticleDOI
TL;DR: Efficient algorithms for computing the Hausdorff distance between all possible relative positions of a binary image and a model are presented and it is shown that the method extends naturally to the problem of comparing a portion of a model against an image.
Abstract: The Hausdorff distance measures the extent to which each point of a model set lies near some point of an image set and vice versa. Thus, this distance can be used to determine the degree of resemblance between two objects that are superimposed on one another. Efficient algorithms for computing the Hausdorff distance between all possible relative positions of a binary image and a model are presented. The focus is primarily on the case in which the model is only allowed to translate with respect to the image. The techniques are extended to rigid motion. The Hausdorff distance computation differs from many other shape comparison methods in that no correspondence between the model and the image is derived. The method is quite tolerant of small position errors such as those that occur with edge detectors and other feature extraction methods. It is shown that the method extends naturally to the problem of comparing a portion of a model against an image. >

4,194 citations


Journal ArticleDOI
TL;DR: A method for rapid visual recognition of personal identity is described, based on the failure of a statistical test of independence, which implies a theoretical "cross-over" error rate of one in 131000 when a decision criterion is adopted that would equalize the false accept and false reject error rates.
Abstract: A method for rapid visual recognition of personal identity is described, based on the failure of a statistical test of independence. The most unique phenotypic feature visible in a person's face is the detailed texture of each eye's iris. The visible texture of a person's iris in a real-time video image is encoded into a compact sequence of multi-scale quadrature 2-D Gabor wavelet coefficients, whose most-significant bits comprise a 256-byte "iris code". Statistical decision theory generates identification decisions from Exclusive-OR comparisons of complete iris codes at the rate of 4000 per second, including calculation of decision confidence levels. The distributions observed empirically in such comparisons imply a theoretical "cross-over" error rate of one in 131000 when a decision criterion is adopted that would equalize the false accept and false reject error rates. In the typical recognition case, given the mean observed degree of iris code agreement, the decision confidence levels correspond formally to a conditional false accept probability of one in about 10/sup 31/. >

3,399 citations


Journal ArticleDOI
TL;DR: Two new algorithms for computer recognition of human faces, one based on the computation of a set of geometrical features, such as nose width and length, mouth position, and chin shape, and the second based on almost-gray-level template matching are presented.
Abstract: Two new algorithms for computer recognition of human faces, one based on the computation of a set of geometrical features, such as nose width and length, mouth position, and chin shape, and the second based on almost-gray-level template matching, are presented. The results obtained for the testing sets show about 90% correct recognition using geometrical features and perfect recognition using template matching. >

2,671 citations


Journal ArticleDOI
Laurent D. Cohen1, Isaac Cohen1
TL;DR: A 3-D generalization of the balloon model as a3-D deformable surface, which evolves in 3D images, is presented and properties of energy-minimizing surfaces concerning their relationship with 3- D edge points are shown.
Abstract: The use of energy-minimizing curves, known as "snakes" to extract features of interest in images has been introduced by Kass, Witkin and Terzopoulos (1987). A balloon model was introduced by Cohen (1991) as a way to generalize and solve some of the problems encountered with the original method. A 3-D generalization of the balloon model as a 3-D deformable surface, which evolves in 3-D images, is presented. It is deformed under the action of internal and external forces attracting the surface toward detected edgels by means of an attraction potential. We also show properties of energy-minimizing surfaces concerning their relationship with 3-D edge points. To solve the minimization problem for a surface, two simplified approaches are shown first, defining a 3-D surface as a series of 2-D planar curves. Then, after comparing finite-element method and finite-difference method in the 2-D problem, we solve the 3-D model using the finite-element method yielding greater stability and faster convergence. This model is applied for segmenting magnetic resonance images. >

1,576 citations


Journal ArticleDOI
TL;DR: A novel graph theoretic approach for data clustering is presented and its application to the image segmentation problem is demonstrated, resulting in an optimal solution equivalent to that obtained by partitioning the complete equivalent tree and is able to handle very large graphs with several hundred thousand vertices.
Abstract: A novel graph theoretic approach for data clustering is presented and its application to the image segmentation problem is demonstrated. The data to be clustered are represented by an undirected adjacency graph G with arc capacities assigned to reflect the similarity between the linked vertices. Clustering is achieved by removing arcs of G to form mutually exclusive subgraphs such that the largest inter-subgraph maximum flow is minimized. For graphs of moderate size ( approximately 2000 vertices), the optimal solution is obtained through partitioning a flow and cut equivalent tree of G, which can be efficiently constructed using the Gomory-Hu algorithm (1961). However for larger graphs this approach is impractical. New theorems for subgraph condensation are derived and are then used to develop a fast algorithm which hierarchically constructs and partitions a partially equivalent tree of much reduced size. This algorithm results in an optimal solution equivalent to that obtained by partitioning the complete equivalent tree and is able to handle very large graphs with several hundred thousand vertices. The new clustering algorithm is applied to the image segmentation problem. The segmentation is achieved by effectively searching for closed contours of edge elements (equivalent to minimum cuts in G), which consist mostly of strong edges, while rejecting contours containing isolated strong edges. This method is able to accurately locate region boundaries and at the same time guarantees the formation of closed edge contours. >

1,223 citations


Journal ArticleDOI
TL;DR: It is shown that the SSSD-in-inverse-distance function exhibits a unique and clear minimum at the correct matching position, even when the underlying intensity patterns of the scene include ambiguities or repetitive patterns.
Abstract: A stereo matching method that uses multiple stereo pairs with various baselines generated by a lateral displacement of a camera to obtain precise distance estimates without suffering from ambiguity is presented. Matching is performed simply by computing the sum of squared-difference (SSD) values. The SSD functions for individual stereo pairs are represented with respect to the inverse distance and are then added to produce the sum of SSDs. This resulting function is called the SSSD-in-inverse-distance. It is shown that the SSSD-in-inverse-distance function exhibits a unique and clear minimum at the correct matching position, even when the underlying intensity patterns of the scene include ambiguities or repetitive patterns. The authors first define a stereo algorithm based on the SSSD-in-inverse-distance and present a mathematical analysis to show how the algorithm can remove ambiguity and increase precision. Experimental results with real stereo images are presented to demonstrate the effectiveness of the algorithm. >

1,066 citations


Journal ArticleDOI
TL;DR: The reliability exhibited by texture signatures based on wavelet packets analysis suggest that the multiresolution properties of such transforms are beneficial for accomplishing segmentation, classification and subtle discrimination of texture.
Abstract: This correspondence introduces a new approach to characterize textures at multiple scales. The performance of wavelet packet spaces are measured in terms of sensitivity and selectivity for the classification of twenty-five natural textures. Both energy and entropy metrics were computed for each wavelet packet and incorporated into distinct scale space representations, where each wavelet packet (channel) reflected a specific scale and orientation sensitivity. Wavelet packet representations for twenty-five natural textures were classified without error by a simple two-layer network classifier. An analyzing function of large regularity (D/sub 20/) was shown to be slightly more efficient in representation and discrimination than a similar function with fewer vanishing moments (D/sub 6/) In addition, energy representations computed from the standard wavelet decomposition alone (17 features) provided classification without error for the twenty-five textures included in our study. The reliability exhibited by texture signatures based on wavelet packets analysis suggest that the multiresolution properties of such transforms are beneficial for accomplishing segmentation, classification and subtle discrimination of texture. >

831 citations


Journal ArticleDOI
Lawrence O'Gorman1
TL;DR: The document spectrum (or docstrum) as discussed by the authors is a method for structural page layout analysis based on bottom-up, nearest-neighbor clustering of page components, which yields an accurate measure of skew, within-line, and between-line spacings and locates text lines and text blocks.
Abstract: Page layout analysis is a document processing technique used to determine the format of a page. This paper describes the document spectrum (or docstrum), which is a method for structural page layout analysis based on bottom-up, nearest-neighbor clustering of page components. The method yields an accurate measure of skew, within-line, and between-line spacings and locates text lines and text blocks. It is advantageous over many other methods in three main ways: independence from skew angle, independence from different text spacings, and the ability to process local regions of different text orientations within the same image. Results of the method shown for several different page formats and for randomly oriented subpages on the same image illustrate the versatility of the method. We also discuss the differences, advantages, and disadvantages of the docstrum with respect to other lay-out methods. >

654 citations


Journal ArticleDOI
TL;DR: An improved terminating criterion for the optimization scheme that is based on topographic features of the graph of the intensity image is proposed, as well as a continuation method based on a discrete sale-space representation.
Abstract: The problems of segmenting a noisy intensity image and tracking a nonrigid object in the plane are discussed. In evaluating these problems, a technique based on an active contour model commonly called a snake is examined. The technique is applied to cell locomotion and tracking studies. The snake permits both the segmentation and tracking problems to be simultaneously solved in constrained cases. A detailed analysis of the snake model, emphasizing its limitations and shortcomings, is presented, and improvements to the original description of the model are proposed. Problems of convergence of the optimization scheme are considered. In particular, an improved terminating criterion for the optimization scheme that is based on topographic features of the graph of the intensity image is proposed. Hierarchical filtering methods, as well as a continuation method based on a discrete sale-space representation, are discussed. Results for both segmentation and tracking are presented. Possible failures of the method are discussed. >

644 citations


Journal ArticleDOI
TL;DR: An estimation technique that uses deformable contour models (snakes) to track the nonrigid motions of facial features in video images is developed and estimates muscle actuator controls with sufficient accuracy to permit the face model to resynthesize transient expressions.
Abstract: An approach to the analysis of dynamic facial images for the purposes of estimating and resynthesizing dynamic facial expressions is presented. The approach exploits a sophisticated generative model of the human face originally developed for realistic facial animation. The face model which may be simulated and rendered at interactive rates on a graphics workstation, incorporates a physics-based synthetic facial tissue and a set of anatomically motivated facial muscle actuators. The estimation of dynamical facial muscle contractions from video sequences of expressive human faces is considered. An estimation technique that uses deformable contour models (snakes) to track the nonrigid motions of facial features in video images is developed. The technique estimates muscle actuator controls with sufficient accuracy to permit the face model to resynthesize transient expressions. >

602 citations


Journal ArticleDOI
TL;DR: A physics-based framework for 3-D shape and nonrigid motion estimation for real-time computer vision systems is presented and a recursive shape and motion estimator that employs the Lagrange equations as a system model is developed.
Abstract: A physics-based framework for 3-D shape and nonrigid motion estimation for real-time computer vision systems is presented. The framework features dynamic models that incorporate the mechanical principles of rigid and nonrigid bodies into conventional geometric primitives. Through the efficient numerical simulation of Lagrange equations of motion, the models can synthesize physically correct behaviors in response to applied forces and imposed constraints. Applying continuous Kalman filtering theory, a recursive shape and motion estimator that employs the Lagrange equations as a system model is developed. The system model continually synthesizes nonrigid motion in response to generalized forces that arise from the inconsistency between the incoming observations and the estimated model state. The observation forces also account formally for instantaneous uncertainties and incomplete information. A Riccati procedure updates a covariance matrix that transforms the forces in accordance with the system dynamics and prior observation history. Experiments involving model fitting and tracking of articulated and flexible objects from noisy 3-D data are described. >

Journal ArticleDOI
TL;DR: Interactive graphics systems that are driven by visual input and an extension to the basic technique to include structure recovery is discussed, quantitatively comparing the accuracy of the visual technique with traditional sensing.
Abstract: Interactive graphics systems that are driven by visual input are discussed. The underlying computer vision techniques and a theoretical formulation that addresses issues of accuracy, computational efficiency, and compensation for display latency are presented. Experimental results quantitatively compare the accuracy of the visual technique with traditional sensing. An extension to the basic technique to include structure recovery is discussed. >

Journal ArticleDOI
TL;DR: Simulations with long image sequences of real-world scenes indicate that the approach to estimating the motion of the head and facial expressions in model-based facial image coding not only greatly reduces computational complexity but also substantially improves estimation accuracy.
Abstract: An approach to estimating the motion of the head and facial expressions in model-based facial image coding is presented. An affine nonrigid motion model is set up. The specific knowledge about facial shape and facial expression is formulated in this model in the form of parameters. A direct method of estimating the two-view motion parameters that is based on the affine method is discussed. Based on the reasonable assumption that the 3-D motion of the face is almost smooth in the time domain, several approaches to predicting the motion of the next frame are proposed. Using a 3-D model, the approach is characterized by a feedback loop connecting computer vision and computer graphics. Embedding the synthesis techniques into the analysis phase greatly improves the performance of motion estimation. Simulations with long image sequences of real-world scenes indicate that the method not only greatly reduces computational complexity but also substantially improves estimation accuracy. >

Journal ArticleDOI
TL;DR: The proposed feature extraction algorithm has several desirable properties: it predicts the minimum number of features necessary to achieve the same classification accuracy as in the original space for a given pattern recognition problem; and it finds the necessary feature vectors.
Abstract: A novel approach to feature extraction for classification based directly on the decision boundaries is proposed. It is shown how discriminantly redundant features and discriminantly informative features are related to decision boundaries. A procedure to extract discriminantly informative features based on a decision boundary is proposed. The proposed feature extraction algorithm has several desirable properties: (1) it predicts the minimum number of features necessary to achieve the same classification accuracy as in the original space for a given pattern recognition problem; and (2) it finds the necessary feature vectors. The proposed algorithm does not deteriorate under the circumstances of equal class means or equal class covariances as some previous algorithms do. Experiments show that the performance of the proposed algorithm compares favorably with those of previous algorithms. >

Journal ArticleDOI
TL;DR: A general, matrix-based method using regularization is presented, which eliminates some fundamental problems with inverse filtering: inaccuracies in finding the frequency domain representation, windowing effects, and border effects.
Abstract: The concept of depth from focus involves calculating distances to points in an observed scene by modeling the effect that the camera's focal parameters have on images acquired with a small depth of field. This technique is passive and requires only a single camera. The most difficult segment of calculating depth from focus is deconvolving the defocus operator from the scene and modeling it. Most current methods for determining the defocus operator employ inverse filtering. The authors reveal some fundamental problems with inverse filtering: inaccuracies in finding the frequency domain representation, windowing effects, and border effects. A general, matrix-based method using regularization is presented, which eliminates these problems. The new method is confirmed experimentally, with the results showing an RMS error of 1.3%. >

Journal ArticleDOI
TL;DR: In this article, an automatic feature recognizer decomposes the total volume to be machined into volumetric features that satisfy stringent conditions for manufacturability, and correspond to operations typically performed in 3-axis machining centers.
Abstract: Discusses an automatic feature recognizer that decomposes the total volume to be machined into volumetric features that satisfy stringent conditions for manufacturability, and correspond to operations typically performed in 3-axis machining centers. Unlike most of the previous research, the approach is based on general techniques for dealing with features with intersecting volumes. Feature interactions are represented explicitly in the recognizer's output, to facilitate spatial reasoning in subsequent planning stages. A generate-and-test strategy is used. OPS-5 production rules generate hints or clues for the existence of features, and post them on a blackboard. The clues are assessed, and those judged promising are processed to ensure that they correspond to actual features, and to gather information for process planning. Computational geometry techniques are used to produce the largest volumetric feature compatible with the available data. The feature's accessibility, and its interactions with others are analyzed. The validity tests ensure that the proposed features are accessible, do not intrude into the desired part, and satisfy other machinability conditions. The process continues until it produces a complete decomposition of the volume to be machined into fully-specified features. >

Journal ArticleDOI
TL;DR: A strategy for acquiring 3-D data of an unknown scene, using range images obtained by a light stripe range finder is addressed, where the foci of attention are occluded regions and the system can resolve the appearance of occlusions by analyzing them.
Abstract: A strategy for acquiring 3-D data of an unknown scene, using range images obtained by a light stripe range finder is addressed. The foci of attention are occluded regions, i.e., only the scene at the borders of the occlusions is modeled to compute the next move. Since the system has knowledge of the sensor geometry, it can resolve the appearance of occlusions by analyzing them. The problem of 3-D data acquisition is divided into two subproblems due to two types of occlusions. An occlusion arises either when the reflected laser light does not reach the camera or when the directed laser light does not reach the scene surface. After taking the range image of a scene, the regions of no data due to the first kind of occlusion are extracted. The missing data are acquired by rotating the sensor system in the scanning plane, which is defined by the first scan. After a complete image of the surface illuminated from the first scanning plane has been built, the regions of missing data due to the second kind of occlusions are located. Then, the directions of the next scanning planes for further 3-D data acquisition are computed. >

Journal ArticleDOI
TL;DR: Experiments in hand-written digit recognition are presented, revealing that the normalized edit distance consistently provides better results than both unnormalized or post-normalized classical edit distances.
Abstract: Given two strings X and Y over a finite alphabet, the normalized edit distance between X and Y, d(X,Y) is defined as the minimum of W(P)/L(P), where P is an editing path between X and Y, W(P) is the sum of the weights of the elementary edit operations of P, and L(P) is the number of these operations (length of P). It is shown that in general, d(X,Y) cannot be computed by first obtaining the conventional (unnormalized) edit distance between X and Y and then normalizing this value by the length of the corresponding editing path. In order to compute normalized edit distances, an algorithm that can be implemented to work in O(m*n/sup 2/) time and O(n/sup 2/) memory space is proposed, where m and n are the lengths of the strings under consideration, and m>or=n. Experiments in hand-written digit recognition are presented, revealing that the normalized edit distance consistently provides better results than both unnormalized or post-normalized classical edit distances. >

Journal ArticleDOI
TL;DR: It is shown that multiple constraints can provide more accurate flow estimation in a wide range of circumstances and is presented a multimodal approach to the problem of motion estimation in which the computation of visual motion is based on several complementary constraints.
Abstract: The estimation of dense velocity fields from image sequences is basically an ill-posed problem, primarily because the data only partially constrain the solution. It is rendered especially difficult by the presence of motion boundaries and occlusion regions which are not taken into account by standard regularization approaches. In this paper, the authors present a multimodal approach to the problem of motion estimation in which the computation of visual motion is based on several complementary constraints. It is shown that multiple constraints can provide more accurate flow estimation in a wide range of circumstances. The theoretical framework relies on Bayesian estimation associated with global statistical models, namely, Markov random fields. The constraints introduced here aim to address the following issues: optical flow estimation while preserving motion boundaries, processing of occlusion regions, fusion between gradient and feature-based motion constraint equations. Deterministic relaxation algorithms are used to merge information and to provide a solution to the maximum a posteriori estimation of the unknown dense motion field. The algorithm is well suited to a multiresolution implementation which brings an appreciable speed-up as well as a significant improvement of estimation when large displacements are present in the scene. Experiments on synthetic and real world image sequences are reported. >

Journal ArticleDOI
TL;DR: It is shown that even a small pixel-level perturbation may override the epipolar information that is essential for the linear algorithms to distinguish different motions, indicating the need for optimal estimation in the presence of noise.
Abstract: The causes of existing linear algorithms exhibiting various high sensitivities to noise are analyzed. It is shown that even a small pixel-level perturbation may override the epipolar information that is essential for the linear algorithms to distinguish different motions. This analysis indicates the need for optimal estimation in the presence of noise. Methods are introduced for optimal motion and structure estimation under two situations of noise distribution: known and unknown. Computationally, the optimal estimation amounts to minimizing a nonlinear function. For the correct convergence of this nonlinear minimization, a two-step approach is used. The first step is using a linear algorithm to give a preliminary estimate for the parameters. The second step is minimizing the optimal objective function starting from that preliminary estimate as an initial guess. A remarkable accuracy improvement has been achieved by this two-step approach over using the linear algorithm alone. >

Journal ArticleDOI
TL;DR: Experimental results showed that the LP approach is superior in matching graphs than both other methods.
Abstract: A linear programming (LP) approach is proposed for the weighted graph matching problem. A linear program is obtained by formulating the graph matching problem in L/sub 1/ norm and then transforming the resulting quadratic optimization problem to a linear one. The linear program is solved using a simplex-based algorithm. Then, approximate 0-1 integer solutions are obtained by applying the Hungarian method on the real solutions of the linear program. The complexity of the proposed algorithm is polynomial time, and it is O(n/sup 6/L) for matching graphs of size n. The developed algorithm is compared to two other algorithms. One is based on an eigendecomposition approach and the other on a symmetric polynomial transform. Experimental results showed that the LP approach is superior in matching graphs than both other methods. >

Journal ArticleDOI
TL;DR: The robustness of local phase information for measuring image velocity and binocular disparity is addressed, particularly in the stability of phase with respect to geometric deformations, and its linearity as a function of spatial position.
Abstract: This paper concerns the robustness of local phase information for measuring image velocity and binocular disparity. It addresses the dependence of phase behavior on the initial filters as well as the image variations that exist between different views of a 3D scene. We are particularly interested in the stability of phase with respect to geometric deformations, and its linearity as a function of spatial position. These properties are important to the use of phase information, and are shown to depend on the form of the filters as well as their frequency bandwidths. Phase instabilities are also discussed using the model of phase singularities described by Jepson and Fleet. In addition to phase-based methods, these results are directly relevant to differential optical flow methods and zero-crossing tracking. >

Journal ArticleDOI
TL;DR: It is demonstrated that the motion equations that relate the egomotion and/or the motion of the objects in the scene to the optical flow are considerably simplified if the velocity is represented in a polar or log-polar coordinate system, as opposed to a Cartesian representation.
Abstract: The application of an anthropomorphic retina-like visual sensor and the advantages of polar and log-polar mapping for visual navigation are investigated. It is demonstrated that the motion equations that relate the egomotion and/or the motion of the objects in the scene to the optical flow are considerably simplified if the velocity is represented in a polar or log-polar coordinate system, as opposed to a Cartesian representation. The analysis is conducted for tracking egomotion but is then generalized to arbitrary sensor and object motion. The main result stems from the abundance of equations that can be written directly that relate the polar or log-polar optical flow with the time to impact. Experiments performed on images acquired from real scenes are presented. >

Journal ArticleDOI
TL;DR: The capabilities of subsequential transductions are illustrated through a series of experiments that also show the high effectiveness of the proposed learning method in obtaining very accurate and compact transducers for the corresponding tasks.
Abstract: A formalization of the transducer learning problem and an effective and efficient method for the inductive learning of an important class of transducers, the class of subsequential transducers, are presented. The capabilities of subsequential transductions are illustrated through a series of experiments that also show the high effectiveness of the proposed learning method in obtaining very accurate and compact transducers for the corresponding tasks. >

Journal ArticleDOI
TL;DR: Reconstruction formulae that were formally similar to the convolved backprojection ones are derived, and an iterative reconstruction technique is found to converge after a finite number of steps.
Abstract: A model of finite Radon transforms composed of Radon projections is presented. The model generalizes to finite group projections in the classical Radon transform theory. The Radon projector averages a function on a group over cosets of a subgroup. Reconstruction formulae that were formally similar to the convolved backprojection ones are derived, and an iterative reconstruction technique is found to converge after a finite number of steps. Applying these results to the group Z/sub 2//sup P/, new computationally favorable image representations have been obtained. A numerical study of the transform coding aspects is attached. >

Journal ArticleDOI
TL;DR: A method for feature extraction directly from gray-scale images of scanned documents without the usual step of binarization is presented and the advantages and effectiveness are both shown theoretically and demonstrated through preliminary experiments of the proposed method.
Abstract: A method for feature extraction directly from gray-scale images of scanned documents without the usual step of binarization is presented. This approach eliminates binarization by extracting features directly from gray-scale images. In this method, a digitized gray-scale image is treated as a noisy sampling of the underlying continuous surface and desired features are obtained by extracting and assembling topographic characteristics of this surface. The advantages and effectiveness of the approach are both shown theoretically and demonstrated through preliminary experiments of the proposed method. >

Journal ArticleDOI
TL;DR: The variation, with respect to view, of 2D features defined for projections of 3D point sets and line segments is studied and it is established that general-case view-invariants do not exist for any number of points, given true perspective, weak perspective, or orthographic projection models.
Abstract: The variation, with respect to view, of 2D features defined for projections of 3D point sets and line segments is studied. It is established that general-case view-invariants do not exist for any number of points, given true perspective, weak perspective, or orthographic projection models. Feature variation under the weak perspective approximation is then addressed. Though there are no general-case weak-perspective invariants, there are special-case invariants of practical importance. Those cited in the literature are derived from linear dependence relations and the invariance of this type of relation to linear transformation. The variation with respect to view is studied for an important set of 2D line segment features: the relative orientation, size, and position of one line segment with respect to another. The analysis includes an evaluation criterion for feature utility in terms of view-variation. This relationship is a function of both the feature and the particular configuration of 3D line segments. The use of this information in objection recognition is demonstrated for difficult discrimination tasks. >

Journal ArticleDOI
TL;DR: Fast algorithms for computing min, median, max, or any other order statistic filter transforms are described and a logarithmic time per pixel lower bound for the computation of the median filter is shown.
Abstract: Fast algorithms for computing min, median, max, or any other order statistic filter transforms are described. The algorithms take constant time per pixel to compute min or max filters and polylog time per pixel, in the size of the filter, to compute the median filter. A logarithmic time per pixel lower bound for the computation of the median filter is shown. >

Journal ArticleDOI
TL;DR: The main result is that the expansion coefficients of the approximation are obtained by linear filtering and sampling, and the results are applied to construct a L/sub 2/ polynomial spline pyramid that is a parametric multiresolution representation of a signal.
Abstract: The authors are concerned with the derivation of general methods for the L/sub 2/ approximation of signals by polynomial splines. The main result is that the expansion coefficients of the approximation are obtained by linear filtering and sampling. The authors apply those results to construct a L/sub 2/ polynomial spline pyramid that is a parametric multiresolution representation of a signal. This hierarchical data structure is generated by repeated application of a REDUCE function (prefilter and down-sampler). A complementary EXPAND function (up-sampler and post-filter) allows a finer resolution mapping of any coarser level of the pyramid. Four equivalent representations of this pyramid are considered, and the corresponding REDUCE and EXPAND filters are determined explicitly for polynomial splines of any order n (odd). Some image processing examples are presented. It is demonstrated that the performance of the Laplacian pyramid can be improved significantly by using a modified EXPAND function associated with the dual representation of a cubic spline pyramid. >

Journal ArticleDOI
TL;DR: This paper provides a precise formalization of the consequences entailed by a defeasible knowledge base, develops the computational machinery necessary for deriving these consequences, and compares the behavior of the maximum entropy approach to those of Ɛ-semantics and rational closure.
Abstract: An approach to nonmonotonic reasoning that combines the principle of infinitesimal probabilities with that of maximum entropy, thus extending the inferential power of the probabilistic interpretation of defaults, is proposed. A precise formalization of the consequences entailed by a conditional knowledge base is provided, the computational machinery necessary for drawing these consequences is developed, and the behavior of the maximum entropy approach is compared to related work in default reasoning. The resulting formalism offers a compromise between two extremes: the cautious approach based on the conditional interpretations of defaults and the bold approach based on minimizing abnormalities. >