scispace - formally typeset
Search or ask a question

Showing papers on "Segmentation-based object categorization published in 1996"


Journal ArticleDOI
TL;DR: A methodology for evaluating range image segmentation algorithms and four research groups have contributed to evaluate their own algorithm for segmenting a range image into planar patches.
Abstract: A methodology for evaluating range image segmentation algorithms is proposed. This methodology involves (1) a common set of 40 laser range finder images and 40 structured light scanner images that have manually specified ground truth and (2) a set of defined performance metrics for instances of correctly segmented, missed, and noise regions, over- and under-segmentation, and accuracy of the recovered geometry. A tool is used to objectively compare a machine generated segmentation against the specified ground truth. Four research groups have contributed to evaluate their own algorithm for segmenting a range image into planar patches.

895 citations


Journal ArticleDOI
TL;DR: H holistic approaches that avoid segmentation by recognizing entire character strings as units are described, including methods that partition the input image into subimages, which are then classified.
Abstract: Character segmentation has long been a critical area of the OCR process. The higher recognition rates for isolated characters vs. those obtained for words and connected character strings well illustrate this fact. A good part of recent progress in reading unconstrained printed and written text may be ascribed to more insightful handling of segmentation. This paper provides a review of these advances. The aim is to provide an appreciation for the range of techniques that have been developed, rather than to simply list sources. Segmentation methods are listed under four main headings. What may be termed the "classical" approach consists of methods that partition the input image into subimages, which are then classified. The operation of attempting to decompose the image into classifiable units is called "dissection." The second class of methods avoids dissection, and segments the image either explicitly, by classification of prespecified windows, or implicitly by classification of subsets of spatial features collected from the image as a whole. The third strategy is a hybrid of the first two, employing dissection together with recombination rules to define potential segments, but using classification to select from the range of admissible segmentation possibilities offered by these subimages. Finally, holistic approaches that avoid segmentation by recognizing entire character strings as units are described.

880 citations


Journal ArticleDOI
TL;DR: A hierarchical segmentation process is derived, which gives a compact description of the image, containing all the segmentations one can obtain by the notion of dynamics, by means of a simple thresholding.
Abstract: The watershed is one of the latest segmentation tools developed in mathematical morphology. In order to prevent its oversegmentation, the notion of dynamics of a minimum, based on geodesic reconstruction, has been proposed. In this paper, we extend the notion of dynamics to the contour arcs. This notion acts as a measure of the saliency of the contour. Contrary to the dynamics of minima, our concept reflects the extension and shape of the corresponding object in the image. This representation is also much more natural, because it is expressed in terms of partitions of the plane, i.e., segmentations. A hierarchical segmentation process is then derived, which gives a compact description of the image, containing all the segmentations one can obtain by the notion of dynamics, by means of a simple thresholding. Finally, efficient algorithms for computing the geodesic reconstruction as well as the dynamics of contours are presented.

552 citations


Journal Article
TL;DR: In this paper, a multi-pass, pair-wise, region-growing algorithm is proposed for image segmentation, which is based on a simple adaptive-window interactive texture channel.
Abstract: Image segmentation is a method of defining discrete objects or classes of objects in images. Addition of n spatial attxibute, i.e., image texture, improves the segmentation process in most areas where there are differences in texture between classes in the image. Such areas include sparsely vegetated areas and highly textured human-generated areas, such as the urban-suburban interface. A simple udaptive-window iexture program creates a texture channel useful in image segmentation. The segmentation algorithm is a multi-pass, pair-wise, region-growing algorithm. The test sites include a simulated conifer forest, a natural vegetation urea, and a mixed-use suburban area. The simulated image is especially useful because polygon boundaries are unambiguous. Both the weighting of textural data relative to the spectral data, and the effects of the degree of segmentation, are explored. The use of texture improves segmentations for most areas. It is apparent that the addition of texture, at worst, has no influence on the accuracy of the segmentation, and can improve the accuracy in areas where the features of interest exhibit differences in local variance. Results indicate that, for most uses, segmentation scheme.? should include both a minimum and maximum region size to insure the greaiest accuracy.

245 citations


28 Jul 1996
TL;DR: This paper describes an approach to using GP for image analysis based on the idea that image enhancement, feature detection and image segmentation can be re-framed as filtering problems and proposes terminals, functions and fitness functions that satisfy these requirements.
Abstract: This paper describes an approach to using GP for image analysis based on the idea that image enhancement, feature detection and image segmentation can be re-framed as filtering problems. GP can discover efficient optimal filters which solve such problems but in order to make the search feasible and effective, terminal sets, function sets and fitness functions have to meet some requirements. We describe these requirements and we propose terminals, functions and fitness functions that satisfy them. Experiments are reported in which GP is applied to the segmentation of the brain in medical images and is compared with artificial neural nets.

184 citations


Journal ArticleDOI
TL;DR: This paper presents an overview on the most important techniques used in segmenting characters from handwritten words, and summarizes the terms and measurements commonly used in handwritten character segmentation.

174 citations


Journal ArticleDOI
TL;DR: A new methodology for character segmentation and recognition which makes the best use of the characteristics of gray-scale images and a recognition-based segmentation method is adopted.
Abstract: Generally speaking, through the binarization of gray-scale images, useful information for the segmentation of touched or overlapped characters may be lost in many cases. If we analyze gray-scale images, however, specific topographic features and the variation of intensities can be observed in the character boundaries. In this paper, we propose a new methodology for character segmentation and recognition which makes the best use of the characteristics of gray-scale images. In the proposed methodology, the character segmentation regions are determined by using projection profiles and topographic features extracted from the gray-scale images. Then a nonlinear character segmentation path in each character segmentation region is found by using multi-stage graph search algorithm. Finally, in order to confirm the nonlinear character segmentation paths and recognition results, a recognition-based segmentation method is adopted. Through the experiments with various kinds of printed documents, it is convinced that the proposed methodology is very effective for the segmentation and recognition of touched and overlapped characters.

154 citations


Journal ArticleDOI
TL;DR: A scheme that automatically selects the optimal features for each pixel using wavelet analysis is proposed, leading to a robust segmentation algorithm.
Abstract: The optimal features with which to discriminate between regions and, thus, segment an image often differ depending on the nature of the image. Many real images are made up of both smooth and textured regions and are best segmented using different features in different areas. A scheme that automatically selects the optimal features for each pixel using wavelet analysis is proposed, leading to a robust segmentation algorithm. An automatic method for determining the optimal number of regions for segmentation is also developed.

142 citations


Proceedings ArticleDOI
16 Sep 1996
TL;DR: A new method to extract the plate region using a distributed genetic algorithm that offers robustness in dealing with deformation of vehicle images and inherent parallelism to improve the processing time is proposed.
Abstract: Extracting a license plate is an important stage in automatic vehicle identification. It is very difficult because vehicle images are usually degraded and processing the images is computationally intensive. We propose a new method to extract the plate region using a distributed genetic algorithm. The algorithm offers robustness in dealing with deformation of vehicle images and inherent parallelism to improve the processing time. A test with seventy images shows an extraction rate of 92.8%, working well within real world situations. This results suggest that the proposed method is pertinent to be put into practical use.

130 citations


Patent
Chuang Gu1
30 Sep 1996
TL;DR: In this paper, a joint motion and spatial segmentation of image features is proposed for representing moving image features, and the video image features are jointly segmented as a weighted combination of the motion-segmented video image feature and the spatial image feature.
Abstract: Homogeneous moving objects of arbitrary shapes are segmented and tracked with respect to the motion of the objects. In an intraframe mode of operation, a segmentation method includes obtaining a motion representation of corresponding pixels in the selected video image frame and a preceding video image frame to form motion-segmented video image features. Video image features are also segmented according to their spatial image characteristics (e.g., color) to form spatially-segmented video image features. Finally, the video image features are jointly segmented as a weighted combination of the motion-segmented video image features and the spatially-segmented video image features. The joint motion and spatial segmentation of image features provides enhanced accuracy in representing moving image features. This enhanced accuracy is particularly beneficial because the motion of image features is a significant display characteristic for human observers.

124 citations


Journal ArticleDOI
TL;DR: A fuzzy validity function is proposed which measures a degree of separation and compactness between and within finely segmented regions, and an edge strength along boundaries of all regions, using a genetic algorithm with a fuzzy measure.

Journal ArticleDOI
TL;DR: The texture-based segmentation method presented here is a pixel classifier based on four texture energy measures associated with each pixel in the image based on an automated clustering procedure.

Book ChapterDOI
01 Apr 1996
TL;DR: A set of terminals and functions for the parse trees handled by genetic programming which enable it to develop effective image filters which can either be used to highly enhance and detect features of interest or to build pixel-classification-based segmentation algorithms.
Abstract: Genetic Programming is a method of program discovery/optimisation consisting of a special kind of genetic algorithm capable of operating on nonlinear chromosomes (parse trees) representing programs and an interpreter which can run the programs being optimised. In this paper we describe a set of terminals and functions for the parse trees handled by genetic programming which enable it to develop effective image filters. These filters can either be used to highly enhance and detect features of interest or to build pixel-classification-based segmentation algorithms. Some experiments with medical images which show the efficacy of the approach are reported.

Proceedings ArticleDOI
18 Jun 1996
TL;DR: This paper presents a prediction-and-verification segmentation scheme that can handle a large number of different deformable objects presented in complex backgrounds and is relatively efficient since the segmentation is guided by the past knowledge through a prediction andverification scheme.
Abstract: This paper presents a prediction-and-verification segmentation scheme wing attention images from multiple fixations. A major advantage of this scheme is that it can handle a large number of different deformable objects presented in complex backgrounds. The scheme is also relatively efficient since the segmentation is guided by the past knowledge through a prediction-and-verification scheme. The system has been tested to segment hands in the sequences of intensity images, where each sequence represents a hand sign. The experimental result showed a 95% correct segmentation rate with a 3% false rejection rate.

Proceedings ArticleDOI
01 Jan 1996
TL;DR: Two new methods, based on the EM algorithm, are proposed to perform robust motion segmentation on image sequences that contain IMOs and tracks depth-structure over time and evaluates rigidity allowing IMOs to be identified as outliers.
Abstract: Motion in image sequences can result from the motion of the observer (egomotion) and from the presence of independently moving objects (IMOs) within the field of view of the observer. Any vision system intended for all observer capable of motion needs the ability to distinguish between these two possibilities in order to successfully perform navigation and collision avoidance tasks. One approach to motion segmentation is to perform a statistical clustering on a set of local constraints on 3-D motion in the image. This thesis proposes two new methods, based on the EM algorithm, to perform robust motion segmentation on image sequences that contain IMOs. The first method uses statistical clustering of linear and bilinear constraints (derived from computed optical flow using subspace methods) on 3-D translation and rotation. The problems of outlier detection and determining number of processes and their initial parameters for the EM algorithm are considered. Also, analysis of the effects of IMO boundaries on linear constraints, as well as a derivation for the removal of bias inherent in translation estimates from linear constraints, are presented. Effects of fixation on detection of IMOs are considered. A framework for hypothesizing about motions underlying a set of constraint clusters is detailed. There exist situations in which 3-D motion constraints are not sufficient to perform segmentation. The second method tracks depth-structure over time and evaluates rigidity allowing IMOs to be identified as outliers. Results obtained from four image sequences are presented. The first sequence is synthetic optic flow generated from a depth map and contains one IMO. The second sequence was captured from a robot moving in an industrial environment. The third sequence is similar to the second, except the flow has been generated using a regularly spaced grid which assumes no prior segmentation of the image. The fourth sequence illustrates a case in which the 3-D constraints are insufficient to perform the segmentation, and the estimation of depth structure is needed to solve the problem. Finally, directions for future research into this problem are presented.

Proceedings ArticleDOI
16 Sep 1996
TL;DR: A segmentation scheme that takes account of multiple image characteristics, developing a multi-modal statistical model of regions based on a small amount of user-supplied training data is described.
Abstract: Researchers have shown the compression advantages of coding video as a set of regions that can be defined by motion or texture models. Whether for purposes of compression efficiency, image editing, interactive multimedia authoring, or database search, it is often useful to be able to segment images or image sequences into regions corresponding to objects. We describe a segmentation scheme that takes account of multiple image characteristics, developing a multi-modal statistical model of regions based on a small amount of user-supplied training data.

Proceedings ArticleDOI
07 May 1996
TL;DR: The development of the design method and mathematical models provide new insight into the design of multiple Gabor filters for texture segmentation, which offers the potential to improve the segmentation performance or to reduce the number of filters.
Abstract: This paper presents a method for the design of multiple Gabor filters for segmenting multi-textured images. Although design methods for a single Gabor filter have been presented previously, the development of general multi-filter multi-texture design methods largely remains an open problem. Previous multi-filter design approaches required one filter per texture or were constrained to pairs of textures. Other approaches employed ad hoc banks of Gabor filters for texture segmentation, where the parameters of the constituent filters were restricted to fixed values and were not necessarily tuned for a specific texture-segmentation problem. The proposed method removes these restrictions on the number of filters and the number of textures. This offers the potential to improve the segmentation performance or to reduce the number of filters. Further, the development of the design method and mathematical models provide new insight into the design of multiple Gabor filters for texture segmentation. Results are presented that confirm the efficacy of our filter-design method and support underlying mathematical models.

Proceedings ArticleDOI
02 Dec 1996
TL;DR: A novel algorithm for very fast segmentation of range images into both planar and curved surface patches that makes use of high-level features (curve segments) as segmentation primitives instead of individual pixels.
Abstract: In this paper we present a novel algorithm for very fast segmentation of range images into both planar and curved surface patches. In contrast to other known segmentation methods our approach makes use of high-level features (curve segments) as segmentation primitives instead of individual pixels. This way the amount of data can be significantly reduced and a very fast segmentation algorithm is obtained. The proposed algorithm has been tested on a large number of real range images and demonstrated good results. With an optimized implementation our method has the potential to operate in quasi real-time (a few range images per second).

Proceedings ArticleDOI
25 Aug 1996
TL;DR: A segmentation method and handwritten word coding method by human observation for automatic document processing in Arabic and the results are used in the recognition level, which is presented as perspective in this paper.
Abstract: We propose a segmentation method and handwritten word coding method by human observation for automatic document processing in Arabic. The system is composed of three levels. The first level deals with the word segmentation into portions of characters called graphemes. The second level analyses these graphemes and codes the word by a sequence of observations similar to human perception. The results of these two levels are used in the recognition level (the third level) which are presented as perspective in this paper.

Proceedings ArticleDOI
18 Jun 1996
TL;DR: Current computer vision systems whose basic methodology is open-loop or filter type typically use image segmentation followed by object recognition algorithms, but the system presented here achieves robust performance by using reinforcement learning to induce a mapping from input images to corresponding segmentation parameters.
Abstract: Current computer vision systems whose basic methodology is open-loop or filter type typically use image segmentation followed by object recognition algorithms. These systems are not robust for most real-world applications. In contrast, the system presented here achieves robust performance by using reinforcement learning to induce a mapping from input images to corresponding segmentation parameters. This is accomplished by using the confidence level of model matching as a reinforcement signal for a team of learning automata to search for segmentation parameters during training. The use of the recognition algorithm as part of the evaluation function for image segmentation gives rise to significant improvement of the system performance by automatic generation of recognition strategies. The system is verified through experiments on sequences of color images with varying external conditions.

Journal ArticleDOI
TL;DR: A predictive coding scheme is proposed which is based on the temporal coherence of the segmentation of time-varying image sequences whose goal is object-oriented image coding.
Abstract: This paper describes a new method of segmentation of time-varying image sequences whose goal is object-oriented image coding. The segmentation represents a partition of each frame of the sequence into a set of regions which are homogeneous with regard to motion criterion. The region borders correspond to spatial contours of objects in the frame. Each spatio-temporal region is characterized by its temporal component, which is a model-dependent vector of motion parameters, and a structural component representing the polygonal approximation of the spatial contour of the region. The construction of spatio-temporal segmentation includes two phases: the initialization step and temporal tracking. The initialization step is based on the spatial segmentation of the first frame of the sequence. Then homogeneous spatial regions are merged through motion estimation in accordance with a motion-based criterion. The temporal tracking consists of the projection of the segmentation along the time axis, and its adjustment. Special attention is paid to the processing of occlusions. A predictive coding scheme is proposed which is based on the temporal coherence of the segmentation. This scheme is promising for a low bit-rate image compression. The results for teleconference and TV sequences show the high visual quality of reconstructed only by prediction images. Moreover, the bit-rates for motion coding are very low: from 0.002 to 0.007 bit/pixel for teleconference sequence and from 0.004 to 0.021 bit/pixel for complex TV sequence. A scheme for encoding of the structural information is proposed which requires 0.083 – 0.17 bit per pixel depending on the content of the sequence.

Journal ArticleDOI
TL;DR: This paper presents a method for spatiotemporal segmentation of long image sequences of scenes which include multiple independently moving objects, based on the minimum description length (MDL) principle, and shows the validity of this method.
Abstract: This paper presents a method for spatiotemporal segmentation of long image sequences of scenes which include multiple independently moving objects, based on the minimum description length (MDL) principle. First, a family of motion models is constructed, each of which corresponds to a physically meaningful motion such as translation with constant velocity or a combination of translation and rotation. Then, the motion description length is formulated. When an object changes the type of the motion or a new part of an object appears, the corresponding temporal or spatial segmentation is carried out. Ambiguous segmentation of two consecutive images can be resolved by minimizing the motion description length in a long sequence of images. Experiments on several real image sequences show the validity of our method.

Journal ArticleDOI
TL;DR: A new image objective function based on directional gradient information derived from Gaussian smoothed derivatives of the image data is proposed and designed to accurately locate an object boundary even in the case of a conflicting object positioned close to the object of interest.

Journal ArticleDOI
TL;DR: The data suggest that segmentation by motion proceeds first via a cooperative linking over space of local motion signals, generating almost immediate perceptual coherence even of physically incoherent signals, providing further evidence for the existence of two motion processes with distinct dynamic properties.
Abstract: Theories of image segmentation suggest that the human visual system may use two distinct processes to segregate figure from background: a local process that uses local feature contrasts to mark borders of coherent regions and a global process that groups similar features over a larger spatial scale. We performed psychophysical experiments to determine whether and to what extent the global similarity process contributes to image segmentation by motion and color. Our results show that for color, as well as for motion, segmentation occurs first by an integrative process on a coarse spatial scale, demonstrating that for both modalities the global process is faster than one based on local feature contrasts. Segmentation by motion builds up over time, whereas segmentation by color does not, indicating a fundamental difference between the modalities. Our data suggest that segmentation by motion proceeds first via a cooperative linking over space of local motion signals, generating almost immediate perceptual coherence even of physically incoherent signals. This global segmentation process occurs faster than the detection of absolute motion, providing further evidence for the existence of two motion processes with distinct dynamic properties.

Proceedings ArticleDOI
31 Oct 1996
TL;DR: With one exception, the method performs well and yields a skin-air interface with sufficient fidelity to preserve a nipple in profile and the method has been tested on 58 mammograms of two views from two digital mammogram databases.
Abstract: The breast and background on a mammogram form complementary, connected sets. Generally, the intensities comprising the background are spatially continuous, low in value and lie within a closed interval. The background may therefore be approximated by a polynomial in x and y on the basis of the Weierstrass approximation theorem. The authors include the whole background and a small portion of the breast in the region being modelled. The modelled background is subtracted from the original image, the resulting image thresholded, and the largest low intensity region taken to be the background. Connected regions are identified, labelled and merged. The background is floodfilled, and inclusions removed from the object, to yield a breast-background binary image. The method has been tested on 58 mammograms of two views from two digital mammogram databases. With one exception, it performs well and yields a skin-air interface with sufficient fidelity to preserve a nipple in profile.

Proceedings ArticleDOI
25 Aug 1996
TL;DR: The limitations of boundary based shape features, and an alternative shape characterization technique based on orientation radiograms are discussed, and a working image retrieval system based on this approach is described.
Abstract: For content based image retrieval using shape descriptors, most approaches so far extract shape information from a segmentation of the image. Shape features derived based on a specific segmentation are not suitable for images containing complex structures. Further, static segmentation based approaches are useful only for a small set of queries. In this paper we discuss the limitations of such boundary based shape features, and propose an alternative shape characterization technique based on orientation radiograms. A working image retrieval system based on this approach is described and sample results are presented for a full-image query.

Proceedings ArticleDOI
25 Aug 1996
TL;DR: A prediction-and-verification segmentation scheme which efficiently utilizes the attention images from the multiple fixations to recognize hand sign based on the segmentation results has shown that the system has achieved a good performance for this very difficult vision task.
Abstract: In this paper, we presents a three-stage framework to analyze time-varying image sequences. The focus of this paper is the second stage: segmentation. We propose a prediction-and-verification segmentation scheme which efficiently utilizes the attention images from the multiple fixations. The experimental results show 95% correct segmentation rate with 3% false rejection rate of 805 testing images. The recognition of hand sign based on the segmentation results has shown that the system has achieved a good performance for this very difficult vision task.

Patent
13 Jun 1996
TL;DR: In this article, an automatic contextual segmentation method which can be used to identify features in QCT images of femora, tibiae and vertebrae was proposed. But this method is not suitable for the segmentation of bones with known topological constraints.
Abstract: An automatic contextual segmentation method which can be used to identify features in QCT images of femora, tibiae and vertebrae. The principal advantages of this automatic approach over traditional techniques such as histomorphometry are, 1) the algorithms can be implemented in a fast, uniform, non-subjective manner across many images allowing unbiased comparisons of therapeutic efficacy, 2) much larger volumes in the region of interest can be analyzed, and 3) QCT can be used longitudinally. Two automatic contextual segmentation algorithms relate to a cortical bone algorithm (CBA) and a whole bone algorithm (WBA). These methods include a preprocessing step, a threshold selection step, a segmentation step satisfying logical constraints, a pixel wise label image updating step, and a feature extraction step; with the WBA including whole bone segmentation, cortical segmentation, spine segmentation, and centrum segmentation. The algorithms are constructed to provide successful segmentations for known classes of bones with known topological constraints.

Journal ArticleDOI
Jianping Fan1, Rong Wang1, Liming Zhang2, Dingjia Xing2, Fuxi Gan1 
TL;DR: Experimental results show that this segmentation-based coding scheme is more efficient than normal fixed-size coding algorithms.

Proceedings ArticleDOI
25 Aug 1996
TL;DR: The proposed method is capable of segmenting pages with non-rectangular layout as well as with various angles of skew, and the advantages and limitations of the proposed method are discussed.
Abstract: This paper presents a new method of page segmentation based on the analysis of background (white areas). The proposed method is capable of segmenting pages with non-rectangular layout as well as with various angles of skew. The characteristics of the method are as follows: (1) thinning of the background enables us to represent white areas of any shape as connected thin lines or chains and the robustness for tilted page images is also achieved by the representation; and (2) based on this representation, the task of page segmentation is defined as to find the loops enclosing printed areas. The task is achieved by eliminating unnecessary chains using not only a feature of white areas, but also a feature of black areas divided by a chain. Based on the experimental results and the comparison with previous methods, we discuss the advantages and limitations of the proposed method.