scispace - formally typeset
Search or ask a question

Showing papers on "Segmentation-based object categorization published in 2005"


Proceedings ArticleDOI
17 Oct 2005
TL;DR: This work treats object categories as topics, so that an image containing instances of several categories is modeled as a mixture of topics, and develops a model developed in the statistical text literature: probabilistic latent semantic analysis (pLSA).
Abstract: We seek to discover the object categories depicted in a set of unlabelled images. We achieve this using a model developed in the statistical text literature: probabilistic latent semantic analysis (pLSA). In text analysis, this is used to discover topics in a corpus using the bag-of-words document representation. Here we treat object categories as topics, so that an image containing instances of several categories is modeled as a mixture of topics. The model is applied to images by using a visual analogue of a word, formed by vector quantizing SIFT-like region descriptors. The topic discovery approach successfully translates to the visual domain: for a small set of objects, we show that both the object categories and their approximate spatial layout are found without supervision. Performance of this unsupervised method is compared to the supervised approach of Fergus et al. (2003) on a set of unseen images containing only one object per image. We also extend the bag-of-words vocabulary to include 'doublets' which encode spatially local co-occurring regions. It is demonstrated that this extended vocabulary gives a cleaner image segmentation. Finally, the classification and segmentation methods are applied to a set of images containing multiple objects per image. These results demonstrate that we can successfully build object class models from an unsupervised analysis of images.

1,129 citations


Proceedings ArticleDOI
17 Oct 2005
TL;DR: An optimally compact visual dictionary is learned by pair-wise merging of visual words from an initially large dictionary, and a novel statistical measure of discrimination is proposed which is optimized by each merge operation.
Abstract: This paper presents a new algorithm for the automatic recognition of object classes from images (categorization). Compact and yet discriminative appearance-based object class models are automatically learned from a set of training images. The method is simple and extremely fast, making it suitable for many applications such as semantic image retrieval, Web search, and interactive image editing. It classifies a region according to the proportions of different visual words (clusters in feature space). The specific visual words and the typical proportions in each object are learned from a segmented training set. The main contribution of this paper is twofold: i) an optimally compact visual dictionary is learned by pair-wise merging of visual words from an initially large dictionary. The final visual words are described by GMMs. ii) A novel statistical measure of discrimination is proposed which is optimized by each merge operation. High classification accuracy is demonstrated for nine object classes on photographs of real objects viewed under general lighting conditions, poses and viewpoints. The set of test images used for validation comprise: i) photographs acquired by us, ii) images from the Web and iii) images from the recently released Pascal dataset. The proposed algorithm performs well on both texture-rich objects (e.g. grass, sky, trees) and structure-rich ones (e.g. cars, bikes, planes)

968 citations


Proceedings ArticleDOI
20 Jun 2005
TL;DR: The segmentation algorithm works simultaneously across the graph scales, with an inter-scale constraint to ensure communication and consistency between the segmentations at each scale, and incorporates long-range connections with linear-time complexity, providing high-quality segmentations efficiently.
Abstract: We present a multiscale spectral image segmentation algorithm. In contrast to most multiscale image processing, this algorithm works on multiple scales of the image in parallel, without iteration, to capture both coarse and fine level details. The algorithm is computationally efficient, allowing to segment large images. We use the normalized cut graph partitioning framework of image segmentation. We construct a graph encoding pairwise pixel affinity, and partition the graph for image segmentation. We demonstrate that large image graphs can be compressed into multiple scales capturing image structure at increasingly large neighborhood. We show that the decomposition of the image segmentation graph into different scales can be determined by ecological statistics on the image grouping cues. Our segmentation algorithm works simultaneously across the graph scales, with an inter-scale constraint to ensure communication and consistency between the segmentations at each scale. As the results show, we incorporate long-range connections with linear-time complexity, providing high-quality segmentations efficiently. Images that previously could not be processed because of their size have been accurately segmented thanks to this method.

635 citations


Journal ArticleDOI
TL;DR: In this paper, a Bayesian framework for parsing images into their constituent visual patterns is presented, which optimizes the posterior probability and outputs a scene representation as a "parsing graph", in a spirit similar to parsing sentences in speech and natural language.
Abstract: In this paper we present a Bayesian framework for parsing images into their constituent visual patterns. The parsing algorithm optimizes the posterior probability and outputs a scene representation as a "parsing graph", in a spirit similar to parsing sentences in speech and natural language. The algorithm constructs the parsing graph and re-configures it dynamically using a set of moves, which are mostly reversible Markov chain jumps. This computational framework integrates two popular inference approaches--generative (top-down) methods and discriminative (bottom-up) methods. The former formulates the posterior probability in terms of generative models for images defined by likelihood functions and priors. The latter computes discriminative probabilities based on a sequence (cascade) of bottom-up tests/filters. In our Markov chain algorithm design, the posterior probability, defined by the generative models, is the invariant (target) probability for the Markov chain, and the discriminative probabilities are used to construct proposal probabilities to drive the Markov chain. Intuitively, the bottom-up discriminative probabilities activate top-down generative models. In this paper, we focus on two types of visual patterns--generic visual patterns, such as texture and shading, and object patterns including human faces and text. These types of patterns compete and cooperate to explain the image and so image parsing unifies image segmentation, object detection, and recognition (if we use generic visual patterns only then image parsing will correspond to image segmentation (Tu and Zhu, 2002. IEEE Trans. PAMI, 24(5):657--673). We illustrate our algorithm on natural images of complex city scenes and show examples where image segmentation can be improved by allowing object specific knowledge to disambiguate low-level segmentation cues, and conversely where object detection can be improved by using generic visual patterns to explain away shadows and occlusions.

463 citations


Proceedings ArticleDOI
John Winn1, Nebojsa Jojic1
17 Oct 2005
TL;DR: LOCUS (learning object classes with unsupervised segmentation) is introduced which uses a generative probabilistic model to combine bottom-up cues of color and edge with top-down cues of shape and pose, allowing for significant within-class variation.
Abstract: We address the problem of learning object class models and object segmentations from unannotated images. We introduce LOCUS (learning object classes with unsupervised segmentation) which uses a generative probabilistic model to combine bottom-up cues of color and edge with top-down cues of shape and pose. A key aspect of this model is that the object appearance is allowed to vary from image to image, allowing for significant within-class variation. By iteratively updating the belief in the object's position, size, segmentation and pose, LOCUS avoids making hard decisions about any of these quantities and so allows for each to be refined at any stage. We show that LOCUS successfully learns an object class model from unlabeled images, whilst also giving segmentation accuracies that rival existing supervised methods. Finally, we demonstrate simultaneous recognition and segmentation in novel images using the learned models for a number of object classes, as well as unsupervised object discovery and tracking in video.

456 citations


Proceedings ArticleDOI
20 Jun 2005
TL;DR: This work devise a graph cut algorithm for interactive segmentation which incorporates shape priors, and positive results on both medical and natural images are demonstrated.
Abstract: Interactive or semi-automatic segmentation is a useful alternative to pure automatic segmentation in many applications. While automatic segmentation can be very challenging, a small amount of user input can often resolve ambiguous decisions on the part of the algorithm. In this work, we devise a graph cut algorithm for interactive segmentation which incorporates shape priors. While traditional graph cut approaches to interactive segmentation are often quite successful, they may fail in cases where there are diffuse edges, or multiple similar objects in close proximity to one another. Incorporation of shape priors within this framework mitigates these problems. Positive results on both medical and natural images are demonstrated.

437 citations


Proceedings ArticleDOI
17 Oct 2005
TL;DR: Probabilistic latent semantic analysis generates a compact scene representation, discriminative for accurate classification, and significantly more robust when less training data are available, and the ability of PLSA to automatically extract visually meaningful aspects is exploited to propose new algorithms for aspect-based image ranking and context-sensitive image segmentation.
Abstract: We present a new approach to model visual scenes in image collections, based on local invariant features and probabilistic latent space models. Our formulation provides answers to three open questions:(l) whether the invariant local features are suitable for scene (rather than object) classification; (2) whether unsupennsed latent space models can be used for feature extraction in the classification task; and (3) whether the latent space formulation can discover visual co-occurrence patterns, motivating novel approaches for image organization and segmentation. Using a 9500-image dataset, our approach is validated on each of these issues. First, we show with extensive experiments on binary and multi-class scene classification tasks, that a bag-of-visterm representation, derived from local invariant descriptors, consistently outperforms state-of-the-art approaches. Second, we show that probabilistic latent semantic analysis (PLSA) generates a compact scene representation, discriminative for accurate classification, and significantly more robust when less training data are available. Third, we have exploited the ability of PLSA to automatically extract visually meaningful aspects, to propose new algorithms for aspect-based image ranking and context-sensitive image segmentation.

410 citations


Journal ArticleDOI
01 Sep 2005
TL;DR: A robust segmentation technique based on an extension to the traditional fuzzy c-means (FCM) clustering algorithm is proposed and a neighborhood attraction, which is dependent on the relative location and features of neighboring pixels, is shown to improve the segmentation performance dramatically.
Abstract: Image segmentation is an indispensable process in the visualization of human tissues, particularly during clinical analysis of magnetic resonance (MR) images. Unfortunately, MR images always contain a significant amount of noise caused by operator performance, equipment, and the environment, which can lead to serious inaccuracies with segmentation. A robust segmentation technique based on an extension to the traditional fuzzy c-means (FCM) clustering algorithm is proposed in this paper. A neighborhood attraction, which is dependent on the relative location and features of neighboring pixels, is shown to improve the segmentation performance dramatically. The degree of attraction is optimized by a neural-network model. Simulated and real brain MR images with different noise levels are segmented to demonstrate the superiority of the proposed technique compared to other FCM-based methods. This segmentation method is a key component of an MR image-based classification system for brain tumors, currently being developed.

397 citations


Journal ArticleDOI
Yin Li1, Jian Sun1, Heung-Yeung Shum1
01 Jul 2005
TL;DR: This paper presents a system for cutting a moving object out from a video clip using a new 3D graph cut based segmentation approach on the spatial-temporal video volume and provides brush tools for the user to control the object boundary precisely wherever needed.
Abstract: In this paper, we present a system for cutting a moving object out from a video clip. The cutout object sequence can be pasted onto another video or a background image. To achieve this, we first apply a new 3D graph cut based segmentation approach on the spatial-temporal video volume. Our algorithm partitions watershed presegmentation regions into foreground and background while preserving temporal coherence. Then, the initial segmentation result is refined locally. Given two frames in the video sequence, we specify two respective windows of interest which are then tracked using a bi-directional feature tracking algorithm. For each frame in between these two given frames, the segmentation in each tracked window is refined using a 2D graph cut that utilizes a local color model. Moreover, we provide brush tools for the user to control the object boundary precisely wherever needed. Based on the accurate binary segmentation result, we apply coherent matting to extract the alpha mattes and foreground colors of the object.

391 citations


Journal ArticleDOI
TL;DR: This paper solves the information-theoretic optimization problem by deriving the associated gradient flows and applying curve evolution techniques and uses level-set methods to implement the resulting evolution.
Abstract: In this paper, we present a new information-theoretic approach to image segmentation. We cast the segmentation problem as the maximization of the mutual information between the region labels and the image pixel intensities, subject to a constraint on the total length of the region boundaries. We assume that the probability densities associated with the image pixel intensities within each region are completely unknown a priori, and we formulate the problem based on nonparametric density estimates. Due to the nonparametric structure, our method does not require the image regions to have a particular type of probability distribution and does not require the extraction and use of a particular statistic. We solve the information-theoretic optimization problem by deriving the associated gradient flows and applying curve evolution techniques. We use level-set methods to implement the resulting evolution. The experimental results based on both synthetic and real images demonstrate that the proposed technique can solve a variety of challenging image segmentation problems. Furthermore, our method, which does not require any training, performs as good as methods based on training.

335 citations


Journal ArticleDOI
TL;DR: In this paper, four algorithms from the two main groups of segmentation algorithms (boundary-based and region-based) were evaluated and compared and an evaluation of each algorithm was carried out with empirical discrepancy evaluation methods.
Abstract: Since 1999, very high spatial resolution satellite data represent the surface of the Earth with more detail. However, information extraction by per pixel multispectral classification techniques proves to be very complex owing to the internal variability increase in land-cover units and to the weakness of spectral resolution. Image segmentation before classification was proposed as an alternative approach, but a large variety of segmentation algorithms were developed during the last 20 years, and a comparison of their implementation on very high spatial resolution images is necessary. In this study, four algorithms from the two main groups of segmentation algorithms (boundarybased and region-based) were evaluated and compared. In order to compare the algorithms, an evaluation of each algorithm was carried out with empirical discrepancy evaluation methods. This evaluation is carried out with a visual segmentation of Ikonos panchromatic images. The results show that the choice of parameters is very important and has a great influence on the segmentation results. The selected boundary-based algorithms are sensitive to the noise or texture. Better results are obtained with regionbased algorithms, but a problem with the transition zones between the contrasted objects can be present.

Proceedings ArticleDOI
17 Oct 2005
TL;DR: A multilevel banded heuristic for computation of graph cuts that is motivated by the well-known narrow band algorithm in level set computation is introduced that drastically reduces both the running time and the memory consumption of graph cut while producing nearly the same segmentation result as the conventional graph cuts.
Abstract: In the short time since publication of Boykov and Jolly's seminal paper [2001], graph cuts have become well established as a leading method in 2D and 3D semi-automated image segmentation. Although this approach is computationally feasible for many tasks, the memory overhead and supralinear time complexity of leading algorithms results in an excessive computational burden for high-resolution data. In this paper, we introduce a multilevel banded heuristic for computation of graph cuts that is motivated by the well-known narrow band algorithm in level set computation. We perform a number of numerical experiments to show that this heuristic drastically reduces both the running time and the memory consumption of graph cuts while producing nearly the same segmentation result as the conventional graph cuts. Additionally, we are able to characterize the type of segmentation target for which our multilevel banded heuristic yields different results from the conventional graph cuts. The proposed method has been applied to both 2D and 3D images with promising results.

Proceedings ArticleDOI
20 Jun 2005
TL;DR: This paper demonstrates how a modification of the Rand index, the Normalized Probabilistic Rand (NPR) index, meets the requirements of largescale performance evaluation of image segmentation.
Abstract: Despite significant advances in image segmentation techniques, evaluation of these techniques thus far has been largely subjective. Typically, the effectiveness of a new algorithm is demonstrated only by the presentation of a few segmented images and is otherwise left to subjective evaluation by the reader. Little effort has been spent on the design of perceptually correct measures to compare an automatic segmentation of an image to a set of hand-segmented examples of the same image. This paper demonstrates how a modification of the Rand index, the Normalized Probabilistic Rand (NPR) index, meets the requirements of largescale performance evaluation of image segmentation. We show that the measure has a clear probabilistic interpretation as the maximum likelihood estimator of an underlying Gibbs model, can be correctly normalized to account for the inherent similarity in a set of ground truth images, and can be computed efficiently for large datasets. Results are presented on images from the publicly available Berkeley Segmentation dataset.

Journal ArticleDOI
TL;DR: The experimental results show that the proposed image segmentation system has the desired ability for the segmentation of color image in a variety of vision tasks.
Abstract: An image segmentation system is proposed for the segmentation of color image based on neural networks. In order to measure the color difference properly, image colors are represented in a modified L/sup */u/sup */v/sup */ color space. The segmentation system comprises unsupervised segmentation and supervised segmentation. The unsupervised segmentation is achieved by a two-level approach, i.e., color reduction and color clustering. In color reduction, image colors are projected into a small set of prototypes using self-organizing map (SOM) learning. In color clustering, simulated annealing (SA) seeks the optimal clusters from SOM prototypes. This two-level approach takes the advantages of SOM and SA, which can achieve the near-optimal segmentation with a low computational cost. The supervised segmentation involves color learning and pixel classification. In color learning, color prototype is defined to represent a spherical region in color space. A procedure of hierarchical prototype learning (HPL) is used to generate the different sizes of color prototypes from the sample of object colors. These color prototypes provide a good estimate for object colors. The image pixels are classified by the matching of color prototypes. The experimental results show that the system has the desired ability for the segmentation of color image in a variety of vision tasks.

Proceedings ArticleDOI
17 Oct 2005
TL;DR: An improved iris segmentation and eyelid detection stage of the algorithm is developed and implemented, which leads to an increase of over 6% in the rank-one recognition rate.
Abstract: Iris is claimed to be one of the best biometrics. We have collected a large data set of iris images, intentionally sampling a range of quality broader than that used by current commercial iris recognition systems. We have re-implemented the Daugman-like iris recognition algorithm developed by Masek. We have also developed and implemented an improved iris segmentation and eyelid detection stage of the algorithm, and experimentally verified the improvement in recognition performance using the collected dataset. Compared to Masek's original segmentation approach, our improved segmentation algorithm leads to an increase of over 6% in the rank-one recognition rate.

Journal ArticleDOI
TL;DR: A generic framework for segmentation evaluation is introduced and a metric based on the distance between segmentation partitions is proposed to overcome some of the limitations of existing approaches.
Abstract: Image segmentation plays a major role in a broad range of applications. Evaluating the adequacy of a segmentation algorithm for a given application is a requisite both to allow the appropriate selection of segmentation algorithms as well as to tune their parameters for optimal performance. However, objective segmentation quality evaluation is far from being a solved problem. In this paper, a generic framework for segmentation evaluation is introduced after a brief review of previous work. A metric based on the distance between segmentation partitions is proposed to overcome some of the limitations of existing approaches. Symmetric and asymmetric distance metric alternatives are presented to meet the specificities of a wide class of applications. Experimental results confirm the potential of the proposed measures.

Proceedings ArticleDOI
05 Jan 2005
TL;DR: This paper proposes a measure that addresses the above concerns and has desirable properties such as accommodation of labeling errors at segment boundaries, region sensitive refinement, and compensation for differences in segment ambiguity between images.
Abstract: Quantitative evaluation and comparison of image segmentation algorithms is now feasible owing to the recent availability of collections of hand-labeled images. However, little attention has been paid to the design of measures to compare one segmentation result to one or more manual segmentations of the same image. Existing measures in statistics and computer vision literature suffer either from intolerance to labeling refinement, making them unsuitable for image segmentation, or from the existence of degenerate cases, making the process of training algorithms using the measures to be prone to failure. This paper surveys previous work on measures of similarity and illustrates scenarios where they are applicable for performance evaluation in computer vision. For the image segmentation problem, we propose a measure that addresses the above concerns and has desirable properties such as accommodation of labeling errors at segment boundaries, region sensitive refinement, and compensation for differences in segment ambiguity between images

Journal ArticleDOI
TL;DR: A two-stage method for general image segmentation is proposed, which is capable of processing both textured and nontextured objects in a meaningful fashion, and introduces the weighted mean cut cost function for graph partitioning.
Abstract: The goal of segmentation is to partition an image into disjoint regions, in a manner consistent with human perception of the content. For unsupervised segmentation of general images, however, there is the competing requirement not to make prior assumptions about the scene. Here, a two-stage method for general image segmentation is proposed, which is capable of processing both textured and nontextured objects in a meaningful fashion. The first stage extracts texture features from the subbands of the dual-tree complex wavelet transform. Oriented median filtering is employed, to circumvent the problem of texture feature response at step edges in the image. From the processed feature images, a perceptual gradient function is synthesised, whose watershed transform provides an initial segmentation. The second stage of the algorithm groups together these primitive regions into meaningful objects. To achieve this, a novel spectral clustering technique is proposed, which introduces the weighted mean cut cost function for graph partitioning. The ability of the proposed algorithm to generalize across a variety of image types is demonstrated.

Journal ArticleDOI
TL;DR: The proposed algorithm divides the image into homogeneous regions by local thresholds by adapting the number of thresholds and their values by an automatic process, where local information is taken into consideration.

Proceedings ArticleDOI
20 Jun 2005
TL;DR: The main focus of this work is the integration of feature grouping and model based segmentation into one consistent framework based on partitioning a given set of image features using a likelihood function that is parameterized on the shape and location of potential individuals in the scene.
Abstract: The main focus of this work is the integration of feature grouping and model based segmentation into one consistent framework The algorithm is based on partitioning a given set of image features using a likelihood function that is parameterized on the shape and location of potential individuals in the scene Using a variant of the EM formulation, maximum likelihood estimates of both the model parameters and the grouping are obtained simultaneously The resulting algorithm performs global optimization and generates accurate results even when decisions can not be made using local context alone An important feature of the algorithm is that the number of people in the scene is not modeled explicitly As a result no prior knowledge or assumed distributions are required The approach is shown to be robust with respect to partial occlusion, shadows, clutter, and can operate over a large range of challenging view angles including those that are parallel to the ground plane Comparisons with existing crowd segmentation systems are made and the utility of coupling crowd segmentation with a temporal tracking system is demonstrated

Proceedings ArticleDOI
29 Jul 2005
TL;DR: The theoretical foundation and difficulties of hand vein recognition are introduced, and the matching method based on distances is used to match vein images and it is indicated that this method is efficiently.
Abstract: In this paper, the theoretical foundation and difficulties of hand vein recognition are introduced at first. Then, the threshold segmentation method and thinning method of hand vein image are deeply studied and a new threshold segmentation method and an improved conditional thinning method are proposed. The method of hand vein image feature extraction based on end points and crossing points is studied initially, and the matching method based on distances is used to match vein images. The matching experiments indicated that this method is efficiently.

01 Jan 2005
TL;DR: This paper presents an evaluation of two popular segmentation algorithms, the mean shift-based segmentation algorithm and a graph- based segmentation scheme, and considers a hybrid method which combines the other two methods.
Abstract: Unsupervised image segmentation algorithms have matured to the point where they generate reasonable segmentations, and thus can begin to be incorporated into larger systems. A system designer now has an array of available algorithm choices, however, few objective numerical evaluations exist of these segmentation algorithms. As a first step towards filling this gap, this paper presents an evaluation of two popular segmentation algorithms, the mean shift-based segmentation algorithm and a graph-based segmentation scheme. We also consider a hybrid method which combines the other two methods. This quantitative evaluation is made possible by the recently proposed measure of segmentation correctness, the Normalized Probabilistic Rand (NPR) index, which allows a principled comparison between segmentations created by different algorithms, as well as segmentations on different images. For each algorithm, we consider its correctness as measured by the NPR index, as well as its stability with respect to changes in parameter settings and with respect to different images. An algorithm which produces correct segmentation results with a wide array of parameters on any one image, as well as correct segmentation results on multiple images with the same parameters, will be a useful, predictable and easily adjustable preprocessing step in a larger system. Our results are presented on the Berkeley image segmentation database, which contains 300 natural images along with several ground truth hand segmentations for each image. As opposed to previous results presented on this database, the algorithms we compare all use the same image features (position and colour) for segmentation, thereby making their outputs directly comparable.

Proceedings ArticleDOI
14 Nov 2005
TL;DR: A graph cuts-based image segmentation technique that incorporates an elliptical shape prior that is effective in segmenting vessels and lymph nodes from pelvic magnetic resonance images, as well as human faces.
Abstract: We present a graph cuts-based image segmentation technique that incorporates an elliptical shape prior. Inclusion of this shape constraint restricts the solution space of the segmentation result, increasing robustness to misleading information that results from noise, weak boundaries, and clutter. We argue that combining a shape prior with a graph cuts method suggests an iterative approach that updates an intermediate result to the desired solution. We first present the details of our method and then demonstrate its effectiveness in segmenting vessels and lymph nodes from pelvic magnetic resonance images, as well as human faces.

Journal ArticleDOI
TL;DR: It is concluded that the (multivariate) LBP texture model in combination with a hierarchical splitting segmentation framework is suitable for identifying objects and for quantifying their uncertainty.
Abstract: In this study, a segmentation procedure is proposed, based on grey-level and multivariate texture to extract spatial objects from an image scene. Object uncertainty was quantified to identify transitions zones of objects with indeterminate boundaries. The Local Binary Pattern (LBP) operator, modelling texture, was integrated into a hierarchical splitting segmentation to identify homogeneous texture regions in an image. We proposed a multivariate extension of the standard univariate LBP operator to describe colour texture. The paper is illustrated with two case studies. The first considers an image with a composite of texture regions. The two LBP operators provided good segmentation results on both grey-scale and colour textures, depicted by accuracy values of 96% and 98%, respectively. The second case study involved segmentation of coastal land cover objects from a multi-spectral Compact Airborne Spectral Imager (CASI) image, of a coastal area in the UK. Segmentation based on the univariate LBP measure provided unsatisfactory segmentation results from a single CASI band (70% accuracy). A multivariate LBP-based segmentation of three CASI bands improved segmentation results considerably (77% accuracy). Uncertainty values for object building blocks provided valuable information for identification of object transition zones. We conclude that the (multivariate) LBP texture model in combination with a hierarchical splitting segmentation framework is suitable for identifying objects and for quantifying their uncertainty.

01 Jan 2005
TL;DR: In this paper, a framework for image object-based change detection is proposed, which breaks down the n-dimensional problem to two main aspects, geometry and thematic content, which can be associated with the following questions: did a certain classified object change geometrically, class-wise, or both.
Abstract: With the advent of high resolution satellite imagery and airborne digital camera data approaches that include contextual information are more commonly utilized. One way to include spatial dimensions in image analysis is to identify relatively homogeneous regions and to treat them as objects. Although segmentation is not a new concept, the number of image segmentation based applications is recently significantly increasing. Concurrently, new methodological challenges arise. Standard change detection and accuracy assessment techniques mainly rely on statistically assessing individual pixels. Such assessments are not satisfactory for image objects which exhibit shape, boundary, homogeneity or topological information. These additional dimensions of information describing real world objects have to be assessed in multitemporal object-based image analysis. In this paper, problems associated with multitemporal object recognition are identified and a framework for image object-based change detection is suggested. For simplicity, this framework breaks down the n-dimensional problem to two main aspects, geometry and thematic content. These two aspects can be associated with the following questions: did a certain classified object change geometrically, class-wise, or both? When can we identify an object in one data set as being the same object in another data set? Do we need user-defined or application-specific thresholds for geometric overlap, shape-area relations, centroid movements, etc? This paper elucidates some specific challenges to change detection of objects and incorporates GIS-functionality into image analysis.

Journal ArticleDOI
TL;DR: An approach to optimal object segmentation in the geodesic active contour framework is presented with application to automated image segmentation and an efficient algorithm is presented for the computation of globally optimal segmentations.
Abstract: An approach to optimal object segmentation in the geodesic active contour framework is presented with application to automated image segmentation. The new segmentation scheme seeks the geodesic active contour of globally minimal energy under the sole restriction that it contains a specified internal point pint. This internal point selects the object of interest and may be used as the only input parameter to yield a highly automated segmentation scheme. The image to be segmented is represented as a Riemannian space S with an associated metric induced by the image. The metric is an isotropic and decreasing function of the local image gradient at each point in the image, encoding the local homogeneity of image features. Optimal segmentations are then the closed geodesics which partition the object from the background with minimal similarity across the partitioning. An efficient algorithm is presented for the computation of globally optimal segmentations and applied to cell microscopy, x-ray, magnetic resonance and cDNA microarray images.

BookDOI
01 Jan 2005
TL;DR: A Basic Model for IVUS Image Simulation and State of the Art of Level Set Methods in Segmentation and Registration of Medical Imaging Modalities are presented.
Abstract: A Basic Model for IVUS Image Simulation.- Quantitative Functional Imaging with Positron Emission Tomography.- Advances in Magnetic Resonance Angiography and Physical Principles.- Recent Advances in the Level Set Method - Shape from Shading Models.- Wavelets in Medical Image Processing - Improving the Initialization, Convergence, and Memory Utilization for Defomable Models.- Level Set Segmentation of Biological Volume Database.- Advanced Segmentation (Level Set) Techniques.- A Regional-aided Color Geometric Snake.- Co-Volume Level Set Method in Subjective Surface Based Medical Image Segmentation.- Model-Based Brain Tissue Classification.- Supervised Texture Classification for Intravascular Tissue Characterization.- Medical Image Segmentation: Methods and Applications in Functional Imaging.- Automatic Segmentation of Pancreatic Tumors in Computed Tomography.- Computerized Analysis and Vasodilation Parameterization in Flow-Mediated Dilation Tests from Ultrasonic Image Sequences.- Statistical and Adaptive Approaches for Optimal Segmentation in Medical Images.- Automatic Analysis of Color Fundus Photographs and its Application to the Diagnosis of Diabetic Retinopathy.- Segmentation Issues in Carotid Artery Atherosclerotic Plague Analysis with MRI.- Accurate Lumen Identification, Detection, and Quantification in MR Plague Volumes.- Hessian-based Multiscale Enhancement, Description, and Quantification of Second-Order 3D Local Structures from Medical Volume Data.- A Knowledge-Based Scheme for Digital Mammography - Simultaneous Fuzzy Segmentation of Medical Images.- Computer Aided Diagnosis of Mammographic Calcification Clusters: Impact of Segmentation.- Computer Supported Segmentation of Radiological Data.- Medical Image Registration: Theory Algorithms, and Case Studies.- State of the Art of Level Set Methods in Segmentation and Registration of Medical Imaging Modalities.- Three-Dimensional Rigid and Non-Rigid Image Restriction for the Pelvis and Prostate.- Stereo and Temporal Retinal Image Registration by Mutual Information Maximization.- Quantification of Brain Aneurysm Dimensions from CTA for Surgical Planning of Coiling Intervention.- Inverse Consistent Image Registration.- A Computer-Aided Design System for Segmentation of Volumetric Images.- Inter-subject Non-Rigid Registration: an Overview with Classification and the Romeo Algorithm.- Elastic Registration for Biomedical Applications.- Cross-entropy, reversed cross-entropy, and symmetric divergence similarity measures for 3D image registration: a comparative study.- Quo Vadis, Atlas-Based Segmentation?.- Index.

Proceedings ArticleDOI
12 Dec 2005
TL;DR: An approach for interactive foreground extraction in still images that is currently being integrated into the GIMP is presented, derived from color signatures, a technique originating from image retrieval.
Abstract: The following article presents an approach for interactive foreground extraction in still images that is currently being integrated into the GIMP. The presented approach has been derived from color signatures, a technique originating from image retrieval. The article explains the algorithm and presents some benchmark results to show the improvements in speed and accuracy compared to state of the art solutions. The article also describes how the algorithm can easily be adapted for video segmentation tasks.

Journal Article
TL;DR: In this paper, a new method of improved Pulse Coupled Neural Networks (PCNN) image segmentation based on the criterion of minimum cross-entropy is put forward according to the image processing and the improved PCNN model.
Abstract: Pulse Coupled Neural Networks (PCNN) is a new Neural Networks, which was developed and formed in the 1990’s. In order to process accurate image segmentation automatically, a new method of improved PCNN image segmentation based on the criterion of minimum cross-entropy is put forward according to the image processing and the improved PCNN model. This approach is also based on the objective of original image and segmentation image, and the differences of backgrounds. Through computer simulation, the new method can determine the cyclic iterative times, and also select the best threshold automatically. The comprison has been made about the PCNN segmentation method with the basis of maximum Shannon entropy. The results of experiment show that this method mentioned in this article is superior to the PCNN segmentation method, based on the criterion of Shannon entropy. The method has fine adaptability and high accuracy to image segmentation.

Journal ArticleDOI
TL;DR: A combined key-frame extraction and object-based segmentation method is developed based on state-of-the-art video segmentation algorithms and statistical clustering approaches to provide a new combined shot-based andobject-based framework for a variety of advanced video analysis tasks.
Abstract: Video segmentation has been an important and challenging issue for many video applications. Usually there are two different video segmentation approaches, i.e., shot-based segmentation that uses a set of key-frames to represent a video shot and object-based segmentation that partitions a video shot into objects and background. Representing a video shot at different semantic levels, two segmentation processes are usually implemented separately or independently for video analysis. In this paper, we propose a new approach to combine two video segmentation techniques together. Specifically, a combined key-frame extraction and object-based segmentation method is developed based state-of-the-art video segmentation algorithms and statistical clustering approaches. On the one hand, shot-based segmentation can dramatically facilitate and enhance object-based segmentation by using key-frame extraction to select a few key-frames for statistical model training. On the other hand, object-based segmentation can be used to improve shot-based segmentation results by using model-based key-frame refinement. The proposed approach is able to integrate advantages of these two segmentation methods and provide a new combined shot-based and object-based framework for a variety of advanced video analysis tasks. Experimental results validate effectiveness and flexibility of the proposed video segmentation algorithm.