scispace - formally typeset
Search or ask a question

Showing papers on "Image segmentation published in 2006"


Proceedings ArticleDOI
17 Jun 2006
TL;DR: This paper presents a method for recognizing scene categories based on approximate global geometric correspondence that exceeds the state of the art on the Caltech-101 database and achieves high accuracy on a large database of fifteen natural scene categories.
Abstract: This paper presents a method for recognizing scene categories based on approximate global geometric correspondence. This technique works by partitioning the image into increasingly fine sub-regions and computing histograms of local features found inside each sub-region. The resulting "spatial pyramid" is a simple and computationally efficient extension of an orderless bag-of-features image representation, and it shows significantly improved performance on challenging scene categorization tasks. Specifically, our proposed method exceeds the state of the art on the Caltech-101 database and achieves high accuracy on a large database of fifteen natural scene categories. The spatial pyramid framework also offers insights into the success of several recently proposed image descriptions, including Torralba’s "gist" and Lowe’s SIFT descriptors.

8,736 citations


Journal ArticleDOI
TL;DR: The methods and software engineering philosophy behind this new tool, ITK-SNAP, are described and the results of validation experiments performed in the context of an ongoing child autism neuroimaging study are provided, finding that SNAP is a highly reliable and efficient alternative to manual tracing.

6,669 citations


Journal ArticleDOI
Leo Grady1
TL;DR: A novel method is proposed for performing multilabel, interactive image segmentation using combinatorial analogues of standard operators and principles from continuous potential theory, allowing it to be applied in arbitrary dimension on arbitrary graphs.
Abstract: A novel method is proposed for performing multilabel, interactive image segmentation. Given a small number of pixels with user-defined (or predefined) labels, one can analytically and quickly determine the probability that a random walker starting at each unlabeled pixel will first reach one of the prelabeled pixels. By assigning each pixel to the label for which the greatest probability is calculated, a high-quality image segmentation may be obtained. Theoretical properties of this algorithm are developed along with the corresponding connections to discrete potential theory and electrical circuits. This algorithm is formulated in discrete space (i.e., on a graph) using combinatorial analogues of standard operators and principles from continuous potential theory, allowing it to be applied in arbitrary dimension on arbitrary graphs

2,610 citations


Journal ArticleDOI
TL;DR: This application epitomizes the best features of combinatorial graph cuts methods in vision: global optima, practical efficiency, numerical robustness, ability to fuse a wide range of visual cues and constraints, unrestricted topological properties of segments, and applicability to N-D problems.
Abstract: Combinatorial graph cut algorithms have been successfully applied to a wide range of problems in vision and graphics. This paper focusses on possibly the simplest application of graph-cuts: segmentation of objects in image data. Despite its simplicity, this application epitomizes the best features of combinatorial graph cuts methods in vision: global optima, practical efficiency, numerical robustness, ability to fuse a wide range of visual cues and constraints, unrestricted topological properties of segments, and applicability to N-D problems. Graph cuts based approaches to object extraction have also been shown to have interesting connections with earlier segmentation methods such as snakes, geodesic active contours, and level-sets. The segmentation energies optimized by graph cuts combine boundary regularization with region-based properties in the same fashion as Mumford-Shah style functionals. We present motivation and detailed technical description of the basic combinatorial optimization framework for image segmentation via s/t graph cuts. After the general concept of using binary graph cut algorithms for object segmentation was first proposed and tested in Boykov and Jolly (2001), this idea was widely studied in computer vision and graphics communities. We provide links to a large number of known extensions based on iterative parameter re-estimation and learning, multi-scale or hierarchical approaches, narrow bands, and other techniques for demanding photo, video, and medical applications.

2,076 citations


Journal ArticleDOI
TL;DR: In this paper, a method for automated segmentation of the vasculature in retinal images is presented, which produces segmentations by classifying each image pixel as vessel or non-vessel, based on the pixel's feature vector.
Abstract: We present a method for automated segmentation of the vasculature in retinal images. The method produces segmentations by classifying each image pixel as vessel or nonvessel, based on the pixel's feature vector. Feature vectors are composed of the pixel's intensity and two-dimensional Gabor wavelet transform responses taken at multiple scales. The Gabor wavelet is capable of tuning to specific frequencies, thus allowing noise filtering and vessel enhancement in a single step. We use a Bayesian classifier with class-conditional probability density functions (likelihoods) described as Gaussian mixtures, yielding a fast classification, while being able to model complex decision surfaces. The probability distributions are estimated based on a training set of labeled pixels obtained from manual segmentations. The method's performance is evaluated on publicly available DRIVE (Staal et al.,2004) and STARE (Hoover et al.,2000) databases of manually labeled images. On the DRIVE database, it achieves an area under the receiver operating characteristic curve of 0.9614, being slightly superior than that presented by state-of-the-art approaches. We are making our implementation available as open source MATLAB scripts for researchers interested in implementation details, evaluation, or development of methods

1,435 citations


Book ChapterDOI
07 May 2006
TL;DR: A new approach to learning a discriminative model of object classes, incorporating appearance, shape and context information efficiently, is proposed, which is used for automatic visual recognition and semantic segmentation of photographs.
Abstract: This paper proposes a new approach to learning a discriminative model of object classes, incorporating appearance, shape and context information efficiently. The learned model is used for automatic visual recognition and semantic segmentation of photographs. Our discriminative model exploits novel features, based on textons, which jointly model shape and texture. Unary classification and feature selection is achieved using shared boosting to give an efficient classifier which can be applied to a large number of classes. Accurate image segmentation is achieved by incorporating these classifiers in a conditional random field. Efficient training of the model on very large datasets is achieved by exploiting both random feature selection and piecewise training methods. High classification and segmentation accuracy are demonstrated on three different databases: i) our own 21-object class database of photographs of real objects viewed under general lighting conditions, poses and viewpoints, ii) the 7-class Corel subset and iii) the 7-class Sowerby database used in [1]. The proposed algorithm gives competitive results both for highly textured (e.g. grass, trees), highly structured (e.g. cars, faces, bikes, aeroplanes) and articulated objects (e.g. body, cow).

1,343 citations


Journal ArticleDOI
TL;DR: This paper presents a fuzzy c-means (FCM) algorithm that incorporates spatial information into the membership function for clustering and yields regions more homogeneous than those of other methods.

1,296 citations


Journal ArticleDOI
TL;DR: This paper reviews ultrasound segmentation methods, in a broad sense, focusing on techniques developed for medical B-mode ultrasound images, and presents a classification of methodology in terms of use of prior information.
Abstract: This paper reviews ultrasound segmentation methods, in a broad sense, focusing on techniques developed for medical B-mode ultrasound images. First, we present a review of articles by clinical application to highlight the approaches that have been investigated and degree of validation that has been done in different clinical domains. Then, we present a classification of methodology in terms of use of prior information. We conclude by selecting ten papers which have presented original ideas that have demonstrated particular clinical usefulness or potential specific to the ultrasound segmentation problem

1,150 citations


Journal ArticleDOI
TL;DR: It is shown how certain nonconvex optimization problems that arise in image processing and computer vision can be restated as convex minimization problems, which allows, in particular, the finding of global minimizers via standard conveX minimization schemes.
Abstract: We show how certain nonconvex optimization problems that arise in image processing and computer vision can be restated as convex minimization problems. This allows, in particular, the finding of global minimizers via standard convex minimization schemes.

1,142 citations


Proceedings ArticleDOI
20 Aug 2006
TL;DR: A novel stereo matching algorithm is proposed that utilizes color segmentation on the reference image and a self-adapting matching score that maximizes the number of reliable correspondences that is more robust to outliers.
Abstract: A novel stereo matching algorithm is proposed that utilizes color segmentation on the reference image and a self-adapting matching score that maximizes the number of reliable correspondences. The scene structure is modeled by a set of planar surface patches which are estimated using a new technique that is more robust to outliers. Instead of assigning a disparity value to each pixel, a disparity plane is assigned to each segment. The optimal disparity plane labeling is approximated by applying belief propagation. Experimental results using the Middlebury stereo test bed demonstrate the superior performance of the proposed method

969 citations


Journal ArticleDOI
TL;DR: An automated method for the segmentation of the vascular network in retinal images that outperforms other solutions and approximates the average accuracy of a human observer without a significant degradation of sensitivity and specificity is presented.
Abstract: This paper presents an automated method for the segmentation of the vascular network in retinal images. The algorithm starts with the extraction of vessel centerlines, which are used as guidelines for the subsequent vessel filling phase. For this purpose, the outputs of four directional differential operators are processed in order to select connected sets of candidate points to be further classified as centerline pixels using vessel derived features. The final segmentation is obtained using an iterative region growing method that integrates the contents of several binary images resulting from vessel width dependent morphological filters. Our approach was tested on two publicly available databases and its results are compared with recently published methods. The results demonstrate that our algorithm outperforms other solutions and approximates the average accuracy of a human observer without a significant degradation of sensitivity and specificity

Journal ArticleDOI
TL;DR: A learning-based method for recovering 3D human body pose from single images and monocular image sequences, embedded in a novel regressive tracking framework, using dynamics from the previous state estimate together with a learned regression value to disambiguate the pose.
Abstract: We describe a learning-based method for recovering 3D human body pose from single images and monocular image sequences. Our approach requires neither an explicit body model nor prior labeling of body parts in the image. Instead, it recovers pose by direct nonlinear regression against shape descriptor vectors extracted automatically from image silhouettes. For robustness against local silhouette segmentation errors, silhouette shape is encoded by histogram-of-shape-contexts descriptors. We evaluate several different regression methods: ridge regression, relevance vector machine (RVM) regression, and support vector machine (SVM) regression over both linear and kernel bases. The RVMs provide much sparser regressors without compromising performance, and kernel bases give a small but worthwhile improvement in performance. The loss of depth and limb labeling information often makes the recovery of 3D pose from single silhouettes ambiguous. To handle this, the method is embedded in a novel regressive tracking framework, using dynamics from the previous state estimate together with a learned regression value to disambiguate the pose. We show that the resulting system tracks long sequences stably. For realism and good generalization over a wide range of viewpoints, we train the regressors on images resynthesized from real human motion capture data. The method is demonstrated for several representations of full body pose, both quantitatively on independent but similar test data and qualitatively on real image sequences. Mean angular errors of 4-6/spl deg/ are obtained for a variety of walking motions.

Journal ArticleDOI
TL;DR: A review in the related literature presented in this paper reveals that better performance has been reported, when limitations in distance, angle of view, illumination conditions are set, and background complexity is low.
Abstract: In this paper, a new algorithm for vehicle license plate identification is proposed, on the basis of a novel adaptive image segmentation technique (sliding concentric windows) and connected component analysis in conjunction with a character recognition neural network. The algorithm was tested with 1334 natural-scene gray-level vehicle images of different backgrounds and ambient illumination. The camera focused in the plate, while the angle of view and the distance from the vehicle varied according to the experimental setup. The license plates properly segmented were 1287 over 1334 input images (96.5%). The optical character recognition system is a two-layer probabilistic neural network (PNN) with topology 108-180-36, whose performance for entire plate recognition reached 89.1%. The PNN is trained to identify alphanumeric characters from car license plates based on data obtained from algorithmic image processing. Combining the above two rates, the overall rate of success for the license-plate-recognition algorithm is 86.0%. A review in the related literature presented in this paper reveals that better performance (90% up to 95%) has been reported, when limitations in distance, angle of view, illumination conditions are set, and background complexity is low

Proceedings ArticleDOI
17 Jun 2006
TL;DR: This work compute multiple segmentations of each image and then learns the object classes and chooses the correct segmentations, demonstrating that such an algorithm succeeds in automatically discovering many familiar objects in a variety of image datasets, including those from Caltech, MSRC and LabelMe.
Abstract: Given a large dataset of images, we seek to automatically determine the visually similar object and scene classes together with their image segmentation. To achieve this we combine two ideas: (i) that a set of segmented objects can be partitioned into visual object classes using topic discovery models from statistical text analysis; and (ii) that visual object classes can be used to assess the accuracy of a segmentation. To tie these ideas together we compute multiple segmentations of each image and then: (i) learn the object classes; and (ii) choose the correct segmentations. We demonstrate that such an algorithm succeeds in automatically discovering many familiar objects in a variety of image datasets, including those from Caltech, MSRC and LabelMe.

Journal ArticleDOI
TL;DR: An optimal surface detection method capable of simultaneously detecting multiple interacting surfaces, in which the optimality is controlled by the cost functions designed for individual surfaces and by several geometric constraints defining the surface smoothness and interrelations is developed.
Abstract: Efficient segmentation of globally optimal surfaces representing object boundaries in volumetric data sets is important and challenging in many medical image analysis applications. We have developed an optimal surface detection method capable of simultaneously detecting multiple interacting surfaces, in which the optimality is controlled by the cost functions designed for individual surfaces and by several geometric constraints defining the surface smoothness and interrelations. The method solves the surface segmentation problem by transforming it into computing a minimum s{\hbox{-}} t cut in a derived arc-weighted directed graph. The proposed algorithm has a low-order polynomial time complexity and is computationally efficient. It has been extensively validated on more than 300 computer-synthetic volumetric images, 72 CT-scanned data sets of different-sized plexiglas tubes, and tens of medical images spanning various imaging modalities. In all cases, the approach yielded highly accurate results. Our approach can be readily extended to higher-dimensional image segmentation.

Journal ArticleDOI
TL;DR: Common overlap measures are generalized to measure the total overlap of ensembles of labels defined on multiple test images and account for fractional labels using fuzzy set theory to allow a single "figure-of-merit" to be reported which summarises the results of a complex experiment by image pair, by label or overall.
Abstract: Measures of overlap of labelled regions of images, such as the Dice and Tanimoto coefficients, have been extensively used to evaluate image registration and segmentation algorithms. Modern studies can include multiple labels defined on multiple images yet most evaluation schemes report one overlap per labelled region, simply averaged over multiple images. In this paper, common overlap measures are generalized to measure the total overlap of ensembles of labels defined on multiple test images and account for fractional labels using fuzzy set theory. This framework allows a single "figure-of-merit" to be reported which summarises the results of a complex experiment by image pair, by label or overall. A complementary measure of error, the overlap distance, is defined which captures the spatial extent of the nonoverlapping part and is related to the Hausdorff distance computed on grey level images. The generalized overlap measures are validated on synthetic images for which the overlap can be computed analytically and used as similarity measures in nonrigid registration of three-dimensional magnetic resonance imaging (MRI) brain images. Finally, a pragmatic segmentation ground truth is constructed by registering a magnetic resonance atlas brain to 20 individual scans, and used with the overlap measures to evaluate publicly available brain segmentation algorithms

Book ChapterDOI
07 May 2006
TL;DR: In this paper, the authors cast the problem of motion segmentation of feature trajectories as linear manifold finding problems, and propose a general framework to segment a wide range of motions including independent, articulated, rigid, nonrigid, degenerate, non-degenerate or any combination of them.
Abstract: We cast the problem of motion segmentation of feature trajectories as linear manifold finding problems and propose a general framework for motion segmentation under affine projections which utilizes two properties of trajectory data: geometric constraint and locality. The geometric constraint states that the trajectories of the same motion lie in a low dimensional linear manifold and different motions result in different linear manifolds; locality, by which we mean in a transformed space a data and its neighbors tend to lie in the same linear manifold, provides a cue for efficient estimation of these manifolds. Our algorithm estimates a number of linear manifolds, whose dimensions are unknown beforehand, and segment the trajectories accordingly. It first transforms and normalizes the trajectories; secondly, for each trajectory it estimates a local linear manifold through local sampling; then it derives the affinity matrix based on principal subspace angles between these estimated linear manifolds; at last, spectral clustering is applied to the matrix and gives the segmentation result. Our algorithm is general without restriction on the number of linear manifolds and without prior knowledge of the dimensions of the linear manifolds. We demonstrate in our experiments that it can segment a wide range of motions including independent, articulated, rigid, non-rigid, degenerate, non-degenerate or any combination of them. In some highly challenging cases where other state-of-the-art motion segmentation algorithms may fail, our algorithm gives expected results.

Journal ArticleDOI
TL;DR: A novel 3D model-based algorithm is presented which performs viewpoint independent recognition of free-form objects and their segmentation in the presence of clutter and occlusions automatically and efficiently and is superior in terms of recognition rate and efficiency.
Abstract: Viewpoint independent recognition of free-form objects and their segmentation in the presence of clutter and occlusions is a challenging task. We present a novel 3D model-based algorithm which performs this task automatically and efficiently. A 3D model of an object is automatically constructed offline from its multiple unordered range images (views). These views are converted into multidimensional table representations (which we refer to as tensors). Correspondences are automatically established between these views by simultaneously matching the tensors of a view with those of the remaining views using a hash table-based voting scheme. This results in a graph of relative transformations used to register the views before they are integrated into a seamless 3D model. These models and their tensor representations constitute the model library. During online recognition, a tensor from the scene is simultaneously matched with those in the library by casting votes. Similarity measures are calculated for the model tensors which receive the most votes. The model with the highest similarity is transformed to the scene and, if it aligns accurately with an object in the scene, that object is declared as recognized and is segmented. This process is repeated until the scene is completely segmented. Experiments were performed on real and synthetic data comprised of 55 models and 610 scenes and an overall recognition rate of 95 percent was achieved. Comparison with the spin images revealed that our algorithm is superior in terms of recognition rate and efficiency

Proceedings ArticleDOI
17 Jun 2006
TL;DR: It is demonstrated that this generative model for cosegmentation has the potential to improve a wide range of research: Object driven image retrieval, video tracking and segmentation, and interactive image editing.
Abstract: We introduce the term cosegmentation which denotes the task of segmenting simultaneously the common parts of an image pair. A generative model for cosegmentation is presented. Inference in the model leads to minimizing an energy with an MRF term encoding spatial coherency and a global constraint which attempts to match the appearance histograms of the common parts. This energy has not been proposed previously and its optimization is challenging and NP-hard. For this problem a novel optimization scheme which we call trust region graph cuts is presented. We demonstrate that this framework has the potential to improve a wide range of research: Object driven image retrieval, video tracking and segmentation, and interactive image editing. The power of the framework lies in its generality, the common part can be a rigid/non-rigid object (or scene), observed from different viewpoints or even similar objects of the same class.

Journal ArticleDOI
TL;DR: Based on the concepts of luminance-weighted chrominance blending and fast intrinsic distance computations, high-quality colorization results for still images and video are obtained at a fraction of the complexity and computational cost of previously reported techniques.
Abstract: Colorization, the task of coloring a grayscale image or video, involves assigning from the single dimension of intensity or luminance a quantity that varies in three dimensions, such as red, green, and blue channels. Mapping between intensity and color is, therefore, not unique, and colorization is ambiguous in nature and requires some amount of human interaction or external information. A computationally simple, yet effective, approach of colorization is presented in this paper. The method is fast and it can be conveniently used "on the fly," permitting the user to interactively get the desired results promptly after providing a reduced set of chrominance scribbles. Based on the concepts of luminance-weighted chrominance blending and fast intrinsic distance computations, high-quality colorization results for still images and video are obtained at a fraction of the complexity and computational cost of previously reported techniques. Possible extensions of the algorithm introduced here included the capability of changing the colors of an existing color image or video, as well as changing the underlying luminance, and many other special effects demonstrated here.

Journal ArticleDOI
01 Jul 2006
TL;DR: A new algorithm is presented that processes geometric models and efficiently discovers and extracts a compact representation of their Euclidean symmetries, which captures important high-level information about the structure of a geometric model which enables a large set of further processing operations.
Abstract: "Symmetry is a complexity-reducing concept [...]; seek it every-where." - Alan J. PerlisMany natural and man-made objects exhibit significant symmetries or contain repeated substructures. This paper presents a new algorithm that processes geometric models and efficiently discovers and extracts a compact representation of their Euclidean symmetries. These symmetries can be partial, approximate, or both. The method is based on matching simple local shape signatures in pairs and using these matches to accumulate evidence for symmetries in an appropriate transformation space. A clustering stage extracts potential significant symmetries of the object, followed by a verification step. Based on a statistical sampling analysis, we provide theoretical guarantees on the success rate of our algorithm. The extracted symmetry graph representation captures important high-level information about the structure of a geometric model which in turn enables a large set of further processing operations, including shape compression, segmentation, consistent editing, symmetrization, indexing for retrieval, etc.

Journal ArticleDOI
TL;DR: High detection accuracy and overall Kappa were achieved by OB-Reflectance method in temperate forests using three SPOT-HRV images covering a 10-year period.

Book ChapterDOI
07 May 2006
TL;DR: A set of energy minimization benchmarks, which are used to compare the solution quality and running time of several common energy minimizations algorithms, as well as a general-purpose software interface that allows vision researchers to easily switch between optimization methods with minimal overhead.
Abstract: One of the most exciting advances in early vision has been the development of efficient energy minimization algorithms. Many early vision tasks require labeling each pixel with some quantity such as depth or texture. While many such problems can be elegantly expressed in the language of Markov Random Fields (MRF's), the resulting energy minimization problems were widely viewed as intractable. Recently, algorithms such as graph cuts and loopy belief propagation (LBP) have proven to be very powerful: for example, such methods form the basis for almost all the top-performing stereo methods. Unfortunately, most papers define their own energy function, which is minimized with a specific algorithm of their choice. As a result, the tradeoffs among different energy minimization algorithms are not well understood. In this paper we describe a set of energy minimization benchmarks, which we use to compare the solution quality and running time of several common energy minimization algorithms. We investigate three promising recent methods—graph cuts, LBP, and tree-reweighted message passing—as well as the well-known older iterated conditional modes (ICM) algorithm. Our benchmark problems are drawn from published energy functions used for stereo, image stitching and interactive segmentation. We also provide a general-purpose software interface that allows vision researchers to easily switch between optimization methods with minimal overhead. We expect that the availability of our benchmarks and interface will make it significantly easier for vision researchers to adopt the best method for their specific problems. Benchmarks, code, results and images are available at http://vision.middlebury.edu/MRF.

Journal ArticleDOI
TL;DR: A fast and robust hybrid method of super-resolution and demosaicing, based on a maximum a posteriori estimation technique by minimizing a multiterm cost function is proposed.
Abstract: In the last two decades, two related categories of problems have been studied independently in image restoration literature: super-resolution and demosaicing. A closer look at these problems reveals the relation between them, and, as conventional color digital cameras suffer from both low-spatial resolution and color-filtering, it is reasonable to address them in a unified context. In this paper, we propose a fast and robust hybrid method of super-resolution and demosaicing, based on a maximum a posteriori estimation technique by minimizing a multiterm cost function. The L/sub 1/ norm is used for measuring the difference between the projected estimate of the high-resolution image and each low-resolution image, removing outliers in the data and errors due to possibly inaccurate motion estimation. Bilateral regularization is used for spatially regularizing the luminance component, resulting in sharp edges and forcing interpolation along the edges and not across them. Simultaneously, Tikhonov regularization is used to smooth the chrominance components. Finally, an additional regularization term is used to force similar edge location and orientation in different color channels. We show that the minimization of the total cost function is relatively easy and fast. Experimental results on synthetic and real data sets confirm the effectiveness of our method.

Journal ArticleDOI
TL;DR: This paper proposes shape dissimilarity measures on the space of level set functions which are analytically invariant under the action of certain transformation groups, and proposes a statistical shape prior which allows to accurately encode multiple fairly distinct training shapes.
Abstract: In this paper, we make two contributions to the field of level set based image segmentation. Firstly, we propose shape dissimilarity measures on the space of level set functions which are analytically invariant under the action of certain transformation groups. The invariance is obtained by an intrinsic registration of the evolving level set function. In contrast to existing approaches to invariance in the level set framework, this closed-form solution removes the need to iteratively optimize explicit pose parameters. The resulting shape gradient is more accurate in that it takes into account the effect of boundary variation on the object's pose. Secondly, based on these invariant shape dissimilarity measures, we propose a statistical shape prior which allows to accurately encode multiple fairly distinct training shapes. This prior constitutes an extension of kernel density estimators to the level set domain. In contrast to the commonly employed Gaussian distribution, such nonparametric density estimators are suited to model aribtrary distributions. We demonstrate the advantages of this multi-modal shape prior applied to the segmentation and tracking of a partially occluded walking person in a video sequence, and on the segmentation of the left ventricle in cardiac ultrasound images. We give quantitative results on segmentation accuracy and on the dependency of segmentation results on the number of training shapes.

Journal ArticleDOI
TL;DR: A framework for evaluating image segmentation algorithms based on three factors-precision, accuracy, and efficiency-need to be considered for both recognition and delineation is described.

Journal ArticleDOI
TL;DR: An automated system that integrates a series of advanced analysis methods that can be used to segment, classify, and track individual cells in a living cell population over a few days is presented.
Abstract: Quantitative measurement of cell cycle progression in individual cells over time is important in understanding drug treatment effects on cancer cells. Recent advances in time-lapse fluorescence microscopy imaging have provided an important tool to study the cell cycle process under different conditions of perturbation. However, existing computational imaging methods are rather limited in analyzing and tracking such time-lapse datasets, and manual analysis is unreasonably time-consuming and subject to observer variances. This paper presents an automated system that integrates a series of advanced analysis methods to fill this gap. The cellular image analysis methods can be used to segment, classify, and track individual cells in a living cell population over a few days. Experimental results show that the proposed method is efficient and effective in cell tracking and phase identification.

Proceedings ArticleDOI
26 Mar 2006
TL;DR: This work addresses the drawbacks of the conventional watershed algorithm when it is applied to medical images by using k-means clustering to produce a primary segmentation of the image before the authors apply their improved watershed segmentation algorithm to it.
Abstract: We propose a methodology that incorporates k-means and improved watershed segmentation algorithm for medical image segmentation. The use of the conventional watershed algorithm for medical image analysis is widespread because of its advantages, such as always being able to produce a complete division of the image. However, its drawbacks include over-segmentation and sensitivity to false edges. We address the drawbacks of the conventional watershed algorithm when it is applied to medical images by using k-means clustering to produce a primary segmentation of the image before we apply our improved watershed segmentation algorithm to it. The k-means clustering is an unsupervised learning algorithm, while the improved watershed segmentation algorithm makes use of automated thresholding on the gradient magnitude map and post-segmentation merging on the initial partitions to reduce the number of false edges and over-segmentation. By comparing the number of partitions in the segmentation maps of 50 images, we showed that our proposed methodology produced segmentation maps which have 92% fewer partitions than the segmentation maps produced by the conventional watershed algorithm

Journal ArticleDOI
TL;DR: A novel marker-controlled watershed based on mathematical morphology is proposed, which can effectively segment clustered cells with less oversegmentation and design a tracking method based on modified mean shift algorithm, in which several kernels with adaptive scale, shape, and direction are designed.
Abstract: It is important to observe and study cancer cells' cycle progression in order to better understand drug effects on cancer cells. Time-lapse microscopy imaging serves as an important method to measure the cycle progression of individual cells in a large population. Since manual analysis is unreasonably time consuming for the large volumes of time-lapse image data, automated image analysis is proposed. Existing approaches dealing with time-lapse image data are rather limited and often give inaccurate analysis results, especially in segmenting and tracking individual cells in a cell population. In this paper, we present a new approach to segment and track cell nuclei in time-lapse fluorescence image sequence. First, we propose a novel marker-controlled watershed based on mathematical morphology, which can effectively segment clustered cells with less oversegmentation. To further segment undersegmented cells or to merge oversegmented cells, context information among neighboring frames is employed, which is proved to be an effective strategy. Then, we design a tracking method based on modified mean shift algorithm, in which several kernels with adaptive scale, shape, and direction are designed. Finally, we combine mean-shift and Kalman filter to achieve a more robust cell nuclei tracking method than existing ones. Experimental results show that our method can obtain 98.8% segmentation accuracy, 97.4% cell division tracking accuracy, and 97.6% cell tracking accuracy

Journal Article
TL;DR: A Bayesian framework for parsing images into their constituent visual patterns that optimizes the posterior probability and outputs a scene representation as a “parsing graph”, in a spirit similar to parsing sentences in speech and natural language is presented.
Abstract: In this chapter we present a Bayesian framework for parsing images into their constituent visual patterns. The parsing algorithm optimizes the posterior probability and outputs a scene representation as a parsing graph, in a spirit similar to parsing sentences in speech and natural language. The algorithm constructs the parsing graph and re-configures it dynamically using a set of moves, which are mostly reversible Markov chain jumps. This computational framework integrates two popular inference approaches - generative (top-down) methods and discriminative (bottom-up) methods. The former formulates the posterior probability in terms of generative models for images defined by likelihood functions and priors. The latter computes discriminative probabilities based on a sequence (cascade) of bottom-up tests/filters. In our Markov chain algorithm design, the posterior probability, defined by the generative models, is the invariant (target) probability for the Markov chain, and the discriminative probabilities are used to construct proposal probabilities to drive the Markov chain. Intuitively, the bottom-up discriminative probabilities activate top-down generative models. In this chapter, we focus on two types of visual patterns - generic visual patterns, such as texture and shading, and object patterns including human faces and text. These types of patterns compete and cooperate to explain the image and so image parsing unifies image segmentation, object detection, and recognition (if we use generic visual patterns only then image parsing will correspond to image segmentation [48].). We illustrate our algorithm on natural images of complex city scenes and show examples where image segmentation can be improved by allowing object specific knowledge to disambiguate low-level segmentation cues, and conversely where object detection can be improved by using generic visual patterns to explain away shadows and occlusions.