scispace - formally typeset
Search or ask a question

Showing papers on "Image segmentation published in 2011"


Journal ArticleDOI
TL;DR: This paper investigates two fundamental problems in computer vision: contour detection and image segmentation and presents state-of-the-art algorithms for both of these tasks.
Abstract: This paper investigates two fundamental problems in computer vision: contour detection and image segmentation. We present state-of-the-art algorithms for both of these tasks. Our contour detector combines multiple local cues into a globalization framework based on spectral clustering. Our segmentation algorithm consists of generic machinery for transforming the output of any contour detector into a hierarchical region tree. In this manner, we reduce the problem of image segmentation to that of contour detection. Extensive experimental evaluation demonstrates that both our contour detection and segmentation methods significantly outperform competing algorithms. The automatically generated hierarchical segmentations can be interactively refined by user-specified annotations. Computation at multiple image resolutions provides a means of coupling our system to recognition applications.

5,068 citations


Journal ArticleDOI
TL;DR: A first-order primal-dual algorithm for non-smooth convex optimization problems with known saddle-point structure can achieve O(1/N2) convergence on problems, where the primal or the dual objective is uniformly convex, and it can show linear convergence, i.e. O(ωN) for some ω∈(0,1), on smooth problems.
Abstract: In this paper we study a first-order primal-dual algorithm for non-smooth convex optimization problems with known saddle-point structure. We prove convergence to a saddle-point with rate O(1/N) in finite dimensions for the complete class of problems. We further show accelerations of the proposed algorithm to yield improved rates on problems with some degree of smoothness. In particular we show that we can achieve O(1/N 2) convergence on problems, where the primal or the dual objective is uniformly convex, and we can show linear convergence, i.e. O(? N ) for some ??(0,1), on smooth problems. The wide applicability of the proposed algorithm is demonstrated on several imaging problems such as image denoising, image deconvolution, image inpainting, motion estimation and multi-label image segmentation.

4,487 citations


Proceedings ArticleDOI
20 Jun 2011
TL;DR: This work proposes a regional contrast based saliency extraction algorithm, which simultaneously evaluates global contrast differences and spatial coherence, and consistently outperformed existing saliency detection methods.
Abstract: Automatic estimation of salient object regions across images, without any prior assumption or knowledge of the contents of the corresponding scenes, enhances many computer vision and computer graphics applications. We introduce a regional contrast based salient object detection algorithm, which simultaneously evaluates global contrast differences and spatial weighted coherence scores. The proposed algorithm is simple, efficient, naturally multi-scale, and produces full-resolution, high-quality saliency maps. These saliency maps are further used to initialize a novel iterative version of GrabCut, namely SaliencyCut, for high quality unsupervised salient object segmentation. We extensively evaluated our algorithm using traditional salient object detection datasets, as well as a more challenging Internet image dataset. Our experimental results demonstrate that our algorithm consistently outperforms 15 existing salient object detection and segmentation methods, yielding higher precision and better recall rates. We also show that our algorithm can be used to efficiently extract salient object masks from Internet images, enabling effective sketch-based image retrieval (SBIR) via simple shape comparisons. Despite such noisy internet images, where the saliency regions are ambiguous, our saliency guided image retrieval achieves a superior retrieval rate compared with state-of-the-art SBIR methods, and additionally provides important target object region information.

3,653 citations


Proceedings Article
12 Dec 2011
TL;DR: This paper considers fully connected CRF models defined on the complete set of pixels in an image and proposes a highly efficient approximate inference algorithm in which the pairwise edge potentials are defined by a linear combination of Gaussian kernels.
Abstract: Most state-of-the-art techniques for multi-class image segmentation and labeling use conditional random fields defined over pixels or image regions. While region-level models often feature dense pairwise connectivity, pixel-level models are considerably larger and have only permitted sparse graph structures. In this paper, we consider fully connected CRF models defined on the complete set of pixels in an image. The resulting graphs have billions of edges, making traditional inference algorithms impractical. Our main contribution is a highly efficient approximate inference algorithm for fully connected CRF models in which the pairwise edge potentials are defined by a linear combination of Gaussian kernels. Our experiments demonstrate that dense connectivity at the pixel level substantially improves segmentation and labeling accuracy.

3,233 citations


Journal ArticleDOI
TL;DR: Efficiency figures show that the proposed technique for motion detection outperforms recent and proven state-of-the-art methods in terms of both computation speed and detection rate.
Abstract: This paper presents a technique for motion detection that incorporates several innovative mechanisms. For example, our proposed technique stores, for each pixel, a set of values taken in the past at the same location or in the neighborhood. It then compares this set to the current pixel value in order to determine whether that pixel belongs to the background, and adapts the model by choosing randomly which values to substitute from the background model. This approach differs from those based upon the classical belief that the oldest values should be replaced first. Finally, when the pixel is found to be part of the background, its value is propagated into the background model of a neighboring pixel. We describe our method in full details (including pseudo-code and the parameter values used) and compare it to other background subtraction techniques. Efficiency figures show that our method outperforms recent and proven state-of-the-art methods in terms of both computation speed and detection rate. We also analyze the performance of a downscaled version of our algorithm to the absolute minimum of one comparison and one byte of memory per pixel. It appears that even such a simplified version of our algorithm performs better than mainstream techniques.

1,777 citations


Journal ArticleDOI
TL;DR: A novel region-based method for image segmentation, which is able to simultaneously segment the image and estimate the bias field, and the estimated bias field can be used for intensity inhomogeneity correction (or bias correction).
Abstract: Intensity inhomogeneity often occurs in real-world images, which presents a considerable challenge in image segmentation. The most widely used image segmentation algorithms are region-based and typically rely on the homogeneity of the image intensities in the regions of interest, which often fail to provide accurate segmentation results due to the intensity inhomogeneity. This paper proposes a novel region-based method for image segmentation, which is able to deal with intensity inhomogeneities in the segmentation. First, based on the model of images with intensity inhomogeneities, we derive a local intensity clustering property of the image intensities, and define a local clustering criterion function for the image intensities in a neighborhood of each point. This local clustering criterion function is then integrated with respect to the neighborhood center to give a global criterion of image segmentation. In a level set formulation, this criterion defines an energy in terms of the level set functions that represent a partition of the image domain and a bias field that accounts for the intensity inhomogeneity of the image. Therefore, by minimizing this energy, our method is able to simultaneously segment the image and estimate the bias field, and the estimated bias field can be used for intensity inhomogeneity correction (or bias correction). Our method has been validated on synthetic images and real images of various modalities, with desirable performance in the presence of intensity inhomogeneities. Experiments show that our method is more robust to initialization, faster and more accurate than the well-known piecewise smooth model. As an application, our method has been used for segmentation and bias correction of magnetic resonance (MR) images with promising results.

1,201 citations


Proceedings ArticleDOI
09 Jun 2011
TL;DR: Ilastik as mentioned in this paper is an easy-to-use tool which allows the user without expertise in image processing to perform segmentation and classification in a unified way, based on labels provided by the user through a convenient mouse interface.
Abstract: Segmentation is the process of partitioning digital images into meaningful regions. The analysis of biological high content images often requires segmentation as a first step. We propose ilastik as an easy-to-use tool which allows the user without expertise in image processing to perform segmentation and classification in a unified way. ilastik learns from labels provided by the user through a convenient mouse interface. Based on these labels, ilastik infers a problem specific segmentation. A random forest classifier is used in the learning step, in which each pixel's neighborhood is characterized by a set of generic (nonlinear) features. ilastik supports up to three spatial plus one spectral dimension and makes use of all dimensions in the feature calculation. ilastik provides realtime feedback that enables the user to interactively refine the segmentation result and hence further fine-tune the classifier. An uncertainty measure guides the user to ambiguous regions in the images. Real time performance is achieved by multi-threading which fully exploits the capabilities of modern multi-core machines. Once a classifier has been trained on a set of representative images, it can be exported and used to automatically process a very large number of images (e.g. using the CellProfiler pipeline). ilastik is an open source project and released under the BSD license at www.ilastik.org.

1,158 citations


Journal ArticleDOI
TL;DR: A neural network scheme for pixel classification and computes a 7-D vector composed of gray-level and moment invariants-based features for pixel representation that is suitable for retinal image computer analyses such as automated screening for early diabetic retinopathy detection.
Abstract: This paper presents a new supervised method for blood vessel detection in digital retinal images. This method uses a neural network (NN) scheme for pixel classification and computes a 7-D vector composed of gray-level and moment invariants-based features for pixel representation. The method was evaluated on the publicly available DRIVE and STARE databases, widely used for this purpose, since they contain retinal images where the vascular structure has been precisely marked by experts. Method performance on both sets of test images is better than other existing solutions in literature. The method proves especially accurate for vessel detection in STARE images. Its application to this database (even when the NN was trained on the DRIVE database) outperforms all analyzed segmentation approaches. Its effectiveness and robustness with different image conditions, together with its simplicity and fast implementation, make this blood vessel segmentation proposal suitable for retinal image computer analyses such as automated screening for early diabetic retinopathy detection.

913 citations


Book
14 Dec 2011
TL;DR: This book discusses methods for preserving geometric deformable models for brain reconstruction, as well as methods for implicit active contour models, and some of the methods used in this book were adapted for this purpose.
Abstract: * Level set methods * Deformable models * Fast methods for implicit active contour models * Fast edge integration * Variational snake theory * Multiplicative denoising and deblurring * Total varation minimization for scalar/vector regularization * Morphological global reconstruction and levelings * Fast marching techniques for visual grouping and segmentation * Multiphase object detection and image segmentation * Adaptive segmentation of vector-valued images * Mumford-Shah for segmentation and stereo * Shape analysis toward model-based segmentation * Joint image registration and segmentation * Image alignment * Variational principles in optical flow estimation and tracking * Region matching and tracking under deformations or occlusions * Computational stereo * Visualization, analysis and shape reconstruction of sparse data * Variational problems and partial differential equations on implicit surfaces * Knowledge-based segmentation of medical images * Topology preserving geometric deformable models for brain reconstruction * Editing geometric models * Simulating natural phenomena

899 citations


Proceedings ArticleDOI
20 Jun 2011
TL;DR: This paper proposes a generic and simple framework comprising three steps: constructing a cost volume, fast cost volume filtering and winner-take-all label selection, and achieves state-of-the-art results that achieve disparity maps in real-time, and optical flow fields with very fine structures as well as large displacements.
Abstract: Many computer vision tasks can be formulated as labeling problems. The desired solution is often a spatially smooth labeling where label transitions are aligned with color edges of the input image. We show that such solutions can be efficiently achieved by smoothing the label costs with a very fast edge preserving filter. In this paper we propose a generic and simple framework comprising three steps: (i) constructing a cost volume (ii) fast cost volume filtering and (iii) winner-take-all label selection. Our main contribution is to show that with such a simple framework state-of-the-art results can be achieved for several computer vision applications. In particular, we achieve (i) disparity maps in real-time, whose quality exceeds those of all other fast (local) approaches on the Middlebury stereo benchmark, and (ii) optical flow fields with very fine structures as well as large displacements. To demonstrate robustness, the few parameters of our framework are set to nearly identical values for both applications. Also, competitive results for interactive image segmentation are presented. With this work, we hope to inspire other researchers to leverage this framework to other application areas.

898 citations


Proceedings ArticleDOI
20 Jun 2011
TL;DR: An efficient greedy algorithm for superpixel segmentation is developed by exploiting submodular and mono-tonic properties of the objective function and proving an approximation bound of ½ for the optimality of the solution.
Abstract: We propose a new objective function for superpixel segmentation This objective function consists of two components: entropy rate of a random walk on a graph and a balancing term The entropy rate favors formation of compact and homogeneous clusters, while the balancing function encourages clusters with similar sizes We present a novel graph construction for images and show that this construction induces a matroid — a combinatorial structure that generalizes the concept of linear independence in vector spaces The segmentation is then given by the graph topology that maximizes the objective function under the matroid constraint By exploiting submodular and mono-tonic properties of the objective function, we develop an efficient greedy algorithm Furthermore, we prove an approximation bound of ½ for the optimality of the solution Extensive experiments on the Berkeley segmentation benchmark show that the proposed algorithm outperforms the state of the art in all the standard evaluation metrics

Proceedings ArticleDOI
06 Nov 2011
TL;DR: This work adapt segmentation as a selective search by reconsidering segmentation to generate many approximate locations over few and precise object delineations because an object whose location is never generated can not be recognised and appearance and immediate nearby context are most effective for object recognition.
Abstract: For object recognition, the current state-of-the-art is based on exhaustive search. However, to enable the use of more expensive features and classifiers and thereby progress beyond the state-of-the-art, a selective search strategy is needed. Therefore, we adapt segmentation as a selective search by reconsidering segmentation: We propose to generate many approximate locations over few and precise object delineations because (1) an object whose location is never generated can not be recognised and (2) appearance and immediate nearby context are most effective for object recognition. Our method is class-independent and is shown to cover 96.7% of all objects in the Pascal VOC 2007 test set using only 1,536 locations per image. Our selective search enables the use of the more expensive bag-of-words method which we use to substantially improve the state-of-the-art by up to 8.5% for 8 out of 20 classes on the Pascal VOC 2010 detection challenge.

Journal ArticleDOI
TL;DR: Inspired by recent work in image denoising, the proposed nonlocal patch-based label fusion produces accurate and robust segmentation in quantitative magnetic resonance analysis.

Proceedings ArticleDOI
06 Nov 2011
TL;DR: This paper proposes to construct the dictionary by using both observed and unobserved, hidden data, and shows that the effects of the hidden data can be approximately recovered by solving a nuclear norm minimization problem, which is convex and can be solved efficiently.
Abstract: Low-Rank Representation (LRR) [16, 17] is an effective method for exploring the multiple subspace structures of data. Usually, the observed data matrix itself is chosen as the dictionary, which is a key aspect of LRR. However, such a strategy may depress the performance, especially when the observations are insufficient and/or grossly corrupted. In this paper we therefore propose to construct the dictionary by using both observed and unobserved, hidden data. We show that the effects of the hidden data can be approximately recovered by solving a nuclear norm minimization problem, which is convex and can be solved efficiently. The formulation of the proposed method, called Latent Low-Rank Representation (LatLRR), seamlessly integrates subspace segmentation and feature extraction into a unified framework, and thus provides us with a solution for both subspace segmentation and feature extraction. As a subspace segmentation algorithm, LatLRR is an enhanced version of LRR and outperforms the state-of-the-art algorithms. Being an unsupervised feature extraction algorithm, LatLRR is able to robustly extract salient features from corrupted data, and thus can work much better than the benchmark that utilizes the original data vectors as features for classification. Compared to dimension reduction based methods, LatLRR is more robust to noise.

Proceedings ArticleDOI
20 Jun 2011
TL;DR: This paper proposes a model based approach, which detects humans using a 2-D head contour model and a 3-DHead surface model and proposes a segmentation scheme to segment the human from his/her surroundings and extract the whole contours of the figure based on the authors' detection point.
Abstract: Conventional human detection is mostly done in images taken by visible-light cameras. These methods imitate the detection process that human use. They use features based on gradients, such as histograms of oriented gradients (HOG), or extract interest points in the image, such as scale-invariant feature transform (SIFT), etc. In this paper, we present a novel human detection method using depth information taken by the Kinect for Xbox 360. We propose a model based approach, which detects humans using a 2-D head contour model and a 3-D head surface model. We propose a segmentation scheme to segment the human from his/her surroundings and extract the whole contours of the figure based on our detection point. We also explore the tracking algorithm based on our detection result. The methods are tested on our database taken by the Kinect in our lab and present superior results.

Journal ArticleDOI
TL;DR: A supervised workflow is proposed in this study to reduce manual labor and objectify the choice of significant object features and classification thresholds and resulted in accuracies between 73% and 87% for the affected areas, and approximately balanced commission and omission errors.

Proceedings ArticleDOI
01 Nov 2011
TL;DR: This work uses a state-of-the-art big and deep neural network combining convolution and max-pooling for supervised feature learning and classification of hand gestures given by humans to mobile robots using colored gloves.
Abstract: Automatic recognition of gestures using computer vision is important for many real-world applications such as sign language recognition and human-robot interaction (HRI). Our goal is a real-time hand gesture-based HRI interface for mobile robots. We use a state-of-the-art big and deep neural network (NN) combining convolution and max-pooling (MPCNN) for supervised feature learning and classification of hand gestures given by humans to mobile robots using colored gloves. The hand contour is retrieved by color segmentation, then smoothened by morphological image processing which eliminates noisy edges. Our big and deep MPCNN classifies 6 gesture classes with 96% accuracy, nearly three times better than the nearest competitor. Experiments with mobile robots using an ARM 11 533MHz processor achieve real-time gesture recognition performance.

Journal ArticleDOI
TL;DR: This paper introduces a robust, learning-based brain extraction system (ROBEX), which combines a discriminative and a generative model to achieve the final result and shows that ROBEX provides significantly improved performance measures for almost every method/dataset combination.
Abstract: Automatic whole-brain extraction from magnetic resonance images (MRI), also known as skull stripping, is a key component in most neuroimage pipelines. As the first element in the chain, its robustness is critical for the overall performance of the system. Many skull stripping methods have been proposed, but the problem is not considered to be completely solved yet. Many systems in the literature have good performance on certain datasets (mostly the datasets they were trained/tuned on), but fail to produce satisfactory results when the acquisition conditions or study populations are different. In this paper we introduce a robust, learning-based brain extraction system (ROBEX). The method combines a discriminative and a generative model to achieve the final result. The discriminative model is a Random Forest classifier trained to detect the brain boundary; the generative model is a point distribution model that ensures that the result is plausible. When a new image is presented to the system, the generative model is explored to find the contour with highest likelihood according to the discriminative model. Because the target shape is in general not perfectly represented by the generative model, the contour is refined using graph cuts to obtain the final segmentation. Both models were trained using 92 scans from a proprietary dataset but they achieve a high degree of robustness on a variety of other datasets. ROBEX was compared with six other popular, publicly available methods (BET, BSE, FreeSurfer, AFNI, BridgeBurner, and GCUT) on three publicly available datasets (IBSR, LPBA40, and OASIS, 137 scans in total) that include a wide range of acquisition hardware and a highly variable population (different age groups, healthy/diseased). The results show that ROBEX provides significantly improved performance measures for almost every method/dataset combination.

Proceedings ArticleDOI
01 Nov 2011
TL;DR: This paper uses a CRF-based model to evaluate a range of different representations for depth information and proposes a novel prior on 3D location, revealing that the combination of depth and intensity images gives dramatic performance gains over intensity images alone.
Abstract: In this paper we explore how a structured light depth sensor, in the form of the Microsoft Kinect, can assist with indoor scene segmentation. We use a CRF-based model to evaluate a range of different representations for depth information and propose a novel prior on 3D location. We introduce a new and challenging indoor scene dataset, complete with accurate depth maps and dense label coverage. Evaluating our model on this dataset reveals that the combination of depth and intensity images gives dramatic performance gains over intensity images alone. Our results clearly demonstrate the utility of structured light sensors for scene understanding.

Proceedings ArticleDOI
06 Nov 2011
TL;DR: The method first identifies object-like regions in any frame according to both static and dynamic cues and compute a series of binary partitions among candidate “key-segments” to discover hypothesis groups with persistent appearance and motion.
Abstract: We present an approach to discover and segment foreground object(s) in video. Given an unannotated video sequence, the method first identifies object-like regions in any frame according to both static and dynamic cues. We then compute a series of binary partitions among those candidate “key-segments” to discover hypothesis groups with persistent appearance and motion. Finally, using each ranked hypothesis in turn, we estimate a pixel-level object labeling across all frames, where (a) the foreground likelihood depends on both the hypothesis's appearance as well as a novel localization prior based on partial shape matching, and (b) the background likelihood depends on cues pulled from the key-segments' (possibly diverse) surroundings observed across the sequence. Compared to existing methods, our approach automatically focuses on the persistent foreground regions of interest while resisting oversegmentation. We apply our method to challenging benchmark videos, and show competitive or better results than the state-of-the-art.

Proceedings ArticleDOI
09 May 2011
TL;DR: This paper presents a set of segmentation methods for various types of 3D point clouds addressed using ground models of non-constant resolution either providing a continuous probabilistic surface or a terrain mesh built from the structure of a range image, both representations providing close to real-time performance.
Abstract: This paper presents a set of segmentation methods for various types of 3D point clouds. Segmentation of dense 3D data (e.g. Riegl scans) is optimised via a simple yet efficient voxelisation of the space. Prior ground extraction is empirically shown to significantly improve segmentation performance. Segmentation of sparse 3D data (e.g. Velodyne scans) is addressed using ground models of non-constant resolution either providing a continuous probabilistic surface or a terrain mesh built from the structure of a range image, both representations providing close to real-time performance. All the algorithms are tested on several hand labeled data sets using two novel metrics for segmentation evaluation.

Proceedings ArticleDOI
20 Jun 2011
TL;DR: Using a new collection of fifty hyperspectral images of indoor and outdoor scenes, an optimized “spatio-spectral basis” is derived for representing hyperspectrals image patches and statistical models for the coefficients in this basis are explored.
Abstract: Hyperspectral images provide higher spectral resolution than typical RGB images by including per-pixel irradiance measurements in a number of narrow bands of wavelength in the visible spectrum. The additional spectral resolution may be useful for many visual tasks, including segmentation, recognition, and relighting. Vision systems that seek to capture and exploit hyperspectral data should benefit from statistical models of natural hyperspectral images, but at present, relatively little is known about their structure. Using a new collection of fifty hyperspectral images of indoor and outdoor scenes, we derive an optimized “spatio-spectral basis” for representing hyperspectral image patches, and explore statistical models for the coefficients in this basis.

Journal ArticleDOI
TL;DR: This work describes the technical and implementation aspects of Atropos, an ITK-based multivariate n-class open source segmentation algorithm distributed with ANTs and evaluates its performance on two different ground-truth datasets.
Abstract: We introduce Atropos, an ITK-based multivariate n-class open source segmentation algorithm distributed with ANTs (http://www.picsl.upenn.edu/ANTs). The Bayesian formulation of the segmentation problem is solved using the Expectation Maximization (EM) algorithm with the modeling of the class intensities based on either parametric or non-parametric finite mixtures. Atropos is capable of incorporating spatial prior probability maps (sparse), prior label maps and/or Markov Random Field (MRF) modeling. Atropos has also been efficiently implemented to handle large quantities of possible labelings (in the experimental section, we use up to 69 classes) with a minimal memory footprint. This work describes the technical and implementation aspects of Atropos and evaluates its performance on two different ground-truth datasets. First, we use the BrainWeb dataset from Montreal Neurological Institute to evaluate three-tissue segmentation performance via (1) K-means segmentation without use of template data; (2) MRF segmentation with initialization by prior probability maps derived from a group template; (3) Prior-based segmentation with use of spatial prior probability maps derived from a group template. We also evaluate Atropos performance by using spatial priors to drive a 69-class EM segmentation problem derived from the Hammers atlas from University College London. These evaluation studies, combined with illustrative examples that exercise Atropos options, demonstrate both performance and wide applicability of this new platform-independent open source segmentation tool.

Journal ArticleDOI
TL;DR: In this paper an attempt is made to study the performance of most commonly used edge detection techniques for image segmentation and the comparison of these techniques is carried out with an experiment by using MATLAB software.
Abstract: Interpretation of image contents is one of the objectives in computer vision specifically in image processing. In this era it has received much awareness of researchers. In image interpretation the partition of the image into object and background is a severe step. Segmentation separates an image into its component regions or objects. Image segmentation t needs to segment the object from the background to read the image properly and identify the content of the image carefully. In this context, edge detection is a fundamental tool for image segmentation. In this paper an attempt is made to study the performance of most commonly used edge detection techniques for image segmentation and also the comparison of these techniques is carried out with an experiment by using MATLAB software.

Journal ArticleDOI
TL;DR: A new fuzzy level set algorithm is proposed in this paper to facilitate medical image segmentation that is able to directly evolve from the initial segmentation by spatial fuzzy clustering and enhanced with locally regularized evolution.

Journal ArticleDOI
TL;DR: A new supervised Bayesian approach to hyperspectral image segmentation with active learning, which consists of a multinomial logistic regression model to learn the class posterior probability distributions and a new active sampling approach, called modified breaking ties, which is able to provide an unbiased sampling.
Abstract: This paper introduces a new supervised Bayesian approach to hyperspectral image segmentation with active learning, which consists of two main steps. First, we use a multinomial logistic regression (MLR) model to learn the class posterior probability distributions. This is done by using a recently introduced logistic regression via splitting and augmented Lagrangian algorithm. Second, we use the information acquired in the previous step to segment the hyperspectral image using a multilevel logistic prior that encodes the spatial information. In order to reduce the cost of acquiring large training sets, active learning is performed based on the MLR posterior probabilities. Another contribution of this paper is the introduction of a new active sampling approach, called modified breaking ties, which is able to provide an unbiased sampling. Furthermore, we have implemented our proposed method in an efficient way. For instance, in order to obtain the time-consuming maximum a posteriori segmentation, we use the α-expansion min-cut-based integer optimization algorithm. The state-of-the-art performance of the proposed approach is illustrated using both simulated and real hyperspectral data sets in a number of experimental comparisons with recently introduced hyperspectral image analysis methods.

Journal ArticleDOI
TL;DR: An automatic OD parameterization technique based on segmented OD and cup regions obtained from monocular retinal images and a novel cup segmentation method which is based on anatomical evidence such as vessel bends at the cup boundary, considered relevant by glaucoma experts are presented.
Abstract: Automatic retinal image analysis is emerging as an important screening tool for early detection of eye diseases. Glaucoma is one of the most common causes of blindness. The manual examination of optic disk (OD) is a standard procedure used for detecting glaucoma. In this paper, we present an automatic OD parameterization technique based on segmented OD and cup regions obtained from monocular retinal images. A novel OD segmentation method is proposed which integrates the local image information around each point of interest in multidimensional feature space to provide robustness against variations found in and around the OD region. We also propose a novel cup segmentation method which is based on anatomical evidence such as vessel bends at the cup boundary, considered relevant by glaucoma experts. Bends in a vessel are robustly detected using a region of support concept, which automatically selects the right scale for analysis. A multi-stage strategy is employed to derive a reliable subset of vessel bends called r-bends followed by a local spline fitting to derive the desired cup boundary. The method has been evaluated on 138 images comprising 33 normal and 105 glaucomatous images against three glaucoma experts. The obtained segmentation results show consistency in handling various geometric and photometric variations found across the dataset. The estimation error of the method for vertical cup-to-disk diameter ratio is 0.09/0.08 (mean/standard deviation) while for cup-to-disk area ratio it is 0.12/0.10. Overall, the obtained qualitative and quantitative results show effectiveness in both segmentation and subsequent OD parameterization for glaucoma assessment.

Journal ArticleDOI
TL;DR: A hybrid approach to robustly detect and localize texts in natural scene images using a text region detector, a conditional random field model, and a learning-based energy minimization method are presented.
Abstract: Text detection and localization in natural scene images is important for content-based image analysis. This problem is challenging due to the complex background, the non-uniform illumination, the variations of text font, size and line orientation. In this paper, we present a hybrid approach to robustly detect and localize texts in natural scene images. A text region detector is designed to estimate the text existing confidence and scale information in image pyramid, which help segment candidate text components by local binarization. To efficiently filter out the non-text components, a conditional random field (CRF) model considering unary component properties and binary contextual component relationships with supervised parameter learning is proposed. Finally, text components are grouped into text lines/words with a learning-based energy minimization method. Since all the three stages are learning-based, there are very few parameters requiring manual tuning. Experimental results evaluated on the ICDAR 2005 competition dataset show that our approach yields higher precision and recall performance compared with state-of-the-art methods. We also evaluated our approach on a multilingual image dataset with promising results.

Proceedings ArticleDOI
29 Dec 2011
TL;DR: A new public dataset of blood samples is proposed, specifically designed for the evaluation and the comparison of algorithms for segmentation and classification, to offer a new test tool to the image processing and pattern matching communities.
Abstract: The visual analysis of peripheral blood samples is an important test in the procedures for the diagnosis of leukemia. Automated systems based on artificial vision methods can speed up this operation and increase the accuracy and homogeneity of the response also in telemedicine applications. Unfortunately, there are not available public image datasets to test and compare such algorithms. In this paper, we propose a new public dataset of blood samples, specifically designed for the evaluation and the comparison of algorithms for segmentation and classification. For each image in the dataset, the classification of the cells is given, as well as a specific set of figures of merits to fairly compare the performances of different algorithms. This initiative aims to offer a new test tool to the image processing and pattern matching communities, direct to stimulating new studies in this important field of research.

Reference BookDOI
01 Jan 2011
TL;DR: In this article, the Mumford and Shah Model and its applications in total variation image restoration are discussed. But the authors focus on the reconstruction of 3D information, rather than the analysis of the image.
Abstract: Linear Inverse Problems.- Large-Scale Inverse Problems in Imaging.- Regularization Methods for Ill-Posed Problems.- Distance Measures and Applications to Multi-Modal Variational Imaging.- Energy Minimization Methods.- Compressive Sensing.- Duality and Convex Programming.- EM Algorithms.- Iterative Solution Methods.- Level Set Methods for Structural Inversion and Image Reconstructions.- Expansion Methods.- Sampling Methods.- Inverse Scattering.- Electrical Impedance Tomography.- Synthetic Aperture Radar Imaging.- Tomography.- Optical Imaging.- Photoacoustic and Thermoacoustic Tomography: Image Formation Principles.- Mathematics of Photoacoustic and Thermoacoustic Tomography.- Wave Phenomena.- Statistical Methods in Imaging.- Supervised Learning by Support Vector Machines.- Total Variation in Imaging.- Numerical Methods and Applications in Total Variation Image Restoration.- Mumford and Shah Model and its Applications in Total Variation Image Restoration.- Local Smoothing Neighbourhood Filters.- Neighbourhood Filters and the Recovery of 3D Information.- Splines and Multiresolution Analysis.- Gabor Analysis for Imaging.- Shaper Spaces.- Variational Methods in Shape Analysis.- Manifold Intrinsic Similarity.- Image Segmentation with Shape Priors: Explicit Versus Implicit Representations.- Starlet Transform in Astronomical Data Processing.- Differential Methods for Multi-Dimensional Visual Data Analysis.- Wave fronts in Imaging, Quinto.- Ultrasound Tomography, Natterer.- Optical Flow, Schnoerr.- Morphology, Petros.- Maragos.- PDEs, Weickert. - Registration, Modersitzki. - Discrete Geometry in Imaging, Bobenko, Pottmann.-Visualization, Hege.- Fast Marching and Level Sets, Osher.- Couple Physics Imaging, Arridge.- Imaging in Random Media, Borcea.- Conformal Methods, Gu.- Texture, Peyre.- Graph Cuts, Darbon.- Imaging in Physics with Fourier Transform (i.e. Phase Retrieval e.g Dark field imaging), J. R. Fienup.- Electron Microscopy, Oktem Ozan.- Mathematical Imaging OCT (this is also FFT based), Mark E. Brezinski.- Spect, PET, Faukas, Louis.