scispace - formally typeset
Search or ask a question

Showing papers on "Image segmentation published in 2012"


Journal ArticleDOI
TL;DR: A new superpixel algorithm is introduced, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels and is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.
Abstract: Computer vision applications have come to rely increasingly on superpixels in recent years, but it is not always clear what constitutes a good superpixel algorithm. In an effort to understand the benefits and drawbacks of existing methods, we empirically compare five state-of-the-art superpixel algorithms for their ability to adhere to image boundaries, speed, memory efficiency, and their impact on segmentation performance. We then introduce a new superpixel algorithm, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels. Despite its simplicity, SLIC adheres to boundaries as well as or better than previous methods. At the same time, it is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.

7,849 citations


Proceedings ArticleDOI
16 Jun 2012
TL;DR: A conceptually clear and intuitive algorithm for contrast-based saliency estimation that outperforms all state-of-the-art approaches and can be formulated in a unified way using high-dimensional Gaussian filters.
Abstract: Saliency estimation has become a valuable tool in image processing. Yet, existing approaches exhibit considerable variation in methodology, and it is often difficult to attribute improvements in result quality to specific algorithm properties. In this paper we reconsider some of the design choices of previous methods and propose a conceptually clear and intuitive algorithm for contrast-based saliency estimation. Our algorithm consists of four basic steps. First, our method decomposes a given image into compact, perceptually homogeneous elements that abstract unnecessary detail. Based on this abstraction we compute two measures of contrast that rate the uniqueness and the spatial distribution of these elements. From the element contrast we then derive a saliency measure that produces a pixel-accurate saliency map which uniformly covers the objects of interest and consistently separates fore- and background. We show that the complete contrast and saliency estimation can be formulated in a unified way using high-dimensional Gaussian filters. This contributes to the conceptual simplicity of our method and lends itself to a highly efficient implementation with linear complexity. In a detailed experimental evaluation we analyze the contribution of each individual feature and show that our method outperforms all state-of-the-art approaches.

1,711 citations


Journal ArticleDOI
TL;DR: "Radiomics" refers to the extraction and analysis of large amounts of advanced quantitative imaging features with high throughput from medical images obtained with computed tomography, positron emission tomography or magnetic resonance imaging, leading to a very large potential subject pool.

1,608 citations


Journal ArticleDOI
TL;DR: In this paper, a generic objectness measure is proposed to quantify how likely an image window is to contain an object of any class, such as cows and telephones, from amorphous background elements such as grass and road.
Abstract: We present a generic objectness measure, quantifying how likely it is for an image window to contain an object of any class. We explicitly train it to distinguish objects with a well-defined boundary in space, such as cows and telephones, from amorphous background elements, such as grass and road. The measure combines in a Bayesian framework several image cues measuring characteristics of objects, such as appearing different from their surroundings and having a closed boundary. These include an innovative cue to measure the closed boundary characteristic. In experiments on the challenging PASCAL VOC 07 dataset, we show this new cue to outperform a state-of-the-art saliency measure, and the combined objectness measure to perform better than any cue alone. We also compare to interest point operators, a HOG detector, and three recent works aiming at automatic object segmentation. Finally, we present two applications of objectness. In the first, we sample a small numberof windows according to their objectness probability and give an algorithm to employ them as location priors for modern class-specific object detectors. As we show experimentally, this greatly reduces the number of windows evaluated by the expensive class-specific model. In the second application, we use objectness as a complementary score in addition to the class-specific model, which leads to fewer false positives. As shown in several recent papers, objectness can act as a valuable focus of attention mechanism in many other applications operating on image windows, including weakly supervised learning of object categories, unsupervised pixelwise segmentation, and object tracking in video. Computing objectness is very efficient and takes only about 4 sec. per image.

1,223 citations


Journal ArticleDOI
TL;DR: The aim of this paper is to review, analyze and categorize the retinal vessel extraction algorithms, techniques and methodologies, giving a brief description, highlighting the key points and the performance measures.

890 citations


Proceedings ArticleDOI
16 Jun 2012
TL;DR: The proposed end-to-end real-time scene text localization and recognition method achieves state-of-the-art text localization results amongst published methods and it is the first one to report results for end- to-end text recognition.
Abstract: An end-to-end real-time scene text localization and recognition method is presented. The real-time performance is achieved by posing the character detection problem as an efficient sequential selection from the set of Extremal Regions (ERs). The ER detector is robust to blur, illumination, color and texture variation and handles low-contrast text. In the first classification stage, the probability of each ER being a character is estimated using novel features calculated with O(1) complexity per region tested. Only ERs with locally maximal probability are selected for the second stage, where the classification is improved using more computationally expensive features. A highly efficient exhaustive search with feedback loops is then applied to group ERs into words and to select the most probable character segmentation. Finally, text is recognized in an OCR stage trained using synthetic fonts. The method was evaluated on two public datasets. On the ICDAR 2011 dataset, the method achieves state-of-the-art text localization results amongst published methods and it is the first one to report results for end-to-end text recognition. On the more challenging Street View Text dataset, the method achieves state-of-the-art recall. The robustness of the proposed method against noise and low contrast of characters is demonstrated by “false positives” caused by detected watermark text in the dataset.

862 citations


Journal ArticleDOI
TL;DR: A novel systematic approach to enhance underwater images by a dehazing algorithm, to compensate the attenuation discrepancy along the propagation path, and to take the influence of the possible presence of an artifical light source into consideration is proposed.
Abstract: Light scattering and color change are two major sources of distortion for underwater photography. Light scattering is caused by light incident on objects reflected and deflected multiple times by particles present in the water before reaching the camera. This in turn lowers the visibility and contrast of the image captured. Color change corresponds to the varying degrees of attenuation encountered by light traveling in the water with different wavelengths, rendering ambient underwater environments dominated by a bluish tone. No existing underwater processing techniques can handle light scattering and color change distortions suffered by underwater images, and the possible presence of artificial lighting simultaneously. This paper proposes a novel systematic approach to enhance underwater images by a dehazing algorithm, to compensate the attenuation discrepancy along the propagation path, and to take the influence of the possible presence of an artifical light source into consideration. Once the depth map, i.e., distances between the objects and the camera, is estimated, the foreground and background within a scene are segmented. The light intensities of foreground and background are compared to determine whether an artificial light source is employed during the image capturing process. After compensating the effect of artifical light, the haze phenomenon and discrepancy in wavelength attenuation along the underwater propagation path to camera are corrected. Next, the water depth in the image scene is estimated according to the residual energy ratios of different color channels existing in the background light. Based on the amount of attenuation corresponding to each light wavelength, color change compensation is conducted to restore color balance. The performance of the proposed algorithm for wavelength compensation and image dehazing (WCID) is evaluated both objectively and subjectively by utilizing ground-truth color patches and video downloaded from the Youtube website. Both results demonstrate that images with significantly enhanced visibility and superior color fidelity are obtained by the WCID proposed.

782 citations


Journal ArticleDOI
TL;DR: This paper introduces a new supervised segmentation algorithm for remotely sensed hyperspectral image data which integrates the spectral and spatial information in a Bayesian framework and represents an innovative contribution in the literature.
Abstract: This paper introduces a new supervised segmentation algorithm for remotely sensed hyperspectral image data which integrates the spectral and spatial information in a Bayesian framework. A multinomial logistic regression (MLR) algorithm is first used to learn the posterior probability distributions from the spectral information, using a subspace projection method to better characterize noise and highly mixed pixels. Then, contextual information is included using a multilevel logistic Markov-Gibbs Markov random field prior. Finally, a maximum a posteriori segmentation is efficiently computed by the min-cut-based integer optimization algorithm. The proposed segmentation approach is experimentally evaluated using both simulated and real hyperspectral data sets, exhibiting state-of-the-art performance when compared with recently introduced hyperspectral image classification methods. The integration of subspace projection methods with the MLR algorithm, combined with the use of spatial-contextual information, represents an innovative contribution in the literature. This approach is shown to provide accurate characterization of hyperspectral imagery in both the spectral and the spatial domain.

678 citations


Journal ArticleDOI
TL;DR: A novel framework to generate and rank plausible hypotheses for the spatial extent of objects in images using bottom-up computational processes and mid-level selection cues and it is shown that the algorithm can be used, successfully, in a segmentation-based visual object category recognition pipeline.
Abstract: We present a novel framework to generate and rank plausible hypotheses for the spatial extent of objects in images using bottom-up computational processes and mid-level selection cues. The object hypotheses are represented as figure-ground segmentations, and are extracted automatically, without prior knowledge of the properties of individual object classes, by solving a sequence of Constrained Parametric Min-Cut problems (CPMC) on a regular image grid. In a subsequent step, we learn to rank the corresponding segments by training a continuous model to predict how likely they are to exhibit real-world regularities (expressed as putative overlap with ground truth) based on their mid-level region properties, then diversify the estimated overlap score using maximum marginal relevance measures. We show that this algorithm significantly outperforms the state of the art for low-level segmentation in the VOC 2009 and 2010 data sets. In our companion papers [1], [2], we show that the algorithm can be used, successfully, in a segmentation-based visual object category recognition pipeline. This architecture ranked first in the VOC2009 and VOC2010 image segmentation and labeling challenges.

671 citations


Proceedings ArticleDOI
16 Jun 2012
TL;DR: In this paper, a novel method for foreground segmentation is presented that follows a non-parametric background modeling paradigm, thus the background is modeled by a history of recently observed pixel values and the background update is based on a learning parameter.
Abstract: In this paper we present a novel method for foreground segmentation. Our proposed approach follows a non-parametric background modeling paradigm, thus the background is modeled by a history of recently observed pixel values. The foreground decision depends on a decision threshold. The background update is based on a learning parameter. We extend both of these parameters to dynamic per-pixel state variables and introduce dynamic controllers for each of them. Furthermore, both controllers are steered by an estimate of the background dynamics. In our experiments, the proposed Pixel-Based Adaptive Segmenter (PBAS) outperforms most state-of-the-art methods.

583 citations


Proceedings Article
01 Nov 2012
TL;DR: The proposed approach is based on statistics of natural images and this improves its modeling capacity, and the experimental results show that the method improves accuracy in texture recognition tasks compared to the state-of-the-art.
Abstract: This paper proposes a method for constructing local image descriptors which efficiently encode texture information and are suitable for histogram based representation of image regions. The method computes a binary code for each pixel by linearly projecting local image patches onto a subspace, whose basis vectors are learnt from natural images via independent component analysis, and by binarizing the coordinates in this basis via thresholding. The length of the binary code string is determined by the number of basis vectors. Image regions can be conveniently represented by histograms of pixels' binary codes. Our method is inspired by other descriptors which produce binary codes, such as local binary pattern and local phase quantization. However, instead of heuristic code constructions, the proposed approach is based on statistics of natural images and this improves its modeling capacity. The experimental results show that our method improves accuracy in texture recognition tasks compared to the state-of-the-art.

Proceedings ArticleDOI
16 Jun 2012
TL;DR: It is shown that training from a combination of weakly annotated videos and fully annotated still images using domain adaptation improves the performance of a detector trained from still images alone.
Abstract: Object detectors are typically trained on a large set of still images annotated by bounding-boxes. This paper introduces an approach for learning object detectors from real-world web videos known only to contain objects of a target class. We propose a fully automatic pipeline that localizes objects in a set of videos of the class and learns a detector for it. The approach extracts candidate spatio-temporal tubes based on motion segmentation and then selects one tube per video jointly over all videos. To compare to the state of the art, we test our detector on still images, i.e., Pascal VOC 2007. We observe that frames extracted from web videos can differ significantly in terms of quality to still images taken by a good camera. Thus, we formulate the learning from videos as a domain adaptation task. We show that training from a combination of weakly annotated videos and fully annotated still images using domain adaptation improves the performance of a detector trained from still images alone.

Journal ArticleDOI
TL;DR: A bottom-up aggregation approach to image segmentation that takes into account intensity and texture distributions in a local area around each region and incorporates priors based on the geometry of the regions, providing a complete hierarchical segmentation of the image.
Abstract: We present a bottom-up aggregation approach to image segmentation. Beginning with an image, we execute a sequence of steps in which pixels are gradually merged to produce larger and larger regions. In each step, we consider pairs of adjacent regions and provide a probability measure to assess whether or not they should be included in the same segment. Our probabilistic formulation takes into account intensity and texture distributions in a local area around each region. It further incorporates priors based on the geometry of the regions. Finally, posteriors based on intensity and texture cues are combined using “ a mixture of experts” formulation. This probabilistic approach is integrated into a graph coarsening scheme, providing a complete hierarchical segmentation of the image. The algorithm complexity is linear in the number of the image pixels and it requires almost no user-tuned parameters. In addition, we provide a novel evaluation scheme for image segmentation algorithms, attempting to avoid human semantic considerations that are out of scope for segmentation algorithms. Using this novel evaluation scheme, we test our method and provide a comparison to several existing segmentation algorithms.

Proceedings ArticleDOI
16 Jun 2012
TL;DR: The main objective is to empirically understand the promises and challenges of scene labeling with RGB-D and adapt the framework of kernel descriptors that converts local similarities (kernels) to patch descriptors to capture appearance (RGB) and shape (D) similarities.
Abstract: Scene labeling research has mostly focused on outdoor scenes, leaving the harder case of indoor scenes poorly understood. Microsoft Kinect dramatically changed the landscape, showing great potentials for RGB-D perception (color+depth). Our main objective is to empirically understand the promises and challenges of scene labeling with RGB-D. We use the NYU Depth Dataset as collected and analyzed by Silberman and Fergus [30]. For RGB-D features, we adapt the framework of kernel descriptors that converts local similarities (kernels) to patch descriptors. For contextual modeling, we combine two lines of approaches, one using a superpixel MRF, and the other using a segmentation tree. We find that (1) kernel descriptors are very effective in capturing appearance (RGB) and shape (D) similarities; (2) both superpixel MRF and segmentation tree are useful in modeling context; and (3) the key to labeling accuracy is the ability to efficiently train and test with large-scale data. We improve labeling accuracy on the NYU Dataset from 56.6% to 76.1%. We also apply our approach to image-only scene labeling and improve the accuracy on the Stanford Background Dataset from 79.4% to 82.9%.

Proceedings ArticleDOI
16 Jun 2012
TL;DR: An approach to holistic scene understanding that reasons jointly about regions, location, class and spatial extent of objects, presence of a class in the image, as well as the scene type that outperforms the state-of-the-art on the MSRC-21 benchmark, while being much faster.
Abstract: In this paper we propose an approach to holistic scene understanding that reasons jointly about regions, location, class and spatial extent of objects, presence of a class in the image, as well as the scene type. Learning and inference in our model are efficient as we reason at the segment level, and introduce auxiliary variables that allow us to decompose the inherent high-order potentials into pairwise potentials between a few variables with small number of states (at most the number of classes). Inference is done via a convergent message-passing algorithm, which, unlike graph-cuts inference, has no submodularity restrictions and does not require potential specific moves. We believe this is very important, as it allows us to encode our ideas and prior knowledge about the problem without the need to change the inference engine every time we introduce a new potential. Our approach outperforms the state-of-the-art on the MSRC-21 benchmark, while being much faster. Importantly, our holistic model is able to improve performance in all tasks.

Journal ArticleDOI
TL;DR: A new robust method dedicated to produce consistent and accurate brain extraction based on nonlocal segmentation embedded in a multi-resolution framework, which provides results comparable to a recent label fusion approach, while being 40 times faster and requiring a much smaller library of priors.

Journal ArticleDOI
TL;DR: Nonparametric Bayesian methods are considered for recovery of imagery based upon compressive, incomplete, and/or noisy measurements and significant improvements in image recovery are manifested using learned dictionaries, relative to using standard orthonormal image expansions.
Abstract: Nonparametric Bayesian methods are considered for recovery of imagery based upon compressive, incomplete, and/or noisy measurements. A truncated beta-Bernoulli process is employed to infer an appropriate dictionary for the data under test and also for image recovery. In the context of compressive sensing, significant improvements in image recovery are manifested using learned dictionaries, relative to using standard orthonormal image expansions. The compressive-measurement projections are also optimized for the learned dictionary. Additionally, we consider simpler (incomplete) measurements, defined by measuring a subset of image pixels, uniformly selected at random. Spatial interrelationships within imagery are exploited through use of the Dirichlet and probit stick-breaking processes. Several example results are presented, with comparisons to other methods in the literature.

Journal ArticleDOI
TL;DR: In this article, an extension of the?-expansion algorithm is proposed that optimizes label costs with well-characterized optimality bounds, which is useful for multi-model fitting.
Abstract: The ?-expansion algorithm has had a significant impact in computer vision due to its generality, effectiveness, and speed. It is commonly used to minimize energies that involve unary, pairwise, and specialized higher-order terms. Our main algorithmic contribution is an extension of ?-expansion that also optimizes "label costs" with well-characterized optimality bounds. Label costs penalize a solution based on the set of labels that appear in it, for example by simply penalizing the number of labels in the solution. Our energy has a natural interpretation as minimizing description length (MDL) and sheds light on classical algorithms like K-means and expectation-maximization (EM). Label costs are useful for multi-model fitting and we demonstrate several such applications: homography detection, motion segmentation, image segmentation, and compression. Our C++ and MATLAB code is publicly available http://vision.csd.uwo.ca/code/ .

Journal ArticleDOI
TL;DR: A forensic tool able to discriminate between original and forged regions in an image captured by a digital camera is presented, based on a new feature measuring the presence of demosaicking artifacts at a local level and a new statistical model allowing to derive the tampering probability of each 2 × 2 image block without requiring to know a priori the position of the forged region.
Abstract: In this paper, a forensic tool able to discriminate between original and forged regions in an image captured by a digital camera is presented. We make the assumption that the image is acquired using a Color Filter Array, and that tampering removes the artifacts due to the demosaicking algorithm. The proposed method is based on a new feature measuring the presence of demosaicking artifacts at a local level, and on a new statistical model allowing to derive the tampering probability of each 2 × 2 image block without requiring to know a priori the position of the forged region. Experimental results on different cameras equipped with different demosaicking algorithms demonstrate both the validity of the theoretical model and the effectiveness of our scheme.

Proceedings ArticleDOI
16 Jun 2012
TL;DR: A novel design for region-based object detectors that integrates efficiently top-down information from scanning-windows part models and global appearance cues is proposed that produces class-specific scores for bottom-up regions, and then aggregate the votes of multiple overlapping candidates through pixel classification.
Abstract: We address the problem of segmenting and recognizing objects in real world images, focusing on challenging articulated categories such as humans and other animals. For this purpose, we propose a novel design for region-based object detectors that integrates efficiently top-down information from scanning-windows part models and global appearance cues. Our detectors produce class-specific scores for bottom-up regions, and then aggregate the votes of multiple overlapping candidates through pixel classification. We evaluate our approach on the PASCAL segmentation challenge, and report competitive performance with respect to current leading techniques. On VOC2010, our method obtains the best results in 6/20 categories and the highest performance on articulated objects.

Proceedings ArticleDOI
16 Jun 2012
TL;DR: A novel energy-minimization approach to cosegmentation that can handle multiple classes and a significantly larger number of images and it admits a probabilistic interpretation.
Abstract: Bottom-up, fully unsupervised segmentation remains a daunting challenge for computer vision. In the cosegmentation context, on the other hand, the availability of multiple images assumed to contain instances of the same object classes provides a weak form of supervision that can be exploited by discriminative approaches. Unfortunately, most existing algorithms are limited to a very small number of images and/or object classes (typically two of each). This paper proposes a novel energy-minimization approach to cosegmentation that can handle multiple classes and a significantly larger number of images. The proposed cost function combines spectral- and discriminative-clustering terms, and it admits a probabilistic interpretation. It is optimized using an efficient EM method, initialized using a convex quadratic approximation of the energy. Comparative experiments show that the proposed approach matches or improves the state of the art on several standard datasets.

Proceedings ArticleDOI
16 Jun 2012
TL;DR: The boosting model outperforms 27 state-of-the-art models and is so far the closest model to the accuracy of human model for fixation prediction, and successfully detects the most salient object in a scene without sophisticated image processings such as region segmentation.
Abstract: Despite significant recent progress, the best available visual saliency models still lag behind human performance in predicting eye fixations in free-viewing of natural scenes. Majority of models are based on low-level visual features and the importance of top-down factors has not yet been fully explored or modeled. Here, we combine low-level features such as orientation, color, intensity, saliency maps of previous best bottom-up models with top-down cognitive visual features (e.g., faces, humans, cars, etc.) and learn a direct mapping from those features to eye fixations using Regression, SVM, and AdaBoost classifiers. By extensive experimenting over three benchmark eye-tracking datasets using three popular evaluation scores, we show that our boosting model outperforms 27 state-of-the-art models and is so far the closest model to the accuracy of human model for fixation prediction. Furthermore, our model successfully detects the most salient object in a scene without sophisticated image processings such as region segmentation.

Proceedings ArticleDOI
16 Jun 2012
TL;DR: A novel segmentation framework based on bipartite graph partitioning is proposed, which is able to aggregate multi-layer superpixels in a principled and very effective manner and leads to a highly efficient, linear-time spectral algorithm.
Abstract: Grouping cues can affect the performance of segmentation greatly. In this paper, we show that superpixels (image segments) can provide powerful grouping cues to guide segmentation, where superpixels can be collected easily by (over)-segmenting the image using any reasonable existing segmentation algorithms. Generated by different algorithms with varying parameters, superpixels can capture diverse and multi-scale visual patterns of a natural image. Successful integration of the cues from a large multitude of superpixels presents a promising yet not fully explored direction. In this paper, we propose a novel segmentation framework based on bipartite graph partitioning, which is able to aggregate multi-layer superpixels in a principled and very effective manner. Computationally, it is tailored to unbalanced bipartite graph structure and leads to a highly efficient, linear-time spectral algorithm. Our method achieves significantly better performance on the Berkeley Segmentation Database compared to state-of-the-art techniques.

Proceedings ArticleDOI
09 Jul 2012
TL;DR: A feature-fusion-based food recognition method for bounding boxes of the candidate regions with various kinds of visual features including bag-of-features of SIFT and CSIFT with spatial pyramid, histogram of oriented gradient (HoG), and Gabor texture features is applied.
Abstract: In this paper, we propose a two-step method to recognize multiple-food images by detecting candidate regions with several methods and classifying them with various kinds of features. In the first step, we detect several candidate regions by fusing outputs of several region detectors including Felzenszwalb's deformable part model (DPM) [1], a circle detector and the JSEG region segmentation. In the second step, we apply a feature-fusion-based food recognition method for bounding boxes of the candidate regions with various kinds of visual features including bag-of-features of SIFT and CSIFT with spatial pyramid (SP-BoF), histogram of oriented gradient (HoG), and Gabor texture features. In the experiments, we estimated ten food candidates for multiple-food images in the descending order of the confidence scores. As results, we have achieved the 55.8% classification rate, which improved the baseline result in case of using only DPM by 14.3 points, for a multiple-food image data set. This demonstrates that the proposed two-step method is effective for recognition of multiple-food images.

Book ChapterDOI
07 Oct 2012
TL;DR: This work proposes an approximation framework for streaming hierarchical video segmentation motivated by data stream algorithms: each video frame is processed only once and does not change the segmentation of previous frames.
Abstract: The use of video segmentation as an early processing step in video analysis lags behind the use of image segmentation for image analysis, despite many available video segmentation methods. A major reason for this lag is simply that videos are an order of magnitude bigger than images; yet most methods require all voxels in the video to be loaded into memory, which is clearly prohibitive for even medium length videos. We address this limitation by proposing an approximation framework for streaming hierarchical video segmentation motivated by data stream algorithms: each video frame is processed only once and does not change the segmentation of previous frames. We implement the graph-based hierarchical segmentation method within our streaming framework; our method is the first streaming hierarchical video segmentation method proposed. We perform thorough experimental analysis on a benchmark video data set and longer videos. Our results indicate the graph-based streaming hierarchical method outperforms other streaming video segmentation methods and performs nearly as well as the full-video hierarchical graph-based method.

Journal ArticleDOI
TL;DR: An overview of the image analysis techniques in the domain of histopathology, specifically, for the objective of automated carcinoma detection and classification is presented and emphasis is given to state-of-the-art image segmentation methods for feature extraction and disease classification.

Journal ArticleDOI
Andac Hamamci1, N. Kucuk, Kutlay Karaman, Kayihan Engin, Gozde Unal1 
TL;DR: A cellular automata based seeded tumor segmentation method on contrast enhanced T1 weighted magnetic resonance images, which standardizes the volume of interest (VOI) and seed selection, and an algorithm based on CA is presented to differentiate necrotic and enhancing tumor tissue content, which gains importance for a detailed assessment of radiation therapy response.
Abstract: In this paper, we present a fast and robust practical tool for segmentation of solid tumors with minimal user interaction to assist clinicians and researchers in radiosurgery planning and assessment of the response to the therapy. Particularly, a cellular automata (CA) based seeded tumor segmentation method on contrast enhanced T1 weighted magnetic resonance (MR) images, which standardizes the volume of interest (VOI) and seed selection, is proposed. First, we establish the connection of the CA-based segmentation to the graph-theoretic methods to show that the iterative CA framework solves the shortest path problem. In that regard, we modify the state transition function of the CA to calculate the exact shortest path solution. Furthermore, a sensitivity parameter is introduced to adapt to the heterogeneous tumor segmentation problem, and an implicit level set surface is evolved on a tumor probability map constructed from CA states to impose spatial smoothness. Sufficient information to initialize the algorithm is gathered from the user simply by a line drawn on the maximum diameter of the tumor, in line with the clinical practice. Furthermore, an algorithm based on CA is presented to differentiate necrotic and enhancing tumor tissue content, which gains importance for a detailed assessment of radiation therapy response. Validation studies on both clinical and synthetic brain tumor datasets demonstrate 80%-90% overlap performance of the proposed algorithm with an emphasis on less sensitivity to seed initialization, robustness with respect to different and heterogeneous tumor types, and its efficiency in terms of computation time.

Proceedings ArticleDOI
22 Aug 2012
TL;DR: This paper evaluated SFTA for the tasks of content-based image retrieval (CBIR) and image classification, comparing its performance to that of other widely employed feature extraction methods such as Haralick and Gabor filter banks and found that SFTA achieved higher precision and accuracy for CBIR and image Classification.
Abstract: In this paper we propose a new and efficient texture feature extraction method: the Segmentation-based Fractal Texture Analysis, or SFTA. The extraction algorithm consists in decomposing the input image into a set of binary images from which the fractal dimensions of the resulting regions are computed in order to describe segmented texture patterns. The decomposition of the input image is achieved by the Two-Threshold Binary Decomposition (TTBD) algorithm, which we also propose in this work. We evaluated SFTA for the tasks of content-based image retrieval (CBIR) and image classification, comparing its performance to that of other widely employed feature extraction methods such as Haralick and Gabor filter banks. SFTA achieved higher precision and accuracy for CBIR and image classification. Additionally, SFTA was at least 3.7 times faster than Gabor and 1.6 times faster than Haralick with respect to feature extraction time.

Proceedings Article
Ren Xiaofeng1, Liefeng Bo1
03 Dec 2012
TL;DR: This work shows that contour detection accuracy can be significantly improved by computing Sparse Code Gradients (SCG), which measure contrast using patch representations automatically learned through sparse coding, which is verified on the NYU Depth Dataset.
Abstract: Finding contours in natural images is a fundamental problem that serves as the basis of many tasks such as image segmentation and object recognition. At the core of contour detection technologies are a set of hand-designed gradient features, used by most approaches including the state-of-the-art Global Pb (gPb) operator. In this work, we show that contour detection accuracy can be significantly improved by computing Sparse Code Gradients (SCG), which measure contrast using patch representations automatically learned through sparse coding. We use K-SVD for dictionary learning and Orthogonal Matching Pursuit for computing sparse codes on oriented local neighborhoods, and apply multi-scale pooling and power transforms before classifying them with linear SVMs. By extracting rich representations from pixels and avoiding collapsing them prematurely, Sparse Code Gradients effectively learn how to measure local contrasts and find contours. We improve the F-measure metric on the BSDS500 benchmark to 0.74 (up from 0.71 of gPb contours). Moreover, our learning approach can easily adapt to novel sensor data such as Kinect-style RGB-D cameras: Sparse Code Gradients on depth maps and surface normals lead to promising contour detection using depth and depth+color, as verified on the NYU Depth Dataset.

Journal ArticleDOI
TL;DR: This paper presents two novel methods for segmentation of images based on the Fractional-Order Darwinian Particle Swarm Optimization (FODPSO) and Darwinian particle Swarmoptimization for determining the n-1 optimal n-level threshold on a given image.
Abstract: Image segmentation has been widely used in document image analysis for extraction of printed characters, map processing in order to find lines, legends, and characters, topological features extraction for extraction of geographical information, and quality inspection of materials where defective parts must be delineated among many other applications. In image analysis, the efficient segmentation of images into meaningful objects is important for classification and object recognition. This paper presents two novel methods for segmentation of images based on the Fractional-Order Darwinian Particle Swarm Optimization (FODPSO) and Darwinian Particle Swarm Optimization (DPSO) for determining the n-1 optimal n-level threshold on a given image. The efficiency of the proposed methods is compared with other well-known thresholding segmentation methods. Experimental results show that the proposed methods perform better than other methods when considering a number of different measures.