scispace - formally typeset
Search or ask a question
Author

Kevin Smith

Bio: Kevin Smith is an academic researcher from Royal Institute of Technology. The author has contributed to research in topics: Image segmentation & Breast cancer. The author has an hindex of 25, co-authored 60 publications receiving 8802 citations. Previous affiliations of Kevin Smith include Karolinska Institutet & École Polytechnique Fédérale de Lausanne.


Papers
More filters
Journal ArticleDOI
TL;DR: A new superpixel algorithm is introduced, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels and is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.
Abstract: Computer vision applications have come to rely increasingly on superpixels in recent years, but it is not always clear what constitutes a good superpixel algorithm. In an effort to understand the benefits and drawbacks of existing methods, we empirically compare five state-of-the-art superpixel algorithms for their ability to adhere to image boundaries, speed, memory efficiency, and their impact on segmentation performance. We then introduce a new superpixel algorithm, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels. Despite its simplicity, SLIC adheres to boundaries as well as or better than previous methods. At the same time, it is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.

7,849 citations

Journal ArticleDOI
TL;DR: This work proposes an automated graph partitioning scheme that is able to segment mitochondria at a performance level close to that of a human annotator, and outperforms a state-of-the-art 3-D segmentation technique.
Abstract: It is becoming increasingly clear that mitochondria play an important role in neural function Recent studies show mitochondrial morphology to be crucial to cellular physiology and synaptic function and a link between mitochondrial defects and neuro-degenerative diseases is strongly suspected Electron microscopy (EM), with its very high resolution in all three directions, is one of the key tools to look more closely into these issues but the huge amounts of data it produces make automated analysis necessary State-of-the-art computer vision algorithms designed to operate on natural 2-D images tend to perform poorly when applied to EM data for a number of reasons First, the sheer size of a typical EM volume renders most modern segmentation schemes intractable Furthermore, most approaches ignore important shape cues, relying only on local statistics that easily become confused when confronted with noise and textures inherent in the data Finally, the conventional assumption that strong image gradients always correspond to object boundaries is violated by the clutter of distracting membranes In this work, we propose an automated graph partitioning scheme that addresses these issues It reduces the computational complexity by operating on supervoxels instead of voxels, incorporates shape features capable of describing the 3-D shape of the target objects, and learns to recognize the distinctive appearance of true boundaries Our experiments demonstrate that our approach is able to segment mitochondria at a performance level close to that of a human annotator, and outperforms a state-of-the-art 3-D segmentation technique

265 citations

Proceedings ArticleDOI
20 Jun 2005
TL;DR: A Bayesian framework for the fully automatic tracking of a variable number of interacting targets using a fixed camera and a trans-dimensional Markov Chain Monte Carlo particle filter to recursively estimates the multi-object configuration and efficiently search the state-space is presented.
Abstract: In this paper, we present a Bayesian framework for the fully automatic tracking of a variable number of interacting targets using a fixed camera. This framework uses a joint multi-object state-space formulation and a trans-dimensional Markov Chain Monte Carlo (MCMC) particle filter to recursively estimates the multi-object configuration and efficiently search the state-space. We also define a global observation model comprised of color and binary measurements capable of discriminating between different numbers of objects in the scene. We present results which show that our method is capable of tracking varying numbers of people through several challenging real-world tracking situations such as full/partial occlusion and entering/leaving the scene.

251 citations

Journal ArticleDOI
TL;DR: The use of AI and deep learning in diagnostic breast pathology, and other recent developments in digital image analysis are covered.

197 citations

Proceedings ArticleDOI
20 Jun 2005
TL;DR: The tracking characteristics important to measure in a real-life application are explored, focusing on configuration and identification, and a set of measures and a protocol to objectively evaluate these characteristics are defined.
Abstract: Multiple object tracking (MOT) is an active and challenging research topic. Many different approaches to the MOT problem exist, yet there is little agreement amongst the community on how to evaluate or compare these methods, and the amount of literature addressing this problem is limited. The goal of this paper is to address this issue by providing a comprehensive approach to the empirical evaluation of tracking performance. To that end, we explore the tracking characteristics important to measure in a real-life application, focusing on configuration (the number and location of objects in a scene) and identification (the consistent labeling of objects over time), and define a set of measures and a protocol to objectively evaluate these characteristics.

181 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This work addresses the task of semantic image segmentation with Deep Learning and proposes atrous spatial pyramid pooling (ASPP), which is proposed to robustly segment objects at multiple scales, and improves the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models.
Abstract: In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First , we highlight convolution with upsampled filters, or ‘atrous convolution’, as a powerful tool in dense prediction tasks. Atrous convolution allows us to explicitly control the resolution at which feature responses are computed within Deep Convolutional Neural Networks. It also allows us to effectively enlarge the field of view of filters to incorporate larger context without increasing the number of parameters or the amount of computation. Second , we propose atrous spatial pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales. Third , we improve the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models. The commonly deployed combination of max-pooling and downsampling in DCNNs achieves invariance but has a toll on localization accuracy. We overcome this by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF), which is shown both qualitatively and quantitatively to improve localization performance. Our proposed “DeepLab” system sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 79.7 percent mIOU in the test set, and advances the results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and Cityscapes. All of our code is made publicly available online.

11,856 citations

Proceedings ArticleDOI
21 Jul 2017
TL;DR: This paper exploits the capability of global context information by different-region-based context aggregation through the pyramid pooling module together with the proposed pyramid scene parsing network (PSPNet) to produce good quality results on the scene parsing task.
Abstract: Scene parsing is challenging for unrestricted open vocabulary and diverse scenes. In this paper, we exploit the capability of global context information by different-region-based context aggregation through our pyramid pooling module together with the proposed pyramid scene parsing network (PSPNet). Our global prior representation is effective to produce good quality results on the scene parsing task, while PSPNet provides a superior framework for pixel-level prediction. The proposed approach achieves state-of-the-art performance on various datasets. It came first in ImageNet scene parsing challenge 2016, PASCAL VOC 2012 benchmark and Cityscapes benchmark. A single PSPNet yields the new record of mIoU accuracy 85.4% on PASCAL VOC 2012 and accuracy 80.2% on Cityscapes.

10,189 citations

Posted Content
TL;DR: DeepLab as discussed by the authors proposes atrous spatial pyramid pooling (ASPP) to segment objects at multiple scales by probing an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views.
Abstract: In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First, we highlight convolution with upsampled filters, or 'atrous convolution', as a powerful tool in dense prediction tasks. Atrous convolution allows us to explicitly control the resolution at which feature responses are computed within Deep Convolutional Neural Networks. It also allows us to effectively enlarge the field of view of filters to incorporate larger context without increasing the number of parameters or the amount of computation. Second, we propose atrous spatial pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales. Third, we improve the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models. The commonly deployed combination of max-pooling and downsampling in DCNNs achieves invariance but has a toll on localization accuracy. We overcome this by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF), which is shown both qualitatively and quantitatively to improve localization performance. Our proposed "DeepLab" system sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 79.7% mIOU in the test set, and advances the results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and Cityscapes. All of our code is made publicly available online.

10,120 citations

Journal ArticleDOI
TL;DR: A new superpixel algorithm is introduced, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels and is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.
Abstract: Computer vision applications have come to rely increasingly on superpixels in recent years, but it is not always clear what constitutes a good superpixel algorithm. In an effort to understand the benefits and drawbacks of existing methods, we empirically compare five state-of-the-art superpixel algorithms for their ability to adhere to image boundaries, speed, memory efficiency, and their impact on segmentation performance. We then introduce a new superpixel algorithm, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels. Despite its simplicity, SLIC adheres to boundaries as well as or better than previous methods. At the same time, it is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.

7,849 citations

Journal ArticleDOI
TL;DR: The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) as mentioned in this paper was organized in conjunction with the MICCAI 2012 and 2013 conferences, and twenty state-of-the-art tumor segmentation algorithms were applied to a set of 65 multi-contrast MR scans of low and high grade glioma patients.
Abstract: In this paper we report the set-up and results of the Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) organized in conjunction with the MICCAI 2012 and 2013 conferences Twenty state-of-the-art tumor segmentation algorithms were applied to a set of 65 multi-contrast MR scans of low- and high-grade glioma patients—manually annotated by up to four raters—and to 65 comparable scans generated using tumor image simulation software Quantitative evaluations revealed considerable disagreement between the human raters in segmenting various tumor sub-regions (Dice scores in the range 74%–85%), illustrating the difficulty of this task We found that different algorithms worked best for different sub-regions (reaching performance comparable to human inter-rater variability), but that no single algorithm ranked in the top for all sub-regions simultaneously Fusing several good algorithms using a hierarchical majority vote yielded segmentations that consistently ranked above all individual algorithms, indicating remaining opportunities for further methodological improvements The BRATS image data and manual annotations continue to be publicly available through an online evaluation system as an ongoing benchmarking resource

3,699 citations