scispace - formally typeset
Author

Dorin Comaniciu

Other affiliations: Siemens, Rutgers University
Bio: Dorin Comaniciu is a academic researcher from Princeton University. The author has contributed to research in topic(s): Object detection & Segmentation. The author has an hindex of 74, co-authored 622 publication(s) receiving 40541 citation(s). Previous affiliations of Dorin Comaniciu include Siemens & Rutgers University.
Papers
More filters

Journal ArticleDOI
TL;DR: It is proved the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density.
Abstract: A general non-parametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure: the mean shift. For discrete data, we prove the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density. The relation of the mean shift procedure to the Nadaraya-Watson estimator from kernel regression and the robust M-estimators; of location is also established. Algorithms for two low-level vision tasks discontinuity-preserving smoothing and image segmentation - are described as applications. In these algorithms, the only user-set parameter is the resolution of the analysis, and either gray-level or color images are accepted as input. Extensive experimental results illustrate their excellent performance.

11,014 citations


Journal ArticleDOI
TL;DR: A new approach toward target representation and localization, the central component in visual tracking of nonrigid objects, is proposed, which employs a metric derived from the Bhattacharyya coefficient as similarity measure, and uses the mean shift procedure to perform the optimization.
Abstract: A new approach toward target representation and localization, the central component in visual tracking of nonrigid objects, is proposed. The feature histogram-based target representations are regularized by spatial masking with an isotropic kernel. The masking induces spatially-smooth similarity functions suitable for gradient-based optimization, hence, the target localization problem can be formulated using the basin of attraction of the local maxima. We employ a metric derived from the Bhattacharyya coefficient as similarity measure, and use the mean shift procedure to perform the optimization. In the presented tracking examples, the new method successfully coped with camera motion, partial occlusions, clutter, and target scale variations. Integration with motion filters and data association techniques is also discussed. We describe only a few of the potential applications: exploitation of background information, Kalman tracking using motion models, and face tracking.

4,901 citations


Proceedings ArticleDOI
14 Feb 2000-
TL;DR: The theoretical analysis of the approach shows that it relates to the Bayesian framework while providing a practical, fast and efficient solution for real time tracking of non-rigid objects seen from a moving camera.
Abstract: A new method for real time tracking of non-rigid objects seen from a moving camera is proposed. The central computational module is based on the mean shift iterations and finds the most probable target position in the current frame. The dissimilarity between the target model (its color distribution) and the target candidates is expressed by a metric derived from the Bhattacharyya coefficient. The theoretical analysis of the approach shows that it relates to the Bayesian framework while providing a practical, fast and efficient solution. The capability of the tracker to handle in real time partial occlusions, significant clutter, and target scale variations, is demonstrated for several image sequences.

3,295 citations


Proceedings ArticleDOI
20 Sep 1999-
TL;DR: A nonparametric estimator of density gradient, the mean shift, is employed in the joint, spatial-range (value) domain of gray level and color images for discontinuity preserving filtering and image segmentation and its convergence on lattices is proven.
Abstract: A nonparametric estimator of density gradient, the mean shift, is employed in the joint, spatial-range (value) domain of gray level and color images for discontinuity preserving filtering and image segmentation. Properties of the mean shift are reviewed and its convergence on lattices is proven. The proposed filtering method associates with each pixel in the image the closest local mode in the density distribution of the joint domain. Segmentation into a piecewise constant structure requires only one more step, fusion of the regions associated with nearby modes. The proposed technique has two parameters controlling the resolution in the spatial and range domains. Since convergence is guaranteed, the technique does not require the intervention of the user to stop the filtering at the desired image quality. Several examples, for gray and color images, show the versatility of the method and compare favorably with results described in the literature for the same images.

1,035 citations


Proceedings ArticleDOI
17 Jun 1997-
TL;DR: A general technique for the recovery of significant image features is presented, based on the mean shift algorithm, a simple nonparametric procedure for estimating density gradients.
Abstract: A general technique for the recovery of significant image features is presented. The technique is based on the mean shift algorithm, a simple nonparametric procedure for estimating density gradients. Drawbacks of the current methods (including robust clustering) are avoided. Feature space of any nature can be processed, and as an example, color image segmentation is discussed. The segmentation is completely autonomous, only its class is chosen by the user. Thus, the same program can produce a high quality edge image, or provide, by extracting all the significant colors, a preprocessor for content-based query systems. A 512/spl times/512 color image is analyzed in less than 10 seconds on a standard workstation. Gray level images are handled as color images having only the lightness coordinate.

777 citations


Cited by
More filters

Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

12,323 citations


Journal ArticleDOI
TL;DR: It is proved the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density.
Abstract: A general non-parametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure: the mean shift. For discrete data, we prove the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density. The relation of the mean shift procedure to the Nadaraya-Watson estimator from kernel regression and the robust M-estimators; of location is also established. Algorithms for two low-level vision tasks discontinuity-preserving smoothing and image segmentation - are described as applications. In these algorithms, the only user-set parameter is the resolution of the analysis, and either gray-level or color images are accepted as input. Extensive experimental results illustrate their excellent performance.

11,014 citations


Christopher M. Bishop1
01 Jan 2006-
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations


Journal ArticleDOI
TL;DR: A new superpixel algorithm is introduced, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels and is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.
Abstract: Computer vision applications have come to rely increasingly on superpixels in recent years, but it is not always clear what constitutes a good superpixel algorithm. In an effort to understand the benefits and drawbacks of existing methods, we empirically compare five state-of-the-art superpixel algorithms for their ability to adhere to image boundaries, speed, memory efficiency, and their impact on segmentation performance. We then introduce a new superpixel algorithm, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels. Despite its simplicity, SLIC adheres to boundaries as well as or better than previous methods. At the same time, it is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.

6,470 citations


8


Proceedings ArticleDOI
07 Jul 2001-
Abstract: This paper presents a database containing 'ground truth' segmentations produced by humans for images of a wide variety of natural scenes. We define an error measure which quantifies the consistency between segmentations of differing granularities and find that different human segmentations of the same image are highly consistent. Use of this dataset is demonstrated in two applications: (1) evaluating the performance of segmentation algorithms and (2) measuring probability distributions associated with Gestalt grouping factors as well as statistics of image region properties.

6,077 citations


Network Information
Related Authors (5)
Bogdan Georgescu

261 papers, 9.5K citations

97% related
Yefeng Zheng

373 papers, 8.9K citations

95% related
Sasa Grbic

87 papers, 1.5K citations

93% related
Viorel Mihalef

68 papers, 1.3K citations

93% related
Lucian Mihai Itu

97 papers, 1.3K citations

93% related
Performance
Metrics

Author's H-index: 74

No. of papers from the Author in previous years
YearPapers
202113
202020
201911
201817
201727
201625

Top Attributes

Show by:

Author's top 5 most impactful journals

Proceedings of SPIE

22 papers, 452 citations

IEEE Transactions on Medical Imaging

14 papers, 1.7K citations

Medical Image Analysis

12 papers, 505 citations