Journal•ISSN: 1077-3142
Computer Vision and Image Understanding
About: Computer Vision and Image Understanding is an academic journal. The journal publishes majorly in the area(s): Image processing & Image segmentation. It has an ISSN identifier of 1077-3142. Over the lifetime, 2562 publication(s) have been published receiving 162226 citation(s).
Topics: Image processing, Image segmentation, Motion estimation, Segmentation, Feature (computer vision)
Papers published on a yearly basis
Papers
More filters
TL;DR: A novel scale- and rotation-invariant detector and descriptor, coined SURF (Speeded-Up Robust Features), which approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster.
Abstract: This article presents a novel scale- and rotation-invariant detector and descriptor, coined SURF (Speeded-Up Robust Features). SURF approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster. This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descriptors (specifically, using a Hessian matrix-based measure for the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This leads to a combination of novel detection, description, and matching steps. The paper encompasses a detailed description of the detector and descriptor and then explores the effects of the most important parameters. We conclude the article with SURF's application to two challenging, yet converse goals: camera calibration as a special case of image registration, and object recognition. Our experiments underline SURF's usefulness in a broad range of topics in computer vision.
11,276 citations
TL;DR: This work describes a method for building models by learning patterns of variability from a training set of correctly annotated images that can be used for image search in an iterative refinement algorithm analogous to that employed by Active Contour Models (Snakes).
Abstract: !, Model-based vision is firmly established as a robust approach to recognizing and locating known rigid objects in the presence of noise, clutter, and occlusion It is more problematic to apply modelbased methods to images of objects whose appearance can vary, though a number of approaches based on the use of flexible templates have been proposed The problem with existing methods is that they sacrifice model specificity in order to accommodate variability, thereby compromising robustness during image interpretation We argue that a model should only be able to deform in ways characteristic of the class of objects it represents We describe a method for building models by learning patterns of variability from a training set of correctly annotated images These models can be used for image search in an iterative refinement algorithm analogous to that employed by Active Contour Models (Snakes) The key difference is that our Active Shape Models can only deform to fit the data in ways consistent with the training set We show several practical examples where we have built such models and used them to locate partially occluded objects in noisy, cluttered images Q 199s A&& prrss, IN
7,675 citations
TL;DR: This survey reviews recent trends in video-based human capture and analysis, as well as discussing open problems for future research to achieve automatic visual analysis of human movement.
Abstract: This survey reviews advances in human motion capture and analysis from 2000 to 2006, following a previous survey of papers up to 2000 [T.B. Moeslund, E. Granum, A survey of computer vision-based human motion capture, Computer Vision and Image Understanding, 81(3) (2001) 231-268.]. Human motion capture continues to be an increasingly active research area in computer vision with over 350 publications over this period. A number of significant research advances are identified together with novel methodologies for automatic initialization, tracking, pose estimation, and movement recognition. Recent research has addressed reliable tracking and pose estimation in natural scenes. Progress has also been made towards automatic understanding of human actions and behavior. This survey reviews recent trends in video-based human capture and analysis, as well as discussing open problems for future research to achieve automatic visual analysis of human movement.
2,631 citations
TL;DR: The incremental algorithm is compared experimentally to an earlier batch Bayesian algorithm, as well as to one based on maximum-likelihood, which have comparable classification performance on small training sets, but incremental learning is significantly faster, making real-time learning feasible.
Abstract: Current computational approaches to learning visual object categories require thousands of training images, are slow, cannot learn in an incremental manner and cannot incorporate prior information into the learning process. In addition, no algorithm presented in the literature has been tested on more than a handful of object categories. We present an method for learning object categories from just a few training images. It is quick and it uses prior information in a principled way. We test it on a dataset composed of images of objects belonging to 101 widely varied categories. Our proposed method is based on making use of prior information, assembled from (unrelated) object categories which were previously learnt. A generative probabilistic model is used, which represents the shape and appearance of a constellation of features belonging to the object. The parameters of the model are learnt incrementally in a Bayesian manner. Our incremental algorithm is compared experimentally to an earlier batch Bayesian algorithm, as well as to one based on maximum likelihood. The incremental and batch versions have comparable classification performance on small training sets, but incremental learning is significantly faster, making real-time learning feasible. Both Bayesian methods outperform maximum likelihood on small training sets.
2,529 citations
TL;DR: A new robust estimator MLESAC is presented which is a generalization of the RANSAC estimator which adopts the same sampling strategy as RANSac to generate putative solutions, but chooses the solution that maximizes the likelihood rather than just the number of inliers.
Abstract: A new method is presented for robustly estimating multiple view relations from point correspondences. The method comprises two parts. The first is a new robust estimator MLESAC which is a generalization of the RANSAC estimator. It adopts the same sampling strategy as RANSAC to generate putative solutions, but chooses the solution that maximizes the likelihood rather than just the number of inliers. The second part of the algorithm is a general purpose method for automatically parameterizing these relations, using the output of MLESAC. A difficulty with multiview image relations is that there are often nonlinear constraints between the parameters, making optimization a difficult task. The parameterization method overcomes the difficulty of nonlinear constraints and conducts a constrained optimization. The method is general and its use is illustrated for the estimation of fundamental matrices, image–image homographies, and quadratic transformations. Results are given for both synthetic and real images. It is demonstrated that the method gives results equal or superior to those of previous approaches.
2,021 citations