Showing papers in "arXiv: Computer Vision and Pattern Recognition in 2011"

PDF

Open Access

Posted Content•

Moving Object Detection by Detecting Contiguous Outliers in the Low-Rank Representation

[...]

Xiaowei Zhou¹, Can Yang¹, Weichuan Yu¹•Institutions (1)

Hong Kong University of Science and Technology¹

05 Sep 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper presents a unified framework named DEtecting Contiguous Outliers in the LOw-rank Representation (DECOLOR), which integrates object detection and background learning into a single process of optimization, which can be solved by an alternating algorithm efficiently.

...read moreread less

Abstract: Object detection is a fundamental step for automated video analysis in many vision applications. Object detection in a video is usually performed by object detectors or background subtraction techniques. Often, an object detector requires manually labeled examples to train a binary classifier, while background subtraction needs a training sequence that contains no objects to build a background model. To automate the analysis, object detection without a separate training phase becomes a critical task. People have tried to tackle this task by using motion information. But existing motion-based methods are usually limited when coping with complex scenarios such as nonrigid motion and dynamic background. In this paper, we show that above challenges can be addressed in a unified framework named DEtecting Contiguous Outliers in the LOw-rank Representation (DECOLOR). This formulation integrates object detection and background learning into a single process of optimization, which can be solved by an alternating algorithm efficiently. We explain the relations between DECOLOR and other sparsity-based methods. Experiments on both simulated data and real sequences demonstrate that DECOLOR outperforms the state-of-the-art approaches and it can work effectively on a wide range of complex scenarios.

...read moreread less

509 citations

Posted Content•

3D Terrestrial lidar data classification of complex natural scenes using a multi-scale dimensionality criterion: applications in geomorphology

[...]

Nicolas Brodu, Dimitri Lague

04 Jul 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, a multi-scale measure of the point cloud dimensionality around each point is defined, which characterizes the local 3D organization, and a probabilistic confidence is given at each point, allowing the user to remove the points for which the classification is uncertain.

...read moreread less

Abstract: 3D point clouds of natural environments relevant to problems in geomorphology often require classification of the data into elementary relevant classes. A typical example is the separation of riparian vegetation from ground in fluvial environments, the distinction between fresh surfaces and rockfall in cliff environments, or more generally the classification of surfaces according to their morphology. Natural surfaces are heterogeneous and their distinctive properties are seldom defined at a unique scale, prompting the use of multi-scale criteria to achieve a high degree of classification success. We have thus defined a multi-scale measure of the point cloud dimensionality around each point, which characterizes the local 3D organization. We can thus monitor how the local cloud geometry behaves across scales. We present the technique and illustrate its efficiency in separating riparian vegetation from ground and classifying a mountain stream as vegetation, rock, gravel or water surface. In these two cases, separating the vegetation from ground or other classes achieve accuracy larger than 98 %. Comparison with a single scale approach shows the superiority of the multi-scale analysis in enhancing class separability and spatial resolution. The technique is robust to missing data, shadow zones and changes in point density within the scene. The classification is fast and accurate and can account for some degree of intra-class morphological variability such as different vegetation types. A probabilistic confidence in the classification result is given at each point, allowing the user to remove the points for which the classification is uncertain. The process can be both fully automated, but also fully customized by the user including a graphical definition of the classifiers. Although developed for fully 3D data, the method can be readily applied to 2.5D airborne lidar data.

...read moreread less

386 citations

Posted Content•

Re-initialization Free Level Set Evolution via Reaction Diffusion

[...]

Kaihua Zhang, Lei Zhang, Huihui Song, David Zhang

07 Dec 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: By successfully applying diffusion to LSE, the RD-LSE model is stable by means of the simple finite difference method, which is very easy to implement.

...read moreread less

Abstract: This paper presents a novel reaction-diffusion (RD) method for implicit active contours, which is completely free of the costly re-initialization procedure in level set evolution (LSE). A diffusion term is introduced into LSE, resulting in a RD-LSE equation, to which a piecewise constant solution can be derived. In order to have a stable numerical solution of the RD based LSE, we propose a two-step splitting method (TSSM) to iteratively solve the RD-LSE equation: first iterating the LSE equation, and then solving the diffusion equation. The second step regularizes the level set function obtained in the first step to ensure stability, and thus the complex and costly re-initialization procedure is completely eliminated from LSE. By successfully applying diffusion to LSE, the RD-LSE model is stable by means of the simple finite difference method, which is very easy to implement. The proposed RD method can be generalized to solve the LSE for both variational level set method and PDE-based level set method. The RD-LSE method shows very good performance on boundary anti-leakage, and it can be readily extended to high dimensional level set method. The extensive and promising experimental results on synthetic and real images validate the effectiveness of the proposed RD-LSE approach.

...read moreread less

174 citations

Posted Content•

Introduction to the bag of features paradigm for image classification and retrieval

[...]

Stephen O'Hara, Bruce A. Draper

17 Jan 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper presents an introduction to BoF image representations, describes critical design choices, and surveys the BoF literature, placing emphasis on recent techniques that mitigate quantization errors, improve fea- ture detection, and speed up image retrieval.

...read moreread less

Abstract: The past decade has seen the growing popularity of Bag of Features (BoF) approaches to many computer vision tasks, including image classification, video search, robot localization, and texture recognition. Part of the appeal is simplicity. BoF meth- ods are based on orderless collections of quantized local image descriptors; they discard spatial information and are therefore conceptually and computationally simpler than many alternative methods. Despite this, or perhaps because of this, BoF-based systems have set new performance standards on popular image classification benchmarks and have achieved scalability breakthroughs in image retrieval. This paper presents an introduction to BoF image representations, describes critical design choices, and surveys the BoF literature. Emphasis is placed on recent techniques that mitigate quantization errors, improve fea- ture detection, and speed up image retrieval. At the same time, unresolved issues and fundamental challenges are raised. Among the unresolved issues are determining the best techniques for sampling images, describing local image features, and evaluating system performance. Among the more fundamental challenges are how and whether BoF meth- ods can contribute to localizing objects in complex images, or to associating high-level semantics with natural images. This survey should be useful both for introducing new in- vestigators to the field and for providing existing researchers with a consolidated reference to related work.

...read moreread less

169 citations

Journal Article•DOI•

Curved Gabor Filters for Fingerprint Image Enhancement

[...]

Carsten Gottschlich¹•Institutions (1)

University of Göttingen¹

21 Apr 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, curved Gabor filters are applied to the curved ridge and valley structure of low-quality fingerprint images for the purpose of enhancing curved structures in noisy images, which locally adapt their shape to the direction of flow.

...read moreread less

Abstract: Gabor filters play an important role in many application areas for the enhancement of various types of images and the extraction of Gabor features. For the purpose of enhancing curved structures in noisy images, we introduce curved Gabor filters which locally adapt their shape to the direction of flow. These curved Gabor filters enable the choice of filter parameters which increase the smoothing power without creating artifacts in the enhanced image. In this paper, curved Gabor filters are applied to the curved ridge and valley structure of low-quality fingerprint images. First, we combine two orientation field estimation methods in order to obtain a more robust estimation for very noisy images. Next, curved regions are constructed by following the respective local orientation and they are used for estimating the local ridge frequency. Lastly, curved Gabor filters are defined based on curved regions and they are applied for the enhancement of low-quality fingerprint images. Experimental results on the FVC2004 databases show improvements of this approach in comparison to state-of-the-art enhancement methods.

...read moreread less

116 citations

Posted Content•

Minutiae Extraction from Fingerprint Images - a Review

[...]

Roli Bansal, Priti Sehgal, Punam Bedi

29 Nov 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper presents a review of a large number of techniques present in the literature for extracting fingerprint minutiae, broadly classified as those working on binarized images and those that work on gray scale images directly.

...read moreread less

Abstract: Fingerprints are the oldest and most widely used form of biometric identification. Everyone is known to have unique, immutable fingerprints. As most Automatic Fingerprint Recognition Systems are based on local ridge features known as minutiae, marking minutiae accurately and rejecting false ones is very important. However, fingerprint images get degraded and corrupted due to variations in skin and impression conditions. Thus, image enhancement techniques are employed prior to minutiae extraction. A critical step in automatic fingerprint matching is to reliably extract minutiae from the input fingerprint images. This paper presents a review of a large number of techniques present in the literature for extracting fingerprint minutiae. The techniques are broadly classified as those working on binarized images and those that work on gray scale images directly.

...read moreread less

102 citations

Posted Content•

Towards Holistic Scene Understanding: Feedback Enabled Cascaded Classification Models

[...]

Congcong Li¹, Adarsh Kowdle¹, Ashutosh Saxena¹, Tsuhan Chen¹•Institutions (1)

Cornell University¹

24 Oct 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, a feedback enabled cascaded classification model (FE-CCM) is proposed to jointly optimize all the sub-tasks, while requiring only a black-box interface to the original classifier for each sub-task.

...read moreread less

Abstract: Scene understanding includes many related sub-tasks, such as scene categorization, depth estimation, object detection, etc. Each of these sub-tasks is often notoriously hard, and state-of-the-art classifiers already exist for many of them. These classifiers operate on the same raw image and provide correlated outputs. It is desirable to have an algorithm that can capture such correlation without requiring any changes to the inner workings of any classifier. We propose Feedback Enabled Cascaded Classification Models (FE-CCM), that jointly optimizes all the sub-tasks, while requiring only a `black-box' interface to the original classifier for each sub-task. We use a two-layer cascade of classifiers, which are repeated instantiations of the original ones, with the output of the first layer fed into the second layer as input. Our training method involves a feedback step that allows later classifiers to provide earlier classifiers information about which error modes to focus on. We show that our method significantly improves performance in all the sub-tasks in the domain of scene understanding, where we consider depth estimation, scene categorization, event categorization, object detection, geometric labeling and saliency detection. Our method also improves performance in two robotic applications: an object-grasping robot and an object-finding robot.

...read moreread less

94 citations

Journal Article•DOI•

Total variation regularization for fMRI-based prediction of behaviour

[...]

Vincent Michel¹, Alexandre Gramfort¹, Gaël Varoquaux, Evelyn Eger, Bertrand Thirion¹ - Show less +1 more•Institutions (1)

French Institute for Research in Computer Science and Automation¹

05 Feb 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: This article applies for the first time this method to fMRI data, and shows that TV regularization is well suited to the purpose of brain mapping while being a powerful tool for brain decoding.

...read moreread less

Abstract: While medical imaging typically provides massive amounts of data, the extraction of relevant information for predictive diagnosis remains a difficult challenge. Functional MRI (fMRI) data, that provide an indirect measure of task-related or spontaneous neuronal activity, are classically analyzed in a mass-univariate procedure yielding statistical parametric maps. This analysis framework disregards some important principles of brain organization: population coding, distributed and overlapping representations. Multivariate pattern analysis, i.e., the prediction of behavioural variables from brain activation patterns better captures this structure. To cope with the high dimensionality of the data, the learning method has to be regularized. However, the spatial structure of the image is not taken into account in standard regularization methods, so that the extracted features are often hard to interpret. More informative and interpretable results can be obtained with the l_1 norm of the image gradient, a.k.a. its Total Variation (TV), as regularization. We apply for the first time this method to fMRI data, and show that TV regularization is well suited to the purpose of brain mapping while being a powerful tool for brain decoding. Moreover, this article presents the first use of TV regularization for classification.

...read moreread less

90 citations

Posted Content•

An Axis-Based Representation for Recognition

[...]

Cagri Aslan, Sibel Tari

14 Apr 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: A new axis-based shape representation scheme along with a matching framework to address the problem of generic shape recognition that captures the perceptual qualities of shapes well and finding the similarities and the differences among shapes becomes easier.

...read moreread less

Abstract: This paper presents a new axis-based shape representation scheme along with a matching framework to address the problem of generic shape recognition. The main idea is to define the relative spatial arrangement of local symmetry axes and their metric properties in a shape centered coordinate frame. The resulting descriptions are invariant to scale, rotation, small changes in viewpoint and articulations. Symmetry points are extracted from a surface whose level curves roughly mimic the motion by curvature. By increasing the amount of smoothing on the evolving curve, only those symmetry axes that correspond to the most prominent parts of a shape are extracted. The representation does not suffer from the common instability problems of the traditional connected skeletons. It captures the perceptual qualities of shapes well. Therefore finding the similarities and the differences among shapes becomes easier. The matching process gives highly successful results on a diverse database of 2D shapes.

...read moreread less

89 citations

Posted Content•

Positive Semidefinite Metric Learning Using Boosting-like Algorithms

[...]

Chunhua Shen¹, Junae Kim², Lei Wang³, Anton van den Hengel¹•Institutions (3)

University of Adelaide¹, NICTA², University of Wollongong³

25 Apr 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: BoostMetric as mentioned in this paper uses rank-one positive semidefinite matrices as weak learners within an efficient and scalable boosting-based learning process to learn a valid Mahalanobis distance metric.

...read moreread less

Abstract: The success of many machine learning and pattern recognition methods relies heavily upon the identification of an appropriate distance metric on the input data. It is often beneficial to learn such a metric from the input training data, instead of using a default one such as the Euclidean distance. In this work, we propose a boosting-based technique, termed BoostMetric, for learning a quadratic Mahalanobis distance metric. Learning a valid Mahalanobis distance metric requires enforcing the constraint that the matrix parameter to the metric remains positive definite. Semidefinite programming is often used to enforce this constraint, but does not scale well and easy to implement. BoostMetric is instead based on the observation that any positive semidefinite matrix can be decomposed into a linear combination of trace-one rank-one matrices. BoostMetric thus uses rank-one positive semidefinite matrices as weak learners within an efficient and scalable boosting-based learning process. The resulting methods are easy to implement, efficient, and can accommodate various types of constraints. We extend traditional boosting algorithms in that its weak learner is a positive semidefinite matrix with trace and rank being one rather than a classifier or regressor. Experiments on various datasets demonstrate that the proposed algorithms compare favorably to those state-of-the-art methods in terms of classification accuracy and running time.

...read moreread less

73 citations

Posted Content•

A Review of Research on Devnagari Character Recognition

[...]

Vikas J. Dongre, Vijay H. Mankar

13 Jan 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: An overview of DOCR systems is presented and the available DOCR techniques are reviewed in this article, where the current status of the DOCR is discussed and directions for future research are suggested.

...read moreread less

Abstract: English Character Recognition (CR) has been extensively studied in the last half century and progressed to a level, sufficient to produce technology driven applications. But same is not the case for Indian languages which are complicated in terms of structure and computations. Rapidly growing computational power may enable the implementation of Indic CR methodologies. Digital document processing is gaining popularity for application to office and library automation, bank and postal services, publishing houses and communication technology. Devnagari being the national language of India, spoken by more than 500 million people, should be given special attention so that document retrieval and analysis of rich ancient and modern Indian literature can be effectively done. This article is intended to serve as a guide and update for the readers, working in the Devnagari Optical Character Recognition (DOCR) area. An overview of DOCR systems is presented and the available DOCR techniques are reviewed. The current status of DOCR is discussed and directions for future research are suggested.

...read moreread less

Posted Content•

Steps Towards a Theory of Visual Information: Active Perception, Signal-to-Symbol Conversion and the Interplay Between Sensing and Control

[...]

Stefano Soatto

10 Oct 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: It is shown that the "actionable information gap" between the two can be reduced by exercising control on the sensing process, and therefore, senging, control and information are inextricably tied.

...read moreread less

Abstract: This manuscript describes the elements of a theory of information tailored to control and decision tasks and specifically to visual data. The concept of Actionable Information is described, that relates to a notion of information championed by J. Gibson, and a notion of "complete information" that relates to the minimal sufficient statistics of a complete representation. It is shown that the "actionable information gap" between the two can be reduced by exercising control on the sensing process. Thus, senging, control and information are inextricably tied. This has consequences in the so-called "signal-to-symbol barrier" problem, as well as in the analysis and design of active sensing systems. It has ramifications in vision-based control, navigation, 3-D reconstruction and rendering, as well as detection, localization, recognition and categorization of objects and scenes in live video. This manuscript has been developed from a set of lecture notes for a summer course at the First International Computer Vision Summer School (ICVSS) in Scicli, Italy, in July of 2008. They were later expanded and amended for subsequent lectures in the same School in July 2009. Starting on November 1, 2009, they were further expanded for a special topics course, CS269, taught at UCLA in the Spring term of 2010.

...read moreread less

Posted Content•

Design of an Optical Character Recognition System for Camera- based Handheld Devices

[...]

Ayatullah Faruk Mollah, Nabamita Majumder, Subhadip Basu, Mita Nasipuri

15 Sep 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: A complete Optical Character Recognition system for camera captured image/graphics embedded textual documents for handheld devices that is computationally efficient and consumes low memory so as to be applicable on handheld devices.

...read moreread less

Abstract: This paper presents a complete Optical Character Recognition (OCR) system for camera captured image/graphics embedded textual documents for handheld devices. At first, text regions are extracted and skew corrected. Then, these regions are binarized and segmented into lines and characters. Characters are passed into the recognition module. Experimenting with a set of 100 business card images, captured by cell phone camera, we have achieved a maximum recognition accuracy of 92.74%. Compared to Tesseract, an open source desktop-based powerful OCR engine, present recognition accuracy is worth contributing. Moreover, the developed technique is computationally efficient and consumes low memory so as to be applicable on handheld devices.

...read moreread less

Posted Content•

Large Scale Correlation Clustering Optimization

[...]

Shai Bagon, Meirav Galun

13 Dec 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: A theoretic analysis provides a probabilistic generative interpretation for the Correlation Clustering functional, and justifies its intrinsic "model-selection" capability, and suggests several new optimization algorithms which can cope with large scale problems (>100K variables) that are infeasible using existing methods.

...read moreread less

Abstract: Clustering is a fundamental task in unsupervised learning. The focus of this paper is the Correlation Clustering functional which combines positive and negative affinities between the data points. The contribution of this paper is two fold: (i) Provide a theoretic analysis of the functional. (ii) New optimization algorithms which can cope with large scale problems (>100K variables) that are infeasible using existing methods. Our theoretic analysis provides a probabilistic generative interpretation for the functional, and justifies its intrinsic "model-selection" capability. Furthermore, we draw an analogy between optimizing this functional and the well known Potts energy minimization. This analogy allows us to suggest several new optimization algorithms, which exploit the intrinsic "model-selection" capability of the functional to automatically recover the underlying number of clusters. We compare our algorithms to existing methods on both synthetic and real data. In addition we suggest two new applications that are made possible by our algorithms: unsupervised face identification and interactive multi-object segmentation by rough boundary delineation.

...read moreread less

Posted Content•

A linear framework for region-based image segmentation and inpainting involving curvature penalization

[...]

Thomas Schoenemann¹, Fredrik Kahl¹, Simon Masnou², Daniel Cremers³•Institutions (3)

Lund University¹, University of Lyon², Technische Universität München³

18 Feb 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the authors present the first method to handle curvature regularity in region-based image segmentation and inpainting that is independent of initialization, which is based on a cell complex and considers basic regions and boundary elements.

...read moreread less

Abstract: We present the first method to handle curvature regularity in region-based image segmentation and inpainting that is independent of initialization. To this end we start from a new formulation of length-based optimization schemes, based on surface continuation constraints, and discuss the connections to existing schemes. The formulation is based on a \emph{cell complex} and considers basic regions and boundary elements. The corresponding optimization problem is cast as an integer linear program. We then show how the method can be extended to include curvature regularity, again cast as an integer linear program. Here, we are considering pairs of boundary elements to reflect curvature. Moreover, a constraint set is derived to ensure that the boundary variables indeed reflect the boundary of the regions described by the region variables. We show that by solving the linear programming relaxation one gets quite close to the global optimum, and that curvature regularity is indeed much better suited in the presence of long and thin objects compared to standard length regularity.

...read moreread less

Posted Content•

Leveraging Billions of Faces to Overcome Performance Barriers in Unconstrained Face Recognition

[...]

Yaniv Taigman, Lior Wolf

04 Aug 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: The face recognition technology developed in house at face.com is employed to a well accepted benchmark and it is shown that without any tuning the system is able to considerably surpass state of the art results.

...read moreread less

Abstract: We employ the face recognition technology developed in house at this http URL to a well accepted benchmark and show that without any tuning we are able to considerably surpass state of the art results. Much of the improvement is concentrated in the high-valued performance point of zero false positive matches, where the obtained recall rate almost doubles the best reported result to date. We discuss the various components and innovations of our system that enable this significant performance gap. These components include extensive utilization of an accurate 3D reconstructed shape model dealing with challenges arising from pose and illumination. In addition, discriminative models based on billions of faces are used in order to overcome aging and facial expression as well as low light and overexposure. Finally, we identify a challenging set of identification queries that might provide useful focus for future research.

...read moreread less

Posted Content•

Analysis and Improvement of Low Rank Representation for Subspace segmentation

[...]

Siming Wei, Zhouchen Lin¹•Institutions (1)

Microsoft¹

08 Jul 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: It is shown that LRR can be approximated as a factorization method that combines noise removal by column sparse robust PCA and an improved version of LRR, called Robust Shape Interaction (RSI), which uses the corrected data as the dictionary instead of the noisy data.

...read moreread less

Abstract: We analyze and improve low rank representation (LRR), the state-of-the-art algorithm for subspace segmentation of data. We prove that for the noiseless case, the optimization model of LRR has a unique solution, which is the shape interaction matrix (SIM) of the data matrix. So in essence LRR is equivalent to factorization methods. We also prove that the minimum value of the optimization model of LRR is equal to the rank of the data matrix. For the noisy case, we show that LRR can be approximated as a factorization method that combines noise removal by column sparse robust PCA. We further propose an improved version of LRR, called Robust Shape Interaction (RSI), which uses the corrected data as the dictionary instead of the noisy data. RSI is more robust than LRR when the corruption in data is heavy. Experiments on both synthetic and real data testify to the improved robustness of RSI.

...read moreread less

Journal Article•DOI•

Estimating 3D Human Shapes from Measurements

[...]

Stefanie Wuhrer, Chang Shu¹•Institutions (1)

National Research Council¹

06 Sep 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper introduces a technique that extrapolates the statistically inferred shape to fit the measurement data using non-linear optimization and ensures that the generated shape is both human-like and satisfies the measurement conditions.

...read moreread less

Abstract: The recent advances in 3-D imaging technologies give rise to databases of human shapes, from which statistical shape models can be built. These statistical models represent prior knowledge of the human shape and enable us to solve shape reconstruction problems from partial information. Generating human shape from traditional anthropometric measurements is such a problem, since these 1-D measurements encode 3-D shape information. Combined with a statistical shape model, these easy-to-obtain measurements can be leveraged to create 3D human shapes. However, existing methods limit the creation of the shapes to the space spanned by the database and thus require a large amount of training data. In this paper, we introduce a technique that extrapolates the statistically inferred shape to fit the measurement data using nonlinear optimization. This method ensures that the generated shape is both human-like and satisfies the measurement conditions. We demonstrate the effectiveness of the method and compare it to existing approaches through extensive experiments, using both synthetic data and real human measurements.

...read moreread less

Posted Content•

Aorta Segmentation for Stent Simulation

[...]

Jan Egger, Bernd Freisleben, Randolph M. Setser, Rahul Renapuraar, Christina Biermann, Thomas F. O'Donnell¹ - Show less +2 more•Institutions (1)

Siemens¹

09 Mar 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, the authors presented a framework for facilitating virtual aortic stenting from a contrast computer tomography (CT) scan, which may be employed in determining both the appropriateness of intervention as well as the selection and localization of the device.

...read moreread less

Abstract: Simulation of arterial stenting procedures prior to intervention allows for appropriate device selection as well as highlights potential complications. To this end, we present a framework for facilitating virtual aortic stenting from a contrast computer tomography (CT) scan. More specifically, we present a method for both lumen and outer wall segmentation that may be employed in determining both the appropriateness of intervention as well as the selection and localization of the device. The more challenging recovery of the outer wall is based on a novel minimal closure tracking algorithm. Our aortic segmentation method has been validated on over 3000 multiplanar reformatting (MPR) planes from 50 CT angiography data sets yielding a Dice Similarity Coefficient (DSC) of 90.67%.

...read moreread less

Journal Article•DOI•

A Comparative Experiment of Several Shape Methods in Recognizing Plants

[...]

Abdul Kadir, Lukito Edi Nugroho, Adhi Susanto, Paulus Insap Santosa

07 Oct 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this research, a comparative experiment of 4 methods to identify plants using shape features was accomplished and Polar Fourier Transform gave best performance with 64% in accuracy and outperformed the other methods.

...read moreread less

Abstract: Shape is an important aspects in recognizing plants. Several approaches have been introduced to identify objects, including plants. Combination of geometric features such as aspect ratio, compactness, and dispersion, or moments such as moment invariants were usually used toidentify plants. In this research, a comparative experiment of 4 methods to identify plants using shape features was accomplished. Two approaches have never been used in plants identification yet, Zernike moments and Polar Fourier Transform (PFT), were incorporated. The experimental comparison was done on 52 kinds of plants with various shapes. The result, PFT gave best performance with 64% in accuracy and outperformed the other methods.

...read moreread less

Posted Content•

Application of Freeman Chain Codes: An Alternative Recognition Technique for Malaysian Car Plates

[...]

Nor Amizam Jusoh, Jasni Mohamad Zain

08 Jan 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper is mainly focused on conducting an experiment using chain codes technique to perform recognition for different types of fonts used in Malaysian car plates.

...read moreread less

Abstract: Summary Various applications of car plate recognition systems have been developed using various kinds of methods and techniques by researchers all over the world. The applications developed were only suitable for specific country due to its standard specification endorsed by the transport department of particular countries. The Road Transport Department of Malaysia also has endorsed a specification for car plates that includes the font and size of characters that must be followed by car owners. However, there are cases where this specification is not followed. Several applications have been developed in Malaysia to overcome this problem. However, there is still problem in achieving 100% recognition accuracy. This paper is mainly focused on conducting an experiment using chain codes technique to perform recognition for different types of fonts used in Malaysian car plates.

...read moreread less

Posted Content•

POCS Based Super-Resolution Image Reconstruction Using an Adaptive Regularization Parameter

[...]

Sudam Sekhar Panda, M. S. R. S. Prasad, Gunamani Jena

07 Dec 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: An adaptive regularization approach based on the fact that the regularization parameter should be a linear function of noise variance is proposed and the obtained results demonstrate the superiority of the approach compared with existing methods.

...read moreread less

Abstract: Crucial information barely visible to the human eye is often embedded in a series of low-resolution images taken of the same scene. Super-resolution enables the extraction of this information by reconstructing a single image, at a high resolution than is present in any of the individual images. This is particularly useful in forensic imaging, where the extraction of minute details in an image can help to solve a crime. Super-resolution image restoration has been one of the most important research areas in recent years which goals to obtain a high resolution (HR) image from several low resolutions (LR) blurred, noisy, under sampled and displaced images. Relation of the HR image and LR images can be modeled by a linear system using a transformation matrix and additive noise. However, a unique solution may not be available because of the singularity of transformation matrix. To overcome this problem, POCS method has been used. However, their performance is not good because the effect of noise energy has been ignored. In this paper, we propose an adaptive regularization approach based on the fact that the regularization parameter should be a linear function of noise variance. The performance of the proposed approach has been tested on several images and the obtained results demonstrate the superiority of our approach compared with existing methods.

...read moreread less

Book Chapter•DOI•

Visual Speech Recognition

[...]

Ahmad B. A. Hassanat

21 Jun 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: Visual speech recognition (VSR) deals with the visual domain of speech and involves image processing, artificial intelligence, object detection, pattern recognition, statistical modelling, etc and has received a great deal of attention in the last decade.

...read moreread less

Abstract: Lip reading is used to understand or interpret speech without hearing it, a technique especially mastered by people with hearing difficulties. The ability to lip read enables a person with a hearing impairment to communicate with others and to engage in social activities, which otherwise would be difficult. Recent advances in the fields of computer vision, pattern recognition, and signal processing has led to a growing interest in automating this challenging task of lip reading. Indeed, automating the human ability to lip read, a process referred to as visual speech recognition (VSR) (or sometimes speech reading), could open the door for other novel related applications. VSR has received a great deal of attention in the last decade for its potential use in applications such as human-computer interaction (HCI), audio-visual speech recognition (AVSR), speaker recognition, talking heads, sign language recognition and video surveillance. Its main aim is to recognise spoken word(s) by using only the visual signal that is produced during speech. Hence, VSR deals with the visual domain of speech and involves image processing, artificial intelligence, object detection, pattern recognition, statistical modelling, etc.

...read moreread less

Journal Article•DOI•

Foliage Plant Retrieval using Polar Fourier Transform, Color Moments and Vein Features

[...]

Abdul Kadir, Lukito Edi Nugroho, Adhi Susanto, Paulus Insap Santosa

07 Oct 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: A method that combines Polar Fourier Transform, color moments, and vein features to retrieve leaf images based on a leaf image is proposed and shows that the method gave better performance than PNN, SVM, and Fourier transform.

...read moreread less

Abstract: This paper proposed a method that combines Polar Fourier Transform, color moments, and vein features to retrieve leaf images based on a leaf image. The method is very useful to help people in recognizing foliage plants. Foliage plants are plants that have various colors and unique patterns in the leaf. Therefore, the colors and its patterns are information that should be counted on in the processing of plant identification. To compare the performance of retrieving system to other result, the experiments used Flavia dataset, which is very popular in recognizing plants. The result shows that the method gave better performance than PNN, SVM, and Fourier Transform. The method was also tested using foliage plants with various colors. The accuracy was 90.80% for 50 kinds of plants.

...read moreread less

Posted Content•

A Facial Expression Classification System Integrating Canny, Principal Component Analysis and Artificial Neural Network

[...]

Le Hoang Thai, Nguyen Do Thai Nguyen, Tran Son Hai

17 Nov 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, a novel approach using Canny, Principal Component Analysis (PCA) and Artificial Neural Network (ANN) was proposed for facial expression classification using the JAFFE database.

...read moreread less

Abstract: Facial Expression Classification is an interesting research problem in recent years. There are a lot of methods to solve this problem. In this research, we propose a novel approach using Canny, Principal Component Analysis (PCA) and Artificial Neural Network. Firstly, in preprocessing phase, we use Canny for local region detection of facial images. Then each of local region's features will be presented based on Principal Component Analysis (PCA). Finally, using Artificial Neural Network (ANN)applies for Facial Expression Classification. We apply our proposal method (Canny_PCA_ANN) for recognition of six basic facial expressions on JAFFE database consisting 213 images posed by 10 Japanese female models. The experimental result shows the feasibility of our proposal method.

...read moreread less

Posted Content•

Face Recognition Based on SVM and 2DPCA

[...]

Thai Hoang Le¹, Len Bui²•Institutions (2)

Ho Chi Minh City University of Science¹, University of Canberra²

25 Oct 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: This method combines 2D Principal Component Analysis (2DPCA), one of the prominent methods for extracting feature vectors, and Support Vector Machine (SVM), the most powerful discriminative method for classification.

...read moreread less

Abstract: The paper will present a novel approach for solving face recognition problem. Our method combines 2D Principal Component Analysis (2DPCA), one of the prominent methods for extracting feature vectors, and Support Vector Machine (SVM), the most powerful discriminative method for classification. Experiments based on proposed method have been conducted on two public data sets FERET and ATT the results show that the proposed method could improve the classification rates.

...read moreread less

Posted Content•

Combining Neural Networks for Skin Detection

[...]

Chelsia Amy Doukim, Jamal Ahmad Dargham, Ali Chekima, Sigeru Omatu¹•Institutions (1)

Osaka Prefecture University¹

02 Jan 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, two types of combining strategies were evaluated, namely combining skin features and combining skin classifiers, where the outputs of the skin classifier are combined using binary operators such as the AND and the OR operators, "Voting", "Sum of Weights" and a new neural network.

...read moreread less

Abstract: Two types of combining strategies were evaluated namely combining skin features and combining skin classifiers. Several combining rules were applied where the outputs of the skin classifiers are combined using binary operators such as the AND and the OR operators, "Voting", "Sum of Weights" and a new neural network. Three chrominance components from the YCbCr colour space that gave the highest correct detection on their single feature MLP were selected as the combining parameters. A major issue in designing a MLP neural network is to determine the optimal number of hidden units given a set of training patterns. Therefore, a "coarse to fine search" method to find the number of neurons in the hidden layer is proposed. The strategy of combining Cb/Cr and Cr features improved the correct detection by 3.01% compared to the best single feature MLP given by Cb-Cr. The strategy of combining the outputs of three skin classifiers using the "Sum of Weights" rule further improved the correct detection by 4.38% compared to the best single feature MLP.

...read moreread less

Posted Content•DOI•

Devnagari document segmentation using histogram approach

[...]

Vikas J. Dongre, Vijay H. Mankar

06 Sep 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: A simple histogram based approach to segment Devnagari documents is proposed in this paper and various challenges in segmentation of DevNagari script are discussed.

...read moreread less

Abstract: Document segmentation is one of the critical phases in machine recognition of any language. Correct segmentation of individual symbols decides the accuracy of character recognition technique. It is used to decompose image of a sequence of characters into sub images of individual symbols by segmenting lines and words. Devnagari is the most popular script in India. It is used for writing Hindi, Marathi, Sanskrit and Nepali languages. Moreover, Hindi is the third most popular language in the world. Devnagari documents consist of vowels, consonants and various modifiers. Hence proper segmentation of Devnagari word is challenging. A simple histogram based approach to segment Devnagari documents is proposed in this paper. Various challenges in segmentation of Devnagari script are also discussed.

...read moreread less

Journal Article•DOI•

Constant-time filtering using shiftable kernels

[...]

Kunal N. Chaudhury¹•Institutions (1)

Princeton University¹

22 Jul 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: This letter identifies a central property of trigonometric functions, called shiftability, that allows us to exploit the redundancy inherent in the filtering operations and shows how certain complex filtering can be reduced to simply that of computing the moving sum of a stack of images.

...read moreread less

Abstract: It was recently demonstrated in [5] that the non-linear bilateral filter [14] can be efficiently implemented using a constant-time or O(1) algorithm. At the heart of this algorithm was the idea of approximating the Gaussian range kernel of the bilateral filter using trigonometric functions. In this letter, we explain how the idea in [5] can be extended to few other linear and non-linear filters [14, 17, 2]. While some of these filters have received a lot of attention in recent years, they are known to be computationally intensive. To extend the idea in [5], we identify a central property of trigonometric functions, called shiftability, that allows us to exploit the redundancy inherent in the filtering operations. In particular, using shiftable kernels, we show how certain complex filtering can be reduced to simply that of computing the moving sum of a stack of images. Each image in the stack is obtained through an elementary pointwise transform of the input image. This has a two-fold advantage. First, we can use fast recursive algorithms for computing the moving sum [15, 6], and, secondly, we can use parallel computation to further speed up the computation. We also show how shiftable kernels can also be used to approximate the (non-shiftable) Gaussian kernel that is ubiquitously used in image filtering.

...read moreread less

Posted Content•

Spectral descriptors for deformable shapes

[...]

Alexander M. Bronstein¹•Institutions (1)

Tel Aviv University¹

23 Oct 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: It is argued that in order to be optimal for a specific task, the descriptor should take into account the statistics of the corpus of shapes to which it is applied and those of the class of transformations toWhich it is made insensitive (the "noise").

...read moreread less

Abstract: Informative and discriminative feature descriptors play a fundamental role in deformable shape analysis. For example, they have been successfully employed in correspondence, registration, and retrieval tasks. In the recent years, significant attention has been devoted to descriptors obtained from the spectral decomposition of the Laplace-Beltrami operator associated with the shape. Notable examples in this family are the heat kernel signature (HKS) and the wave kernel signature (WKS). Laplacian-based descriptors achieve state-of-the-art performance in numerous shape analysis tasks; they are computationally efficient, isometry-invariant by construction, and can gracefully cope with a variety of transformations. In this paper, we formulate a generic family of parametric spectral descriptors. We argue that in order to be optimal for a specific task, the descriptor should take into account the statistics of the corpus of shapes to which it is applied (the "signal") and those of the class of transformations to which it is made insensitive (the "noise"). While such statistics are hard to model axiomatically, they can be learned from examples. Following the spirit of the Wiener filter in signal processing, we show a learning scheme for the construction of optimal spectral descriptors and relate it to Mahalanobis metric learning. The superiority of the proposed approach is demonstrated on the SHREC'10 benchmark.

...read moreread less

Collapse