01 Oct 2002-IEEE Transactions on Image Processing (University of Groningen, Johann Bernoulli Institute for Mathematics and Computer Science)-Vol. 11, Iss: 10, pp 1160-1167
TL;DR: The grating cell operator is the only one that selectively responds only to texture and does not give false response to nontexture features such as object contours and the texture detection capabilities of the operators are compared.
Abstract: Texture features that are based on the local power spectrum obtained by a bank of Gabor filters are compared. The features differ in the type of nonlinear post-processing which is applied to the local power spectrum. The following features are considered: Gabor energy, complex moments, and grating cell operator features. The capability of the corresponding operators to produce distinct feature vector clusters for different textures is compared using two methods: the Fisher (1923) criterion and the classification result comparison. Both methods give consistent results. The grating cell operator gives the best discrimination and segmentation results. The texture detection capabilities of the operators and their robustness to nontexture features are also compared. The grating cell operator is the only one that selectively responds only to texture and does not give false response to nontexture features such as object contours.
At this point, the question arises of how to measure the usefulness of different features.
None of the aforementioned evaluation methods can be generally considered as superior because each of them is informative in its own way and each has its limitations.
In Section IV, a number of texture segmentation experiments are carried out and the properties of the considered operators are assessed using the classification result comparison method.
B. Gabor Energy Features
The outputs of a symmetric and an antisymmetric kernel filter in each image point can be combined in a single quantity that is called the Gabor energy.
This feature is related to the model of a specific type of orientation selective neuron in the primary visual cortex called the complex cell [35] and is defined in the following way: (3) where and are the responses of the linear symmetric and antisymmetric Gabor filters, respectively.
The Gabor energy is closely related to the local power spectrum.
The local power spectrum associated with a pixel in an image is defined as the squared modulus of the Fourier transform of the product of the image function and a window functio that restricts the Fourier analysis to a neighborhood of the ixel of interest.
Using a Gaussian windowing function as the one used in (2) and taking into account (1) and (3) the following relation between the local power spectrum and the Gabor energy features can be proven: (4).
C. Complex Moments Features
In [9] and [36], the real and imaginary parts of the complex moments of the local power spectrum were proposed as features that give information about the presence or absence of dominant texture orientations.
In [36], the authors discuss the advantages of using the real and imaginary parts of the complex moments as features instead of their moduli and arguments.
In their experiments, the authors use as features the nonzero real and imaginary parts of the complex moments of the local power spectrum.
This amounts to 43 nonzero real values out of which only 24 are linearly independent because .
Such a step can improve the separability of the feature clusters, but then this step should also be applied to the other features.
D. Grating Cell Operator Features
Grating cells are selective for orientation but differ from the majority of orientation selective cells found in the mentioned cortical areas in that they do not react to single lines or edges, as for instance simple cells (modeled by Gabor filters) or complex cells (modeled by Gabor energy operators) do.
The response increases with the number of bars that cross the receptive field of the cell and saturates at about ten bars.
The grating cell operator was conceived to reproduce the properties of grating cells as known from electrophysiological researches [11]–[14].
Essentially, this operator signals the presence of one-dimensional (1-D) periodicity of certain preferred spatial frequency and orientation in 2-D images.
A. Comparison Method
The feature vectors computed in different points of a texture image are not identical; they rather form a cluster in the multidimensional feature space.
A linear transform that, under certain conditions, realizes such a projection was first introduced by Fisher [39] and is called the Fisher linear discriminant function.
This projection of the feature vectors into the 1-D space maximizes theFisher criterion[40], which measures the separability of the two concerned clusters in the reduced space (7) where and are the standard deviations of the distributions of the projected feature vectors of the two clusters andand are the projections of the meansand , respectively.
Strictly speaking, the transform given by (6) need not necessarily maximize the value of according to (7) for arbitrary distributions.
Only recently, this criterion has been applied to the evaluation of texture feature extraction operators [13].
B. Results
The authors evaluated the performance of the operators presented in Section II according to the Fisher criterion by looking at the pair-wise separability of the feature clusters corresponding to nine test textures (Fig. 1).
The separability achieved for the complex moments features is smaller than the one achieved with the Gabor energy features.
Similarly, in [14], the authors stress that the grating cell operator was conceived to respond only to a given orientation and frequency of the input stimuli.
TL;DR: This survey comprehensively survey the existing methods and applications for the fusion of infrared and visible images, which can serve as a reference for researchers inrared and visible image fusion and related fields.
Abstract: Infrared images can distinguish targets from their backgrounds based on the radiation difference, which works well in all-weather and all-day/night conditions. By contrast, visible images can provide texture details with high spatial resolution and definition in a manner consistent with the human visual system. Therefore, it is desirable to fuse these two types of images, which can combine the advantages of thermal radiation information in infrared images and detailed texture information in visible images. In this work, we comprehensively survey the existing methods and applications for the fusion of infrared and visible images. First, infrared and visible image fusion methods are reviewed in detail. Meanwhile, image registration, as a prerequisite of image fusion, is briefly introduced. Second, we provide an overview of the main applications of infrared and visible image fusion. Third, the evaluation metrics of fusion performance are discussed and summarized. Fourth, we select eighteen representative methods and nine assessment metrics to conduct qualitative and quantitative experiments, which can provide an objective performance reference for different fusion methods and thus support relative engineering with credible and solid evidence. Finally, we conclude with the current status of infrared and visible image fusion and deliver insightful discussions and prospects for future work. This survey can serve as a reference for researchers in infrared and visible image fusion and related fields.
TL;DR: The proposed biologically motivated method to improve contour detection in machine vision, called nonclassical receptive field (non-CRF) inhibition (more generally, surround inhibition or suppression), is proposed and is more useful for contour-based object recognition tasks, than traditional edge detectors, which do not distinguish between contour and texture edges.
Abstract: We propose a biologically motivated method, called nonclassical receptive field (non-CRF) inhibition (more generally, surround inhibition or suppression), to improve contour detection in machine vision. Non-CRF inhibition is exhibited by 80% of the orientation-selective neurons in the primary visual cortex of monkeys and has been shown to influence human visual perception as well. Essentially, the response of an edge detector at a certain point is suppressed by the responses of the operator in the region outside the supported area. We combine classical edge detection with isotropic and anisotropic inhibition, both of which have counterparts in biology. We also use a biologically motivated method (the Gabor energy operator) for edge detection. The resulting operator responds strongly to isolated lines, edges, and contours, but exhibits weak or no response to edges that are part of texture. We use natural images with associated ground truth contour maps to assess the performance of the proposed operator for detecting contours while suppressing texture edges. Our method enhances contour detection in cluttered visual scenes more effectively than classical edge detectors used in machine vision (Canny edge detector). Therefore, the proposed operator is more useful for contour-based object recognition tasks, such as shape comparison, than traditional edge detectors, which do not distinguish between contour and texture edges. Traditional edge detection algorithms can, however, also be extended with surround suppression. This study contributes also to the understanding of inhibitory mechanisms in biology.
TL;DR: A comprehensive survey in a systematic approach about the state-of-the-art on-road vision-based vehicle detection and tracking systems for collision avoidance systems (CASs).
Abstract: Over the past decade, vision-based vehicle detection techniques for road safety improvement have gained an increasing amount of attention. Unfortunately, the techniques suffer from robustness due to huge variability in vehicle shape (particularly for motorcycles), cluttered environment, various illumination conditions, and driving behavior. In this paper, we provide a comprehensive survey in a systematic approach about the state-of-the-art on-road vision-based vehicle detection and tracking systems for collision avoidance systems (CASs). This paper is structured based on a vehicle detection processes starting from sensor selection to vehicle detection and tracking. Techniques in each process/step are reviewed and analyzed individually. Two main contributions in this paper are the following: survey on motorcycle detection techniques and the sensor comparison in terms of cost and range parameters. Finally, the survey provides an optimal choice with a low cost and reliable CAS design in vehicle industries.
TL;DR: A novel mobile telephone food record that will provide an accurate account of daily food and nutrient intake and the approach to image analysis that includes the segmentation of food items, features used to identify foods, a method for automatic portion estimation, and the overall system architecture for collecting the food intake information are described.
Abstract: There is a growing concern about chronic diseases and other health problems related to diet including obesity and cancer. The need to accurately measure diet (what foods a person consumes) becomes imperative. Dietary intake provides valuable insights for mounting intervention programs for prevention of chronic diseases. Measuring accurate dietary intake is considered to be an open research problem in the nutrition and health fields. In this paper, we describe a novel mobile telephone food record that will provide an accurate account of daily food and nutrient intake. Our approach includes the use of image analysis tools for identification and quantification of food that is consumed at a meal. Images obtained before and after foods are eaten are used to estimate the amount and type of food consumed. The mobile device provides a unique vehicle for collecting dietary information that reduces the burden on respondents that are obtained using more classical approaches for dietary assessment. We describe our approach to image analysis that includes the segmentation of food items, features used to identify foods, a method for automatic portion estimation, and our overall system architecture for collecting the food intake information.
TL;DR: A novel method for constructing a smooth direction field that preserves the flow of the salient image features and the notion of flow-guided anisotropic filtering for detecting highly coherent lines while suppressing noise is introduced.
Abstract: This paper presents a non-photorealistic rendering technique that automatically generates a line drawing from a photograph. We aim at extracting a set of coherent, smooth, and stylistic lines that effectively capture and convey important shapes in the image. We first develop a novel method for constructing a smooth direction field that preserves the flow of the salient image features. We then introduce the notion of flow-guided anisotropic filtering for detecting highly coherent lines while suppressing noise. Our method is simple and easy to implement. A variety of experimental results are presented to show the effectiveness of our method in producing self-contained, high-quality line illustrations.
315 citations
Cites background from "Comparison of texture features base..."
...Gabor .lters [Grigorescu et al. 2002] and other steerable .lters [Freeman and Adelson 1991] typically
employ a set of oriented, elliptic kernels to help analyze parallel structures or texture patterns....
[...]
...Gabor filters [Grigorescu et al. 2002] and other steerable filters [Freeman and Adelson 1991] typically employ a set of oriented, elliptic kernels to help analyze parallel structures or texture patterns....
TL;DR: This completely revised second edition presents an introduction to statistical pattern recognition, which is appropriate as a text for introductory courses in pattern recognition and as a reference book for workers in the field.
Abstract: This completely revised second edition presents an introduction to statistical pattern recognition Pattern recognition in general covers a wide range of problems: it is applied to engineering problems, such as character readers and wave form analysis as well as to brain modeling in biology and psychology Statistical decision and estimation, which are the main subjects of this book, are regarded as fundamental to the study of pattern recognition This book is appropriate as a text for introductory courses in pattern recognition and as a reference book for workers in the field Each chapter contains computer projects as well as exercises
10,526 citations
Additional excerpts
...[8]) Ó Ô Â - Â 3 Ô Õ Z 3 Z 3 3 where S [ and S b are the standard deviations...
TL;DR: This paper evaluates the performance both of some texture measures which have been successfully used in various applications and of some new promising approaches proposed recently.
Abstract: This paper evaluates the performance both of some texture measures which have been successfully used in various applications and of some new promising approaches proposed recently For classification a method based on Kullback discrimination of sample and prototype distributions is used The classification results for single features with one-dimensional feature value distributions and for pairs of complementary features with two-dimensional distributions are presented
TL;DR: A method for rapid visual recognition of personal identity is described, based on the failure of a statistical test of independence, which implies a theoretical "cross-over" error rate of one in 131000 when a decision criterion is adopted that would equalize the false accept and false reject error rates.
Abstract: A method for rapid visual recognition of personal identity is described, based on the failure of a statistical test of independence. The most unique phenotypic feature visible in a person's face is the detailed texture of each eye's iris. The visible texture of a person's iris in a real-time video image is encoded into a compact sequence of multi-scale quadrature 2-D Gabor wavelet coefficients, whose most-significant bits comprise a 256-byte "iris code". Statistical decision theory generates identification decisions from Exclusive-OR comparisons of complete iris codes at the rate of 4000 per second, including calculation of decision confidence levels. The distributions observed empirically in such comparisons imply a theoretical "cross-over" error rate of one in 131000 when a decision criterion is adopted that would equalize the false accept and false reject error rates. In the typical recognition case, given the mean observed degree of iris code agreement, the decision confidence levels correspond formally to a conditional false accept probability of one in about 10/sup 31/. >
3,399 citations
"Comparison of texture features base..." refers methods in this paper
...If needed, scale and orientation invariance can be added to the methods in a way similar to the one used in other applications [61], [62]....
TL;DR: Evidence is presented that the 2D receptive-field profiles of simple cells in mammalian visual cortex are well described by members of this optimal 2D filter family, and thus such visual neurons could be said to optimize the general uncertainty relations for joint 2D-spatial-2D-spectral information resolution.
Abstract: Two-dimensional spatial linear filters are constrained by general uncertainty relations that limit their attainable information resolution for orientation, spatial frequency, and two-dimensional (2D) spatial position. The theoretical lower limit for the joint entropy, or uncertainty, of these variables is achieved by an optimal 2D filter family whose spatial weighting functions are generated by exponentiated bivariate second-order polynomials with complex coefficients, the elliptic generalization of the one-dimensional elementary functions proposed in Gabor’s famous theory of communication [ J. Inst. Electr. Eng.93, 429 ( 1946)]. The set includes filters with various orientation bandwidths, spatial-frequency bandwidths, and spatial dimensions, favoring the extraction of various kinds of information from an image. Each such filter occupies an irreducible quantal volume (corresponding to an independent datum) in a four-dimensional information hyperspace whose axes are interpretable as 2D visual space, orientation, and spatial frequency, and thus such a filter set could subserve an optimally efficient sampling of these variables. Evidence is presented that the 2D receptive-field profiles of simple cells in mammalian visual cortex are well described by members of this optimal 2D filter family, and thus such visual neurons could be said to optimize the general uncertainty relations for joint 2D-spatial–2D-spectral information resolution. The variety of their receptive-field dimensions and orientation and spatial-frequency bandwidths, and the correlations among these, reveal several underlying constraints, particularly in width/length aspect ratio and principal axis organization, suggesting a polar division of labor in occupying the quantal volumes of information hyperspace. Such an ensemble of 2D neural receptive fields in visual cortex could locally embed coarse polar mappings of the orientation–frequency plane piecewise within the global retinotopic mapping of visual space, thus efficiently representing 2D spatial visual information by localized 2D spectral signatures.
TL;DR: A texture segmentation algorithm inspired by the multi-channel filtering theory for visual information processing in the early stages of human visual system is presented, which is based on reconstruction of the input image from the filtered images.
Abstract: This paper presents a texture segmentation algorithm inspired by the multi-channel filtering theory for visual information processing in the early stages of human visual system. The channels are characterized by a bank of Gabor filters that nearly uniformly covers the spatial-frequency domain, and a systematic filter selection scheme is proposed, which is based on reconstruction of the input image from the filtered images. Texture features are obtained by subjecting each (selected) filtered image to a nonlinear transformation and computing a measure of “energy” in a window around each pixel. A square-error clustering algorithm is then used to integrate the feature images and produce a segmentation. A simple procedure to incorporate spatial information in the clustering process is proposed. A relative index is used to estimate the “true” number of texture categories.
2,351 citations
Additional excerpts
...compares them with the linear Gabor features or with the thresholded Gabor features [47], [48]....
Q1. What is the reason why the grating cell features are spatially more extended?
One should note that the grating cell features are spatially more extended, a property that is due to the weighted averaging step.
Q2. Why did the authors not include textures at different scales and orientations?
The authors did not include textures at different scales and orientations because the operators compared here are not scaling and rotation invariant, a property that is mainly due to the frequency and orientation selectivity of the Gabor filters.
Q3. What is the Fisher criterion for determining the distance between two clusters of feature?
In order to determine the distance between two clusters of feature vectors, it is sufficient to look at their projections onto a 1-D space, i.e., a line, under the assumption that this projection maximizes the separability of the clusters in the 1-D space.
Q4. What is the effect of local averaging on the texture operators?
the texture operators were also tested for their ability to detect texture in an image and to separate texture information from other image features like edges and contours of objects.
Q5. Why was the weighted local averaging part included in the grating cell operator?
It has been included in the model in order to reproduce a specific property of grating cells, namely, that a grating cell starts to respond when at least three parallel bars are present in its receptive field and that its response grows linearly with the addition of further bars to the grating, reaching saturation at about ten bars [37], [38].
Q6. What is the Fisher criterion for determining the separability of two clusters?
The authors evaluated the performance of the operators presented in Section II according to the Fisher criterion by looking at the pair-wise separability of the feature clusters corresponding to nine test textures (Fig. 1).While thenumberof test imagesusedis limited,onehastopoint out that the only aspect that was taken into account in selecting them is that the textures show a certain degree of “orientedness” which is to guarantee that (some of) the Gabor filters employed will respond.
Q7. How many linearly independent values are used for each point in the image?
The authors use this set of 24 linearly independent values computed for each point in the image as a feature vector associated with that point.
Q8. How was the interclass texture discrimination evaluated?
The interclass texture discrimination properties of different features were assessed by Fisher linear discriminant analysis and by the (classical) classification result comparison method.