scispace - formally typeset
Search or ask a question
Journal ArticleDOI

3D Texture Recognition Using Bidirectional Feature Histograms

01 Aug 2004-International Journal of Computer Vision (Kluwer Academic Publishers)-Vol. 59, Iss: 1, pp 33-60
TL;DR: A 3D texture recognition method is designed which employs the BFH as the surface model, and classifies surfaces based on a single novel texture image of unknown imaging parameters, and a computational method for quantitatively evaluating the relative significance of texture images within the BTF is developed.
Abstract: Textured surfaces are an inherent constituent of the natural surroundings, therefore efficient real-world applications of computer vision algorithms require precise surface descriptors. Often textured surfaces present not only variations of color or reflectance, but also local height variations. This type of surface is referred to as a 3D texture. As the lighting and viewing conditions are varied, effects such as shadowing, foreshortening and occlusions, give rise to significant changes in texture appearance. Accounting for the variation of texture appearance due to changes in imaging parameters is a key issue in developing accurate 3D texture models. The bidirectional texture function (BTF) is observed image texture as a function of viewing and illumination directions. In this work, we construct a BTF-based surface model which captures the variation of the underlying statistical distribution of local structural image features, as the viewing and illumination conditions are changed. This 3D texture representation is called the bidirectional feature histogram (BFH). Based on the BFH, we design a 3D texture recognition method which employs the BFH as the surface model, and classifies surfaces based on a single novel texture image of unknown imaging parameters. Also, we develop a computational method for quantitatively evaluating the relative significance of texture images within the BTF. The performance of our methods is evaluated by employing over 6200 texture images corresponding to 40 real-world surface samples from the CUReT (Columbia-Utrecht reflectance and texture) database. Our experiments produce excellent classification results, which validate the strong descriptive properties of the BFH as a 3D texture representation.
Citations
More filters
Journal ArticleDOI
TL;DR: A method of reliably measuring relative orientation co-occurrence statistics in a rotationally invariant manner is presented, and whether incorporating such information can enhance the classifier’s performance is discussed.
Abstract: We investigate texture classification from single images obtained under unknown viewpoint and illumination. A statistical approach is developed where textures are modelled by the joint probability distribution of filter responses. This distribution is represented by the frequency histogram of filter response cluster centres (textons). Recognition proceeds from single, uncalibrated images and the novelty here is that rotationally invariant filters are used and the filter response space is low dimensional.

1,145 citations


Cites background or methods or result from "3D Texture Recognition Using Bidire..."

  • ..., 1999), the same database used by Cula and Dana (2004) and Leung and Malik (2001). It is demonstrated that the classifier developed here achieves performance superior to that of Cula and Dana (2004) and Leung and Malik (2001), while requiring only a single image as input and with no information (implicit or explicit) about the illumination and viewing conditions. The CUReT database contains 61 textures, and each texture has 205 images obtained under different viewing and illumination conditions. The variety of textures in this database is shown in Fig. 3. Results are reported for all 61 textures. A preliminary version of these results appeared in Varma and Zisserman (2002)....

    [...]

  • ...The first experiment, where we classify images from 20 textures, corresponds to the setup employed by Cula and Dana (2004)....

    [...]

  • ...Cula and Dana (2004) presented an algorithm based on Leung and Malik’s framework but capable of classifying single images without requiring any a priori information. Using much the same filter bank as Leung and Malik, they showed how to achieve results comparable to Leung and Malik (2001) but using 2D textons generated from single images instead of registered image stacks....

    [...]

  • ...Our approach is most closely related to those of Leung and Malik (2001), Schmid (2001) and Cula and Dana (2004)....

    [...]

  • ...Our approach is most closely related to those of Leung and Malik (2001), Schmid (2001) and Cula and Dana (2004). Leung and Malik’s method is not rotationally invariant and requires as input a set of registered images acquired under a (implicitly) known set of imaging conditions. Schmid’s approach is rotationally invariant but the invariance is achieved in a different manner to ours, and texton clustering is in a higher dimensional space. More importantly, only a single model is used to characterise each texture class rather than having multiple models to account for the variations in imaging conditions. Cula and Dana classify from single images, but the method is not rotationally invariant and their algorithm for model selection differs from the one developed in this paper. These points are discussed in more detail subsequently. The paper is organised as follows: in Section 2, the basic classification algorithm is developed within a rotationally invariant framework. The clustering, learning and classification steps of the algorithm are described, and the performance of four filter sets is compared. The sets include those used by Schmid (2001) and Leung and Malik (2001), and two rotationally invariant sets based on maximal filter responses....

    [...]

Book
15 Aug 2013
TL;DR: Metric Learning: A Review presents an overview of existing research in this topic, including recent progress on scaling to high-dimensional feature spaces and to data sets with an extremely large number of data points.
Abstract: The metric learning problem is concerned with learning a distance function tuned to a particular task, and has been shown to be useful when used in conjunction with nearest-neighbor methods and other techniques that rely on distances or similarities. Metric Learning: A Review presents an overview of existing research in this topic, including recent progress on scaling to high-dimensional feature spaces and to data sets with an extremely large number of data points. It presents as unified a framework as possible under which existing research on metric learning can be cast. The monograph starts out by focusing on linear metric learning approaches, and mainly concentrates on the class of Mahalanobis distance learning methods. It then discusses nonlinear metric learning approaches, focusing on the connections between the non-linear and linear approaches. Finally, it discusses extensions of metric learning, as well as applications to a variety of problems in computer vision, text analysis, program analysis, and multimedia. Metric Learning: A Review is an ideal reference for anyone interested in the metric learning problem. It synthesizes much of the recent work in the area and it is hoped that it will inspire new algorithms and applications.

810 citations


Additional excerpts

  • ..., [16, 73, 75]....

    [...]

Journal ArticleDOI
TL;DR: It is demonstrated that materials can be classified using the joint distribution of intensity values over extremely compact neighborhoods (starting from as small as 3times3 pixels square) and that this can outperform classification using filter banks with large support.
Abstract: In this paper, we investigate material classification from single images obtained under unknown viewpoint and illumination. It is demonstrated that materials can be classified using the joint distribution of intensity values over extremely compact neighborhoods (starting from as small as 3times3 pixels square) and that this can outperform classification using filter banks with large support. It is also shown that the performance of filter banks is inferior to that of image patches with equivalent neighborhoods. We develop novel texton-based representations which are suited to modeling this joint neighborhood distribution for Markov random fields. The representations are learned from training images and then used to classify novel images (with unknown viewpoint and lighting) into texture classes. Three such representations are proposed and their performance is assessed and compared to that of filter banks. The power of the method is demonstrated by classifying 2,806 images of all 61 materials present in the Columbia-Utrecht database. The classification performance surpasses that of recent state-of-the-art filter bank-based classifiers such as Leung and Malik (IJCV 01), Cula and Dana (IJCV 04), and Varma and Zisserman (IJCV 05). We also benchmark performance by classifying all of the textures present in the UIUC, Microsoft Textile, and San Francisco outdoor data sets. We conclude with discussions on why features based on compact neighborhoods can correctly discriminate between textures with large global structure and why the performance of filter banks is not superior to that of the source image patches from which they were derived.

649 citations


Cites background or methods from "3D Texture Recognition Using Bidire..."

  • ...The power of the method is demonstrated by classifying 2806 images of all 61 materials present in the Columbia-Utrecht database....

    [...]

  • ...Each of the materials in the database has been imaged under 205 different viewing and illumination conditions....

    [...]

  • ...Out of these, 46 images per class are randomly chosen for training and the remaining 46 per class are chosen for testing....

    [...]

Journal ArticleDOI
TL;DR: The proposed unconventional random feature extraction is simple, yet by leveraging the sparse nature of texture images, the approach outperforms traditional feature extraction methods which involve careful design and complex steps and leads to significant improvements in classification accuracy and reductions in feature dimensionality.
Abstract: Inspired by theories of sparse representation and compressed sensing, this paper presents a simple, novel, yet very powerful approach for texture classification based on random projection, suitable for large texture database applications. At the feature extraction stage, a small set of random features is extracted from local image patches. The random features are embedded into a bag--of-words model to perform texture classification; thus, learning and classification are carried out in a compressed domain. The proposed unconventional random feature extraction is simple, yet by leveraging the sparse nature of texture images, our approach outperforms traditional feature extraction methods which involve careful design and complex steps. We have conducted extensive experiments on each of the CUReT, the Brodatz, and the MSRC databases, comparing the proposed approach to four state-of-the-art texture classification methods: Patch, Patch-MRF, MR8, and LBP. We show that our approach leads to significant improvements in classification accuracy and reductions in feature dimensionality.

310 citations


Additional excerpts

  • ...Ç...

    [...]

Book ChapterDOI
05 Sep 2010
TL;DR: It is shown that the new QC members outperform state of the art distances for these tasks, while having a short running time, and the experimental results show that both the cross-bin property and the normalization are important.
Abstract: We present a new histogram distance family, the Quadratic-Chi (QC) QC members are Quadratic-Form distances with a cross-bin χ2-like normalization The cross-bin χ2-like normalization reduces the effect of large bins having undo influence Normalization was shown to be helpful in many cases, where the χ2 histogram distance outperformed the L2 norm However, χ2 is sensitive to quantization effects, such as caused by light changes, shape deformations etc The Quadratic-Form part of QC members takes care of cross-bin relationships (eg red and orange), alleviating the quantization problem We present two new crossbin histogram distance properties: Similarity-Matrix-Quantization-Invariance and Sparseness-Invariance and show that QC distances have these propertiesWe also show that experimentally they boost performance QC distances computation time complexity is linear in the number of non-zero entries in the bin-similarity matrix and histograms and it can easily be parallelizedWe present results for image retrieval using the Scale Invariant Feature Transform (SIFT) and color image descriptors In addition, we present results for shape classification using Shape Context (SC) and Inner Distance Shape Context (IDSC) We show that the new QC members outperform state of the art distances for these tasks, while having a short running time The experimental results show that both the cross-bin property and the normalization are important

273 citations


Cites methods from "3D Texture Recognition Using Bidire..."

  • ...χ(2) was successfully used for texture and object categories classifica tion [5,6,7], near duplicate image identification[ 8], local descriptors matching [ 9], shape classification [ 10,11] and boundary detection [ 12]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: A near real-time recognition system with 20 complex objects in the database has been developed and a compact representation of object appearance is proposed that is parametrized by pose and illumination.
Abstract: The problem of automatically learning object models for recognition and pose estimation is addressed. In contrast to the traditional approach, the recognition problem is formulated as one of matching appearance rather than shape. The appearance of an object in a two-dimensional image depends on its shape, reflectance properties, pose in the scene, and the illumination conditions. While shape and reflectance are intrinsic properties and constant for a rigid object, pose and illumination vary from scene to scene. A compact representation of object appearance is proposed that is parametrized by pose and illumination. For each object of interest, a large set of images is obtained by automatically varying pose and illumination. This image set is compressed to obtain a low-dimensional subspace, called the eigenspace, in which the object is represented as a manifold. Given an unknown input image, the recognition system projects the image to eigenspace. The object is recognized based on the manifold it lies on. The exact position of the projection on the manifold determines the object's pose in the image. A variety of experiments are conducted using objects with complex appearance characteristics. The performance of the recognition and pose estimation algorithms is studied using over a thousand input images of sample objects. Sensitivity of recognition to the number of eigenspace dimensions and the number of learning samples is analyzed. For the objects used, appearance representation in eigenspaces with less than 20 dimensions produces accurate recognition results with an average pose estimation error of about 1.0 degree. A near real-time recognition system with 20 complex objects in the database has been developed. The paper is concluded with a discussion on various issues related to the proposed learning and recognition methodology.

2,037 citations


"3D Texture Recognition Using Bidire..." refers methods in this paper

  • ...Principal component analysis is performed on the histogram of filter outputs and recognition is done using the SLAM library (Nene et al., 1994) (Murase and Nayar, 1995)....

    [...]

  • ...Principal component analysis is performed on the histogram of filter outputs and recognition is done using the SLAM library (Nene et al., 1994; Murase and Nayar, 1995)....

    [...]

  • ...This approach has been inspired by (Murase and Nayar, 1995), where a similar problem is treated, specifically an object is represented by set of images taken from various poses, and PCA is used to obtain a compact lower-dimensional representation....

    [...]

  • ...In all our experiments, PCA is accomplished by employing the SLAM software library (Nene et al., 1994; Murase and Nayar, 1995)....

    [...]

  • ...In all our experiments, PCA is accomplished by employing the SLAM software library (Nene et al., 1994) (Murase and Nayar, 1995)....

    [...]

Journal ArticleDOI
TL;DR: A unified model to construct a vocabulary of prototype tiny surface patches with associated local geometric and photometric properties, represented as a set of linear Gaussian derivative filter outputs, under different lighting and viewing conditions is provided.
Abstract: We study the recognition of surfaces made from different materials such as concrete, rug, marble, or leather on the basis of their textural appearance. Such natural textures arise from spatial variation of two surface attributes: (1) reflectance and (2) surface normal. In this paper, we provide a unified model to address both these aspects of natural texture. The main idea is to construct a vocabulary of prototype tiny surface patches with associated local geometric and photometric properties. We call these 3D textons. Examples might be ridges, grooves, spots or stripes or combinations thereof. Associated with each texton is an appearance vector, which characterizes the local irradiance distribution, represented as a set of linear Gaussian derivative filter outputs, under different lighting and viewing conditions. Given a large collection of images of different materials, a clustering approach is used to acquire a small (on the order of 100) 3D texton vocabulary. Given a few (1 to 4) images of any material, it can be characterized using these textons. We demonstrate the application of this representation for recognition of the material viewed under novel lighting and viewing conditions. We also illustrate how the 3D texton model can be used to predict the appearance of materials under novel conditions.

1,762 citations


"3D Texture Recognition Using Bidire..." refers background or methods in this paper

  • ...A widely used computational approach for encoding the local structural attributes of textures is based on multichannel filtering (Bovik et al., 1990) (Jain et al., 1999) (Randen and Husoy, 1999) (Leung and Malik, 2001)....

    [...]

  • ...As in many approaches in texture literature (Leung and Malik, 2001) (Aksoy and Haralick, 1999) (Puzicha et al., 1999) (Ma and Manjunath, 1996), we cluster the feature space to determine the set of prototypes among the population....

    [...]

  • ...Based on the classic concept of textons, and also based on modern generalizations (Leung and Malik, 2001), we obtain a finite set of local structural features that can be found within a large collection of texture images from various samples....

    [...]

  • ...…appearance and it has been used in numerous recent studies (Koenderink et al., 1999) (van Ginneken et al., 1998) (Leung and Malik, 1999) (Leung and Malik, 2001) (Suen and Healey, 1998) (Dana and Nayar, 1998) (Dana and Nayar, 1999b) (Dana and Nayar, 1999a) (Suen and Healey, 2000) (Liu et al.,…...

    [...]

  • ...The most notable prior work is the 3D texton method (Leung and Malik, 1999) (Leung and Malik, 2001)....

    [...]

Journal ArticleDOI
Bela Julesz1
12 Mar 1981-Nature
TL;DR: Research with texture pairs having identical second-order statistics has revealed that the pre-attentive texture discrimination system cannot globally process third- and higher- order statistics, and that discrimination is the result of a few local conspicuous features, called textons.
Abstract: Research with texture pairs having identical second-order statistics has revealed that the pre-attentive texture discrimination system cannot globally process third- and higher-order statistics, and that discrimination is the result of a few local conspicuous features, called textons. It seems that only the first-order statistics of these textons have perceptual significance, and the relative phase between textons cannot be perceived without detailed scrutiny by focal attention.

1,757 citations


"3D Texture Recognition Using Bidire..." refers background in this paper

  • ...Early psychophysical studies brought evidence that discrimination of black and white textures is based on pre-attentively discriminable conspicuous local features called textons (Julesz, 1981)....

    [...]

Journal ArticleDOI
TL;DR: An interpretation of image texture as a region code, or carrier of region information, is emphasized and examples are given of both types of texture processing using a variety of real and synthetic textures.
Abstract: A computational approach for analyzing visible textures is described. Textures are modeled as irradiance patterns containing a limited range of spatial frequencies, where mutually distinct textures differ significantly in their dominant characterizing frequencies. By encoding images into multiple narrow spatial frequency and orientation channels, the slowly varying channel envelopes (amplitude and phase) are used to segregate textural regions of different spatial frequency, orientation, or phase characteristics. Thus, an interpretation of image texture as a region code, or carrier of region information, is emphasized. The channel filters used, known as the two-dimensional Gabor functions, are useful for these purposes in several senses: they have tunable orientation and radial frequency bandwidths and tunable center frequencies, and they optimally achieve joint resolution in space and in spatial frequency. By comparing the channel amplitude responses, one can detect boundaries between textures. Locating large variations in the channel phase responses allows discontinuities in the texture phase to be detected. Examples are given of both types of texture processing using a variety of real and synthetic textures. >

1,582 citations


"3D Texture Recognition Using Bidire..." refers methods in this paper

  • ...A widely used computational approach for encoding the local structural attributes of textures is based on multichannel filtering (Bovik et al., 1990) (Jain et al....

    [...]

  • ...A widely used computational approach for encoding the local structural attributes of textures is based on multichannel filtering (Bovik et al., 1990) (Jain et al., 1999) (Randen and Husoy, 1999) (Leung and Malik, 2001)....

    [...]

Journal ArticleDOI
TL;DR: Most major filtering approaches to texture feature extraction are reviewed and a ranking of the tested approaches based on extensive experiments is presented, showing the effect of the filtering is highlighted, keeping the local energy function and the classification algorithm identical for most approaches.
Abstract: In this paper, we review most major filtering approaches to texture feature extraction and perform a comparative study. Filtering approaches included are Laws masks (1980), ring/wedge filters, dyadic Gabor filter banks, wavelet transforms, wavelet packets and wavelet frames, quadrature mirror filters, discrete cosine transform, eigenfilters, optimized Gabor filters, linear predictors, and optimized finite impulse response filters. The features are computed as the local energy of the filter responses. The effect of the filtering is highlighted, keeping the local energy function and the classification algorithm identical for most approaches. For reference, comparisons with two classical nonfiltering approaches, co-occurrence (statistical) and autoregressive (model based) features, are given. We present a ranking of the tested approaches based on extensive experiments.

1,567 citations


"3D Texture Recognition Using Bidire..." refers methods in this paper

  • ...A widely used computational approach for encoding the local structural attributes of textures is based on multichannel filtering (Bovik et al., 1990; Jain et al., 1999; Randen and Husoy, 1999; Leung and Malik, 2001)....

    [...]

  • ...A widely used computational approach for encoding the local structural attributes of textures is based on multichannel filtering (Bovik et al., 1990) (Jain et al., 1999) (Randen and Husoy, 1999) (Leung and Malik, 2001)....

    [...]