scispace - formally typeset
Search or ask a question

Showing papers on "Contextual image classification published in 2000"


Book ChapterDOI
TL;DR: The SIMPLIcity system represents an image by a set of regions, roughly corresponding to objects, which are characterized by color, texture, shape, and location, which classifies images into categories intended to distinguish semantically meaningful differences.
Abstract: We present here SIMPLIcity (Semantics-sensitive Integrated Matching for Picture LIbraries), an image retrieval system using semantics classification and integrated region matching (IRM) based upon image segmentation. The SIMPLIcity system represents an image by a set of regions, roughly corresponding to objects, which are characterized by color, texture, shape, and location. The system classifies images into categories which are intended to distinguish semantically meaningful differences, such as textured versus nontextured, indoor versus outdoor, and graph versus photograph. Retrieval is enhanced by narrowing down the searching range in a database to a particular category and exploiting semantically-adaptive searching methods. A measure for the overall similarity between images, the IRM distance, is defined by a region-matching scheme that integrates properties of all the regions in the images. This overall similarity approach reduces the adverse effect of inaccurate segmentation, helps to clarify the semantics of a particular region, and enables a simple querying interface for region-based image retrieval systems. The application of SIMPLIcity to a database of about 200,000 general-purpose images demonstrates accurate retrieval at high speed. The system is also robust to image alterations.

1,475 citations


Book ChapterDOI
26 Jun 2000
TL;DR: A method to learn object class models from unlabeled and unsegmented cluttered cluttered scenes for the purpose of visual object recognition that achieves very good classification results on human faces and rear views of cars.
Abstract: We present a method to learn object class models from unlabeled and unsegmented cluttered scenes for the purpose of visual object recognition. We focus on a particular type of model where objects are represented as flexible constellations of rigid parts (features). The variability within a class is represented by a joint probability density function (pdf) on the shape of the constellation and the output of part detectors. In a first stage, the method automatically identifies distinctive parts in the training set by applying a clustering algorithm to patterns selected by an interest operator. It then learns the statistical shape model using expectation maximization. The method achieves very good classification results on human faces and rear views of cars.

736 citations


Book ChapterDOI
26 Jun 2000
TL;DR: This paper presents a theoretically very simple yet efficient approach for gray scale and rotation invariant texture classification based on local binary patterns and nonparametric discrimination of sample and prototype distributions, which is very robust in terms of gray scale variations, since the operators are by definition invariant against any monotonic transformation of the gray scale.
Abstract: This paper presents a theoretically very simple yet efficient approach for gray scale and rotation invariant texture classification based on local binary patterns and nonparametric discrimination of sample and prototype distributions The proposed approach is very robust in terms of gray scale variations, since the operators are by definition invariant against any monotonic transformation of the gray scale Another advantage is computational simplicity, as the operators can be realized with a few operations in a small neighborhood and a lookup table Excellent experimental results obtained in two true problems of rotation invariance, where the classifier is trained at one particular rotation angle and tested with samples from other rotation angles, demonstrate that good discrimination can be achieved with the statistics of simple rotation invariant local binary patterns These operators characterize the spatial configuration of local image texture and the performance can be further improved by combining them with rotation invariant variance measures that characterize the contrast of local image texture The joint distributions of these orthogonal measures are shown to be very powerful tools for rotation invariant texture analysis

646 citations


Proceedings ArticleDOI
24 Apr 2000
TL;DR: This paper presents a new appearance-based place recognition system for topological localization that uses a panoramic vision system to sense the environment and correctly classified between 87% and 98% of the input color images.
Abstract: This paper presents a new appearance-based place recognition system for topological localization. The method uses a panoramic vision system to sense the environment. Color images are classified in real-time based on nearest-neighbor learning, image histogram matching, and a simple voting scheme. The system has been evaluated with eight cross-sequence tests in four unmodified environments, three indoors and one outdoors. In all eight cases, the system successfully tracked the mobile robot's position. The system correctly classified between 87% and 98% of the input color images. For the remaining images, the system was either momentarily confused or uncertain, but never classified an image incorrectly.

629 citations


Proceedings ArticleDOI
15 Jun 2000
TL;DR: An approach for image retrieval using a very large number of highly selective features and efficient online learning based on the assumption that each image is generated by a sparse set of visual "causes" and that images which are visually similar share causes.
Abstract: We present an approach for image retrieval using a very large number of highly selective features and efficient online learning. Our approach is predicated on the assumption that each image is generated by a sparse set of visual "causes" and that images which are visually similar share causes. We propose a mechanism for computing a very large number of highly selective features which capture some aspects of this causal structure (in our implementation there are over 45,000 highly selective features). At query time a user selects a few example images, and a technique known as "boosting" is used to learn a classification function in this feature space. By construction, the boosting procedure learns a simple classifier which only relies on 20 of the features. As a result a very large database of images can be scanned rapidly, perhaps a million images per second. Finally we will describe a set of experiments performed using our retrieval system on a database of 3000 images.

504 citations


Journal ArticleDOI
TL;DR: Experimental results, up to a 97 percent rate of success in classification, will show the possibility of using this biometric system in medium/high security environments with full acceptance from all users.
Abstract: A work in defining and implementing a biometric system based on hand geometry identification is presented here. Hand features are extracted from a color photograph taken when the user has placed his hand on a platform designed for such a task. Different pattern recognition techniques have been tested to be used in classification and/or verification from Euclidean distance to neural networks. Experimental results, up to a 97 percent rate of success in classification, will show the possibility of using this system in medium/high security environments with full acceptance from all users.

504 citations


Proceedings ArticleDOI
01 Sep 2000
TL;DR: A new system for personal identification based on iris patterns is presented, which employs the rich 2D information of the iris and is translation, rotation, and scale invariant.
Abstract: A new system for personal identification based on iris patterns is presented in this paper. It is composed of iris image acquisition, image preprocessing, feature extraction and classifier design. The algorithm for iris feature extraction is based on texture analysis using multichannel Gabor filtering and wavelet transform. Compared with existing methods, our method employs the rich 2D information of the iris and is translation, rotation, and scale invariant.

400 citations


Journal ArticleDOI
TL;DR: The usefulness of the proposed tissue classification method is demonstrated by comparisons with conventional single-channel classification using both synthesized data and clinical data acquired with CT (computed tomography) and MRI (magnetic resonance imaging) scanners.
Abstract: This paper describes a novel approach to tissue classification using three-dimensional (3D) derivative features in the volume rendering pipeline. In conventional tissue classification for a scalar volume, tissues of interest are characterized by an opacity transfer function defined as a one-dimensional (1D) function of the original volume intensity. To overcome the limitations inherent in conventional 1D opacity functions, we propose a tissue classification method that employs a multidimensional opacity function, which is a function of the 3D derivative features calculated from a scalar volume as well as the volume intensity. Tissues of interest are characterized by explicitly defined classification rules based on 3D filter responses highlighting local structures, such as edge, sheet, line, and blob, which typically correspond to tissue boundaries, cortices, vessels, and nodules, respectively, in medical volume data. The 3D local structure filters are formulated using the gradient vector and Hessian matrix of the volume intensity function combined with isotropic Gaussian blurring. These filter responses and the original intensity define a multidimensional feature space in which multichannel tissue classification strategies are designed. The usefulness of the proposed method is demonstrated by comparisons with conventional single-channel classification using both synthesized data and clinical data acquired with CT (computed tomography) and MRI (magnetic resonance imaging) scanners. The improvement in image quality obtained using multichannel classification is confirmed by evaluating the contrast and contrast-to-noise ratio in the resultant volume-rendered images with variable opacity values.

378 citations


Proceedings ArticleDOI
26 Mar 2000
TL;DR: Support vector machines (SVM) are investigated for visual gender classification with low-resolution "thumbnail" faces processed from 1755 images from the FERET face database, demonstrating robustness and relative scale invariance for visual classification.
Abstract: Support vector machines (SVM) are investigated for visual gender classification with low-resolution "thumbnail" faces (21-by-12 pixels) processed from 1755 images from the FERET face database. The performance of SVM (3.4% error) is shown to be superior to traditional pattern classifiers (linear, quadratic, Fisher linear discriminant, nearest-neighbor) as well as more modern techniques such as radial basis function (RBF) classifiers and large ensemble-RBF networks. SVM also out-performed human test subjects at the same task: in a perception study with 30 human test subjects, ranging in age from mid-20s to mid-40s, the average error rate was found to be 32% for the "thumbnails" and 6.7% with higher resolution images. The difference in performance between low- and high-resolution tests with SVM was only 1%, demonstrating robustness and relative scale invariance for visual classification.

339 citations


Journal ArticleDOI
TL;DR: An algorithm is proposed that models images by two dimensional (2-D) hidden Markov models (HMMs) that outperforms CART/sup TM/, LVQ, and Bayes VQ in classification by context.
Abstract: For block-based classification, an image is divided into blocks, and a feature vector is formed for each block by grouping statistics extracted from the block. Conventional block-based classification algorithms decide the class of a block by examining only the feature vector of this block and ignoring context information. In order to improve classification by context, an algorithm is proposed that models images by two dimensional (2-D) hidden Markov models (HMMs). The HMM considers feature vectors statistically dependent through an underlying state process assumed to be a Markov mesh, which has transition probabilities conditioned on the states of neighboring blocks from both horizontal and vertical directions. Thus, the dependency in two dimensions is reflected simultaneously. The HMM parameters are estimated by the EM algorithm. To classify an image, the classes with maximum a posteriori probability are searched jointly for all the blocks. Applications of the HMM algorithm to document and aerial image segmentation show that the algorithm outperforms CART/sup TM/, LVQ, and Bayes VQ.

296 citations


Journal Article
TL;DR: In this article, relative radiometric normalization (RRN) is used to reduce radiometric differences among images caused by inconsistencies of acquisition conditions rather than changes in sudace reflectance.
Abstract: Relative radiometric normalization (RRN minimizes radiometric differences among images caused by inconsistencies of acquisition conditions rather than changes in sudace reflectance. Five methods of RRN have been applied to 1973, 1983, and 1988 Landsat MSS images of the Atlanta area for evaluating their pedormance in relation to change detection. These methods include pseudoinvariant features (PIF), radiometric control set (RCS), image regression (ml, no-change set determined from scattergrams (NC], and histogram matching (MM), all requiring the use of a reference-subject image pair. They were compared in terms of their capability to improve visual image quality and statistical robustness. The way in which different RRN methods affect the results of information extraction in change detection was explored. It was found that RRN methods which employed a large sample size to relate targets of subject images to the reference image exhibited a better overall performance, but tended to reduce the dynamic range and coefficient of variation of the images, thus undermining the accuracy of image classification. It was also found that visually and statistically robust RRN methods tended to substantially reduce the magnitude of spectral differences which can be linked to meaningful changes in landscapes. Finally, factors affecting the pedormance of relative radiometric normalization were identified, which include land-use/ land-cover distribution, water-land proportion, topographic relief, similarity between the subject and reference images, and sample size.

Journal ArticleDOI
TL;DR: Computer-aided classification of benign and malignant masses on mammograms is attempted in this study by computing gradient-based and texture-based features based on posterior probabilities computed from Mahalanobis distances.
Abstract: Computer-aided classification of benign and malignant masses on mammograms is attempted in this study by computing gradient-based and texture-based features. Features computed based on gray-level co-occurrence matrices (GCMs) are used to evaluate the effectiveness of textural information possessed by mass regions in comparison with the textural information present in mass margins. A method involving polygonal modeling of boundaries is proposed for the extraction of a ribbon of pixels across mass margins. Two gradient-based features are developed to estimate the sharpness of mass boundaries in the ribbons of pixels extracted from their margins. A total of 54 images (28 benign and 26 malignant) containing 39 images from the Mammographic Image Analysis Society (MIAS) database and 15 images from a local database are analyzed. The best benign versus malignant classification of 82.1%, with an area (A/sub z/) of 0.85 under the receiver operating characteristics (ROC) curve, was obtained with the images from the MIAS database by using GCM-based texture features computed from mass margins. The classification method used is based on posterior probabilities computed from Mahalanobis distances. The corresponding accuracy using jack-knife classification was observed to be 74.4%, with A/sub x/=0.67. Gradient-based features achieved A/sub x/=0.6 on the MIAS database and A/sub z/=0.76 on the combined database. The corresponding values obtained using jack-knife classification were observed to be 0.52 and 0.73 for the MIAS and combined databases, respectively.

Proceedings ArticleDOI
26 Mar 2000
TL;DR: The estimation of head pose, which is achieved by using the support vector regression technique, provides crucial information for choosing the appropriate face detector, which helps to improve the accuracy and reduce the computation in multi-view face detection compared to other methods.
Abstract: A support vector machine-based multi-view face detection and recognition framework is described. Face detection is carried out by constructing several detectors, each of them in charge of one specific view. The symmetrical property of face images is employed to simplify the complexity of the modelling. The estimation of head pose, which is achieved by using the support vector regression technique, provides crucial information for choosing the appropriate face detector. This helps to improve the accuracy and reduce the computation in multi-view face detection compared to other methods. For video sequences, further computational reduction can be achieved by using a pose change smoothing strategy. When face detectors find a face in frontal view, a support vector machine-based multi-class classifier is activated for face recognition. All the above issues are integrated under a support vector machine framework. Test results on four video sequences are presented, among them the detection rate is above 95%, recognition accuracy is above 90%, average pose estimation error is around 10/spl deg/, and the full detection and recognition speed is up to 4 frames/second on a Pentium II 300 PC.

Journal ArticleDOI
TL;DR: The performance of a classifier based only on change is assessed on a range of test sites in the UK, Finland, and Poland, and the possibility of improving performance by including radiometric information in the mapping strategy is discussed.
Abstract: Examination of the physical background underlying the ERS response of forest and analysis of time series of ERS data indicates that the greater temporal stability of forest compared with many other types of land cover presents a means of mapping forest area. The processing chain necessary to make such area estimations involves reconstruction of an optimal estimate of the backscattering coefficient at each pixel using temporal and spatial filtering so that classification rules derived from large scale averaging are applicable. The rationale behind the filtering strategy and the level of averaging needed is explained in terms of the observed multitemporal behavior of forest and nonforest areas, much of this analysis is generic and applicable to a wide range of situation in which significant information is carried by multitemporal features of the data. The choice of decision rules is based on the forest observations, with the added requirement for robustness. The performance of a classifier based only on change is assessed on a range of test sites in the UK, Finland, and Poland. Error sources in this classifier are identified, and the possibility of improving performance by including radiometric information in the mapping strategy is discussed. Brief discussions of how the classification is affected by the addition of coherence and how the processing chain would need to be modified for other forms of satellite data are included.

Proceedings ArticleDOI
24 Sep 2000
TL;DR: An image classification technique that uses the Bayes decision rule for minimum cost to classify pixels into skin color and non-skin color is addressed, and it is robust against different skin colors.
Abstract: This paper addresses an image classification technique that uses the Bayes decision rule for minimum cost to classify pixels into skin color and non-skin color. Color statistics are collected from YCbCr color space. The Bayesian approach to skin color classification is discussed along with an overview of YCbCr color space. Experimental results demonstrate that this approach can achieve good classification outcomes, and it is robust against different skin colors.

Journal ArticleDOI
TL;DR: It is shown that in nonmetric spaces, boundary points are less significant for capturing the structure of a class than in Euclidean spaces, and it is suggested that atypical points may be more important in describing classes.
Abstract: A key problem in appearance-based vision is understanding how to use a set of labeled images to classify new images. Systems that model human performance, or that use robust image matching methods, often use nonmetric similarity judgments; but when the triangle inequality is not obeyed, most pattern recognition techniques are not applicable. Exemplar-based (nearest-neighbor) methods can be applied to a wide class of nonmetric similarity functions. The key issue, however, is to find methods for choosing good representatives of a class that accurately characterize it. We show that existing condensing techniques are ill-suited to deal with nonmetric dataspaces. We develop techniques for solving this problem, emphasizing two points: First, we show that the distance between images is not a good measure of how well one image can represent another in nonmetric spaces. Instead, we use the vector correlation between the distances from each image to other previously seen images. Second, we show that in nonmetric spaces, boundary points are less significant for capturing the structure of a class than in Euclidean spaces. We suggest that atypical points may be more important in describing classes. We demonstrate the importance of these ideas to learning that generalizes from experience by improving performance. We also suggest ways of applying parametric techniques to supervised learning problems that involve a specific nonmetric distance functions, showing how to generalize the idea of linear discriminant functions in a way that may be more useful in nonmetric spaces.

Proceedings ArticleDOI
01 Sep 2000
TL;DR: Two methods for fingerprint image enhancement are proposed using a unique anisotropic filter for direct grayscale enhancement and show some improvement in the minutiae detection process in terms of either efficiency or time required.
Abstract: Extracting minutiae from fingerprint images is one of the most important steps in automatic fingerprint identification and classification. Minutiae are local discontinuities in the fingerprint pattern, mainly terminations and bifurcations. In this work we propose two methods for fingerprint image enhancement. The first one is carried out using local histogram equalization, Wiener filtering, and image binarization. The second method use a unique anisotropic filter for direct grayscale enhancement. The results achieved are compared with those obtained through some other methods. Both methods show some improvement in the minutiae detection process in terms of either efficiency or time required.

Journal ArticleDOI
TL;DR: The algorithm estimates the density of each class and is able to model class distributions with non-Gaussian structure and can improve classification accuracy compared with standard Gaussian mixture models.
Abstract: An unsupervised classification algorithm is derived by modeling observed data as a mixture of several mutually exclusive classes that are each described by linear combinations of independent, non-Gaussian densities. The algorithm estimates the density of each class and is able to model class distributions with non-Gaussian structure. The new algorithm can improve classification accuracy compared with standard Gaussian mixture models. When applied to blind source separation in nonstationary environments, the method can switch automatically between classes, which correspond to contexts with different mixing properties. The algorithm can learn efficient codes for images containing both natural scenes and text. This method shows promise for modeling non-Gaussian structure in high-dimensional data and has many potential applications.

Journal ArticleDOI
TL;DR: A methodology based on computing a set of univariate and multivariate textural measures of spatial variability based on several variogram estimators, and an application of this methodology to lithological discrimination is presented using a Landsat-5 TM image.

Journal ArticleDOI
TL;DR: In this article, a variational model devoted to image classification coupled with an edge-preserving regularization process is presented, which contributes to provide images composed of homogeneous regions with regularized boundaries.
Abstract: We present a variational model devoted to image classification coupled with an edge-preserving regularization process. The discrete nature of classification (i.e., to attribute a label to each pixel) has led to the development of many probabilistic image classification models, but rarely to variational ones. In the last decade, the variational approach has proven its efficiency in the field of edge-preserving restoration. We add a classification capability which contributes to provide images composed of homogeneous regions with regularized boundaries, a region being defined as a set of pixels belonging to the same class. The soundness of our model is based on the works developed on the phase transition theory in mechanics. The proposed algorithm is fast, easy to implement, and efficient. We compare our results on both synthetic and satellite images with the ones obtained by a stochastic model using a Potts regularization.

Journal ArticleDOI
01 Apr 2000
TL;DR: In this paper, a spatial fuzzy clustering algorithm that exploits the spatial contextual information in image data is presented, which is adaptive to the image content in the sense that influence from the neighbouring pixels is suppressed in nonhomogeneous regions in the image.
Abstract: The authors present a spatial fuzzy clustering algorithm that exploits the spatial contextual information in image data. The objective functional of their method utilises a new dissimilarity index that takes into account the influence of the neighbouring pixels on the centre pixel in a 3/spl times/1 window. The algorithm is adaptive to the image content in the sense that influence from the neighbouring pixels is suppressed in nonhomogeneous regions in the image. A cluster merging scheme that merges two clusters based on their closeness and their degree of overlap is presented. Through this merging scheme, an 'optimal' number of clusters can be determined automatically as iteration proceeds. Experimental results with synthetic and real images indicate that the proposed algorithm is more tolerant to noise, better at resolving classification ambiguity and coping with different cluster shape and size than the conventional fuzzy c-means algorithm.

Proceedings ArticleDOI
01 Jan 2000
TL;DR: The possibility of including an unlabeled data set to make up the insufficiency of labeled data is investigated and the proposed algorithm, Discriminant EM (D-EM) not only estimates the parameters of a generative model but also finds a linear transformation to relax the assumption of probabilistic structure of data distributions as well as select good features automatically.
Abstract: In many vision applications, the practice of supervised learning faces several difficulties, one of which is that insufficient labeled training data result in poor generalization. In image retrieval, we have very few labeled images from query and relevance feedback so that it is hard to automatically weight image features and select similarity metrics for image classification. This paper investigates the possibility of including an unlabeled data set to make up the insufficiency of labeled data. Different from most current research in image retrieval, the proposed approach tries to cast image retrieval as a transductive learning problem, in which the generalization of an image classifier is only defined on a set of images such as the given image database. Formulating this transductive problem in a probabilistic framework the proposed algorithm, Discriminant EM (D-EM) not only estimates the parameters of a generative model but also finds a linear transformation to relax the assumption of probabilistic structure of data distributions as well as select good features automatically. Our experiments show that D-EM has a satisfactory performance in image retrieval applications. D-EM algorithm has the potential to many other applications.

Journal ArticleDOI
TL;DR: This paper provides an overview of current research in image information retrieval and provides an outline of areas for future research, which is broad and interdisciplinary and focuses on three aspects of image research (IR): text-based retrieval, content- based retrieval, and user interactions with image information retrieved systems.
Abstract: Introduction Interest in image retrieval has increased in large part due to the rapid growth of the World Wide Web. According to a recent study (Lawrence & Giles, 1999) there are 180 million images on the publicly indexable Web, a total amount of image data of about 3Tb [terabytes], and an astounding one million or more digital images are being produced every day (Jain, 93). The need to find a desired image from a collection is shared by many groups, including journalists, engineers, historians, designers, teachers, artists, and advertising agencies. Image needs and uses across users in these groups vary considerably. Users may require access to images based on primitive features such as color, texture or shape or users may require access to images based on abstract concepts and symbolic imagery. The technology to access these images has also accelerated phenomenally and at present surpasses our understanding of how users interact with visual information. This paper provides an overview of current research in image information retrieval and provides an outline of areas for future research. The approach is broad and interdisciplinary and focuses on three aspects of image research (IR): text-based retrieval, content-based retrieval, and user interactions with image information retrieval systems. The review concludes with a call for image retrieval evaluation studies similar to TREC. Text-Based Image Retrieval Research Most existing IR systems are text-based, but images frequently have little or no accompanying textual information. The solution historically has been to develop text-based ontologies and classification schemes for image description. Text-based indexing has many strengths including the ability to represent both general and specific instantiations of an object at varying levels of complexity. Reviews of the literature pertaining primarily to text-based approaches include Rasmussen (1997) Lancaster (1998) Lunin (1987) and Cawkell (1993). Long before images could be digitized, access to image collections was provided by librarians, curators, and archivists through text descriptors or classification codes. These indexing schemes were often developed in-house and reflect the unique characteristics of a particular collection or clientele. This is still common practice and recently Zheng (1999) and Goodrum & Martin (1997) have reported on the hybridization of multiple schemas for classifying collections of historic costume collections. Hourihane (1989) has also reviewed a number of these unique systems for image classification. To date, very little research has been conducted on the relative effectiveness of these various approaches to image indexing in electronic environments. Attempts to provide general systems for image indexing include the Getty's Art and Architecture Thesaurus (AAT), which consists of over 120,000 terms for the description of art, art history, architecture, and other cultural objects, and the Library of Congress Thesaurus of Graphic Materials (LCTGM). The AAT currently provides access to thirty-three hierarchical categories of image description using seven broad facets (Associated Concepts, Physical Attributes, Styles and Periods, Agents, Activities, Materials, and Objects). The approach in many collections, particularly general library environments, has been to apply an existing cataloging system like the Dewey Decimal System to image description using the LCTGM, or ICONCLASS. Assignment of terms to describe images is not solved entirely by the use of controlled vocabularies or classification schemes however. The textual representation of images is problematic because images convey information relating to what is actually depicted in the image as well as what the image is about. Shatford (1986) posits this discussion within a framework based on Panofsky's approach to analyzing iconographical levels of meaning in images (1955). …

Proceedings ArticleDOI
10 Sep 2000
TL;DR: This paper proposes an approach to utilize both positive and negative feedbacks for image retrieval that releases the user from manually providing preference weight for each positive example.
Abstract: By using relevance feedback, content-based image retrieval (CBIR) allows the user to retrieve images interactively. Beginning with a coarse query, the user can select the most relevant images and provide a weight of preference for each relevant image to refine the query. The high level concept borne by the user and perception subjectivity of the user can be automatically captured by the system to some degree. This paper proposes an approach to utilize both positive and negative feedbacks for image retrieval. Support vector machines (SVM) is applied to classifying the positive and negative images. The SVM learning results are used to update the preference weights for the relevant images. This approach releases the user from manually providing preference weight for each positive example. Experimental results show that the proposed approach has improvement over the previous approach (Rui et al. 1997) that uses positive examples only.

Proceedings ArticleDOI
01 Sep 2000
TL;DR: This paper shows that a properly selected subset of patterns encoded in LBP forms an efficient and robust texture description which can achieve better classification rates in comparison with the whole LBP histogram.
Abstract: Recently, a nonparametric approach to texture analysis has been developed, in which the distributions of simple texture measures based on local binary patterns (LBP) are used for texture description. The basic LBP encodes 256 simple feature detectors in a single 3/spl times/3 operator. This paper shows that a properly selected subset of patterns encoded in LBP forms an efficient and robust texture description which can achieve better classification rates in comparison with the whole LBP histogram. Experiments on classification of textures from the Columbia-Utrecht (CURET) database demonstrate the robustness of the approach.

Proceedings ArticleDOI
01 Jul 2000-Versus
TL;DR: It is shown that analysis of trajectories may be carried out in a model-free fashion, using self-organising feature map neutral networks to learn the characteristics of normal trajectories, and to detect novel ones.
Abstract: This paper presents an approach to the problem of automatically classifying events detected by video surveillance systems; specifically, of detecting unusual or suspicious movements. Approaches to this problem typically involve building complex 3D-models in real-world coordinates to provide trajectory information for the classifier. We show that analysis of trajectories may be carried out in a model-free fashion, using self-organising feature map neutral networks to learn the characteristics of normal trajectories, and to detect novel ones. Trajectories are represented in 2D image coordinates. First and second order motion information is also generated, with moving-average smoothing. This allows novelty detection to be applied on a point-by-point basis in real time, and permits both instantaneous motion and whole trajectory motion to be subjected to novelty detection.

Journal ArticleDOI
TL;DR: Presents a fully automatic three-dimensional classification of brain tissues for Magnetic Resonance (MR) images using Markov random field (MRF) models and the multifractal dimension, describing the topology of the brain, is added to the MRFs to improve discrimination of the mixclasses.
Abstract: Presents a fully automatic three-dimensional classification of brain tissues for Magnetic Resonance (MR) images. An MR image volume may be composed of a mixture of several tissue types due to partial volume effects. Therefore, the authors consider that in a brain dataset there are not only the three main types of brain tissue: gray matter, white matter, and cerebro spinal fluid, called pure classes, but also mixtures, called mixclasses. A statistical model of the mixtures is proposed and studied by means of simulations. It is shown that it can be approximated by a Gaussian function under some conditions. The D'Agostino-Pearson normality test is used to assess the risk or of the approximation. In order to classify a brain into three types of brain tissue and deal with the problem of partial volume effects, the proposed algorithm uses two steps: (1) segmentation of the brain into pure and mixclasses using the mixture model; (2) reclassification of the mixclasses into the pure classes using knowledge about the obtained pure classes. Both steps use Markov random field (MRF) models. The multifractal dimension, describing the topology of the brain, is added to the MRFs to improve discrimination of the mixclasses. The algorithm is evaluated using both simulated images and real MR images with different T1-weighted acquisition sequences.

Proceedings ArticleDOI
26 Mar 2000
TL;DR: A novel statistical approach to hand segmentation based on Bayes decision theory that generates a hand color model and a background color model for a given image, and uses these models to classify each pixel in the image as either a hand pixel or a background pixel.
Abstract: Hand segmentation is a prerequisite for many gesture recognition tasks. Color has been widely used for hand segmentation. However, many approaches rely on predefined skin color models. It is very difficult to predefine a color model in a mobile application where the light condition may change dramatically over time. We propose a novel statistical approach to hand segmentation based on Bayes decision theory. The proposed method requires no predefined skin color model. Instead it generates a hand color model and a background color model for a given image, and uses these models to classify each pixel in the image as either a hand pixel or a background pixel. Models are generated using a Gaussian mixture model with the restricted EM algorithm. Our method is capable of segmenting hands of arbitrary color in a complex scene. It performs well even when there is a significant overlap between hand and background colors, or when the user wears gloves. We show that the Bayes decision method is superior to a commonly used method by comparing their upper bound performance. Experimental results demonstrate the feasibility of the proposed method.

Proceedings ArticleDOI
28 Mar 2000
TL;DR: The system is trained from examples to classify faces on the basis of high-level attributes, such as sex, "race", and expression, using linear discriminant analysis (LDA), using the Gabor representation.
Abstract: A method for automatically classifying facial images is proposed. Faces are represented using elastic graphs labelled with with 2D Gabor wavelet features. The system is trained from examples to classify faces on the basis of high-level attributes, such as sex, "race", and expression, using linear discriminant analysis (LDA). Use of the Gabor representation relaxes the requirement for precise normalization of the face: approximate registration of a facial graph is sufficient. LDA allows simple and rapid training from examples, as well as straightforward interpretation of the role of the input features for classification. The algorithm is tested on three different facial image datasets, one of which was acquired under relatively uncontrolled conditions, on tasks of sex, "race" and expression classification. Results of these tests are presented. The discriminant vectors may be interpreted in terms of the saliency of the input features for the different classification tasks, which we portray visually with feature saliency maps for node position as well as filter spatial frequency and orientation.

Journal ArticleDOI
TL;DR: Since untrained classes are commonly encountered it may be more appropriate to use approaches such as the PCM in addition to, or instead of, the FCM to enhance the extraction of land cover information from remotely sensed data.