scispace - formally typeset
Search or ask a question

Showing papers on "Feature vector published in 1999"


Proceedings Article
29 Nov 1999
TL;DR: An algorithm, DAGSVM, is presented, which operates in a kernel-induced feature space and uses two-class maximal margin hyperplanes at each decision-node of the DDAG, which is substantially faster to train and evaluate than either the standard algorithm or Max Wins, while maintaining comparable accuracy to both of these algorithms.
Abstract: We present a new learning architecture: the Decision Directed Acyclic Graph (DDAG), which is used to combine many two-class classifiers into a multiclass classifier. For an N-class problem, the DDAG contains N(N - 1)/2 classifiers, one for each pair of classes. We present a VC analysis of the case when the node classifiers are hyperplanes; the resulting bound on the test error depends on N and on the margin achieved at the nodes, but not on the dimension of the space. This motivates an algorithm, DAGSVM, which operates in a kernel-induced feature space and uses two-class maximal margin hyperplanes at each decision-node of the DDAG. The DAGSVM is substantially faster to train and evaluate than either the standard algorithm or Max Wins, while maintaining comparable accuracy to both of these algorithms.

1,857 citations


Proceedings Article
29 Nov 1999
TL;DR: The algorithm is a natural extension of the support vector algorithm to the case of unlabelled data and is regularized by controlling the length of the weight vector in an associated feature space.
Abstract: Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified ν between 0 and 1. We propose a method to approach this problem by trying to estimate a function f which is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. We provide a theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabelled data.

1,851 citations


Journal ArticleDOI
TL;DR: It is observed that a simple remapping of the input x(i)-->x(i)(a) improves the performance of linear SVM's to such an extend that it makes them, for this problem, a valid alternative to RBF kernels.
Abstract: Traditional classification approaches generalize poorly on image classification tasks, because of the high dimensionality of the feature space. This paper shows that support vector machines (SVM) can generalize well on difficult image classification problems where the only features are high dimensional histograms. Heavy-tailed RBF kernels of the form K(x, y)=e/sup -/spl rho///spl Sigma//sub i//sup |xia-yia|b/ with a /spl les/1 and b/spl les/2 are evaluated on the classification of images extracted from the Corel stock photo collection and shown to far outperform traditional polynomial or Gaussian radial basis function (RBF) kernels. Moreover, we observed that a simple remapping of the input x/sub i//spl rarr/x/sub i//sup a/ improves the performance of linear SVM to such an extend that it makes them, for this problem, a valid alternative to RBF kernels.

1,510 citations


Proceedings ArticleDOI
01 Aug 1999
TL;DR: An unsupervised, near-linear time text clustering system that offers a number of algorithm choices for each phase, and a refinement to center adjustment, “vector average damping,” that further improves cluster quality.
Abstract: Clustering is a powerful technique for large-scale topic discovery from text. It involves two phases: first, feature extraction maps each document or record to a point in high-dimensional space, then clustering algorithms automatically group the points into a hierarchy of clusters. We describe an unsupervised, near-linear time text clustering system that offers a number of algorithm choices for each phase. We introduce a methodology for measuring the quality of a cluster hierarchy in terms of FMeasure, and present the results of experiments comparing different algorithms. The evaluation considers some feature selection parameters (tfidfand feature vector length) but focuses on the clustering algorithms, namely techniques from Scatter/Gather (buckshot, fractionation, and split/join) and kmeans. Our experiments suggest that continuous center adjustment contributes more to cluster quality than seed selection does. It follows that using a simpler seed selection algorithm gives a better time/quality tradeoff. We describe a refinement to center adjustment, “vector average damping,” that further improves cluster quality. We also compare the near-linear time algorithms to a group average greedy agglomerative clustering algorithm to demonstrate the time/quality tradeoff quantitatively.

958 citations


Book ChapterDOI
TL;DR: This work indexes the blob descriptions using a lower-rank approximation to the high-dimensional distance to make large-scale retrieval feasible, and shows encouraging results for both querying and indexing.
Abstract: Blobworld is a system for image retrieval based on finding coherent image regions which roughly correspond to objects. Each image is automatically segmented into regions ("blobs") with associated color and texture descriptors. Queryingi s based on the attributes of one or two regions of interest, rather than a description of the entire image. In order to make large-scale retrieval feasible, we index the blob descriptions usinga tree. Because indexing in the high-dimensional feature space is computationally prohibitive, we use a lower-rank approximation to the high-dimensional distance. Experiments show encouraging results for both queryinga nd indexing.

896 citations


Proceedings Article
01 Jan 1999
TL;DR: A formulation of the SVM is proposed that enables a multi-class pattern recognition problem to be solved in a single optimisation and a similar generalization of linear programming machines is proposed.
Abstract: The solution of binary classi cation problems using support vector machines (SVMs) is well developed, but multi-class problems with more than two classes have typically been solved by combining independently produced binary classi ers. We propose a formulation of the SVM that enables a multi-class pattern recognition problem to be solved in a single optimisation. We also propose a similar generalization of linear programming machines. We report experiments using bench-mark datasets in which these two methods achieve a reduction in the number of support vectors and kernel calculations needed. 1. k-Class Pattern Recognition The k-class pattern recognition problem is to construct a decision function given ` iid (independent and identically distributed) samples (points) of an unknown function, typically with noise: (x1; y1); : : : ; (x`; y`) (1) where xi; i = 1; : : : ; ` is a vector of length d and yi 2 f1; : : : ; kg represents the class of the sample. A natural loss function is the number of mistakes made. 2. Solving k-Class Problems with Binary SVMs For the binary pattern recognition problem (case k = 2), the support vector approach has been well developed [3, 5]. The classical approach to solving k-class pattern recognition problems is to consider the problem as a collection of binary classi cation problems. In the one-versus-rest method one constructs k classi ers, one for each class. The n classi er constructs a hyperplane between class n and the k 1 other classes. A particular point is assigned to the class for which the distance from the margin, in the positive direction (i.e. in the direction in which class \one" lies rather than class \rest"), is maximal. This method has been used widely in ESANN'1999 proceedings European Symposium on Artificial Neural Networks Bruges (Belgium), 21-23 April 1999, D-Facto public., ISBN 2-600049-9-X, pp. 219-224

873 citations


Journal ArticleDOI
TL;DR: An efficient and reliable probabilistic metric derived from the Bhattacharrya distance is used in order to classify the extracted feature vectors into face or nonface areas, using some prototype face area vectors, acquired in a previous training stage.
Abstract: Detecting and recognizing human faces automatically in digital images strongly enhance content-based video indexing systems. In this paper, a novel scheme for human faces detection in color images under nonconstrained scene conditions, such as the presence of a complex background and uncontrolled illumination, is presented. Color clustering and filtering using approximations of the YCbCr and HSV skin color subspaces are applied on the original image, providing quantized skin color regions. A merging stage is then iteratively performed on the set of homogeneous skin color regions in the color quantized image, in order to provide a set of potential face areas. Constraints related to shape and size of faces are applied, and face intensity texture is analyzed by performing a wavelet packet decomposition on each face area candidate in order to detect human faces. The wavelet coefficients of the band filtered images characterize the face texture and a set of simple statistical deviations is extracted in order to form compact and meaningful feature vectors. Then, an efficient and reliable probabilistic metric derived from the Bhattacharrya distance is used in order to classify the extracted feature vectors into face or nonface areas, using some prototype face area vectors, acquired in a previous training stage.

641 citations


Journal ArticleDOI
TL;DR: In this paper, a multivariate, nonparametric time series simulation method is provided to generate random sequences of daily weather variables that "honor" the statistical properties of the historical data of the same weather variables at the site.
Abstract: A multivariate, nonparametric time series simulation method is provided to generate random sequences of daily weather variables that "honor" the statistical properties of the historical data of the same weather variables at the site. A vector of weather variables (solar radiation, maximum temperature, minimum temperature, average dew point temperature, average wind speed, and precipitation) on a day of interest is resampled from the historical data by conditioning on the vector of the same variables (feature vector) on the preceding day. The resampling is done from the k nearest neighbors in state space of the feature vector using a weight function. This approach is equivalent to a nonparametric approximation of a multivariate, lag 1 Markov process. It does not require prior assumptions as to the form of the joint probability density function of the variables. An application of the resampling scheme with 30 years of daily weather data at Salt Lake City, Utah, is provided. Results are compared with those from the application of a multivariate autoregressive model similar to that of Richardson (1981).

419 citations


Proceedings Article
29 Nov 1999
TL;DR: New functionals for parameter (model) selection of Support Vector Machines are introduced based on the concepts of the span of support vectors and rescaling of the feature space and it is shown that using these functionals one can both predict the best choice of parameters of the model and the relative quality of performance for any value of parameter.
Abstract: New functionals for parameter (model) selection of Support Vector Machines are introduced based on the concepts of the span of support vectors and rescaling of the feature space. It is shown that using these functionals, one can both predict the best choice of parameters of the model and the relative quality of performance for any value of parameter.

392 citations


Journal ArticleDOI
01 Jun 1999-Test
TL;DR: A method for exploring the structure of populations of complex objects, such as images, is considered, and endemic outliers motivate the development of a bounded influence approach to PCA.
Abstract: A method for exploring the structure of populations of complex objects, such as images, is considered. The objects are summarized by feature vectors. The statistical backbone is Principal Component Analysis in the space of feature vectors. Visual insights come from representing the results in the original data space. In an ophthalmological example, endemic outliers motivate the development of a bounded influence approach to PCA.

345 citations


Proceedings Article
29 Nov 1999
TL;DR: Experimental results on commonly used benchmark data sets of a wide range of face images show that the SNoW-based approach outperforms methods that use neural networks, Bayesian methods, support vector machines and others.
Abstract: A novel learning approach for human face detection using a network of linear units is presented. The SNoW learning architecture is a sparse network of linear functions over a pre-defined or incrementally learned feature space and is specifically tailored for learning in the presence of a very large number of features. A wide range of face images in different poses, with different expressions and under different lighting conditions are used as a training set to capture the variations of human faces. Experimental results on commonly used benchmark data sets of a wide range of face images show that the SNoW-based approach outperforms methods that use neural networks, Bayesian methods, support vector machines and others. Furthermore, learning and evaluation using the SNoW-based method are significantly more efficient than with other methods.

Journal ArticleDOI
TL;DR: A new feature-based approach to automated image-to-image registration that combines an invariant-moment shape descriptor with improved chain-code matching to establish correspondences between the potentially matched regions detected from the two images is presented.
Abstract: A new feature-based approach to automated image-to-image registration is presented. The characteristic of this approach is that it combines an invariant-moment shape descriptor with improved chain-code matching to establish correspondences between the potentially matched regions detected from the two images. It is robust in that it overcomes the difficulties of control-point correspondence by matching the images both in the feature space, using the principle of minimum distance classifier (based on the combined criteria), and sequentially in the image space, using the rule of root mean-square error (RMSE). In image segmentation, the performance of the Laplacian of Gaussian operators is improved by introducing a new algorithm called thin and robust zero crossing. After the detected edge points are refined and sorted, regions are defined. Region correspondences are then performed by an image-matching algorithm developed in this research. The centers of gravity are then extracted from the matched regions and are used as control points. Transformation parameters are estimated based on the final matched control-point pairs. The algorithm proposed is automated, robust, and of significant value in an operational context. Experimental results using multitemporal Landsat TM imagery are presented.

Journal ArticleDOI
TL;DR: This work presents an algorithm combining variants of Winnow and weighted-majority voting, and applies it to a problem in the aforementioned class: context-sensitive spelling correction, and finds that WinSpell achieves accuracies significantly higher than BaySpell was able to achieve in either the pruned or unpruned condition.
Abstract: A large class of machine-learning problems in natural language require the characterization of linguistic context. Two characteristic properties of such problems are that their feature space is of very high dimensionality, and their target concepts depend on only a small subset of the features in the space. Under such conditions, multiplicative weight-update algorithms such as Winnow have been shown to have exceptionally good theoretical properties. In the work reported here, we present an algorithm combining variants of Winnow and weighted-majority voting, and apply it to a problem in the aforementioned class: context-sensitive spelling correction. This is the task of fixing spelling errors that happen to result in valid words, such as substituting to for too, casual for causal, and so on. We evaluate our algorithm, WinSpell, by comparing it against BaySpell, a statistics-based method representing the state of the art for this task. We find: (1) When run with a full (unpruned) set of features, WinSpell achieves accuracies significantly higher than BaySpell was able to achieve in either the pruned or unpruned conditions (2) When compared with other systems in the literature, WinSpell exhibits the highest performances (3) While several aspects of WinSpell‘s architecture contribute to its superiority over BaySpell, the primary factor is that it is able to learn a better linear separator than BaySpell learnss (4) When run on a test set drawn from a different corpus than the training set was drawn from, WinSpell is better able than BaySpell to adapt, using a strategy we will present that combines supervised learning on the training set with unsupervised learning on the (noisy) test set.

Journal ArticleDOI
01 Dec 1999
TL;DR: Two new clustering algorithms are introduced that can effectively cluster documents, even in the presence of a very high dimensional feature space, and do not require pre-specified ad hoc distance functions and are capable of automatically discovering document similarities or associations.
Abstract: Clustering techniques have been used by many intelligent software agents in order to retrieve, filter, and categorize documents available on the World Wide Web. Clustering is also useful in extracting salient features of related Web documents to automatically formulate queries and search for other similar documents on the Web. Traditional clustering algorithms either use a priori knowledge of document structures to define a distance or similarity among these documents, or use probabilistic techniques such as Bayesian classification. Many of these traditional algorithms, however, falter when the dimensionality of the feature space becomes high relative to the size of the document space. In this paper, we introduce two new clustering algorithms that can effectively cluster documents, even in the presence of a very high dimensional feature space. These clustering techniques, which are based on generalizations of graph partitioning, do not require pre-specified ad hoc distance functions, and are capable of automatically discovering document similarities or associations. We conduct several experiments on real Web data using various feature selection heuristics, and compare our clustering schemes to standard distance-based techniques, such as hierarchical agglomeration clustering , and Bayesian classification methods, such as AutoClass .

Journal Article
TL;DR: This paper shows that Support Vector Machines (SVM) can generalize well on difficult image classification problems where the only features are high dimensional histograms and observes that a simple remapping of the input xi → x a i improves the performance of linear SVMs to such an extend that it makes them a valid alternative to RBF kernels.
Abstract: Traditional classification approaches generalize poorly on image classification tasks, because of the high dimensionality of the feature space This paper shows that Support Vector Machines (SVM) can generalize well on difficult image classification problems where the only features are high dimensional histograms Heavy-tailed RBF kernels of the form K(x,y) = e−ρ P i |x i −y i | with a ≤ 1 and b ≤ 2 are evaluated on the classification of images extracted from the Corel Stock Photo Collection and shown to far outperform traditional polynomial or Gaussian RBF kernels Moreover, we observed that a simple remapping of the input xi → x a i improves the performance of linear SVMs to such an extend that it makes them, for this problem, a valid alternative to RBF kernels keywords: Support Vector Machines, Radial Basis Functions, Image Histogram, Image Classification, Corel

Proceedings ArticleDOI
20 Sep 1999
TL;DR: A novel variational method for supervised texture segmentation that is guided by boundary and region based segmentation forces, and is constrained by a regularity force is presented.
Abstract: The paper presents a novel variational method for supervised texture segmentation. The textured feature space is generated by filtering the given textured images using isotropic and anisotropic filters, and analyzing their responses as multi-component conditional probability density functions. The texture segmentation is obtained by unifying region and boundary based information as an improved Geodesic Active Contour Model. The defined objective function is minimized using a gradient-descent method where a level set approach is used to implement the obtained PDE. According to this PDE, the curve propagation towards the final solution is guided by boundary and region based segmentation forces, and is constrained by a regularity force. The level set implementation is performed using a fast front propagation algorithm where topological changes are naturally handled. The performance of our method is demonstrated on a variety of synthetic and real textured frames.

Journal ArticleDOI
TL;DR: Curvature scale space (CSS) image representation along with a small number of global parameters are used for this purpose and the results show the promising performance of the method and its superiority over Fourier descriptors and moment invariants.
Abstract: In many applications, the user of an image database system points to an image, and wishes to retrieve similar images from the database. Computer vision researchers aim to capture image information in feature vectors which describe shape, texture and color properties of the image. These vectors are indexed or compared to one another during query processing to find images from the database. This paper is concerned with the problem of shape similarity retrieval in image databases. Curvature scale space (CSS) image representation along with a small number of global parameters are used for this purpose. The CSS image consists of several arch-shape contours representing the inflection points of the shape as it is smoothed. The maxima of these contours are used to represent a shape. The method is then tested on a database of 1100 images of marine creatures. A classified subset of this database is used to evaluate the method and compare it with other methods. The results show the promising performance of the method and its superiority over Fourier descriptors and moment invariants.

Proceedings ArticleDOI
23 Mar 1999
TL;DR: The hybrid tree as mentioned in this paper is a multidimensional data structure for indexing high-dimensional feature spaces, which combines the positive aspects of the two types of index structures into a single data structure to achieve a search performance which is more scalable to high dimensionalities than either of the above techniques.
Abstract: Feature-based similarity searching is emerging as an important search paradigm in database systems. The technique used is to map the data items as points into a high-dimensional feature space which is indexed using a multidimensional data structure. Similarity searching then corresponds to a range search over the data structure. Although several data structures have been proposed for feature indexing, none of them is known to scale beyond 10-15 dimensional spaces. This paper introduces the hybrid tree-a multidimensional data structure for indexing high-dimensional feature spaces. Unlike other multidimensional data structures, the hybrid tree cannot be classified as either a pure data partitioning (DP) index structure (such as the R-tree, SS-tree or SR-tree) or a pure space partitioning (SP) one (such as the KDB-tree or hB-tree); rather it combines the positive aspects of the two types of index structures into a single data structure to achieve a search performance which is more scalable to high dimensionalities than either of the above techniques. Furthermore, unlike many data structures (e.g. distance-based index structures like the SS-tree and SR-tree), the hybrid tree can support queries based on arbitrary distance functions. Our experiments on "real" high-dimensional large-size feature databases demonstrate that the hybrid tree scales well to high dimensionality and large database sizes. It significantly outperforms both purely DP-based and SP-based index mechanisms as well as linear scans at all dimensionalities for large-sized databases.

Proceedings ArticleDOI
15 Mar 1999
TL;DR: A new approach to content-based video indexing using hidden Markov models (HMMs), in which one feature vector is calculated for each image of the video sequence, that allows the classification of complex video sequences.
Abstract: This paper presents a new approach to content-based video indexing using hidden Markov models (HMMs). In this approach one feature vector is calculated for each image of the video sequence. These feature vectors are modeled and classified using HMMs. This approach has many advantages compared to other video indexing approaches. The system has automatic learning capabilities. It is trained by presenting manually indexed video sequences. To improve the system we use a video model, that allows the classification of complex video sequences. The presented approach works three times faster than real-time. We tested our system on TV broadcast news. The rate of 97.3% correctly classified frames shows the efficiency of our system.

Journal ArticleDOI
TL;DR: The Self-Organizing Map (SOM) and Learning Vector Quantization (LVQ) algorithms are constructed in this work for variable-length and warped feature sequences and good results have been obtained in speaker-independent speech recognition.
Abstract: The Self-Organizing Map (SOM) and Learning Vector Quantization (LVQ) algorithms are constructed in this work for variable-length and warped feature sequences. The novelty is to associate an entire feature vector sequence, instead of a single feature vector, as a model with each SOM node. Dynamic time warping is used to obtain time-normalized distances between sequences with different lengths. Starting with random initialization, ordered feature sequence maps then ensue, and Learning Vector Quantization can be used to fine tune the prototype sequences for optimal class separation. The resulting SOM models, the prototype sequences, can then be used for the recognition as well as synthesis of patterns. Good results have been obtained in speaker-independent speech recognition.

Journal ArticleDOI
TL;DR: The RBFNN classifier appears to be well suited to classifying the arrhythmia, owing to the feature vectors' linear inseparability, and tendency to cluster, and the potential for wavelet based energy descriptors to distinguish the main features of the signal and thereby enhance the classification scheme.
Abstract: Automatic detection and classification of arrhythmias based on ECG signals are important to cardiac-disease diagnostics. The ability of the ECG classifier to identify arrhythmias accurately is based on the development of robust techniques for both feature extraction and classification. A classifier is developed based on using wavelet, transforms for extracting features and then using a radial basis function neural network (RBFNN) to classify the arrhythmia. Six energy descriptors are derived from the wavelent coefficients, over a single-beat interval from the ECG signal. Nine different continuous and discrete wavelet transforms, are considered for obtaining the feature vector. An RBFNN adapted to detect and classify life-threatening arrhythmias is then used to classify the feature vector. Classification results are based on 159 arrhythmia, files obtained from three different sources. Classification results indicate the potential for wavelet based energy descriptors to distinguish the main features of the signal and thereby enhance the classification scheme. The RBFNN classifier appears to be well suited to classifying the arrhythmia, owing to the feature vectors' linear inseparability, and tendency to cluster. Utilising the Daubechies wavelet transform, an overall correct classification of 97.5% is obtained, with 100% correct classification for both ventricular fibrillation and ventricular tachycardia.

Proceedings ArticleDOI
19 Apr 1999
TL;DR: The proposed and compared four dimensionality reduction techniques to reduce the feature space into an input space of much lower dimension for the neural network classifier showed that the proposed model was able to achieve high categorization effectiveness as measured by precision and recall.
Abstract: In a text categorization model using an artificial neural network as the text classifier scalability is poor if the neural network is trained using the raw feature space since textural data has a very high-dimension feature space. We proposed and compared four dimensionality reduction techniques to reduce the feature space into an input space of much lower dimension for the neural network classifier. To test the effectiveness of the proposed model, experiments were conducted using a subset of the Reuters-22173 test collection for text categorization. The results showed that the proposed model was able to achieve high categorization effectiveness as measured by precision and recall. Among the four dimensionality reduction techniques proposed, principal component analysis was found to be the most effective in reducing the dimensionality of the feature space.

Patent
11 Mar 1999
TL;DR: In this article, a technique for classifying video frames using statistical models of transform coefficients is disclosed, where image frames are transformed using a discrete cosine transform or Hadamard transform and the resulting transform matrices are reduced using truncation, principal component analysis, or linear discriminant analysis to produce feature vectors.
Abstract: Techniques for classifying video frames using statistical models of transform coefficients are disclosed. After optionally being decimated in time and space, image frames are transformed using a discrete cosine transform or Hadamard transform. The methods disclosed model image composition and operate on grayscale images. The resulting transform matrices are reduced using truncation, principal component analysis, or linear discriminant analysis to produce feature vectors. Feature vectors of training images for image classes are used to compute image class statistical models. Once image class statistical models are derived, individual frames are classified by the maximum likelihood resulting from the image class statistical models. Thus, the probabilities that a feature vector derived from a frame would be produced from each of the image class statistical models are computed. The frame is classified into the image class corresponding to the image class statistical model which produced the highest probability for the feature vector derived from the frame. Optionally, frame sequence information is taken into account by applying a hidden Markov model to represent image class transitions from the previous frame to the current frame. After computing all class probabilities for all frames in the video or sequence of frames using the image class statistical models and the image class transition probabilities, the final class is selected as having the maximum likelihood. Previous frames are selected in reverse order based upon their likelihood given determined current states.

PatentDOI
Raimo Bakis1, Ellen Eide1
TL;DR: A system and method for rescoring the N-best hypotheses from an automatic speech recognition system by comparing an original speech waveform to synthetic speech waveforms that are generated for each text sequence of the N -best hypotheses.
Abstract: A system and method for rescoring the N-best hypotheses from an automatic speech recognition system by comparing an original speech waveform to synthetic speech waveforms that are generated for each text sequence of the N-best hypotheses. A distance is calculated from the original speech waveform to each of the synthesized waveforms, and the text associated with the synthesized waveform that is determined to be closest to the original waveform is selected as the final hypothesis. The original waveform and each synthesized waveform are aligned to a corresponding text sequence on a phoneme level. The mean of the feature vectors which align to each phoneme is computed for the original waveform as well as for each of the synthesized hypotheses. The distance of a synthesized hypothesis to the original speech signal is then computed as the sum over all phonemes in the hypothesis of the Euclidean distance between the means of the feature vectors of the frames aligning to that phoneme for the original and the synthesized signals. The text of the hypothesis which is closest under the above metric to the original waveform is chosen as the final system output.

Journal ArticleDOI
Zhengyou Zhang1
TL;DR: Experiments show that facial expression recognition is mainly a low frequency process, and a spatial resolution of 64 pixels × 64 pixels is probably enough to represent the space of facial expressions.
Abstract: In this paper, we report our experiments on feature-based facial expression recognition within an architecture based on a two-layer perceptron. We investigate the use of two types of features extracted from face images: the geometric positions of a set of fiducial points on a face, and a set of multiscale and multiorientation Gabor wavelet coefficients at these points. They can be used either independently or jointly. The recognition performance with different types of features has been compared, which shows that Gabor wavelet coefficients are much more powerful than geometric positions. Furthermore, since the first layer of the perceptron actually performs a nonlinear reduction of the dimensionality of the feature space, we have also studied the desired number of hidden units, i.e. the appropriate dimension to represent a facial expression in order to achieve a good recognition rate. It turns out that five to seven hidden units are probably enough to represent the space of facial expressions. Then, we have investigated the importance of each individual fiducial point to facial expression recognition. Sensitivity analysis reveals that points on cheeks and on forehead carry little useful information. After discarding them, not only the computational efficiency increases, but also the generalization performance slightly improves. Finally, we have studied the significance of image scales. Experiments show that facial expression recognition is mainly a low frequency process, and a spatial resolution of 64 pixels × 64 pixels is probably enough.

Journal ArticleDOI
TL;DR: A novel probabilistic method is presented that enables image retrieval procedures to automatically capture feature relevance based on user's feedback and that is highly adaptive to query locations.

Proceedings ArticleDOI
23 Aug 1999
TL;DR: A new method is presented, which applies the idea of capacity control in SVM but tries to make the machine less sensitive to noises and outliers, and can be called a central support vector machine or CSVM, for it uses the class centers in building the support vectors.
Abstract: A support vector machine builds the final classification function on only a small part of the training samples (the support vectors). It is believed that all the information about classification in the training set can be represented by these samples. However, this is actually not always true when the training set is polluted by noises (training data are not i.i.d.). We present a different method for the problem, which applies the idea of capacity control in SVM but tries to make the machine less sensitive to noises and outliers. The new method can be called a central support vector machine or CSVM, for it uses the class centers in building the support vector machine.

Patent
01 Mar 1999
TL;DR: In this article, a method and system for indexing and retrieving database objects, such as images, include a database manager which initializes database objects based on vectors for values of quantified features associated with the database objects Similar database objects are grouped into common clusters that are based on system-perceived relationships among the objects.
Abstract: A method and system for indexing and retrieving database objects, such as images, include a database manager which initializes database objects based on vectors for values of quantified features associated with the database objects Similar database objects are grouped into common clusters that are based on system-perceived relationships among the objects For each search session, a vector for a search query is calculated and database objects from the closest cluster within feature space are selected for presentation at a user device The user indicates which of the selected objects are relevant to the search session and which of the objects are irrelevant If one of the clusters includes both relevant and irrelevant objects, the cluster is split into two clusters, so that one of the resulting clusters includes the relevant objects and the other cluster includes irrelevant objects The correlation matrix is updated to indicate that the resulting clusters have a weak correlation If two of the clusters include database objects which were indicated to be relevant to the search session, the correlation matrix is updated to indicate that the two clusters have a strong correlation To avoid an excessive proliferation of database clusters, mergers are performed on clusters which are closely located within the feature space and share a strong correlation within the correlation matrix Following continued use, the groupings of objects into clusters and the cluster-to-cluster correlations will reflect user-perceived relationships

Patent
26 Oct 1999
TL;DR: A system, method, and computer program product for computer-aided detection of suspicious lesions (204) in digital mammograms wherein single view feature vectors (704) from a first mammogram are processed in a classification algorithm (706) along with information computed from a plurality of related mammograms to assign an overall probability of suspiciousness (711) to potentially suspicious lesions in the first digital mammogram as mentioned in this paper.
Abstract: A system, method, and computer program product for computer-aided detection of suspicious lesions (204) in digital mammograms wherein single view feature vectors (704) from a first mammogram are processed in a classification algorithm (706) along with information computed from a plurality of related digital mammograms to assign an overall probability of suspiciousness (711) to potentially suspicious lesions in the first digital mammogram.

Journal ArticleDOI
TL;DR: A new approach to computer supported diagnosis of skin tumors in dermatology is presented, using neural networks with error back-propagation as learning paradigm to optimized classification performance of the neural classifiers.