scispace - formally typeset
Search or ask a question

Showing papers on "Feature (machine learning) published in 2006"


Book
01 Aug 2006
TL;DR: Looking for competent reading resources?
Abstract: Looking for competent reading resources? We have pattern recognition and machine learning information science and statistics to read, not only read, but also download them or even check out online. Locate this fantastic book writtern by by now, simply here, yeah just here. Obtain the reports in the kinds of txt, zip, kindle, word, ppt, pdf, as well as rar. Once again, never ever miss to review online and download this book in our site right here. Click the link.

8,923 citations


Journal ArticleDOI
TL;DR: This research presents a genetic algorithm approach for feature selection and parameters optimization to solve the problem of optimizing parameters and feature subset without degrading the SVM classification accuracy.
Abstract: Support Vector Machines, one of the new techniques for pattern classification, have been widely used in many application areas. The kernel parameters setting for SVM in a training process impacts on the classification accuracy. Feature selection is another factor that impacts classification accuracy. The objective of this research is to simultaneously optimize the parameters and feature subset without degrading the SVM classification accuracy. We present a genetic algorithm approach for feature selection and parameters optimization to solve this kind of problem. We tried several real-world datasets using the proposed GA-based approach and the Grid algorithm, a traditional method of performing parameters searching. Compared with the Grid algorithm, our proposed GA-based approach significantly improves the classification accuracy and has fewer input features for support vector machines. q 2005 Elsevier Ltd. All rights reserved.

1,316 citations


Proceedings ArticleDOI
01 Jan 2006
TL;DR: The approach is not only able to classify different actions, but also to localize different actions simultaneously in a novel and complex video sequence.
Abstract: We present a novel unsupervised learning method for human action categories. A video sequence is represented as a collection of spatial-temporal words by extracting space-time interest points. The algorithm automatically learns the probability distributions of the spatial-temporal words and the intermediate topics corresponding to human action categories. This is achieved by using latent topic models such as the probabilistic Latent Semantic Analysis (pLSA) model and Latent Dirichlet Allocation (LDA). Our approach can handle noisy feature points arisen from dynamic background and moving cameras due to the application of the probabilistic models. Given a novel video sequence, the algorithm can categorize and localize the human action(s) contained in the video. We test our algorithm on three challenging datasets: the KTH human motion dataset, the Weizmann human action dataset, and a recent dataset of figure skating actions. Our results reflect the promise of such a simple approach. In addition, our algorithm can recognize and localize multiple actions in long and complex video sequences containing multiple motions.

927 citations


Proceedings ArticleDOI
03 Apr 2006
TL;DR: The design of an activity recognition and monitoring system based on the eWatch, multi-sensor platform worn on different body positions, is presented and the tradeoff between recognition accuracy and computational complexity is analyzed.
Abstract: The design of an activity recognition and monitoring system based on the eWatch, multi-sensor platform worn on different body positions, is presented in this paper. The system identifies the user's activity in realtime using multiple sensors and records the classification results during a day. We compare multiple time domain feature sets and sampling rates, and analyze the tradeoff between recognition accuracy and computational complexity. The classification accuracy on different body positions used for wearing electronic devices was evaluated.

740 citations


Journal IssueDOI
TL;DR: A framework for authorship identification of online messages to address the identity-tracing problem is developed and four types of writing-style features are extracted and inductive learning algorithms are used to build feature-based classification models to identify authorship ofonline messages.
Abstract: With the rapid proliferation of Internet technologies and applications, misuse of online messages for inappropriate or illegal purposes has become a major concern for society. The anonymous nature of online-message distribution makes identity tracing a critical problem. We developed a framework for authorship identification of online messages to address the identity-tracing problem. In this framework, four types of writing-style features (lexical, syntactic, structural, and content-specific features) are extracted and inductive learning algorithms are used to build feature-based classification models to identify authorship of online messages. To examine this framework, we conducted experiments on English and Chinese online-newsgroup messages. We compared the discriminating power of the four types of features and of three classification techniques: decision trees, backpropagation neural networks, and support vector machines. The experimental results showed that the proposed approach was able to identify authors of online messages with satisfactory accuracy of 70 to 95p. All four types of message features contributed to discriminating authors of online messages. Support vector machines outperformed the other two classification techniques in our experiments. The high performance we achieved for both the English and Chinese datasets showed the potential of this approach in a multiple-language context. © 2006 Wiley Periodicals, Inc.

619 citations


Proceedings Article
01 Jan 2006
TL;DR: A practical procedure for applying WCCN to an SVM-based speaker recognition system where the input feature vectors reside in a high-dimensional space and achieves improvements of up to 22% in EER and 28% in minimum decision cost function (DCF) over the previous baseline.
Abstract: This paper extends the within-class covariance normalization (WCCN) technique described in [1, 2] for training generalized linear kernels. We describe a practical procedure for applying WCCN to an SVM-based speaker recognition system where the input feature vectors reside in a high-dimensional space. Our approach involves using principal component analysis (PCA) to split the original feature space into two subspaces: a low-dimensional “PCA space” and a high-dimensional “PCA-complement space.” After performing WCCN in the PCA space, we concatenate the resulting feature vectors with a weighted version of their PCAcomplements. When applied to a state-of-the-art MLLR-SVM speaker recognition system, this approach achieves improvements of up to 22% in EER and 28% in minimum decision cost function (DCF) over our previous baseline. We also achieve substantial improvements over an MLLR-SVM system that performs WCCN in the PCA space but discards the PCA-complement.

461 citations


Journal ArticleDOI
TL;DR: This paper investigates the feasibility of an audio-based context recognition system developed and compared to the accuracy of human listeners in the same task, with particular emphasis on the computational complexity of the methods.
Abstract: The aim of this paper is to investigate the feasibility of an audio-based context recognition system. Here, context recognition refers to the automatic classification of the context or an environment around a device. A system is developed and compared to the accuracy of human listeners in the same task. Particular emphasis is placed on the computational complexity of the methods, since the application is of particular interest in resource-constrained portable devices. Simplistic low-dimensional feature vectors are evaluated against more standard spectral features. Using discriminative training, competitive recognition accuracies are achieved with very low-order hidden Markov models (1-3 Gaussian components). Slight improvement in recognition accuracy is observed when linear data-driven feature transformations are applied to mel-cepstral features. The recognition rate of the system as a function of the test sequence length appears to converge only after about 30 to 60 s. Some degree of accuracy can be achieved even with less than 1-s test sequence lengths. The average reaction time of the human listeners was 14 s, i.e., somewhat smaller, but of the same order as that of the system. The average recognition accuracy of the system was 58% against 69%, obtained in the listening tests in recognizing between 24 everyday contexts. The accuracies in recognizing six high-level classes were 82% for the system and 88% for the subjects.

436 citations


Journal ArticleDOI
TL;DR: This paper proposed two empirical heuristics: per-document text normalization and feature weighting method, which performed very well in the standard benchmark collections, competing with state-of-the-art text classifiers based on a highly complex learning method such as SVM.
Abstract: While naive Bayes is quite effective in various data mining tasks, it shows a disappointing result in the automatic text classification problem Based on the observation of naive Bayes for the natural language text, we found a serious problem in the parameter estimation process, which causes poor results in text classification domain In this paper, we propose two empirical heuristics: per-document text normalization and feature weighting method While these are somewhat ad hoc methods, our proposed naive Bayes text classifier performs very well in the standard benchmark collections, competing with state-of-the-art text classifiers based on a highly complex learning method such as SVM

430 citations


Journal ArticleDOI
TL;DR: This review covers aspects of analysis from data normalisation methods to pattern recognition and classification techniques, and focuses on the use of artificial intelligence techniques such as neural networks and fuzzy logic for classification and genetic algorithms for feature (sensor) selection.
Abstract: Electronic noses (e-noses) employ an array of chemical gas sensors and have been widely used for the analysis of volatile organic compounds. Pattern recognition provides a higher degree of selectivity and reversibility to the systems leading to an extensive range of applications. These range from the food and medical industry to environmental monitoring and process control. Many types of data analysis techniques have been used on the data produced. This review covers aspects of analysis from data normalisation methods to pattern recognition and classification techniques. An overview of data visualisation such as non-linear mapping and multivariate statistical techniques is given. Focus is then on the use of artificial intelligence techniques such as neural networks and fuzzy logic for classification and genetic algorithms for feature (sensor) selection. Application areas are covered with examples of the types of systems and analysis methods currently in use. Future trends in the analysis of sensor array data are discussed.

423 citations


Journal ArticleDOI
TL;DR: This paper proposes a classification system based on a genetic optimization framework formulated in such a way as to detect the best discriminative features without requiring the a priori setting of their number by the user and to estimate the best SVM parameters in a completely automatic way.
Abstract: Recent remote sensing literature has shown that support vector machine (SVM) methods generally outperform traditional statistical and neural methods in classification problems involving hyperspectral images. However, there are still open issues that, if suitably addressed, could allow further improvement of their performances in terms of classification accuracy. Two especially critical issues are: 1) the determination of the most appropriate feature subspace where to carry out the classification task and 2) model selection. In this paper, these two issues are addressed through a classification system that optimizes the SVM classifier accuracy for this kind of imagery. This system is based on a genetic optimization framework formulated in such a way as to detect the best discriminative features without requiring the a priori setting of their number by the user and to estimate the best SVM parameters (i.e., regularization and kernel parameters) in a completely automatic way. For these purposes, it exploits fitness criteria intrinsically related to the generalization capabilities of SVM classifiers. In particular, two criteria are explored, namely: 1) the simple support vector count and 2) the radius margin bound. The effectiveness of the proposed classification system in general and of these two criteria in particular is assessed both by simulated and real experiments. In addition, a comparison with classification approaches based on three different feature selection methods is reported, i.e., the steepest ascent (SA) algorithm and two other methods explicitly developed for SVM classifiers, namely: 1) the recursive feature elimination technique and 2) the radius margin bound minimization method

421 citations


Proceedings ArticleDOI
17 Jun 2006
TL;DR: This paper investigates the application of the SIFT approach in the context of face authentication, and proposes and tests different matching schemes using the BANCA database and protocol, showing promising results.
Abstract: Several pattern recognition and classification techniques have been applied to the biometrics domain. Among them, an interesting technique is the Scale Invariant Feature Transform (SIFT), originally devised for object recognition. Even if SIFT features have emerged as a very powerful image descriptors, their employment in face analysis context has never been systematically investigated. This paper investigates the application of the SIFT approach in the context of face authentication. In order to determine the real potential and applicability of the method, different matching schemes are proposed and tested using the BANCA database and protocol, showing promising results.

Book ChapterDOI
07 May 2006
TL;DR: A method of gait recognition from various view directions using frequency-domain features and a view transformation model to solve the problem of appearance changes due to view direction changes.
Abstract: Gait analyses have recently gained attention as methods of identification of individuals at a distance from a camera. However, appearance changes due to view direction changes cause difficulties for gait recognition systems. Here, we propose a method of gait recognition from various view directions using frequency-domain features and a view transformation model. We first construct a spatio-temporal silhouette volume of a walking person and then extract frequency-domain features of the volume by Fourier analysis based on gait periodicity. Next, our view transformation model is obtained with a training set of multiple persons from multiple view directions. In a recognition phase, the model transforms gallery features into the same view direction as that of an input feature, and so the features match each other. Experiments involving gait recognition from 24 view directions demonstrate the effectiveness of the proposed method.

Journal ArticleDOI
TL;DR: This study proposes methods for improving SVM performance in two aspects: feature subset selection and parameter optimization.
Abstract: Bankruptcy prediction is an important and widely studied topic since it can have significant impact on bank lending decisions and profitability. Recently, the support vector machine (SVM) has been applied to the problem of bankruptcy prediction. The SVM-based method has been compared with other methods such as the neural network (NN) and logistic regression, and has shown good results. The genetic algorithm (GA) has been increasingly applied in conjunction with other AI techniques such as NN and Case-based reasoning (CBR). However, few studies have dealt with the integration of GA and SVM, though there is a great potential for useful applications in this area. This study proposes methods for improving SVM performance in two aspects: feature subset selection and parameter optimization. GA is used to optimize both a feature subset and parameters of SVM simultaneously for bankruptcy prediction.

Proceedings ArticleDOI
17 Jun 2006
TL;DR: This paper proposes a novel approach to extract primitive 3D facial expression features, and then applies the feature distribution to classify the prototypic facial expressions, and demonstrates the advantages of the 3D geometric based approach over 2D texture based approaches in terms of various head poses.
Abstract: The creation of facial range models by 3D imaging systems has led to extensive work on 3D face recognition [19] However, little work has been done to study the usefulness of such data for recognizing and understanding facial expressions Psychological research shows that the shape of a human face, a highly mobile facial surface, is critical to facial expression perception In this paper, we investigate the importance and usefulness of 3D facial geometric shapes to represent and recognize facial expressions using 3D facial expression range data We propose a novel approach to extract primitive 3D facial expression features, and then apply the feature distribution to classify the prototypic facial expressions In order to validate our proposed approach, we have conducted experiments for person-independent facial expression recognition using our newly created 3D facial expression database We also demonstrate the advantages of our 3D geometric based approach over 2D texture based approaches in terms of various head poses

Journal ArticleDOI
TL;DR: Both the classification and the verification performances are found to be very satisfactory as it was shown that, at least for groups of about five hundred subjects, hand-based recognition is a viable secure access control scheme.
Abstract: The problem of person recognition and verification based on their hand images has been addressed. The system is based on the images of the right hands of the subjects, captured by a flatbed scanner in an unconstrained pose at 45 dpi. In a preprocessing stage of the algorithm, the silhouettes of hand images are registered to a fixed pose, which involves both rotation and translation of the hand and, separately, of the individual fingers. Two feature sets have been comparatively assessed, Hausdorff distance of the hand contours and independent component features of the hand silhouette images. Both the classification and the verification performances are found to be very satisfactory as it was shown that, at least for groups of about five hundred subjects, hand-based recognition is a viable secure access control scheme.

Journal ArticleDOI
01 Feb 2006
TL;DR: This paper presents an online feature selection algorithm using genetic programming (GP) that simultaneously selects a good subset of features and constructs a classifier using the selected features and produces a feature ranking scheme.
Abstract: This paper presents an online feature selection algorithm using genetic programming (GP). The proposed GP methodology simultaneously selects a good subset of features and constructs a classifier using the selected features. For a c-class problem, it provides a classifier having c trees. In this context, we introduce two new crossover operations to suit the feature selection process. As a byproduct, our algorithm produces a feature ranking scheme. We tested our method on several data sets having dimensions varying from 4 to 7129. We compared the performance of our method with results available in the literature and found that the proposed method produces consistently good results. To demonstrate the robustness of the scheme, we studied its effectiveness on data sets with known (synthetically added) redundant/bad features.

Journal ArticleDOI
TL;DR: The proposed R-SVM method is suitable for analyzing noisy high-throughput proteomics and microarray data and it outperforms SVM-RFE in the robustness to noise and in the ability to recover informative features.
Abstract: Background: Like microarray-based investigations, high-throughput proteomics techniques require machine learning algorithms to identify biomarkers that are informative for biological classification problems. Feature selection and classification algorithms need to be robust to noise and outliers in the data. Results: We developed a recursive support vector machine (R-SVM) algorithm to select important genes/biomarkers for the classification of noisy data. We compared its performance to a similar, state-of-the-art method (SVM recursive feature elimination or SVM-RFE), paying special attention to the ability of recovering the true informative genes/biomarkers and the robustness to outliers in the data. Simulation experiments show that a 5 %-~20 % improvement over SVM-RFE can be achieved regard to these properties. The SVM-based methods are also compared with a conventional univariate method and their respective strengths and weaknesses are discussed. RSVM was applied to two sets of SELDI-TOF-MS proteomics data, one from a human breast cancer study and the other from a study on rat liver cirrhosis. Important biomarkers found by the algorithm were validated by follow-up biological experiments. Conclusion: The proposed R-SVM method is suitable for analyzing noisy high-throughput proteomics and microarray data and it outperforms SVM-RFE in the robustness to noise and in the ability to recover informative features. The multivariate SVM-based method outperforms the univariate method in the classification performance, but univariate methods can reveal more of the differentially expressed features especially when there are correlations between the features.

Proceedings ArticleDOI
17 Jul 2006
TL;DR: A perceptron-style discriminative approach to machine translation in which large feature sets can be exploited and a novel way to introduce learning into the initial phrase extraction process, which has previously been entirely heuristic.
Abstract: We present a perceptron-style discriminative approach to machine translation in which large feature sets can be exploited. Unlike discriminative reranking approaches, our system can take advantage of learned features in all stages of decoding. We first discuss several challenges to error-driven discriminative approaches. In particular, we explore different ways of updating parameters given a training example. We find that making frequent but smaller updates is preferable to making fewer but larger updates. Then, we discuss an array of features and show both how they quantitatively increase BLEU score and how they qualitatively interact on specific examples. One particular feature we investigate is a novel way to introduce learning into the initial phrase extraction process, which has previously been entirely heuristic.

Journal ArticleDOI
TL;DR: The experiments suggest that discriminant analysis provides a fast, efficient yet accurate alternative for general multi-class classification problems.
Abstract: Many supervised machine learning tasks can be cast as multi-class classification problems. Support vector machines (SVMs) excel at binary classification problems, but the elegant theory behind large-margin hyperplane cannot be easily extended to their multi-class counterparts. On the other hand, it was shown that the decision hyperplanes for binary classification obtained by SVMs are equivalent to the solutions obtained by Fisher's linear discriminant on the set of support vectors. Discriminant analysis approaches are well known to learn discriminative feature transformations in the statistical pattern recognition literature and can be easily extend to multi-class cases. The use of discriminant analysis, however, has not been fully experimented in the data mining literature. In this paper, we explore the use of discriminant analysis for multi-class classification problems. We evaluate the performance of discriminant analysis on a large collection of benchmark datasets and investigate its usage in text categorization. Our experiments suggest that discriminant analysis provides a fast, efficient yet accurate alternative for general multi-class classification problems.

Book ChapterDOI
12 Sep 2006
TL;DR: The authors proposed a variable-length n-gram approach inspired by previous work for selecting variable length word sequences for authorship identification, using a subset of the new Reuters corpus, consisting of texts on the same topic by 50 different authors.
Abstract: Automatic authorship identification offers a valuable tool for supporting crime investigation and security. It can be seen as a multi-class, single-label text categorization task. Character n-grams are a very successful approach to represent text for stylistic purposes since they are able to capture nuances in lexical, syntactical, and structural level. So far, character n-grams of fixed length have been used for authorship identification. In this paper, we propose a variable-length n-gram approach inspired by previous work for selecting variable-length word sequences. Using a subset of the new Reuters corpus, consisting of texts on the same topic by 50 different authors, we show that the proposed approach is at least as effective as information gain for selecting the most significant n-grams although the feature sets produced by the two methods have few common members. Moreover, we explore the significance of digits for distinguishing between authors showing that an increase in performance can be achieved using simple text pre-processing.

Journal ArticleDOI
TL;DR: A Bayesian method for mixture model training that simultaneously treats the feature selection and the model selection problem and can simultaneously optimize over the number of components, the saliency of the features, and the parameters of the mixture model is presented.
Abstract: We present a Bayesian method for mixture model training that simultaneously treats the feature selection and the model selection problem. The method is based on the integration of a mixture model formulation that takes into account the saliency of the features and a Bayesian approach to mixture learning that can be used to estimate the number of mixture components. The proposed learning algorithm follows the variational framework and can simultaneously optimize over the number of components, the saliency of the features, and the parameters of the mixture model. Experimental results using high-dimensional artificial and real data illustrate the effectiveness of the method.

Proceedings Article
27 Mar 2006
TL;DR: This paper describes experiments on blog identification using Support Vector Machines (SVM), compares results of using different feature sets and introduces new features for blog identification, and reports preliminary results on splog detection.
Abstract: Weblogs, or blogs have become an important new way to publish information, engage in discussions and form communities. The increasing popularity of blogs has given rise to search and analysis engines focusing on the 'blogosphere'. A key requirement of such systems is to identify blogs as they crawl the Web. While this ensures that only blogs are indexed, blog search engines are also often overwhelmed by spam blogs (splogs). Splogs not only incur computational overheads but also reduce user satisfaction. In this paper we first describe our experiments on blog identification using Support Vector Machines (SVM). We compare results of using different feature sets and introduce new features for blog identification. We then report preliminary results on splog detection and identify future work.

Journal ArticleDOI
TL;DR: A system for human behaviour recognition in video sequences that combines Bayesian networks and belief propagation, non-parametric sampling from a previously learned database of actions, and Hidden Markov Models which encode scene rules are used to smooth sequences of actions.

Journal ArticleDOI
TL;DR: An efficient approach for face image feature extraction, namely, (2D)^2LDA method is presented, which obtains good recognition accuracy despite having less number of coefficients.

Proceedings ArticleDOI
27 Oct 2006
TL;DR: This paper presents a HMM-based methodology for action recogni-tion using star skeleton as a representative descriptor of human posture, and implements a system to automatically recognize ten different types of actions.
Abstract: This paper presents a HMM-based methodology for action recogni-tion using star skeleton as a representative descriptor of human posture. Star skeleton is a fast skeletonization technique by connecting from centroid of target object to contour extremes. To use star skeleton as feature for action recognition, we clearly define the fea-ture as a five-dimensional vector in star fashion because the head and four limbs are usually local extremes of human shape. In our proposed method, an action is composed of a series of star skeletons over time. Therefore, time-sequential images expressing human action are transformed into a feature vector sequence. Then the fea-ture vector sequence must be transformed into symbol sequence so that HMM can model the action. We design a posture codebook, which contains representative star skeletons of each action type and define a star distance to measure the similarity between feature vec-tors. Each feature vector of the sequence is matched against the codebook and is assigned to the symbol that is most similar. Conse-quently, the time-sequential images are converted to a symbol posture sequence. We use HMMs to model each action types to be recognized. In the training phase, the model parameters of the HMM of each category are optimized so as to best describe the training symbol sequences. For human action recognition, the model which best matches the observed symbol sequence is selected as the recog-nized category. We implement a system to automatically recognize ten different types of actions, and the system has been tested on real human action videos in two cases. One case is the classification of 100 video clips, each containing a single action type. A 98% recog-nition rate is obtained. The other case is a more realistic situation in which human takes a series of actions combined. An action-series recognition is achieved by referring a period of posture history using a sliding window scheme. The experimental results show promising performance.

Journal ArticleDOI
TL;DR: This matrix-based scheme demonstrates a much better gait recognition performance than state-of-the-art algorithms on the standard USF HumanID Gait database.
Abstract: Human gait is an important biometric feature. It can be perceived from a great distance and has recently attracted greater attention in video-surveillance-related applications, such as closed-circuit television. We explore gait recognition based on a matrix representation in this paper. First, binary silhouettes over one gait cycle are averaged. As a result, each gait video sequence, containing a number of gait cycles, is represented by a series of gray-level averaged images. Then, a matrix-based unsupervised algorithm, namely coupled subspace analysis (CSA), is employed as a preprocessing step to remove noise and retain the most representative information. Finally, a supervised algorithm, namely discriminant analysis with tensor representation, is applied to further improve classification ability. This matrix-based scheme demonstrates a much better gait recognition performance than state-of-the-art algorithms on the standard USF HumanID Gait database

Journal ArticleDOI
TL;DR: This paper proposes a new measure called the maximal participation ratio (maxPR) and shows that a co-location pattern with a relatively high maxPR value corresponds to a co,location pattern containing rare spatial events, and identifies a weak monotonicity property of the maxPR measure.
Abstract: A co-location pattern is a group of spatial features/events that are frequently co-located in the same region. For example, human cases of West Nile Virus often occur in regions with poor mosquito control and the presence of birds. For co-location pattern mining, previous studies often emphasize the equal participation of every spatial feature. As a result, interesting patterns involving events with substantially different frequency cannot be captured. In this paper, we address the problem of mining co-location patterns with rare spatial features. Specifically, we first propose a new measure called the maximal participation ratio (maxPR) and show that a co-location pattern with a relatively high maxPR value corresponds to a co-location pattern containing rare spatial events. Furthermore, we identify a weak monotonicity property of the maxPR measure. This property can help to develop an efficient algorithm to mine patterns with high maxPR values. As demonstrated by our experiments, our approach is effective in identifying co-location patterns with rare events, and is efficient and scalable for large-scale data sets.

Journal ArticleDOI
TL;DR: It is suggested that learning has no direct impact on the strength or resistance of bindings or on speed with which features are bound; however, learning does affect the amount of attention particular feature dimensions attract, which again can influence which feature features are considered in binding.
Abstract: Four experiments were conducted to investigate the relationship between the binding of visual features (as measured by their aftereffects on subsequent binding) and the learning of feature-conjunction probabilities. Both binding and learning effects were obtained, but they did not interact. Interestingly, (shape-color) binding effects disappeared with increasing practice, presumably because of the fact that only 1 of the features involved was relevant to the task. However, this instability was only observed for arbitrary, not highly overlearned combinations of simple geometric features and not for real objects (colored pictures of a banana and strawberry), where binding effects were strong and resistant to practice. These findings suggest that learning has no direct impact on the strength or resistance of bindings or on speed with which features are bound; however, learning does affect the amount of attention particular feature dimensions attract, which again can influence which features are considered in binding.

Journal ArticleDOI
TL;DR: A novel Gabor-based kernel principal component analysis (PCA) with doubly nonlinear mapping with Eigenmask is proposed for human face recognition, which not only considers the statistical property of the input features, but also adopts an eigenmask to emphasize those important facial feature points.
Abstract: In this paper, a novel Gabor-based kernel principal component analysis (PCA) with doubly nonlinear mapping is proposed for human face recognition. In our approach, the Gabor wavelets are used to extract facial features, then a doubly nonlinear mapping kernel PCA (DKPCA) is proposed to perform feature transformation and face recognition. The conventional kernel PCA nonlinearly maps an input image into a high-dimensional feature space in order to make the mapped features linearly separable. However, this method does not consider the structural characteristics of the face images, and it is difficult to determine which nonlinear mapping is more effective for face recognition. In this paper, a new method of nonlinear mapping, which is performed in the original feature space, is defined. The proposed nonlinear mapping not only considers the statistical property of the input features, but also adopts an eigenmask to emphasize those important facial feature points. Therefore, after this mapping, the transformed features have a higher discriminating power, and the relative importance of the features adapts to the spatial importance of the face images. This new nonlinear mapping is combined with the conventional kernel PCA to be called "doubly" nonlinear mapping kernel PCA. The proposed algorithm is evaluated based on the Yale database, the AR database, the ORL database and the YaleB database by using different face recognition methods such as PCA, Gabor wavelets plus PCA, and Gabor wavelets plus kernel PCA with fractional power polynomial models. Experiments show that consistent and promising results are obtained

Proceedings Article
01 Oct 2006
TL;DR: It is shown that there is a significant performance difference as different tag sets are selected, and the proposed method gives the state-of-the-art performance.
Abstract: This paper is concerned with Chinese word segmentation, which is regarded as a character based tagging problem under conditional random field framework. It is different in our method that we consider both feature template selection and tag set selection, instead of feature template focused only method in existing work. Thus, there comes an empirical comparison study of performance among different tag sets in this paper. We show that there is a significant performance difference as different tag sets are selected. Based on the proposed method, our system gives the state-of-the-art performance.