scispace - formally typeset
Search or ask a question

Showing papers on "Linear discriminant analysis published in 2010"


Journal ArticleDOI
01 Jun 2010
TL;DR: A brief overview of clustering is provided, well known clustering methods are summarized, the major challenges and key issues in designing clustering algorithms are discussed, and some of the emerging and useful research directions are pointed out.
Abstract: Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organisms into a system of ranked taxa: domain, kingdom, phylum, class, etc. Cluster analysis is the formal study of methods and algorithms for grouping, or clustering, objects according to measured or perceived intrinsic characteristics or similarity. Cluster analysis does not use category labels that tag objects with prior identifiers, i.e., class labels. The absence of category information distinguishes data clustering (unsupervised learning) from classification or discriminant analysis (supervised learning). The aim of clustering is to find structure in data and is therefore exploratory in nature. Clustering has a long and rich history in a variety of scientific fields. One of the most popular and simple clustering algorithms, K-means, was first published in 1955. In spite of the fact that K-means was proposed over 50 years ago and thousands of clustering algorithms have been published since then, K-means is still widely used. This speaks to the difficulty in designing a general purpose clustering algorithm and the ill-posed problem of clustering. We provide a brief overview of clustering, summarize well known clustering methods, discuss the major challenges and key issues in designing clustering algorithms, and point out some of the emerging and useful research directions, including semi-supervised clustering, ensemble clustering, simultaneous feature selection during data clustering, and large scale data clustering.

6,601 citations


Book
01 Oct 2010
TL;DR: Partial least squares (PLS) was not originally designed as a tool for statistical discrimination as discussed by the authors, but applied scientists routinely use PLS for classification and there is substantial empirical evidence to suggest that it performs well in that role.
Abstract: Partial least squares (PLS) was not originally designed as a tool for statistical discrimination. In spite of this, applied scientists routinely use PLS for classification and there is substantial empirical evidence to suggest that it performs well in that role. The interesting question is: why can a procedure that is principally designed for overdetermined regression problems locate and emphasize group structure? Using PLS in this manner has heurestic support owing to the relationship between PLS and canonical correlation analysis (CCA) and the relationship, in turn, between CCA and linear discriminant analysis (LDA). This paper replaces the heuristics with a formal statistical explanation. As a consequence, it will become clear that PLS is to be preferred over PCA when discrimination is the goal and dimension reduction is needed. Copyright © 2003 John Wiley & Sons, Ltd.

2,067 citations


Journal ArticleDOI
TL;DR: In this work, a versatile signal processing and analysis framework for Electroencephalogram (EEG) was proposed and a set of statistical features was extracted from the sub-bands to represent the distribution of wavelet coefficients.
Abstract: In this work, we proposed a versatile signal processing and analysis framework for Electroencephalogram (EEG). Within this framework the signals were decomposed into the frequency sub-bands using DWT and a set of statistical features was extracted from the sub-bands to represent the distribution of wavelet coefficients. Principal components analysis (PCA), independent components analysis (ICA) and linear discriminant analysis (LDA) is used to reduce the dimension of data. Then these features were used as an input to a support vector machine (SVM) with two discrete outputs: epileptic seizure or not. The performance of classification process due to different methods is presented and compared to show the excellent of classification process. These findings are presented as an example of a method for training, and testing a seizure prediction method on data from individual petit mal epileptic patients. Given the heterogeneity of epilepsy, it is likely that methods of this type will be required to configure intelligent devices for treating epilepsy to each individual's neurophysiology prior to clinical operation.

1,010 citations


Book ChapterDOI
01 Jan 2010
TL;DR: In this article, the assumption of Gaussianity for the measurement error combined with the maximum likelihood principle could be emphasized to promote the least square criterion for nonlinear regression problems; considering classification as a regression problem towards estimating class posterior probabilities, least squares has been employed to train neural network and other classifier topologies to approximate correct labels.
Abstract: INTRODUCTION Learning systems depend on three interrelated components: topologies, cost/performance functions, and learning algorithms. Topologies provide the constraints for the mapping, and the learning algorithms offer the means to find an optimal solution; but the solution is optimal with respect to what? Optimality is characterized by the criterion and in neural network literature, this is the least addressed component, yet it has a decisive influence in generalization performance. Certainly, the assumptions behind the selection of a criterion should be better understood and investigated. Traditionally, least squares has been the benchmark criterion for regression problems; considering classification as a regression problem towards estimating class posterior probabilities, least squares has been employed to train neural network and other classifier topologies to approximate correct labels. The main motivation to utilize least squares in regression simply comes from the intellectual comfort this criterion provides due to its success in traditional linear least squares regression applications – which can be reduced to solving a system of linear equations. For nonlinear regression, the assumption of Gaussianity for the measurement error combined with the maximum likelihood principle could be emphasized to promote this criterion. In nonparametric regression, least squares principle leads to the conditional expectation solution, which is intuitively appealing. Although these are good reasons to use the mean squared error as the cost, it is inherently linked to the assumptions and habits stated above. Consequently, there is information in the error signal that is not captured during the training of nonlinear adaptive systems under non-Gaussian distribution conditions when one insists on secondorder statistical criteria. This argument extends to other linear-second-order techniques such as principal component analysis (PCA), linear discriminant analysis (LDA), and canonical correlation analysis (CCA). Recent work tries to generalize these techniques to nonlinear scenarios by utilizing kernel techniques or other heuristics. This begs the question: what other alternative cost functions could be used to train adaptive systems and how could we establish rigorous techniques for extending useful concepts from linear and second-order statistical techniques to nonlinear and higher-order statistical learning methodologies?

615 citations


Journal ArticleDOI
01 Mar 2010
TL;DR: A novel emotion evocation and EEG-based feature extraction technique is presented, in which the mirror neuron system concept was adapted to efficiently foster emotion induction by the process of imitation, justifying the efficiency of the proposed approach.
Abstract: Electroencephalogram (EEG)-based emotion recognition is a relatively new field in the affective computing area with challenging issues regarding the induction of the emotional states and the extraction of the features in order to achieve optimum classification performance. In this paper, a novel emotion evocation and EEG-based feature extraction technique is presented. In particular, the mirror neuron system concept was adapted to efficiently foster emotion induction by the process of imitation. In addition, higher order crossings (HOC) analysis was employed for the feature extraction scheme and a robust classification method, namely HOC-emotion classifier (HOC-EC), was implemented testing four different classifiers [quadratic discriminant analysis (QDA), k-nearest neighbor, Mahalanobis distance, and support vector machines (SVMs)], in order to accomplish efficient emotion recognition. Through a series of facial expression image projection, EEG data have been collected by 16 healthy subjects using only 3 EEG channels, namely Fp1, Fp2, and a bipolar channel of F3 and F4 positions according to 10-20 system. Two scenarios were examined using EEG data from a single-channel and from combined-channels, respectively. Compared with other feature extraction methods, HOC-EC appears to outperform them, achieving a 62.3% (using QDA) and 83.33% (using SVM) classification accuracy for the single-channel and combined-channel cases, respectively, differentiating among the six basic emotions, i.e., happiness , surprise, anger, fear, disgust, and sadness. As the emotion class-set reduces its dimension, the HOC-EC converges toward maximum classification rate (100% for five or less emotions), justifying the efficiency of the proposed approach. This could facilitate the integration of HOC-EC in human machine interfaces, such as pervasive healthcare systems, enhancing their affective character and providing information about the user's emotional status (e.g., identifying user's emotion experiences, recurring affective states, time-dependent emotional trends).

542 citations


Journal ArticleDOI
TL;DR: Compared classifiers' accuracy at decoding the category of visual objects from response patterns in human early visual and inferior temporal cortex acquired in an event-related design with BOLD fMRI at 3T is compared and linear decoders based on t-value patterns may perform best.

465 citations


Journal ArticleDOI
TL;DR: A unified manifold learning framework for semi-supervised and unsupervised dimension reduction by employing a simple but effective linear regression function to map the new data points by modeling the mismatch between h(X) and F.
Abstract: We propose a unified manifold learning framework for semi-supervised and unsupervised dimension reduction by employing a simple but effective linear regression function to map the new data points. For semi-supervised dimension reduction, we aim to find the optimal prediction labels F for all the training samples X, the linear regression function h(X) and the regression residue F0 = F - h(X) simultaneously. Our new objective function integrates two terms related to label fitness and manifold smoothness as well as a flexible penalty term defined on the residue F0. Our Semi-Supervised learning framework, referred to as flexible manifold embedding (FME), can effectively utilize label information from labeled data as well as a manifold structure from both labeled and unlabeled data. By modeling the mismatch between h(X) and F, we show that FME relaxes the hard linear constraint F = h(X) in manifold regularization (MR), making it better cope with the data sampled from a nonlinear manifold. In addition, we propose a simplified version (referred to as FME/U) for unsupervised dimension reduction. We also show that our proposed framework provides a unified view to explain and understand many semi-supervised, supervised and unsupervised dimension reduction techniques. Comprehensive experiments on several benchmark databases demonstrate the significant improvement over existing dimension reduction algorithms.

435 citations


Journal ArticleDOI
TL;DR: This paper presents a family of subspace learning algorithms based on a new form of regularization, which transfers the knowledge gained in training samples to testing samples, and minimizes the Bregman divergence between the distribution of training samples and that of testing samples in the selected subspace.
Abstract: The regularization principals [31] lead approximation schemes to deal with various learning problems, e.g., the regularization of the norm in a reproducing kernel Hilbert space for the ill-posed problem. In this paper, we present a family of subspace learning algorithms based on a new form of regularization, which transfers the knowledge gained in training samples to testing samples. In particular, the new regularization minimizes the Bregman divergence between the distribution of training samples and that of testing samples in the selected subspace, so it boosts the performance when training and testing samples are not independent and identically distributed. To test the effectiveness of the proposed regularization, we introduce it to popular subspace learning algorithms, e.g., principal components analysis (PCA) for cross-domain face modeling; and Fisher's linear discriminant analysis (FLDA), locality preserving projections (LPP), marginal Fisher's analysis (MFA), and discriminative locality alignment (DLA) for cross-domain face recognition and text categorization. Finally, we present experimental evidence on both face image data sets and text data sets, suggesting that the proposed Bregman divergence-based regularization is effective to deal with cross-domain learning problems.

430 citations


Journal ArticleDOI
TL;DR: The average classification rate and subsets of emotions classification rate of two simple pattern classification methods, K Nearest Neighbor (KNN) and Linear Discriminant Analysis (LDA), are presented for justifying the performance of the emotion recognition system.
Abstract: In this paper, we summarize the human emotion recognition using different set of electroencephalogram (EEG) channels using discrete wavelet transform. An audio-visual induction based protocol has been designed with more dynamic emotional content for inducing discrete emotions (disgust, happy, surprise, fear and neutral). EEG signals are collected using 64 electrodes from 20 subjects and are placed over the entire scalp using International 10-10 system. The raw EEG signals are preprocessed using Surface Laplacian (SL) filtering method and decomposed into three different frequency bands (alpha, beta and gamma) using Discrete Wavelet Transform (DWT). We have used “db4” wavelet function for deriving a set of conventional and modified energy based features from the EEG signals for classifying emotions. Two simple pattern classification methods, K Nearest Neighbor (KNN) and Linear Discriminant Analysis (LDA) methods are used and their performances are compared for emotional states classification. The experimental results indicate that, one of the proposed features (ALREE) gives the maximum average classification rate of 83.26% using KNN and 75.21% using LDA compared to those of conventional features. Finally, we present the average classification rate and subsets of emotions classification rate of these two different classifiers for justifying the performance of our emotion recognition system.

408 citations


Journal ArticleDOI
TL;DR: This paper proposes local Gabor XOR patterns (LGXP), which encodes the Gabor phase by using the local XOR pattern (LXP) operator, and introduces block-based Fisher's linear discriminant (BFLD) to reduce the dimensionality of the proposed descriptor and at the same time enhance its discriminative power.
Abstract: Gabor features have been known to be effective for face recognition. However, only a few approaches utilize phase feature and they usually perform worse than those using magnitude feature. To investigate the potential of Gabor phase and its fusion with magnitude for face recognition, in this paper, we first propose local Gabor XOR patterns (LGXP), which encodes the Gabor phase by using the local XOR pattern (LXP) operator. Then, we introduce block-based Fisher's linear discriminant (BFLD) to reduce the dimensionality of the proposed descriptor and at the same time enhance its discriminative power. Finally, by using BFLD, we fuse local patterns of Gabor magnitude and phase for face recognition. We evaluate our approach on FERET and FRGC 2.0 databases. In particular, we perform comparative experimental studies of different local Gabor patterns. We also make a detailed comparison of their combinations with BFLD, as well as the fusion of different descriptors by using BFLD. Extensive experimental results verify the effectiveness of our LGXP descriptor and also show that our fusion approach outperforms most of the state-of-the-art approaches.

390 citations


Journal ArticleDOI
TL;DR: This paper proposes a new image clustering algorithm, referred to as clustering using local discriminant models and global integration (LDMGI), and shows that LDMGI shares a similar objective function with the spectral clustering (SC) algorithms, e.g., normalized cut (NCut).
Abstract: In this paper, we propose a new image clustering algorithm, referred to as clustering using local discriminant models and global integration (LDMGI). To deal with the data points sampled from a nonlinear manifold, for each data point, we construct a local clique comprising this data point and its neighboring data points. Inspired by the Fisher criterion, we use a local discriminant model for each local clique to evaluate the clustering performance of samples within the local clique. To obtain the clustering result, we further propose a unified objective function to globally integrate the local models of all the local cliques. With the unified objective function, spectral relaxation and spectral rotation are used to obtain the binary cluster indicator matrix for all the samples. We show that LDMGI shares a similar objective function with the spectral clustering (SC) algorithms, e.g., normalized cut (NCut). In contrast to NCut in which the Laplacian matrix is directly calculated based upon a Gaussian function, a new Laplacian matrix is learnt in LDMGI by exploiting both manifold structure and local discriminant information. We also prove that K-means and discriminative K-means (DisKmeans) are both special cases of LDMGI. Extensive experiments on several benchmark image datasets demonstrate the effectiveness of LDMGI. We observe in the experiments that LDMGI is more robust to algorithmic parameter, when compared with NCut. Thus, LDMGI is more appealing for the real image clustering applications in which the ground truth is generally not available for tuning algorithmic parameters.

Journal ArticleDOI
TL;DR: A multimodal affect detector that combines conversational cues, gross body language, and facial features, and linear discriminant analyses to discriminate between naturally occurring experiences of boredom, engagement/flow, confusion, frustration, delight, and neutral is developed and evaluated.
Abstract: We developed and evaluated a multimodal affect detector that combines conversational cues, gross body language, and facial features. The multimodal affect detector uses feature-level fusion to combine the sensory channels and linear discriminant analyses to discriminate between naturally occurring experiences of boredom, engagement/flow, confusion, frustration, delight, and neutral. Training and validation data for the affect detector were collected in a study where 28 learners completed a 32- min. tutorial session with AutoTutor, an intelligent tutoring system with conversational dialogue. Classification results supported a channel × judgment type interaction, where the face was the most diagnostic channel for spontaneous affect judgments (i.e., at any time in the tutorial session), while conversational cues were superior for fixed judgments (i.e., every 20 s in the session). The analyses also indicated that the accuracy of the multichannel model (face, dialogue, and posture) was statistically higher than the best single-channel model for the fixed but not spontaneous affect expressions. However, multichannel models reduced the discrepancy (i.e., variance in the precision of the different emotions) of the discriminant models for both judgment types. The results also indicated that the combination of channels yielded superadditive effects for some affective states, but additive, redundant, and inhibitory effects for others. We explore the structure of the multimodal linear discriminant models and discuss the implications of some of our major findings.

Journal ArticleDOI
TL;DR: A simulation study using data models and analysis of real microarray data shows that for small samples the root mean square differences of the estimated and true metrics are considerable, and even for large samples, there is only weak correlation between the true and estimated metrics.
Abstract: Motivation: The receiver operator characteristic (ROC) curves are commonly used in biomedical applications to judge the performance of a discriminant across varying decision thresholds. The estimated ROC curve depends on the true positive rate (TPR) and false positive rate (FPR), with the key metric being the area under the curve (AUC). With small samples these rates need to be estimated from the training data, so a natural question arises: How well do the estimates of the AUC, TPR and FPR compare with the true metrics? Results: Through a simulation study using data models and analysis of real microarray data, we show that (i) for small samples the root mean square differences of the estimated and true metrics are considerable; (ii) even for large samples, there is only weak correlation between the true and estimated metrics; and (iii) generally, there is weak regression of the true metric on the estimated metric. For classification rules, we consider linear discriminant analysis, linear support vector machine (SVM) and radial basis function SVM. For error estimation, we consider resubstitution, three kinds of cross-validation and bootstrap. Using resampling, we show the unreliability of some published ROC results. Availability: Companion web site at http://compbio.tgen.org/paper_supp/ROC/roc.html Contact: edward@mail.ece.tamu.edu

Book
09 Aug 2010
TL;DR: This chapter discusses the need for Statistics in Experimental Planning and Analysis, and some Basic Properties of a Distribution (Mean, Variance and Standard Deviation) and the importance of Relationships between Two or More Variables.
Abstract: Preface. Acknowledgements. 1 Introduction. 1.1 The Distinction between Trained Sensory Panels and Consumer Panels. 1.2 The Need for Statistics in Experimental Planning and Analysis. 1.3 Scales and Data Types. 1.4 Organisation of the Book. 2 Important Data Collection Techniques for Sensory and Consumer Studies. 2.1 Sensory Panel Methodologies. 2.2 Consumer Tests. PART I PROBLEM DRIVEN. 3 Quality Control of Sensory Profile Data. 3.1 General Introduction. 3.2 Visual Inspection of Raw Data. 3.3 Mixed Model ANOVA for Assessing the Importance of the Sensory Attributes. 3.4 Overall Assessment of Assessor Differences Using All Variables Simultaneously. 3.5 Methods for Detecting Differences in Use of the Scale. 3.6 Comparing the Assessors Ability to Detect Differences between the Products. 3.7 Relations between Individual Assessor Ratings and the Panel Average. 3.8 Individual Line Plots for Detailed Inspection of Assessors. 3.9 Miscellaneous Methods.- 4 Correction Methods and Other Remedies for Improving Sensory Profile Data. 4.1 Introduction. 4.2 Correcting for Different Use of the Scale. 4.3 Computing Improved Panel Averages. 4.4 Pre-processing of Data for Three-Way Analysis. 5 Detecting and Studying Sensory Differences and Similarities between Products. 5.1 Introduction. 5.2 Analysing Sensory Profile Data: Univariate Case. 5.3 Analysing Sensory Profile Data: Multivariate Case. 6 Relating Sensory Data to Other Measurements. 6.1 Introduction. 6.2 Estimating Relations between Consensus Profiles and External Data. 6.3 Estimating Relations between Individual Sensory Profiles and External Data. 7 Discrimination and Similarity Testing. 7.1 Introduction. 7.2 Analysis of Data from Basic Sensory Discrimination Tests. 7.3 Examples of Basic Discrimination Testing. 7.4 Power Calculations in Discrimination Tests. 7.5 Thurstonian Modelling: What Is It Really? 7.6 Similarity versus Difference Testing. 7.7 Replications: What to Do? 7.8 Designed Experiments, Extended Analysis and Other Test Protocols. 8 Investigating Important Factors Influencing Food Acceptance and Choice. 8.1 Introduction. 8.2 Preliminary Analysis of Consumer Data Sets (Raw Data Overview). 8.3 Experimental Designs for Rating Based Consumer Studies. 8.4 Analysis of Categorical Effect Variables. 8.5 Incorporating Additional Information about Consumers. 8.6 Modelling of Factors as Continuous Variables. 8.7 Reliability/Validity Testing for Rating Based Methods. 8.8 Rank Based Methodology. 8.9 Choice Based Conjoint Analysis. 8.10 Market Share Simulation. 9 Preference Mapping for Understanding Relations between Sensory Product Attributes and Consumer Acceptance. 9.1 Introduction. 9.2 External and Internal Preference Mapping. 9.3 Examples of Linear Preference Mapping. 9.4 Ideal Point Preference Mapping. 9.5 Selecting Samples for Preference Mapping. 9.6 Incorporating Additional Consumer Attributes. 9.7 Combining Preference Mapping with Additional Information about the Samples. 10 Segmentation of Consumer Data. 10.1 Introduction. 10.2 Segmentation of Rating Data. 10.3 Relating Segments to Consumer Attributes. PART II METHOD ORIENTED. 11 Basic Statistics. 11.1 Basic Concepts and Principles. 11.2 Histogram, Frequency and Probability. 11.3 Some Basic Properties of a Distribution (Mean, Variance and Standard Deviation). 11.4 Hypothesis Testing and Confidence Intervals for the Mean . 11.5 Statistical Process Control. 11.6 Relationships between Two or More Variables. 11.7 Simple Linear Regression. 11.8 Binomial Distribution and Tests. 11.9 Contingency Tables and Homogeneity Testing. 12 Design of Experiments for Sensory and Consumer Data. 12.1 Introduction. 12.2 Important Concepts and Distinctions. 12.3 Full Factorial Designs. 12.4 Fractional Factorial Designs: Screening Designs. 12.5 Randomised Blocks and Incomplete Block Designs. 12.6 Split-Plot and Nested Designs. 12.7 Power of Experiments. 13 ANOVA for Sensory and Consumer Data. 13.1 Introduction. 13.2 One-Way ANOVA. 13.3 Single Replicate Two-Way ANOVA. 13.4 Two-Way ANOVA with Randomised Replications. 13.5 Multi-Way ANOVA. 13.6 ANOVA for Fractional Factorial Designs. 13.7 Fixed and Random Effects in ANOVA: Mixed Models. 13.8 Nested and Split-Plot Models. 13.9 Post Hoc Testing. 14 Principal Component Analysis. 14.1 Interpretation of Complex Data Sets by PCA. 14.2 Data Structures for the PCA. 14.3 PCA: Description of the Method. 14.4 Projections and Linear Combinations. 14.5 The Scores and Loadings Plots. 14.6 Correlation Loadings Plot. 14.7 Standardisation. 14.8 Calculations and Missing Values. 14.9 Validation. 14.10 Outlier Diagnostics. 14.11 Tucker-1. 14.12 The Relation between PCA and Factor Analysis (FA). 15 Multiple Regression, Principal Components Regression and Partial Least Squares Regression. 15.1 Introduction. 15.2 Multivariate Linear Regression. 15.3 The Relation between ANOVA and Regression Analysis. 15.4 Linear Regression Used for Estimating Polynomial Models. 15.5 Combining Continuous and Categorical Variables. 15.6 Variable Selection for Multiple Linear Regression. 15.7 Principal Components Regression (PCR). 15.8 Partial Least Squares (PLS) Regression. 15.9 Model Validation: Prediction Performance. 15.10 Model Diagnostics and Outlier Detection. 15.11 Discriminant Analysis. 15.12 Generalised Linear Models, Logistic Regression and Multinomial Regression. 16 Cluster Analysis: Unsupervised Classification. 16.1 Introduction. 16.2 Hierarchical Clustering. 16.3 Partitioning Methods. 16.4 Cluster Analysis for Matrices. 17 Miscellaneous Methodologies. 17.1 Three-Way Analysis of Sensory Data. 17.2 Relating Three-Way Data to Two-Way Data. 17.3 Path Modelling. 17.4 MDS-Multidimensional Scaling. 17.5 Analysing Rank Data. 17.6 The L-PLS Method. 17.7 Missing Value Estimation. Nomenclature, Symbols and Abbreviations. Index.

Journal ArticleDOI
TL;DR: In this paper, the authors exploited environmental and multi-temporal landslide information for an area in Umbria, Italy, to produce four single and two combined landslide susceptibility zonations.

Journal ArticleDOI
TL;DR: An efficient road sign recognition system is built, based on a conventional nearest neighbour classifier and a simple temporal integration scheme, which demonstrates a competitive performance in the experiments involving real traffic video.

Journal ArticleDOI
TL;DR: This paper proposes a semi-supervised dimensionality reduction method which preserves the global structure of unlabeled samples in addition to separating labeled samples in different classes from each other and shows the usefulness of SELF through experiments with benchmark and real-world document classification datasets.
Abstract: When only a small number of labeled samples are available, supervised dimensionality reduction methods tend to perform poorly because of overfitting. In such cases, unlabeled samples could be useful in improving the performance. In this paper, we propose a semi-supervised dimensionality reduction method which preserves the global structure of unlabeled samples in addition to separating labeled samples in different classes from each other. The proposed method, which we call SEmi-supervised Local Fisher discriminant analysis (SELF), has an analytic form of the globally optimal solution and it can be computed based on eigen-decomposition. We show the usefulness of SELF through experiments with benchmark and real-world document classification datasets.

Journal ArticleDOI
TL;DR: NIR spectroscopy was found to be effective for gasoline classification purposes, when compared with nuclear magnetic resonance (NMR)Spectroscopy or gas chromatography (GC), and KNN, SVM, and PNN techniques for classification were find to be among the most effective ones.

Journal ArticleDOI
TL;DR: This work proposes algorithms for feature extraction and classification based on orthogonal or nonnegative tensor (multi-array) decompositions, and higher order (multilinear) discriminant analysis (HODA), whereby input data are considered as tensors instead of more conventional vector or matrix representations.
Abstract: Feature extraction and selection are key factors in model reduction, classification and pattern recognition problems. This is especially important for input data with large dimensions such as brain recording or multiview images, where appropriate feature extraction is a prerequisite to classification. To ensure that the reduced dataset contains maximum information about input data we propose algorithms for feature extraction and classification. This is achieved based on orthogonal or nonnegative tensor (multi-array) decompositions, and higher order (multilinear) discriminant analysis (HODA), whereby input data are considered as tensors instead of more conventional vector or matrix representations. The developed algorithms are verified on benchmark datasets, using constraints imposed on tensors and/or factor matrices such as orthogonality and nonnegativity.

Journal ArticleDOI
12 Jan 2010
TL;DR: A novel pattern recognition based myoelectric control system that uses parallel binary classification and class specific thresholds that is robust, easily configured, and highly usable is described.
Abstract: This paper describes a novel pattern recognition based myoelectric control system that uses parallel binary classification and class specific thresholds. The system was designed with an intuitive configuration interface, similar to existing conventional myoelectric control systems. The system was assessed quantitatively with a classification error metric and functionally with a clothespin test implemented in a virtual environment. For each case, the proposed system was compared to a state-of-the-art pattern recognition system based on linear discriminant analysis and a conventional myoelectric control scheme with mode switching. These assessments showed that the proposed control system had a higher classification error (p < 0.001) but yielded a more controllable myoelectric control system (p < 0.001) as measured through a clothespin usability test implemented in a virtual environment. Furthermore, the system was computationally simple and applicable for real-time embedded implementation. This work provides the basis for a clinically viable pattern recognition based myoelectric control system which is robust, easily configured, and highly usable.

Journal ArticleDOI
TL;DR: This work shows that with an appropriate combination of kernels a significant boost in classification performance is possible, and indicates the utility of active learning with probabilistic predictive models, especially when the amount of training data labels that may be sought for a category is ultimately very small.
Abstract: Discriminative methods for visual object category recognition are typically non-probabilistic, predicting class labels but not directly providing an estimate of uncertainty. Gaussian Processes (GPs) provide a framework for deriving regression techniques with explicit uncertainty models; we show here how Gaussian Processes with covariance functions defined based on a Pyramid Match Kernel (PMK) can be used for probabilistic object category recognition. Our probabilistic formulation provides a principled way to learn hyperparameters, which we utilize to learn an optimal combination of multiple covariance functions. It also offers confidence estimates at test points, and naturally allows for an active learning paradigm in which points are optimally selected for interactive labeling. We show that with an appropriate combination of kernels a significant boost in classification performance is possible. Further, our experiments indicate the utility of active learning with probabilistic predictive models, especially when the amount of training data labels that may be sought for a category is ultimately very small.

Journal ArticleDOI
TL;DR: This paper takes advantage of the functional nature of the data-set and proposes a forecasting methodology based on functional statistics, using a functional clustering procedure to classify the daily load curves and defines a family of functional linear regression models.

Journal ArticleDOI
TL;DR: A novel anthropometric three dimensional (Anthroface 3D) face recognition algorithm, which is based on a systematically selected set of discriminatory structural characteristics of the human face derived from the existing scientific literature on facial anthropometry, is presented.
Abstract: We present a novel anthropometric three dimensional (Anthroface 3D) face recognition algorithm, which is based on a systematically selected set of discriminatory structural characteristics of the human face derived from the existing scientific literature on facial anthropometry. We propose a novel technique for automatically detecting 10 anthropometric facial fiducial points that are associated with these discriminatory anthropometric features. We isolate and employ unique textural and/or structural characteristics of these fiducial points, along with the established anthropometric facial proportions of the human face for detecting them. Lastly, we develop a completely automatic face recognition algorithm that employs facial 3D Euclidean and geodesic distances between these 10 automatically located anthropometric facial fiducial points and a linear discriminant classifier. On a database of 1149 facial images of 118 subjects, we show that the standard deviation of the Euclidean distance of each automatically detected fiducial point from its manually identified position is less than 2.54 mm. We further show that the proposed Anthroface 3D recognition algorithm performs well (equal error rate of 1.98% and a rank 1 recognition rate of 96.8%), out performs three of the existing benchmark 3D face recognition algorithms, and is robust to the observed fiducial point localization errors.

Proceedings ArticleDOI
14 Mar 2010
TL;DR: An algorithm to regularize the Common Spatial Patterns (CSP) and Linear Discriminant Analysis (LDA) algorithms based on the data from a subset of automatically selected subjects is proposed.
Abstract: A major limitation of Brain-Computer Interfaces (BCI) is their long calibration time, as much data from the user must be collected in order to tune the BCI for this target user. In this paper, we propose a new method to reduce this calibration time by using data from other subjects. More precisely, we propose an algorithm to regularize the Common Spatial Patterns (CSP) and Linear Discriminant Analysis (LDA) algorithms based on the data from a subset of automatically selected subjects. An evaluation of our approach showed that our method significantly outperformed the standard BCI design especially when the amount of data from the target user is small. Thus, our approach helps in reducing the amount of data needed to achieve a given performance level.

Journal ArticleDOI
TL;DR: Nonlinear SVM technique is applied in a highly heterogeneous sandstone reservoir to classify electrofacies and predict permeability distributions and statistical error analysis shows that the SVM method yields comparable or superior classification of the lithology and estimates of the permeability than the neural network methods.

Journal ArticleDOI
Taiping Zhang1, Bin Fang1, Yuan Yan Tang1, Zhaowei Shang1, Bin Xu1 
01 Feb 2010
TL;DR: Comparisons of experimental results on different data sets are given with respect to existing LDA extensions, including PCA + LDA, LDA via generalized singular value decomposition, regularized L DA, NLDA, and LDA through QR decompose, which demonstrate the effectiveness of the proposed EDA method.
Abstract: Linear discriminant analysis (LDA) is well known as a powerful tool for discriminant analysis. In the case of a small training data set, however, it cannot directly be applied to high-dimensional data. This case is the so-called small-sample-size or undersampled problem. In this paper, we propose an exponential discriminant analysis (EDA) technique to overcome the undersampled problem. The advantages of EDA are that, compared with principal component analysis (PCA) + LDA, the EDA method can extract the most discriminant information that was contained in the null space of a within-class scatter matrix, and compared with another LDA extension, i.e., null-space LDA (NLDA), the discriminant information that was contained in the non-null space of the within-class scatter matrix is not discarded. Furthermore, EDA is equivalent to transforming original data into a new space by distance diffusion mapping, and then, LDA is applied in such a new space. As a result of diffusion mapping, the margin between different classes is enlarged, which is helpful in improving classification accuracy. Comparisons of experimental results on different data sets are given with respect to existing LDA extensions, including PCA + LDA, LDA via generalized singular value decomposition, regularized LDA, NLDA, and LDA via QR decomposition, which demonstrate the effectiveness of the proposed EDA method.

Journal ArticleDOI
01 Jun 2010
TL;DR: The objective is to regulate the LPP space in a parametric manner and extract useful discriminant information from the whole feature space rather than a reduced projection subspace of principal component analysis which results in better locality preserving power and higher recognition accuracy than the original LPP method.
Abstract: We propose in this paper a parametric regularized locality preserving projections (LPP) method for face recognition. Our objective is to regulate the LPP space in a parametric manner and extract useful discriminant information from the whole feature space rather than a reduced projection subspace of principal component analysis. This results in better locality preserving power and higher recognition accuracy than the original LPP method. Moreover, the proposed regularization method can easily be extended to other manifold learning algorithms and to effectively address the small sample size problem. Experimental results on two widely used face databases demonstrate the efficacy of the proposed method.

Journal ArticleDOI
TL;DR: Characteristics such as time effort, classifier comprehensibility and method intricacy are evaluated—aspects that determine the success of a classification technique among ecologists and conservation biologists as well as for the communication with managers and decision makers.

Journal Article
TL;DR: In this article, the authors propose a new parsimonious version of the classical multivariate nor-mal linear model, yielding a maximum likelihood estimator (MLE) that is asymp- totically less variable than the MLE based on the usual model.
Abstract: We propose a new parsimonious version of the classical multivariate nor- mal linear model, yielding a maximum likelihood estimator (MLE) that is asymp- totically less variable than the MLE based on the usual model. Our approach is based on the construction of a link between the mean function and the covariance matrix, using the minimal reducing subspace of the latter that accommodates the former. This leads to a multivariate regression model that we call the envelope model, where the number of parameters is maximally reduced. The MLE from the envelope model can be substantially less variable than the usual MLE, especially when the mean function varies in directions that are orthogonal to the directions of maximum variation for the covariance matrix.

Proceedings ArticleDOI
10 Dec 2010
TL;DR: iVisClassifier fully interacts with all the reduced dimensions obtained by LDA through parallel coordinates and a scatter plot, which significantly improves the interactivity and interpretability of LDA.
Abstract: We present an interactive visual analytics system for classification, iVisClassifier, based on a supervised dimension reduction method, linear discriminant analysis (LDA). Given high-dimensional data and associated cluster labels, LDA gives their reduced dimensional representation, which provides a good overview about the cluster structure. Instead of a single two- or three-dimensional scatter plot, iVisClassifier fully interacts with all the reduced dimensions obtained by LDA through parallel coordinates and a scatter plot. Furthermore, it significantly improves the interactivity and interpretability of LDA. LDA enables users to understand each of the reduced dimensions and how they influence the data by reconstructing the basis vector into the original data domain. By using heat maps, iVisClassifier gives an overview about the cluster relationship in terms of pairwise distances between cluster centroids both in the original space and in the reduced dimensional space. Equipped with these functionalities, iVisClassifier supports users' classification tasks in an efficient way. Using several facial image data, we show how the above analysis is performed.