Showing papers on "Linear discriminant analysis published in 2010"

PDF

Open Access

Journal Article•DOI•

Data clustering: 50 years beyond K-means

[...]

Anil K. Jain¹•Institutions (1)

01 Jun 2010

TL;DR: A brief overview of clustering is provided, well known clustering methods are summarized, the major challenges and key issues in designing clustering algorithms are discussed, and some of the emerging and useful research directions are pointed out.

...read moreread less

Abstract: Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organisms into a system of ranked taxa: domain, kingdom, phylum, class, etc. Cluster analysis is the formal study of methods and algorithms for grouping, or clustering, objects according to measured or perceived intrinsic characteristics or similarity. Cluster analysis does not use category labels that tag objects with prior identifiers, i.e., class labels. The absence of category information distinguishes data clustering (unsupervised learning) from classification or discriminant analysis (supervised learning). The aim of clustering is to find structure in data and is therefore exploratory in nature. Clustering has a long and rich history in a variety of scientific fields. One of the most popular and simple clustering algorithms, K-means, was first published in 1955. In spite of the fact that K-means was proposed over 50 years ago and thousands of clustering algorithms have been published since then, K-means is still widely used. This speaks to the difficulty in designing a general purpose clustering algorithm and the ill-posed problem of clustering. We provide a brief overview of clustering, summarize well known clustering methods, discuss the major challenges and key issues in designing clustering algorithms, and point out some of the emerging and useful research directions, including semi-supervised clustering, ensemble clustering, simultaneous feature selection during data clustering, and large scale data clustering.

...read moreread less

6,601 citations

Book•

Partial Least Squares for Discrimination

[...]

Matthew L Barker¹, William S. Rayens²•Institutions (2)

Procter & Gamble¹, University of Kentucky²

01 Oct 2010

TL;DR: Partial least squares (PLS) was not originally designed as a tool for statistical discrimination as discussed by the authors, but applied scientists routinely use PLS for classification and there is substantial empirical evidence to suggest that it performs well in that role.

...read moreread less

Abstract: Partial least squares (PLS) was not originally designed as a tool for statistical discrimination. In spite of this, applied scientists routinely use PLS for classification and there is substantial empirical evidence to suggest that it performs well in that role. The interesting question is: why can a procedure that is principally designed for overdetermined regression problems locate and emphasize group structure? Using PLS in this manner has heurestic support owing to the relationship between PLS and canonical correlation analysis (CCA) and the relationship, in turn, between CCA and linear discriminant analysis (LDA). This paper replaces the heuristics with a formal statistical explanation. As a consequence, it will become clear that PLS is to be preferred over PCA when discrimination is the goal and dimension reduction is needed. Copyright © 2003 John Wiley & Sons, Ltd.

...read moreread less

2,067 citations

Journal Article•DOI•

EEG signal classification using PCA, ICA, LDA and support vector machines

[...]

Abdulhamit Subasi¹, M. Ismail Gursoy²•Institutions (2)

International Burch University¹, Adıyaman University²

01 Dec 2010-Expert Systems With Applications

TL;DR: In this work, a versatile signal processing and analysis framework for Electroencephalogram (EEG) was proposed and a set of statistical features was extracted from the sub-bands to represent the distribution of wavelet coefficients.

...read moreread less

Abstract: In this work, we proposed a versatile signal processing and analysis framework for Electroencephalogram (EEG). Within this framework the signals were decomposed into the frequency sub-bands using DWT and a set of statistical features was extracted from the sub-bands to represent the distribution of wavelet coefficients. Principal components analysis (PCA), independent components analysis (ICA) and linear discriminant analysis (LDA) is used to reduce the dimension of data. Then these features were used as an input to a support vector machine (SVM) with two discrete outputs: epileptic seizure or not. The performance of classification process due to different methods is presented and compared to show the excellent of classification process. These findings are presented as an example of a method for training, and testing a seizure prediction method on data from individual petit mal epileptic patients. Given the heterogeneity of epilepsy, it is likely that methods of this type will be required to configure intelligent devices for treating epilepsy to each individual's neurophysiology prior to clinical operation.

...read moreread less

1,010 citations

Book Chapter•DOI•

Information Theoretic Learning

[...]

Jose C. Principe¹•Institutions (1)

University of Florida¹

01 Jan 2010

TL;DR: In this article, the assumption of Gaussianity for the measurement error combined with the maximum likelihood principle could be emphasized to promote the least square criterion for nonlinear regression problems; considering classification as a regression problem towards estimating class posterior probabilities, least squares has been employed to train neural network and other classifier topologies to approximate correct labels.

...read moreread less

Abstract: INTRODUCTION Learning systems depend on three interrelated components: topologies, cost/performance functions, and learning algorithms. Topologies provide the constraints for the mapping, and the learning algorithms offer the means to find an optimal solution; but the solution is optimal with respect to what? Optimality is characterized by the criterion and in neural network literature, this is the least addressed component, yet it has a decisive influence in generalization performance. Certainly, the assumptions behind the selection of a criterion should be better understood and investigated. Traditionally, least squares has been the benchmark criterion for regression problems; considering classification as a regression problem towards estimating class posterior probabilities, least squares has been employed to train neural network and other classifier topologies to approximate correct labels. The main motivation to utilize least squares in regression simply comes from the intellectual comfort this criterion provides due to its success in traditional linear least squares regression applications – which can be reduced to solving a system of linear equations. For nonlinear regression, the assumption of Gaussianity for the measurement error combined with the maximum likelihood principle could be emphasized to promote this criterion. In nonparametric regression, least squares principle leads to the conditional expectation solution, which is intuitively appealing. Although these are good reasons to use the mean squared error as the cost, it is inherently linked to the assumptions and habits stated above. Consequently, there is information in the error signal that is not captured during the training of nonlinear adaptive systems under non-Gaussian distribution conditions when one insists on secondorder statistical criteria. This argument extends to other linear-second-order techniques such as principal component analysis (PCA), linear discriminant analysis (LDA), and canonical correlation analysis (CCA). Recent work tries to generalize these techniques to nonlinear scenarios by utilizing kernel techniques or other heuristics. This begs the question: what other alternative cost functions could be used to train adaptive systems and how could we establish rigorous techniques for extending useful concepts from linear and second-order statistical techniques to nonlinear and higher-order statistical learning methodologies?

...read moreread less

615 citations

Journal Article•DOI•

Emotion Recognition From EEG Using Higher Order Crossings

[...]

Panagiotis C. Petrantonakis¹, Leontios J. Hadjileontiadis¹•Institutions (1)

Aristotle University of Thessaloniki¹

01 Mar 2010

TL;DR: A novel emotion evocation and EEG-based feature extraction technique is presented, in which the mirror neuron system concept was adapted to efficiently foster emotion induction by the process of imitation, justifying the efficiency of the proposed approach.

...read moreread less

Abstract: Electroencephalogram (EEG)-based emotion recognition is a relatively new field in the affective computing area with challenging issues regarding the induction of the emotional states and the extraction of the features in order to achieve optimum classification performance. In this paper, a novel emotion evocation and EEG-based feature extraction technique is presented. In particular, the mirror neuron system concept was adapted to efficiently foster emotion induction by the process of imitation. In addition, higher order crossings (HOC) analysis was employed for the feature extraction scheme and a robust classification method, namely HOC-emotion classifier (HOC-EC), was implemented testing four different classifiers [quadratic discriminant analysis (QDA), k-nearest neighbor, Mahalanobis distance, and support vector machines (SVMs)], in order to accomplish efficient emotion recognition. Through a series of facial expression image projection, EEG data have been collected by 16 healthy subjects using only 3 EEG channels, namely Fp1, Fp2, and a bipolar channel of F3 and F4 positions according to 10-20 system. Two scenarios were examined using EEG data from a single-channel and from combined-channels, respectively. Compared with other feature extraction methods, HOC-EC appears to outperform them, achieving a 62.3% (using QDA) and 83.33% (using SVM) classification accuracy for the single-channel and combined-channel cases, respectively, differentiating among the six basic emotions, i.e., happiness , surprise, anger, fear, disgust, and sadness. As the emotion class-set reduces its dimension, the HOC-EC converges toward maximum classification rate (100% for five or less emotions), justifying the efficiency of the proposed approach. This could facilitate the integration of HOC-EC in human machine interfaces, such as pervasive healthcare systems, enhancing their affective character and providing information about the user's emotional status (e.g., identifying user's emotion experiences, recurring affective states, time-dependent emotional trends).

...read moreread less

542 citations

Journal Article•DOI•

Comparison of multivariate classifiers and response normalizations for pattern-information fMRI.

[...]

Masaya Misaki¹, Youn Kim², Youn Kim¹, Peter A. Bandettini¹, Nikolaus Kriegeskorte¹, Nikolaus Kriegeskorte³ - Show less +2 more•Institutions (3)

National Institutes of Health¹, University of California, San Diego², Cognition and Brain Sciences Unit³

15 Oct 2010-NeuroImage

TL;DR: Compared classifiers' accuracy at decoding the category of visual objects from response patterns in human early visual and inferior temporal cortex acquired in an event-related design with BOLD fMRI at 3T is compared and linear decoders based on t-value patterns may perform best.

...read moreread less

465 citations

Journal Article•DOI•

Flexible Manifold Embedding: A Framework for Semi-Supervised and Unsupervised Dimension Reduction

[...]

Feiping Nie¹, Dong Xu¹, Ivor W. Tsang¹, Changshui Zhang²•Institutions (2)

Nanyang Technological University¹, Tsinghua University²

01 Jul 2010-IEEE Transactions on Image Processing

TL;DR: A unified manifold learning framework for semi-supervised and unsupervised dimension reduction by employing a simple but effective linear regression function to map the new data points by modeling the mismatch between h(X) and F.

...read moreread less

Abstract: We propose a unified manifold learning framework for semi-supervised and unsupervised dimension reduction by employing a simple but effective linear regression function to map the new data points. For semi-supervised dimension reduction, we aim to find the optimal prediction labels F for all the training samples X, the linear regression function h(X) and the regression residue F0 = F - h(X) simultaneously. Our new objective function integrates two terms related to label fitness and manifold smoothness as well as a flexible penalty term defined on the residue F0. Our Semi-Supervised learning framework, referred to as flexible manifold embedding (FME), can effectively utilize label information from labeled data as well as a manifold structure from both labeled and unlabeled data. By modeling the mismatch between h(X) and F, we show that FME relaxes the hard linear constraint F = h(X) in manifold regularization (MR), making it better cope with the data sampled from a nonlinear manifold. In addition, we propose a simplified version (referred to as FME/U) for unsupervised dimension reduction. We also show that our proposed framework provides a unified view to explain and understand many semi-supervised, supervised and unsupervised dimension reduction techniques. Comprehensive experiments on several benchmark databases demonstrate the significant improvement over existing dimension reduction algorithms.

...read moreread less

435 citations

Journal Article•DOI•

Bregman Divergence-Based Regularization for Transfer Subspace Learning

[...]

Si Si¹, Dacheng Tao², Bo Geng³•Institutions (3)

University of Hong Kong¹, Nanyang Technological University², Peking University³

01 Jul 2010-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This paper presents a family of subspace learning algorithms based on a new form of regularization, which transfers the knowledge gained in training samples to testing samples, and minimizes the Bregman divergence between the distribution of training samples and that of testing samples in the selected subspace.

...read moreread less

Abstract: The regularization principals [31] lead approximation schemes to deal with various learning problems, e.g., the regularization of the norm in a reproducing kernel Hilbert space for the ill-posed problem. In this paper, we present a family of subspace learning algorithms based on a new form of regularization, which transfers the knowledge gained in training samples to testing samples. In particular, the new regularization minimizes the Bregman divergence between the distribution of training samples and that of testing samples in the selected subspace, so it boosts the performance when training and testing samples are not independent and identically distributed. To test the effectiveness of the proposed regularization, we introduce it to popular subspace learning algorithms, e.g., principal components analysis (PCA) for cross-domain face modeling; and Fisher's linear discriminant analysis (FLDA), locality preserving projections (LPP), marginal Fisher's analysis (MFA), and discriminative locality alignment (DLA) for cross-domain face recognition and text categorization. Finally, we present experimental evidence on both face image data sets and text data sets, suggesting that the proposed Bregman divergence-based regularization is effective to deal with cross-domain learning problems.

...read moreread less

430 citations

Journal Article•DOI•

Classification of human emotion from EEG using discrete wavelet transform

[...]

Murugappan Murugappan¹, Nagarajan Ramachandran, Yaacob Sazali•Institutions (1)

Universiti Malaysia Perlis¹

28 Apr 2010-Journal of Biomedical Science and Engineering

TL;DR: The average classification rate and subsets of emotions classification rate of two simple pattern classification methods, K Nearest Neighbor (KNN) and Linear Discriminant Analysis (LDA), are presented for justifying the performance of the emotion recognition system.

...read moreread less

Abstract: In this paper, we summarize the human emotion recognition using different set of electroencephalogram (EEG) channels using discrete wavelet transform. An audio-visual induction based protocol has been designed with more dynamic emotional content for inducing discrete emotions (disgust, happy, surprise, fear and neutral). EEG signals are collected using 64 electrodes from 20 subjects and are placed over the entire scalp using International 10-10 system. The raw EEG signals are preprocessed using Surface Laplacian (SL) filtering method and decomposed into three different frequency bands (alpha, beta and gamma) using Discrete Wavelet Transform (DWT). We have used “db4” wavelet function for deriving a set of conventional and modified energy based features from the EEG signals for classifying emotions. Two simple pattern classification methods, K Nearest Neighbor (KNN) and Linear Discriminant Analysis (LDA) methods are used and their performances are compared for emotional states classification. The experimental results indicate that, one of the proposed features (ALREE) gives the maximum average classification rate of 83.26% using KNN and 75.21% using LDA compared to those of conventional features. Finally, we present the average classification rate and subsets of emotions classification rate of these two different classifiers for justifying the performance of our emotion recognition system.

...read moreread less

408 citations

Journal Article•DOI•

Fusing Local Patterns of Gabor Magnitude and Phase for Face Recognition

[...]

Shufu Xie, Shiguang Shan, Xilin Chen, Jie Chen¹•Institutions (1)

University of Oulu¹

01 May 2010-IEEE Transactions on Image Processing

TL;DR: This paper proposes local Gabor XOR patterns (LGXP), which encodes the Gabor phase by using the local XOR pattern (LXP) operator, and introduces block-based Fisher's linear discriminant (BFLD) to reduce the dimensionality of the proposed descriptor and at the same time enhance its discriminative power.

...read moreread less

Abstract: Gabor features have been known to be effective for face recognition. However, only a few approaches utilize phase feature and they usually perform worse than those using magnitude feature. To investigate the potential of Gabor phase and its fusion with magnitude for face recognition, in this paper, we first propose local Gabor XOR patterns (LGXP), which encodes the Gabor phase by using the local XOR pattern (LXP) operator. Then, we introduce block-based Fisher's linear discriminant (BFLD) to reduce the dimensionality of the proposed descriptor and at the same time enhance its discriminative power. Finally, by using BFLD, we fuse local patterns of Gabor magnitude and phase for face recognition. We evaluate our approach on FERET and FRGC 2.0 databases. In particular, we perform comparative experimental studies of different local Gabor patterns. We also make a detailed comparison of their combinations with BFLD, as well as the fusion of different descriptors by using BFLD. Extensive experimental results verify the effectiveness of our LGXP descriptor and also show that our fusion approach outperforms most of the state-of-the-art approaches.

...read moreread less

390 citations

Journal Article•DOI•

Image Clustering Using Local Discriminant Models and Global Integration

[...]

Yi Yang¹, Dong Xu², Feiping Nie², Shuicheng Yan³, Yueting Zhuang¹ - Show less +1 more•Institutions (3)

Zhejiang University¹, Nanyang Technological University², National University of Singapore³

01 Oct 2010-IEEE Transactions on Image Processing

TL;DR: This paper proposes a new image clustering algorithm, referred to as clustering using local discriminant models and global integration (LDMGI), and shows that LDMGI shares a similar objective function with the spectral clustering (SC) algorithms, e.g., normalized cut (NCut).

...read moreread less

Abstract: In this paper, we propose a new image clustering algorithm, referred to as clustering using local discriminant models and global integration (LDMGI). To deal with the data points sampled from a nonlinear manifold, for each data point, we construct a local clique comprising this data point and its neighboring data points. Inspired by the Fisher criterion, we use a local discriminant model for each local clique to evaluate the clustering performance of samples within the local clique. To obtain the clustering result, we further propose a unified objective function to globally integrate the local models of all the local cliques. With the unified objective function, spectral relaxation and spectral rotation are used to obtain the binary cluster indicator matrix for all the samples. We show that LDMGI shares a similar objective function with the spectral clustering (SC) algorithms, e.g., normalized cut (NCut). In contrast to NCut in which the Laplacian matrix is directly calculated based upon a Gaussian function, a new Laplacian matrix is learnt in LDMGI by exploiting both manifold structure and local discriminant information. We also prove that K-means and discriminative K-means (DisKmeans) are both special cases of LDMGI. Extensive experiments on several benchmark image datasets demonstrate the effectiveness of LDMGI. We observe in the experiments that LDMGI is more robust to algorithmic parameter, when compared with NCut. Thus, LDMGI is more appealing for the real image clustering applications in which the ground truth is generally not available for tuning algorithmic parameters.

...read moreread less

Journal Article•DOI•

Multimodal semi-automated affect detection from conversational cues, gross body language, and facial features

[...]

Sidney K. D'Mello¹, Arthur C. Graesser¹•Institutions (1)

University of Memphis¹

01 Jun 2010-User Modeling and User-adapted Interaction

TL;DR: A multimodal affect detector that combines conversational cues, gross body language, and facial features, and linear discriminant analyses to discriminate between naturally occurring experiences of boredom, engagement/flow, confusion, frustration, delight, and neutral is developed and evaluated.

...read moreread less

Abstract: We developed and evaluated a multimodal affect detector that combines conversational cues, gross body language, and facial features. The multimodal affect detector uses feature-level fusion to combine the sensory channels and linear discriminant analyses to discriminate between naturally occurring experiences of boredom, engagement/flow, confusion, frustration, delight, and neutral. Training and validation data for the affect detector were collected in a study where 28 learners completed a 32- min. tutorial session with AutoTutor, an intelligent tutoring system with conversational dialogue. Classification results supported a channel × judgment type interaction, where the face was the most diagnostic channel for spontaneous affect judgments (i.e., at any time in the tutorial session), while conversational cues were superior for fixed judgments (i.e., every 20 s in the session). The analyses also indicated that the accuracy of the multichannel model (face, dialogue, and posture) was statistically higher than the best single-channel model for the fixed but not spontaneous affect expressions. However, multichannel models reduced the discrepancy (i.e., variance in the precision of the different emotions) of the discriminant models for both judgment types. The results also indicated that the combination of channels yielded superadditive effects for some affective states, but additive, redundant, and inhibitory effects for others. We explore the structure of the multimodal linear discriminant models and discuss the implications of some of our major findings.

...read moreread less

Journal Article•DOI•

Small-sample precision of ROC-related estimates

[...]

Blaise Hanczar¹, Jianping Hua², Chao Sima², John N. Weinstein, Michael L. Bittner², Edward R. Dougherty³ - Show less +2 more•Institutions (3)

Paris Descartes University¹, Translational Genomics Research Institute², University of Texas MD Anderson Cancer Center³

01 Mar 2010-Bioinformatics

TL;DR: A simulation study using data models and analysis of real microarray data shows that for small samples the root mean square differences of the estimated and true metrics are considerable, and even for large samples, there is only weak correlation between the true and estimated metrics.

...read moreread less

Abstract: Motivation: The receiver operator characteristic (ROC) curves are commonly used in biomedical applications to judge the performance of a discriminant across varying decision thresholds. The estimated ROC curve depends on the true positive rate (TPR) and false positive rate (FPR), with the key metric being the area under the curve (AUC). With small samples these rates need to be estimated from the training data, so a natural question arises: How well do the estimates of the AUC, TPR and FPR compare with the true metrics? Results: Through a simulation study using data models and analysis of real microarray data, we show that (i) for small samples the root mean square differences of the estimated and true metrics are considerable; (ii) even for large samples, there is only weak correlation between the true and estimated metrics; and (iii) generally, there is weak regression of the true metric on the estimated metric. For classification rules, we consider linear discriminant analysis, linear support vector machine (SVM) and radial basis function SVM. For error estimation, we consider resubstitution, three kinds of cross-validation and bootstrap. Using resampling, we show the unreliability of some published ROC results. Availability: Companion web site at http://compbio.tgen.org/paper_supp/ROC/roc.html Contact: edward@mail.ece.tamu.edu

...read moreread less

Book•

Statistics for Sensory and Consumer Science

[...]

Tormod Næs, Per B. Brockhoff¹, Oliver Tomic•Institutions (1)

Technical University of Denmark¹

09 Aug 2010

TL;DR: This chapter discusses the need for Statistics in Experimental Planning and Analysis, and some Basic Properties of a Distribution (Mean, Variance and Standard Deviation) and the importance of Relationships between Two or More Variables.

...read moreread less

Abstract: Preface. Acknowledgements. 1 Introduction. 1.1 The Distinction between Trained Sensory Panels and Consumer Panels. 1.2 The Need for Statistics in Experimental Planning and Analysis. 1.3 Scales and Data Types. 1.4 Organisation of the Book. 2 Important Data Collection Techniques for Sensory and Consumer Studies. 2.1 Sensory Panel Methodologies. 2.2 Consumer Tests. PART I PROBLEM DRIVEN. 3 Quality Control of Sensory Profile Data. 3.1 General Introduction. 3.2 Visual Inspection of Raw Data. 3.3 Mixed Model ANOVA for Assessing the Importance of the Sensory Attributes. 3.4 Overall Assessment of Assessor Differences Using All Variables Simultaneously. 3.5 Methods for Detecting Differences in Use of the Scale. 3.6 Comparing the Assessors Ability to Detect Differences between the Products. 3.7 Relations between Individual Assessor Ratings and the Panel Average. 3.8 Individual Line Plots for Detailed Inspection of Assessors. 3.9 Miscellaneous Methods.- 4 Correction Methods and Other Remedies for Improving Sensory Profile Data. 4.1 Introduction. 4.2 Correcting for Different Use of the Scale. 4.3 Computing Improved Panel Averages. 4.4 Pre-processing of Data for Three-Way Analysis. 5 Detecting and Studying Sensory Differences and Similarities between Products. 5.1 Introduction. 5.2 Analysing Sensory Profile Data: Univariate Case. 5.3 Analysing Sensory Profile Data: Multivariate Case. 6 Relating Sensory Data to Other Measurements. 6.1 Introduction. 6.2 Estimating Relations between Consensus Profiles and External Data. 6.3 Estimating Relations between Individual Sensory Profiles and External Data. 7 Discrimination and Similarity Testing. 7.1 Introduction. 7.2 Analysis of Data from Basic Sensory Discrimination Tests. 7.3 Examples of Basic Discrimination Testing. 7.4 Power Calculations in Discrimination Tests. 7.5 Thurstonian Modelling: What Is It Really? 7.6 Similarity versus Difference Testing. 7.7 Replications: What to Do? 7.8 Designed Experiments, Extended Analysis and Other Test Protocols. 8 Investigating Important Factors Influencing Food Acceptance and Choice. 8.1 Introduction. 8.2 Preliminary Analysis of Consumer Data Sets (Raw Data Overview). 8.3 Experimental Designs for Rating Based Consumer Studies. 8.4 Analysis of Categorical Effect Variables. 8.5 Incorporating Additional Information about Consumers. 8.6 Modelling of Factors as Continuous Variables. 8.7 Reliability/Validity Testing for Rating Based Methods. 8.8 Rank Based Methodology. 8.9 Choice Based Conjoint Analysis. 8.10 Market Share Simulation. 9 Preference Mapping for Understanding Relations between Sensory Product Attributes and Consumer Acceptance. 9.1 Introduction. 9.2 External and Internal Preference Mapping. 9.3 Examples of Linear Preference Mapping. 9.4 Ideal Point Preference Mapping. 9.5 Selecting Samples for Preference Mapping. 9.6 Incorporating Additional Consumer Attributes. 9.7 Combining Preference Mapping with Additional Information about the Samples. 10 Segmentation of Consumer Data. 10.1 Introduction. 10.2 Segmentation of Rating Data. 10.3 Relating Segments to Consumer Attributes. PART II METHOD ORIENTED. 11 Basic Statistics. 11.1 Basic Concepts and Principles. 11.2 Histogram, Frequency and Probability. 11.3 Some Basic Properties of a Distribution (Mean, Variance and Standard Deviation). 11.4 Hypothesis Testing and Confidence Intervals for the Mean . 11.5 Statistical Process Control. 11.6 Relationships between Two or More Variables. 11.7 Simple Linear Regression. 11.8 Binomial Distribution and Tests. 11.9 Contingency Tables and Homogeneity Testing. 12 Design of Experiments for Sensory and Consumer Data. 12.1 Introduction. 12.2 Important Concepts and Distinctions. 12.3 Full Factorial Designs. 12.4 Fractional Factorial Designs: Screening Designs. 12.5 Randomised Blocks and Incomplete Block Designs. 12.6 Split-Plot and Nested Designs. 12.7 Power of Experiments. 13 ANOVA for Sensory and Consumer Data. 13.1 Introduction. 13.2 One-Way ANOVA. 13.3 Single Replicate Two-Way ANOVA. 13.4 Two-Way ANOVA with Randomised Replications. 13.5 Multi-Way ANOVA. 13.6 ANOVA for Fractional Factorial Designs. 13.7 Fixed and Random Effects in ANOVA: Mixed Models. 13.8 Nested and Split-Plot Models. 13.9 Post Hoc Testing. 14 Principal Component Analysis. 14.1 Interpretation of Complex Data Sets by PCA. 14.2 Data Structures for the PCA. 14.3 PCA: Description of the Method. 14.4 Projections and Linear Combinations. 14.5 The Scores and Loadings Plots. 14.6 Correlation Loadings Plot. 14.7 Standardisation. 14.8 Calculations and Missing Values. 14.9 Validation. 14.10 Outlier Diagnostics. 14.11 Tucker-1. 14.12 The Relation between PCA and Factor Analysis (FA). 15 Multiple Regression, Principal Components Regression and Partial Least Squares Regression. 15.1 Introduction. 15.2 Multivariate Linear Regression. 15.3 The Relation between ANOVA and Regression Analysis. 15.4 Linear Regression Used for Estimating Polynomial Models. 15.5 Combining Continuous and Categorical Variables. 15.6 Variable Selection for Multiple Linear Regression. 15.7 Principal Components Regression (PCR). 15.8 Partial Least Squares (PLS) Regression. 15.9 Model Validation: Prediction Performance. 15.10 Model Diagnostics and Outlier Detection. 15.11 Discriminant Analysis. 15.12 Generalised Linear Models, Logistic Regression and Multinomial Regression. 16 Cluster Analysis: Unsupervised Classification. 16.1 Introduction. 16.2 Hierarchical Clustering. 16.3 Partitioning Methods. 16.4 Cluster Analysis for Matrices. 17 Miscellaneous Methodologies. 17.1 Three-Way Analysis of Sensory Data. 17.2 Relating Three-Way Data to Two-Way Data. 17.3 Path Modelling. 17.4 MDS-Multidimensional Scaling. 17.5 Analysing Rank Data. 17.6 The L-PLS Method. 17.7 Missing Value Estimation. Nomenclature, Symbols and Abbreviations. Index.

...read moreread less

Journal Article•DOI•

Optimal landslide susceptibility zonation based on multiple forecasts

[...]

Mauro Rossi, Fausto Guzzetti, Paola Reichenbach, Alessandro Cesare Mondini, Silvia Peruccacci - Show less +1 more

15 Jan 2010-Geomorphology

TL;DR: In this paper, the authors exploited environmental and multi-temporal landslide information for an area in Umbria, Italy, to produce four single and two combined landslide susceptibility zonations.

...read moreread less

Journal Article•DOI•

Real-time traffic sign recognition from video by class-specific discriminative features

[...]

Andrzej Ruta¹, Yongmin Li¹, Xiaohui Liu¹•Institutions (1)

Brunel University London¹

01 Jan 2010-Pattern Recognition

TL;DR: An efficient road sign recognition system is built, based on a conventional nearest neighbour classifier and a simple temporal integration scheme, which demonstrates a competitive performance in the experiments involving real traffic video.

...read moreread less

Journal Article•DOI•

Semi-supervised local Fisher discriminant analysis for dimensionality reduction

[...]

Masashi Sugiyama¹, Tsuyoshi Ide², Shinichi Nakajima³, Jun Sese⁴•Institutions (4)

Tokyo Institute of Technology¹, IBM², Nikon³, Ochanomizu University⁴

01 Jan 2010-Machine Learning

TL;DR: This paper proposes a semi-supervised dimensionality reduction method which preserves the global structure of unlabeled samples in addition to separating labeled samples in different classes from each other and shows the usefulness of SELF through experiments with benchmark and real-world document classification datasets.

...read moreread less

Abstract: When only a small number of labeled samples are available, supervised dimensionality reduction methods tend to perform poorly because of overfitting. In such cases, unlabeled samples could be useful in improving the performance. In this paper, we propose a semi-supervised dimensionality reduction method which preserves the global structure of unlabeled samples in addition to separating labeled samples in different classes from each other. The proposed method, which we call SEmi-supervised Local Fisher discriminant analysis (SELF), has an analytic form of the globally optimal solution and it can be computed based on eigen-decomposition. We show the usefulness of SELF through experiments with benchmark and real-world document classification datasets.

...read moreread less

Journal Article•DOI•

Gasoline classification using near infrared (NIR) spectroscopy data: Comparison of multivariate techniques

[...]

Roman M. Balabin¹, Ravilya Z. Safieva², Ekaterina I. Lomakina³•Institutions (3)

ETH Zurich¹, Gubkin Russian State University of Oil and Gas², Moscow State University³

25 Jun 2010-Analytica Chimica Acta

TL;DR: NIR spectroscopy was found to be effective for gasoline classification purposes, when compared with nuclear magnetic resonance (NMR)Spectroscopy or gas chromatography (GC), and KNN, SVM, and PNN techniques for classification were find to be among the most effective ones.

...read moreread less

Journal Article•DOI•

Tensor decompositions for feature extraction and classification of high dimensional datasets

[...]

Anh Huy Phan, Andrzej Cichocki¹•Institutions (1)

Systems Research Institute¹

01 Jan 2010-Nonlinear Theory and Its Applications, IEICE

TL;DR: This work proposes algorithms for feature extraction and classification based on orthogonal or nonnegative tensor (multi-array) decompositions, and higher order (multilinear) discriminant analysis (HODA), whereby input data are considered as tensors instead of more conventional vector or matrix representations.

...read moreread less

Abstract: Feature extraction and selection are key factors in model reduction, classification and pattern recognition problems. This is especially important for input data with large dimensions such as brain recording or multiview images, where appropriate feature extraction is a prerequisite to classification. To ensure that the reduced dataset contains maximum information about input data we propose algorithms for feature extraction and classification. This is achieved based on orthogonal or nonnegative tensor (multi-array) decompositions, and higher order (multilinear) discriminant analysis (HODA), whereby input data are considered as tensors instead of more conventional vector or matrix representations. The developed algorithms are verified on benchmark datasets, using constraints imposed on tensors and/or factor matrices such as orthogonality and nonnegativity.

...read moreread less

Journal Article•DOI•

Multiple Binary Classifications via Linear Discriminant Analysis for Improved Controllability of a Powered Prosthesis

[...]

Levi J. Hargrove¹, Erik Scheme¹, Kevin Englehart¹, B. Hudgins¹•Institutions (1)

University of New Brunswick¹

12 Jan 2010

TL;DR: A novel pattern recognition based myoelectric control system that uses parallel binary classification and class specific thresholds that is robust, easily configured, and highly usable is described.

...read moreread less

Abstract: This paper describes a novel pattern recognition based myoelectric control system that uses parallel binary classification and class specific thresholds. The system was designed with an intuitive configuration interface, similar to existing conventional myoelectric control systems. The system was assessed quantitatively with a classification error metric and functionally with a clothespin test implemented in a virtual environment. For each case, the proposed system was compared to a state-of-the-art pattern recognition system based on linear discriminant analysis and a conventional myoelectric control scheme with mode switching. These assessments showed that the proposed control system had a higher classification error (p < 0.001) but yielded a more controllable myoelectric control system (p < 0.001) as measured through a clothespin usability test implemented in a virtual environment. Furthermore, the system was computationally simple and applicable for real-time embedded implementation. This work provides the basis for a clinically viable pattern recognition based myoelectric control system which is robust, easily configured, and highly usable.

...read moreread less

Journal Article•DOI•

Gaussian Processes for Object Categorization

[...]

Ashish Kapoor¹, Kristen Grauman², Raquel Urtasun³, Trevor Darrell³•Institutions (3)

Microsoft¹, University of Texas at Austin², University of California, Berkeley³

01 Jun 2010-International Journal of Computer Vision

TL;DR: This work shows that with an appropriate combination of kernels a significant boost in classification performance is possible, and indicates the utility of active learning with probabilistic predictive models, especially when the amount of training data labels that may be sought for a category is ultimately very small.

...read moreread less

Abstract: Discriminative methods for visual object category recognition are typically non-probabilistic, predicting class labels but not directly providing an estimate of uncertainty. Gaussian Processes (GPs) provide a framework for deriving regression techniques with explicit uncertainty models; we show here how Gaussian Processes with covariance functions defined based on a Pyramid Match Kernel (PMK) can be used for probabilistic object category recognition. Our probabilistic formulation provides a principled way to learn hyperparameters, which we utilize to learn an optimal combination of multiple covariance functions. It also offers confidence estimates at test points, and naturally allows for an active learning paradigm in which points are optimally selected for interactive labeling. We show that with an appropriate combination of kernels a significant boost in classification performance is possible. Further, our experiments indicate the utility of active learning with probabilistic predictive models, especially when the amount of training data labels that may be sought for a category is ultimately very small.

...read moreread less

Journal Article•DOI•

Functional clustering and linear regression for peak load forecasting

[...]

Aldo Goia¹, Caterina May¹, Gianluca Fusai¹•Institutions (1)

University of Eastern Piedmont¹

01 Oct 2010-International Journal of Forecasting

TL;DR: This paper takes advantage of the functional nature of the data-set and proposes a forecasting methodology based on functional statistics, using a functional clustering procedure to classify the daily load curves and defines a family of functional linear regression models.

...read moreread less

Journal Article•DOI•

Anthropometric 3D Face Recognition

[...]

Shalini Gupta¹, Mia K. Markey¹, Alan C. Bovik¹•Institutions (1)

University of Texas at Austin¹

01 Dec 2010-International Journal of Computer Vision

TL;DR: A novel anthropometric three dimensional (Anthroface 3D) face recognition algorithm, which is based on a systematically selected set of discriminatory structural characteristics of the human face derived from the existing scientific literature on facial anthropometry, is presented.

...read moreread less

Abstract: We present a novel anthropometric three dimensional (Anthroface 3D) face recognition algorithm, which is based on a systematically selected set of discriminatory structural characteristics of the human face derived from the existing scientific literature on facial anthropometry. We propose a novel technique for automatically detecting 10 anthropometric facial fiducial points that are associated with these discriminatory anthropometric features. We isolate and employ unique textural and/or structural characteristics of these fiducial points, along with the established anthropometric facial proportions of the human face for detecting them. Lastly, we develop a completely automatic face recognition algorithm that employs facial 3D Euclidean and geodesic distances between these 10 automatically located anthropometric facial fiducial points and a linear discriminant classifier. On a database of 1149 facial images of 118 subjects, we show that the standard deviation of the Euclidean distance of each automatically detected fiducial point from its manually identified position is less than 2.54 mm. We further show that the proposed Anthroface 3D recognition algorithm performs well (equal error rate of 1.98% and a rank 1 recognition rate of 96.8%), out performs three of the existing benchmark 3D face recognition algorithms, and is robust to the observed fiducial point localization errors.

...read moreread less

Proceedings Article•DOI•

Learning from other subjects helps reducing Brain-Computer Interface calibration time

[...]

Fabien Lotte¹, Cuntai Guan¹•Institutions (1)

Institute for Infocomm Research Singapore¹

14 Mar 2010

TL;DR: An algorithm to regularize the Common Spatial Patterns (CSP) and Linear Discriminant Analysis (LDA) algorithms based on the data from a subset of automatically selected subjects is proposed.

...read moreread less

Abstract: A major limitation of Brain-Computer Interfaces (BCI) is their long calibration time, as much data from the user must be collected in order to tune the BCI for this target user. In this paper, we propose a new method to reduce this calibration time by using data from other subjects. More precisely, we propose an algorithm to regularize the Common Spatial Patterns (CSP) and Linear Discriminant Analysis (LDA) algorithms based on the data from a subset of automatically selected subjects. An evaluation of our approach showed that our method significantly outperformed the standard BCI design especially when the amount of data from the target user is small. Thus, our approach helps in reducing the amount of data needed to achieve a given performance level.

...read moreread less

Journal Article•DOI•

A support vector machine algorithm to classify lithofacies and model permeability in heterogeneous reservoirs

[...]

A. F. Al-Anazi¹, Ian D. Gates¹•Institutions (1)

University of Calgary¹

10 Aug 2010-Engineering Geology

TL;DR: Nonlinear SVM technique is applied in a highly heterogeneous sandstone reservoir to classify electrofacies and predict permeability distributions and statistical error analysis shows that the SVM method yields comparable or superior classification of the lithology and estimates of the permeability than the neural network methods.

...read moreread less

Journal Article•DOI•

Generalized Discriminant Analysis: A Matrix Exponential Approach

[...]

Taiping Zhang¹, Bin Fang¹, Yuan Yan Tang¹, Zhaowei Shang¹, Bin Xu¹ - Show less +1 more•Institutions (1)

Chongqing University¹

01 Feb 2010

TL;DR: Comparisons of experimental results on different data sets are given with respect to existing LDA extensions, including PCA + LDA, LDA via generalized singular value decomposition, regularized L DA, NLDA, and LDA through QR decompose, which demonstrate the effectiveness of the proposed EDA method.

...read moreread less

Abstract: Linear discriminant analysis (LDA) is well known as a powerful tool for discriminant analysis. In the case of a small training data set, however, it cannot directly be applied to high-dimensional data. This case is the so-called small-sample-size or undersampled problem. In this paper, we propose an exponential discriminant analysis (EDA) technique to overcome the undersampled problem. The advantages of EDA are that, compared with principal component analysis (PCA) + LDA, the EDA method can extract the most discriminant information that was contained in the null space of a within-class scatter matrix, and compared with another LDA extension, i.e., null-space LDA (NLDA), the discriminant information that was contained in the non-null space of the within-class scatter matrix is not discarded. Furthermore, EDA is equivalent to transforming original data into a new space by distance diffusion mapping, and then, LDA is applied in such a new space. As a result of diffusion mapping, the margin between different classes is enlarged, which is helpful in improving classification accuracy. Comparisons of experimental results on different data sets are given with respect to existing LDA extensions, including PCA + LDA, LDA via generalized singular value decomposition, regularized LDA, NLDA, and LDA via QR decomposition, which demonstrate the effectiveness of the proposed EDA method.

...read moreread less

Journal Article•DOI•

Regularized Locality Preserving Projections and Its Extensions for Face Recognition

[...]

Jiwen Lu¹, Yap-Peng Tan¹•Institutions (1)

Nanyang Technological University¹

01 Jun 2010

TL;DR: The objective is to regulate the LPP space in a parametric manner and extract useful discriminant information from the whole feature space rather than a reduced projection subspace of principal component analysis which results in better locality preserving power and higher recognition accuracy than the original LPP method.

...read moreread less

Abstract: We propose in this paper a parametric regularized locality preserving projections (LPP) method for face recognition. Our objective is to regulate the LPP space in a parametric manner and extract useful discriminant information from the whole feature space rather than a reduced projection subspace of principal component analysis. This results in better locality preserving power and higher recognition accuracy than the original LPP method. Moreover, the proposed regularization method can easily be extended to other manifold learning algorithms and to effectively address the small sample size problem. Experimental results on two widely used face databases demonstrate the efficacy of the proposed method.

...read moreread less

Journal Article•DOI•

Classification in conservation biology: A comparison of five machine-learning methods

[...]

Christian Kampichler¹, Ralf Wieland, Sophie Calmé², Holger Weissenberger, Stefan L. Arriaga-Weiss¹ - Show less +1 more•Institutions (2)

Universidad Juárez Autónoma de Tabasco¹, Université de Sherbrooke²

01 Nov 2010-Ecological Informatics

TL;DR: Characteristics such as time effort, classifier comprehensibility and method intricacy are evaluated—aspects that determine the success of a classification technique among ecologists and conservation biologists as well as for the communication with managers and decision makers.

...read moreread less

Journal Article•

Envelope models for parsimonious and efficient multivariate linear regression

[...]

R. D Cook, Bing Li¹, Francesca Chiaromonte¹•Institutions (1)

Pennsylvania State University¹

01 Jul 2010-Statistica Sinica

TL;DR: In this article, the authors propose a new parsimonious version of the classical multivariate nor-mal linear model, yielding a maximum likelihood estimator (MLE) that is asymp- totically less variable than the MLE based on the usual model.

...read moreread less

Abstract: We propose a new parsimonious version of the classical multivariate nor- mal linear model, yielding a maximum likelihood estimator (MLE) that is asymp- totically less variable than the MLE based on the usual model. Our approach is based on the construction of a link between the mean function and the covariance matrix, using the minimal reducing subspace of the latter that accommodates the former. This leads to a multivariate regression model that we call the envelope model, where the number of parameters is maximally reduced. The MLE from the envelope model can be substantially less variable than the usual MLE, especially when the mean function varies in directions that are orthogonal to the directions of maximum variation for the covariance matrix.

...read moreread less

Proceedings Article•DOI•

iVisClassifier: An interactive visual analytics system for classification based on supervised dimension reduction

[...]

Jaegul Choo¹, Hanseung Lee¹, Jaeyeon Kihm¹, Haesun Park¹•Institutions (1)

Georgia Institute of Technology¹

10 Dec 2010

TL;DR: iVisClassifier fully interacts with all the reduced dimensions obtained by LDA through parallel coordinates and a scatter plot, which significantly improves the interactivity and interpretability of LDA.

...read moreread less

Abstract: We present an interactive visual analytics system for classification, iVisClassifier, based on a supervised dimension reduction method, linear discriminant analysis (LDA). Given high-dimensional data and associated cluster labels, LDA gives their reduced dimensional representation, which provides a good overview about the cluster structure. Instead of a single two- or three-dimensional scatter plot, iVisClassifier fully interacts with all the reduced dimensions obtained by LDA through parallel coordinates and a scatter plot. Furthermore, it significantly improves the interactivity and interpretability of LDA. LDA enables users to understand each of the reduced dimensions and how they influence the data by reconstructing the basis vector into the original data domain. By using heat maps, iVisClassifier gives an overview about the cluster relationship in terms of pairwise distances between cluster centroids both in the original space and in the reduced dimensional space. Equipped with these functionalities, iVisClassifier supports users' classification tasks in an efficient way. Using several facial image data, we show how the above analysis is performed.

...read moreread less

Collapse