Showing papers on "Dimensionality reduction published in 2013"

PDF

Open Access

Proceedings Article•DOI•

Transfer Feature Learning with Joint Distribution Adaptation

[...]

Mingsheng Long¹, Jianmin Wang¹, Guiguang Ding¹, Jiaguang Sun¹, Philip S. Yu² - Show less +1 more•Institutions (2)

Tsinghua University¹, University of Illinois at Chicago²

01 Dec 2013

TL;DR: JDA aims to jointly adapt both the marginal distribution and conditional distribution in a principled dimensionality reduction procedure, and construct new feature representation that is effective and robust for substantial distribution difference.

...read moreread less

Abstract: Transfer learning is established as an effective technology in computer vision for leveraging rich labeled data in the source domain to build an accurate classifier for the target domain. However, most prior methods have not simultaneously reduced the difference in both the marginal distribution and conditional distribution between domains. In this paper, we put forward a novel transfer learning approach, referred to as Joint Distribution Adaptation (JDA). Specifically, JDA aims to jointly adapt both the marginal distribution and conditional distribution in a principled dimensionality reduction procedure, and construct new feature representation that is effective and robust for substantial distribution difference. Extensive experiments verify that JDA can significantly outperform several state-of-the-art methods on four types of cross-domain image classification problems.

...read moreread less

1,542 citations

Proceedings Article•DOI•

Local Fisher Discriminant Analysis for Pedestrian Re-identification

[...]

Sateesh Pedagadi¹, James Orwell¹, Sergio A. Velastin, Boghos Boghossian•Institutions (1)

Kingston University¹

23 Jun 2013

TL;DR: A novel approach to the pedestrian re-identification problem that uses metric learning to improve the state-of-the-art performance on standard public datasets and is an effective way to process observations comprising multiple shots, and is non-iterative: the computation times are relatively modest.

...read moreread less

Abstract: Metric learning methods, for person re-identification, estimate a scaling for distances in a vector space that is optimized for picking out observations of the same individual. This paper presents a novel approach to the pedestrian re-identification problem that uses metric learning to improve the state-of-the-art performance on standard public datasets. Very high dimensional features are extracted from the source color image. A first processing stage performs unsupervised PCA dimensionality reduction, constrained to maintain the redundancy in color-space representation. A second stage further reduces the dimensionality, using a Local Fisher Discriminant Analysis defined by a training set. A regularization step is introduced to avoid singular matrices during this stage. The experiments conducted on three publicly available datasets confirm that the proposed method outperforms the state-of-the-art performance, including all other known metric learning methods. Further-more, the method is an effective way to process observations comprising multiple shots, and is non-iterative: the computation times are relatively modest. Finally, a novel statistic is derived to characterize the Match Characteristic: the normalized entropy reduction can be used to define the 'Proportion of Uncertainty Removed' (PUR). This measure is invariant to test set size and provides an intuitive indication of performance.

...read moreread less

607 citations

Journal Article•DOI•

ECG beat classification using PCA, LDA, ICA and Discrete Wavelet Transform

[...]

Roshan Joy Martis¹, U. Rajendra Acharya², U. Rajendra Acharya¹, Lim Choo Min¹•Institutions (2)

Ngee Ann Polytechnic¹, University of Malaya²

01 Sep 2013-Biomedical Signal Processing and Control

TL;DR: Five types of beat classes of arrhythmia as recommended by Association for Advancement of Medical Instrumentation (AAMI) were analyzed and dimensionality reduced features were fed to the Support Vector Machine, neural network and probabilistic neural network (PNN) classifiers for automated diagnosis.

...read moreread less

586 citations

Journal Article•DOI•

Tensor Regression with Applications in Neuroimaging Data Analysis

[...]

Hua Zhou¹, Lexin Li¹, Hongtu Zhu²•Institutions (2)

North Carolina State University¹, University of North Carolina at Chapel Hill²

25 Mar 2013-Journal of the American Statistical Association

TL;DR: In this article, a tensor regression model was proposed for analysis of high-throughput data due to their ultra-high dimensionality as well as complex structure, which can efficiently exploit the structure of tensor covariates.

...read moreread less

Abstract: Classical regression methods treat covariates as a vector and estimate a corresponding vector of regression coefficients Modern applications in medical imaging generate covariates of more complex form such as multidimensional arrays (tensors) Traditional statistical and computational methods are proving insufficient for analysis of these high-throughput data due to their ultrahigh dimensionality as well as complex structure In this article, we propose a new family of tensor regression models that efficiently exploit the special structure of tensor covariates Under this framework, ultrahigh dimensionality is reduced to a manageable level, resulting in efficient estimation and prediction A fast and highly scalable estimation algorithm is proposed for maximum likelihood estimation and its associated asymptotic properties are studied Effectiveness of the new methods is demonstrated on both synthetic and real MRI imaging data Supplementary materials for this article are available online

...read moreread less

425 citations

Journal Article•DOI•

A comparative review of dimension reduction methods in approximate Bayesian computation

[...]

Michael G. B. Blum, Matthew A. Nunes, Dennis Prangle, Scott A. Sisson

01 May 2013-Statistical Science

TL;DR: This article provides a comprehensive review and comparison of the performance of the principal methods of dimension reduction proposed in the ABC literature, split into three nonmutually exclusive classes consisting of best subset selection methods, projection techniques and regularization.

...read moreread less

Abstract: Approximate Bayesian computation (ABC) methods make use of comparisons between simulated and observed summary statistics to overcome the problem of computationally intractable likelihood functions. As the practical implementation of ABC requires computations based on vectors of summary statistics, rather than full data sets, a central question is how to derive low-dimensional summary statistics from the observed data with minimal loss of information. In this article we provide a comprehensive review and comparison of the performance of the principal methods of dimension reduction proposed in the ABC literature. The methods are split into three nonmutually exclusive classes consisting of best subset selection methods, projection techniques and regularization. In addition, we introduce two new methods of dimension reduction. The first is a best subset selection method based on Akaike and Bayesian information criteria, and the second uses ridge regression as a regularization procedure. We illustrate the performance of these dimension reduction techniques through the analysis of three challenging models and data sets.

...read moreread less

393 citations

Posted Content•

Experiments with Random Projection

[...]

Sanjoy Dasgupta¹•Institutions (1)

AT&T Labs¹

16 Jan 2013-arXiv: Learning

TL;DR: Results of random projection as a promising dimensionality reduction technique for learning mixtures of Gaussians are summarized by a wide variety of experiments on synthetic and real data.

...read moreread less

Abstract: Recent theoretical work has identified random projection as a promising dimensionality reduction technique for learning mixtures of Gausians. Here we summarize these results and illustrate them by a wide variety of experiments on synthetic and real data.

...read moreread less

329 citations

Journal Article•DOI•

Sparse Principal Component Analysis and Iterative Thresholding

[...]

Zongming Ma

01 Apr 2013-Annals of Statistics

TL;DR: In this paper, a new iterative thresholding approach for estimating principal subspaces in the setting where the leading eigenvectors are sparse is proposed, and the new approach recovers the principal subspace and leading eigvectors consistently, and even optimally, in a range of high-dimensional sparse settings.

...read moreread less

Abstract: Principal component analysis (PCA) is a classical dimension reduction method which projects data onto the principal subspace spanned by the leading eigenvectors of the covariance matrix. However, it behaves poorly when the number of features $p$ is comparable to, or even much larger than, the sample size $n$. In this paper, we propose a new iterative thresholding approach for estimating principal subspaces in the setting where the leading eigenvectors are sparse. Under a spiked covariance model, we find that the new approach recovers the principal subspace and leading eigenvectors consistently, and even optimally, in a range of high-dimensional sparse settings. Simulated examples also demonstrate its competitive performance.

...read moreread less

298 citations

Journal Article•DOI•

Fuzzy spectral clustering by PCCA+: application to Markov state models and data classification

[...]

Susanna Röblitz¹, Marcus Weber¹•Institutions (1)

Zuse Institute Berlin¹

01 Jun 2013-Advanced Data Analysis and Classification

TL;DR: It is demonstrated in this paper that PCCA+ always delivers an optimal fuzzy clustering for nearly uncoupled, not necessarily reversible, Markov chains with transition states.

...read moreread less

Abstract: Given a row-stochastic matrix describing pairwise similarities between data objects, spectral clustering makes use of the eigenvectors of this matrix to perform dimensionality reduction for clustering in fewer dimensions. One example from this class of algorithms is the Robust Perron Cluster Analysis (PCCA+), which delivers a fuzzy clustering. Originally developed for clustering the state space of Markov chains, the method became popular as a versatile tool for general data classification problems. The robustness of PCCA+, however, cannot be explained by previous perturbation results, because the matrices in typical applications do not comply with the two main requirements: reversibility and nearly decomposability. We therefore demonstrate in this paper that PCCA+ always delivers an optimal fuzzy clustering for nearly uncoupled, not necessarily reversible, Markov chains with transition states.

...read moreread less

288 citations

Journal Article•DOI•

Tensor Discriminative Locality Alignment for Hyperspectral Image Spectral–Spatial Feature Extraction

[...]

Liangpei Zhang¹, Lefei Zhang¹, Dacheng Tao², Xin Huang¹•Institutions (2)

Wuhan University¹, University of Technology, Sydney²

01 Jan 2013-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: A tensor organization scheme for representing a pixel's spectral-spatial feature and develop tensor discriminative locality alignment (TDLA) for removing redundant information for subsequent classification are defined.

...read moreread less

Abstract: In this paper, we propose a method for the dimensionality reduction (DR) of spectral-spatial features in hyperspectral images (HSIs), under the umbrella of multilinear algebra, i.e., the algebra of tensors. The proposed approach is a tensor extension of conventional supervised manifold-learning-based DR. In particular, we define a tensor organization scheme for representing a pixel's spectral-spatial feature and develop tensor discriminative locality alignment (TDLA) for removing redundant information for subsequent classification. The optimal solution of TDLA is obtained by alternately optimizing each mode of the input tensors. The methods are tested on three public real HSI data sets collected by hyperspectral digital imagery collection experiment, reflective optics system imaging spectrometer, and airborne visible/infrared imaging spectrometer. The classification results show significant improvements in classification accuracies while using a small number of features.

...read moreread less

283 citations

Journal Article•DOI•

Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis

[...]

Zhu-Hong You¹, Ying-Ke Lei, Lin Zhu², Junfeng Xia³, Bing Wang⁴ - Show less +1 more•Institutions (4)

Shenzhen University¹, University of Science and Technology of China², Anhui University³, Tongji University⁴

09 May 2013-BMC Bioinformatics

TL;DR: A novel hierarchical PCA-EELM (principal component analysis-ensemble extreme learning machine) model to predict protein-protein interactions only using the information of protein sequences is presented.

...read moreread less

Abstract: Protein-protein interactions (PPIs) play crucial roles in the execution of various cellular processes and form the basis of biological mechanisms. Although large amount of PPIs data for different species has been generated by high-throughput experimental techniques, current PPI pairs obtained with experimental methods cover only a fraction of the complete PPI networks, and further, the experimental methods for identifying PPIs are both time-consuming and expensive. Hence, it is urgent and challenging to develop automated computational methods to efficiently and accurately predict PPIs. We present here a novel hierarchical PCA-EELM (principal component analysis-ensemble extreme learning machine) model to predict protein-protein interactions only using the information of protein sequences. In the proposed method, 11188 protein pairs retrieved from the DIP database were encoded into feature vectors by using four kinds of protein sequences information. Focusing on dimension reduction, an effective feature extraction method PCA was then employed to construct the most discriminative new feature set. Finally, multiple extreme learning machines were trained and then aggregated into a consensus classifier by majority voting. The ensembling of extreme learning machine removes the dependence of results on initial random weights and improves the prediction performance. When performed on the PPI data of Saccharomyces cerevisiae, the proposed method achieved 87.00% prediction accuracy with 86.15% sensitivity at the precision of 87.59%. Extensive experiments are performed to compare our method with state-of-the-art techniques Support Vector Machine (SVM). Experimental results demonstrate that proposed PCA-EELM outperforms the SVM method by 5-fold cross-validation. Besides, PCA-EELM performs faster than PCA-SVM based method. Consequently, the proposed approach can be considered as a new promising and powerful tools for predicting PPI with excellent performance and less time.

...read moreread less

275 citations

Journal Article•DOI•

A Comparison of Multi-label Feature Selection Methods using the Problem Transformation Approach

[...]

Newton Spolaôr¹, Everton Alvares Cherman¹, Maria Carolina Monard¹, Huei Diana Lee•Institutions (1)

University of São Paulo¹

01 Mar 2013-Electronic Notes in Theoretical Computer Science

TL;DR: This work proposes multi-label feature selection methods which use the filter approach, and uses ReliefF and Information Gain to measure the goodness of features.

...read moreread less

Journal Article•DOI•

Feature Selection for Multimedia Analysis by Sharing Information Among Multiple Tasks

[...]

Yi Yang¹, Zhigang Ma², Alexander G. Hauptmann¹, Nicu Sebe²•Institutions (2)

Carnegie Mellon University¹, University of Trento²

01 Apr 2013-IEEE Transactions on Multimedia

TL;DR: A new multi-task feature selection algorithm is proposed and applied to multimedia (e.g., video and image) analysis, which enables the common knowledge of multiple tasks as supplementary information to facilitate decision making.

...read moreread less

Abstract: While much progress has been made to multi-task classification and subspace learning, multi-task feature selection has long been largely unaddressed. In this paper, we propose a new multi-task feature selection algorithm and apply it to multimedia (e.g., video and image) analysis. Instead of evaluating the importance of each feature individually, our algorithm selects features in a batch mode, by which the feature correlation is considered. While feature selection has received much research attention, less effort has been made on improving the performance of feature selection by leveraging the shared knowledge from multiple related tasks. Our algorithm builds upon the assumption that different related tasks have common structures. Multiple feature selection functions of different tasks are simultaneously learned in a joint framework, which enables our algorithm to utilize the common knowledge of multiple tasks as supplementary information to facilitate decision making. An efficient iterative algorithm is proposed to optimize it, whose convergence is guaranteed. Experiments on different databases have demonstrated the effectiveness of the proposed algorithm.

...read moreread less

Journal Article•DOI•

Sparse Representation Classifier Steered Discriminative Projection With Applications to Face Recognition

[...]

Jian Yang¹, Delin Chu², Lei Zhang, Yong Xu³, Jingyu Yang¹ - Show less +1 more•Institutions (3)

Nanjing University of Science and Technology¹, National University of Singapore², Harbin Institute of Technology³

13 Mar 2013-IEEE Transactions on Neural Networks

TL;DR: A dimensionality reduction method that fits SRC well, which maximizes the ratio of between- class reconstruction residual to within-class reconstruction residual in the projected space and thus enables SRC to achieve better performance.

...read moreread less

Abstract: A sparse representation-based classifier (SRC) is developed and shows great potential for real-world face recognition. This paper presents a dimensionality reduction method that fits SRC well. SRC adopts a class reconstruction residual-based decision rule, we use it as a criterion to steer the design of a feature extraction method. The method is thus called the SRC steered discriminative projection (SRC-DP). SRC-DP maximizes the ratio of between-class reconstruction residual to within-class reconstruction residual in the projected space and thus enables SRC to achieve better performance. SRC-DP provides low-dimensional representation of human faces to make the SRC-based face recognition system more efficient. Experiments are done on the AR, the extended Yale B, and PIE face image databases, and results demonstrate the proposed method is more effective than other feature extraction methods based on the SRC.

...read moreread less

Proceedings Article•DOI•

Fast High Dimensional Vector Multiplication Face Recognition

[...]

Oren Barkan¹, Jonathan Weill¹, Lior Wolf¹, Hagai Aronowitz²•Institutions (2)

Tel Aviv University¹, IBM²

01 Dec 2013

TL;DR: This paper advances descriptor-based face recognition by suggesting a novel usage of descriptors to form an over-complete representation, and by proposing a new metric learning pipeline within the same/not-same framework.

...read moreread less

Abstract: This paper advances descriptor-based face recognition by suggesting a novel usage of descriptors to form an over-complete representation, and by proposing a new metric learning pipeline within the same/not-same framework. First, the Over-Complete Local Binary Patterns (OCLBP) face representation scheme is introduced as a multi-scale modified version of the Local Binary Patterns (LBP) scheme. Second, we propose an efficient matrix-vector multiplication-based recognition system. The system is based on Linear Discriminant Analysis (LDA) coupled with Within Class Covariance Normalization (WCCN). This is further extended to the unsupervised case by proposing an unsupervised variant of WCCN. Lastly, we introduce Diffusion Maps (DM) for non-linear dimensionality reduction as an alternative to the Whitened Principal Component Analysis (WPCA) method which is often used in face recognition. We evaluate the proposed framework on the LFW face recognition dataset under the restricted, unrestricted and unsupervised protocols. In all three cases we achieve very competitive results.

...read moreread less

Journal Article•DOI•

Linear Discriminant Analysis Based on L1-Norm Maximization

[...]

Fujin Zhong¹, Jiashu Zhang¹•Institutions (1)

Southwest Jiaotong University¹

20 Mar 2013-IEEE Transactions on Image Processing

TL;DR: This paper proposes a simple but effective robust LDA version based on L1-norm maximization, which learns a set of local optimal projection vectors by maximizing the ratio of the L2-norm-based between-class dispersion and the within- class dispersion.

...read moreread less

Abstract: Linear discriminant analysis (LDA) is a well-known dimensionality reduction technique, which is widely used for many purposes. However, conventional LDA is sensitive to outliers because its objective function is based on the distance criterion using L2-norm. This paper proposes a simple but effective robust LDA version based on L1-norm maximization, which learns a set of local optimal projection vectors by maximizing the ratio of the L1-norm-based between-class dispersion and the L1-norm-based within-class dispersion. The proposed method is theoretically proved to be feasible and robust to outliers while overcoming the singular problem of the within-class scatter matrix for conventional LDA. Experiments on artificial datasets, standard classification datasets and three popular image databases demonstrate the efficacy of the proposed method.

...read moreread less

Journal Article•DOI•

Semisupervised Local Discriminant Analysis for Feature Extraction in Hyperspectral Images

[...]

Wenzhi Liao¹, Aleksandra Pizurica¹, Paul Scheunders², Wilfried Philips¹, Youguo Pi³ - Show less +1 more•Institutions (3)

Ghent University¹, University of Antwerp², South China University of Technology³

01 Jan 2013-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: The underlying idea is to design an optimal projection matrix, which preserves the local neighborhood information inferred from unlabeled samples, while simultaneously maximizing the class discrimination of the data inferred from the labeled samples.

...read moreread less

Abstract: We propose a novel semisupervised local discriminant analysis method for feature extraction in hyperspectral remote sensing imagery, with improved performance in both ill-posed and poor-posed conditions. The proposed method combines unsupervised methods (local linear feature extraction methods and supervised method (linear discriminant analysis) in a novel framework without any free parameters. The underlying idea is to design an optimal projection matrix, which preserves the local neighborhood information inferred from unlabeled samples, while simultaneously maximizing the class discrimination of the data inferred from the labeled samples. Experimental results on four real hyperspectral images demonstrate that the proposed method compares favorably with conventional feature extraction methods.

...read moreread less

Journal Article•DOI•

Facial age estimation based on label-sensitive learning and age-oriented regression

[...]

Wei-Lun Chao¹, Jun-Zuo Liu¹, Jian-Jiun Ding¹•Institutions (1)

National Taiwan University¹

01 Mar 2013-Pattern Recognition

TL;DR: This paper combines distance metric learning and dimensionality reduction to better explore the connections between facial features and age labels and presents an age-oriented local regression to capture the complicated facial aging process for age determination.

...read moreread less

Journal Article•DOI•

A near-optimal algorithm for differentially-private principal components

[...]

Kamalika Chaudhuri¹, Anand D. Sarwate², Kaushik Sinha³•Institutions (3)

University of California, San Diego¹, Toyota Technological Institute at Chicago², Wichita State University³

01 Jan 2013-Journal of Machine Learning Research

TL;DR: In this paper, the authors investigate the theory and empirical performance of differentially private approximations to PCA and propose a new method which explicitly optimizes the utility of the output.

...read moreread less

Abstract: The principal components analysis (PCA) algorithm is a standard tool for identifying good low-dimensional approximations to high-dimensional data. Many data sets of interest contain private or sensitive information about individuals. Algorithms which operate on such data should be sensitive to the privacy risks in publishing their outputs. Differential privacy is a framework for developing tradeoffs between privacy and the utility of these outputs. In this paper we investigate the theory and empirical performance of differentially private approximations to PCA and propose a new method which explicitly optimizes the utility of the output. We show that the sample complexity of the proposed method differs from the existing procedure in the scaling with the data dimension, and that our method is nearly optimal in terms of this scaling. We furthermore illustrate our results, showing that on real data there is a large performance gap between the existing method and our method.

...read moreread less

Posted Content•

Manopt, a Matlab toolbox for optimization on manifolds

[...]

Nicolas Boumal¹, Bamdev Mishra², Pierre-Antoine Absil¹, Rodolphe Sepulchre³•Institutions (3)

Université catholique de Louvain¹, University of Liège², University of Cambridge³

23 Aug 2013-arXiv: Mathematical Software

TL;DR: The Manopt toolbox, available at www.manopt.org, is a user-friendly, documented piece of software dedicated to simplify experimenting with state of the art Riemannian optimization algorithms, which aims particularly at lowering the entrance barrier.

...read moreread less

Abstract: Optimization on manifolds is a rapidly developing branch of nonlinear optimization. Its focus is on problems where the smooth geometry of the search space can be leveraged to design efficient numerical algorithms. In particular, optimization on manifolds is well-suited to deal with rank and orthogonality constraints. Such structured constraints appear pervasively in machine learning applications, including low-rank matrix completion, sensor network localization, camera network registration, independent component analysis, metric learning, dimensionality reduction and so on. The Manopt toolbox, available at this http URL, is a user-friendly, documented piece of software dedicated to simplify experimenting with state of the art Riemannian optimization algorithms. We aim particularly at reaching practitioners outside our field.

...read moreread less

Book Chapter•DOI•

Linear Discriminant Analysis

[...]

Petros Xanthopoulos¹, Panos M. Pardalos², Panos M. Pardalos³, Theodore B. Trafalis⁴•Institutions (4)

University of Central Florida¹, University of Florida², National Research University – Higher School of Economics³, University of Oklahoma⁴

01 Jan 2013

TL;DR: This chapter discusses another popular data mining algorithm that can be used for supervised or unsupervised learning, Linear Discriminant Analysis, and presents the robust counterpart scheme originally proposed by Kim and Boyd.

...read moreread less

Abstract: In this chapter we discuss another popular data mining algorithm that can be used for supervised or unsupervised learning. Linear Discriminant Analysis (LDA) was proposed by R. Fischer in 1936. It consists in finding the projection hyperplane that minimizes the interclass variance and maximizes the distance between the projected means of the classes. Similarly to PCA, these two objectives can be solved by solving an eigenvalue problem with the corresponding eigenvector defining the hyperplane of interest. This hyperplane can be used for classification, dimensionality reduction and for interpretation of the importance of the given features. In the first part of the chapter we discuss the generic formulation of LDA whereas in the second we present the robust counterpart scheme originally proposed by Kim and Boyd. We also discuss the non linear extension of LDA through the kernel transformation.

...read moreread less

Journal Article•DOI•

Efficient ant colony optimization for image feature selection

[...]

Bolun Chen¹, Ling Chen², Yixin Chen³•Institutions (3)

Nanjing University of Aeronautics and Astronautics¹, Yangzhou University², Washington University in St. Louis³

01 Jun 2013-Signal Processing

TL;DR: A novel algorithm in which the artificial ants traverse on a directed graph with only O(2n) arcs is proposed, which incorporates the classification performance and feature set size into the heuristic guidance, and selects a feature set with small size and high classification accuracy.

...read moreread less

Book•

Dimensionality Reduction with Unsupervised Nearest Neighbors

[...]

Oliver Kramer

04 Jun 2013

TL;DR: This book is devoted to a novel approach for dimensionality reduction based on the famous nearest neighbor method that is a powerful classification and regression approach and various optimization approaches are compared, from evolutionary to swarm-based heuristics.

...read moreread less

Abstract: This book is devoted to a novel approach for dimensionality reduction based on the famous nearest neighbor method that is a powerful classification and regression approach. It starts with an introduction to machine learning concepts and a real-world application from the energy domain. Then, unsupervised nearest neighbors (UNN) is introduced as efficient iterative method for dimensionality reduction. Various UNN models are developed step by step, reaching from a simple iterative strategy for discrete latent spaces to a stochastic kernel-based algorithm for learning submanifolds with independent parameterizations. Extensions that allow the embedding of incomplete and noisy patterns are introduced. Various optimization approaches are compared, from evolutionary to swarm-based heuristics. Experimental comparisons to related methodologies taking into account artificial test data sets and also real-world data demonstrate the behavior of UNN in practical scenarios. The book contains numerous color figures to illustrate the introduced concepts and to highlight the experimental results.

...read moreread less

Journal Article•DOI•

Self-taught dimensionality reduction on the high-dimensional small-sized data

[...]

Xiaofeng Zhu¹, Zi Huang¹, Yang Yang¹, Heng Tao Shen¹, Changsheng Xu², Jiebo Luo³ - Show less +2 more•Institutions (3)

University of Queensland¹, Chinese Academy of Sciences², University of Rochester³

01 Jan 2013-Pattern Recognition

TL;DR: Experimental results at various types of datasets show the proposed STDR outperforms the state-of-the-art algorithms in terms of k-means clustering performance and the proposed solution is theoretically guaranteed that the objective function of the proposed model converges to the global optimum.

...read moreread less

Posted Content•

Nonparametric Independence Screening in Sparse Ultra-High Dimensional Varying Coefficient Models

[...]

Jianqing Fan¹, Yunbei Ma², Wei Dai¹•Institutions (2)

Princeton University¹, Southwestern University of Finance and Economics²

03 Mar 2013-arXiv: Statistics Theory

TL;DR: In this article, a nonparametric independence screening (NIS) method is proposed to select variables by ranking a measure of the non-parametric marginal contributions of each covariate given the exposure variable.

...read moreread less

Abstract: The varying-coefficient model is an important nonparametric statistical model that allows us to examine how the effects of covariates vary with exposure variables. When the number of covariates is big, the issue of variable selection arrives. In this paper, we propose and investigate marginal nonparametric screening methods to screen variables in ultra-high dimensional sparse varying-coefficient models. The proposed nonparametric independence screening (NIS) selects variables by ranking a measure of the nonparametric marginal contributions of each covariate given the exposure variable. The sure independent screening property is established under some mild technical conditions when the dimensionality is of nonpolynomial order, and the dimensionality reduction of NIS is quantified. To enhance practical utility and the finite sample performance, two data-driven iterative NIS methods are proposed for selecting thresholding parameters and variables: conditional permutation and greedy methods, resulting in Conditional-INIS and Greedy-INIS. The effectiveness and flexibility of the proposed methods are further illustrated by simulation studies and real data applications.

...read moreread less

Journal Article•DOI•

Efficient Gaussian process regression for large datasets

[...]

Anjishnu Banerjee¹, David B. Dunson¹, Surya T. Tokdar¹•Institutions (1)

Duke University¹

01 Mar 2013-Biometrika

TL;DR: In this article, the authors proposed an alternative approach that involves linear projection of all the data points onto a lower-dimensional subspace and demonstrate the superiority of this approach from a theoretical perspective and through simulated and real data examples.

...read moreread less

Abstract: Gaussian processes are widely used in nonparametric regression, classification and spatiotemporal modelling, facilitated in part by a rich literature on their theoretical properties. However, one of their practical limitations is expensive computation, typically on the order of n3 where n is the number of data points, in performing the necessary matrix inversions. For large datasets, storage and processing also lead to computational bottlenecks, and numerical stability of the estimates and predicted values degrades with increasing n. Various methods have been proposed to address these problems, including predictive processes in spatial data analysis and the subset-of-regressors technique in machine learning. The idea underlying these approaches is to use a subset of the data, but this raises questions concerning sensitivity to the choice of subset and limitations in estimating fine-scale structure in regions that are not well covered by the subset. Motivated by the literature on compressive sensing, we propose an alternative approach that involves linear projection of all the data points onto a lower-dimensional subspace. We demonstrate the superiority of this approach from a theoretical perspective and through simulated and real data examples.

...read moreread less

Journal Article•DOI•

Pairwise constraints based multiview features fusion for scene classification

[...]

Jun Yu¹, Dacheng Tao², Yong Rui³, Jun Cheng⁴•Institutions (4)

Xiamen University¹, University of Technology, Sydney², Microsoft³, The Chinese University of Hong Kong⁴

01 Feb 2013-Pattern Recognition

TL;DR: A novel multiview dimensionality reduction method for scene classification that takes both intraclass and interclass geometries into consideration, and has the best performance in scene classification.

...read moreread less

Proceedings Article•DOI•

ReliefF for Multi-label Feature Selection

[...]

Newton Spolaôr¹, Everton Alvares Cherman¹, Maria Carolina Monard¹, Huei Diana Lee•Institutions (1)

University of São Paulo¹

19 Oct 2013

TL;DR: This work proposes a new multi- label feature selection algorithm, RF-ML, by extending the single-label feature selection ReliefF algorithm that takes into account the effect of interacting attributes to directly deal with multi-label data without any data transformation.

...read moreread less

Abstract: The feature selection process aims to select a subset of relevant features to be used in model construction, reducing data dimensionality by removing irrelevant and redundant features. Although effective feature selection methods to support single-label learning are abound, this is not the case for multi-label learning. Furthermore, most of the multi-label feature selection methods proposed initially transform the multi-label data to single-label in which a traditional feature selection method is then applied. However, the application of single-label feature selection methods after transforming the data can hinder exploring label dependence, an important issue in multi-label learning. This work proposes a new multi-label feature selection algorithm, RF-ML, by extending the single-label feature selection ReliefF algorithm. RF-ML, unlike strictly univariate measures for feature ranking, takes into account the effect of interacting attributes to directly deal with multi-label data without any data transformation. Using synthetic datasets, the proposed algorithm is experimentally compared to the ReliefF algorithm in which the multi-label data has been previously transformed to single-label data using two well-known data transformation approaches. Results show that the proposed algorithm stands out by ranking the relevant features as the best ones more often.

...read moreread less

Journal Article•DOI•

Outlier-Robust PCA: The High-Dimensional Case

[...]

Huan Xu¹, Constantine Caramanis², Shie Mannor³•Institutions (3)

National University of Singapore¹, University of Texas at Austin², Technion – Israel Institute of Technology³

01 Jan 2013-IEEE Transactions on Information Theory

TL;DR: This work proposes a high-dimensional robust principal component analysis algorithm that is efficient, robust to contaminated points, and easily kernelizable, and achieves maximal robustness.

...read moreread less

Abstract: Principal component analysis plays a central role in statistics, engineering, and science. Because of the prevalence of corrupted data in real-world applications, much research has focused on developing robust algorithms. Perhaps surprisingly, these algorithms are unequipped-indeed, unable-to deal with outliers in the high-dimensional setting where the number of observations is of the same magnitude as the number of variables of each observation, and the dataset contains some (arbitrarily) corrupted observations. We propose a high-dimensional robust principal component analysis algorithm that is efficient, robust to contaminated points, and easily kernelizable. In particular, our algorithm achieves maximal robustness-it has a breakdown point of 50% (the best possible), while all existing algorithms have a breakdown point of zero. Moreover, our algorithm recovers the optimal solution exactly in the case where the number of corrupted points grows sublinearly in the dimension.

...read moreread less

Journal Article•DOI•

Joint discriminative dimensionality reduction and dictionary learning for face recognition

[...]

Zhizhao Feng¹, Meng Yang¹, Lei Zhang¹, Yan Liu¹, David Zhang¹ - Show less +1 more•Institutions (1)

Hong Kong Polytechnic University¹

01 Aug 2013-Pattern Recognition

TL;DR: The proposed algorithm is evaluated on benchmark face databases in comparison with existing linear representation based methods, and the results show that the joint learning improves the FR rate, particularly when the number of training samples per class is small.

...read moreread less

Journal Article•DOI•

A k-nearest-neighbor classifier with heart rate variability feature-based transformation algorithm for driving stress recognition

[...]

Jeen-Shing Wang¹, Che-Wei Lin¹, Ya-Ting Carolyn Yang¹•Institutions (1)

National Cheng Kung University¹

20 Sep 2013-Neurocomputing

TL;DR: This approach can use only ECG signals to effectively recognize driving stress conditions with very good recognition performance and the combination of KBCS, LDA, and PCA can achieve satisfactory recognition rates for the features generated by both trend-based and parameter-based methods.

...read moreread less

Collapse