Showing papers on "Sufficient dimension reduction published in 2015"

PDF

Open Access

Posted Content•

Universality laws for randomized dimension reduction, with applications

[...]

Samet Oymak, Joel A. Tropp¹•Institutions (1)

30 Nov 2015-arXiv: Probability

TL;DR: In this article, the authors studied a family of randomized dimension reduction maps and a large class of data sets, and they showed that there is a phase transition in the success probability of the dimension reduction map as the embedding dimension increases.

...read moreread less

Abstract: Dimension reduction is the process of embedding high-dimensional data into a lower dimensional space to facilitate its analysis. In the Euclidean setting, one fundamental technique for dimension reduction is to apply a random linear map to the data. This dimension reduction procedure succeeds when it preserves certain geometric features of the set. The question is how large the embedding dimension must be to ensure that randomized dimension reduction succeeds with high probability. This paper studies a natural family of randomized dimension reduction maps and a large class of data sets. It proves that there is a phase transition in the success probability of the dimension reduction map as the embedding dimension increases. For a given data set, the location of the phase transition is the same for all maps in this family. Furthermore, each map has the same stability properties, as quantified through the restricted minimum singular value. These results can be viewed as new universality laws in high-dimensional stochastic geometry. Universality laws for randomized dimension reduction have many applications in applied mathematics, signal processing, and statistics. They yield design principles for numerical linear algebra algorithms, for compressed sensing measurement ensembles, and for random linear codes. Furthermore, these results have implications for the performance of statistical estimation methods under a large class of random experimental designs.

...read moreread less

75 citations

Journal Article•DOI•

Sequential sufficient dimension reduction for large p, small n problems

[...]

Xiangrong Yin¹, Haileab Hilafu²•Institutions (2)

University of Kentucky¹, University of Tennessee²

01 Sep 2015-Journal of The Royal Statistical Society Series B-statistical Methodology

TL;DR: A new and simple framework for dimension reduction in the large p, small n setting is proposed, which decomposes the data into pieces thereby enabling existing approaches for n>p to be adapted to n.

...read moreread less

Abstract: Summary We propose a new and simple framework for dimension reduction in the large p, small n setting. The framework decomposes the data into pieces, thereby enabling existing approaches for n>p to be adapted to n

...read moreread less

66 citations

Journal Article•DOI•

Bayesian Compressed Regression

[...]

Rajarshi Guhaniyogi¹, David B. Dunson¹•Institutions (1)

Duke University¹

02 Oct 2015-Journal of the American Statistical Association

TL;DR: In this paper, the predictors are projected to a low-dimensional linear subspace with minimal loss of information about the response, and the exact posterior distribution conditional on the compressed data is available analytically.

...read moreread less

Abstract: As an alternative to variable selection or shrinkage in high-dimensional regression, we propose to randomly compress the predictors prior to analysis. This dramatically reduces storage and computational bottlenecks, performing well when the predictors can be projected to a low-dimensional linear subspace with minimal loss of information about the response. As opposed to existing Bayesian dimensionality reduction approaches, the exact posterior distribution conditional on the compressed data is available analytically, speeding up computation by many orders of magnitude while also bypassing robustness issues due to convergence and mixing problems with MCMC. Model averaging is used to reduce sensitivity to the random projection matrix, while accommodating uncertainty in the subspace dimension. Strong theoretical support is provided for the approach by showing near parametric convergence rates for the predictive density in the large p small n asymptotic paradigm. Practical performance relative to competitors ...

...read moreread less

48 citations

Journal Article•DOI•

Dimension reduction in multivariate extreme value analysis

[...]

Emilie Chautru

01 Jan 2015-Electronic Journal of Statistics

TL;DR: In this article, the authors proposed a novel approach that combines clustering techniques with angular/spectral measure analysis to find groups of variables exhibiting asymptotic dependence, thereby reducing the dimension of the initial problem.

...read moreread less

Abstract: Non-parametric assessment of extreme dependence structures between an arbitrary number of variables, though quite well-established in dimension $2$ and recently extended to moderate dimensions such as $5$, still represents a statistical challenge in larger dimensions. Here, we propose a novel approach that combines clustering techniques with angular/spectral measure analysis to find groups of variables (not necessarily disjoint) exhibiting asymptotic dependence, thereby reducing the dimension of the initial problem. A heuristic criterion is proposed to choose the threshold over which it is acceptable to consider observations as extreme and the appropriate number of clusters. When empirically evaluated through numerical experiments, the approach we promote here is found to be very efficient under some regularity constraints, even in dimension $20$. For illustration purpose, we also carry out a case study in dietary risk assessment.

...read moreread less

43 citations

Journal Article•DOI•

Dimensional reduction in causal set gravity

[...]

Steven Carlip

29 Jun 2015-Classical and Quantum Gravity

TL;DR: In this article, the authors show that two different dimensional estimators in causal set theory display the same behavior, and argue that a third dimension, the spectral dimension, may exhibit a related phenomenon of "asymptotic silence".

...read moreread less

Abstract: Results from a number of different approaches to quantum gravity suggest that the effective dimension of spacetime may drop to d = 2 at small scales. I show that two different dimensional estimators in causal set theory display the same behavior, and argue that a third, the spectral dimension, may exhibit a related phenomenon of “asymptotic silence.”

...read moreread less

36 citations

Posted Content•

Sufficient Forecasting Using Factor Models

[...]

Jianqing Fan¹, Lingzhou Xue², Jiawei Yao¹•Institutions (2)

Princeton University¹, Pennsylvania State University²

27 May 2015-arXiv: Statistics Theory

TL;DR: In this paper, the authors developed a sufficient forecasting method, which provides a set of sufficient predictive indices, inferred from high-dimensional predictors, to deliver additional predictive power, which is applicable to cross-sectional sufficient regression using extracted factors.

...read moreread less

Abstract: We consider forecasting a single time series when there is a large number of predictors and a possible nonlinear effect. The dimensionality was first reduced via a high-dimensional (approximate) factor model implemented by the principal component analysis. Using the extracted factors, we develop a novel forecasting method called the sufficient forecasting, which provides a set of sufficient predictive indices, inferred from high-dimensional predictors, to deliver additional predictive power. The projected principal component analysis will be employed to enhance the accuracy of inferred factors when a semi-parametric (approximate) factor model is assumed. Our method is also applicable to cross-sectional sufficient regression using extracted factors. The connection between the sufficient forecasting and the deep learning architecture is explicitly stated. The sufficient forecasting correctly estimates projection indices of the underlying factors even in the presence of a nonparametric forecasting function. The proposed method extends the sufficient dimension reduction to high-dimensional regimes by condensing the cross-sectional information through factor models. We derive asymptotic properties for the estimate of the central subspace spanned by these projection directions as well as the estimates of the sufficient predictive indices. We further show that the natural method of running multiple regression of target on estimated factors yields a linear estimate that actually falls into this central subspace. Our method and theory allow the number of predictors to be larger than the number of observations. We finally demonstrate that the sufficient forecasting improves upon the linear forecasting in both simulation studies and an empirical study of forecasting macroeconomic variables.

...read moreread less

35 citations

Journal Article•DOI•

A validated information criterion to determine the structural dimension in dimension reduction models

[...]

Yanyuan Ma¹, Xinyu Zhang²•Institutions (2)

University of South Carolina¹, Capital University of Economics and Business²

01 Jun 2015-Biometrika

TL;DR: This work proposes a novel information criterion-based model selection method that breaks free from dependence on the connection between dimension reduction models and their corresponding matrix eigenstructures, which relies heavily on a linearity condition that it no longer assumes.

...read moreread less

Abstract: SUMMARY A crucial component of performing sufficient dimension reduction is to determine the structural dimension of the reduction model. We propose a novel information criterion-based method for this purpose, a special feature of which is that when examining the goodness-of-fit of the current model, one needs to perform model evaluation by using an enlarged candidate model. Althoughtheproceduredoesnotrequireestimationundertheenlargedmodelofdimensionk + 1, the decision as to how well the current model of dimensionk fits relies on the validation provided by the enlarged model; thus we call this procedure the validated information criterion, VIC(k). Our method is different from existing information criterion-based model selection methods; it breaks free from dependence on the connection between dimension reduction models and their corresponding matrix eigenstructures, which relies heavily on a linearity condition that we no longer assume. We prove consistency of the proposed method, and its finite-sample performance is demonstrated numerically.

...read moreread less

34 citations

Journal Article•DOI•

Tensor sliced inverse regression

[...]

Shanshan Ding¹, R. Dennis Cook²•Institutions (2)

University of Delaware¹, University of Minnesota²

01 Jan 2015-Journal of Multivariate Analysis

TL;DR: Higher-order sufficient dimension reduction mainly by extending SIR to general tensor-valued predictors and refer to it as tensor SIR, which provides fast and efficient estimation and circumvents high-dimensional covariance matrix inversion that researchers often suffer when dealing with such data.

...read moreread less

34 citations

Journal Article•DOI•

Effective dimension reduction for sparse functional data

[...]

Fang Yao¹, Edwin Lei¹, Yichao Wu²•Institutions (2)

University of Toronto¹, North Carolina State University²

01 Jun 2015-Biometrika

TL;DR: The theoretical study reveals a bias-variance trade-off associated with the regularizing truncation and decaying structures of the predictor process and the effective dimension reduction space.

...read moreread less

Abstract: We propose a method of effective dimension reduction for functional data, emphasizing the sparse design where one observes only a few noisy and irregular measurements for some or all of the subjects. The proposed method borrows strength across the entire sample and provides a way to characterize the effective dimension reduction space, via functional cumulative slicing. Our theoretical study reveals a bias-variance trade-off associated with the regularizing truncation and decaying structures of the predictor process and the effective dimension reduction space. A simulation study and an application illustrate the superior finite-sample performance of the method.

...read moreread less

31 citations

Journal Article•DOI•

Sufficient Reductions in Regressions With Elliptically Contoured Inverse Predictors

[...]

Efstathia Bura, Liliana Forzani

22 Apr 2015-Journal of the American Statistical Association

TL;DR: In this paper, it was shown that if X|Y is elliptically contoured with parameters (μY, Δ) and density gY, there is no linear nontrivial suf...

...read moreread less

Abstract: There are two general approaches based on inverse regression for estimating the linear sufficient reductions for the regression of Y on X: the moment-based approach such as SIR, PIR, SAVE, and DR, and the likelihood-based approach such as principal fitted components (PFC) and likelihood acquired directions (LAD) when the inverse predictors, X|Y, are normal. By construction, these methods extract information from the first two conditional moments of X|Y; they can only estimate linear reductions and thus form the linear sufficient dimension reduction (SDR) methodology. When var(X|Y) is constant, E(X|Y) contains the reduction and it can be estimated using PFC. When var(X|Y) is nonconstant, PFC misses the information in the variance and second moment based methods (SAVE, DR, LAD) are used instead, resulting in efficiency loss in the estimation of the mean-based reduction. In this article we prove that (a) if X|Y is elliptically contoured with parameters (μY,Δ) and density gY, there is no linear nontrivial suf...

...read moreread less

30 citations

Journal Article•DOI•

Robust dimension reduction

[...]

Shojaeddin Chenouri¹, Jiaxi Liang¹, Christopher G. Small¹•Institutions (1)

University of Waterloo¹

01 Jan 2015-Wiley Interdisciplinary Reviews: Computational Statistics

TL;DR: This article considers linear dimension reduction first and describes robust principal component analysis (PCA) using three approaches; the first approach uses a singular value decomposition of a robust covariance matrix, the second employs robust measures of dispersion to realize PCA as a robust projection pursuit, and the third uses a low‐rank plus sparse decomposition

...read moreread less

Abstract: Information in the data often has far fewer degrees of freedom than the number of variables encoding the data. Dimensionality reduction attempts to reduce the number of variables used to describe the data. In this article, we shall survey some dimension reduction techniques that are robust. We consider linear dimension reduction first and describe robust principal component analysis (PCA) using three approaches. The first approach uses a singular value decomposition of a robust covariance matrix. The second approach employs robust measures of dispersion to realize PCA as a robust projection pursuit. The third approach uses a low-rank plus sparse decomposition of the data matrix. We also survey robust approaches to nonlinear dimension reduction under a unifying framework of kernel PCA. By using a kernel trick, the robust methods available for PCA can be extended to nonlinear cases. WIREs Comput Stat 2015, 7:63–69. doi: 10.1002/wics.1331 For further resources related to this article, please visit the WIREs website. Conflict of interest: The authors have declared no conflicts of interest for this article.

...read moreread less

Journal Article•DOI•

Groupwise Dimension Reduction via Envelope Method

[...]

Zifang Guo¹, Lexin Li², Wenbin Lu², Bing Li³•Institutions (3)

Merck & Co.¹, North Carolina State University², Pennsylvania State University³

01 Dec 2015-Journal of the American Statistical Association

TL;DR: This article aims at dimension reduction that recovers full regression information while preserving the predictor group structure, and introduces a systematic way to incorporate the group information in most existing SDR estimators.

...read moreread less

Abstract: The family of sufficient dimension reduction (SDR) methods that produce informative combinations of predictors, or indices, are particularly useful for high-dimensional regression analysis In many such analyses, it becomes increasingly common that there is available a priori subject knowledge of the predictors; for example, they belong to different groups While many recent SDR proposals have greatly expanded the scope of the methods’ applicability, how to effectively incorporate the prior predictor structure information remains a challenge In this article, we aim at dimension reduction that recovers full regression information while preserving the predictor group structure Built upon a new concept of the direct sum envelope, we introduce a systematic way to incorporate the group information in most existing SDR estimators As a result, the reduction outcomes are much easier to interpret Moreover, the envelope method provides a principled way to build a variety of prior structures into dimension reduc

...read moreread less

Journal Article•DOI•

Dimension and variance reduction for Monte Carlo methods for high-dimensional models in finance

[...]

Duy-Minh Dang¹, Kenneth R. Jackson², Mohammadreza Mohammadi²•Institutions (2)

University of Queensland¹, University of Toronto²

01 Jan 2015-Applied Mathematical Finance

TL;DR: A dimension reduction technique for Monte Carlo (MC) methods, referred to as drMC, that exploits this structure for pricing plain-vanilla European options under an N-dimensional one-way coupled model, where N is arbitrary.

...read moreread less

Abstract: One-way coupling often occurs in multi-dimensional models in finance. In this paper, we present a dimension reduction technique for Monte Carlo (MC) methods, referred to as drMC, that exploits this structure for pricing plain-vanilla European options under an N-dimensional one-way coupled model, where N is arbitrary. The dimension reduction also often produces a significant variance reduction.The drMC method is a dimension reduction technique built upon (i) the conditional MC technique applied to one of the factors which does not depend on any other factors in the model, and (ii) the derivation of a closed-form solution to the conditional partial differential equation (PDE) that arises via Fourier transforms. In the drMC approach, the option price can be computed simply by taking the expectation of this closed-form solution. Hence, the approach results in a powerful dimension reduction from N to one, which often results in a significant variance reduction as well, since the variance associated wit...

...read moreread less

Journal Article•DOI•

A distribution-based LASSO for a general single-index model

[...]

Tao Wang¹, Lixing Zhu¹•Institutions (1)

Hong Kong Baptist University¹

01 Jan 2015-Science China-mathematics

TL;DR: A general single-index model with high-dimensional predictors is considered and the consistency of predictor selection and estimation is investigated, resulting in robustness of the new method against outliers in the response variable.

...read moreread less

Abstract: A general single-index model with high-dimensional predictors is considered. Additive structure of the unknown link function and the error is not assumed in this model. The consistency of predictor selection and estimation is investigated in this model. The index is formulated in the sufficient dimension reduction framework. A distribution-based LASSO estimation is then suggested. When the dimension of predictors can diverge at a polynomial rate of the sample size, the consistency holds under an irrepresentable condition and mild conditions on the predictors. The new method has no requirement, other than independence from the predictors, for the distribution of the error. This property results in robustness of the new method against outliers in the response variable. The conventional consistency of index estimation is provided after the dimension is brought down to a value smaller than the sample size. The importance of the irrepresentable condition for the consistency, and the robustness are examined by a simulation study and two real-data examples.

...read moreread less

Journal Article•DOI•

Robust inverse regression for dimension reduction

[...]

Yuexiao Dong¹, Zhou Yu², Liping Zhu³•Institutions (3)

Temple University¹, East China Normal University², Chinese Ministry of Education³

01 Feb 2015-Journal of Multivariate Analysis

TL;DR: Two robust inverse regression methods which are insensitive to data contamination are proposed which produce unbiased estimates of the central space when the predictors follow an elliptically contoured distribution.

...read moreread less

Journal Article•DOI•

Scaling cut criterion-based discriminant analysis for supervised dimension reduction

[...]

Xiangrong Zhang¹, Yudi He¹, Licheng Jiao¹, Ruochen Liu¹, Jie Feng¹, Sisi Zhou¹ - Show less +2 more•Institutions (1)

Xidian University¹

01 Jun 2015-Knowledge and Information Systems

TL;DR: The optimal dimension scaling cut criterion is proposed, which can automatically select the optimal dimension for the dimension reduction methods, and the approaches have shown a better and efficient performance compared with other linear and nonlinear dimension reduction techniques.

...read moreread less

Abstract: Dimension reduction has always been a major problem in many applications of machine learning and pattern recognition. In this paper, the scaling cut criterion-based supervised dimension reduction methods for data analysis are proposed. The scaling cut criterion can eliminate the limit of the hypothesis that data distribution of each class is homoscedastic Gaussian. To obtain a more reasonable mapping matrix and reduce the computational complexity, local scaling cut criterion-based dimension reduction is raised, which utilized the localization strategy of the input data. The localized $$k$$k-nearest neighbor graph is introduced , which relaxes the within-class variance and enlarges the between-class margin. Moreover, by kernelizing the scaling cut criterion and local scaling cut criterion, both methods are extended to efficiently model the nonlinear variability of the data. Furthermore, the optimal dimension scaling cut criterion is proposed, which can automatically select the optimal dimension for the dimension reduction methods. The approaches have been tested on several datasets, and the results have shown a better and efficient performance compared with other linear and nonlinear dimension reduction techniques.

...read moreread less

Journal Article•DOI•

Dimension reduction based on the Hellinger integral

[...]

Qin Wang¹, Xiangrong Yin², Frank Critchley³•Institutions (3)

Virginia Commonwealth University¹, University of Kentucky², Open University³

01 Mar 2015-Biometrika

TL;DR: A new formulation is proposed that is based on the Hellinger integral of order two, introduced as a natural measure of the regression information contained in the predictor subspace, and an efficient local estimation algorithm is proposed.

...read moreread less

Abstract: Sufficient dimension reduction is a useful tool for studying the dependence between a response and a multi-dimensional predictor. In this article, a new formulation is proposed that is based on the Hellinger integral of order two, introduced as a natural measure of the regression information contained in the predictor subspace. The response may be either continuous or discrete. We establish links between local and global central subspaces, and propose an efficient local estimation algorithm. Simulations and an application show that our method compares favourably with existing approaches.

...read moreread less

Journal Article•DOI•

Diagnostic studies in sufficient dimension reduction

[...]

Xin Chen¹, R. Dennis Cook², Changliang Zou³•Institutions (3)

National University of Singapore¹, University of Minnesota², Nankai University³

01 Sep 2015-Biometrika

TL;DR: Methods to check goodness-of-fit for a given dimension reduction subspace are introduced to extend the so-called distance correlation to measure the conditional dependence relationship between the covariates and the response given a reduction sub space.

...read moreread less

Abstract: Sufficient dimension reduction in regression aims to reduce the predictor dimension by replacing the original predictors with some set of linear combinations of them without loss of information. Numerous dimension reduction methods have been developed based on this paradigm. However, little effort has been devoted to diagnostic studies within the context of dimension reduction. In this paper we introduce methods to check goodness-of-fit for a given dimension reduction subspace. The key idea is to extend the so-called distance correlation to measure the conditional dependence relationship between the covariates and the response given a reduction subspace. Our methods require minimal assumptions, which are usually much less restrictive than the conditions needed to justify the original methods. Asymptotic properties of the test statistic are studied. Numerical examples demonstrate the effectiveness of the proposed approach.

...read moreread less

Journal Article•DOI•

Variable selection and estimation for semi-parametric multiple-index models

[...]

Tao Wang, Peirong Xu¹, Lixing Zhu•Institutions (1)

Southeast University¹

01 Feb 2015-Bernoulli

TL;DR: In this paper, a group-wise minimum average variance estimator with LASSO penalty is proposed to select significant variables and estimate the corresponding coefficients in multiple-index models with a group structure.

...read moreread less

Abstract: In this paper, we propose a novel method to select significant variables and estimate the corresponding coefficients in multiple-index models with a group structure. All existing approaches for single-index models cannot be extended directly to handle this issue with several indices. This method integrates a popularly used shrinkage penalty such as LASSO with the group-wise minimum average variance estimation. It is capable of simultaneous dimension reduction and variable selection, while incorporating the group structure in predictors. Interestingly, the proposed estimator with the LASSO penalty then behaves like an estimator with an adaptive LASSO penalty. The estimator achieves consistency of variable selection without sacrificing the root-$n$ consistency of basis estimation. Simulation studies and a real-data example illustrate the effectiveness and efficiency of the new method.

...read moreread less

Journal Article•DOI•

Sufficient dimension reduction for longitudinal data

[...]

Xuan Bi¹, Annie Qu¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Apr 2015-Statistica Sinica

TL;DR: In this article, the authors apply the quadratic inference function to incorporate the correlation information and apply the transformation method to recover the central subspace, which is shown to be consistent and more efficient than the ones assuming independence.

...read moreread less

Abstract: Correlation structure contains important information about longitudinal data. Existing sufficient dimension reduction approaches assuming independence may lead to substantial loss of efficiency. We apply the quadratic inference function to incorporate the correlation information and apply the transformation method to recover the central subspace. The proposed estimators are shown to be consistent and more efficient than the ones assuming independence. In addition, the esti- mated central subspace is also efficient when the correlation information is taken into account. We compare the proposed method with other dimension reduction approaches through simulation studies, and apply this new approach to longitudinal data for an environmental health study.

...read moreread less

Journal Article•DOI•

Functional sufficient dimension reduction: Convergence rates and multiple functional case

[...]

Heng Lian¹•Institutions (1)

University of New South Wales¹

01 Dec 2015-Journal of Statistical Planning and Inference

TL;DR: In this paper, the authors considered the functional version of SIR and SAVE via a Tikhonov regularization approach and showed that their convergence rates are the same as the minimax rates for functional linear regression.

...read moreread less

Journal Article•DOI•

Heteroscedasticity checks for single index models

[...]

Xuehu Zhu¹, Xu Guo², Lu Lin¹, Lixing Zhu³•Institutions (3)

Shandong University¹, Nanjing University of Aeronautics and Astronautics², Hong Kong Baptist University³

01 Apr 2015-Journal of Multivariate Analysis

TL;DR: To test heteroscedasticity in single index models, two test statistics are proposed via quadratic conditional moments that are not robust against the violation of dimension reduction structure, but can detect local alternative hypotheses distinct from the null at a much faster rate.

...read moreread less

Journal Article•DOI•

Higher-order sliced inverse regressions

[...]

Shanshan Ding¹, R. Dennis Cook²•Institutions (2)

University of Delaware¹, University of Minnesota²

01 Jul 2015-Wiley Interdisciplinary Reviews: Computational Statistics

TL;DR: Recently developed higher‐order approaches to SDR for regressions with matrix‐ or array‐valued predictors, with a special focus on sliced inverse regressions, can reduce an array-valued predictor's multiple dimensions simultaneously without losing much/any information for prediction and classification.

...read moreread less

Abstract: With the advancement of modern technology, array-valued data are often encountered in application. Such data can exhibit both high dimensionality and complex structures. Traditional methods for sufficient dimension reduction (SDR) are generally inefficient for array-valued data as they cannot adequately capture the underlying structure. In this article, we discuss recently developed higher-order approaches to SDR for regressions with matrix- or array-valued predictors, with a special focus on sliced inverse regressions. These methods can reduce an array-valued predictor's multiple dimensions simultaneously without losing much/any information for prediction and classification. We briefly discuss the implementation procedure for each method. WIREs Comput Stat 2015, 7:249–257. doi: 10.1002/wics.1354 For further resources related to this article, please visit the WIREs website. Conflict of interest: The authors have declared no conflicts of interest for this article.

...read moreread less

Journal Article•DOI•

Tensor sufficient dimension reduction.

[...]

Wenxuan Zhong¹, Xin Xing¹, Kenneth S. Suslick²•Institutions (2)

University of Georgia¹, University of Illinois at Urbana–Champaign²

01 May 2015-Wiley Interdisciplinary Reviews: Computational Statistics

TL;DR: A tensor dimension reduction model, a model assuming the nonlinear dependence between a response and a projection of all the tensor predictors is proposed, which can greatly improve the sensitivity and specificity of the CSA technique.

...read moreread less

Abstract: Tensor is a multiway array. With the rapid development of science and technology in the past decades, large amount of tensor observations are routinely collected, processed, and stored in many scientific researches and commercial activities nowadays. The colorimetric sensor array (CSA) data is such an example. Driven by the need to address data analysis challenges that arise in CSA data, we propose a tensor dimension reduction model, a model assuming the nonlinear dependence between a response and a projection of all the tensor predictors. The tensor dimension reduction models are estimated in a sequential iterative fashion. The proposed method is applied to a CSA data collected for 150 pathogenic bacteria coming from 10 bacterial species and 14 bacteria from one control species. Empirical performance demonstrates that our proposed method can greatly improve the sensitivity and specificity of the CSA technique.

...read moreread less

Journal Article•DOI•

Dimension Estimation Using Weighted Correlation Dimension Method

[...]

Yuanhong Liu¹, Yuanhong Liu², Zhiwei Yu¹, Ming Zeng¹, Shun Wang¹ - Show less +1 more•Institutions (2)

Harbin Institute of Technology¹, Northeast Petroleum University²

22 Feb 2015-Discrete Dynamics in Nature and Society

TL;DR: In this paper, a novel weighted correlation dimension (WCD) approach is proposed, and the vertex degree of an undirected graph is invoked to measure the contribution of each point to the intrinsic dimension estimation.

...read moreread less

Abstract: Dimension reduction is an important tool for feature extraction and has been widely used in many fields including image processing, discrete-time systems, and fault diagnosis. As a key parameter of the dimension reduction, intrinsic dimension represents the smallest number of variables which is used to describe a complete dataset. Among all the dimension estimation methods, correlation dimension (CD) method is one of the most popular ones, which always assumes that the effect of every point on the intrinsic dimension estimation is identical. However, it is different when the distribution of a dataset is nonuniform. Intrinsic dimension estimated by the high density area is more reliable than the ones estimated by the low density or boundary area. In this paper, a novel weighted correlation dimension (WCD) approach is proposed. The vertex degree of an undirected graph is invoked to measure the contribution of each point to the intrinsic dimension estimation. In order to improve the adaptability of WCD estimation, -means clustering algorithm is adopted to adaptively select the linear portion of the log-log sequence . Various factors that affect the performance of WCD are studied. Experiments on synthetic and real datasets show the validity and the advantages of the development of technique.

...read moreread less

Posted Content•

Random-projection ensemble classification

[...]

Timothy I. Cannings¹, Richard J. Samworth¹•Institutions (1)

University of Cambridge¹

17 Apr 2015-arXiv: Methodology

TL;DR: In this article, the random projections are divided into disjoint groups, and within each group the projection yielding the smallest estimate of the test error is selected, and the results of applying the base classifier on the selected projections, with a data-driven voting threshold to determine the final assignment.

...read moreread less

Abstract: We introduce a very general method for high-dimensional classification, based on careful combination of the results of applying an arbitrary base classifier to random projections of the feature vectors into a lower-dimensional space. In one special case that we study in detail, the random projections are divided into disjoint groups, and within each group we select the projection yielding the smallest estimate of the test error. Our random projection ensemble classifier then aggregates the results of applying the base classifier on the selected projections, with a data-driven voting threshold to determine the final assignment. Our theoretical results elucidate the effect on performance of increasing the number of projections. Moreover, under a boundary condition implied by the sufficient dimension reduction assumption, we show that the test excess risk of the random projection ensemble classifier can be controlled by terms that do not depend on the original data dimension and a term that becomes negligible as the number of projections increases. The classifier is also compared empirically with several other popular high-dimensional classifiers via an extensive simulation study, which reveals its excellent finite-sample performance.

...read moreread less

Journal Article•DOI•

Forecasting with Sufficient Dimension Reductions

[...]

Alessandro Barbarino, Efstathia Bura

01 Sep 2015-Social Science Research Network

TL;DR: This paper presents an alternative and potentially more attractive alternative: summarizing x as it relates to y, so that all the information in the conditional distribution of y|x is preserved.

...read moreread less

Abstract: Factor models have been successfully employed in summarizing large datasets with few underlying latent factors and in building time series forecasting models for economic variables. When the objective is to forecast a target variable y with a large set of predictors x, the construction of the summary of the xs should be driven by how informative on y it is. Most existing methods first reduce the predictors and then forecast y in independent phases of the modeling process. In this paper we present an alternative and potentially more attractive alternative: summarizing x as it relates to y, so that all the information in the conditional distribution of y|x is preserved. These y-targeted reductions of the predictors are obtained using Sufficient Dimension Reduction techniques. We show in simulations and real data analysis that forecasting models based on sufficient reductions have the potential of significantly improved performance.

...read moreread less

Journal Article•DOI•

Nonparametric Variable Transformation in Sufficient Dimension Reduction

[...]

Qing Mai¹, Hui Zou²•Institutions (2)

Florida State University¹, University of Minnesota²

04 Mar 2015-Technometrics

TL;DR: This work proposes a nonparametric variable transformation method after which the predictors become normal and combines this flexible transformation method with two well-established SDR techniques, sliced inverse regression (SIR) and inverse regression estimator (IRE).

...read moreread less

Abstract: Sufficient dimension reduction (SDR) techniques have proven to be very useful data analysis tools in various applications. Underlying many SDR techniques is a critical assumption that the predictors are elliptically contoured. When this assumption appears to be wrong, practitioners usually try variable transformation such that the transformed predictors become (nearly) normal. The transformation function is often chosen from the log and power transformation family, as suggested in the celebrated Box–Cox model. However, any parametric transformation can be too restrictive, causing the danger of model misspecification. We suggest a nonparametric variable transformation method after which the predictors become normal. To demonstrate the main idea, we combine this flexible transformation method with two well-established SDR techniques, sliced inverse regression (SIR) and inverse regression estimator (IRE). The resulting SDR techniques are referred to as TSIR and TIRE, respectively. Both simulation and real da...

...read moreread less

Journal Article•DOI•

Estimation and inference on central mean subspace for multivariate response data

[...]

Liping Zhu¹, Wei Zhong²•Institutions (2)

Renmin University of China¹, Xiamen University²

01 Dec 2015-Computational Statistics & Data Analysis

TL;DR: It is demonstrated theoretically and empirically that the properly weighted profile least squares approach is more efficient than its unweighted counterpart.

...read moreread less

Journal Article•DOI•

Robust estimating equation-based sufficient dimension reduction

[...]

Jingke Zhou¹, Wangli Xu¹, Lixing Zhu²•Institutions (2)

Renmin University of China¹, Hong Kong Baptist University²

01 Feb 2015-Journal of Multivariate Analysis

TL;DR: From the estimating equation-based sufficient dimension reduction method in the literature, its robust version is proposed to alleviate the impact from outliers and to achieve this, a robust nonparametric regression estimator is suggested.

...read moreread less