scispace - formally typeset
Search or ask a question

Showing papers in "Biometrika in 2012"


Journal ArticleDOI
TL;DR: In this article, the authors proposed a scaled lasso method to jointly estimate the regression coefficients and the noise level in a linear model, which is a convex minimization of a penalized joint loss function.
Abstract: SUMMARY Scaled sparse linear regression jointly estimates the regression coefficients and noise level in a linear model. It chooses an equilibrium with a sparse regression method by iteratively estimating the noise level via the mean residual square and scaling the penalty in proportion to the estimated noise level. The iterative algorithm costs little beyond the computation of a path or grid of the sparse regression estimator for penalty levels above a proper threshold. For the scaled lasso, the algorithm is a gradient descent in a convex minimization of a penalized joint loss function for the regression coefficients and noise level. Under mild regularity conditions, we prove that the scaled lasso simultaneously yields an estimator for the noise level and an estimated coefficient vector satisfying certain oracle inequalities for prediction, the estimation of the noise level and the regression coefficients. These inequalities provide sufficient conditions for the consistency and asymptotic normality of the noise-level estimator, including certain cases where the number of variables is of greater order than the sample size. Parallel results are provided for least-squares estimation after model selection by the scaled lasso. Numerical results demonstrate the superior performance of the proposed methods over an earlier proposal of joint convex minimization. Somekeywords:Convexminimization;Estimationaftermodelselection;Iterativealgorithm;Linearregression;Oracle inequality; Penalized least squares; Scale invariance; Variance estimation.

454 citations


Journal ArticleDOI
TL;DR: The theory shows that the method proposed can consistently identify the subset of discriminative features contributing to the Bayes rule and at the same time consistently estimate theBayes classification direction, even when the dimension can grow faster than any polynomial order of the sample size.
Abstract: SUMMARY Sparse discriminant methods based on independence rules, such as the nearest shrunken centroids classifier (Tibshirani et al., 2002) and features annealed independence rules (Fan & Fan, 2008), have been proposed as computationally attractive tools for feature selection and classification with high-dimensional data. A fundamental drawback of these rules is that they ignore correlations among features and thus could produce misleading feature selection and inferior classification. We propose a new procedure for sparse discriminant analysis, motivated by the least squares formulation of linear discriminant analysis. To demonstrate our proposal, we study the numerical and theoretical properties of discriminant analysis constructed via lasso penalized least squares. Our theory shows that the method proposed can consistently identify the subset of discriminative features contributing to the Bayes rule and at the same time consistently estimate the Bayes classification direction, even when the dimension can grow faster than any polynomial order of the sample size. The theory allows for general dependence among features. Simulated and real data examples show that lassoed discriminant analysis compares favourably with other popular sparse discriminant proposals.

227 citations


Journal ArticleDOI
TL;DR: In this paper, a flexible class of models that is suitable for such data in a spatial context is proposed, and applied to spatially referenced significant wave height data from the North Sea, finding evidence that their extremal structure is not compatible with a limiting dependence model.
Abstract: Current dependence models for spatial extremes are based upon max-stable processes. Within this class, there are few inferentially viable models available, and we propose one further model. More problematic are the restrictive assumptions that must be made when using max-stable processes to model dependence for spatial extremes: it must be assumed that the dependence structure of the observed extremes is compatible with a limiting model that holds for all events more extreme than those that have already occurred. This problem has long been acknowledged in the context of finite-dimensional multivariate extremes, in particular when data display dependence at observable levels, but are independent in the limit. We propose a flexible class of models that is suitable for such data in a spatial context. In addition, we consider the situation where the extremal dependence structure may vary with distance. We apply our models to spatially referenced significant wave height data from the North Sea, finding evidence that their extremal structure is not compatible with a limiting dependence model.

187 citations


Journal ArticleDOI
TL;DR: A new class of double-robust estimators for the parameters of regression models with incompleteCross-sectional or longitudinal data, and of marginal structural mean models for cross-sectional data with similar efficiency properties are derived.
Abstract: Recently proposed double-robust estimators for a population mean from incomplete data and for a finite number of counterfactual means can have much higher efficiency than the usual double-robust estimators under misspecification of the outcome model. In this paper, we derive a new class of double-robust estimators for the parameters of regression models with incomplete cross-sectional or longitudinal data, and of marginal structural mean models for cross-sectional data with similar efficiency properties. Unlike the recent proposals, our estimators solve outcome regression estimating equations. In a simulation study, the new estimator shows improvements in variance relative to the standard double-robust estimator that are in agreement with those suggested by asymptotic theory.

161 citations


Journal ArticleDOI
TL;DR: Analysis of principal nested spheres provides an intuitive and flexible decomposition of the high-dimensional sphere and an interesting special case of the analysis results in finding principal geodesics, similar to those from previous approaches to manifold principal component analysis.
Abstract: A general framework for a novel non-geodesic decomposition of high-dimensional spheres or high-dimensional shape spaces for planar landmarks is discussed. The decomposition, principal nested spheres, leads to a sequence of submanifolds with decreasing intrinsic dimensions, which can be interpreted as an analogue of principal component analysis. In a number of real datasets, an apparent one-dimensional mode of variation curving through more than one geodesic component is captured in the one-dimensional component of principal nested spheres. While analysis of principal nested spheres provides an intuitive and flexible decomposition of the high-dimensional sphere, an interesting special case of the analysis results in finding principal geodesics, similar to those from previous approaches to manifold principal component analysis. An adaptation of our method to Kendall’s shape space is discussed, and a computational algorithm for fitting principal nested spheres is proposed. The result provides a coordinate system to visualize the data structure and an intuitive summary of principal modes of variation, as exemplified by several datasets.

149 citations


Journal ArticleDOI
TL;DR: Using convex optimization, a sparse estimator of the covariance matrix is constructed that is positive definite and performs well in high-dimensional settings and an efficient computational algorithm is developed.
Abstract: Using convex optimization, we construct a sparse estimator of the covariance matrix that is positive definite and performs well in high-dimensional settings. A lasso-type penalty is used to encourage sparsity and a logarithmic barrier function is used to enforce positive definiteness. Consistency and convergence rate bounds are established as both the number of variables and sample size diverge. An efficient computational algorithm is developed and the merits of the approach are illustrated with simulations and a speech signal classification example. Copyright 2012, Oxford University Press.

132 citations


Journal ArticleDOI
TL;DR: In this article, the authors generalize the Dunnett test to derive efficacy and futility boundaries for a flexible multi-arm multi-stage clinical trial for a normally distributed endpoint with known variance.
Abstract: We generalize the Dunnett test to derive efficacy and futility boundaries for a flexible multi-arm multi-stage clinical trial for a normally distributed endpoint with known variance. We show that the boundaries control the familywise error rate in the strong sense. The method is applicable for any number of treatment arms, number of stages and number of patients per treatment per stage. It can be used for a wide variety of boundary types or rules derived from α-spending functions. Additionally, we show how sample size can be computed under a least favourable configuration power requirement and derive formulae for expected sample sizes.

127 citations


Journal ArticleDOI
TL;DR: A unified estimating equation approach to proving asymptotic independence between a filtering statistic and an interaction test statistic in a range of situations, including marginal association and interaction in a generalized linear model with a canonical link is developed.
Abstract: Several two-stage multiple testing procedures have been proposed to detect gene-environment interaction in genome-wide association studies. In this article, we elucidate general conditions that are required for validity and power of these procedures, and we propose extensions of two-stage procedures using the case-only estimator of gene-treatment interaction in randomized clinical trials. We develop a unified estimating equation approach to proving asymptotic independence between a filtering statistic and an interaction test statistic in a range of situations, including marginal association and interaction in a generalized linear model with a canonical link. We assess the performance of various two-stage procedures in simulations and in genetic studies from Women’s Health Initiative clinical trials.

100 citations


Journal ArticleDOI
TL;DR: In this paper, a Gaussian prior measure is chosen in the function space by specifying its precision operator as an appropriate differential operator for the unknown functional, and a Bayesian-Gaussian conjugate analysis is applied to estimate the drift of models used in molecular dynamics and financial econometrics.
Abstract: We consider estimation of scalar functions that determine the dynamics of diffusion processes. It has been recently shown that nonparametric maximum likelihood estimation is ill-posed in this context. We adopt a probabilistic approach to regularize the problem by the adoption of a prior distribution for the unknown functional. A Gaussian prior measure is chosen in the function space by specifying its precision operator as an appropriate differential operator. We establish that a Bayesian–Gaussian conjugate analysis for the drift of one-dimensional nonlinear diffusions is feasible using high-frequency data, by expressing the loglikelihood as a quadratic function of the drift, with sufficient statistics given by the local time process and the end points of the observed path. Computationally efficient posterior inference is carried out using a finite element method. We embed this technology in partially observed situations and adopt a data augmentation approach whereby we iteratively generate missing data paths and draws from the unknown functional. Our methodology is applied to estimate the drift of models used in molecular dynamics and financial econometrics using high- and low-frequency observations. We discuss extensions to other partially observed schemes and connections to other types of nonparametric inference.

86 citations


Journal ArticleDOI
TL;DR: Approaches to adaptively choose components, enabling classification and clustering to be reduced to finite-dimensional problems and to determine regions that are relevant to one of these analyses but not the other.
Abstract: The infinite dimension of functional data can challenge conventional methods for classification and clustering. A variety of techniques have been introduced to address this problem, particularly in the case of prediction, but the structural models that they involve can be too inaccurate, or too abstract, or too difficult to interpret, for practitioners. In this paper, we develop approaches to adaptively choose components, enabling classification and clustering to be reduced to finite-dimensional problems. We explore and discuss properties of these methodologies. Our techniques involve methods for estimating classifier error rate and cluster tightness, and for choosing both the number of components, and their locations, to optimize these quantities. A major attraction of this approach is that it allows identification of parts of the function domain that convey important information for classification and clustering. It also permits us to determine regions that are relevant to one of these analyses but not the other. Copyright 2012, Oxford University Press.

76 citations


Journal ArticleDOI
TL;DR: The asymptotic properties of empirical likelihood and its penalized version are quantified and it is shown that penalized empirical likelihood has the oracle property.
Abstract: When a parametric likelihood function is not specified for a model, estimating equations may provide an instrument for statistical inference. Qin and Lawless (1994) illustrated that empirical likelihood makes optimal use of these equations in inferences for fixed low-dimensional unknown parameters. In this paper, we study empirical likelihood for general estimating equations with growing high dimensionality and propose a penalized empirical likelihood approach for parameter estimation and variable selection. We quantify the asymptotic properties of empirical likelihood and its penalized version, and show that penalized empirical likelihood has the oracle property. The performance of the proposed method is illustrated via simulated applications and a data analysis.

Journal ArticleDOI
TL;DR: In this paper, the authors introduce the notion of a dispersion operator, investigate its use in probing the second-order structure of functional data, and develop a test for comparing the secondorder characteristics of two functional samples that is resistant to atypical observations and departures from normality.
Abstract: Inferences related to the second-order properties of functional data, as expressed by covariance structure, can become unreliable when the data are non-Gaussian or contain unusual observations. In the functional setting, it is often difficult to identify atypical observations, as their distinguishing characteristics can be manifold but subtle. In this paper, we introduce the notion of a dispersion operator, investigate its use in probing the second-order structure of functional data, and develop a test for comparing the second-order characteristics of two functional samples that is resistant to atypical observations and departures from normality. The proposed test is a regularized M-test based on a spectrally truncated version of the Hilbert-Schmidt norm of a score operator defined via the dispersion operator. We derive the asymptotic distribution of the test statistic, investigate the behaviour of the test in a simulation study and illustrate the method on a structural biology dataset.

Journal ArticleDOI
TL;DR: This article proposed a model-assisted projection method of estimation based on a working model, but the reference distribution is design-based, where a large sample of survey 1 collects only auxiliary information and a much smaller sample from survey 2 provides information on both the variables of interest and the auxiliary variables.
Abstract: Combining information from two or more independent surveys is a problem frequently encountered in survey sampling. We consider the case of two independent surveys, where a large sample from survey 1 collects only auxiliary information and a much smaller sample from survey 2 provides information on both the variables of interest and the auxiliary variables. We propose a model-assisted projection method of estimation based on a working model, but the reference distribution is design-based. We generate synthetic or proxy values of a variable of interest by first fitting the working model, relating the variable of interest to the auxiliary variables, to the data from survey 2 and then predicting the variable of interest associated with the auxiliary variables observed in survey 1. The projection estimator of a total is simply obtained from the survey 1 weights and associated synthetic values. We identify the conditions for the projection estimator to be asymptotically unbiased. Domain estimation using the projection method is also considered. Replication variance estimators are obtained by augmenting the synthetic data file for survey 1 with additional synthetic columns associated with the columns of replicate weights. Results from a simulation study are presented. Copyright 2012, Oxford University Press.

Journal ArticleDOI
TL;DR: A multiple imputation estimator for parameter estimation in a quantile regression model when some covariates are missing at random is proposed, and the resulting coefficient estimators are root-n consistent and asymptotically normal.
Abstract: SUMMARY We propose a multiple imputation estimator for parameter estimation in a quantile regression model when some covariates are missing at random. The estimation procedure fully utilizes the entire dataset to achieve increased efficiency, and the resulting coefficient estimators are root-n consistent and asymptotically normal. To protect against possible model misspecification, we further propose a shrinkage estimator, which automatically adjusts for possible bias. The finite sample performance of our estimator is investigated in a simulation study. Finally, we apply our methodology to part of the Eating at American’s Table Study data, investigating the association between two measures of dietary intake.

Journal ArticleDOI
TL;DR: In this article, a new regression model for parameterizing covariance structures in longitudinal data analysis is proposed, and the entries in this decomposition have a moving average and log-innovation interpretation and are modelled as linear functions of covariates.
Abstract: We propose new regression models for parameterizing covariance structures in longitudinal data analysis. Using a novel Cholesky factor, the entries in this decomposition have a moving average and log-innovation interpretation and are modelled as linear functions of covariates. We propose efficient maximum likelihood estimates for joint mean-covariance analysis based on this decomposition and derive the asymptotic distributions of the coefficient estimates. Furthermore, we study a local search algorithm, computationally more efficient than traditional all subset selection, based on bic for model selection, and show its model selection consistency. Thus, a conjecture of Pan & MacKenzie (2003) is verified. We demonstrate the finite-sample performance of the method via analysis of data on CD4 trajectories and through simulations.

Journal ArticleDOI
Hansheng Wang1
TL;DR: The new method assumes that the correlation structure of the high-dimensional data can be well represented by a set of low-dimensional latent factors, which can be estimated consistently by eigenvalue-eigenvector decomposition, and produces uncorrelated predictors.
Abstract: We propose a method of factor profiled sure independence screening for ultrahigh-dimensional variable selection. The objective of this method is to identify nonzero components consistently from a sparse coefficient vector. The new method assumes that the correlation structure of the high-dimensional data can be well represented by a set of low-dimensional latent factors, which can be estimated consistently by eigenvalue-eigenvector decomposition. The estimated latent factors should then be profiled out from both the response and the predictors. Such an operation, referred to as factor profiling, produces uncorrelated predictors. Therefore, sure independence screening can be applied subsequently and the resulting screening result is consistent for model selection, a major advantage that standard sure independence screening does not share. We refer to the new method as factor profiled sure independence screening. Numerical studies confirm its outstanding performance. Copyright 2012, Oxford University Press.

Journal ArticleDOI
TL;DR: Several optimality properties of Dorfman's (1943) group testing procedure are derived for estimation of the prevalence of a rare disease whose status is classified with error.
Abstract: Several optimality properties of Dorfman's (1943) group testing procedure are derived for estimation of the prevalence of a rare disease whose status is classified with error. Exact ranges of disease prevalence are obtained for which group testing provides more efficient estimation when group size increases.

Journal ArticleDOI
TL;DR: A new estimation approach based on corrected scores to account for a class of covariate measurement errors in quantile regression, which requires no parametric assumptions on the regression error distributions and is simple to implement.
Abstract: We study estimation in quantile regression when covariates are measured with errors. Existing methods require stringent assumptions, such as spherically symmetric joint distribution of the regression and measurement error variables, or linearity of all quantile functions, which restrict model flexibility and complicate computation. In this paper, we develop a new estimation approach based on corrected scores to account for a class of covariate measurement errors in quantile regression. The proposed method is simple to implement. Its validity requires only linearity of the particular quantile function of interest, and it requires no parametric assumptions on the regression error distributions. Finite-sample results demonstrate that the proposed estimators are more efficient than the existing methods in various models considered.

Journal ArticleDOI
TL;DR: In this paper, an alternative model for the distribution of the cluster maxima which accounts for the sub-asymptotic theory of extremes of a stationary process is proposed, which is a product of two terms, one for the marginal distribution of exceedances and the other for the dependence structure of the exceedance values within a cluster.
Abstract: A standard approach to model the extreme values of a stationary process is the peaks over threshold method, which consists of imposing a high threshold, identifying clusters of exceedances of this threshold, and fitting the maximum value from each cluster using the generalised Pareto distribution. This approach is strongly justified by underlying asymptotic theory. We propose an alternative model for the distribution of the cluster maxima which accounts for the sub-asymptotic theory of extremes of a stationary process. This new distribution is a product of two terms, one for the marginal distribution of exceedances and the other for the dependence structure of the exceedance values within a cluster. We illustrate the improvement in fit, measured by the root mean square error of the estimated quantiles, offered by the new distribution over the peaks over thresholds analysis using simulated and hydrological data, and we suggest a diagnostic tool to help identify when the proposed model is likely to lead to such an improvement in fit.

Journal ArticleDOI
TL;DR: A simple method for the marginal analysis of longitudinal data with time-varying covariates, some of which are measured with error, while the response is subject to missingness, which has a number of appealing properties.
Abstract: SUMMARY Covariate measurement error and missing responses are typical features in longitudinal data analysis. There has been extensive research on either covariate measurement error or missing responses, but relatively little work has been done to address both simultaneously. In this paper, we propose a simple method for the marginal analysis of longitudinal data with time-varying covariates, some of which are measured with error, while the response is subject to missingness. Our method has a number of appealing properties: assumptions on the model are minimal, with none needed about the distribution of the mismeasured covariate; implementation is straightforward and its applicability is broad. We provide both theoretical justification and numerical results.

Journal ArticleDOI
TL;DR: The inner envelope model as mentioned in this paper is based on a different construction and it can produce substantial efficiency gains in situations where the envelope model offers no gains, which opens a new frontier to the way in which reducing subspaces can be used to improve efficiency in multivariate problems.
Abstract: In this article we propose a new model, called the inner envelope model, which leads to efficient estimation in the context of multivariate normal linear regression. The asymptotic distribution and the consistency of its maximum likelihood estimators are established. Theoretical results, simulation studies and examples all show that the efficiency gains can be substantial relative to standard methods and to the maximum likelihood estimators from the envelope model introduced recently by Cook et al. (2010). Compared to the envelope model, the inner envelope model is based on a different construction and it can produce substantial efficiency gains in situations where the envelope model offers no gains. In effect, inner envelopes open a new frontier to the way in which reducing subspaces can be used to improve efficiency in multivariate problems. Copyright 2012, Oxford University Press.

Journal ArticleDOI
TL;DR: In this paper, a semiparametric proportional likelihood ratio model is proposed for modeling a nonlinear monotonic relationship between the outcome variable and a covariate, and a maximum likelihood estimator is obtained for the new model.
Abstract: We propose a semiparametric proportional likelihood ratio model which is particularly suitable for modelling a nonlinear monotonic relationship between the outcome variable and a covariate. This model extends the generalized linear model by leaving the distribution unspecified, and has a strong connection with semiparametric models such as the selection bias model (Gilbert et al., 1999), the density ratio model (Qin, 1998; Fokianos & Kaimi, 2006), the single-index model (Ichimura, 1993) and the exponential tilt regression model (Rathouz & Gao, 2009). A maximum likelihood estimator is obtained for the new model and its asymptotic properties are derived. An example and simulation study illustrate the use of the model. Copyright 2012, Oxford University Press.

Journal ArticleDOI
TL;DR: In this paper, the normalized inverse Gaussian process is considered and a stick-breaking representation for it is provided. But this representation is not directly applicable to our work, since it does not capture the dynamics of the process.
Abstract: Random probability measures are the main tool for Bayesian nonparametric inference, with their laws acting as prior distributions. Many well-known priors used in practice admit different, though equivalent, representations. In terms of computational convenience, stick-breaking representations stand out. In this paper we focus on the normalized inverse Gaussian process and provide a completely explicit stick-breaking representation for it. This result is of interest both from a theoretical viewpoint and for statistical practice. Copyright 2012, Oxford University Press.

Journal ArticleDOI
TL;DR: In this article, the authors consider the problem of fitting a generalized linear model to overdispersed data, focussing on a quasilikelihood approach in which the variance is assumed to be proportional to that specified by the model, and the constant of proportionality, φ, is used to obtain appropriate standard errors and model comparisons.
Abstract: SUMMARY We consider the problem of fitting a generalized linear model to overdispersed data, focussing on a quasilikelihood approach in which the variance is assumed to be proportional to that specified by the model, and the constant of proportionality, φ, is used to obtain appropriate standard errors and model comparisons. It is common practice to base an estimate of φ on Pearson’s lack-of-fit statistic, with or without Farrington’s modification. We propose a new estimator that has a smaller variance, subject to a condition on the third moment of the response variable. We conjecture that this condition is likely to be achieved for the important special cases of count and binomial data. We illustrate the benefits of the new estimator using simulations for both count and binomial data.

Journal ArticleDOI
TL;DR: This article proposes a regression method for simultaneous supervised clustering and feature selection over a given undirected graph, where homogeneous groups or clusters are estimated as well as informative predictors, with each predictor corresponding to one node in the graph and a connecting path indicating a priori possible grouping among the corresponding predictors.
Abstract: In this article, we propose a regression method for simultaneous supervised clustering and feature selection over a given undirected graph, where homogeneous groups or clusters are estimated as well as informative predictors, with each predictor corresponding to one node in the graph and a connecting path indicating a priori possible grouping among the corresponding predictors. The method seeks a parsimonious model with high predictive power through identifying and collapsing homogeneous groups of regression coefficients. To address computational challenges, we present an efficient algorithm integrating the augmented Lagrange multipliers, coordinate descent and difference convex methods. We prove that the proposed method not only identifies the true homogeneous groups and informative features consistently but also leads to accurate parameter estimation. A gene network dataset is analysed to demonstrate that the method can make a difference by exploring dependency structures among the genes.

Journal ArticleDOI
TL;DR: Substantial conditions under which a local minimizer is unique are given, and it is shown that the oracle estimator becomes the unique local minimizers with probability tending to one.
Abstract: Nonconvex penalties such as the smoothly clipped absolute deviation or minimax concave penalties have desirable properties such as the oracle property, even when the dimension of the predictive variables is large. However, checking whether a given local minimizer has such properties is not easy since there can be many local minimizers. In this paper, we give sufficient conditions under which a local minimizer is unique, and show that the oracle estimator becomes the unique local minimizer with probability tending to one. Copyright 2012, Oxford University Press.

Journal ArticleDOI
TL;DR: In this article, a flexible class of semiparametric mean residual life models where some effects may be time-varying and some may be constant over time is proposed.
Abstract: The mean residual life provides the remaining life expectancy of a subject who has survived to a certain time-point. When covariates are present, regression models are needed to study the association between the mean residual life function and potential regression covariates. In this paper, we propose a flexible class of semiparametric mean residual life models where some effects may be time-varying and some may be constant over time. In the presence of right censoring, we use the inverse probability of censoring weighting approach and develop inference procedures for estimating the model parameters. In addition, we provide graphical and numerical methods for model checking and tests for examining whether or not the covariate effects vary with time. Asymptotic and finite sample properties of the proposed estimators are established and the approach is applied to real life datasets collected from clinical trials. Copyright 2012, Oxford University Press.

Journal ArticleDOI
TL;DR: In a matched observational study of treatment effects, a sensitivity analysis asks about the magnitude of the departure from random assignment that would need to be present to alter the conclusions of an analysis that assumes that matching for measured covariates removes all bias as discussed by the authors.
Abstract: In a matched observational study of treatment effects, a sensitivity analysis asks about the magnitude of the departure from random assignment that would need to be present to alter the conclusions of an analysis that assumes that matching for measured covariates removes all bias. The reported degree of sensitivity to unmeasured biases depends on both the process that generated the data and the chosen methods of analysis, so a poor choice of method may lead to an exaggerated report of sensitivity to bias. This suggests the possibility of performing more than one analysis with a correction for multiple inference, say testing one null hypothesis using two or three different tests. In theory and in an example, it is shown that, in large samples, the gains from testing twice will often be large, because testing twice has the larger of the two design sensitivities of the component tests, and the losses due to correcting for two tests will often be small, because two tests of one hypothesis will typically be highly correlated, so a correction for multiple testing that takes this into account will be small. An illustration uses data from the U.S. National Health and Nutrition Examination Survey concerning lead in the blood of cigarette smokers. Copyright 2012, Oxford University Press.

Journal ArticleDOI
TL;DR: In this paper, multilinear principal component analysis (MPCA) is used for dimension reduction in analyzing high-dimensional data, which can serve a similar function for analyzing tensor structure data and has empirically been shown effective in reducing dimensionality.
Abstract: SUMMARY Principal component analysis is commonly used for dimension reduction in analysing highdimensional data. Multilinear principal component analysis aims to serve a similar function for analysing tensor structure data, and has empirically been shown effective in reducing dimensionality. In this paper, we investigate its statistical properties and demonstrate its advantages. Conventional principal component analysis, which vectorizes the tensor data, may lead to inefficient and unstable prediction due to the often extremely large dimensionality involved. Multilinear principal component analysis, in trying to preserve the data structure, searches for low-dimensional projections and, thereby, decreases dimensionality more efficiently. The asymptotic theory of order-two multilinear principal component analysis, including asymptotic efficiency and distributions of principal components, associated projections, and the explained variance, is developed. A test of dimensionality is also proposed. Finally, multilinear principal component analysis is shown to improve conventional principal component analysis in analysing the Olivetti faces dataset, which is achieved by extracting a more modularly oriented basis set in reconstructing the test faces.

Journal ArticleDOI
TL;DR: This paper presents a simple rule that substantially reduces computation by allowing resampling to terminate early on a subset of tests, and proves that the method has a low probability of obtaining a set of rejected hypotheses different from those rejected without early stopping.
Abstract: Resampling-based methods for multiple hypothesis testing often lead to long run times when the number of tests is large. This paper presents a simple rule that substantially reduces computation by allowing resampling to terminate early on a subset of tests. We prove that the method has a low probability of obtaining a set of rejected hypotheses different from those rejected without early stopping, and obtain error bounds for multiple hypothesis testing. Simulation shows that our approach saves more computation than other available procedures.