scispace - formally typeset
Search or ask a question

Showing papers in "Statistica Sinica in 2010"


Journal Article
TL;DR: In this paper, a brief account of the recent developments of theory, methods, and implementations for high-dimensional variable selection is presented, with emphasis on independence screening and two-scale methods.
Abstract: High dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded as a specific form of penalized likelihood, is computationally too expensive for many modern statistical applications. Other forms of penalized likelihood methods have been successfully developed over the last decade to cope with high dimensionality. They have been widely applied for simultaneously selecting important variables and estimating their effects in high dimensional statistical inference. In this article, we present a brief account of the recent developments of theory, methods, and implementations for high dimensional variable selection. What limits of the dimensionality such methods can handle, what the role of penalty functions is, and what the statistical properties are rapidly drive the advances of the field. The properties of non-concave penalized likelihood and its roles in high dimensional statistical modeling are emphasized. We also review some recent advances in ultra-high dimensional variable selection, with emphasis on independence screening and two-scale methods.

892 citations


Journal Article
TL;DR: In this article, the authors propose a new parsimonious version of the classical multivariate nor-mal linear model, yielding a maximum likelihood estimator (MLE) that is asymp- totically less variable than the MLE based on the usual model.
Abstract: We propose a new parsimonious version of the classical multivariate nor- mal linear model, yielding a maximum likelihood estimator (MLE) that is asymp- totically less variable than the MLE based on the usual model. Our approach is based on the construction of a link between the mean function and the covariance matrix, using the minimal reducing subspace of the latter that accommodates the former. This leads to a multivariate regression model that we call the envelope model, where the number of parameters is maximally reduced. The MLE from the envelope model can be substantially less variable than the usual MLE, especially when the mean function varies in directions that are orthogonal to the directions of maximum variation for the covariance matrix.

140 citations


Journal Article
TL;DR: A methodology that can be used to estimate parameters for such large, and/or incomplete, data sets for the UK 2001 foot-and-mouth disease (FMD) epidemic is detailed.
Abstract: Individual Level Models (ILMs), a new class of models, are being applied to infectious epidemic data to aid in the understanding of the spatio-temporal dynamics of infectious diseases. These models are highly flexible and intuitive, and can be parameterised under a Bayesian framework via Markov chain Monte Carlo (MCMC) methods. Unfortunately, this parameterisation can be difficult to implement due to intense computational requirements when calculating the full posterior for large, or even moderately large, susceptible populations, or when missing data are present. Here we detail a methodology that can be used to estimate parameters for such large, and/or incomplete, data sets. This is done in the context of a study of the UK 2001 foot-and-mouth disease (FMD) epidemic.

126 citations



Journal Article
TL;DR: A generalized product partition model (GPPM) in which the parti- tion process is predictor-dependent is derived, which generalizes DP clustering to relax the exchangeability assumption through the incorporation of predictors, resulting in a generalized Polyaurn scheme.
Abstract: Starting with a carefully formulated Dirichlet process (DP) mixture model, we derive a generalized product partition model (GPPM) in which the parti- tion process is predictor-dependent. The GPPM generalizes DP clustering to relax the exchangeability assumption through the incorporation of predictors, resulting in a generalized Polya urn scheme. In addition, the GPPM can be used for formu- lating flexible semiparametric Bayes models for conditional distribution estimation, bypassing the need for expensive computation of large numbers of unknowns charac- terizing priors for dependent collections of random probability measures. A variety of special cases are considered, and an efficient Gibbs sampling algorithm is de- veloped for posterior computation. The methods are illustrated using simulation examples and an epidemiologic application.

83 citations


Journal Article
TL;DR: The Wang-Landau algorithm (Wang and Landau, 2001) is a recent Monte Carlo method that has generated much interest in the physics literature due to some spectacular simulation performances.
Abstract: The Wang-Landau algorithm (Wang and Landau (2001)) is a recent Monte Carlo method that has generated much interest in the Physics literature due to some spectacular simulation performances. The objective of this paper is two-fold. First, we show that the algorithm can be naturally extended to more general state spaces and used to improve on Markov Chain Monte Carlo schemes of more interest in Statistics. In a second part, we study asymptotic behaviors of the algorithm. We show that with an appropriate choice of the step-size, the algorithm is consistent and a strong law of large numbers holds under some fairly mild conditions. We have also shown by simulations the potential advantage of the WL algorithm for problems in Bayesian inference.

73 citations


Journal Article
TL;DR: It is shown that the variable selection procedure based on IC(Q) automatically and consistently selects the important covariates and leads to efficient estimates with oracle properties, which can be applied to numerous situations involving missing data.
Abstract: We consider the variable selection problem for a class of statistical models with missing data, including missing covariate and/or response data. We investigate the smoothly clipped absolute deviation penalty (SCAD) and adaptive LASSO and propose a unified model selection and estimation procedure for use in the presence of missing data. We develop a computationally attractive algorithm for simulta- neously optimizing the penalized likelihood function and estimating the penalty parameters. Particularly, we propose to use a model selection criterion, called the ICQ statistic, for selecting the penalty parameters. We show that the variable selec- tion procedure based on ICQ automatically and consistently selects the important covariates and leads to efficient estimates with oracle properties. The methodology is very general and can be applied to numerous situations involving missing data, from covariates missing at random in arbitrary regression models to nonignorably missing longitudinal responses and/or covariates. Simulations are given to demon- strate the methodology and examine the finite sample performance of the variable selection procedures. Melanoma data from a cancer clinical trial is presented to illustrate the proposed methodology.

65 citations


Journal Article
TL;DR: An EM algorithm is developed to calculate the estimates for both the regression parameters and the unknown error density, in which a kernel-smoothed conditional profile likelihood is maximized in the M-step, and shows that with a proper choice of the kernel bandwidth parameter, the resulting estimates are consistent and asymptotically normal.
Abstract: We study the accelerated failure time model with a cure fraction via kernel-based nonparametric maximum likelihood estimation. An EM algorithm is developed to calculate the estimates for both the regression parameters and the unknown error density, in which a kernel-smoothed conditional profile likelihood is maximized in the M-step. We show that with a proper choice of the kernel band- width parameter, the resulting estimates are consistent and asymptotically normal. The asymptotic covariance matrix can be consistently estimated by inverting the empirical Fisher information matrix obtained from the profile likelihood using the EM algorithm. Numerical examples are used to illustrate the finite-sample perfor- mance of the proposed estimates.

60 citations


Journal Article
TL;DR: In this article, a two-stage approach is proposed to generate synthetic data that enables agencies to release different numbers of imputations for different variables, which can reduce computational burdens, decrease disclosure risk, and increase inferen- tial accuracy relative to generation in one stage.
Abstract: To protect the confidentiality of survey respondents' identities and sensi- tive attributes, statistical agencies can release data in which confidential values are replaced with multiple imputations. These are called synthetic data. We propose a two-stage approach to generating synthetic data that enables agencies to release different numbers of imputations for different variables. Generation in two stages can reduce computational burdens, decrease disclosure risk, and increase inferen- tial accuracy relative to generation in one stage. We present methods for obtaining inferences from such data. We describe the application of two stage synthesis to creating a public use file for a German business database.

50 citations


Journal Article
TL;DR: A general asymptotic theory for nonparametric maximum likelihood estimation in semiparametric regression models with right censored data is established and the usefulness of this powerful theory is demonstrated through a variety of examples.
Abstract: We establish a general asymptotic theory for nonparametric maximum likelihood estimation in semiparametric regression models with right censored data. We identify a set of regularity conditions under which the nonparametric maximum likelihood estimators are consistent, asymptotically normal, and asymptotically efficient with a covariance matrix that can be consistently estimated by the inverse information matrix or the profile likelihood method. The general theory allows one to obtain the desired asymptotic properties of the nonparametric maximum likelihood estimators for any specific problem by verifying a set of conditions rather than by proving technical results from first principles. We demonstrate the usefulness of this powerful theory through a variety of examples.

50 citations


Journal Article
TL;DR: In this article, Wang, Chen and Li proposed a one-step estimator that has the oracle property in variable selection and estimator has a much simpler implementation and gives better performance than the ordinary gSCAD estimator.
Abstract: Nonparametric varying coefficient models are useful for the analysis of repeated measurements. While many procedures have been developed for estimat- ing varying-coefficients, there have been few results on variable selection for such models. Recently, Wang, Chen and Li (2007) proposed a group SCAD procedure for model selection in varying-coefficient models, and Wang, Li and Huang (2008) established the existence of a local minimizer of the group SCAD criterion that has the oracle property. However, whether the final estimator from the gSCAD procedure via local quadratic approximation always finds the desired local mini- mizer is not clear. In this paper, by linearizing the gSCAD penalty we propose a one-step estimator that has the oracle property in variable selection and esti- mation. The proposed estimator has a much simpler implementation and gives better performance in variable selection and estimation than the ordinary gSCAD estimator.

Journal Article
TL;DR: In this article, a nonparametric model for conditional covariance matrix is proposed, and a kernel estimator is developed accordingly, its asymptotic bias and variance are derived, and its normality is established.
Abstract: There has been considerable attention on estimation of conditional variance function in the literature. We propose here a nonparametric model for conditional covariance matrix. A kernel estimator is developed accordingly, its asymptotic bias and variance are derived, and its asymptotic normality is established. A real data example is used to illustrate the proposed estimation procedure.

Journal Article
TL;DR: An estimating equation-based approach for regression analysis of interval-censored failure time data with the additive hazards model is proposed and is robust and applies to both noninformative and informative censoring cases.
Abstract: Interval-censored failure time data often arise in clinical trials and medical follow-up studies, and a few methods have been proposed for their regression analysis using various regression models (Finkelstein (1986); Huang (1996); Lin, Oakes, and Ying (1998); Sun (2006)). This paper proposes an estimating equation-based approach for regression analysis of interval-censored failure time data with the additive hazards model. The proposed approach is robust and applies to both noninformative and informative censoring cases. A major advantage of the proposed method is that it does not involve estimation of any baseline hazard function. The implementation of the propsoed approach is easy and fast. Asymptotic properties of the proposed estimates are established and some simulation results and an application are provided.

Journal Article
TL;DR: This work proposes a strategy for performing constrained variable selection, and discusses hierarchical and grouping constraints, and introduces anti-hierarchical con- straints in which the inclusion of a variable forces another to be excluded from the model.
Abstract: By building on the stochastic search approach (George and McCulloch (1993)) we propose a strategy for performing constrained variable selection We discuss hierarchical and grouping constraints, and introduce anti-hierarchical con- straints in which the inclusion of a variable forces another to be excluded from the model We prove consistency results about models receiving maximal posterior probability, and about the median model (Barbieri and Berger (2004)), and discuss extensions to generalized linear models

Journal Article
TL;DR: A data driven method to nonparametrically model the penalty function by a step function whose segmentation is data driven, and to estimate it by maximizing the generalized like- lihood.
Abstract: In classical smoothing splines, the smoothness is controlled by a single smoothing parameter that penalizes the roughness uniformly across the whole do- main. Adaptive smoothing splines extend this framework to allow the smoothing parameter to change in the domain, adapting to the change of roughness. In this article we propose a data driven method to nonparametrically model the penalty function. We propose to approximate the penalty function by a step function whose segmentation is data driven, and to estimate it by maximizing the generalized like- lihood. A complexity penalty is added to the generalized likelihood in selecting the best step function from a collection of candidates. A state space representation for the adaptive smoothing splines is derived to ease the computational demand. To allow for fast search among the candidate models, we impose a binary tree structure on the penalty function and propose an efficient search algorithm. We show the consistency of the final estimate. We demonstrate the effectiveness of the method through simulations and a data example.

Journal Article
TL;DR: This work proposes an approach to constructing new NSFDs based on powerful (t, s)-sequences, which is simple, easy to implement, and quite general and can also construct NS FDs for categorical and mixed factors.
Abstract: Multi-fidelity computer experiments are widely used in many engineering and scientific fields. Nested space-filling designs (NSFDs) are suitable for conduct- ing such experiments. Two classes of NSFDs are currently available. One class is based on special orthogonal arrays of strength two, and the other consists of nested Latin hypercube designs; both of them assume all factors are continuous. We pro- pose an approach to constructing new NSFDs based on powerful (t, s)-sequences. The method is simple, easy to implement, and quite general. For continuous fac- tors, this approach produces NSFDs with better space-filling properties than exist- ing ones. Unlike the previous methods, it can also construct NSFDs for categorical and mixed factors. Some illustrative examples are given. Other applications of the constructed designs are briefly discussed.

Journal Article
TL;DR: In this paper, a nonparametric prior for two-dimensional vec- tors of survival functions (S1, S2) is proposed based on the Levy copula and it is used to model, in a non-parametric Bayesian framework, two-sample survival data.
Abstract: The paper proposes a new nonparametric prior for two-dimensional vec- tors of survival functions (S1, S2). The definition is based on the Levy copula and it is used to model, in a nonparametric Bayesian framework, two-sample survival data. Such an application yields a natural extension of the more familiar neutral to the right process of Doksum (1974) adopted for drawing inferences on single survival functions. We then obtain a description of the posterior distribution of (S1, S2), conditionally on possibly right-censored data. As a by-product, we find that the marginal distribution of a pair of observations from the two samples co- incides with the Marshall-Olkin or the Weibull distribution according to specific choices of the marginal Levy measures.

Journal Article
TL;DR: In this paper, a unified method that can be regarded as either an inverse regression approach or a forward regression method is proposed to recover the central dimension reduction subspace for regressions with multivariate responses on high-dimensional predictors.
Abstract: This paper is concerned with dimension reduction in regressions with multivariate responses on high-dimensional predictors. A unified method that can be regarded as either an inverse regression approach or a forward regression method is proposed to recover the central dimension reduction subspace. By using Stein's Lemma, the forward regression estimates the first derivative of the conditional characteristic function of the response given the predictors; by using the Fourier method, the inverse regression estimates the subspace spanned by the conditional mean of the predictors given the responses. Both methods lead to an identical kernel matrix, while preserving as much regression information as possible. Illustrative examples of a data set and comprehensive simulations are used to demonstrate the application of our methods.

Journal Article
TL;DR: A threshold model extending the generalized Pareto distribu- tion for exceedances over a threshold is proposed and is shown to be super-consistent under the maximum product of spacings estimation method.
Abstract: We propose a threshold model extending the generalized Pareto distribu- tion for exceedances over a threshold. The threshold is solely determined within the model and is shown to be super-consistent under the maximum product of spacings estimation method. We apply the model to some insurance data and demonstrate the merit of having a full parametric model for the entire data set.

Journal ArticleDOI
TL;DR: In this paper, the authors show how poor the performance of score tests can be in comparison to the performances of Wald and likelihood ratio (LR) tests through a simulation study and prove consistency and asymptotic normality of the maximum likelihood estimates in ZGMP regression models.
Abstract: Count data often exhibit overdispersion and/or require an adjustment for zero outcomes with respect to a Poisson model. Zero-modified Poisson (ZMP) and zeromodified generalized Poisson (ZMGP) regression models are useful classes of models for such data. In the literature so far only score tests are used for testing the necessity of this adjustment. We address this problem by using Wald and likelihood ratio tests. We show how poor the performance of the score tests can be in comparison to the performance of Wald and likelihood ratio (LR) tests through a simulation study. In particular, the score test in the ZMP case results in a power loss of 47% compared to the Wald test in the worst case, while in the ZMGP case the worst loss is 87%. Therefore, regardless of the computational advantage of score tests, the loss in power compared to the Wald and LR tests should not be neglected and these much more powerful alternatives should be used instead. We prove consistency and asymptotic normality of the maximum likelihood estimates in ZGMP regression models, on what Wald and likelihood ratio tests rely. The usefulnes of ZGMP models is illustrated in a real data example.

Journal Article
TL;DR: In this paper, a multivariate control chart for phase II process monitoring is proposed as a supplementary tool to the usual monitoring schemes designed for detecting general changes in the covariance matrix.
Abstract: For signalling alarms sooner when the dispersion of a multivariate process is “increased”, a multivariate control chart for Phase II process monitoring is proposed as a supplementary tool to the usual monitoring schemes designed for detecting general changes in the covariance matrix. The proposed chart is constructed based on the one-sided likelihood ratio test (LRT) for testing the hypothesis that the covariance matrix of the quality characteristic vector of the current process, Σ, is “larger” than that of the in-control process, Σ0, in the sense that Σ−Σ0 is positive semidefinite and Σ 6= Σ0. Assuming Σ0 is known, the LRT statistic is derived and then used to construct the control chart. A simulation study shows that the proposed control chart indeed outperforms three existing two-sided-test-based control charts under comparison in terms of the average run length. The applicability and effectiveness of the proposed control chart are demonstrated through a semiconductor example and two simulations.

Journal Article
TL;DR: In this paper, the problem of fitting a parametric model to the nonparametric component in partially linear regression models when covariates in parametric and non-parametric parts are subject to Berkson measurement errors is discussed.
Abstract: This paper discusses the problem of fitting a parametric model to the nonparametric component in partially linear regression models when covariates in parametric and nonparametric parts are subject to Berkson measurement errors. The proposed test is based on the supremum of a martingale transform of a certain partial sum process of calibrated residuals. The asymptotic null distribution of this transformed process is shown to be the same as that of a time transformed standard Brownian motion. Consistency of this sequence of tests against some fixed alternatives and asymptotic power under some local nonparametric alternatives are also discussed. A simulation study is conducted to assess the finite sample performance of the proposed test. A Monte Carlo power comparison with some existing tests shows some superiority of the proposed test at the chosen alternatives.

Journal Article
TL;DR: In this paper, it was shown that Ferre and Yao's approach does not give an estimator of the SIR subspace, and necessary and sufficient conditions for this to be true.
Abstract: Ferre and Yao (2005, 2007) proposed a method to estimate the Effective Dimension Reduction space in functional sliced inverse regression. Their approach did not require the inversion of the variance-covariance operator of the explanatory variables, and it allowed them to get √ n consistent estimators in the functional case. In those papers there is a mistake. In this note we show that, in general, the approach does not give an estimator of the SIR subspace. We also give necessary and sufficient conditions for this to be true.

Journal Article
TL;DR: This work proposes a working independent profile likelihood method for the semiparametric time-varying coefficient model with correlation, evaluates the performance of proposed nonparametric kernel estimator and the profile estimator, and applies the method to the western Kenya parasitemia data.
Abstract: We propose a working independent profile likelihood method for the semiparametric time-varying coefficient model with correlation. Kernel likelihood is used to estimate time-varying coefficient. Profile likelihood for the parametric coefficient is formed by plugging in the nonparametric estimator. For independent data, the estimator is asymptotically normal and achieves the asymptotic semiparametric efficiency bound. We evaluate the performance of proposed nonparametric kernel estimator and the profile estimator, and apply the method to the western Kenya parasitemia data.

Journal Article
TL;DR: In this paper, the authors proposed a new block bootstrap procedure for time series, called the ex- tended tapered block bootstrapping, to estimate the variance and approximate the sam- pling distribution of a large class of approximately linear statistics.
Abstract: We propose a new block bootstrap procedure for time series, called the ex- tended tapered block bootstrap, to estimate the variance and approximate the sam- pling distribution of a large class of approximately linear statistics. Our proposal differs from the existing tapered block bootstrap (Paparoditis and Politis (2001, 2002)) in that the tapering is applied to the random weights in the bootstrapped empirical distribution. Under the smooth function model, we obtain asymptotic bias and variance expansions for the variance estimator and establish the consis- tency of the distribution approximation. The extended tapered block bootstrap has wider applicability than the tapered block bootstrap, while preserving the fa- vorable bias and mean squared error properties of the tapered block bootstrap over the moving block bootstrap. A small simulation study is performed to compare the finite-sample performance of the block-based bootstrap methods.

Journal Article
TL;DR: In this paper, a multivariate quantile function model and a Bayesian method to estimate the model parameters are presented, without direct use of the joint probability distribution or density functions of the random variables of interest.
Abstract: Multivariate quantiles have been defined by a number of researchers and can be estimated by different methods. However, little work can be found in the literature about Bayesian estimation of joint quantiles of multivariate random variables. In this paper we present a multivariate quantile function model and propose a Bayesian method to estimate the model parameters. The methodology developed here enables us to estimate the multivariate quantile surfaces and the joint probability without direct use of the joint probability distribution or density functions of the random variables of interest. Furthermore, simulation studies and applications of the methodology to bivariate economics data sets show that the method works well both theoretically and practically.

Journal Article
TL;DR: In this paper, the authors established asymptotic equivalences between the D-test and three likelihood ratio-type tests for homogeneity in finite mixtures, and provided a simple limiting null distribution under contiguous local alternatives.
Abstract: The D-test for homogeneity in finite mixtures is appealing because the D-test statistic depends on the data solely through parameter estimates, whereas likelihood ratio-type test statistics require both parameter estimates and the full data set. In this paper we establish asymptotic equivalences between the D-test and three likelihood ratio-type tests for homogeneity. The first two equivalences, under maximum likelihood and Bayesian estimation frameworks respectively, ap- ply to mixtures from a one-dimensional exponential family; the second equivalence yields a simple limiting null distribution for the D-test statistic as well as a simple limiting distribution under contiguous local alternatives, revealing that the D-test is asymptotically locally most powerful. The third equivalence, under an empirical Bayesian estimation framework, pertains to mixtures from a normal location fam- ily with unknown structural parameter; the third equivalence also yields a simple limiting null distribution for the D-test statistic. Simulation studies are provided to investigate finite-sample accuracy of critical values based on the limiting null dis- tributions and to compare the D-test to its competitors regarding power to detect heterogeneity. We conclude with an application to medical data and a discussion emphasizing computational advantages of the D-test.

Journal Article
Antai Wang1
TL;DR: Two tests for parametric models belonging to the Archimedean copula family are proposed, one for uncensored bivariate data and the other one for right-censored b variables, which perform well when the sample size is large.
Abstract: In this paper, we propose two tests for parametric models belonging to the Archimedean copula family, one for uncensored bivariate data and the other one for right-censored bivariate data. Our test procedures are based on the Fisher transform of the correlation coefficient of a bivariate (U, V ), which is a one-to- one transform of the original random pair (T1, T2) that can be modeled by an Archimedean copula model. A multiple imputation technique is applied to establish our test for censored data and its p value is computed by combining test statistics obtained from multiply imputed data sets. Simulation studies suggest that both procedures perform well when the sample size is large. The test for censored data is carried out for a medical data example.

Journal Article
TL;DR: In this article, the authors proposed a penalized joint likelihood method for nonparametric estimation of hazard function, which combines the functional ANOVA decomposition and the Kullback-Leibler geometry, and derive a model selection tool to assess the covariate effects.
Abstract: Frailty has been introduced as a group-wise random effect to describe the within-group dependence for correlated survival data. In this article, we propose a penalized joint likelihood method for nonparametric estimation of hazard func- tion. With the proposed method, the frailty variance component and the smoothing parameters become the tuning parameters that are selected to minimize a loss func- tion derived from the Kullback-Leibler distance through delete-one cross-validation. Confidence intervals for the hazard function are constructed using the Bayes model of the penalized likelihood. Combining the functional ANOVA decomposition and the Kullback-Leibler geometry, we also derive a model selection tool to assess the covariate effects. We establish that our estimate is consistent and its nonpara- metric part achieves the optimal convergence rate. We investigate finite sample performance of the proposed method with simulations and data analysis.

Journal Article
TL;DR: In this article, a new class of optimum design criteria for the linear regression model with r responses based on the volume of the predictive ellipsoid is proposed, referred to as I r L-optimality.
Abstract: This paper proposes a new class of optimum design criteria for the linear regression model with r responses based on the volume of the predictive ellipsoid. This is referred to as I r L-optimality. The I r L-optimality criterion is invariant with respect to different parameterizations of the model, and reduces to IL-optimality as proposed by Dette and O'Brien (1999) in single response situations. An equivalence theorem for I r L-optimality is provided and used to verify I r L-optimality of designs,