scispace - formally typeset
Search or ask a question

Showing papers in "Biometrika in 2007"


Journal ArticleDOI
TL;DR: The implementation of the penalized likelihood methods for estimating the concentration matrix in the Gaussian graphical model is nontrivial, but it is shown that the computation can be done effectively by taking advantage of the efficient maxdet algorithm developed in convex optimization.
Abstract: SUMMARY We propose penalized likelihood methods for estimating the concentration matrix in the Gaussian graphical model. The methods lead to a sparse and shrinkage estimator of the concentration matrix that is positive definite, and thus conduct model selection and estimation simultaneously. The implementation of the methods is nontrivial because of the positive definite constraint on the concentration matrix, but we show that the computation can be done effectively by taking advantage of the efficient maxdet algorithm developed in convex optimization. We propose a BIC-type criterion for the selection of the tuning parameter in the penalized likelihood methods. The connection between our methods and existing methods is illustrated. Simulations and real examples demonstrate the competitive performance of the new methods.

1,824 citations


Journal ArticleDOI
TL;DR: This work shows that the commonly used the generalised crossvalidation cannot select the tuning parameter satisfactorily, with a nonignorable overfitting effect in the resulting model, and proposes a bic tuning parameter selector, which is shown to be able to identify the true model consistently.
Abstract: SUMMARY The penalized least squares approach with smoothly clipped absolute deviation penalty has been consistently demonstrated to be an attractive regression shrinkage and selection method. It not only automatically and consistently selects the important variables, but also produces estimators which are as efficient as the oracle estimator. However, these attractive features depend on appropriate choice of the tuning parameter. We show that the commonly used generalized crossvalidation cannot select the tuning parameter satisfactorily, with a nonignorable overfitting effect in the resulting model. In addition, we propose a BIC tuning parameter selector, which is shown to be able to identify the true model consistently. Simulation studies are presented to support theoretical findings, and an empirical example is given to illustrate its use in the Female Labor Supply data.

730 citations


Journal ArticleDOI
TL;DR: In this paper, the adaptive Lasso estimator is proposed for Cox's proportional hazards model, which is based on a penalized log partial likelihood with the adaptively weighted L 1 penalty on regression coefficients.
Abstract: SUMMARY We investigate the variable selection problem for Cox's proportional hazards model, and propose a unified model selection and estimation procedure with desired theoretical properties and computational convenience. The new method is based on a penalized log partial likelihood with the adaptively weighted L1 penalty on regression coefficients, providing what we call the adaptive Lasso estimator. The method incorporates different penalties for different coefficients: unimportant variables receive larger penalties than important ones, so that important variables tend to be retained in the selection process, whereas unimportant variables are more likely to be dropped. Theoretical properties, such as consistency and rate of convergence of the estimator, are studied. We also show that, with proper choice of regularization parameters, the proposed estimator has the oracle properties. The convex optimization nature of the method leads to an efficient algorithm. Both simulated and real examples show that the method performs competitively.

587 citations


Journal ArticleDOI
Tomohiro Ando1
TL;DR: In this article, a Bayesian predictive information criterion is proposed as an estimator of the posterior mean of the expected loglikelihood of the predictive distribution when the specified family of probability distributions does not contain the true distribution.
Abstract: SUMMARY The problem of evaluating the goodness of the predictive distributions of hierarchical Bayesian and empirical Bayes models is investigated. A Bayesian predictive information criterion is proposed as an estimator of the posterior mean of the expected loglikelihood of the predictive distribution when the specified family of probability distributions does not contain the true distribution. The proposed criterion is developed by correcting the asymptotic bias of the posterior mean of the loglikelihood as an estimator of its expected loglikelihood. In the evaluation of hierarchical Bayesian models with random effects, regardless of our parametric focus, the proposed criterion considers the bias correction of the posterior mean of the marginal loglikelihood because it requires a consistent parameter estimator. The use of the bootstrap in model evaluation is also discussed.

218 citations


Journal ArticleDOI
TL;DR: In this article, the authors derived a general result for exponential mixtures and explore its implications for the specification and empirical analysis of univariate and multivariate duration models, and showed that the distribution of the heterogeneity among survivors converges to a gamma distribution.
Abstract: jabbringgfeweb.vu.nl gberggfeweb.vu.nl SUMMARY In a large class of hazard models with proportional unobserved heterogeneity, the distribution of the heterogeneity among survivors converges to a gamma distribution. This convergence is often rapid. We derive this result as a general result for exponential mixtures and explore its implications for the specification and empirical analysis of univariate and multivariate duration models.

192 citations


Journal ArticleDOI
TL;DR: In this article, a Bayesian nonparametric approach is used to evaluate the probability of discovering a certain number of new species in a new sample of population units, conditional on the number of species recorded in a basic sample.
Abstract: SUMMARY We consider the problem of evaluating the probability of discovering a certain number of new species in a new sample of population units, conditional on the number of species recorded in a basic sample. We use a Bayesian nonparametric approach. The different species proportions are assumed to be random and the observations from the population exchangeable. We provide a Bayesian estimator, under quadratic loss, for the probability of discovering new species which can be compared with well-known frequentist estimators. The results we obtain are illustrated through a numerical example and an application to a genomic dataset concerning the discovery of new genes by sequencing additional single-read sequences of cDNA fragments.

191 citations


Journal ArticleDOI
TL;DR: In this article, a generalized spatial Dirichlet process is proposed for point-referenced data, which allows different surface selection at different sites, and the marginal distribution of the effect at each site still comes from a Gaussian process.
Abstract: SUMMARY Many models for the study of point-referenced data explicitly introduce spatial random effects to capture residual spatial association. These spatial effects are customarily modelled as a zeromean stationary Gaussian process. The spatial Dirichlet process introduced by Gelfand et al. (2005) produces a random spatial process which is neither Gaussian nor stationary. Rather, it varies about a process that is assumed to be stationary and Gaussian. The spatial Dirichlet process arises as a probability-weighted collection of random surfaces. This can be limiting for modelling and inferential purposes since it insists that a process realization must be one of these surfaces. We introduce a random distribution for the spatial effects that allows different surface selection at different sites. Moreover, we can specify the model so that the marginal distribution of the effect at each site still comes from a Dirichlet process. The development is offered constructively, providing a multivariate extension of the stick-breaking representation of the weights. We then introduce mixing using this generalized spatial Dirichlet process. We illustrate with a simulated dataset of independent replications and note that we can embed the generalized process within a dynamic model specification to eliminate the independence assumption.

188 citations


Journal ArticleDOI
TL;DR: In this article, the authors use the properties of independence estimating equations to adjust the "independence" loglikelihood function in the presence of clustering, which relies on the robust sandwich estimator of the parameter covariance matrix, which is easily calculated.
Abstract: We use the properties of independence estimating equations to adjust the 'independence' loglikelihood function in the presence of clustering. The proposed adjustment relies on the robust sandwich estimator of the parameter covariance matrix, which is easily calculated. The methodology competes favourably with established techniques based on independence estimating equations; we provide some insight as to why this is so. The adjustment is applied to examples relating to the modelling of wind speed in Europe and annual maximum temperatures in the U.K.

186 citations


Journal ArticleDOI
TL;DR: In this article, the covariance matrix of a multivariate random vector under the constraint that certain covariances are zero is estimated using an iterative conditional fitting (ICF) approach.
Abstract: SUMMARY We consider estimation of the covariance matrix of a multivariate random vector under the constraint that certain covariances are zero. We first present an algorithm, which we call iterative conditional fitting, for computing the maximum likelihood estimate of the constrained covariance matrix, under the assumption of multivariate normality. In contrast to previous approaches, this algorithm has guaranteed convergence properties. Dropping the assumption of multivariate normality, we show how to estimate the covariance matrix in an empirical likelihood approach. These approaches are then compared via simulation and on an example of gene expression.

162 citations


Journal ArticleDOI
TL;DR: In this paper, a geometric representation of high-dimensional, low-small-sample size datasets has been established under milder conditions using asymptotic properties of sample covariance matrices.
Abstract: SUMMARY High-dimension, low-small-sample size datasets have different geometrical properties from those of traditional low-dimensional data In their asymptotic study regarding increasing dimensionality with a fixed sample size, Hall et al (2005) showed that each data vector is approximately located on the vertices of a regular simplex in a high-dimensional space A perhaps unappealing aspect of their result is the underlying assumption which requires the variables, viewed as a time series, to be almost independent We establish an equivalent geometric representation under much milder conditions using asymptotic properties of sample covariance matrices We discuss implications of the results, such as the use of principal component analysis in a high-dimensional space, extension to the case of nonindependent samples and also the binary classification problem

158 citations


Journal ArticleDOI
TL;DR: A unified estimation strategy is proposed, which combines a regression-type formulation of sufficient dimension reduction methods and shrinkage estimation, to produce sparse and accurate solutions.
Abstract: Existing sufficient dimension reduction methods suffer from the fact that each dimension reduction component is a linear combination of all the original predictors, so that it is difficult to interpret the resulting estimates. We propose a unified estimation strategy, which combines a regression-type formulation of sufficient dimension reduction methods and shrinkage estimation, to produce sparse and accurate solutions. The method can be applied to most existing sufficient dimension reduction methods such as sliced inverse regression, sliced average variance estimation and principal Hessian directions. We demonstrate the effectiveness of the proposed method by both simulations and real data analysis.

Journal ArticleDOI
TL;DR: In this paper, a new class of statistical models, designed for life history analysis of plants and animals, that allow joint analysis of data on survival and reproduction over multiple years, allow for variables having different probability distributions, and correctly account for the dependence of variables on earlier variables.
Abstract: SUMMARY We present a new class of statistical models, designed for life history analysis of plants and animals, that allow joint analysis of data on survival and reproduction over multiple years, allow for variables having different probability distributions, and correctly account for the dependence of variables on earlier variables. We illustrate their utility with an analysis of data taken from an experimental study of Echinacea angustifolia sampled from remnant prairie populations in western Minnesota. These models generalize both generalized linear models and survival analysis. The joint distribution is factorized as a product of conditional distributions, each an exponential family with the conditioning variable being the sample size of the conditional distribution. The model may be heterogeneous, each conditional distribution being from a different exponential family. We show that the joint distribution is from a flat exponential family and derive its canonical parameters, Fisher information and other properties. These models are implemented in an R package ‘aster’ available from the Comprehensive R Archive Network, CRAN.

Journal ArticleDOI
TL;DR: In this article, the authors introduce and exemplify an efficient method for direct sampling from hyperinverse Wishart distributions, which relies very naturally on the use of standard junction-tree representation of graphs, and couples these with matrix results for inverse Wishart distribution.
Abstract: SUMMARY We introduce and exemplify an efficient method for direct sampling from hyperinverse Wishart distributions. The method relies very naturally on the use of standard junction-tree representation of graphs, and couples these with matrix results for inverse Wishart distributions. We describe the theory and resulting computational algorithms for both decomposable and nondecomposable graphical models. An example drawn from financial time series demonstrates application in a context where inferences on a structured covariance model are required. We discuss and investigate questions of scalability of the simulation methods to higher-dimensional distributions. The paper concludes with general comments about the approach, including its use in connection with existing Markov chain Monte Carlo methods that deal with uncertainty about the graphical model structure.

Journal ArticleDOI
TL;DR: In this article, a groupwise empirical likelihood procedure was proposed to handle the inter-series dependence for the longitudinal semiparametric regression model, and employed bias correction to construct the empirical likelihood ratio functions for the parameters of interest.
Abstract: A semiparametric regression model for longitudinal data is considered. The empirical likelihood method is used to estimate the regression coefficients and the baseline function, and to construct confidence regions and intervals. It is proved that the maximum empirical likelihood estimator of the regression coefficients achieves asymptotic efficiency and the estimator of the baseline function attains asymptotic normality when a bias correction is made. Two calibrated empirical likelihood approaches to inference for the baseline function are developed. We propose a groupwise empirical likelihood procedure to handle the inter-series dependence for the longitudinal semiparametric regression model, and employ bias correction to construct the empirical likelihood ratio functions for the parameters of interest. This leads us to prove a nonparametric version of Wilks' theorem. Compared with methods based on normal approximations, the empirical likelihood does not require consistent estimators for the asymptotic variance and bias. A simulation compares the empirical likelihood and normal-based methods in terms of coverage accuracies and average areas/lengths of confidence regions/intervals.

Journal ArticleDOI
TL;DR: A class of semiparametric estimators is proposed for the parameter of interest beta, as well as for the population mean E(Y) and the resulting estimators are shown to be consistent and asymptotically normal under general assumptions.
Abstract: We consider partially linear models of the form Y = X(T)beta + nu(Z) + epsilon when the response variable Y is sometimes missing with missingness probability pi depending on (X, Z), and the covariate X is measured with error, where nu(z) is an unspecified smooth function. The missingness structure is therefore missing not at random, rather than the usual missing at random. We propose a class of semiparametric estimators for the parameter of interest beta, as well as for the population mean E(Y). The resulting estimators are shown to be consistent and asymptotically normal under general assumptions. To construct a confidence region for beta, we also propose an empirical-likelihood-based statistic, which is shown to have a chi-squared distribution asymptotically. The proposed methods are applied to an AIDS clinical trial dataset. A simulation study is also reported.

Journal ArticleDOI
TL;DR: In this article, Chen and Dunson proposed a modified Cholesky decomposition of the form E = DLL'D for a covariance matrix where D is a diagonal matrix with entries proportional to the square roots of the diagonal entries of E and L is a unit lower-triangular matrix solely determining its correlation matrix.
Abstract: SUMMARY Chen & Dunson (2003) have proposed a modified Cholesky decomposition of the form E = DLL'D for a covariance matrix where D is a diagonal matrix with entries proportional to the square roots of the diagonal entries of E and L is a unit lower-triangular matrix solely determining its correlation matrix. This total separation of variance and correlation is definitely a major advantage over the more traditional modified Cholesky decomposition of the form LD2L' (Pourahmadi, 1999). We show that, though the variance and correlation parameters of the former decomposition are separate, they are not asymptotically orthogonal and that the estimation of the new parameters could be more demanding computationally. We also provide statistical interpretation for the entries of L and D as certain moving average parameters and innovation variances and indicate how the existing likelihood procedures can be employed to estimate the new parameters.

Journal ArticleDOI
TL;DR: In this article, an extension of population-based Markov chain Monte Carlo to the transdimensional case is presented, which is applied to gene expression data of 1000 data points in six dimensions.
Abstract: We present an extension of population-based Markov chain Monte Carlo to the transdimensional case. A major challenge is that of simulating from high- and transdimensional target measures. In such cases, Markov chain Monte Carlo methods may not adequately traverse the support of the target; the simulation results will be unreliable. We develop population methods to deal with such problems, and give a result proving the uniform ergodicity of these population algorithms, under mild assumptions. This result is used to demonstrate the superiority, in terms of convergence rate, of a population transition kernel over a reversible jump sampler for a Bayesian variable selection problem. We also give an example of a population algorithm for a Bayesian multivariate mixture model with an unknown number of components. This is applied to gene expression data of 1000 data points in six dimensions and it is demonstrated that our algorithm outperforms some competing Markov chain samplers. In this example, we show how to combine the methods of parallel chains (Geyer, 1991), tempering (Geyer & Thompson, 1995), snooker algorithms (Gilks et al., 1994), constrained sampling and delayed rejection (Green & Mira, 2001).

Journal ArticleDOI
TL;DR: In this article, a nonparametric likelihood-based estimator of the mean function of counting processes with panel count data using monotone polynomial splines was proposed.
Abstract: We study nonparametric likelihood-based estimators of the mean function of counting processes with panel count data using monotone polynomial splines. The generalized Rosen algorithm, proposed by Zhang & Jamshidian (2004), is used to compute the estimators. We show that the proposed spline likelihood-based estimators are consistent and that their rate of convergence can be faster than n 1/3 . Simulation studies with moderate samples show that the estimators have smaller variances and mean squared errors than their alternatives proposed by Wellner & Zhang (2000). A real example from a bladder tumour clinical trial is used to illustrate this method.

Journal ArticleDOI
TL;DR: In this paper, a new consistent variable selection method, called separated cross-validation, is proposed, which leads to single-index models with selected variables that have better prediction capability than models based on all the covariates.
Abstract: SUMMARY We consider variable selection in the single-index model. We prove that the popular leave-m-out crossvalidation method has different behaviour in the single-index model from that in linear regression models or nonparametric regression models. A new consistent variable selection method, called separated crossvalidation, is proposed. Further analysis suggests that the method has better finite-sample performance and is computationally easier than leave-m-out crossvalidation. Separated crossvalidation, applied to the Swiss banknotes data and the ozone concentration data, leads to single-index models with selected variables that have better prediction capability than models based on all the covariates.

Journal ArticleDOI
TL;DR: A new class of models for making inference about the mean of a vector of repeated outcomes when the outcome vector is incompletely observed in some study units and missingness is nonmonotone is proposed, which are ideal for conducting sensitivity analyses aimed at evaluating the impact that different degrees of departure from sequential explainability have on inference aboutThe marginal means of interest.
Abstract: We propose a new class of models for making inference about the mean of a vector of repeated outcomes when the outcome vector is incompletely observed in some study units and missingness is nonmonotone. Each model in our class is indexed by a set of unidentified selection-bias functions which quantify the residual association of the outcome at each occasion t and the probability that this outcome is missing after adjusting for variables observed prior to time t and for the past nonresponse pattern. In particular, selection-bias functions equal to zero encode the investigator's a priori belief that nonresponse of the next outcome does not depend on that outcome after adjusting for the observed past. We call this assumption sequential explainability. Since each model in our class is nonparametric, it fits the data perfectly well. As such, our models are ideal for conducting sensitivity analyses aimed at evaluating the impact that different degrees of departure from sequential explainability have on inference about the marginal means of interest. Although the marginal means are identified under each of our models, their estimation is not feasible in practice because it requires the auxiliary estimation of conditional expectations and probabilities given high-dimensional variables. We henceforth discuss the estimation of the marginal means under each model in our class assuming, additionally, that at each occasion either one of the following two models holds: a parametric model for the conditional probability of nonresponse given current outcomes and past recorded data or a parametric model for the conditional mean of the outcome on the nonrespondents given the past recorded data. We call the resulting procedure 2 T -multiply robust as it protects at each of the T time points against misspecification of one of these two working models, although not against simultaneous misspecification of both. We extend our proposed class of models and estimators to incorporate data configurations which include baseline covariates and a parametric model for the conditional mean of the vector of repeated outcomes given the baseline covariates.

Journal ArticleDOI
TL;DR: In this article, the central subspace is estimated by a linear combination of linear combinations of the original predictors, and a sufficient dimension reduction (DFR) method is proposed to eliminate the need for inversion.
Abstract: SUMMARY Regressions in which the fixed number of predictors p exceeds the number of independent observational units n occur in a variety of scientific fields. Sufficient dimension reduction provides a promising approach to such problems, by restricting attention to d < n linear combinations of the original p predictors. However, standard methods of sufficient dimension reduction require inversion of the sample predictor covariance matrix. We propose a method for estimating the central subspace that eliminates the need for such inversion and is applicable regardless of the (n, p) relationship. Simulations show that our method compares favourably with standard large sample techniques when the latter are applicable. We illustrate our method with a genomics application.

Journal ArticleDOI
TL;DR: In this paper, the authors present a nonparametric approach to identify the expert's probability distribution uniquely, and consider the issue of imprecision in the elicited probability judgements.
Abstract: SUMMARY A key task in the elicitation of expert knowledge is to construct a distribution from the finite, and usually small, number of statements that have been elicited from the expert. These statements typically specify some quantiles or moments of the distribution. Such statements are not enough to identify the expert's probability distribution uniquely, and the usual approach is to fit some member of a convenient parametric family. There are two clear deficiencies in this solution. First, the expert's beliefs are forced to fit the parametric family. Secondly, no account is then taken of the many other possible distributions that might have fitted the elicited statements equally well. We present a nonparametric approach which tackles both of these deficiencies. We also consider the issue of the imprecision in the elicited probability judgements.

Journal ArticleDOI
TL;DR: In this article, the authors consider the absolute prediction error, the expected value of the absolute difference between the future and predicted responses, as the model evaluation criterion and show that this prediction error is easier to interpret than the average squared error and is equivalent to the misclassification error for a binary outcome.
Abstract: The construction of a reliable, practically useful prediction rule for future responses is heavily dependent on the 'adequacy' of the fitted regression model. In this article, we consider the absolute prediction error, the expected value of the absolute difference between the future and predicted responses, as the model evaluation criterion. This prediction error is easier to interpret than the average squared error and is equivalent to the misclassification error for a binary outcome. We show that the prediction error can be consistently estimated via the resubstitution and crossvalidation methods even when the fitted model is not correctly specified. Furthermore, we show that the resulting estimators are asymptotically normal. When the prediction rule is 'nonsmooth', the variance of the above normal distribution can be estimated well with a perturbation-resampling method. With two real examples and an extensive simulation study, we demonstrate that the interval estimates obtained from the above normal approximation for the prediction errors provide much more information about model adequacy than their point-estimate counterparts.

Journal ArticleDOI
TL;DR: The new results allow the application of these methods to state space models where the observation density p(y|θ) is not log-concave and the methods for the stochastic volatility model with leverage are illustrated.
Abstract: We develop a proposal or importance density for state space models with a nonlinear non-Gaussian observation vector y ∼ p(y|θ) and an unobserved linear Gaussian signal vector θ ∼ p(θ). The proposal density is obtained from the Laplace approximation of the smoothing density p(θ|y). We present efficient algorithms to calculate the mode of p(θ|y) and to sample from the proposal density. The samples can be used for importance sampling and Markov chain Monte Carlo methods. The new results allow the application of these methods to state space models where the observation density p(y|θ) is not log-concave. Additional results are presented that lead to computationally efficient implementations. We illustrate the methods for the stochastic volatility model with leverage.

Journal ArticleDOI
TL;DR: In this article, the authors considered that observations come from a general normal linear model and that it is desirable to test a simplifying null hypothesis about the parameters of the model about the observations.
Abstract: SUMMARY We consider that observations come from a general normal linear model and that it is desirable to test a simplifying null hypothesis about the parameters. We approach this problem from an objective Bayesian, model-selection perspective. Crucial ingredients for this approach are 'proper objective priors' to be used for deriving the Bayes factors. Jeffreys-Zellner-Siow priors have good properties for testing null hypotheses defined by specific values of the parameters in full-rank linear models. We extend these priors to deal with general hypotheses in general linear models, not necessarily of full rank. The resulting priors, which we call 'conventional priors', are expressed as a generalization of recently introduced 'partially informative distributions'. The corresponding Bayes factors are fully automatic, easily computed and very reasonable. The methodology is illustrated for the change-point problem and the equality of treatments effects problem. We compare the conventional priors derived for these problems with other objective Bayesian proposals like the intrinsic priors. It is concluded that both priors behave similarly although interesting subtle differences arise. We adapt the conventional priors to deal with nonnested model selection as well as multiple-model comparison. Finally, we briefly address a generalization of conventional priors to nonnormal scenarios.

Journal ArticleDOI
Jerome P. Reiter1
TL;DR: In this paper, the authors presented an alternative denominator degree of freedom that is always less than or equal to the complete-data denominator degrees of freedom, and equals the currently employed denominator Degree of Freedom for infinite sample sizes.
Abstract: SUMMARY When performing multi-component significance tests with multiply-imputed datasets, analysts can use a Wald-like test statistic and a reference F-distribution. The currently employed degrees of freedom in the denominator of this F-distribution are derived assuming an infinite sample size. For modest complete-data sample sizes, this degrees of freedom can be unrealistic; for example, it may exceed the complete-data degrees of freedom. This paper presents an alternative denominator degrees of freedom that is always less than or equal to the complete-data denominator degrees of freedom, and equals the currently employed denominator degrees of freedom for infinite sample sizes. Its advantages over the currently employed degrees of freedom are illustrated with a simulation.

Journal ArticleDOI
TL;DR: In this article, the identifiability of both single-index models and partially linear single index models with continuity of the regression function has been shown for the additive-index model, a condition much weaker than the differentiability conditions assumed in the existing literature.
Abstract: SUMMARY We provide a proof for the identifiability for both single-index models and partially linear single index models assuming only the continuity of the regression function, a condition much weaker than the differentiability conditions assumed in the existing literature. Our discussion is then extended to the identifiability of the additive-index models.

Journal ArticleDOI
TL;DR: In this article, a method for fitting smooth curves through a series of shapes of landmarks in two dimensions using unrolling and unwrapping procedures in Riemannian manifolds is presented.
Abstract: A method is developed for fitting smooth curves through a series of shapes of landmarks in two dimensions using unrolling and unwrapping procedures in Riemannian manifolds. An explicit method of calculation is given which is analogous to that of Jupp & Kent (1987) for spherical data. The resulting splines are called shape-space smoothing splines. The method resembles that of fitting smoothing splines in real spaces in that, if the smoothing parameter is zero, the resulting curve interpolates the data points, and if it is infinitely large the curve is a geodesic line. The fitted path to the data is defined such that its unrolled version at the tangent space of the starting point is a cubic spline fitted to the unwrapped data with respect to that path. Computation of the fitted path consists of an iterative procedure which converges quickly, and the resulting path is given in a discretised form in terms of a piecewise geodesic path. The procedure is applied to the analysis of some human movement data, and a test for the appropriateness of a mean geodesic curve is given.

Journal ArticleDOI
TL;DR: In this article, the authors show that if the joint distribution of the test statistics is available, through modelling for example, they recommend partition step-down testing, setting exact critical values based on the joint distributions.
Abstract: Holm's method and Hochberg's method for multiple testing can be viewed as step-down and step-up versions of the Bonferroni test. We show that both are special cases of partition testing. The difference is that, while Holm's method tests each partition hypothesis using the largest order statistic, setting a critical value based on the Bonferroni inequality, Hochberg's method tests each partition hypothesis using all the order statistics, setting a series of critical values based on Simes' inequality. Geometrically, Hochberg's step-up method ‘cuts corners’ off the acceptance regions of Holm's step-down method by making assumptions on the joint distribution of the test statistics. As can be expected, partition testing making use of the joint distribution of the test statistics is more powerful than partition testing using probabilistic inequalities. Thus, if the joint distribution of the test statistics is available, through modelling for example, we recommend partition step-down testing, setting exact critical values based on the joint distribution.

Journal ArticleDOI
TL;DR: In this paper, a new dimension reduction method, called partial inverse regression, was proposed, which is similar to or superior to partial least squares when n < p, especially when the regression model is nonlinear or heteroscedastic.
Abstract: In regression with a vector of quantitative predictors, sufficient dimension reduction methods can effectively reduce the predictor dimension, while preserving full regression information and assuming no parametric model. However, all current reduction methods require the sample size n to be greater than the number of predictors p. It is well known that partial least squares can deal with problems with n < p. We first establish a link between partial least squares and sufficient dimension reduction. Motivated by this link, we then propose a new dimension reduction method, entitled partial inverse regression. We show that its sample estimator is consistent, and that its performance is similar to or superior to partial least squares when n < p, especially when the regression model is nonlinear or heteroscedastic. An example involving the spectroscopy analysis of biscuit dough is also given.