scispace - formally typeset
Search or ask a question

Showing papers in "Biometrika in 1994"


Journal ArticleDOI
TL;DR: In this article, the authors developed a spatially adaptive method, RiskShrink, which works by shrinkage of empirical wavelet coefficients, and achieved a performance within a factor log 2 n of the ideal performance of piecewise polynomial and variable-knot spline methods.
Abstract: SUMMARY With ideal spatial adaptation, an oracle furnishes information about how best to adapt a spatially variable estimator, whether piecewise constant, piecewise polynomial, variable knot spline, or variable bandwidth kernel, to the unknown function. Estimation with the aid of an oracle offers dramatic advantages over traditional linear estimation by nonadaptive kernels; however, it is a priori unclear whether such performance can be obtained by a procedure relying on the data alone. We describe a new principle for spatially-adaptive estimation: selective wavelet reconstruction. We show that variable-knot spline fits and piecewise-polynomial fits, when equipped with an oracle to select the knots, are not dramatically more powerful than selective wavelet reconstruction with an oracle. We develop a practical spatially adaptive method, RiskShrink, which works by shrinkage of empirical wavelet coefficients. RiskShrink mimics the performance of an oracle for selective wavelet reconstruction as well as it is possible to do so. A new inequality in multivariate normal decision theory which we call the oracle inequality shows that attained performance differs from ideal performance by at most a factor of approximately 2 log n, where n is the sample size. Moreover no estimator can give a better guarantee than this. Within the class of spatially adaptive procedures, RiskShrink is essentially optimal. Relying only on the data, it comes within a factor log 2 n of the performance of piecewise polynomial and variableknot spline methods equipped with an oracle. In contrast, it is unknown how or if piecewise polynomial methods could be made to function this well when denied access to an oracle and forced to rely on data alone.

8,153 citations


Journal ArticleDOI
TL;DR: In this article, Chen et al. showed that a treatment effect that decreases with time can be directly visualized by smoothing an appropriate residual plot, which can be expressed as a weighted least-squares line fitted to the residual plot.
Abstract: SUMMARY Nonproportional hazards can often be expressed by extending the Cox model to include time varying coefficients; e.g., for a single covariate, the hazard function for subject i is modelled as exp { fl(t)Zi(t)}. A common example is a treatment effect that decreases with time. We show that the function /3(t) can be directly visualized by smoothing an appropriate residual plot. Also, many tests of proportional hazards, including those of Cox (1972), Gill & Schumacher (1987), Harrell (1986), Lin (1991), Moreau, O'Quigley & Mesbah (1985), Nagelkerke, Oosting & Hart (1984), O'Quigley & Pessione (1989), Schoenfeld (1980) and Wei (1984) are related to time-weighted score tests of the proportional hazards hypothesis, and can be visualized as a weighted least-squares line fitted to the residual plot.

4,770 citations


Journal ArticleDOI
TL;DR: This work shows how to use the Gibbs sampler to carry out Bayesian inference on a linear state space model with errors that are a mixture of normals and coefficients that can switch over time.
Abstract: SUMMARY We show how to use the Gibbs sampler to carry out Bayesian inference on a linear state space model with errors that are a mixture of normals and coefficients that can switch over time. Our approach simultaneously generates the whole of the state vector given the mixture and coefficient indicator variables and simultaneously generates all the indicator variables conditional on the state vectors. The states are generated efficiently using the Kalman filter. We illustrate our approach by several examples and empirically compare its performance to another Gibbs sampler where the states are generated one at a time. The empirical results suggest that our approach is both practical to implement and dominates the Gibbs sampler that generates the states one at a time.

2,146 citations


Journal ArticleDOI
TL;DR: In this article, a class of quantile smoothing splines, defined as solutions to the problem, is explored, and the authors explore a set of nonparametric methods for smoothing quantile functions.
Abstract: SUMMARY Although nonparametric regression has traditionally focused on the estimation of conditional mean functions, nonparametric estimation of conditional quantile functions is often of substantial practical interest. We explore a class of quantile smoothing splines, defined as solutions to

610 citations


Journal ArticleDOI
TL;DR: It is proved that Rao-Blackwellization causes a one-lag delay for the autocovariances among dependent samples obtained from data augmentation, and consequently, the mixture approximation produces estimates with smaller variances than the empirical approximation.
Abstract: SUMMARY We study the covariance structure of a Markov chain generated by the Gibbs sampler, with emphasis on data augmentation. When applied to a Bayesian missing data problem, the Gibbs sampler produces two natural approximations for the posterior distribution of the parameter vector: the empirical distribution based on the sampled values of the parameter vector, and a mixture of complete data posteriors. We prove that Rao-Blackwellization causes a one-lag delay for the autocovariances among dependent samples obtained from data augmentation, and consequently, the mixture approximation produces estimates with smaller variances than the empirical approximation. The covariance structure results are used to compare different augmentation schemes. It is shown that collapsing and grouping random components in a Gibbs sampler with two or three components usually result in more efficient sampling schemes.

606 citations


Journal ArticleDOI
TL;DR: ECME as discussed by the authors is a generalization of the ECM algorithm, which is itself an extension of the EM algorithm (Dempster, Laird & Rubin, 1977), which can be obtained by replacing some CM-steps of ECM, which maximise the constrained expected complete-data loglikelihood function, with steps that maximize the correspondingly constrained actual likelihood function.
Abstract: A generalisation of the ECM algorithm (Meng & Rubin, 1993), which is itselfan extension of the EM algorithm (Dempster, Laird & Rubin, 1977), can be obtained by replacing some CM-steps of ECM, which maximise the constrained expected complete-data loglikelihood function, with steps that maximise the correspondingly constrained actual likelihood function. This algorithm, which we call ECME algorithm, for Expectation /Conditional Maximisation Either, shares with both EM and ECM their stable monotone convergence and basic simplicity of implementation relative to competing faster converging methods. Moreover, ECME can have a substantially faster convergence rate than either EM or ECM, measured using either the number of iterations or actual computer time

604 citations


Journal ArticleDOI
TL;DR: In contrast to the proportional hazards model, the additive risk model specifies that the hazard function associated with a set of possibly time-varying covariates is the sum of, rather than the product of, the baseline hazard function and the regression function of covariates as mentioned in this paper.
Abstract: In contrast to the proportional hazards model, the additive risk model specifies that the hazard function associated with a set of possibly time-varying covariates is the sum of, rather than the product of, the baseline hazard function and the regression function of covariates. This formulation describes a different aspect of the association between covariates and the failure time than the proportional hazards model, and is more plausible than the latter for many applications. In the present paper, simple procedures with high efficiencies are developed for making inference about the regression parameters under the additive risk model with an unspecified baseline hazard function

580 citations


Journal ArticleDOI
TL;DR: In this paper, a method for transforming the variable of integration so that the integrand is sampled in an appropriate region was proposed, which can be thought of as a higher-order Laplace approximation.
Abstract: SUMMARY For Gauss-Hermite quadrature, we consider a systematic method for transforming the variable of integration so that the integrand is sampled in an appropriate region. The effectiveness of the quadrature then depends on the ratio of the integrand to some Gaussian density being a smooth function, well approximated by a low-order polynomial. It is pointed out that, in this approach, order one Gauss-Hermite quadrature becomes the Laplace approximation. Thus the quadrature as implemented here can be thought of as a higher-order Laplace approximation.

423 citations


Journal ArticleDOI
TL;DR: In this article, a new class of pattern-mixture models for the situation where missingness is assumed to depend on an arbitrary unspecified function of a linear combination of the two variables is described.
Abstract: SUMMARY Likelihood-based methods are developed for analyzing a random sample on two continuous variables when values of one of the variables are missing. Normal maximum likelihood estimates when values are missing completely at random were derived by Anderson (1957). They are also maximum likelihood providing the missing-data mechanism is ignorable, in Rubin's (1976) sense that the mechanism depends only on observed data. A new class of pattern-mixture models (Little, 1993) is described for the situation where missingness is assumed to depend on an arbitrary unspecified function of a linear combination of the two variables. Maximum likelihood for models in this class is straightforward, and yields the estimates of Anderson (1957) when missingness depends solely on the completely observed variable, and the estimates of Brown (1990) when missingness depends solely on the incompletely observed variable. Another choice of linear combination yields estimates from complete-case analysis. Large-sample and Bayesian methods are described for this model. The data do not supply information about the ratio of the coefficients of the linear combination that controls missingness. If this ratio is not welldetermined based on prior knowledge, a prior distribution can be specified, and Bayesian inference is then readily accomplished. Alternatively, sensitivity of inferences can be displayed for a variety of choices of the ratio.

395 citations


Journal ArticleDOI
Neil Shephard1
TL;DR: The use of simulation techniques to extend the applicability of the usual Gaussian state space filtering and smoothing techniques to a class of nonGaussian time series models allows a fully Bayesian or maximum likelihood analysis of some interesting models, including outlier models, discrete Markov chain components, multiplicative models and stochastic variance models.
Abstract: SUMMARY In this paper we suggest the use of simulation techniques to extend the applicability of the usual Gaussian state space filtering and smoothing techniques to a class of nonGaussian time series models. This allows a fully Bayesian or maximum likelihood analysis of some interesting models, including outlier models, discrete Markov chain components, multiplicative models and stochastic variance models. Finally we discuss at some length the use of a non-Gaussian model to seasonally adjust the published money supply figures.

384 citations


Journal ArticleDOI
TL;DR: In this article, a general and simple resampling method for inferences about a finite-dimensi onal parameter vector based on pivotal estimating functions is proposed, which can be easily and efficiently implemented with existing statistical software.
Abstract: SUMMARY Suppose that, under a semiparametric model setting, one is interested in drawing inferences about a finite-dimensi onal parameter vector /? based on an estimating function. Generally a consistent point estimator /J for /?0, the true value for /J, can be easily obtained by finding a root of the corresponding estimating equation. To estimate the variance of ft, however, may involve complicated and subjective nonparametric functional estimates. In this paper, a general and simple resampling method for inferences about jS0 based on pivotal estimating functions is proposed. The new procedure is illustrated with the quantile and rank regression models. For both cases, our proposal can be easily and efficiently implemented with existing statistical software.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a model that takes the additive structure of Aalen's model and imposes parametric constraints to obtain a semiparametric submodel, which may be more appropriate in some applications.
Abstract: SUMMARY Aalen's additive risk model allows the influence of each covariate to vary separately over time. Although allowing greater flexibility of temporal structure than a Cox model, Aalen's model is more limited in the number of covariates it can handle. We introduce a partly parametric version of Aalen's model in which the influence of only a few covariates varies nonparametrically over time, and that of the remaining covariates is constant. Efficient procedures for fitting this new model are developed and studied. The approach is applied to data from the Medical Research Council's myelomatosis trials. it is the first step of a Taylor series expansion of a general hazard function about the zero of the covariate vector. However, in estimating the unknown functions in such a general model there is a variance-bias trade-off that may be critical in small and medium samples. Also, after fitting the model one does not have parameters or formulae that are easily reported. We propose a model that takes the additive structure of Aalen's model and imposes parametric constraints to obtain a semiparametric submodel, which may be more appropriate in some applications. The model will be illustrated with data from clinical trials on myelomatosis. Covariates include treatment, sex and four age strata, which will be treated parametrically, together with serum levels of haemoglobin and f32-microglobulin, whose effects will be investigated nonparametrically. The additive form can be interpreted loosely in terms of unobserved competing risks since the hazard function for the minimum of independent random vari- ables is the sum of the hazard functions for the individual variables. Microglobulin levels are related to kidney function and tumour mass, whereas haemoglobin is unaffected by kidney function. Hence one might anticipate that the hazard function associated with each covariate represents a different cause of death.

Journal ArticleDOI
TL;DR: The technique of cross-validation is extended to the case where observations form a general stationary sequence, and taking h to be a fixed fraction of the sample size is proposed to reduce the training set by removing the h observations preceding and following the observation in the test set.
Abstract: SUMMARY In this paper we extend the technique of cross-validation to the case where observations form a general stationary sequence. We call it h-block cross-validation, because the idea is to reduce the training set by removing the h observations preceding and following the observation in the test set. We propose taking h to be a fixed fraction of the sample size, and we add a term to our h-block cross-validated estimate to compensate for the underuse of the sample. The advantages of the proposed modification over the cross-validation technique are demonstrated via simulation.

Journal ArticleDOI
TL;DR: In this paper, the authors apply standard convex optimization techniques to the analysis of interval censored data and provide easily verifiable conditions for the selfconsistent estimator proposed by Turnbull (1976) to be a maximum likelihood estimator and for checking whether the maximum likelihood estimate is unique.
Abstract: SUMMARY Standard convex optimization techniques are applied to the analysis of interval censored data. These methods provide easily verifiable conditions for the self-consistent estimator proposed by Turnbull (1976) to be a maximum likelihood estimator and for checking whether the maximum likelihood estimate is unique. A sufficient condition is given for the almost sure convergence of the maximum likelihood estimator to the true underlying distribution function.

Journal ArticleDOI
TL;DR: In this article, simultaneous confidence bands for the subject-specific survival curve under the Cox proportional hazards model were constructed by using a zero-mean Gaussian process whose distribution can be easily generated through simulation.
Abstract: SUMMARY In this paper, we show how to construct simultaneous confidence bands for the subjectspecific survival curve under the Cox proportional hazards model. The idea is to approximate the distribution of the normalized cumulative hazard estimator by a zero-mean Gaussian process whose distribution can be easily generated through simulation. Numerical studies indicate that the proposed bands are appropriate for practical use. A liver disease example is presented.

Journal ArticleDOI
TL;DR: In this article, the authors present two methods for stepwise selection of sampling units, and corresponding schemes for removal of units that can be used in connection with sample rotation, and describe practical, geometrically convergent algorithms for computing the wi from the 7i.
Abstract: SUMMARY Attention is drawn to a method of sampling a finite population of N units with unequal probabilities and without replacement. The method was originally proposed by Stern & Cover (1989) as a model for lotteries. The method can be characterized as maximizing entropy given coverage probabilities 7Ci, or equivalently as having the probability of a selected sample proportional to the product of a set of 'weights' wi. We show the essential uniqueness of the wi given the 7i, and describe practical, geometrically convergent algorithms for computing the wi from the 7i. We present two methods for stepwise selection of sampling units, and corresponding schemes for removal of units that can be used in connection with sample rotation. Inclusion probabilities of any order can be written explicitly in closed form. Second-order inclusion probabilities 7rij satisfy the condition 0< ij < 7ic 7j, which guarantees Yates & Grundy's variance estimator to be unbiased, definable for all samples and always nonnegative for any sample size.

Journal ArticleDOI
TL;DR: In this paper, the joint distribution of p binary variables is studied in the quadratic exponential form containing only "main effects" and "two-factor interactions" in the log probabilities.
Abstract: SUMMARY The joint distribution of p binary variables is studied in the quadratic exponential form containing only 'main effects' and 'two-factor interactions' in the log probabilities. Approximate versions of marginalized forms of the distribution are studied based on Taylor expansion and a number of conclusions drawn.

Journal ArticleDOI
TL;DR: In this article, a stochastic model for the study of the influence of time-dependent covariates on the marginal distribution of the binary response in serially correlated binary data is proposed.
Abstract: SUMMARY A stochastic model is proposed for the study of the influence of time-dependent covariates on the marginal distribution of the binary response in serially correlated binary data Markov chains are expressed in terms of transitional rather than marginal probabilities We show how to construct the model so that the covariates relate only to the mean value of the process, independently of the association parameter After formulating the stochastic model for a simple sequence of data with possibly missing data, the same approach is applied to a repeated measures setting and illustrated with a real data example

Journal ArticleDOI
TL;DR: In this article, the authors extend the Heitjan-Rubin model by explicitly defining the observed degree of coarseness as a data element, which permits the development of a frequentist theory, including a generalisation of'missing completely at random', the frequentist ignorability condition for missing data.
Abstract: SUMMARY Rubin (1976) defined ignorability conditions for frequentist and Bayes/likelihood analyses of data subject to missing observations. More recently, Heitjan & Rubin (1991) and Heitjan (1993) generalised the Rubin model to encompass other forms of incompleteness, establishing ignorability conditions for Bayes/likelihood inferences only. This paper extends the Heitjan-Rubin model by explicitly defining the observed degree of coarseness as a data element. This permits the development of a frequentist theory, including a generalisation of 'missing completely at random', the frequentist ignorability condition for missing data. The model is applied in a number of incomplete-data problems of general interest.

Journal ArticleDOI
TL;DR: In this article, the cumulative distribution function is modeled as a mixture of Beta cumulative distribution functions, noting that the latter family is dense within the collection of all continuous densities on [0, 1], and the fitting of the model is carried out using sampling based methods, in particular, a tailored Metropolis-within-Gibbs algorithm.
Abstract: function by incorporating it as an unknown in the model. Since the link function is usually taken to be strictly increasing, by a strictly increasing transformation of its range to the unit interval we can model it as a strictly increasing cumulative distribution function. The transformation results in a domain which is [0, 1]. We model the cumulative distribution function as a mixture of Beta cumulative distribution functions, noting that the latter family is dense within the collection of all continuous densities on [0, 1]. For the fitting of the model we take a Bayesian approach, encouraging vague priors, to focus upon the likelihood. We discuss choices of such priors as well as the integrability of the resultant posteriors. Implementation of the Bayesian approach is carried out using sampling based methods, in particular, a tailored Metropolis-within-Gibbs algorithm. An illustrative example utilising data involving wave damage to cargo ships is provided.

Journal ArticleDOI
TL;DR: In this article, sampling properties of estimators of the mean of a positive random variable that Winsorize the largest or the two largest observations in the sample are investigated and approximate expressions for the mean squared errors of these estimators are derived.
Abstract: SUMMARY Sampling properties of estimators of the mean of a positive random variable that Winsorize the largest or the two largest observations in the sample are investigated. Exact and approximate expressions for the mean squared errors of these estimators are derived. Optimal Winsorization schemes are obtained for various skewed distributions. Efficiency comparisons between the sample mean and Winsorized means are presented for several families of skewed distributions. A nearly unbiased estimator of the mean squared error of the Winsorized mean is proposed.

Journal ArticleDOI
TL;DR: In this paper, a test based on residual partial autocorrelations is proposed which is particularly powerful in cases when the fitted model underestimates the order of the moving average component.
Abstract: SUMMARY This note proposes a test of goodness of fit for time series models based on the sum of the squared residual partial autocorrelations. The test statistic is asymptotically x2. Its small-sample performance is studied through a Monte Carlo experiment. It appears sensitive to erroneous specifications especially when the fitted model understates the order of the moving average component. Residual analysis is a fundamental step in building empirical time series models. When checking the adequacy of the model one usually tests for the absence of residual autocorrelation. There is a number of tests designed for this purpose both in the time and frequency domains; see, for instance, Quenouille (1947, 1949), Bartlett (1954), Box & Pierce (1970), Ljung & Box (1978), Ansley & Newbold (1979), Godfrey (1979). In this paper a test based on residual partial autocorrelations is proposed which is particularly powerful in cases when the fitted model underestimates the order of the moving average component. Let X, be a zero mean process generated by a ARMA model O(B)X, = 0(B)at, where B is the backshift operator, #)(B) is a polynomial of order p, 0(B) is a polynomial of order q and a, is a white noise. Let a', .. , atn be the residuals obtained after estimating the model. A very popular goodness-of-fit test, proposed by Box & Pierce (1970) and improved by Ljung & Box (1978), is based on the statistic m

Journal ArticleDOI
TL;DR: In this article, the 1 - oc confidence interval for cc-level two one-sided tests was constructed from Westlake's 1 − oc symmetric interval (1976).
Abstract: SUMMARY Previously, the connection between bioequivalence tests and their associated confidence intervals was not well understood. In fact, the oc-level two one-sided tests approach was thought to be associated with a 1 -20c confidence interval. In this paper, we build up connections which allow us to construct 1 - oc confidence intervals from cx-level tests and vice versa. When applied to the cc-level two one-sided tests, the resultant 1 - oc confidence intervals are properly contained in Westlake's 1 - oc symmetric interval (1976). Our approach is readily generalized to different settings, including the nonparametric setting, the ratio parameter setting, and the repeated confidence interval setting.

Journal ArticleDOI
TL;DR: In this paper, the Laplace method is applied to give a general approximation for the kth moment of a ratio of quadratic forms in random variables, which is a simple approximation, which only entails basic algebraic operations.
Abstract: SUMMARY The Laplace method for approximating integrals is applied to give a general approximation for the kth moment of a ratio of quadratic forms in random variables. The technique utilises the existence of a dominating peak at the boundary point on the range of integration. As closed form and tractable formulae do not exist in general, this simple approximation, which only entails basic algebraic operations, has evident practical appeal. We exploit the approximation to provide an approximate mean-bias function for the least squares estimator of the coefficient of the lag dependent variable in a first-order stochastic difference equation.

Journal ArticleDOI
TL;DR: It is claimed that the discretized version of the thinplate spline may profitably be used in place of the Discrete Fourier Transform in a variety of image processing applications besides spline smoothing.
Abstract: SUMMARY This paper describes a fast method of computation for a discretized version of the thinplate spline for image data. This method uses the Discrete Cosine Transform and is contrasted with a similar approach based on the Discrete Fourier Transform. The two methods are similar from the point of view of speed, but the errors introduced near the edge of the image by use of the Discrete Fourier Transform are significantly reduced when the Discrete Cosine Transform is used. This is because, while the Discrete Fourier Transform implicitly assumes periodic boundary conditions, the Discrete Cosine Transform uses reflective boundary conditions. It is claimed that the Discrete Cosine Transform may profitably be used in place of the Discrete Fourier Transform in a variety of image processing applications besides spline smoothing.

Journal ArticleDOI
TL;DR: In this article, the authors describe a dynamic or state-space approach for analyzing discrete time or grouped survival data, which is based on maximization of posterior densities or, equivalently, a penalized likelihood.
Abstract: SUMMMARY This paper describes a dynamic or state-space approach for analyzing discrete time or grouped survival data. Simultaneous estimation of baseline hazard functions and of timevarying covariate effects is based on maximization of posterior densities or, equivalently, a penalized likelihood, leading to Kalman-type smoothing algorithms. Data-driven choice of unknown smoothing parameters is possible via an EM-type procedure. The methods are illustrated by applications to real data.

Journal ArticleDOI
TL;DR: In this article, the optimal choice of the number of resamples in the two stages of the iterated bootstrap was discussed, when that technique was used to calibrate bootstrap confidence intervals, and it was shown that it is optimal to take C to be approximately a constant multiple of B'.
Abstract: SUMMARY We discuss optimal choice of the numbers of resamples in the two stages of the iterated bootstrap, when that technique is used to calibrate bootstrap confidence intervals. If the numbers of resamples in the first and second stages are denoted by B and C respectively, we show that it is optimal to take C to be approximately a constant multiple of B'. The value of the constant is derived, and shown to depend only on the nominal coverage level of a confidence interval. However, it assumes different values in the cases of one- and twosided intervals.

Journal ArticleDOI
TL;DR: In this paper, an error occurs in a test for Poisson overdispersion suggested by Tiago de Oliveira (1965), where the limiting null distribution of the suggested statistic is neither pivotal nor is it standard normal.
Abstract: SUMMARY This note discusses an error occurring in a test for Poisson overdispersion suggested by Tiago de Oliveira (1965). The limiting null distribution of the suggested statistic is neither pivotal nor is it standard normal. The error lies in the computation of the asymptotic standard error of the overdispersion estimate, for which a corrected version is given. The corrected version of the test statistic becomes equivalent to the normalized version of Fisher's index of dispersion.

Journal ArticleDOI
TL;DR: In this paper, it was shown that for logistic regression models with binary data, there is no estimator with a high finite sample breakdown point, provided the estimator has to fulfill a weak condition.
Abstract: SUMMARY One aim of robust regression is to find estimators with high finite sample breakdown points. Although various robust estimators have been proposed in logistic regression models, their breakdown points are not yet known. Here it is shown for logistic regression models with binary data that there is no estimator with a high finite sample breakdown point, provided the estimator has to fulfill a weak condition. In logistic regression models with large strata however, a modification of Rousseeuw's least median of squares estimator is shown to have a finite sample breakdown point of approximately 2.

Journal ArticleDOI
TL;DR: In this paper, the optimality of complete diallel cross in incomplete blocks is investigated, and a table of optimal complete cross plans for up to 15 lines is provided, where the optimal complete crossing plan is derived using nested balanced incomplete block designs.
Abstract: SUMMARY Optimality of complete diallel crosses in incomplete blocks is investigated. Optimal complete diallel crosses are characterized, and such binary plans are derived using nested balanced incomplete block designs. A table of optimal complete diallel crosses for up to 15 lines is also provided.