# Showing papers in "Biometrika in 1988"

••

TL;DR: In this article, the authors proposed new tests for detecting the presence of a unit root in quite general time series models, which accommodate models with a fitted drift and a time trend so that they may be used to discriminate between unit root nonstationarity and stationarity about a deterministic trend.

Abstract: SUMMARY This paper proposes new tests for detecting the presence of a unit root in quite general time series models. Our approach is nonparametric with respect to nuisance parameters and thereby allows for a very wide class of weakly dependent and possibly heterogeneously distributed data. The tests accommodate models with a fitted drift and a time trend so that they may be used to discriminate between unit root nonstationarity and stationarity about a deterministic trend. The limiting distributions of the statistics are obtained under both the unit root null and a sequence of local alternatives. The latter noncentral distribution theory yields local asymptotic power functions for the tests and facilitates comparisons with alternative procedures due to Dickey & Fuller. Simulations are reported on the performance of the new tests in finite samples.

16,874 citations

••

TL;DR: In this article, a simple procedure for multiple tests of significance based on individual p-values is derived, which is sharper than Holm's (1979) sequentially rejective procedure.

Abstract: SUMMARY A simple procedure for multiple tests of significance based on individual p-values is derived. This simple procedure is sharper than Holm's (1979) sequentially rejective procedure. Both procedures contrast the ordered p-values with the same set of critical values. Holm's procedure rejects an hypothesis only if its p-value and each of the smaller p-values are less than their corresponding critical-values. The new procedure rejects all hypotheses with smaller or- equal p-values to that of any one found less than its critical value.

4,610 citations

••

TL;DR: In this article, the empirical distribution function based on a sample is used to define a likelihood ratio function for distributions, which can be used to construct confidence intervals for the sample mean, for a class of M-estimates that includes quantiles, and for differentiable statistical functionals.

Abstract: SUMMARY The empirical distribution function based on a sample is well known to be the maximum likelihood estimate of the distribution from which the sample was taken. In this paper the likelihood function for distributions is used to define a likelihood ratio function for distributions. It is shown that this empirical likelihood ratio function can be used to construct confidence intervals for the sample mean, for a class of M-estimates that includes quantiles, and for differentiable statistical functionals. The results are nonparametric extensions of Wilks's (1938) theorem for parametric likelihood ratios. The intervals are illustrated on some real data and compared in a simulation to some bootstrap confidence intervals and to intervals based on Student's t statistic. A hybrid method that uses the bootstrap to determine critical values of the likelihood ratio is introduced.

1,996 citations

••

TL;DR: In this paper, a general univariate smooth transition autoregressive, STAR, model is studied and three tests for testing linearity against STAR models are presented. But the power of the tests in small samples is investigated by simulation when the alternative is the logistic STAR model.

Abstract: SUMMARY We study a general univariate smooth transition autoregressive, STAR, model. It contains as a special case the self-exciting threshold autoregressive, SETAR, model. We present three tests for testing linearity against STAR models and discuss their properties. The power of the tests in small samples is investigated by simulation when the alternative is the logistic STAR model. One of the tests is identical to Tsay's (1986) test statistic and is recommended only in a special case. Of the two remaining tests with wider applicability, one seems superior to the other in small samples. It is also more powerful than the CUSUM test recently proposed for testing linearity against SETAR models.

1,446 citations

••

TL;DR: In this article, a multiple test procedure allowing statements on individual hypotheses is proposed, based on the principle of closed test procedures (Marcus, Peritz & Gabriel, 1976) and controls the multiple level a.

Abstract: SUMMARY Simes (1986) has proposed a modified Bonferroni procedure for the test of an overall hypothesis which is the combination of n individual hypotheses. In contrast to the classical Bonferroni procedure, it is not obvious how statements about individual hypotheses are to be made for this procedure. In the present paper a multiple test procedure allowing statements on individual hypotheses is proposed. It is based on the principle of closed test procedures (Marcus, Peritz & Gabriel, 1976) and controls the multiple level a.

1,154 citations

••

TL;DR: In this paper, a model for regression analysis with a time series of counts is presented, where correlation is assumed to arise from an unobservable process added to the linear predictor in a log linear model.

Abstract: SUMMARY This paper discusses a model for regression analysis with a time series of counts. Correlation is assumed to arise from an unobservable process added to the linear predictor in a log linear model. An estimating equation approach used for parameter estimation leads to an iterative weighted and filtered least-squares algorithm. Asymptotic properties for the regression coefficients are presented. We illustrate the technique with an analysis of trends in U.S. polio incidence since 1970.

577 citations

••

TL;DR: In this paper, the authors consider the dependence function with parametric models, for which two new models are presented, and the estimation procedure and the flexibility of the new models, are illustrated with an application to sea level data.

Abstract: SUMMARY Bivariate extreme value distributions arise as the limiting distributions of renormalized componentwise maxima. No natural parametric family exists for the dependence between the marginal distributions, but there are considerable restrictions on the dependence structure. We consider modelling the dependence function with parametric models, for which two new models are presented. Tests for independence, and discriminating between models, are also given. The estimation procedure, and the flexibility of the new models, are illustrated with an application to sea level data. Extreme value theory has recently been an area of much theoretical and practical work. Univariate theory is a well documented area, whereas bivariate/multivariate extreme value theory has, until recently, received surprisingly little attention. In the multivariate case, no natural parametric family exists for the dependence structure, so this must be modelled in some way. In the analysis of environmental extreme value data, there is a need for models of dependence between extremes from different sources: for example at various sea ports, or at various -points of a river. In this paper we consider bivariate extreme value distributions. We assume, without loss of generality, that we have exponential marginal distributions with unit means. The class of bivariate exponential distributions in which we are interested, satisfy a strong stability relation. Exponential variables (X, Y) satisfy the stability relation if and only if W = min (aX, bY) is also exponentially distributed for all a, b > 0 (Pickands, 1981). Therefore, the models we will consider have particular application in reliability and survival analysis. One approach to modelling the dependence structure is via parametric models. This requires a flexible family of models which satisfy certain constraints. Models are of two kinds: either differentiable, or nondifferentiable. All nondifferentiable models give distri- butions which are singular, with nonzero probability concentrated on a certain subspace. The differentiable models have densities, but the existing models are symmetric which leads to the variables being exchangeable. Here, we present two new asymmetric differenti- able models, which have increased flexibility. Properties of the differentiable models are examined. Estimation of the parametric models has previously been by ad hoc methods, because there is a nonregular estimation problem when the margins are independent. For the

575 citations

••

TL;DR: In this paper, a modification of the usual logistic regression analysis yields consistent estimates of covariable adjusted relative risks and their standard errors and some efficiency may be gained over the usual single stage design, particularly when the exposure is rare and the relative risks associated with the covariables are large.

Abstract: SUMMARY Samples of diseased cases and nondiseased controls are drawn at random from the population at risk. After classification according to the exposure of interest, subsamples of cases and controls are selected for purposes of covariable ascertainment. A modification of the usual logistic regression analysis yields consistent estimates of covariable adjusted relative risks and their standard errors. By balancing the numbers of exposed and nonexposed for whom covariable inforniation is ascertained within case and control samples, some efficiency may be gained over the usual single stage design, particularly when the exposure is rare and the relative risks associated with the covariables are large. The procedure may be useful also when covariable information is missing for a large part of the sample.

379 citations

••

TL;DR: Nonparametric methods for estimating and comparing the identifiable aspects of the induction distributions of several groups are developed for the induction period between infection with the AIDS virus and the onset of clinical AIDS.

Abstract: SUMMARY One source of data for the induction distribution of AIDS arises from persons infected by the AIDS virus from contaminated blood transfusions. Analyses of these data are complicated because the number of individuals infected by transfusion is unknown; information is available only for those who are infected and develop AIDS within a certain chronologic time interval. The statistical problem is one of making inferences about a stochastic process of infection and disease for which realizations are right truncated in chronologic time. By considering the process in reverse time, we transform the problem to one of analysing survival data that are left truncated in internal time. We develop nonparametric methods for estimating and comparing the identifiable aspects of the induction distributions of several groups. An important feature of the natural history and population dynamics of AIDS is the induction period between infection with the AIDS virus and the onset of clinical AIDS. This time is sometimes referred to as the latency period or incubation period. Follow-up studies of individuals at risk of being infected will provide direct observations from the induction distribution, but most of these studies will not be completed for several years. One alternative source of information arises from persons infected with the AIDS virus from a contaminated blood transfusion. Of persons infected in this way, only those who develop AIDS by a certain date can be identified. The total number who are infected is not known. A similar sampling distribution arises in the study of pediatric AIDS: children who contract AIDS as a result of being infected in utero or at birth can be identified, but the total number infected in this way is unknown. The statistical problem is one of making inferences about a stochastic process of infection and subsequent disease in which realizations are right truncated in chronologic time. By considering the process in reverse time, the problem can be transformed to one of survival data that are left truncated in internal time. We use this relationship to develop nonparametric methods for estimating and comparing the identifiable aspects of the induction distributions for

276 citations

••

TL;DR: In this article, several possible definitions of residuals are given for relative risk regression with time-varying covariates, each such residual has a representation as an estimator of a stochastic integral with respect to the martingale arising from a subject's failure time counting process.

Abstract: SUMMARY Several possible definitions of residuals are given for relative risk regression with time-varying covariates. Each such residual has a representation as an estimator of a stochastic integral with respect to the martingale arising from a subject's failure time counting process. Previously proposed residuals for individual study subjects and for specific time points are shown to be special cases of this definition, as are previously derived regression diagnostics. An illustration and various generalizations are also given. able methods are required to detect various departures from modelling assumptions. Suitably defined residuals may play an important role in such identification. However, the nonparametric aspect of the model, the possibility that modelled regression variables may be varying with follow-up time and, most importantly, the usual presence of right censorship, implies that specialized residual definitions are required. The class of residuals considered here is most easily formulated using counting process notation for the failure time data. Let Ni(t), Yi(t) and Zi(t) represent, respectively, for the ith subject the values of counting, censoring and covariate processes at follow-up time t (i = 1, . . ., n), while {Ni(u), Yi(u), Zi(u); 0C u < t} specifies the corresponding histories for the ith subject prior to time t. Thus in a typical univariate failure time application, Ni, with right- continuous sample paths, will take value zero prior to the time of failure on the ith subject and value one thereafter, while Yi with left-continuous sample paths will take value one at times at which the ith subject is 'at risk' for an observed failure, and value zero otherwise. The counting process Ni can be uniquely decomposed so that for all (t, i)

219 citations

••

TL;DR: In this paper, a single unifying approach to bootstrap resampling, applicable to a very wide range of statistical problems, has been proposed, including bias reduction, shrinkage, hypothesis testing and confidence interval construction.

Abstract: SUMMARY We propose a single unifying approach to bootstrap resampling, applicable to a very wide range of statistical problems. It enables attention to be focused sharply on one or more characteristics which are of major importance in any particular problem, such as coverage error or length for confidence intervals, or bias for point estimation. Our approach leads easily and directly to a very general form of bootstrap iteration, unifying and generalizing present disparate accounts of this subject. It also provides simple solutions to relatively complex problems, such as a suggestion by Lehmann (1986) for 'conditionally' short confidence intervals. We set out a single unifying principle guiding the operation of bootstrap resampling, applicable to a very wide range of statistical problems including bias reduction, shrinkage, hypothesis testing and confidence interval construction. Our principle differs from other approaches in that it focuses attention directly on a measure of quality or accuracy, expressed in the form of an equation whose solution is sought. A very general form of bootstrap iteration is an immediate consequence of iterating the empirical solution to this equation so as to improve accuracy. When employed for bias reduction, iteration of the resampling principle yields a competitor to the generalized jackknife, enabling bias to be reduced to arbitrarily low levels. When applied to confidence intervals it produces the techniques of Hall (1986) and Beran (1987). The resampling principle leads easily to solutions of new, complex problems, such as empirical versions of confidence intervals proposed by Lehmann (1986). Lehmann argued that an 'ideal' confidence interval is one which is short when it covers the true parameter value but not necessarily otherwise. The resampling principle suggests a simple empirical means of constructing such intervals. Section 2 describes the general principle, and ? 3 shows how it leads naturally to bootstrap iteration. There we show that in many problems of practical interest, such as bias reduction and coverage-error reduction in two-sided confidence intervals, each iteration reduces error by the factor n-1, where n is sample size. In the case of confidence intervals our result sharpens one of Beran (1987), who showed that coverage error is reduced by the factor n-2 in two-sided intervals. The main exception to our n-1 rule is coverage error of one-sided intervals, where error is reduced by the factor n-A at each iteration. Our approach to bootstrap iteration serves to unify not just the philosophy of iteration for different statistical problems, but also different techniques of iteration for the same

••

TL;DR: In this paper, a two-stage procedure was proposed to identify the best of K experimental treatments and determine whether it is superior to a standard control in the binomial setting where patient response may be characterized as either success or failure, with ok the success probability for Ek; k = 0 corresponds to C. with many other factors.

Abstract: SUMMARY A two-stage design which selects the best of several experimental treatments and compares it to a standard control is proposed. The design allows early termination with acceptance of the global null hypothesis. Optimal sample size and cut-off parameters are obtained by minimizing expected total sample size for fixed significance level and power. with many other factors. Moreover, the usual error rate computations associated with comparative testing do not account for the preliminary selection process. Consequently, the overall procedure may be neither effective nor efficient for identifying an experimental treatment which is an improvement over C. In this paper we propose a new approach to the problem of identifying the best of K experimental treatments and determining whether it is superior to a control. We deal with the binomial setting where patient response may be characterized as either success or failure, with ok the success probability for Ek; k =0 corresponds to C. For ease of notation assume 01 -... OK . We propose a two-stage procedure which allows early termination with acceptance of H0: o = 01 =. .. = OK, with design parameters chosen to minimize expected total sample size. The design and accompanying generalized definitions of size a and power 1 - ,3 are given in ? 2. The algorithm used for optimization is described in ? 3. Numerical results are presented in ? 4, followed by a discussion of the relative merits of the proposed design.

••

TL;DR: In this paper, an outlier is defined as an observation with a large random error, generated by the linear model under consideration, and is detected by examining the posterior distribution of the random errors.

Abstract: SUMMARY An approach to detecting outliers in a linear model is developed. An outlier is defined to be an observation with a large random error, generated by the linear model under consideration. Outliers are detected by examining the posterior distribution of the random errors. An augmented residual plot is also suggested as a graphical aid in finding outliers. We propose a precise definition of an outlier in a linear model which appears to lead to simple ways of exploring data for the possibility of outliers. The definition is such that, if the parameters of the model are known, then it is also known which observations are outliers. Alternatively, if the parameters are unknown, the posterior distribution can be used to calculate the posterior probability that any observation is an outlier. In a linear model with normally distributed random errors, Ei, with mean zero and variance a 2we declare the ith observation to be an outlier if IEi I> ko- for some choice of k. The value of k can be chosen so that the prior probability of an outlier is small and thus outliers are observations which are more extreme than is usually expected. Realizations of normally distributed errors of more than about three standard deviations from the mean are certainly surprising, and worth further investigation. Such outlying observations can occur under the assumed model, however, and this should be taken into account when deciding what to do with outliers and in choosing k. Note that ei is the actual realization of the random error, not the usual estimated residual ?i. The problem of outliers is studied and thoroughly reviewed by Barnett & Lewis (1984), Hawkins (1980), Beckman & Cook (1983) and Pettit & Smith (1985). The usual Bayesian approach to outlier detection uses the definition given by Freeman (1980). Freeman defines an outlier to be 'any observation that has not been generated by the mechanism that generated the majority of observations in the data set'. Freeman's definition therefore requires that a model for the generation of outliers be specified and is implemented by, for example, Box & Tiao (1968), Guttman, Dutter & Freeman (1978) and Abraham & Box (1978). Our method differs in that we define outliers as arising from the model under consideration rather than arising from a separate, expanded, model. Our approach is similar to that described by Zellner & Moulton (1985) and is an extension of the philosophy

••

TL;DR: In the linear regression model with normal errors, the squared product-moment correlation provides the standard measure of dependence between the explanatory variable and the response variable, and a simple approximation is available.

Abstract: In the linear regression model with normal errors the squared product-moment correlation provides the standard measure of dependence between the explanatory variable and the response variable. Using the concept of information gain, a measure of dependence can also be defined for more general regression models used in survival analysis, such as the Weibull regression model or Cox's proportional hazards model. Further, this measure of dependence can be conveniently estimated even when the response variable is subject to censoring, and a simple approximation is available.

••

TL;DR: In this paper, an expression for the likelihood for a state space model is derived with the Kalman filter initialized at a starting state estimate of zero and associated estimation error covariance matrix of zero.

Abstract: SUMMARY This paper derives an expression for the likelihood for a state space model. The expression can be evaluated with the Kalman filter initialized at a starting state estimate of zero and associated estimation error covariance matrix of zero. Adjustment for initial conditions can be made after filtering. Accordingly, initial conditions can be modelled without filtering implications. In particular initial conditions can be modelled as 'diffuse'. The connection between the 'diffuse' and concentrated likelihood is also displayed.

••

TL;DR: Saddlepoint approximations are shown to be easy to use and accurate in a variety of simple bootstrap and randomization applications as discussed by the authors, such as mean estimation, ratio estimation, two-sample comparisons, and autoregressive estimation.

Abstract: SUMMARY Saddlepoint approximations are shown to be easy to use and accurate in a variety of simple bootstrap and randomization applications. Examples include mean estimation, ratio estimation, two-sample comparisons, and autoregressive estimation.

••

TL;DR: In this article, a test of the null hypothesis of no treatment effect in a randomized clinical trial that is based on the randomization distribution of residuals is proposed, where residuals result from regressing the response on covariates, but not treatment.

Abstract: SUMMARY We propose a test of the null hypothesis of no treatment effect in a randomized clinical trial that is based on the randomization distribution of residuals. These residuals result from regressing the response on covariates, but not treatment. In contrast to model-based score tests, this procedure maintains nominal size when the model is misspecified, and, in particular, when relevant covariates are omitted from the regression. The efficiency of the procedure is evaluated for regressions with some, but not all, required covariates. For many generalized linear models and survival models, conventional model-based score tests are shown to have supranominal size when relevant covariates are omitted, but logistic regression and the proportional hazards model are robust.

••

TL;DR: In this article, it was shown that the level-error of the adjusted statistic is actually order n2, while the Bartlett adjustment reduces the level error from order n-l to order n3/2.

Abstract: SUMMARY It is well known that Bartlett adjustment reduces level-error of the likelihood ratio statistic from order n-l to order n3/2. In the present note we show that level-error of the adjusted statistic is actually order n2.

••

TL;DR: In this paper, a class of conditional logistic regression models for clustered binary data is considered, including the polychotomous logistic model of Rosner (1984) as a special case.

Abstract: SUMMARY A class of conditional logistic regression models for clustered binary data is considered. This includes the polychotomous logistic model of Rosner (1984) as a special case. Properties such as the joint distribution and pairwise odds ratio are investigated. A class of easily computed estimating functions is introduced which is shown to have high efficiency compared to the computationally intensive maximum likelihood approach. An example on chronic obstructive pulmonary disease among sibs is presented for illustration.

••

TL;DR: In this paper, a generalization of correspondence analysis to multivariate categorical data is proposed, where all two-way contingency tables of a set of categorical variables are simultaneously fitted by weighted least-squares.

Abstract: SUMMARY A generalization of correspondence analysis to multivariate categorical data is proposed, where all two-way contingency tables of a set of categorical variables are simultaneously fitted by weighted least-squares. An alternating least-squares algorithm is developed to perform the fitting. This technique has a number of advantages over the usual generalization known as multiple correspondence analysis. It is also an analogue of least-squares factor analysis for categorical data.

••

TL;DR: In this article, a wide class of estimators of the residual variance in nonparametric regression is considered, namely those that are quadratic in the data, unbiased for linear regression, and always nonnegative.

Abstract: SUMMARY A wide class of estimators of the residual variance in nonparametric regression is considered, namely those that are quadratic in the data, unbiased for linear regression, and always nonnegative. The minimax mean squared error estimator over a natural class of regression functions is derived. This optimal estimator has an interesting structure and is closely related to a minimax estimator of the regression curve itself.

••

TL;DR: In this article, it was shown that the amplitude, frequency and phase of a discrete harmonic component of a time series can be approximated using the least square method, and that the frequency estimate is much more variable than indicated by the asymptotic theory and the amplitude estimate is severely biased.

Abstract: SUMMARY t This paper discusses a least-squares procedure and the use of the periodogram for isolating a discrete harmonic of a time series. It is shown that the usual asymptotics on estimation of frequency, amplitude and phase of such a harmonic have to be used with great caution from a moderate sample perspective. Computational issues are discussed and some illustrations are provided. Bolt & Brillinger (1979) make use of these asymptotic results. We consider a time series model of the form X, = a cos {wt + (f>) + e,, where e, is a stationary noise sequence, and one is interested in estimating the amplitude, frequency and phase of the harmonic component. The asymptotic theory of the least-squares estimates of these parameters has a long history. Whittle (1951,1953) obtained some of the earliest results. More recent results are by Hasan (1982), Hannan (1973) and Walker (1971), who formalize and extend Whittle's results. In these works it is shown that the asymptotic variance of the frequency estimate is of order n~3 and that the asymptotic variances of the other two components are of the more usual order n~\ These results extend when there are several harmonic components. The rate for the estimate of w seems almost unbelievably good, and our work was motivated by a desire to see how reliable the asymptotic theory is. In brief, we find that the product of the amplitude and the sample size, n, must be quite large in order for the asymptotic theory to be meaningful. If this product is not large, the frequency estimate is much more variable than indicated by the asymptotic theory and the amplitude estimate is severely biased. In applications in which the amplitude is small, giving rise to a small peak in the periodogram, these results suggest that naive application of the asymptotic theory to gauge resolution can be quite misleading. Section 2 of this paper is devoted to a review and examination of the asymptotic theory. We are also concerned with computational issues arising from the least-squares problem. This problem is nonlinear in the parameters, so that some sort of iterative search must be employed. Typically, search methods start from an initial guess and then proceed by a sequence of modified Newton-Raphson steps. For this nonlinear least-squares problem, it turns out that there are many local minima with a separation in frequency about n~l which makes the stationary point to which the iterative scheme converges extremely sensitive to the starting values, and this problem gets worse as the sample size increases. Furthermore, it follows from the results of § 2 that the estimate of the amplitude is very biased unless the frequency is resolved with order o(n~') so that failure to converge to the global minimum may give a very poor estimate of amplitude. The problem becomes

••

TL;DR: In this article, the ideas of Ansley and Kohn's proof are used to provide a new formula for the covariance between smoothed estimates at any two points in time.

Abstract: SUMMARY Ansley & Kohn (1982) presented a novel proof of the fixed interval smoothing algorithm for state space models. In this note the ideas of their proof are used to provide a new formula for the covariance between smoothed estimates at any two points in time. This equation is substantially simpler than existing special cases.

••

TL;DR: In this article, an extension of the growth curve model of Potthoff & Roy (1964) is presented for the analysis of longitudinal data from designed experiments, where the same number of observations at identical time points are made on all the experimental units.

Abstract: SUMMARY An extension of the growth curve model of Potthoff & Roy (1964) is presented. The extension arises naturally when parallel profiles are required, or profiles are concurrent at some known time point and may be of use for the analysis of longitudinal data from designed experiments. Linearization of a nonlinear model also results in this model. Under certain conditions, transformation of the model leads to seemingly unrelated multivariate regressions. In this case questions of identifiability and estimation are handled easily. An analysis of data on the growth of lambs is presented to illustrate the approach. In this paper we consider the analysis of repeated measurements on experimental units which may be grouped according to some experimental design. Usually successive measurements are taken over time, but this need not be the case. We restrict ourselves to the case where the same number of observations at identical time points are made on all the experimental units. This is likely to be the case in designed experiments, though not perhaps in observational studies. The approach given here extends that of Potthoff & Roy (1964), and we show below that the potential applications are quite widespread. We emphasize that in this paper that we are attempting to present what we regard as a useful and practical extension to the classical growth curve model. It is likely to be useful if the aim of the analysis is the estimation of an explicit model for interpretative or predictive purposes. This is rather in contrast to the traditional approach to the analysis of such data where the aim is more directed towards testing for effects of treatments or covariates. If testing for effects is the primary purpose of analysis, then the simpler traditional methods, as given initially by Wishart (1938) and subsequently by many authors, such as Rowell & Walters (1976), may well suffice. In this approach selected linear functions of the response over time are calculated and treatment effects are assessed using the analysis of variance. Since this approach concentrates on individual linear functions seriatim and generally, but not always, ignores correlations between such linear functions, we would regard this as essentially a univariate technique. However if testing is the main concern, the loss of efficiency in ignoring the multivariate character of the data may not be severe. Our approach is more complex but the aim is more comprehensive. We aim to provide an explicit model for the mean response over time. If the model is then used to answer questions that could have been considered by separate tests, the answers will, of course, usually be essentially the same as those obtained by the simpler traditional approach.

••

TL;DR: In this paper, the authors discuss several approaches to construct confidence intervals for the parameter of interest in a group sequential trial after the trial has ended, and compare these met1fods to one already in the literature and point out interesting differences.

Abstract: SUMMARY With interest in designing group sequential clinical trials increasing, methodology for analysing the data arising in such trials is needed. We discuss several approaches to constructing confidence intervals for the parameter of interest in a group sequential trial after the trial has ended. We compare these met1fods to one already in the literature and point out interesting differences.

••

TL;DR: In this paper, the authors investigated the use of a variance stabilizing transformation for the computation of a bootstrap t confidence interval, which is estimated in an automatic manner through an initial bootstrap step.

Abstract: SUMMARY We investigate the use of a variance stabilizing transformation for the computation of a bootstrap t confidence interval. The transformation is estimated in an 'automatic' manner through an initial bootstrap step. A bootstrap t interval is then computed for the variance stabilized parameter and the interval is mapped back to the original scale. The resultant procedure is second-order correct in some settings, invariant and in a number of examples it performs better than the usual untransformed bootstrap t interval. It also requires far less computation. The new interval is compared with Efron's BCa procedure and the two methods are seen to produce similar results.

••

TL;DR: In this article, the authors use the concepts of stochastic complexity, description length, and model selection to develop data-based methods for choosing smoothing parameters in nonparametric density estimation.

Abstract: SUMMARY We use the concepts of stochastic complexity, description length, and model selection to develop data-based methods for choosing smoothing parameters in nonparametric density estimation. In the case of histogram estimators, we derive a simple, exact formula for stochastic complexity when the prior distribution of cell probabilities is uniform over the class of all possible choices. The formula depends only on the data and the smoothing parameter, which is readily chosen according to the criterion of minimum stochastic complexity. Approaches based on stochastic complexity and description length are shown to be asymptotically equivalent in certain circumstances. They produce a degree of smoothing which is almost optimal from the viewpoint of minimizing L?, or supremum, distance, but which smooths a little more than is optimal in the sense of minimizing L' distance for any finite value of r.

••

••

TL;DR: In this paper, an efficient algorithm is provided to construct exact permutation tests for testing the equality of the two treatments with the randomized play-the-winner rule, which is applicable when patients have delayed responses to treatments.

Abstract: In comparing two treatments in a clinical trial, the randomized play-the-winner rule tends to assign more study subjects to the better treatment. It is applicable when patients have delayed responses to treatments. It is not deterministic and is less vulnerable to experimental bias than other adaptive designs. In this paper, an efficient algorithm is provided to construct exact permutation tests for testing the equality of the two treatments with the randomized play-the-winner rule. The test procedure is illustrated with a real-life example. The example shows that the design used in the trial should not be ignored in the analysis.

••

TL;DR: In this paper, the authors compare three standard methods of estimating the parameter theta due to Rodbard (1978), Raab (1981a), and Carroll and Ruppert (1982b).

Abstract: : Assay data are often fit by a nonlinear regression model incorporating heterogeneity of variance, as ion radioimmunoassay, for example. Typically, the standard deviation of the response is taken to be proportional to a power theta of the mean. There is considerable empirical evidence suggesting that for assays of a reseasonable size, how one estimates the parameter theta does not greatly affect how well one estimates the the mean regression function. An additional component of assay analysis is the estimation of auxillary constructs such as the minimum detectable concentration, for which many definitions exist; we focus on one such definition. The minimum detectable concentration depends both on theta and the mean regression function. We compare three standard method of estimating the parameter theta due to Rodbard (1978), Raab (1981a) and Carroll and Ruppert (1982b). When duplicate counts are taken at each concentration, the first method is only 20% efficient asymptotically in comparison to the third, and the resulting estimate of the minimum detectable concentration is asymptotically 3.3 times more variable for first than the third. Less dramatic results obtain for the second estimator compared to the third; this estimator is still not efficient, however. Simulation results and an example are supportive of the asymptotic theory. Keywords: Least squares method.