scispace - formally typeset
Search or ask a question

Showing papers in "Biometrika in 1980"


Journal ArticleDOI
TL;DR: In this article, a class of omnibus chi-squared goodness-of-fit tests is presented for the model, relating failure time to covariate values, proposed by Cox (1972), which are based on the expected and observed frequency that a data point, representing a failure with associated covariates, falls into one of L mutually exclusive categories.
Abstract: SUMMARY A class of omnibus chi-squared goodness-of-fit tests is presented for the model, relating failure time to covariate values, proposed by Cox (1972). These tests are based on the expected and observed frequency that a data point, representing a failure with associated covariates, falls into one of L mutually exclusive categories. Different partitions of the space of covariate values and failure time will yield different tests. Certain partitions are suggested for the goodness-of-fit problems commonly encountered in clinical trials.

454 citations


Journal ArticleDOI
TL;DR: In this article, the logistic transformation applied to a 2-dimensional normal distribution produces a distribution over the d-dimensional simplex which can sensibly be termed a logistic-norma l distribution.
Abstract: SUMMARY The logistic transformation applied to a ^-dimensional normal distribution produces a distribution over the d-dimensional simplex which can sensibly be termed a logistic-norma l distribution. Such distributions, implicitly used in a number of recent applications, are here given a formal identity and some useful properties are recorded. A main aim is to extend the area of application from the restricted role as a substitute for the Dirichlet conjugate prior class in the analysis of multinomial and contingency table data to the direct statistical description and analysis of compositional and probabilistic data.

435 citations


Journal ArticleDOI
TL;DR: In this article, the authors provide large-sample simultaneous confidence bands for SO, centred at SN, using weak convergence of N1{AqN(t) SO(t), on a finite interval, to a Gaussian process, a theorem of Breslow & Crowley (1974), and transforms both the time and space axes of the limiting process to achieve a Brownian bridge limit.
Abstract: For arbitrarily right-censored data, the Kaplan-Meier product-limit estimator 9N provides a nonparametric estimate of the survival function SO = 1FO. We provide large-sample simultaneous confidence bands for SO, centred at SN. The derivation uses the weak convergence of N1{AqN(t) SO(t)}, on a finite interval, to a Gaussian process, a theorem of Breslow & Crowley (1974), and transforms both the time and space axes of the limiting process to achieve a Brownian bridge limit. Parameters in the transformation are replaced by uniformly consistent estimates to form the bands. The new bands reduce to the well-known Kolmogorov bands in the absence of censoring. Comparisons are made with recent bands of Gillespie & Fisher (1979) and V. N. Nair. The bands are illustrated by imposing some different kinds of random censorship on a set of uncensored data.

278 citations



Journal ArticleDOI
TL;DR: In this paper, the authors suggest some more precise methods of testing simple hypotheses about the reduced major axis than have hitherto been available, and some of their properties derived from a computer implementation of k statistics.
Abstract: SUMMARY In situations such as allometry where a line is to be fitted to a bivariate sample but where an asymmetric choice of one or other variable as regressor cannot be made, the reduced major axis is often used. Existing tests of the slope of this line, particularly between samples, are not sufficiently accurate in view of the scarcity of the material to which such methods are often applied. Alternative test statistics are suggested and some of their properties derived from a computer implementation of k statistics. One often wishes to describe the relationship between two observed random variables without, in the usual regression terminology, having to specify one as dependent on the other. A typical case, in fact the one which led to this paper, is in allometry where the variables are anatomical measurements, the relationship between which determines shape and may be used as a basis for comparison between species. After suitable transformations, usually logarithmic, have been applied, some measure of the slope of the bivariate scatter plot is required that treats both variables symmetrically. Unless there are sufficient grounds for specifying an underlying model with estimable parameters a possible choice is the line whose sum of squared perpendicular distances from the sample points is a minimum, and it is well known that this is given by the eigenvector corresponding to the larger eigenvalue of the sample dispersion matrix, the smaller eigenvalue in this two variable case being the minimized sum of squares. For the bivariate normal distribution this line is the major axis of the ellipses of constant probability, and so has come to be called the major axis of the bivariate sample. Although invariant under rotation the major axis is altered in a complicated way by changes of scale and in practice preference in the specialist literature on allometry has been given to the line obtained by normalizing the variables to unit standard deviations, finding the major axis, and transforming back to the original scales of measurement. This has come to be called the reduced major axis. The purpose of this paper is to suggest some more precise methods of testing simple hypotheses about the reduced major axis than have hitherto been available.

257 citations


Journal ArticleDOI
TL;DR: In this article, the choice of values for the multiplier is explored, with the value one corresponding to Akaike's information criterion, and a simple example is simulated to investigate the taxonomy of optimum values for prediction purposes.
Abstract: SUMMARY One way of selecting models is to choose that model for which the maximized log likelihood minus a multiple of the number of parameters estimated is a maximum. This note explores the choice of values for the multiplier, with the value one corresponding to Akaike's information criterion. The relationship with Bayesian procedures is mentioned. Suppose that there are a number of competing models which may be fitted to some data. If the log likelihood of the ith model maximized over q* parameters is Li, the generalized information criterion is to choose the model for which Li - -1cq* is a maximum. In the criterion suggested by Akaike as equals 2. Values of os in the range 1-4 are considered by Bhansali & Downham (1977) for the choice of a time series model. As have many other authors, including Geisser & Eddy (1979) and McClave (1978), Bhansali & Downham assessed their criteria by the frequency of choice of the correct model. One purpose of the present note is to suggest that this. may not be an appropriate basis for choice: the objectives of the analysis need more explicit formulation. A similar point is made by Akaike (1979) who com- pares values of ot on the basis of squared prediction error. His simulation is, however, un- informative about the conditions under which various values of a are optimum. In this note a simple example is simulated to investigate the taxonomy of optimum of values for prediction purposes. For simplicity of structure and ease of simulation we work with linear regression models, for which the information criterion reduces to a generalized Cp statistic. The behaviour of this statistic as a function of ot is investigated in the next section. Significance testing and Bayesian alternatives are discussed in ? 3. Section 4 is concerned with asymptotics. The note closes with some general comments which allude to ridge regression and simulation.

251 citations


Journal ArticleDOI
TL;DR: In this article, a multivariate generalization of a one-sided test was proposed, where the null hypothesis that /K lies on the boundary of a convex polyhedral cone determined by linear inequalities was considered.
Abstract: SUMMARY In this paper we propose a new multivariate generalization of a one-sided test in a way different from that of Kudo (1963). Let X be a p-variate normal random variable with the mean vector , and a known covariance matrix. We consider the null hypothesis that /K lies on the boundary of a convex polyhedral cone determined by linear inequalities; the alternative is that , lies in its interior. A two-sided version is also discussed. This paper provides likelihood ratio tests and some applications along with some discussion of the geometry of convex polyhedral cones.

230 citations


Journal ArticleDOI
TL;DR: It is shown how homogeneous autoregressive-moving average models may be mistakenly specified for series in which periodic properties are present.
Abstract: SUMMARY Some properties of a class of periodic models for characterizing seasonal time series are explored. The relationships between periodic models and multiple autoregressive-moving average models are developed and used to gain insight into the behaviour of periodic models. In particular it is shown how homogeneous autoregressive-moving average models may be mistakenly specified for series in which periodic properties are present. Consequences of such misspecification on forecasting and diagnostic checking are also derived.

214 citations


Journal ArticleDOI
TL;DR: The Dirichlet-multinomial distribution is used in this paper for contingency tables generated by cluster sampling schemes, and it is shown that the asymptotic distributions of the Pearson and likelihood ratio chi-squared statistics are multiples of chi-square random variables and this result permits a simple modification of the usual procedures when faced with a cluster sample.
Abstract: SUMMARY The Dirichlet-multinomial distribution is formulated as a model for contingency tables generated by cluster sampling schemes. This model provides an alternative justification for some results of Altham (1976) and enables an extension of her results to the case of unequal cluster sizes. The model also provides a convenient framework for the fitting and testing of general log linear models in contingency tables. The asymptotic distributions of the Pearson and likelihood ratio chi-squared statistics are shown to be multiples of chi-squared random variables and this result permits a simple modification of the usual procedures when faced with a cluster sample.

183 citations


Journal ArticleDOI
TL;DR: In this paper, a simple cumulative sum type statistic for the change point with zero-one observations is introduced, and a conditional test of no change against change is introduced and compared with a likelihood ratio test.
Abstract: SUMMARY A simple cumulative sum type statistic for the change-point with zero-one observations is introduced. A conditional test of no change against change is introduced and compared with a likelihood ratio test. The estimation of the change-point is also considered, using the simple statistic, and the method is shown to be asymptotically equivalent to the maximum likelihood estimator in certain circumstances and almost equivalent in others. To investigate the small sample behaviour, simulation experiments were carried out and these showed the new estimator to be generally superior to the maximum likelihood estimator.

170 citations


Journal ArticleDOI
TL;DR: In this paper, three possible modifications of the group sequential method for sequential monitoring of clinical trials testing a one-sided hypothesis are considered, and they are shown to be inappropriate or even unethical.
Abstract: SUMMARY Pocock (1977) has recently proposed group sequential methods for sequential analysis of clinical trials. However, these methods are intended to test a null hypothesis against a twosided alternative, and many trials of a new therapy against a standard have one-sided alternatives. Continuation of the trial to determine whether the new therapy is actually inferior to the standard therapy may be inappropriate or even unethical. In the present paper, three possible modifications of the group sequential method for sequential monitoring of clinical trials testing a one-sided hypothesis are considered.

Journal ArticleDOI
TL;DR: In this article, the authors investigate a model with assumptions similar to the so-called general epidemic model, except that the latent and infectious periods are independently gamma distributed, and obtain a number of results concerning the progress of the epidemic.
Abstract: SUMMARY In this paper we investigate a model with assumptions as for the so-called general epidemic model, except that the latent and infectious periods are independently gamma distributed. For this model, we obtain a number of results concerning the progress of the epidemic. These results allow some assessment of the effects of the latent period and the infectious period on the behaviour of the epidemic model. Further we note that many of the results can be generalized. k large, and the degenerate distribution at /-1, as k -? oo. The latent period has density f10s and the infectious period density fm,. The model go, is in fact the standard epidemic model. For the model glm' we obtain results concerning the behaviour of the path of the process by means of the deterministic approximation and a martingale central limit theorem result. Equations for the distribution of the size of the outbreak are specified and approximations are derived for the probability of a minor outbreak, the distribution of the size of a minor outbreak and the distribution of the size of a major outbreak. Further, we obtain an approxi- mation for the initial rate of increase of the number of infected individuals and hence the mean time for the epidemic to reach its peak. These results give a reasonably detailed description of the behaviour of the model glm' allowing an assessment of the effects of the parameters I and m, and also enabling a comparison of ?pAm with the standard epidemic model 9Yi, to be made.

Journal ArticleDOI
TL;DR: In this article, Huber's M-estimates are adapted to hypothesis tests which can be termed likelihood ratio type tests, in which the sensitivity of the estimates to departures from normality should be inherited by the tests.
Abstract: SUMMARY Robust tests of general linear hypotheses in linear models are developed. These are likeli- hood ratio type tests in the same sense that M-estimates are maximum likelihood type estimates. Construction of the tests suggests a decomposition of the data into terms analogous to classical sums of squares, providing a robust analysis of variance. Asymptotic efficiency and robustness properties of the tests are the same as those of the M-estimates upon which they are based. Parameter estimation is usually only a first step in the analysis of data arising fromn a linear model. A classical least squares analysis often focuses upon the analysis of variance, which tests simultaneous hypotheses on large subsets of the parameters. Since the terms in a classical analysis of variance are quadratic forms in least squares estimates, one would expect that the sensitivity of the estimates to departures from normality should be inherited by the tests. In fact, for moderate to heavy tailed error distributions or in the presence of outliers, it appears that the classical F test does lose power. Calculations of relative efficiency for proce- dures proposed in this paper substantiate on theoretical grounds the possible inefficiency and lack of power of classical F tests. In this paper, Huber's M-estimates are adapted to hypothesis tests which can be termed likelihood ratio type tests. These procedures naturally generalize and bear a striking re- semblance to classical F tests. Robustness and efficiency properties of M-estimates apply directly to the proposed tests. Hence the case to be made for using likelihood ratio type tests rather than classical F tests is the same as that for using M-estimates in favour of least squares estimates: possible poor performance of the classical methods may be overcome with methods which perform well both when classical assumptions are met and when they are not. The proposed methods are natural, intuitive and as easily computed as M-estimates.

Journal ArticleDOI
TL;DR: In this paper, a goodness-of-fit test for the logistic regression model is proposed which is asymptotically chi-squared and is computed as a quadratic form of observed counts minus the expected counts.
Abstract: SUMMARY A goodness-of-fit test for the logistic regression model is proposed which is asymptotically chi-squared and is computed as a quadratic form of observed counts minus the expected counts.

Journal ArticleDOI
TL;DR: In this article, an asymptotic theory for canonical correlation analysis is given for multivariate populations with finite fourth moments, and a modified test statistic with a chi-squared approximation can be used for testing the hypothesis that some of the population coefficients are zero.
Abstract: SUMMARY An asymptotic theory for canonical correlation analysis is given for multivariate populations with finite fourth moments. The asymptotic distributions of the sample canonical correlation coefficients and of statistics used for testing hypotheses about the population coefficients involve the fourth order cumulants of the parent population and are sensitive to departures from normality. These asymptotic distributions have surprisingly simple forms in the case of elliptical populations; here a modified test statistic with a chi-squared approximation can be used for testing the hypothesis that some of the population coefficients are zero. Finally we note that, when sampling from elliptical populations, the asymptotic distributions of test statistics used in some other multivariate procedures are similarly simple.

Journal ArticleDOI
TL;DR: In this paper, a modified version of the Mardia-puri correlation coefficient p2 iS was proposed for bivariate angular distributions and bivariate distributions on general manifolds, and its properties were examined and compared with those of other bidirectional correlation coefficients.
Abstract: SUMMARY A correlation coefficient p2 iS proposed for bivariate angular distributions and for bivariate distributions on general manifolds. In the cylindrical case p2 iS the coefficient of Mardia (1976), and for the bivariate angular case it is a modified version of the correlation coefficient of Mardia & Puri (1978). Some properties of p2 are examined and compared with those of other bidirectional correlation coefficients. In particular, this coefficient is found to be closely connected with important exponential families of distributions. Further, the asymptotic distribution of the sample version of p2 under the hypothesis of independence does not depend on the marginal distributions. Thus it is asymptotically robust against concentration in the bivariate angular case. The regression models arising from complete dependence as measured by p2 are examined. A numerical example is given.

Journal ArticleDOI
TL;DR: In this paper, it-inverse weights and best linear unbiased weights were compared for weighting of observations drawn by unequal probability sampling methods, and the conclusion is that the two schemes are equally efficient as far as first-order efficiency goes.
Abstract: SUMMARY This paper deals with two schemes, it-inverse weights and best linear unbiased weights, for weighting of observations drawn by unequal probability sampling methods. The context is that of constructing an asymptotically design-unbiased estimate of the mean of a finite population. Estimation of regression coefficients is needed as a preliminary step, and it is here that the question of weighting enters. The conclusion is that the two schemes are equally efficient as far as first-order efficiency goes. The paper takes a step towards the structuring of model-based inference within the principles of probability sampling.

Journal ArticleDOI
TL;DR: In this article, it was shown that the residual sum of squares has asymptotically a x2 distribution when the degrees of freedom tend to infinity, and that the results for the general linear model can be considerably simplified by using the technique due to Welch (1951), and extend the results to multivariate models and variance component models.
Abstract: If in a regression problem the variances are not equal it is common to use the reciprocal estimated variances as weights. The residual sum of squares Q has asymptotically a x2 distribution when the degrees of freedom tend to infinity. Welch (1947, 1951) gave an approximation to the distribution of Q in the special case of the comparison of n means, by using a suitably chosen F distribution. James (1951, 1954) gave an improved approximation using the fractiles of a x2 distribution and extended the results to the general linear model. We shall show here how the results for the general linear model can be considerably simplified by using the technique due to Welch (1951), and extend the results to multivariate models and variance component models. Results will also be given on the variance of the fitted value, thereby extending the results of Jacquez, Mather & Crawford (1968) to the general linear model.

Journal ArticleDOI
TL;DR: In this article, the null distribution of the Z test statistic is for practical purposes satisfactorily approximated by a normal distribution for samples of size 5 up to 100, and the large sample null distribution and the consistency of the test are also obtained.
Abstract: SUMMARY The mean and the variance of a random sample are independently distributed if and only if the parent population is normal. This characterization is used as a basis for developing a test, termed the Z test, for the composite hypothesis of normality against asymmetric alternatives. The null distribution of the Z test statistic is for practical purposes satisfactorily approximated by a normal distribution for samples of size 5 up to 100. The large sample null distribution and the consistency of the test are also obtained. A Monte Carlo power study shows that the Z test has good power properties relative to some well-known competitors.

Journal ArticleDOI
TL;DR: In this article, an analogue of the Rayleigh test is developed for weighted vector data, which is sensitive to nonrandom concentration or scatter of vector angles, and can be used also to test certain group differences or interactions.
Abstract: SUMMARY An analogue of the Rayleigh test is developed for weighted vector data. It is sensitive to nonrandom concentration or scatter of vector angles, and can be used also to test certain group differences or interactions. The test is based upon rank transformation of vector weights and calculation of a resultant; it is nonparametric. Critical values of resultant lengths are tabulated.

Journal ArticleDOI
TL;DR: In this paper, the relationship between the response, y, and the subsidiary variate, x, is linear through the origin and the variance of y is proportional to x, where the correlation coefficient p between 9' and X is positive.
Abstract: In sample surveys supplementary information is often used for increasing the precision of estimators. A good example of this is the ratio method of estimation. This is most effective when the relationship between the response, y, and the subsidiary variate, x, is linear through the origin and the variance of y is proportional to x. The method can be used with simple random sampling, stratified random sampling or other types of survey designs. Let I' and X be unbiased estimators of the parameters Y and X corresponding to the variates y and x respectively, based on any probability sampling design. Examples of such parameters are population totals and means. It is assumed that X is known. For simplicity assume all measurements to be nonnegative and X and X to be positive. Let the correlation coefficient p between 9' and X be positive. Then the traditional ratio method of estimation uses Sr= IX/I to estimate Y. Let N and n < N be the population and the sample sizes respectively. Then clearly

Journal ArticleDOI
TL;DR: In this paper, the authors investigated the properties of a design constructed sequentially for a simple nonlinear problem and proved that the design measure corresponding to the sequential design converges to an optimal measure, even when the latter has a singular information matrix.
Abstract: SUMMARY Properties of a design constructed sequentially for a simple nonlinear problem are investigated both theoretically and by simulation. It is proved that the design measure corresponding to the sequential design converges to an optimal measure, even when the latter has singular information matrix. The empirical study suggests that in repeatedsampling inference we can effectively ignore the fact that the design is sequential. Some key word8: Asymptotic efficiency; Fisher and sample information; Nonlinear design; Optimal design; Sequential design; Simulation; Singular design.

Journal ArticleDOI
TL;DR: In this paper, a simple method of obtaining asymptotic expansions for the densities of sufficient estimators is described, which is an extension of the one developed by O Barndorff-Nielsen and DR Cox for exponential families.
Abstract: A simple method of obtaining asymptotic expansions for the densities of sufficient estimators is described It is an extension of the one developed by O Barndorff-Nielsen and DR Cox (1979) for exponential families A series expansion in powers of n−1 is derived of which the first term has an error of order n−1 which can effectively be reduced to n - −3 2 by renormalization The results obtained are similar to those given by HE Daniels's (1954) saddlepoint method but the derivations are simpler A brief treatment of approximations to conditional densities is given Theorems are proved which extend the validity of the multivariate Edgeworth expansion to parametric families of densities of statistics which need not be standardized sums of independent and identically distributed vectors These extensions permit the treatment of problems arising in time series analysis The technique is used by J Durbin (1980) to obtain approximations to the densities of partial serial correlation coefficients

Journal ArticleDOI
Robert E. Wheeler1

Journal ArticleDOI
TL;DR: In this article, it was shown that the normal, gamma and inverse normal densities are the only possible densities for which the renormalized saddlepoint approximation reproduces exactly the density of the mean.
Abstract: SUMMARY The renormalized saddlepoint approximation to the probability density of ani estimator' often has a surprisingly low relative error over the whole admissible range of the parameter. In particular it is known to be exact for certain densities. This raises the question of how to characterize the class of such exact cases. The density of the mean of a univariate random sample is discussed. It is shown that the normal, gamma and inverse normal are the only possible densities for which the renormalized saddlepoint approximation reproduces exactly the density of the mean.


Journal ArticleDOI
TL;DR: In this article, Rank procedures for testing location in matched pairs data are discussed for test location in data sets containing censored as well as uncensored observations, and test statistics are derived under the assumption that within-pair differences are symmetric and identically distributed about a common median.
Abstract: SUMMARY Rank procedures are discussed for testing location in matched pairs data. The tests may be applied to samples containing censored as well as uncensored observations. It is assumed that members of a given pair have equal censoring times while members from distinct pairs may have different times of censoring. In addition, it is assumed that censoring is independent of the random variables under study, and that censoring, if it occurs, is on the right. The test statistics are derived under the assumption that within-pair differences are symmetrically and identically distributed about a common median. The application of the techniques to problems in survival data analysis is also discussed.

Journal ArticleDOI
TL;DR: In this article, the authors investigated the construction of draws for round robin tournaments with the aim of distri- buting the carryover effects due to any team as evenly as possible amongst the other teams.
Abstract: SUMMARY If team A played team B in its previous match of a round robin tournament and is now playing team C, team C is said to receive a carry-over effect due to team B. This paper investigates the construction of draws for round robin tournaments with the aim of distri- buting the carry-over effects due to any team as evenly as possible amongst the other teams. A balanced distribution is shown to occur when the number of teams is a power of two, and a method of construction of draws is given for these cases. It is conjectured that balanced draws do not exist for other numbers of teams, and the most effective method of construction yet found for this latter situation is presented. As considered here, a round robin tournament is made up of t teatns, which play every other team n times. The number of places in the draw is required to be even; if only (t - 1) teams enter the competition, the remaining place is called a bye, and a team drawn to play a bye has a rest day. For the purpose of this discussion, a bye may be regarded as an ordinary team. In each of the n rounds, 't matches are played on each of (t - 1) occasions, with a team meeting every other team once per round. The draw for every round subsequent to the first is the same as for the first round, except that venues may be altered, to allow for 'home' and 'away' matches, for example. Each team is considered to have an effect on its opponents which carries over to the next match. If team A meets team B in one match and team C in the next, then it is reasonable that team A's performance against team C will have been affected by team B. Particularly in body-contact sports, if team B is a strong, hard-playing side, then team A is likely to enter the match against team C bruised in both body and morale. Conversely, if team B is relatively weak, then team C can anticipate that team A will be confident and fit for their match. Team C is said to receive a 'carry-over effect' due to team B. The aim of this paper is to obtain draws, for various values of t, which spread as evenly as possible the carry-over effects of each team. No carry-over effect is present in the first match, so all teams will both pass on, and receive, n(t - 1) - 1 carry-over effects in all. A draw will be called balanced with respect to carry-over effects if every team receives carry-over effects n times from each of (t -2) teams, and (n - 1) times from the remaining team.

Journal ArticleDOI
TL;DR: In this paper, the distribution of the latent roots of the sample covariance matrix is studied when the parent population is nonnormal, and asymptotic expansions of the marginal and joint distributions of the sampled roots and a function of the root function are given by finding the Edgeworth expansions.
Abstract: SUMMARY The distribution of the latent roots of the sample covariance matrix is studied when the parent population is nonnormal. Asymptotic expansions of the marginal and joint distributions of the sample roots and the distribution of a function of the sample roots are given, by finding the Edgeworth expansions.