scispace - formally typeset
Search or ask a question

Showing papers in "Biometrika in 1983"


Journal ArticleDOI
TL;DR: The authors discusses the central role of propensity scores and balancing scores in the analysis of observational studies and shows that adjustment for the scalar propensity score is sufficient to remove bias due to all observed covariates.
Abstract: : The results of observational studies are often disputed because of nonrandom treatment assignment. For example, patients at greater risk may be overrepresented in some treatment group. This paper discusses the central role of propensity scores and balancing scores in the analysis of observational studies. The propensity score is the (estimated) conditional probability of assignment to a particular treatment given a vector of observed covariates. Both large and small sample theory show that adjustment for the scalar propensity score is sufficient to remove bias due to all observed covariates. Applications include: matched sampling on the univariate propensity score which is equal percent bias reducing under more general conditions than required for discriminant matching, multivariate adjustment by subclassification on balancing scores where the same subclasses are used to estimate treatment effects for all outcome variables and in all subpopulations, and visual representation of multivariate adjustment by a two-dimensional plot. (Author)

23,744 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed a more flexible method to construct discrete sequential boundaries based on the choice of a function, a*(t), which characterizes the rate at which the error level ac is spent.
Abstract: SUMMARY Pocock (1977), O'Brien & Fleming (1979) and Slud & Wei (1982) have proposed different methods to construct discrete sequential boundaries for clinical trials. These methods require that the total number of decision times be specified in advance. In the present paper, we propose a more flexible way to construct discrete sequential boundaries. The method is based on the choice of a function, a*(t), which characterizes the rate at which the error level ac is spent. The boundary at a decision time is determined by a*(t), and by past and current decision times, but does not depend on the future decision times or the total number of decision times.

1,913 citations


Journal ArticleDOI
TL;DR: The purpose here is to provide appropriate diagnostic techniques to aid in an assessment of the validity of the usual assumption of homoscedasticity when little or no replication is present.
Abstract: SUMMARY For the usual regression model without replication, we provide a diagnostic test for heteroscedasticity based on the score statistic. A graphical procedure to complement the score test is also presented. Some key u'ords: Influence; Linear model; Residual; Score test. Diagnostic methods in linear regression are used to examine the appropriateness of assumptions underlying the modelling process and to locate unusual characteristics of the data that may influence conclusions. The recent literature on diagnostics is dominated by studies of methods for the detection of influential observations. Cook & Weisberg (1982) provide a review. Diagnostics for the relevance of specific assumptions, however, have not received the same degree of attention, even though these may be of equal importance. Our purpose here is to provide appropriate diagnostic techniques to aid in an assessment of the validity of the usual assumption of homoscedasticity when little or no replication is present. Available methods for studying this assumption include both graphical and nongraphical procedures. The usual graphical procedure consists of plotting the ordinary least squares residuals against fitted values or an explanatory variable. A megaphone shaped pattern is taken as evidence that the variance depends on the quantity plotted on tlle abscissa (Weisberg, 1980, Chapter 6). In ? 3 we suggest several ways in which this

813 citations


Journal ArticleDOI

490 citations


Journal ArticleDOI
TL;DR: In this article, a log linear contrast form of principal component analysis (LCCA) is proposed for compositional data analysis, which is based on transformation-based transformation techniques.
Abstract: SUMMARY Compositional data, consisting of vectors of proportions, have proved difficult to handle statistically because of the awkward constraint that the components of each vector must sum to unity. Moreover such data sets frequently display marked curvature so that linear techniques such as standard principal component analysis are likely to prove inadequate. From a critical reexamination of previous approaches we evolve, through adaptation of recently introduced transformation techniques for compositional data analysis, a log linear contrast form of principal component analysis and illustrate its advantages in applications.

461 citations



Journal ArticleDOI
John T. Kent1
TL;DR: An important motivating feature for this information-based correlation coefficient is the fact that it generalizes both the usual product-moment correlation coefficient for the bivariate normal model and the usual multiple correlation coefficients for the standard multiple regression model with normal errors.
Abstract: SUMMARY Given a parametric model of dependence between two random quantities, X and Y, the notion of information gain can be used to define a measure of correlation. This definition of correlation generalizes both the usual product-moment correlation coeffi- cient for the bivariate normal model and the multiple correlation coefficient in the standard linear regression model. The use of this information-based correlation in a descriptive statistical analysis is examined and several examples are given. If the dependence between two random quantities, X and Y, is modelled parametri- cally, then the concept of information gain can be used to define a measure of correlation. This correlation coefficient can appear in two possible contexts, depending on whether one models the joint distribution of X and Y, or just the conditional distribution of Y given X. An important motivating feature for this information-based correlation coefficient is the fact that it generalizes both the usual product-moment correlation coefficient for the bivariate normal model and the usual multiple correlation coefficient for the standard multiple regression model with normal errors. Our intuition is well developed for these usual correlation coefficients, and hopefully our intuition will still be applicable for-the information-based correlation in more general modelling situations of parametric dependence. Further, since our correlation coefficient is based on information gain, we might hope to extend our intuition to interpret the information gain in any statistical modelling situation where we want to assess how much better a more complicated model is than a simpler model. The concept of information gain for general statistical models is described in ? 2. This concept is then used to define an information-based measure of correlation; the joint case is covered in ? 3 and the conditional case in ? 4. Estimation of the correlation coefficient is carried out by estimating the corresponding information gain; see ?? 5-7. The use of information gain for the purpose of model choice in a descriptive statistical analysis is discussed in ? 8 and a comparison between our approach and Akaike's information criterion is given in ?9. Some examples of the use of this information-based correlation are given in ?? 10 and 11.

376 citations


Journal ArticleDOI
TL;DR: In this paper, the first moments of the distribution of b2 have been determined, by fitting a linear function of the reciprocal of a X2 variable and then using the Wilson-Hilferty transformation.
Abstract: SUMMARY D'Agostino & Pearson (1973) gave percentage points of the distribution of b2 for independent observations from a common univariate normal distribution. Their results can be adequately approximated, when the first three moments of the distribution of b2 have been determined, by fitting a linear function of the reciprocal of a X2 variable and then using the Wilson-Hilferty transformation. Evidence is presented suggesting that the same method of approximation is satisfactory for the b2 statistic calculated from any set of linear-least-squares residuals, on the hypothesis of normal homoscedastic errors.

269 citations


Journal ArticleDOI
TL;DR: In this paper, two classes of models for contingency tables, graphical and recursive models, arise from restrictions that are expressible as conditional independencies of variable pairs, and derive decomposable or multiplicative models as the intersecting class.
Abstract: SUMMARY We discuss two classes of models for contingency tables, graphical and recursive models, both of which arise from restrictions that are expressible as conditional independencies of variable pairs. The first of these is a subclass of hierarchical log linear models. Each of its models can be represented by an undirected graph. In the second class each model corresponds to a particular kind of a directed graph instead and can be characterized by a nontrivial factorization of the joint distribution in terms of response variables. We derive decomposable or multiplicative models as the intersecting class. This result has useful consequences for exploratory types of analysis as well as for the model interpretation: we can give an aid for detecting well-fitting decomposable models in a transformation of the observed contingency table and each decomposable model may be interpreted with the help of an undirected or directed graph.

248 citations


Journal ArticleDOI
TL;DR: In this article, an omnibus test of normality is proposed, which has high power against many alternative hypotheses The test uses a weighted integral of the squared modulus of the difference between the characteristic functions of the sample and of the normal distribution together with the one-to-one correspondence between +(t) and F(x).
Abstract: SUMMARY An omnibus test of normality is proposed, which has high power against many alternative hypotheses The test uses a weighted integral of the squared modulus of the difference between the characteristic functions of the sample and of the normal distribution together with the one-to-one correspondence between +(t) and F(x), suggest utilizing kn(t) in statistical inference Here we use /n(t) to test the composite hypothesis that F(x) is normal Let 00(t) = exp (it1 - 1t2 a2) be the characteristic function under the null hypothesis, with 1u and U2 the unspecified mean and variance The test introduced here is based on a weighted integral over t of the squared modulus of kn(t) - $O(t), where $0(t) depends on sample estimates of ju and a2 We compare the power of the test with that of prominent tests based on order statistics and on sample moments Heathcote (1972) and Feigin & Heathcote (1977) have previously considered using either the real or imaginary part alone of 4n(t) in tests of simple hypotheses Their tests relied on the fact that, for given t, either component of 4n(t) is asymptotically normal with mean given by its population counterpart If the alternative hypothesis is also simple, it may be possible to find a value of t that maximizes the power of the test in large samples Epps, Singleton & Pulley (1982) showed how the sample moment generating function may be employed to test composite hypotheses that the data come from one or the other of two separate families of distributions Murota & Takeuchi (1981) have recently proposed a location and scale invariant test based on the statistic -an(t) = I n(t/S) 12, where S is the sample standard deviation Applied to the problem of testing normality, an(t) was found to have high power in the vicinity of t = 10 against members of six families of symmetric alternatives, but its

214 citations


Journal ArticleDOI
TL;DR: In this article, a coefficient to measure association between angular variables is discussed, its asymptotic distribution is found, and its properties are compared with other statistics in current use.
Abstract: SUMMARY A coefficient to measure association between angular variables is discussed, its asymptotic distribution found, and its properties developed. Comparisons with other statistics in current use are made, and some examples given.

Journal ArticleDOI
TL;DR: In this paper, the robustness and efficiency properties of likelihood ratio tests for functions of the population covariance matrix are studied and an alternative class of tests based upon affine-invariant M-estimates of scatter is proposed whenever the function of the covariance matrices is invariant under a common scale change.
Abstract: The robustness and efficiency properties of likelihood ratio tests for functions of the population covariance matrix are studied. An alternative class of tests based upon affineinvariant M-estimates of scatter is proposed whenever the function of the covariance matrix is invariant under a common scale change. For such inferences, it is shown that a single scalar-valued index of efficiency is sufficient.

Journal ArticleDOI
TL;DR: In this paper, the bias of Yule-Walker and least squares estimates for univariate and multivariate autoregressive processes was studied. And they showed that for strongly autocorrelated processes, Yule Walker estimates can be severely biased even for comparatively large sample sizes.
Abstract: SUMMARY We study the bias of Yule-Walker and least squares estimates for univariate and multivariate autoregressive processes. We obtain explicit formulae for the large-sample bias of Yule-Walker estimates in the scalar first- and second-order cases and for least squares estimates in the general case. Both simulations and theory indicate that YuleWalker estimates are inferior to least squares estimates. For strongly autocorrelated processes, Yule-Walker estimates can be severely biased even for comparatively largesample sizes.

Journal ArticleDOI
TL;DR: In this paper, the authors introduced the stabilized probability plot, which is a new and powerful goodness-of-fit statistic, analogous to the standard Kolmogorov-Smirnov statistic D, defined to be the maximum deviation of the plotted points from their theoretical values.
Abstract: SUMMARY The stabilized probability plot is introduced. An attractive feature of the plot that enhances its interpretability is that the variances of the plotted points are approximately equal. This prompts the definition of a new and powerful goodness-of-fit statistic Dsp which, analogous to the standard Kolmogorov-Smirnov statistic D, is defined to be the maximum deviation of the plotted points from their theoretical values. Using either D or Dsp it is shown how to construct acceptance regions for QQ,, Pp and the new plots. Acceptance regions can help remove much of the subjectivity from the interpretation of these probability plots.

Journal ArticleDOI
TL;DR: In this article, a large sample theory for the proportional hazards model of survival analysis is developed for cases of staggered entry and sequential analysis, and the principal techniques involve an approximation of the score process by a suitable martingale and a random rescaling of time based on the observed Fisher information.
Abstract: : For the proportional hazards model of survival analysis, an appropriate large sample theory is developed for cases of staggered entry and sequential analysis. The principal techniques involve an approximation of the score process by a suitable martingale and a random rescaling of time based on the observed Fisher information. As a result we show that the maximum partial likelihood estimator behaves asymptotically like Brownian motion. (Author)

Journal ArticleDOI
TL;DR: In this paper, it is shown that log linear models are appropriate for tables with response and explanatory variables if and only if they are collapsible onto the explanatory variables, and necessary and sufficient conditions for collapsibility are found in terms of the generating class.
Abstract: SUMMARY Various definitions of the collapsibility of a hierarchical log linear model for a multidimensional contingency table are considered and shown to be equivalent. Necessary and sufficient conditions for collapsibility are found in terms of the generating class. It is shown that log linear models are appropriate for tables with response and explanatory variables if and only if they are collapsible onto the explanatory variables. Some key word8: Collapsibility; Contingency table; Graphical model; Interaction graph; Log linear model; Response variable; S-sufficiency. shown to be closely related. Some models have the property that relations between a set of the classifying factors may be studied by examination of the table of marginal totals formed by summing over the remaining factors. Such models are said to be collapsible onto the given set of factors. Collapsibility has important consequences for hypothesis testing and model selection, and can be useful in data reduction. We consider various definitions of collapsibility and show their equivalence. Furthermore, necessary and sufficient conditions for collapsi- bility are found in terms of the generating class. Many tables analysed in practice involve response variables. Simple examples, one of which is given in ? 3, suffice to show the importance of distinguishing between response and explanatory variables: first, that inappropriate models may be avoided, and second that natural and relevant models that are not log linear may be considered. This paper characterizes appropriate and inappropriate log linear models for tables with response variables and some alternative approaches for the analysis of such tables are briefly considered. We consider a multidimensional contingency table N based on a set of classifying factors F. For a given subset a of F we are interested in the table of marginal totals Na, that is to say the table of cell counts summed over the remaining factors aC, that is the complement of a in F. We identify a hierarchical log linear model L, that is the set of probabilities p E L, with its generating class, whose elements, generators, are given in square brackets: thus for example the model (AB) (BCD) for a 4-way table corresponds

Journal ArticleDOI
TL;DR: In this article, a simple iterative procedure is found for the exact null and alternative distributions of likelihood ratio, cumulative sum and related statistics for testing for a change in probability of a sequence of independent binomial random variables.
Abstract: SUMMARY A simple iterative procedure is found for the exact null and alternative distributions of likelihood ratio, cumulative sum and related statistics for testing for a change in probability of a sequence of independent binomial random variables. It is concluded that the likelihood ratio test is slightly less powerful than the cumulative sum test near the middle of the sequence, but that the likelihood ratio test is much more powerful near the ends. An example from epidemiology is used to illustrate the results. Under certain conditions the cumulative sum test is identical to a two-sample Kolmogorov-Smirnov test and so the iterative procedure can be used to find the null distribution of Kolmogorov-Smirnov and related statistics.

Journal ArticleDOI
TL;DR: In this article, the authors extended the class of generalized linear models to allow for correlated observations, nonlinear models and error distributions not of the exponential family form, and the analysis of deviance is generalized to the extended class of models.
Abstract: SUMMARY The class of generalized linear models is extended to allow for correlated observations, nonlinear models and error distributions not of the exponential family form. The extended class of models include a number of important examples, particularly of the composite transformational type. Large-sample inference and maximum likelihood estimation for the extended class of generalized linear models are discussed, and the analysis of deviance is generalized to the extended class of models. Calculation of the maximum likelihood estimate for a general likelihood by Fisher's scoring method and a related method is considered, and the relation with the Gauss-Newton method is discussed.

Journal ArticleDOI
TL;DR: This paper used the Kalman filter with nonconstant coefficients to compute the exact likelihood of an autoregressive moving average process observed with noise, when some of our observations are either missing or aggregated.
Abstract: SUMMARY This note points out that by using the Kalman filter with nonconstant coefficients, we can compute the exact likelihood of an autoregressive-moving average process observed with noise, when some of our observations are either missing or aggregated.

Journal ArticleDOI
TL;DR: In this paper, a general form of the location model is considered for mixed continuous and categorical variables observed in a number of different populations, and some special cases of practical interest are cited.
Abstract: SUMMARY A general form of the location model is considered for mixed continuous and categorical variables observed in a number of different populations, and some special cases of practical interest are cited. The distance between any two populations is derived for each of these models. Estimation of parameters in these distance measures is discussed, and the methods are illustrated by application to previously published data.

Journal ArticleDOI
TL;DR: In this paper, the authors present a new class of nonparametric assumptions on the conditional distribution of T given C and show how they lead to consistent generalizations of the Kaplan & Meier (1958) survival curve estimator.
Abstract: SUMMARY In many contexts where there is interest in inferring the marginal distribution of a survival time T subject to censoring embodied in a latent waiting time C, the times T and C may not be independent. This paper presents a new class of nonparametric assumptions on the conditional distribution of T given C and shows how they lead to consistent generalizations of the Kaplan & Meier (1958) survival curve estimator. The new survival curve estimators are used under weak assumptions to construct bounds on the marginal survival which can be much narrower than those of Peterson (1976). In stratified populations where T and C are independent only within strata examples indicate that the Kaplan-Meier estimator is often approximately consistent.

Journal ArticleDOI
TL;DR: In this paper, the authors discuss methods for modelling multivariate autoregressive time series in terms of a smaller number of index series which are chosen to provide as complete a summary as possible of the past information contained in the original series necessary for prediction purposes.
Abstract: SUMMARY We discuss methods for modelling multivariate autoregressive time series in terms of a smaller number of index series which are chosen to provide as complete a summary as possible of the past information contained in the original series necessary for prediction purposes. The maximum likelihood method of estimation and asymptotic properties of estimators of the coefficients which determine the index variables, as well as the corresponding autoregressive coefficients, are discussed. A numerical example is presented to illustrate the use of the autoregressive index models.

Journal ArticleDOI
TL;DR: In this paper, a simple test of the composite hypothesis of normality against the alternative that the underlying distribution is long tailed is proposed, based on the behaviour of the empirical characteristic function in the neighbourhood of the origin.
Abstract: Our aim in the present paper is to suggest a simple test of the composite hypothesis of normality against the alternative that the underlying distribution is long tailed. The test is based on the behaviour of the empirical characteristic function in the neighbourhood of the origin, and is very competitive with several well-known tests for normality. Only a table of critical points is required for its implementation. We begin by describing the philosophy behind the test. Let q be the characteristic function of the sampling distribution, and set

Journal ArticleDOI
TL;DR: In this paper, the saddlepoint method is used to approximate to the distribution of an estimator defined by an estimating equation, and two different approaches are available, one of which is shown to be equivalent to the technique of Field & Hampel (1982).
Abstract: SUMMARY The saddlepoint method is used to approximate to the distribution of an estimator defined by an estimating equation. Two different approaches are available, one of which is shown to be equivalent to the technique of Field & Hampel (1982). Two recent formulae for the tail probability, due to Lugannani & Rice (1980) and to Robinson (1982) respectively, which are uniformly accurate over the whole range of the estimator, are compared numerically with the exact results and those computed by Field & Hampel. They are found to be of comparable accuracy while avoiding the use of numerical integration. The most accurate is that of Lugannani & Rice. Hampel (1974) introduced a new technique for approximating to the probability density of an estimator defined by an estimating equation. It is an example of what he called 'small sample asymptotics' where high accuracy is achieved for quite small sample sizes n, even down to single figures. In the original version it gave an approximation to the logarithmic derivative of the density function which was integrated numerically to get the density. The distribution function was obtained by a second numerical integration, which could then be used to renormalize both the density and the distribution function. Field & Hampel (1982) develop the technique in detail and compare its performance with that of other approximation methods. As Hampel had pointed out, his approach is closely related to the saddlepoint method of Daniels (1954) which was applied to sample means and ratios of means. Following the private communication referred to by Field & Hampel (1982, p. 31) it was realized that the first numerical integration was unnecessary and that Hampel's approach could be shortened to give a direct approximation to the density which is in fact a saddlepoint approximation. The purpose of the present paper is to extend the use of the saddlepoint method to estimating equations. There are two distinct ways of doing this which lead to different approximations of similar accuracy. One appears to be more convenient for approxi- mating to tail probabilities by numerical integration, and was used to compute the saddlepoint approximations quoted in Field & Hampel's Tables 1 and 2; The other gives the density directly in a form equivalent to their equation (4 3), which is then integrated

Journal ArticleDOI
TL;DR: In this article, a function of the Wilcoxon rank sum statistic is proposed for testing the equality of the marginal distributions when sampling from a bivariate population, and the test statistic is asymptotically nonparametric.
Abstract: SUMMARY A function of the Wilcoxon rank sum statistic is proposed for testing the equality of the marginal distributions when sampling from a bivariate population. The test statistic is asymptotically nonparametric. A comparison is made between the test statistic and several parametric and nonparametric tests in terms of their Pitman relative efficiency and small-sample performance. The test statistic has power comparable to the paired t test when sampling from the bivariate normal distribution and greater power in several important cases.

Journal ArticleDOI
TL;DR: In this article, the authors examined the asymptotic behaviour of some standard test statistics in the presence of Markov dependence in a sequence of observations, and examined the standard Pearson x2 goodness-of-fit test, and the 2 x 2 contingency table test for independence.
Abstract: SUMMARY The asymptotic behaviour of the Pearson goodness-of-fit test and of the test of independence in 2 x 2 tables is examined when the data are generated by Markov dependent sequences. The results allow simple quantification of the effects of such serial dependence on some standard test statistics. Some key wrords: (hi-squared tests: Contingency tables; Dependence; Goodness of fit; Markov chains; Reversibilitv. 1. INTROI)UCTION In this paper, we discuss the asymptotic behaviour of some standard test statistics in the presence of Markov dependence in a sequence of observations. In particular, we examine the standard Pearson x2 goodness-of-fit test, and the 2 x 2 contingency table test for independence.

Journal ArticleDOI
William J. Welch1
TL;DR: In this article, the authors proposed a criterion for experimental design to protect against bias resulting from a large class of deviations from the assumed model, and an example of such a criterion is given.
Abstract: In estimating a response surface over a design region of interest, mean squared error can arise both from sampling variance and bias introduced by model inadequacy. The criterion adopted here for experimental design attempts to protect against bias resulting from a large class of deviations from the assumed model. Two algorithms are proposed for the construction of either approximate or exact designs and an example is given.

Journal ArticleDOI
TL;DR: In this article, the analysis of multidimensional contingency tables in practice is considered, and strategies for model selection based on this class are considered and an example of an example is given.
Abstract: SUMMARY Some aspects of the analysis of multidimensional contingency tables in practice are considered. The class of graphical models defined by Darroch, Lauritzen & Speed (1980) is described, strategies for model selection based on this class are considered and an example is given.

Journal ArticleDOI
TL;DR: In this paper, the authors present a method of analysis for interlaboratory studies that is robust to the existence of outliers and long-tailed distributions of random effects, using Monte Carlo analysis.
Abstract: SUMMARY A common procedure in testing analytical methods is to send a portion of each of a number of samples to each of several laboratories. The results of such a study are submitted to statistical analysis to determine the two important variance components in the problem: replication error and laboratory bias. Outliers are relatively common in these data both among laboratory effects and among the residuals. This paper presents a method of analysis for interlaboratory studies that is robust to the existence of outliers and long-tailed distributions of random effects. Theoretical considerations as well as a Monte Carlo study are adduced as support for this new technique.

Journal ArticleDOI
TL;DR: A conceptual repeated sampling experiment is considered for evaluating a predictive distribution used to describe such future observations and leads to an asymptotic likelihood principle that gives a small-sample justification for the use of entropy for evaluating parameter estimation as well as model order and structure determination procedures.
Abstract: SUMMARY The objective of inferring stochastic models from a set of data is to obtain the best description, by using a probability model, of the statistical behaviour of future samples of the process. A conceptual repeated sampling experiment is considered for evaluating a predictive distribution used to describe such future observations and leads to an asymptotic likelihood principle. Considerations of likelihood and sufficiency lead to the use of entropy or the Kullback-Leibler information as the natural measure of approximation to the actual distribution by a predictive distribution in repeated samples. This gives a small-sample justification for the use of entropy for evaluating parameter estimation as well as model order and structure determination procedures.