scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the royal statistical society series b-methodological in 1996"


Journal ArticleDOI
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

40,785 citations


Journal ArticleDOI
TL;DR: In this article, a generalization of Henderson's joint likelihood, called a hierarchical or h-likelihood, for inferences from hierarchical generalized linear models is proposed, where the distribution of these components is not restricted to be normal; this allows a broader class of models, which includes generalized linear mixed models.
Abstract: We consider hierarchical generalized linear models which allow extra error components in the linear predictors of generalized linear models. The distribution of these components is not restricted to be normal; this allows a broader class of models, which includes generalized linear mixed models. We use a generalization of Henderson's joint likelihood, called a hierarchical or h-likelihood, for inferences from hierarchical generalized linear models. This avoids the integration that is necessary when marginal likelihood is used. Under appropriate conditions maximizing the h-likelihood gives fixed effect estimators that are asymptotically equivalent to those obtained from the use of marginal likelihood; at the same time we obtain the random effect estimates that are asymptotically best unbiased predictors. An adjusted profile h-likelihood is shown to give the required generalization of restricted maximum likelihood for the estimation of dispersion components. A scaled deviance test for the goodness of fit, a model selection criterion for choosing between various dispersion models and a graphical method for checking the distributional assumption of random effects are proposed. The ideas of quasi-likelihood and extended quasi-likelihood are generalized to the new class. We give examples of the Poisson-gamma, binomial-beta and gamma-inverse gamma hierarchical generalized linear models. A resolution is proposed for the apparent difference between population-averaged and subject-specific models. A unified framework is provided for viewing and extending many existing methods.

825 citations


Journal ArticleDOI
TL;DR: This paper fits Gaussian mixtures to each class to facilitate effective classification in non-normal settings, especially when the classes are clustered.
Abstract: Fisher-Rao linear discriminant analysis (LDA) is a valuable tool for multigroup classification. LDA is equivalent to maximum likelihood classification assuming Gaussian distributions for each class. In this paper, we fit Gaussian mixtures to each class to facilitate effective classification in non-normal settings, especially when the classes are clustered. Low dimensional views are an important by-product of LDA-our new techniques inherit this feature. We can control the within-class spread of the subclass centres relative to the between-class spread. Our technique for fitting these models permits a natural blend with nonparametric versions of LDA.

791 citations


Journal ArticleDOI
TL;DR: In this article, the imprecise Dirichlet model is proposed for multinomial data in cases where there is no prior information and the probabilities are expressed in terms of posterior upper and lower probabilities.
Abstract: A new method is proposed for making inferences from multinomial data in cases where there is no prior information. A paradigm is the problem of predicting the colour of the next marble to be drawn from a bag whose contents are (initially) completely unknown. In such problems we may be unable to formulate a sample space because we do not know what outcomes are possible. This suggests an invariance principle : inferences based on observations should not depend on the sample space in which the observations and future events of interest are represented. Objective Bayesian methods do not satisfy this principle. This paper describes a statistical model, called the imprecise Dirichlet model, for drawing coherent inferences from multinomial data. Inferences are expressed in terms of posterior upper and lower probabilities. The probabilities are initially vacuous, reflecting prior ignorance, but they become more precise as the number of observations increases. This model does satisfy the invariance principle. Two sets of data are analysed in detail. In the first example one red marble is observed in six drawings from a bag. Inferences from the imprecise Dirichlet model are compared with objective Bayesian and frequentist inferences. The second example is an analysis of data from medical trials which compared two treatments for cardiorespiratory failure in newborn babies. There are two problems : to draw conclusions about which treatment is more effective and to decide when the randomized trials should be terminated. This example shows how the imprecise Dirichlet model can be used to analyse data in the form of a contingency table.

505 citations


Journal ArticleDOI
TL;DR: The vector generalized additive models (VGAMs) as discussed by the authors extend GAMs to include a class of multivariate regression models, such as the continuation ratio model and the proportional and non-proportional odds models for ordinal responses.
Abstract: Vector smoothing is used to extend the class of generalized additive models in a very natural way to include a class of multivariate regression models. The resulting models are called 'vector generalized additive models'. The class of models for which the methodology gives generalized additive extensions includes the multiple logistic regression model for nominal responses, the continuation ratio model and the proportional and non-proportional odds models for ordinal responses, and the bivariate probit and bivariate logistic models for correlated binary responses. They may also be applied to generalized estimating equations.

387 citations


Journal ArticleDOI
TL;DR: A modified form of twofold cross-validation is introduced to choose a threshold for wavelet shrinkage estimators operating on data sets of length a power of 2.
Abstract: SUMMARY Wavelets are orthonormal basis functions with special properties that show potential in many areas of mathematics and statistics. This paper concentrates on the estimation of functions and images from noisy data by using wavelet shrinkage. A modified form of twofold cross-validation is introduced to choose a threshold for wavelet shrinkage estimators operating on data sets of length a power of 2. The cross-validation algorithm is then extended to data sets of any length and to multidimensional data sets. The algorithms are compared with established threshold choosers by using simulation. An application to a real data set arising from anaesthesia is presented.

349 citations


Journal ArticleDOI
TL;DR: A combination of system decomposition using a sparse matrix method, experimental design and modelling is applied to one example of an electrical circuit simulator producing a usable emulator of the circuit for use in optimization and sensitivity analysis.
Abstract: Large systems require new methods of experimental designs suitable for the highly adaptive models which are employed to cope with complex non-linear responses and high dimensionality of input spaces. The area of computer experiments has started to provide such designs especially Latin hypercube and lattice designs. System decomposition, prevalent in several branches of engineering, can be employed to decrease complexity. A combination of system decomposition using a sparse matrix method, experimental design and modelling is applied to one example of an electrical circuit simulator producing a usable emulator of the circuit for use in optimization and sensitivity analysis.

192 citations


Journal ArticleDOI
TL;DR: In this article, the authors address the problem of making inferences about the extremal properties of the aggregated process from the pointwise data, and derive a model in which the resulting distribution is determined by the marginal tail behaviour and spatial dependence at extreme levels of the process.
Abstract: Risk assessment for many hydrological structures requires an estimate of the extremal behaviour of the rainfall regime within a specified catchment region. In most cases it is the spatially aggregated rainfall which is the key process, though in practice only pointwise rainfall measurements from a network of sites over the region are available. In this paper we address the problem of making inferences about the extremal properties of the aggregated process from the pointwise data. Working within the usual extreme value paradigm, a model is derived in which the resulting distribution is determined by the marginal tail behaviour and spatial dependence at extreme levels of the process. Data collected from a region in the south-west of England are used to illustrate the procedure.

173 citations


Journal ArticleDOI
TL;DR: The maximum likelihood estimator is shown to converge to the subset characterized by the same density function, and connection is made to the bootstrap method proposed by Aitkin and co-workers and McLachlan for testing the number of components in a finite mixture and deriving confidence regions in a infinite mixture.
Abstract: Statistical inference using the likelihood ratio statistic for the number of components in a mixture model is complicated when the true number of components is less than that of the proposed model since this represents a non-regular problem: the true parameter is on the boundary of the parameter space and in some cases the true parameter is in a non-identifiable subset of the parameter space. The maximum likelihood estimator is shown to converge to the subset characterized by the same density function, and connection is made to the bootstrap method proposed by Aitkin and co-workers and McLachlan for testing the number of components in a finite mixture and deriving confidence regions in a finite mixture.

143 citations


Journal ArticleDOI
TL;DR: A method for boundary correcting kernel density estimators, based on generating pseudodata beyond the extremities of the density's support, which was found to have lower mean-squared error at x = 0 than the boundary kernel method, especially for small sample sizes.
Abstract: We suggest a method for boundary correcting kernel density estimators, based on generating pseudodata beyond the extremities of the density's support. The estimator produced in this way enjoys optimal orders of bias and variance right up to the ends of the support, and it may be used with kernels of arbitrary order. Our method is considerably more adaptive than the common data reflection approach, which is not really appropriate for kernels of order 2 or more since it does not adequately correct boundary bias. In a simulation study, for densities on [0,1] with positive slope at x=0, our method was found to have lower mean-squared error at x = 0 than the boundary kernel method, especially for small sample sizes. Our technique may be used in conjunction with plug-in or least squares cross-validation methods of bandwidth selection and has an analogue in the context of estimating a point process intensity.

129 citations


Journal ArticleDOI
TL;DR: An approach to Bayesian sensitivity analysis that uses an influence statistic and an outlier statistic to assess the sensitivity of a model to perturbations, and two alternative divergences are proposed and shown to be interpretable.
Abstract: This paper describes an approach to Bayesian sensitivity analysis that uses an influence statistic and an outlier statistic to assess the sensitivity of a model to perturbations. The basic outlier statistic is a Bayes factor, whereas the influence statistic depends strongly on the purpose of the analysis. The task of influence analysis is aided by having an interpretable influence statistic. Two alternative divergences, an L 1 -distance and a x 2 -divergence, are proposed and shown to be interpretable. The Bayes factor and the proposed influence measures are shown to be summaries of the posterior of a perturbation function.

Journal ArticleDOI
TL;DR: This paper considers the asymptotic distribution of the likelihood ratio statistic T for testing a subset of parameter of interest θ, θ = (γ, η), H(0) : γ = γ(0), based on the pseudolikelihood L, where ϕ̂ is a consistent estimator of ϕ, the nuisance parameter.
Abstract: This paper concerns the asymptotic distribution of the likehood ratio statistic T for testing H 0 : θ = θ 0 based on the pseudolikelihood L(θ, φ), where φ is a simple estimator of φ. We show that the asymptotic distribution of T under H 0 is a weighted sum of indepepdent x 1 2 -variables where the weights involve the asymptotic joint covariance matrix of φ and the score function for θ. Some sufficient conditions are provided for the limiting distribution to be x 2 . The result is extended to allow θ 0 to be a boundary value of the θ parameter space, and φ to be misspecified in L(θ, φ). We also examine the issue of power loss when φ is misspecified in L(θ, φ). Several examples including variance component models, multivariate survival models, genetic linkage analysis and the Behrens-Fisher problem are presented to demonstrate the scope of the problems considered and to illustrate the results.

Journal ArticleDOI
TL;DR: In this paper, a method for estimating the variance (and other moments) of s( ) by using only the data is proposed, and a rate of convergence is also given, giving guidance into the appropriate choice of subshape size.
Abstract: SUMMARY A statistic s( ) is computed on spatially indexed data {X,: i e D}, where D is a finite subset of the integer lattice 22. We propose a method for estimating the variance (and other moments) of s( ) by using only the data. The set D may be irregularly shaped, the statistic s( ) may be arbitrarily complicated and no distributional assumptions (marginal or joint) are necessary. The method uses the statistic computed on overlapping 'subshapes' of D as replicates of s( ). The estimator is simply the sample variance of the (standardized) replicates. We demonstrate C2-consistency of the estimator (under mild conditions on s( ) and the strength of spatial dependence). A rate of convergence is also given, giving guidance into the appropriate choice of subshape size, and the estimator is illustrated in two data examples.

Journal ArticleDOI
TL;DR: It is shown that an asymPTotically precise one-term correction to the asymptotic distribution function of the classical Cramйr-von Mises statistic approximates the exact distribution function remarkably closely for sample sizes as small as 7 or even smaller.
Abstract: It is shown that an asymptotically precise one-term correction to the asymptotic distribution function of the classical Cramйr-von Mises statistic approximates the exact distribution function remarkably closely for sample sizes as small as 7 or even smaller. This correction can be quickly evaluated, and hence it is suitable for the computation of practically exact p-values when testing simple goodness of fit. Abstract2 It is shown that an asymptotically precise one-term correction to the asymptotic distribution function of the classical Cramer-von Mises statistic approximates the exact distribution function remarkably closely for sample sizes as small as 7 or even smaller. This correction can be quickly evaluated, and hence it is suitable for the computation of practically exact $p$-values when testing simple goodness of fit. Similar findings hold for Watson's rotationally invariant modification, where a sample size of 4 appears to suffice.

Journal ArticleDOI
TL;DR: In principal component analysis (PCA), principal differential analysis (PDA) as discussed by the authors identifies a linear differential operator L = w 0 I + w 1 D +... + w m-1 D m −1 + D m that comes as close as possible to annihilating a sample of functions.
Abstract: Functional data are observations that are either themselves functions or are naturally representable as functions. When these functions can be considered smooth, it is natural to use their derivatives in exploring their variation. Principal differential analysis (PDA) identifies a linear differential operator L = w 0 I + w 1 D +... + w m-1 D m-1 + D m that comes as close as possible to annihilating a sample of functions. Convenient procedures for estimating the m weighting functions w j are developed. The estimated differential operator L is analogous to the projection operator used as the data annihilator in principal components analysis and thus can be viewed as a type of data reduction or exploration tool. The corresponding linear differential equation may also have a useful substantive interpretation. Modelling and regularization features can also be incorporated into PDA.

Journal ArticleDOI
TL;DR: In this paper, it was shown that residual maximum likelihood (REML) has an exact conditional likelihood interpretation, where the conditioning is on an appropriate sufficient statistic to remove dependence on the nuisance parameters.
Abstract: SUMMARY Residual maximum likelihood (REML) estimation is often preferred to maximum likelihood estimation as a method of estimating covariance parameters in linear models because it takes account of the loss of degrees of freedom in estimating the mean and produces unbiased estimating equations for the variance parameters. In this paper it is shown that REML has an exact conditional likelihood interpretation, where the conditioning is on an appropriate sufficient statistic to remove dependence on the nuisance parameters. This interpretation clarifies the motivation for REML and generalizes directly to non-normal models in which there is a low dimensional sufficient statistic for the fitted values. The conditional likelihood is shown to be well defined and to satisfy the properties of a likelihood function, even though this is not generally true when conditioning on statistics which depend on parameters of interest. Using the conditional likelihood representation, the concept of REML is extended to generalized linear models with varying dispersion and canonical link. Explicit calculation of the conditional likelihood is given for the one-way lay-out. A saddlepoint approximation for the conditional likelihood is also derived.

Journal ArticleDOI
TL;DR: In this paper, the authors present a survey of recent developments in response surface models, focusing on the potential or actual usefulness of response surface designs for a wide range of applications, including life testing and models in which time is a factor.
Abstract: Optimum experimental designs were originally developed by Kiefer, mainly for response surface models. This survey of recent developments emphasizes potential or actual usefulness. For linear models the construction of exact designs, particularly over irregular design regions, is stressed, as is the blocking of response surface designs. Other important areas include systematic designs that are robust against trend and designs for mixtures with irregular design regions : several industrial examples are mentioned. Both D- and c-optimum designs are found for a non-linear model of the economic response of cereal production to fertilizer level, the c-optimum design being for the conditions of maximum economic return. Locally optimum and Bayesian designs are both described. Similar results for generalized linear models lead to designs for the LD 95 in a logistic model in which male and female subjects respond differently. Designs with structure in the variance suggest alternatives to the potentially wasteful product designs of Taguchi. Designs for sequential clinical trials to include random balance are presented. The last section outlines some applications, including life testing and models in which time is a factor.

Journal ArticleDOI
TL;DR: In this article, a sample reuse method for dependent data, based on a cross between the block bootstrap and Richardson extrapolation, is proposed, where instead of simulating a same size resample by resampling blocks and placing them end-to-end, it analyses the blocks directly and employs a variant of Richardson extrapolated to adjust for block size.
Abstract: We suggest a sample reuse method for dependent data, based on a cross between the block bootstrap and Richardson extrapolation. Instead of simulating a same size resample by resampling blocks and placing them end to end, it analyses the blocks directly and employs a variant of Richardson extrapolation to adjust for block size. A simple empirical rule, also based on Richardson extrapolation, is suggested for empirically selecting the block size. Performance in the contexts of distribution and bias estimation is discussed via theoretical analysis and numerical simulation.

Journal ArticleDOI
TL;DR: In this paper, the influence of the tail weight of the error distribution is addressed in the setting of choosing threshold and truncation parameters, and different approaches to correction for stochastic design are suggested.
Abstract: SUMMARY Concise asymptotic theory is developed for non-linear wavelet estimators of regression means, in the context of general error distributions, general designs, general normalizations in the case of stochastic design, and non-structural assumptions about the mean. The influence of the tail weight of the error distribution is addressed in the setting of choosing threshold and truncation parameters. Mainly, the tail weight is described in an extremely simple way, by a moment condition; previous work on this topic has generally imposed the much more stringent assumption that the error distribution be normal. Different approaches to correction for stochastic design are suggested. These include conventional kernel estimation of the design density, in which case the interaction between the smoothing parameters of the non-linear wavelet estimator and the linear kernel method is described.

Journal ArticleDOI
TL;DR: The problem of assessing who is the guilty party is discussed, taking particular account of the following complicating features: the evidence at the scene of the crime may be unreliable; the suspects examined may have been chosen by means of a search process, which might itself be informative.
Abstract: A murder has been committed and there is a known population of possible suspects. The identification evidence available, based on information at the scene of the crime, is that the criminal may have a certain characteristic. Information may also be available on a set of suspects about which of them have the characteristic. We discuss the problem of assessing who is the guilty party, taking particular account of the following complicating features: the evidence at the scene of the crime may be unreliable; the suspects examined may have been chosen by means of a search process, which might itself be informative. We also examine the effect of the assumed population dependence structure for the relevant identification characteristics.

Journal ArticleDOI
TL;DR: In this paper, the Gibbs sampling approach for generating from the conditional distribution is proposed, which enables Monte Carlo exact conditional tests to be performed, including tests for goodness of fit of the all-two-way interaction model for a 28-table and of a simple logistic model.
Abstract: SUMMARY The form of the exact conditional distribution of a sufficient statistic for the interest parameters, given a sufficient statistic for the nuisance parameters, is derived for a generalized linear model with canonical link. General results for log-linear and logistic models are given. A Gibbs sampling approach for generating from the conditional distribution is proposed, which enables Monte Carlo exact conditional tests to be performed. Examples include tests for goodness of fit of the all-two-way interaction model for a 28-table and of a simple logistic model. Tests against non-saturated alternatives are also considered.

Journal ArticleDOI
TL;DR: In this paper, the Poisson and multinomial large sample distributions of log-linear model parameter estimators are derived and compared within this constraint equation context; reparameterizations are thereby avoided.
Abstract: SUMMARY We introduce a method for comparing multinomial and Poisson log-linear models which affords an explicit description of their equivalences and differences. The method involves specifying the model in terms of constraint equations, rather than the more common freedom equations. The Poisson and multinomial large sample distributions of log-linear model parameter estimators are derived and compared within this constraint equation context; reparameterizations are thereby avoided. As a by-product, the method provides the practitioner with the adjustment that is necessary to make valid inferences about all multinomial log-linear parameters when, as a matter of convenience, the Poisson log-linear model is fitted. This implies that valid large sample inferences about the multinomial cell probabilities can be made directly by using the Poisson log-linear model. To illustrate the utility of this approach, several examples are considered.

Journal ArticleDOI
TL;DR: In this article, an analytical formula is obtained for the information bias of a general bias-adjusted profile score statistic, which is used to compare bias reducing adjustments to the profile score statistics, as well as to construct further adjustments that reduce the bias to O(n-1).
Abstract: SUMMARY The bias and information bias of the ordinary profile score statistic are both typically of order 0(1). Several additive adjustments to the profile score statistic that reduce its bias to order O(n-') have been proposed. In certain situations, the information bias of these adjusted profile score statistics is also reduced to order O(n-Q); specifically, the modified profile likelihood of Barndorff-Nielsen yields an adjusted profile score statistic having both reduced bias and reduced information bias. In general, however, a bias reducing adjustment to the profile score statistic will not automatically reduce the order of the information bias as well. In this paper, an analytical formula is obtained for the information bias of a general bias-adjusted profile score statistic. This formula is used to compare bias reducing adjustments to the profile score statistic, as well as to construct further adjustments that reduce the information bias to O(n-1). Several examples are presented to illustrate use of the formula for information bias. In particular, the information bias formula may be utilized in a criterion for choosing between orthogonal parameterizations for the conditional profile likelihood of Cox and Reid.

Journal ArticleDOI
TL;DR: In this article, the methods underlying vector generalized additive models are extended to provide additive extensions to the generalized estimating equations approaches to multivariate regression problems of Liang and Zeger and the subsequent literature, illustrated by two examples using correlated binary data concerning the presence or absence of several plant species at the same geographical site and chest pain in male workers from a workforce study.
Abstract: The methods underlying vector generalized additive models are extended to provide additive extensions to the generalized estimating equations approaches to multivariate regression problems of Liang and Zeger and the subsequent literature. The methods are illustrated by two examples using correlated binary data concerning the presence or absence of several plant species at the same geographical site and chest pain in male workers from a workforce study. With minor modification, the extensions are shown to apply to longitudinal data.

Journal ArticleDOI
TL;DR: This work derives models for infection rates that incorporate contact rates between individuals and variables affecting susceptibility to infection, and develops thinned counting process models that reduce to a proportional hazards model when the contact process is not observable.
Abstract: SUMMARY Differences in infection rates among types of individuals within a population can arise from differences in amount of exposure to infection or from differences in susceptibility to infection. We derive models for infection rates that incorporate contact rates between individuals and variables affecting susceptibility to infection. We emphasize the distinction between controlling for exposure opportunity (expected exposure) and actual exposure. We present a marked counting process model for the combined contact and infection transmission processes. When the contact process is not observable, we develop thinned counting process models that reduce to a proportional hazards model. We show that the different commonly used parameters for evaluating covariate effects, such as vaccine efficacy, form a hierarchy depending on the amount of information available about the components of the transmission system.

Journal ArticleDOI
TL;DR: In this article, the problem of simultaneously testing a non-hierarchical finite family of hypotheses is considered, and several definitions have been introduced, including four types of test.
Abstract: This paper considers the problem of simultaneously testing a non-hierarchical finite family of hypotheses. Several definitions have been introduced. They are used to shed interesting light on the interrelationship between the four types of test that have previously been proposed.

Journal ArticleDOI
TL;DR: A unified treatment of switching regression models driven by a general binary process is presented, and a Bayesian testing procedure is developed that can be generalized to accommodate other Bayesian-like procedures.
Abstract: We propose algorithms based on random draws from predictive distributions of unknown quantities (missing values, for instance). This procedure can either be iterative, which is a special variation of the Gibbs sampler, or be sequential, which is a variation of sequential imputation. In the latter case one can update the posterior distribution with new observations easily. The methods proposed have intuitive statistical implications and can be generalized to accommodate other Bayesian-like procedures. We display some applications of the method in connection with the Bayesian bootstrap, classification, hierarchical models and selection of variables. In particular, as an application of the method, we present a unified treatment of switching regression models driven by a general binary process, and we develop a Bayesian testing procedure. Some simulations and a real example are used to illustrate the methods proposed.

Journal ArticleDOI
TL;DR: In this article, a graphical tool for checking normality of independently and identically distributed observations based on the third derivative T 3 of the logarithm of the empirical moment-generating function is proposed.
Abstract: A new graphical tool for checking normality of independently and identically distributed observations based on the third derivative T 3 of the logarithm of the empirical moment-generating function is proposed. A significant departure of this function from the horizontal zero line is indicative of non-normality. Behaviour in the neighbourhood of 0 indicates the type of departure from the normal distribution. Examples show that the T 3 -method, when used in conjunction with the classical probability plot, can provide valuable insight into the data and be a powerful graphical tool.

Journal ArticleDOI
TL;DR: In this article, a first-order modification to Pearson's statistic is proposed which induces local orthogonality with the regression parameters, resulting in substantial simplifications and increased power.
Abstract: SUMMARY Approximations to the first three moments of Pearson's statistic are obtained for noncanonical generalized linear models, extending the results of McCullagh. A first-order modification to Pearson's statistic is proposed which induces local orthogonality with the regression parameters, resulting in substantial simplifications and increased power. Accurate and easily computed approximations to the moments of the modified Pearson statistic conditional on the estimated regression parameters are obtained for testing goodness of fit to sparse data. Both the Pearson statistic and its modification are shown to be asymptotically independent of the regression parameters. Simulation studies and examples are given.

Journal ArticleDOI
Abstract: SUMMARY This paper discusses design issues in 'ecological studies'- epidemiological studies in which the relationship between disease and behavioural and environmental determinants is studied at the population rather than the individual level. The number of study populations has little relevance beyond a certain point, the power and precision being limited by the total number of disease events and by the size of the sample surveys used to estimate the distributions of determinants within populations. In most circumstances, optimal design requires the size of the sample surveys in each population to be related to the number of disease events which will occur in it, and for sampling to be stratified by age and/or sex.