scispace - formally typeset
Search or ask a question

Showing papers in "Biometrika in 1991"


Journal ArticleDOI
TL;DR: In this article, a generalization of the coefficient of determination R2 to general regression models is discussed, and a modification of an earlier definition to allow for discrete models is proposed.
Abstract: SUMMARY A generalization of the coefficient of determination R2 to general regression models is discussed. A modification of an earlier definition to allow for discrete models is proposed.

5,085 citations


Journal ArticleDOI
TL;DR: In this paper, a conceptually simple but general algorithm for the estimation of the fixed effects, random effects, and components of dispersion in generalized linear models with random effects is proposed.
Abstract: SUMMARY A conceptually very simple but general algorithm for the estimation of the fixed effects, random effects, and components of dispersion in generalized linear models with random effects is proposed. Conditions are described under which the algorithm yields approximate maximum likelihood or quasi-maximum likelihood estimates of the fixed effects and dispersion components, and approximate empirical Bayes estimates of the random effects. The algorithm is applied to two data sets to illustrate the estimation of components of dispersion and the modelling of overdispersion.

1,197 citations


Journal ArticleDOI
TL;DR: In this article, the authors modify the estimating equations of Prentice to estimate the odds ratios and show that the parameter estimates for the logistic regression model for the marginal probabilities appear slightly more efficient when using the odds ratio parameterization.
Abstract: SUMMARY Moment methods for analyzing repeated binary responses have been proposed by Liang & Zeger (1986), and extended by Prentice (1988). In their generalized estimating equations, both Liang & Zeger (1986) and Prentice (1988) estimate the parameters associated with the expected value of an individual's vector of binary responses as well as the correlations between pairs of binary responses. Because the odds ratio has many desirable properties, and some investigators may find the odds ratio is easier to interpret, we discuss modelling the association between binary responses at pairs of times with the odds ratio. We then modify the estimating equations of Prentice to estimate the odds ratios. In simulations, the parameter estimates for the logistic regression model for the marginal probabilities appear slightly more efficient when using the odds ratio parameterization.

429 citations


Journal ArticleDOI
TL;DR: In this article, a procedure is proposed for the analysis of multilevel nonlinear models using a linearization, and the case of log linear models for discrete response data is studied in detail.
Abstract: SUMMARY A procedure is proposed for the analysis of multilevel nonlinear models using a linearization. The case of log linear models for discrete response data is studied in detail. Nonlinear models arise in a number of circumstances, notably when modelling discrete data. In this paper we consider the multilevel nonlinear model. As in linear multilevel models, we shall consider the general case where any of the model coefficients can be random at any level, and where the random parameters may also be specified functions of the fixed parameter estimates, discussed by H. Goldstein, R. Prosser and J. Rasbash in an as yet unpublished report. In the next two sections we set out the model and define notation; this is followed by a section on estimation and then some examples.

395 citations


Journal ArticleDOI
TL;DR: The result is a bandwidth selector with the, by nonparametric standards, extremely fast asymptotic rate of convergence of n−½ where n → ∞ denotes sample size.
Abstract: A bandwidth selection method is proposed for kernel density estimation. This is based on the straightforward idea of plugging estimates into the usual asymptotic representation for the optimal bandwidth, but with two important modifications. The result is a bandwidth selector with the, by nonparametric standards, extremely fast asymptotic rate of convergence of n−½ where n → ∞ denotes sample size. Comparison is given to other bandwidth selection methods, and small sample impact is investigated.

282 citations


Journal ArticleDOI
TL;DR: In this paper, the construction of boundary kernels as solutions of a variational problem is addressed and representations in orthogonal polynomials are given, including explicit solutions for the most important cases.
Abstract: SUMMARY Kernel estimators for smooth curves like density, spectral density or regression functions require modifications when estimating near endpoints of the support, both for practical and asymptotic reasons. The construction of such boundary kernels as solutions of a variational problem is addressed and representations in orthogonal polynomials are given, including explicit solutions for the most important cases. Based on explicit formulae for certain functionals of the kernels, it is shown that local bandwidth variation might be indicated near boundaries. Various bandwidth variation schemes are discussed and

273 citations


Journal ArticleDOI
TL;DR: A simulation study in which the true model is an infinite-order autoregression shows that, even in moderate sample sizes, AICC provides substantially better model selections than AIC.
Abstract: SUMMARY The Akaike Information Criterion, AIC (Akaike, 1973), and a bias-corrected version, Aicc (Sugiura, 1978; Hurvich & Tsai, 1989) are two methods for selection of regression and autoregressive models. Both criteria may be viewed as estimators of the expected Kullback-Leibler information. The bias of AIC and AICC is studied in the underfitting case, where none of the candidate models includes the true model (Shibata, 1980, 1981; Parzen, 1978). Both normal linear regression and autoregressive candidate models are considered. The bias of AICC is typically smaller, often dramatically smaller, than that of AIC. A simulation study in which the true model is an infinite-order autoregression shows that, even in moderate sample sizes, AICC provides substantially better model selections than AIC.

272 citations


Journal ArticleDOI
TL;DR: In this paper, the signed log likelihood ratio r for a one-dimensional interest parameter can be modified as r* = r + r-' log (u/r) so that r* is asymptotically standard normally distributed with error of order O(n-3/2), u being interpretable as a test statistic.
Abstract: SUMMARY The signed log likelihood ratio r for a one-dimensional interest parameter can be modified as r* = r + r-' log (u/r) so that r* is asymptotically standard normally distributed with error of order O(n-3/2), u being interpretable as a test statistic. A new, more direct derivation of this result is given, which leads to a more explicit expression for u and provides a better handle on the error term. The relation of the tail area approximation determined by r* to a generalization of the Lugannani-Rice approximation is discussed, and the applicability of r* to the analysis of residual variation is indicated.

247 citations


Journal ArticleDOI
TL;DR: In this paper, a number of ways of calculating exact Monte Carlo p-values by sequential sampling are discussed. But, in particular, a sequential method is proposed for dealing with situations in which values can only be conveniently generated using a Markov chain, conditioned to pass through the observed data.
Abstract: SUMMARY The assessment of statistical significance by Monte Carlo simulation may be costly in computer time. This paper looks at a number of ways of calculating exact Monte Carlo p-values by sequential sampling. Such p-values are shown to have properties similar to those obtained by sampling with a fixed sample size. Both standard and generalized Monte Carlo procedures are discussed and, in particular, a sequential method is proposed for dealing with situations in which values can only be conveniently generated using a Markov chain, conditioned to pass through the observed data.

213 citations


Journal ArticleDOI
TL;DR: In this article, the authors provide small error variance approximations to the distribution of variates subject to contamination by measurement error, which provide the basis for the analysis of the properties of estimators and tests and less formal procedures when measurement error is present.
Abstract: SUMMARY Measurement error causes the distribution generating data to differ from the distribution that is of substantive interest. To understand the effect of measurement error on the information produced by statistical procedures it is thus necessary to understand the distortions induced by measurement error. This paper sheds light on this by providing small error variance approximations to the distribution of variates subject to contamination by measurement error. These provide the basis for the analysis of the properties of estimators and tests and less formal procedures when measurement error is present, and the basis for the development of tests to detect measurement error and of models which incorporate its influence.

197 citations


Journal ArticleDOI
TL;DR: In this article, several types of estimators are developed which are unbiased for the population mean or total with stratified adaptive cluster sampling, which is similar to the adaptive clustering approach in this paper.
Abstract: SUMMARY Stratified adaptive cluster sampling refers to designs in which, following an initial stratified sample, additional units are added to the sample from the neighbourhood of any selected unit with an observed value that satisfies a condition of interest. If any of the added units in turn satisfies the condition, still more units are added to the sample. Estimation of the population mean or total with the stratified adaptive cluster designs is complicated by the possibility that a selection in one stratum may result in the addition of units from other strata to the sample, so that observations in separate strata are not independent. Since conventional estimators such as the stratified sample mean are biased with the adaptive designs of this paper, several types of estimators are developed which are unbiased for the population mean or total with stratified adaptive cluster sampling.

Journal ArticleDOI
TL;DR: In this article, a general modified score test statistic whose null asymptotic distribution is chi-squared to order n-1, where n is the sample size, is given.
Abstract: SUMMARY We give a general modified score test statistic whose nuil distribution is chi-squared to order n-1, where n is the sample size. The modified statistic depends on the joint cumulants of log likelihood derivatives for the full data. Some applications are discussed. Following Cox & Reid (1987a) we also derive a general formula for Bartlett-type corrections to improve test statistics whose null asymptotic distributions are chi-squared.

Journal ArticleDOI
TL;DR: In this paper, a kernel density estimator for length biased data which derives from smoothing the nonparametric maximum likelihood estimator is proposed and investigated, which has various advantages over an alternative method suggested by Bhattacharyya, Franklin & Richardson (1988).
Abstract: A new kernel density estimator for length biased data which derives from smoothing the nonparametric maximum likelihood estimator is proposed and investigated. It has various advantages over an alternative method suggested by Bhattacharyya, Franklin & Richardson (1988): it is necessarily a probability density, it is particularly better behaved near zero, it has better asymptotic mean integrated squared error properties and it is more readily extendable to related problems such as density derivative estimation.

Journal ArticleDOI
TL;DR: In this paper, an exact expression for Fisher's information matrix, based upon the moment generating function of the distribution of covariates, is calculated for the Poisson regression model, and the resulting asymptotic variance of the maximum likelihood estimate of the parameters is used to calculate the sample size required to test hypotheses about the parameters at a specified significance and power.
Abstract: SUMMARY For the Poisson regression model, an exact expression for Fisher's information matrix, based upon the moment generating function of the distribution of covariates, is calculated. This parallels a similar, approximate, calculation by Whittemore (1981) for logistic regression. The resulting asymptotic variance of the maximum likelihood estimate of the parameters is used to calculate the sample size required to test hypotheses about the parameters at a specified significance and power. Methods for calculating sample size are derived for various distributions of a single covariate, and for a family of multivariate exponential-type distributions of multiple covariates. The procedures are illustrated with two examples.

Journal ArticleDOI
TL;DR: In this article, a technique is presented for the estimation of the frequency of a sinusoid in the presence of noise, based on fitting ARMA (2, 2) models iteratively, in a special way.
Abstract: SUMMARY A technique is presented for the estimation of the frequency of a sinusoid in the presence of noise. The technique is based on fitting ARMA (2, 2) models iteratively, in a special way. The estimator is shown to be strongly consistent and as efficient as the least squares estimator of the frequency, or the periodogram maximizer. A simple accelerated version of the technique is shown to converge in a small number of iterations, the number depending on the accuracy of the initiator of the procedure. The results of a number of simulations are reported.

Journal ArticleDOI
TL;DR: In this article, the authors present statistical procedures for deciding if periodicities exist in the autocorrelation function of a seasonal time series after removal of seasonal means and standard deviations.
Abstract: SUMMARY This paper presents statistical procedures for deciding if periodicities exist in the autocorrelation function of a seasonal time series after removal of seasonal means and standard deviations. The procedures are based on asymptotic properties of a Fouriertransformed version of the estimated periodic autocorrelation function. Analyses of simulated and actual data sets illustrate the usefulness of the proposed methods.

Journal ArticleDOI
TL;DR: In this paper, a sampling-based approach is used to develop desired marginal posterior distributions and their features and a useful extension is presented which treats the case of ordered polytomous response.
Abstract: : Previous attempts at implementing fully Bayesian nonparametric bioassay have enjoyed limited success due to computational difficulties. We show here how this problem may be generally handled using a sampling based approach to develop desired marginal posterior distributions and their features. A useful extension is presented which treats the case of ordered polytomous response. Illustrative examples are provided.

Journal ArticleDOI
Clive R. Loader1
TL;DR: In this article, a random change of time scale transforms the empirical process into a Poisson process which enables us to derive large deviation approximations to the significance level of the likelihood ratio test.
Abstract: SUMMARY We discuss inference based on the likelihood ratio process for a hazard rate change point. A random change of time scale transforms the empirical process into a Poisson process which enables us to derive large deviation approximations to the significance level of the likelihood ratio test. We derive approximate confidence regions for the change point and joint confidence regions for the change point and size of change. The effect of censorship is also discussed. The methods are illustrated using Stanford heart transplant data, for which the 70 days following the transplant are found to be most critical.

Journal ArticleDOI
TL;DR: In this article, sequential and group sequential procedures are proposed for monitoring repeated t, X2or F statistics. But these procedures can be used to construct sequential hypothesis tests or repeated confidence intervals when the parameter of interest is a normal mean with unknown variance or a multivariate normal means with variance matrix known or known up to a scale factor.
Abstract: SUMMARY Sequential and group sequential procedures are proposed for monitoring repeated t, X2or F statistics. These can be used to construct sequential hypothesis tests or repeated confidence intervals when the parameter of interest is a normal mean with unknown variance or a multivariate normal mean with variance matrix known or known up to a scale factor. Exact methods for calculating error probabilities and sample size distributions are described and tables of critical values needed to implement the procedures are provided.

Journal ArticleDOI
David Scott1
TL;DR: In this article, the authors compared the optimal pointwise and global window widths for mean absolute and mean squared errors for multivariate data and showed that the optimal window width is nearly equal for all dimensions.
Abstract: SUMMARY The 'curse of dimensionality' has been interpreted as suggesting that kernel methods have limited applicability in more than several dimensions. In this note, qualitative and quantitative performance measures for multivariate density estimates are examined. Optimal pointwise and global window widths for mean absolute and mean squared errors are compared for multivariate data. One result is that the optimal pointwise absolute and squared error window widths are nearly equal for all dimensions. We also show that sample size requirements predicted by absolute rather than squared error criterion are substantially less. Further reductions are realized by using a coefficient of variation criterion. Finally, an example of a 10-dimensional kernel density estimate is given. It is suggested that the true nature of the curse of dimensionality is as much the lack of full rank as sparseness of the data.

Journal ArticleDOI
TL;DR: In this paper, a saddlepoint technique is used to approximate the density and tail probability of the studentized mean of a random sample, and the tail probability is similarly evaluated either by a further numerical integration or by a Laplace approximation of the Temme type.
Abstract: A saddlepoint technique is used to approximate to the density and tail probability of the studentized mean of a random sample. The motivation was to replace bootstrapping of the studentized mean in the way Davison & Hinkley (1988) used the saddlepoint approximation for the unstudentized mean. The method involves first obtaining a bivariate saddlepoint approximation, then, after a nonlinear transformation, integrating out an unwanted variable either numerically or by a Laplace approximation. The tail probability is similarly evaluated either by a further numerical integration or by a Laplace approximation of the Temme type. Two difficulties arise. (i) The nonlinearity of the transformation may result in Laplace approximations failing in the tail when the sample is not large. But numerical integration always works. (ii) In the bootstrap application the saddlepoint approximation may itself break down when the data set contains an outlier.

Journal ArticleDOI
TL;DR: In this paper, the problem of estimating the frequency and other parameters of a cyclical oscillation is considered, where the data consists of a periodic function observed subject to stationary additive noise.
Abstract: SUMMARY The problem of estimating the frequency and other parameters of a cyclical oscillation is considered, where the data consists of a periodic function observed subject to stationary additive noise. An estimation procedure is proposed and the asymptotic properties of the estimators established. Tests for unknown frequencies to be harmonics of a fundamental frequency are also developed and their asymptotic properties investigated. The procedures are applied to observations on the variable star S. Carinae.

Journal ArticleDOI
TL;DR: In this article, a hierarchical Bayes formulation for the smoothing problem is presented, where the prior distribution for the density function is the logistic normal process, which is a logistic transform of a Gaussian process and has parameters that control the degree of smoothness.
Abstract: SUMMARY Nonparametric density estimators smooth the empirical distribution function and are sensitive to the choice of smoothing parameters. This paper develops an hierarchical Bayes formulation for the smoothing problem. The prior distribution for the density function is the logistic normal process, which is a logistic transform of a Gaussian process. The covariance of the Gaussian process is a smoothing kernel and has parameters that control the degree of smoothness. The likelihood function for the smoothing parameters and their posterior density are computed from an approximation of the joint moments of the logistic normal process. The marginal predictive density mixes the conditional predictive density given the smoothing parameters with their posterior distribution. This hierarchical Bayes analysis provides a fully automated, data-dependent method for smoothing and selects the amount of smoothing that is coherent with its prior specification. A Bayesian approach to nonparametric problems is to construct a probability distribution on the relevant space of distribution functions and to update this distribution as observations become available. Families of prior distributions must not only have stochastic properties that can adequately model prior information, but also must provide tractable models. The Dirichlet process (Ferguson, 1973) has been quite successful on both counts. However, it is not directly applicable to absolutely continuous distributions because the posterior probability of the space of absolutely continuous distributions is zero. Lo (1984) convolves the Dirichlet process with a smoothing kernel to model a density function. The computational complexity of the posterior mean increases exponentially with the sample size. An alternative to the Dirichlet model is provided by the logistic normal process

Journal ArticleDOI
TL;DR: This paper presents an asymptotic approximation of marginal tail probabilities for a real-valued function of a random vector, where the function has continuous gradient that does not vanish at the mode of the joint density of the random vector.
Abstract: SUMMARY This paper presents an asymptotic approximation of marginal tail probabilities for a real-valued function of a random vector, where the function has continuous gradient that does not vanish at the mode of the joint density of the random vector. This approximation has error 0(n-312) and improves upon a related standard normal approximation which has error 0(n-1). Derivation involves the application of a tail probability formula given by DiCiccio, Field & Fraser (1990) to an approximation of a marginal density derived by Tierney, Kass & Kadane (1989). The approximation can be applied for Bayesian and conditional inference as well as for approximating sampling distributions, and the accuracy of the approximation is illustrated through several numerical examples related to such applications. In the context of conditional inference, we develop refinements of the standard normal approximation to the distribution of two different signed root likelihood ratio statistics for a component of the natural parameter in exponential families.

Journal ArticleDOI
TL;DR: In this article, the asymptotic properties of the rank transform test statistic for testing for interaction in a balanced two-way classification were studied and necessary and sufficient conditions were obtained for the distribution of this rank transform statistic to be chi-squared under the null hypothesis of no interaction.
Abstract: SUMMARY The asymptotic properties of the rank transform statistic for testing for interaction in a balanced two-way classification are studied. Necessary and sufficient conditions are obtained for the asymptotic distribution of this rank transform statistic to be chi-squared under the null hypothesis of no interaction. It is shown that the rank transform test statistic for interaction is asymptotically chi-squared, divided by its degrees of freedom, when there are exactly two levels of both main effects or when there is only one main effect. In the latter case, the test detects a nested effect instead of an interaction. In all other two-way layouts, there exist values for the main effects such that, under the null hypothesis of no interaction, the expected value of the rank transform test statistic goes to infinity as the sample size increases.

Journal ArticleDOI
TL;DR: In this article, Bayes, empirical bayes and Bayes empirical Bayes solutions are given to the problems of interval estimation, decision making, and point estimation of the population size N. The model accounting for this variation is known as At.
Abstract: SUMMARY In multiple capture-recapture surveys, the probability of capture can vary between sampling occasions. The model accounting for this variation is known as At. Bayes, empirical Bayes, and Bayes empirical Bayes solutions are given to the problems of interval estimation, decision making, and point estimation of the population size N. When the number of sampling occasions is small to moderate and the number of recaptured units observed on each sampling occasion is moderate, estimates obtained from empirical Bayes and Bayes empirical Bayes methods compare closely to Bayesian methods using a reference prior distribution for the capture probabilities. However, when the number of sampling occasions is large and the number of recaptured units observed on each sampling occasion is small, inferences obtained using different reference priors can differ considerably.

Journal ArticleDOI
TL;DR: In this paper, the relative efficiencies of the methods are explored in the special case of estimating the parameters of the proportional odds model for ordinal responses, and the method Hsieh, Manski & McFadden (1985) call "conditional maximum likelihood" is shown to be essentially as efficient as maximum likelihood; the latter is considerably more difficult to implement.
Abstract: SUMMARY We consider fitting prospective regression models to data obtained by case-control or response selective sampling from a finite population with known population totals in each response category. Maximum likelihood estimation is developed and compared with two pseudo-likelihood approaches. The relative efficiencies of the methods are explored in the special case of estimating the parameters of the proportional odds model for ordinal responses. For such applications the method Hsieh, Manski & McFadden (1985) call 'conditional maximum likelihood' is shown to be essentially as efficient as maximum likelihood; the latter is considerably more difficult to implement. In contrast the use of a weighted estimate of the prospective likelihood can lead to a substantial loss of efficiency.

Journal ArticleDOI
TL;DR: Godambe et al. as discussed by the authors proposed the technique of orthogonalizing parameters, to deal with the general problem of nuisance parameters, within fully parametric models, and obtained a large-sample approximation to the conditional likelihood.
Abstract: SUMMARY Cox & Reid (1987) proposed the technique of orthogonalizing parameters, to deal with the general problem of nuisance parameters, within fully parametric models. They obtained a large-sample approximation to the conditional likelihood. Along the same lines Davison (1988) studied generalized linear models. In the present paper we deal with the problem of nuisance parameters, within a semiparametric setup which includes the class of distributions associated with generalized linear models. The technique used is that of optimum orthogonal estimating functions (Godambe & Thompson, 1989). The results are related to those of Cox & Reid (1987).

Journal ArticleDOI
TL;DR: In this paper, the equivalence theorem for T-optimum designs with prior distributions is extended to situations in which there is a specified prior probability that each model is true and, conditionally on this probability, prior distributions for the parameters in the two models are specified.
Abstract: SUMMARY The design of experiments for discriminating between two rival models in the presence of prior information is analyzed. The method of Atkinson & Fedorov is extended. A theorem derived from the Kiefer-Wolfowitz General Equivalence Theorem is used to construct and check optimal designs. Some examples are provided. This paper is concerned with the design of experiments for discriminating between two regression models, one or both of which may be nonlinear in the parameters. Atkinson & Fedorov (1975a) describe T-optimum designs for this purpose which are optimum when it is known which one of the models is true. The designs, which satisfy an equivalence theorem of optimum design theory, are locally optimum, in the sense that they depend upon the values of the unknown parameters of the true model. In the present paper we extend the theory to situations in which there is a specified prior probability that each model is true and, conditionally on this probability, prior distributions for the parameters in the two models are specified. Our central result is that such designs again satisfy an equivalence theorem which can be used both for the construction of designs and for checking the optimality of a proposed design. In the next section we give the background to the problem and introduce our notation. The equivalence theorem for T-optimum designs with prior distributions is presented in ? 3. Examples are in ? 4. 2. BACKGROUND The aim of the experiment is to maximize the expected noncentrality parameter of the false model, the expectation being taken over models and over the prior distributions of the parameters. To be more precise we introduce our notation which is based on that of Silvey (1980, Ch. 3). Let , a compact set, be the design region; let X be the class of all probability

Journal ArticleDOI
TL;DR: In this paper, two approximate maximum likelihood estimates of the fractal dimension are suggested for random point patterns and planar curves, based on the distributions of the Palm probability and the spectrum.
Abstract: SUMMARY Two approximate maximum likelihood estimates of the fractal dimension are suggested for random point patterns and planar curves, based on the distributions of the Palm probability and the spectrum. For estimation of the fractal dimension of self-similar patterns in space, we usually measure the slope of a linear log-log relationship between pairs of points. The box- counting method plots the number of pixels which intersects the pattern under consider- ation versus length of the pixel unit, and the walking-dividers method plots the total length of polygons which approximate the considered curves versus length of the polygon's side, and so on; see Mandelbrot (1982, p. 33), for example. Using these methods, there has been a large number of papers reporting the dimensions of fractal sets, especially since Mandelbrot (1977). Second-order properties of a stochastic process such as the auto-covariance and the spectrum are also available for examining self-similarity and measuring its indices. This is especially easy for stochastic processes on a real line. For instance, Ogata (1988) and Ogata & Abe (1991) applied one-dimensional point processes to the occurrence times of major earthquakes in the world and Japan for a time span of about one century, investigating the self-similarity by the auto-covariance, spectrum, dispersion-time curves and so-called R/S statistic. In particular, fractal dimension and the Hurst number, or the self-similarity index, are estimated objectively by maximizing a spectral log likelihood. The maximum likelihood method is generally expected to provide an efficient estimate together with its standard error. Therefore, in this paper, we develop maximum likelihood methods for planar point and curve patterns. Two types of approximate likelihood functions are considered. One is equivalent to an isotropic Poisson likelihood by modelling an intensity function of points under the Palm probability, and the other is the so-called spectral likelihood based on the distribution of the periodogram, which is an estimate of the spectrum. These two independent estimation methods are compared by using a certain artificially generated clustering point pattern, then a map of epicentres of earthquakes and a topographic contour line are analyzed to compare the two methods.