scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the royal statistical society series b-methodological in 1991"


Journal ArticleDOI
Simon J. Sheather, M. C. Jones1
TL;DR: The key to the success of the current procedure is the reintroduction of a non- stochastic term which was previously omitted together with use of the bandwidth to reduce bias in estimation without inflating variance.
Abstract: We present a new method for data-based selection of the bandwidth in kernel density estimation which has excellent properties It improves on a recent procedure of Park and Marron (which itself is a good method) in various ways First, the new method has superior theoretical performance; second, it also has a computational advantage; third, the new method has reliably good performance for smooth densities in simulations, performance that is second to none in the existing literature These methods are based on choosing the bandwidth to (approximately) minimize good quality estimates of the mean integrated squared error The key to the success of the current procedure is the reintroduction of a non- stochastic term which was previously omitted together with use of the bandwidth to reduce bias in estimation without inflating variance

2,475 citations




Journal ArticleDOI
TL;DR: In this paper, the theory of polynomial splines is applied to multivariate data analysis, where spline smoothing relies on a partition of a function space into two orthogonal subspaces, one containing the obvious or structural components of variation among a set of observed functions, and the other of which contains residual components.
Abstract: Multivariate data analysis permits the study of observations which are finite sets of numbers, but modern data collection situations can involve data, or the processes giving rise to them, which are functions. Functional data analysis involves infinite dimensional processes and/or data. The paper shows how the theory of L-splines can support generalizations of linear modelling and principal components analysis to samples drawn from random functions. Spline smoothing rests on a partition of a function space into two orthogonal subspaces, one of which contains the obvious or structural components of variation among a set of observed functions, and the other of which contains residual components. This partitioning is achieved through the use of a linear differential operator, and we show how the theory of polynomial splines can be applied more generally with an arbitrary operator and associated boundary constraints. These data analysis tools are illustrated by a study of variation in temperature-precipitation patterns among some Canadian weather-stations.

833 citations


Journal ArticleDOI
TL;DR: In this article, the point process of observations which are extreme in at least one component is considered, and two new techniques for generating such models are presented, and the statistical estimation of the resulting models are discussed and illustrated with an application to oceanographic data.
Abstract: SUMMARY The classical treatment of multivariate extreme values is through componentwise ordering, though in practice most interest is in actual extreme events. Here the point process of observations which are extreme in at least one component is considered. Parametric models for the dependence between components must satisfy certain constraints. Two new techniques for generating such models are presented. Aspects of the statistical estimation of the resulting models are discussed and are illustrated with an application to oceanographic data.

525 citations


Journal ArticleDOI
TL;DR: In this article, general formulae for first-order biases of maximum likelihood estimates of the linear parameters, linear predictors, the dispersion parameter and fitted values in generalized linear models were derived.
Abstract: SUMMARY In this paper we derive general formulae for first-order biases of maximum likelihood estimates of the linear parameters, linear predictors, the dispersion parameter and fitted values in generalized linear models. These formulae may be implemented in the GLIM program to compute bias-corrected maximum likelihood estimates to order n - 1, where n is the sample size, with minimal effort by means of a supplementary weighted regression. For linear logistic models it is shown that the asymptotic bias vector of j8 is almost collinear with f3. The approximate formula flp/m+ for the bias of S in logistic models, where p = dim($) and m+ = E mi is the sum of the binomial indices, is derived and checked numerically.

370 citations



Journal ArticleDOI
TL;DR: The authors identify the structure of an optimal subclassification, i.e., one in which the treated and control subjects in the same subclass are, on average, as similar as possible with respect to observed covariates.
Abstract: SUMMARY An empirical investigation of the effects of a treatment is an observational study if it involves the comparison of treated and control groups that were not formed by randomization. In such studies, treated and control groups may differ systematically with respect to pretreatment measures or covariates, and addressing these pre- treatment differences is a central concern. Matching and subclassification are two standard methods of adjusting for observed pretreatment differences; they may be used alone (e.g. Cochran (1968)) or in conjunction with analytical or model-based adjustments (e.g. Rubin (1973, 1979), Holford (1978), Rosenbaum and Rubin (1984), section 3.3, and Rosenbaum (1987, 1988a)). In particular, Rubin's simulation studies suggest that model-based adjustments applied to matched or subclassified samples are more robust than model-based adjustments applied to unmatched samples. The purpose of this paper is to identify the structure of an optimal subclassification, i.e. one in which the treated and control subjects in the same subclass are, on average, as similar as possible with respect to observed covariates. It turns out that this structure is simple and intuitive. While adjustments may control imbalances in observed covariates, they cannot

287 citations



Journal ArticleDOI
TL;DR: In this article, a semiparametric estimation and inference in a logistic regression model with measurement error in the predictors is described, which relies on kernel regression techniques and is asymptotically normally distributed and computationally feasible.
Abstract: SUMMARY We describe semiparametric estimation and inference in a logistic regression model with measurement error in the predictors. The particular measurement error model consists of a primary data set in which only the response Y and a fallible surrogate W of the true predictor X are observed, plus a smaller validation data set for which (Y, X, W) are observed. Except for the underlying assumption of a logistic model in the true predictor, no parametric distributional assumption is made about the true predictor or its surrogate. We develop a semiparametric parameter estimate of the logistic regression parameter which is asymptotically normally distributed and computationally feasible. The estimate relies on kernel regression techniques. For scalar predictors, by a detailed analysis of the mean-squared error of the parameter estimate, we obtain a representation for an optimal bandwidth.

160 citations


Journal ArticleDOI
TL;DR: In this paper, the authors tabulated the approximate upper percentage points for the general case via the Poisson clumping heuristic developed by Aldous, and applied the test to two data sets.
Abstract: SUMMARY Likelihood ratio tests for threshold autoregression have been considered by Chan, and Chan and Tong. However, except for the simplest case, percentage points of the (asymptotic) null distribution of the test statistic have not been tabulated. The purpose of this paper is to tabulate the approximate upper percentage points for the general case. This is done via the Poisson clumping heuristic developed by Aldous. Monte Carlo experiments show that the approximation is quite good. As an illustration, the test is then applied to two data sets.

Journal ArticleDOI
TL;DR: In this article, the authors characterize the class of bivariate distributions such that the conditional distributions belong to any specified exponential families, leading to several bivariate families derived by Castillo and Galambos and by Arnold and Strauss, as well as to distributions with Poisson, geometric and other conditionals.
Abstract: We characterize the class of bivariate distributions such that the conditional distributions belong to any specified exponential families. The result leads at once to several bivariate families derived by Castillo and Galambos and by Arnold and Strauss, as well as to distributions with Poisson, geometric and other conditionals. We indicate methods to simulate samples from the distributions and estimate their parameters.

Journal ArticleDOI
TL;DR: In this paper, a cumulative weighted difference in the Kaplan-Meier estimates is proposed as a test statistic for equality of distributions in the two-sample censored data survival analysis problem.
Abstract: SUMMARY We propose a cumulative weighted difference in the Kaplan-Meier estimates as a test statistic for equality of distributions in the two-sample censored data survival analysis problem. For stability of such a statistic, the absolute value of the possibly random weight function must be bounded above by a multiple of (C-)1/2+6 where 1 - C - is the left continuous censoring distribution function and 6 > 0. For these weighted Kaplan-Meier (WKM) statistics, asymptotic distribution theory is presented along with expressions for the efficacy under a sequence of local alternatives. A simple censored data generalization of the two-sample difference in means test (z-test) is a member of this class and in large samples is seen to be quite efficient relative to the popular log-rank test under a range of alternatives including the proportional hazards alternative. Optimal weight functions are also calculated. The optimal WKM statistic is as efficient as the optimal weighted log-rank statistic for any particular sequence of local alternatives. Stratified statistics and trend statistics are also presented.

Journal ArticleDOI
TL;DR: In this paper, the authors suggest refinements of the box counting method which address the obvious problems caused by the incomplete information and inaccessibility of the limit and a method for the statistical analysis of these corrected data is developed and tested on simulated and real data.
Abstract: SUMMARY We suggest refinements of the box counting method which address the obvious problems caused by the incomplete information and inaccessibility of the limit. A method for the statistical analysis of these corrected data is developed and tested on simulated and real data.

Journal ArticleDOI
TL;DR: The theoretical results of this paper provide an explanation and quantification of this empirical observation that spurious local minima of the cross-validation function are more likely to occur at too small values of the bandwidth, rather than at too large values.
Abstract: SUMMARY The method of least squares cross-validation for choosing the bandwidth of a kernel density estimator has been the object of considerable research, through both theoretical analysis and simulation studies. The method involves the minimization of a certain function of the bandwidth. One of the less attractive features of this method, which has been observed in simulation studies but has not previously been understood theoretically, is that rather often the cross-validation function has multiple local minima. The theoretical results of this paper provide an explanation and quantification of this empirical observation, through modelling the cross-validation function as a Gaussian stochastic process. Asymptotic analysis reveals that the degree of wiggliness of the cross-validation function depends on the underlying density through a fairly simple functional, but dependence on the kernel function is much more complicated. A simulation study explores the extent to which the asymptotic analysis describes the actual situation. Our techniques may also be used to obtain other related results-e.g. to show that spurious local minima of the cross-validation function are more likely to occur at too small values of the bandwidth, rather than at too large values.

Journal ArticleDOI
TL;DR: In this article, the bandwidths are locally chosen by a data-driven method based on the minimization of a local cross-validation criterion, which is shown to be asymptotically optimal with respect to local quadratic measures of errors.
Abstract: SUMMARY Kernel estimators of a regression function are investigated. The bandwidths are locally chosen by a data-driven method based on the minimization of a local cross-validation criterion. This method is shown to be asymptotically optimal with respect to local quadratic measures of errors. Monte Carlo experiments are presented, and finally the method is applied to some data of medical interest.

Journal ArticleDOI
TL;DR: In this article, the authors developed large sample Bayesian methods for assessing the accuracy of screening tests that are used to detect antibodies to the human immunodeficiency virus in donated blood, and assessing the prevalence of the disease in the population sampled from.
Abstract: In this paper, we develop large sample Bayesian methods for assessing the accuracy of screening tests that are used to detect antibodies to the human immunodeficiency virus in donated blood, and for assessing the prevalence of the disease in the population sampled from. We obtain approximate joint and marginal posterior distributions for the predictive values positive and negative of a test and, additionally, we obtain approximate predictive distributions for the number of future individuals that will test positively or be truly positive out of a new sample or population of interest. We illustrate our methods with data from Canada and the UK.

Journal ArticleDOI
TL;DR: In this article, the authors explore Fisher's conception of statistical inference, with special attention to the importance he placed on choosing an appropriate frame of reference to define the inferential model, and investigate inferential models which respect the likelihood principle or the prequential principle.
Abstract: SUMMARY In celebration of the centenary of the birth of Sir Ronald Fisher, this paper explores Fisher's conception of statistical inference, with special attention to the importance he placed on choosing an appropriate frame of reference to define the inferential model. In particular, we investigate inferential models which respect the likelihood principle or the prequential principle, and argue that these will typically have an asymptotic sampling theory justification.

Journal ArticleDOI
TL;DR: In this paper, the authors presented estimators for the infection rate in the general stochastic epidemic model and applied these estimators to data on a smallpox outbreak and obtained the asymptotic distribution of the infection rates.
Abstract: SUMMARY Some estimators for the infection rate in the general stochastic epidemic model are presented. The first estimator follows the approach of maximum likelihood. However, this approach requires us to observe most if not all of the epidemic process. As an alternative, an estimator which uses less detailed data is derived from a suitable martingale. Both estimators are shown to be consistent only when a major outbreak of disease is observed. Minor outbreaks do not provide enough information on the infection rate. The asymptotic distributions of these estimators are also obtained. Asymptotic normality holds only for a major outbreak. Finally, the estimators are applied to data on a smallpox outbreak.

Journal ArticleDOI
TL;DR: In this article, the authors derived asymptotic expansions for the posterior probability of confidence regions based on the likelihood ratio test statistic and for the coverage probability of highest posterior density regions.
Abstract: SUMMARY Let Y1,..., Y, denote independent observations each distributed according to a density depending on a scalar parameter 0. Suppose that we are interested in constructing an interval estimate for 0. One approach is to construct a confidence region with a specified coverage probability based on the likelihood ratio test statistic. Another approach is to construct a highest posterior density region with a specified posterior probability. The goal of this paper is to study the relationship between these two approaches. In particular, we derive asymptotic expansions for the posterior probability of confidence regions based on the likelihood ratio test statistic and for the coverage probability of highest posterior density regions. Conditions under which the two methods lead to identical regions, at least approximately, are also given.



Journal ArticleDOI
TL;DR: In this paper, the authors give a weakly consistent estimator for the maximum pseudolikelihood estimator of V and show that it is consistent with the pseudovalues obtained by deleting one estimating equation at a time.
Abstract: Let (X1, X2, ..., X") be a vector of (possibly dependent) random variables having distribution F(X, 0). Let G(X, 0) = E 1 gi (X, 0) = 0 be an estimating equation for 0, e.g. the score function or the maximum pseudolikelihood estimating equation in spatial processes. Let O,n be the estimator obtained from G such that 0On +0o in probability and n 1/2(0 00) -+ N(O, V) in distribution. In many situations, it is difficult to derive an analytical expression for V, e.g. for maximum pseudolikelihood estimators for the spatial processes. In this paper, we give a jackknife estimator of V and show that it is weakly consistent. The method consists of deleting one estimating equation (instead of one observation) at a time and thus obtaining the pseudovalues. The method of proof and conditions are similar to those of Reeds with some modifications. The method applies equally to independent and identically distributed random variables, independent but not identically distributed random variables, timeor space-dependent stochastic processes. Our conditions are less severe than Carlstein's who deals with a similar problem of estimating V for dependent observations. We also give some simulation results.

Journal ArticleDOI
TL;DR: In this paper, the saddlepoint method is used to construct a conditional density for a real parameter in an exponential linear model, using only a two-pass calculation on the observed likelihood function for the original data.
Abstract: For an exponential linear model, the saddlepoint method gives accurate approximations for the density of the minimal sufficient statistic or maximum likelihood estimate, and for the corresponding distribution functions. In this paper we describe a simple numerical procedure that constructs such approximations for a real parameter in an exponential linear model, using only a two-pass calculation on the observed likelihood function for the original data. Simple examples of the numerical procedure are discussed, but we take the general accuracy of the saddlepoint procedure as given. An immediate application of this is to exponential family models, where inference for a component of the canonical parameter is to be based on the conditional density of the corresponding component of the sufficient statistic, given the remaining components. This conditional density is also of exponential family form, but its functional form and cumulant-generating function may not be accessible. The procedure is applied to the corresponding likelihood, approximated as the full likelihood divided by an approximate marginal likelihood obtained from Barndorff-Nielsen's formula. A double saddlepoint approximation provides another means of bypassing this difficulty. The computational procedure is also examined as a numerical procedure for obtaining the saddlepoint approximation to the Fourier inversion of a characteristic function. As such it is a two-pass calculation on a table of the cumulant-generating function.




Journal ArticleDOI
TL;DR: In this paper, a double saddlepoint approximation to a conditional distribution function is introduced, which is uniformly valid for a two-dimensional log-concave density, and the results are illus- trated with the log-normal distribution and the gamma distribution.
Abstract: SUMMARY For a one-dimensional variable with a log-concave density it is shown that the saddlepoint approximations to the density and the distribution function of the mean are uniformly valid. A double-saddlepoint approximation to a conditional distribution function is introduced, which is uniformly valid for a two-dimensional log-concave density. The results are illus- trated with the log-normal distribution and the gamma distribution. This paper is a continuation of the work in Jensen (1988) concerning the uniform validity of the saddlepoint approximation for either densities or distribution func- tions in the extreme tails of the distribution. In that paper the uniform validity of the expansions was proved for classes of densities for which the conjugate distribution tended to either a normal distribution or a gamma distribution. These classes were originally introduced in Daniels's (1954) pioneering paper. Here we take a more fundamental view and only require the conjugate density to be well behaved, i.e. after a location and scale transformation the conjugate density must be within specified limits. In doing so we arrive at the very appealing result that the saddlepoint approxi- mations are uniformly valid for any log-concave density. The results were originally considered in the one-dimensional case, but there is a natural generalization to the multidimensional setting. We consider in particular the two-dimensional case. The uniform validity of the saddlepoint approximation is of interest in itself, in that for practical applications one does not have to worry about a breakdown of the approximation in the extreme tail. Furthermore, when approximating conditional densities the uniformity is sometimes of crucial importance. In Jensen and Johansson (1988) a uniform saddlepoint approximation was used in establishing the extremal family generated by the Yule process. In much the same setting such results were used in Diaconis and Freedman (1988), and one consequence of the results in this paper is that the result of Diaconis and Freedman (1988) holds for any log-concave density. Approximations to conditional distribution functions, that are uniformly valid, are of interest in connection with the construction of similar tests in exponential families. An example of this is given in Jensen (1986a) for the gamma distribution. A systematic approach to approximations to conditional distributions is developed here and again it is shown that for log-concave densities the approximation is uniformly


Journal ArticleDOI
TL;DR: In this article, a method for comparing tests of normality using their "isotones", i.e., contours on the surfaces representing P-values, is proposed.
Abstract: SUMMARY A method for comparing tests of normality using their 'isotones', ie contours on the surfaces representing P-values, is proposed Select a family of distributions labelled by two parameters XI and X2 to represent the null and alternative hypotheses For each (X, X2) take an 'ideal sample' from that distribution, calculate the value of the goodness-of-fit statistic and determine the corresponding P-value under the normality assumption This generates a surface of the P-values of the 'values of the test statistic' in the (XI, X2) plane In this paper the construction and interpretation of the isotones, the contours on the above surfaces, for competing tests, is illustrated using the Shapiro-Wilk W, Vasicek's entropy test and Lin and Mudholkar's Zp-test of the composite hypothesis of normality