scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the royal statistical society series b-methodological in 1990"



Journal ArticleDOI
Jonathan R. M. Hosking1
TL;DR: The authors define L-moments as the expectations of certain linear combinations of order statistics, which can be defined for any random variable whose mean exists and form the basis of a general theory which covers the summarization and description of theoretical probability distributions.
Abstract: L-moments are expectations of certain linear combinations of order statistics. They can be defined for any random variable whose mean exists and form the basis of a general theory which covers the summarization and description of theoretical probability distributions, the summarization and description of observed data samples, estimation of parameters and quantiles of probability distributions, and hypothesis tests for probability distributions. The theory involves such established procedures as the use of order statistics and Gini's mean difference statistic, and gives rise to some promising innovations such as the measures of skewness and kurtosis and new methods of parameter estimation

2,668 citations


Journal ArticleDOI
TL;DR: In this article, the authors discuss the analysis of the extremes of data by modelling the sizes and occurrence of exceedances over high thresholds, and the natural distribution for such exceedances, the generalized Pareto distribution, is described and its properties elucidated.
Abstract: We discuss the analysis of the extremes of data by modelling the sizes and occurrence of exceedances over high thresholds. The natural distribution for such exceedances, the generalized Pareto distribution, is described and its properties elucidated. Estimation and model-checking procedures for univariate and regression data are developed, and the influence of and information contained in the most extreme observations in a sample are studied. Models for seasonality and serial dependence in the point process of exceedances are described. Sets of data on river flows and wave heights are discussed, and an application to the siting of nuclear installations is described

1,503 citations


Journal ArticleDOI
TL;DR: Hinkley as discussed by the authors showed that principal components regression and least squares multiple regression occupy opposite ends of a continuous spectrum, with partial least squares lying in between, with two adjustable "parameters" controlling the procedure: alpha, in the continuum [0, 11, and oomega, the number of regressors finally accepted.
Abstract: [Read before The Royal Statistical Society at a meeting organized by the Research Section on Wednesday, October 25th, 1989, Professor D. V. Hinkley in the Chair] SUMMARY The paper addresses the evergreen problem of construction of regressors for use in least squares multiple regression. In the context of a general sequential procedure for doing this, it is shown that, with a particular objective criterion for the construction, the procedures of ordinary least squares and principal components regression occupy the opposite ends of a continuous spectrum, with partial least squares lying in between. There are two adjustable 'parameters' controlling the procedure: 'alpha', in the continuum [0, 11, and 'omega', the number of regressors finally accepted. These control parameters are chosen by crossvalidation. The method is illustrated by a range of examples of its application.

445 citations


Journal ArticleDOI
TL;DR: A new method for detecting spatial clustering of events in populations with non-uniform density is proposed, based on selecting controls from the population at risk and computing interpoint distances for the combined sample.
Abstract: A new method for detecting spatial clustering of events in populations with non-uniform density is proposed. The method is based on selecting controls from the population at risk and computing interpoint distances for the combined sample. Nonparametric tests are developed which are based on the number of cases among the k nearest neighbours of each case and the number of cases nearer than the k nearest control. The performance of these tests is evaluated analytically and by simulation and the method is applied to a data set on the locations of cases of childhood leukaemia and lymphoma in a defined geographical area.

444 citations


Journal ArticleDOI
TL;DR: Property of the EM algorithm in such contexts are discussed, concentrating on rates of conver- gence, and an alternative that is usually more practical and converges at least as quickly is presented.
Abstract: SUMMARY The EM algorithm is a popular approach to maximum likelihood estimation but has not been much used for penalized likelihood or maximum a posteriori estimation This paper discusses properties of the EM algorithm in such contexts, concentrating on rates of conver- gence, and presents an alternative that is usually more practical and converges at least as quickly The EM algorithm is a general approach to maximum likelihood estimation, rather than a specific algorithm Dempster et al (1977) discussed the method and derived basic properties, demonstrating that a variety of procedures previously developed rather informally could be unified The common strand to problems where the approach is applicable is a notion of 'incomplete data'; this includes the conventional sense of 'missing data' but is much broader than that The EM algorithm demon- strates its strength in situations where some hypothetical experiment yields data from which estimation is particularly convenient and economical: the 'incomplete' data actually at hand are regarded as observable functions of these 'complete' data The resulting algorithms, while usually slow to converge, are often extremely simple and remain practical in large problems where no other approaches may be feasible Dempster et al (1977) briefly refer to the use of the same approach to the problem of finding the posterior mode (maximum a posteriori estimate) in a Bayesian estima-

385 citations


Journal ArticleDOI
TL;DR: In this paper, the authors modify the maximum likelihood-EM approach by introducing a simple smoothing step at each EM iteration, which converges in relatively few iterations to good estimates of g that do not depend on the choice of starting configuration.
Abstract: There are many practical problems where the observed data are not drawn directly from the density g of real interest, but rather from another distribution derived from g by the application of an integral operator. The estimation of g then entails both statistical and numerical difficulties. A natural statistical approach is by maximum likelihood, conveniently implemented using the EM algorithm, but this provides unsatisfactory reconstructions of g. In this paper, we modify the maximum likelihood-EM approach by introducing a simple smoothing step at each EM iteration. In our experience, this algorithm converges in relatively few iterations to good estimates of g that do not depend on the choice of starting configuration. Some theoretical background is given that relates this smoothed EM algorithm to a maximum penalized likelihood approach. Two applications are considered in detail. The first is the classical stereology problem of determining particle size distributions from data collected on a plane section through a composite medium. The second concerns the recovery of the structure of a section of the human body from external observations obtained by positron emission tomography; for this problem, we also suggest several technical improvements on existing methodology.

264 citations


Journal ArticleDOI
TL;DR: Graphical chain models as mentioned in this paper are special types of graphs which are interpreted in this fashion, and are used to illustrate how conditional independences are reflected in summary statistics derived from the models and how the graphs help to identify analogies and equivalences between different models.
Abstract: SUMMARY Graphs consisting of points, and lines or arrows as connections between selected pairs of points, are used to formulate hypotheses about relations between variables. Points stand for variables, connections represent associations. When a missing connection is interpreted as a conditional independence, the graph characterizes a conditional independence structure as well. Statistical models, called graphical chain models, correspond to special types of graphs which are interpreted in this fashion. Examples are used to illustrate how conditional independences are reflected in summary statistics derived from the models and how the graphs help to identify analogies and equivalences between different models. Graphical chain models are shown to provide a unifying concept for many statistical techniques that in the past have proven to be useful in analyses of data. They also provide tools for new types of analysis.

234 citations


Journal ArticleDOI
TL;DR: In this article, a simple adjustment for profile likelihoods is proposed to alleviate some of the problems inherent in the use of profile likelihood, such as bias, inconsistency and overoptimistic variance estimates.
Abstract: SUMMARY We propose a simple adjustment for profile likelihoods. The aim of the adjustment is to alleviate some of the problems inherent in the use of profile likelihoods, such as bias, inconsistency and overoptimistic variance estimates. The adjustment is applied to the profile loglikelihood score function at each parameter value so that its mean is zero and its variance is the negative expected derivative matrix of the adjusted score function. For cases in which explicit calculation of the adjustments is difficult, we give two methods to simplify their computation: an 'automatic' simulation method that requires as input only the profile loglikelihood and its first few derivatives; first-order asymptotic expressions. Some examples are provided and a comparison is made with the conditional profile log-likelihood of Cox and Reid.

213 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that the most surprising observation must lie at one of the vertices of the convex hull and that the observation with the maximum Mahalanobis distance from the sample mean must lie on the concave hull.
Abstract: SUMMARY The conditional predictive ordinate (CPO) is a Bayesian diagnostic which detects surprising observations. It has been used in a variety of situations such as univariate samples, the multivariate normal distribution and regression models. Results are presented about the most surprising observation which has minimum CPO. For the multivariate normal distribution it is shown that the most surprising observation must lie at one of the vertices of the convex hull. It is also shown that the observation with maximum Mahalanobis distance from the sample mean must lie on the convex hull. Results are given for the expected number of vertices on the convex hull when the sample is contaminated. An alternative, closely related diagnostic, the ratio ordinate measure, is presented. A numerical comparison of the two measures is given.

151 citations


Journal ArticleDOI
TL;DR: In this paper, the null distribution of the likelihood ratio statistic for threshold autoregression with normally distributed noise is addressed. But the problem is non-standard because the threshold parameter is a nuisance parameter which is absent under the null hypothesis.
Abstract: SUMMARY This paper addresses the null distribution of the likelihood ratio statistic for threshold autoregression with normally distributed noise. The problem is non-standard because the threshold parameter is a nuisance parameter which is absent under the null hypothesis. We reduce the problem to the first-passage probability associated with a Gaussian process which, in some special cases, turns out to be a Brownian bridge. It is also shown that, in some specific cases, the asymptotic null distribution of the test statistic depends only on the 'degrees of freedom' and not on the exact null joint distribution of the time series.

Journal ArticleDOI
TL;DR: In this article, the authors developed a model disaggregation method to derive a disaggregate model from a given aggregate model, which is then used to perform data disaggregation, where the time series aggregates are the non-overlapping sums of m consecutive disaggregated observations.
Abstract: We develop a model disaggregation method to derive a disaggregate model from a given aggregate model, which is then used to perform data disaggregation. Let the time series aggregates be the non-overlapping sums of m consecutive disaggregated observations. Given an aggregate autoregressive integrated moving average ARIMA(p,d,r) model with r≤p+d+1, assume that there is no hidden periodicity of order m. It is shown that, if m is odd or if m is even but all the real roots of the autoregressive polynomial of the given aggregate model are positive, then there exists a disaggregate model whose autocovariances can be uniquely derived from the autocovariances of the given aggregate model

Journal ArticleDOI
TL;DR: In this article, two parameterization invariant approximations to the conditional distribution function of the maximum likelihood estimator are considered, one based on partial integration of a formula for the conditional density given an ancillary statistic, a technique which in addition yields a useful expression for the error term.
Abstract: For models of parametric dimension 1 two parameterization invariant approximations to the conditional distribution function of the maximum likelihood estimator are considered. The first is derived by partial integration of a formula for the conditional density given an ancillary statistic, a technique which in addition yields a useful expression for the error term. The second approximation is based on an adjusted version of the signed log-likelihood ratio statistic. A third approximation is a modification of the first that avoids the specification of an ancillary

Journal ArticleDOI
TL;DR: In this article, Bayes estimators of the parameters of the Marshall-Olkin exponential distribution are obtained when random samples from series and parallel systems are available, with respect to the quadratic loss function, and the prior distribution allows for prior dependence among the components of the parameter vector.
Abstract: SUMMARY Bayes estimators of the parameters of the Marshall-Olkin exponential distribution are obtained when random samples from series and parallel systems are available. The estimators are with respect to the quadratic loss function, and the prior distribution allows for prior dependence among the components of the parameter vector. Exact and approximate highest posterior density credible ellipsoids for the parameters are also obtained. In contrast with series sampling, the Bayes estimators under parallel sampling are not in closed form, and numerical procedures are required to obtain estimates. Bayes estimators of the reliability functions are also given. The gain in asymptotic precision of parallel estimates over series estimates is also ascertained theoretically.

Journal ArticleDOI
TL;DR: In this article, an adaptation of least squares cross-validation is proposed for bandwidth choice in the kernel estimation of the derivatives of a probability density, which is demonstrated by an example and a simulation study.
Abstract: An adaptation of least squares cross-validation is proposed for bandwidth choice in the kernel estimation of the derivatives of a probability density. The practicality of the method is demonstrated by an example and a simulation study. Theoretical justification is provided by an asymptotic optimality result

Journal ArticleDOI
TL;DR: In this article, a hierarchical interaction model for mixed qualitative and continuous data is proposed, defined by two properties: that the continuous variables are normally distributed given the qualitative variables and that a set of conditional independence relations hold between specified pairs of variables.
Abstract: Lauritzen and Wermuth have proposed a class of models for mixed qualitative and continuous data, defined by two properties: that the continuous variables are normally distributed given the qualitative variables and that a set of conditional independence relations hold between specified pairs of variables. The present paper examines an extension to this class called hierarchical interaction models. A compact form for model representation is described and an estimation algorithm is given. Some properties of the models. A compact form for model representation is described and an estimation algorithm is given. Some properties of the models concerning marginalization and conditioning are examined. The class includes and generalizes hierarchical log-linear models, standard fixed effect ANOVA, multivariate ANOVA and multivariate regression models

Journal ArticleDOI
TL;DR: In this paper, it is shown that within the three-parameter model, there is an embedded twoparameter special case which corresponds to infinite parameter values in the original model, and the problem arises when this embedded model is the best fit to the data.
Abstract: SUMMARY Two distinct problems can arise in maximum likelihood estimation in certain threeparameter problems. One problem, associated with the fitting of a threshold parameter, occurs if the fitted distribution has to be very positively skewed. This problem is well known. However, there is a second and unrelated difficulty which occurs quite often in practice and which arises when the distribution is not at all skew. This problem is not so well understood. It is shown in this paper that, within the three-parameter model, there is an embedded twoparameter special case which corresponds to infinite parameter values in the original model. The problem arises when this embedded model is the best fit to the data. The problem is shown to be easily resolved by first carrying out a check to see whether the embedded model should be fitted instead of the three-parameter model. Formal tests for this are discussed. The gamma, inverse Gaussian, log-normal and Weibull distributions are examples where the problem occurs. Numerical examples are provided for illustration.

Journal ArticleDOI
TL;DR: In this article, the authors investigate two new test statistics appropriate when w = x + z where z is an independent measurement error, and show that they are both asymptotically efficient for normal errors and approximately efficient when the measurement error variance is small.
Abstract: SUMMARY Hypothesis tests in generalized linear models are studied under the condition that a surrogate w is observed in place of the true predictor x. The efficient score test for the hypothesis of no association depends on the conditional expectation E(xI w) which is generally unknown. The usual test substitutes w for E(xI w) and is asymptotically valid but not efficient. We investigate two new test statistics appropriate when w = x + z where z is an independent measurement error. The first is a Wald test based on estimators corrected for measurement error. Despite the correction for attenuation in the estimator, this test has the same local power as the usual test. The second test employs an estimator of E(x I w) and is both asymptotically efficient for normal errors and approximately efficient when the measurement error variance is small.

Journal ArticleDOI
TL;DR: In this paper, Davison et al. extend the methodology to second-order balance, which principally affects bootstrap estimation of variance, and propose Latin square and balanced incomplete block designs.
Abstract: SUMMARY Davison et al. (1986) have shown that finite bootstrap simulations can be improved by forcing balance in the aggregate of simulated data sets. Their methods yield first-order balance, which principally affects bootstrap estimation of bias. Here we extend the methodology to second-order balance, which principally affects bootstrap estimation of variance. The particular techniques involve Latin square and balanced incomplete block designs. Numerical examples are given to illustrate both the positive and the negative features of the balanced simulations.

Journal ArticleDOI
Abstract: SUMMARY A parametric model is presented for the analysis of square contingency tables where there is a one-to-one correspondence between the categories of the row variable and the categories of the column variable. The model is applicable to cross-classifications where the categories are unordered, partially ordered or completely ordered, but the interpretation of the model and associated parameters is most straightforward when the categories are at least partially ordered. Connections with the mover-stayer model and the usual model of quasisymmetry are described. Two data sets are used to illustrate the utility of the model.

Journal ArticleDOI
TL;DR: In this paper, a new method for estimating the discrete frequencies based on amplification of the sine waves in the time series and on minimization of least squares regressions on amplified waves was proposed.
Abstract: SUMMARY For stationary time series having a mixed spectrum, we introduce a new method for estimating the discrete frequencies based on amplification of the sine waves in the time series and on minimization of least squares regressions on amplified waves. Under the assumption that the continuous part of the spectrum arises from a stationary autoregressive moving average process, we show that the estimators for the discrete frequencies obtained with this method are strongly consistent: more precisely, as the serial length n tends to infinity, their bias almost surely converges to zero with the rate n - 3/2 (log n)5, 6 > 2. We also establish a central limit theorem which shows that these estimators have exactly the same asymptotic variance as that given by Whittle's method, which is known to yield the most accurate frequency estimates. However, the new technique has two advantages over Whittle's method. First, it yields an algorithm that is both robust against improper starting estimates and computationally efficient. Secondly, it gives rise to a simple tool based on its amplified harmonics for identifying the number of sine waves in a time series. A simulation study is also reported that gives statistical results on the properties of the new method on simulated series as well as comparative results with Whittle's method.

Journal ArticleDOI
TL;DR: In this paper, the smoothed bootstrap is compared with an alternative estimation procedure based on resampling ideas of Kendall and Kendall, and specific examples of the estimation problem are considered, showing that the competing estimators may display different qualitative behaviours.
Abstract: Bootstrap estimation of simple population functionals is considered. The smoothed bootstrap is compared with an alternative estimation procedure based on resampling ideas of Kendall and Kendall. Specific examples of the estimation problem are considered, showing that the competing estimators may display different qualitative behaviours

Journal ArticleDOI
TL;DR: In this paper, the construction of optimal or nearly optimal block designs, using the A-optimality criterion, is formulated as a non-linear 0-1 program, and its feasibility is examined by constructing block designs using the nonlinear programming computer package MINOS.
Abstract: SUMMARY The construction of optimal or nearly optimal block designs, using the A-optimality criterion, is formulated as a non-linear 0-1 programme. An approach which ignores the 0-1 constraint is suggested and its feasibility is examined by constructing block designs using the non-linear programming computer package MINOS.

Journal ArticleDOI
TL;DR: The method proposed is more efficient than others suggested in the literature and, in particular, reduces the computational burden associated with exact maximum likelihood estimation of ARMA models.
Abstract: Matrix expressions relating the theoretical autocovariances of ARMA processes to their parameters are derived and used to design an efficient procedure for computing autocovariance sequences of multivariate ARMA processes. The method proposed is more efficient than others suggested in the literature and, in particular, reduces the computational burden associated with exact maximum likelihood estimation of ARMA models. The closed form expressions facilitate the implementation of algorithm for computing multivariate autocovariances

Journal ArticleDOI
TL;DR: In this article, a reconstruction approach is proposed to estimate the point process parameters of a sampled counting process using smoothed versions of the corresponding stationary time series parameters, and the reconstruction is applied to data from meteorology, entomology and particle physics.
Abstract: Sampled counting processes are often studied using methods for the analysis of stationary time series. We express some time series parameters as smoothed versions of corresponding point process parameters and use these relations to suggest estimates of the point process parameters. In addition, we propose a reconstructive approach to the estimation problem. We derive some properties of the estimators and apply them to data from meteorology, entomology and particle physics

Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of assigning the probability density function of a random variable X to one of two separate families of distributions, and study the possibility of constructing best similar tests.
Abstract: We consider, from the viewpoint of hypotheses testing, the problem of assigning the probability density function of a random variable X to one of two separate families of distributions. Our aim is to study the possibility of constructing best similar tests. For this, we characterize sufficient statistics for the union of the two families; completeness is also studied. We state conditions under which the test statistics are easily obtainable when both families are of the generalized exponential type. The relative merits of the exact best test and of asymptotic tests are empirically investigated in small samples when testing for the log-normal versus the gamma distribution

Journal ArticleDOI
TL;DR: In this article, an upper bound for the average efficiency factor of a connected multidimensional design with adjusted orthogonality has been given, and it is shown that the design's average efficiency is a simple function of its component designs.
Abstract: SUMMARY An upper bound is given for the average efficiency factor of a connected multidimensional design. It is shown that, when the design has adjusted orthogonality, its average efficiency factor is a simple function of the average efficiency factors of its component designs.