scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the royal statistical society series b-methodological in 1992"


Journal ArticleDOI
TL;DR: In this paper, a Markov chain Monte Carlo method is used to approximate the whole likelihood function in autologistic models and other exponential family models for dependent data, and the parameter value (if any) maximizing this function approximates the MLE.
Abstract: Maximum likelihood estimates (MLEs) in autologistic models and other exponential family models for dependent data can be calculated with Markov chain Monte Carlo methods (the Metropolis algorithm or the Gibbs sampler), which simulate ergodic Markov chains having equilibrium distributions in the model. From one realization of such a Markov chain, a Monte Carlo approximant to the whole likelihood function can be constructed. The parameter value (if any) maximizing this function approximates the MLE

869 citations


Journal ArticleDOI
Ali S. Hadi1
TL;DR: In this article, the authors propose a procedure for the detection of multiple outliers in multivariate data, where the data set is first ordered using an appropriately chosen robust measure of outlyingness, and then the data sets are divided into two initial subsets: a "basic" subset which contains p + 1 "good" observations and a "nonbasic" subsets which contain the remaining n -p -1 observations.
Abstract: SUMMARY We propose a procedure for the detection of multiple outliers in multivariate data. Let Xbe an n x p data matrix representing n observations onp variates. We first order the n observations, using an appropriately chosen robust measure of outlyingness, then divide the data set into two initial subsets: a 'basic' subset which containsp + 1 'good' observations and a 'nonbasic' subset which contains the remaining n -p - 1 observations. Second, we compute the relative distance from each point in the data set to the centre of the basic subset, relative to the (possibly singular) covariance matrix of the basic subset. Third, we rearrange the n observations in ascending order accordingly, then divide the data set into two subsets: a basic subset which contains the first p +2 observations and a non-basic subset which contains the remaining n -p -2 observations. This process is repeated until an appropriately chosen stopping criterion is met. The final non-basic subset of observations is declared an outlying subset. The procedure proposed is illustrated and compared with existing methods by using several data sets. The procedure is simple, computationally inexpensive, suitable for automation, computable with widely available software packages, effective in dealing with masking and swamping problems and, most importantly, successful in identifying multivariate outliers.

792 citations


Journal ArticleDOI
TL;DR: In this paper, a class of models for the marginal expectations of each response and for pairwise associations are compared with log-linear models, and the robustness and efficiency of each model is discussed.
Abstract: SUMMARY It is common to observe a vector of discrete and/or continuous responses in scientific problems where the objective is to characterize the dependence of each response on explanatory variables and to account for the association between the outcomes. The response vector can comprise repeated observations on one variable, as in longitudinal studies or genetic studies of families, or can include observations for different variables. This paper discusses a class of models for the marginal expectations of each response and for pairwise associations. The marginal models are contrasted with log-linear models. Two generalized estimating equation approaches are compared for parameter estimation. The first focuses on the regression parameters; the second simultaneously estimates the regression and association parameters. The robustness and efficiency of each is discussed. The methods are illustrated with analyses of two data sets from public health research.

667 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that near-blackness is required for signal-to-noise enhancements and for superresolution, and that minimum /1-norm reconstruction may exploit near blackness to an even greater extent.
Abstract: SUMMARY Maximum entropy (ME) inversion is a non-linear inversion technique for inverse problems where the object to be recovered is known to be positive. It has been applied in areas ranging from radio astronomy to various forms of spectroscopy, sometimes with dramatic success. In some cases, ME has attained an order of magnitude finer resolution and/or an order of magnitude smaller noise level than that obtainable by standard linear methods. The dramatic successes all seem to occur in cases where the object to be recovered is 'nearly black': essentially zero in the vast majority of samples. We show that near-blackness is required, both for signal-to-noise enhancements and for superresolution. However, other methods-in particular, minimum /1-norm reconstruction-may exploit near-blackness to an even greater extent.

392 citations


Journal ArticleDOI
TL;DR: In this paper, the dominant Lyapunov exponent (LE) is estimated for biological and economic systems that are subjected to random perturbations and observed over a limited amount of time.
Abstract: In the past twenty years there has been much interest in the physical and biological sciences in nonlinear dynamical systems that appear to have random, unpredictable behavior. One important parameter of a dynamic system is the dominant Lyapunov exponent (LE). When the behavior of the system is compared for two similar initial conditions, this exponent is related to the rate at which the subsequent trajectories diverge. A bounded system with a positive LE is one operational definition of chaotic behavior. Most methods for determining the LE have assumed thousands of observations generated from carefully controlled physical experiments. Less attention has been given to estimating the LE for biological and economic systems that are subjected to random perturbations and observed over a limited amount of time. Using nonparametric regression techniques (Neural Networks and Thin Plate Splines) it is possible to consistently estimate the LE. The properties of these methods have been studied using simulated data and are applied to a biological time series: marten fur returns for the Hudson Bay Company (1820-1900). Based on a nonparametric analysis there is little evidence for lowdimensional chaos in these data. Although these methods appear to work well for systems perturbed by small amounts of noise, finding chaos in a system with a significant stochastic component may be difficult.

331 citations


Journal ArticleDOI
TL;DR: In this article, an exploratory technique is introduced for investigating how much of the irregularity in an aperiodic time series is due to low-dimensional chaotic dynamics, as opposed to stochastic or high-dimensional dynamics.
Abstract: An exploratory technique is introduced for investigating how much of the irregularity in an aperiodic time series is due to low-dimensional chaotic dynamics, as opposed to stochastic or high-dimensional dynamics. Nonlinear models are constructed with a variable smoothing parameter which at one extreme defines a nonlinear deterministic model, and at. the other extreme defines a linear stochastic model. The accuracy of the resulting short-term forecasts as a function of the smoothing parameter reveals much about the underlying dynamics generating the time series. The technique is applied to a variety of experimental and naturally occurring time series data, and the results are compared to dimension calculations.

315 citations


Journal ArticleDOI
TL;DR: In this article, the problem can be reduced to a canonical form, which simplifies the underlying problem and designs are constructed for several contexts with a single variable using geometric and other arguments.
Abstract: Optimal experimental designs for non-linear problems depend on the values of the underlying undknown parameters in the model. For various reasons there is interest in providing explicit formulae for the optimal designs as a function of the unknown parameters. This paper shows that, for a certain class of generalized linear models, the problem can be reduced to a canonical form. This simplifies the underlying problem and designs are constructed for several contexts with a single variable using geometric and other arguments

223 citations


Journal ArticleDOI
TL;DR: In this article, empirical transformations for removing most of the skewness of an asymmetric statistic were proposed, which can be used as the basis for accurate confidence procedures or hypothesis tests and can be employed in conjunction with either a normal approximation or the bootstrap.
Abstract: SUMMARY We suggest empirical transformations for removing most of the skewness of an asymmetric statistic. These transformations may be used as the basis for accurate confidence procedures or hypothesis tests and can be employed in conjunction with either a normal approximation or the bootstrap. Our approach differs from those of other researchers in that the transformation is monotone and invertible. In this paper we suggest elementary but general transformations for eliminating skewness from the distribution of a Studentized statistic. The transformations are monotone and have simple, explicit inversion formulae, and so are readily applied to confidence interval problems. When the transformations are used in conjunction with the bootstrap, they can produce confidence intervals with particularly low levels of coverage error. That confidence intervals for symmetric statistics have lower coverage error than their counterparts in the case of asymmetry may be deduced from Edgeworth expansions. The first term in an expansion, of size n - 1/2 where n is the sample size, describes the error in the usual normal approximation and is due entirely to skewness. If it can be eliminated, for example by transformation, then the normal approxima- tion will be in error by only O(n- 1). Thus, our method involves transforming one statistic into another, where the distribution is virtually symmetric, applying the normal approximation (or the bootstrap) to the new statistic and then regaining the asymmetry of the original problem by inverse transformation, at the same time retain- ing the high coverage accuracy conferred by applying confidence interval procedures to a symmetric statistic.

218 citations


Journal ArticleDOI
TL;DR: In this paper, the authors extend Stein's work to prove a central limit theorem for the variance reduction of LHS integrals, showing that the extent of variance reduction depends on the extent to which the integrand is additive.
Abstract: SUMMARY Latin hypercube sampling (LHS) is a technique for Monte Carlo integration, due to McKay, Conover and Beckman. M. Stein proved that LHS integrals have smaller variance than independent and identically distributed Monte Carlo integration, the extent of the variance reduction depending on the extent to which the integrand is additive. We extend Stein's work to prove a central limit theorem. Variance estimation methods for nonparametric regression can be adapted to provide N'12-consistent estimates of the asymptotic variance in LHS. Moreover the skewness can be estimated at this rate. The variance reduction may be explained in terms of certain control variates that cannot be directly measured. We also show how to combine control variates with LHS. Finally we show how these results lead to a frequentist approach to computer experimentation.

208 citations



Journal ArticleDOI
TL;DR: In this article, an extension of the threshold method for extreme values is developed, to consider the joint distribution of extremes of two variables, based on the point process representation of bivariate extremes.
Abstract: SUMMARY An extension of the threshold method for extreme values is developed, to consider the joint distribution of extremes of two variables. The methodology is based on the point process representation of bivariate extremes. Both parametric and nonparametric models are considered. The simplest case to handle is that in which both marginal distributions are known. For the more realistic case in which the marginal distributions are unknown, a mixed parametric-nonparametric method is proposed. The techniques are illustrated with data on sulphate and nitrate levels taken from a major study of acid rain.

Journal ArticleDOI
TL;DR: A test of fit for exponentiality based on the estimated Kullback-Leibler information and a procedure for choosing m for various sample sizes is proposed and corresponding critical values are computed by Monte Carlo simulations.
Abstract: In this paper a test of fit for exponentiality based on the estimated Kullback-Leibler information is proposed. The procedure is applicable when the exponential parameter is or is not specified under the null hypothesis. The test uses the Vasicek entropy estimate, so to compute it a window size m must first be fixed. A procedure for choosing m for various sample sizes is proposed and corresponding critical values are computed by Monte Carlo simulations


Journal ArticleDOI
TL;DR: In this article, the authors focus on tests and confidence intervals regarding a single parametric function which can be represented as a natural parameter of a full rank exponential family, and explore these approximations to exact conditional inferences.
Abstract: Recently developed asymptotics based on saddlepoint methods provide important practical methods for multiparameter exponential families, especially in generalized linear models. The aim here is to clarify and explore these. Attention is restricted to tests and confidence intervals regarding a single parametric function which can be represented as a natural parameter of a full rank exponential family. Excellent approximations to exact conditional inferences are often available, in terms of simple adjustments to the signed square root of the likelihood ratio statistic


Journal ArticleDOI
TL;DR: In this paper, the authors compare the maximum extended quasi-likelihood estimators, the maximum pseudolikelihood estimator and the maximum likelihood estimator, if it exists.
Abstract: SUMMARY There is considerable interest in the fitting of models jointly to the mean and dispersion of a response. For the mean parameter, the Wedderburn estimating equations are widely accepted. However, there is some controversy about estimating the dispersion parameters. Finite sampling properties of several dispersion estimators are investigated for three models by simulation. We compare the maximum extended quasi-likelihood estimator, the maximum pseudolikelihood estimator and the maximum likelihood estimator, if it exists. Of these estimators, the maximum extended quasi-likelihood estimator is usually superior in minimizing the mean-squared error.

Journal ArticleDOI
TL;DR: In this article, the authors introduce the notion of a generalized partial autocorrelation and an order for the estimation of the embedding dimension via order determination of an unknown non-linear autoregression by cross-validation.
Abstract: SUMMARY We give a brief introduction to deterministic chaos and a link between chaotic deterministic models and stochastic time series models. We argue that it is often natural to determine the embedding dimension in a noisy environment first in any systematic study of chaos. Setting the stochastic models within the framework of non-linear autoregression, we introduce the notion of a generalized partial autocorrelation and an order. We approach the estimation of the embedding dimension via order determination of an unknown non-linear autoregression by cross-validation, and give justification by proving its consistency under global boundedness. As a by-product, we provide a theoretical justification of the final prediction error approach of Auestad and Tj0stheim. Some illustrations based on the Henon map and several real data sets are given. The bias of the residual sum of squares as essentially a noise variance estimator is quantified.

Journal ArticleDOI
TL;DR: In this article, a class of partly exponential models is proposed for the regression analysis of multivariate response data, which is parameterized in terms of the response mean and a general shape parameter.
Abstract: SUMMARY A class of partly exponential models is proposed for the regression analysis of multivariate response data. The class is parameterized in terms of the response mean and a general shape parameter. It includes the generalized linear error model and exponential dispersion models as special cases. Maximum likelihood equations for mean parameters are shown to be of the same form as certain generalized estimating equations, and maximum likelihood estimates of mean and shape parameters are asymptotically independent. Some results are given on the efficiency of the estimating equation procedure under misspecification of the response covariance matrix.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a test statistic for goodness of fit in time series with slowly decaying serial correlations, which is suitable to detect lack of independence in the observations, or estimated residuals, if the first few correlations are small but the decay of the correlations is slow.
Abstract: We propose a test statistic for goodness of fit in time series with slowly decaying serial correlations. The asymptotic distribution of the test statistic, originally proposed by Milhoj for time series with smooth spectra, turns out to be the same, under the null hypothesis, even if the spectrum has a pole at 0. In particular, the test is suitable to detect lack of independence in the observations, or estimated residuals, if the first few correlations are small but the decay of the correlations is slow

Journal ArticleDOI
TL;DR: In this paper, a simple estimator for β is proposed for the model y =x'β+g(1)+error, g smooth but unknown, and the bias and variance of the estimate are computed and compared against the least squares estimate with g known.
Abstract: A simple estimator for β is proposed for the model y=x'β+g(1)+error, g smooth but unknown. The approach is to approximate the estimating equation obtained from a semiparametric likelihood and in the simplest case reduces to minimizing the distance between the pseudoresiduals y-x'β and a local linear cross-validated estimate of them. When the errors are independent with finite variance, the bias and variance of the estimate are computed and compared against the least squares estimate with g known


Journal ArticleDOI
TL;DR: In this article, the forms of first and second posterior moments for a normal location parameter are identified for a rather general class of prior distributions, where the prior distribution is double exponential or Student t respectively.
Abstract: The forms of first and second posterior moments for a normal location parameter are identified for a rather general class of prior distributions. Exact and approximate illustrations are given where the prior distribution is double exponential or Student t respectively

Journal ArticleDOI
TL;DR: In this paper, it was shown that a generic, finite order, non-recursive filter leaves invariant all the quantities that can be estimated by using embedding techniques such as the method of delays.
Abstract: It has been asserted in the literature that the low pass filtering of time series data may lead to erroneous results when calculating attractor dimensions. Here we prove that finite order, non-recursive filters do not have this effect. In fact, a generic, finite order, non-recursive filter leaves invariant all the quantities that can be estimated by using embedding techniques such as the method of delays.



Journal ArticleDOI
TL;DR: In this paper, a symbolic formula for the square root-free Cholesky decomposition of the variance-covariance matrix of the multinomial distribution is given, which is particularly useful when the elements of a probability vector are of quite different orders of magnitude.
Abstract: A symbolic formula is given for the square-root-free Cholesky decomposition of the variance-covariance matrix of the multinomial distribution. The evaluation of the symbolic Cholesky factors requires much fewer arithmetic operations than does the general Cholesky algorithm. Since the symbolic formula is not affected by an ill-conditioned matrix, it is particularly useful when the elements of a probability vector are of quite different orders of magnitude. A simpler formula is obtained for Pederson's procedure of sampling from a multinomial population. An explicit formula of the Moore-Penrose inverse of the variance-covariance matrix is given as well as a symmetric representation of a multinormal density approximation to the multinomial distribution. These formulae facilitate symmetric manipulation of the matrix and are useful in statistical modelling and computation involving the logistic density transformation of the multinomial distribution and in computer simulations of dynamic models in population genetics. Each element of the Cholesky factors is given interesting probabilistic interpretations.

Journal ArticleDOI
TL;DR: In this paper, a self-consistent estimator of the parameters is defined and it is shown that the maximum likelihood estimator is also a self consistent estimator, and an algorithm based on selfconsistency equations is provided for the computation of the estimators.
Abstract: Estimation in a three-state Markov process with irreversible transitions in the presence of interval-censored data is considered. A nonparametric maximum likelihood procedure for the estimation of the cumulative transition intensities is presented. A self-consistent estimator of the parameters is defined and it is shown that the maximum likelihood estimator is a self-consistent estimator. This extends the idea of self-consistency introduced by Efron to the estimation of more than one parameter. An algorithm, based on selfconsistency equations, is provided for the computation of the estimators

Journal ArticleDOI
TL;DR: In this article, a joint multinormal distribution for all the variables as the starting point is proposed, and a hypothesis is formulated stating that only a fixed small number of components constructed from the explanatory variables is relevant for prediction, and maximum-likelihood-type estimates of the parameters in the model are developed under this hypothesis.
Abstract: SUMMARY Most regression methods for multicollinear data are only based on assumptions about the conditional distribution of the dependent variable, given the explanatory variables. Here we propose a new prediction method taking a joint multinormal distribution for all the variables as the starting point. A hypothesis is formulated stating that only a fixed small number of components constructed from the explanatory variables is relevant for prediction, and maximum-likelihood-type estimates of the parameters in the model are developed under this hypothesis.

Journal ArticleDOI
TL;DR: In this article, nonparametric information bounds are defined for the smoothing parameter h 0, which minimizes the squared error of a kernel or smoothing spline estimator, and asymptotically efficient estimators of h 0 are presented.
Abstract: A striking feature of curve estimation is that the smoothing parameter h 0 , which minimizes the squared error of a kernel or smoothing spline estimator, is very difficult to estimate. This is manifest both in slow rates of convergence and in high variability of standard methods such as cross-validation. We quantify this difficulty by describing nonparametric information bounds and exhibit asymptotically efficient estimators of h 0 that attain the bounds. The efficient estimators are substantially less variable than cross-validation (and other current procedures) and simulations suggest that they may offer improvements at moderate sample sizes, at least in terms of minimizing the squared error

Journal ArticleDOI
TL;DR: The authors compare two estimators of error variance, both based on quadratic forms in the residuals about smoothing spline fits to data, and show that the commonly used estimator of variance has the serious drawback of underestimating the error variance for small choices of the smoothing parameter.
Abstract: SUMMARY We compare two estimators of error variance, both based on quadratic forms in the residuals about smoothing spline fits to data. The estimators are compared over the whole range of values of the smoothing parameter as well as for data-based choices of the smoothing parameter. We show that the commonly used estimator of variance has the serious drawback of underestimating the error variance for small choices of the smoothing parameter. This drawback is not shared by a simple, but more computationally intensive, alternative.