scispace - formally typeset
Search or ask a question

Showing papers in "Biometrika in 1985"


Journal ArticleDOI
TL;DR: In this article, the authors consider maximum likelihood estimation of the parameters of a probability density which is zero for x 2, the information matrix is finite and the classical asymptotic properties continue to hold.
Abstract: SUMMARY We consider maximum likelihood estimation of the parameters of a probability density which is zero for x 2, the information matrix is finite and the classical asymptotic properties continue to hold. For cx = 2 the maximum likelihood estimators are asymptotically efficient and normally distributed, but with a different rate of convergence. For 1 < a < 2, the maximum likelihood estimators exist in general, but are not asymptotically normal, while the question of asymptotic efficiency is still unsolved. For cx < 1, the maximum likelihood estimators may not exist at all, but alternatives are proposed. All these results are already known for the case of a single unknown location parameter 0, but are here extended to the case in which there are additional unknown parameters. The paper concludes with a discussion of the applications in extreme value theory.

826 citations



Journal ArticleDOI
TL;DR: In this article, an analogue of Tukey's (1949) one degree of freedom for nonadditivity test seems quite reasonable as a diagnostic for linearity versus a second-order Volterra expansion.
Abstract: The past few years have seen a revived interest in the study of nonlinear stationary time series. The most general form of a nonlinear stationary process is that referred to as a Volterra expansion. This is to a linear process what a higher order polynomial is to a linear function. The so-called bilinear model is a special case which is quite broad and is defined by a small number of parameters. Volterra expansions involve more than secondorder theory and require higher-order cumulant spectra. Various tests for linearity have been proposed although none is totally satisfactory. Because of the similarity of Volterra expansions to polynomials, an analogue of Tukey's (1949) one degree of freedom for nonadditivity test seems quite reasonable as a diagnostic for linearity versus a second-order Volterra expansion. Such a test would be time domain based and computationally less complex than the frequency domain based alternatives, directly generalizable to higher order than second and possibly suggestive of a power transformation toward linearity.

328 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of finding an approximate confidence interval for a multivariate normal data vector y with unknown mean vector q, covariance matrix the identity, and show that the standard approximation based on maximum likelihood theory, 04+CZ00, can be quite misleading when 0 is nonlinear in q.
Abstract: SUMMARY We consider the following class of problems: having observed a multivariate normal data vector y with unknown mean vector q, covariance matrix the identity, find an approximate confidence interval for 0 = t(q), a real-valued function of q. A simple geometric construction is given which leads to highly accurate solutions. This construction shows that the standard approximation based on maximum likelihood theory, 04+(CZ00, can be quite misleading when 0 is nonlinear in q. We discuss bootstrap-based confidence intervals which remove most of the error in the standard approximation, at the expense of considerably more calculation. The bootstrap intervals are invariant under transformation of both y and i, and so they automatically produce accurate solutions in problems which can be transformed to multivariate normality, without requiring knowledge of the normalizing transformation.

306 citations


Journal ArticleDOI
TL;DR: In this paper, a generalized linear model is extended to provide estimates of location and variance parameters for mixed models fitted to binomial data formed by classifying samples from an underlying normal distribution.
Abstract: SUMMARY Methods for generalized linear models are extended to provide estimates of location and variance parameters for mixed models fitted to binomial data formed by classifying samples from an underlying normal distribution. The method estimates the parameters directly on the underlying scale. For a balanced one-way random effects model, the variance estimator simplifies to the usual analysis of variance one. The estimation of variances and the prediction of random effects for binomial traits is required by animal breeders. The predictors given are analogous to best linear unbiased predictors (Henderson, 1973) but differ from those presented by Harville & Mee (1984).

280 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider the case where the population value of the parameter vector is a boundary point of the feasible region and show that the asymptotic distribution of test statistic is a mixture of chi-squared distributions.
Abstract: SUMMARY The analysis of moment structural models has become an important tool of investigation in behavioural, educational and economic studies. The chi-squared largesample test is routinely employed to assess the goodness of fit of the model considered. However, in order to invoke the standard asymptotic distribution theory certain regularity conditions have to be met. Here we consider the case where the population value of the parameter vector is a boundary point of the feasible region. We show that in this case the asymptotic distribution of test statistic is a mixture of chi-squared distributions. The problem of finding the corresponding weights is discussed.

269 citations


Journal ArticleDOI
TL;DR: In this paper, a finite sample optimal estimation of a discrete stochastic process is established for samples of finite size n, where n is the number of samples in the process.
Abstract: Generally, in the literature on stochastic processes, estimation is investigated in terms of asymptotic properties. In this paper we establish some finite sample optimal estimation. Let {Y1, Y2, ...} be a discrete stochastic process. Since we will be discussing results for samples of finite size n, we restrict the process to R'. In the following, the term 'parameter' is used in the same broad sense as by Godambe & Thompson (1984). Let Y be a class of probability distributions F on R' and 0 = 0(F), F E jZ be a real parameter. Let hi be a real function of y 1, yi and 0 be such that

256 citations


Journal ArticleDOI
TL;DR: In this paper, the general location model of Olkin & Tate (1961) and extensions introduced by Krzanowski (1980, 1982) form the basis for the maximum likelihood procedures for analyzing mixed continuous and categorical data with missing values.
Abstract: SUMMARY Maximum likelihood procedures for analysing mixed continuous and categorical data with missing values are presented. The general location model of Olkin & Tate (1961) and extensions introduced by Krzanowski (1980, 1982) form the basis for our methods. Maximum likelihood estimation with incomplete data is achieved by an application of the EM algorithm (Dempster, Laird & Rubin, 1977). Special cases of the algorithm include Orchard & Woodbury's (1972) algorithm for incomplete normal samples, Fuchs's (1982) algorithms for log linear modelling of partially classified contingency tables, and Day's (1969) algorithm for multivariate normal mixtures. Applications include: (a) imputation of missing values, (b) logistic regression and discriminant analysis with missing predictors and unclassified observations, (c) linear regression with missing continuous and categorical predictors, and (d) parametric cluster analysis with incomplete data. Methods are illustrated using data from the St Louis Risk Research Project. Some key word8: Cluster analysis; Discriminant analysis; EM algorithm; Incomplete data; Linear regression; Logistic regression; Log linear model; Mixture model.

240 citations


Journal ArticleDOI
TL;DR: In this article, a locally best linear combination of individual test statistics obtained under different conditions is proposed to test the equality of the two treatments, allowing different patterns of missing observations for the two groups to be compared.
Abstract: SUMMARY In comparing the effectiveness of two treatments, repeated measurements of the same characteristic are often taken under two or more distinct conditions for each experimental subject. A locally best linear combination of individual test statistics obtained under different conditions is proposed to test the equality of the two treatments. The test procedure allows different patterns of missing observations for the two groups to be compared. Special cases such as combining Wilcoxon tests, t tests and 2 x 2 tables are discussed in detail with real examples.

226 citations


Journal ArticleDOI

197 citations


Journal ArticleDOI
TL;DR: In this paper, a procedure to select the simplest acceptable models for a multidimensional contingency table is proposed, based on two rules: first, if a model is accepted, then all models that include it are considered to be accepted, and secondly, if the model is rejected, all its submodels are rejected.
Abstract: SUMMARY A procedure to select the simplest acceptable models for a multidimensional contingency table is proposed. It is based on two rules: first, that if a model is accepted, then all models that include it are considered to be accepted, and secondly, that if a model is rejected, then all its submodels are considered to be rejected. Two versions are described, one for the class of graphical models, and the other for the class of hierarchical log linear models. Application of both versions to a six-dimensional table is illustrated. The procedure can be regarded as an alternative to fitting all possible models, made computationally feasible by application of the two rules. It is a generalization of the procedure proposed by Havr'anek (1984), but is in many cases considerably faster.

Journal ArticleDOI
TL;DR: In this paper, the perturbation theory of real symmetric matrices is used to detect influential observations in principal component analysis (PCA) and the influence function is used as a diagnostic tool.
Abstract: SUMMARY In linear regression, the theoretical influence function and the various sample versions of it have an established place as diagnostic tools. These same functions are developed here to provide methods for the detection of influential observations in principal components analysis. The perturbation theory of real symmetric matrices unifies this development. Some interesting points of contrast with the regression case are noted and explained theoretically.

Journal ArticleDOI
TL;DR: In this article, the Shiryayev-Roberts and Page procedures are compared in the context of continuous time in order to use the machinery of diffusion processes to perform explicitly certain calculations, which seem impossible in discrete time.
Abstract: : The purpose of the present paper is to make a quantitative comparison of the Shiryayev-Roberts and Page procedures We do this in the context of continuous time in order to use the machinery of diffusion processes to perform explicitly certain calculations, which seem impossible in discrete time Although the continuous time results are not especially good approximations to the corresponding quantities in discrete time, they provide very useful comparative information on which to base selection of a stopping rule This paper is organized as follows The Shiryayev-Roberts process is defined and shown to be a novel diffusion process with some surprising properties We also specify more precisely the basis for our comparison of the two procedures and give the results of some elementary calculations These developments contain an asymptotic evaluation We define a modification of our basic procedure and give an asymptotic evaluation of its average run length Numerical comparisons and a discussion of their significance are contained next Our conclusions are roughly these In simple situations where the two procedures can be directly compared, neither seems dramatically superior to the other However, the Shiryayev-Roberts procedure is more easily adapted to complex circumstances and consequently warrants additional study

Journal ArticleDOI
TL;DR: In this paper, a graphical method to investigate possible association of two variates as manifested in a sample of bivariate measurements is presented, which is designed so that the plot is approximately horizontal under independence and under various forms of association it produces corresponding characteristic patterns.
Abstract: SUMMARY This paper presents a graphical method to be used, in conjunction with a scatterplot, to investigate possible association of two variates as manifested in a sample of bivariate measurements. The method is designed so that the plot is approximately horizontal under independence, and under various forms of association it produces corresponding characteristic patterns. Examples given include application to study of regression residuals, dependence of two spatial point processes, serial association of time series, and comparison of two time series.

Journal ArticleDOI
TL;DR: In this paper, the authors considered extensions of logistic regression to the case where the binary outcome variable is observed repeatedly for each subject and proposed two working models that lead to consistent estimates of the regression parameters and of their variances under mild assumptions about the time dependence within each subject's data.
Abstract: SUMMARY This paper considers extensions of logistic regression to the case where the binary outcome variable is observed repeatedly for each subject. We propose two working models that lead to consistent estimates of the regression parameters and of their variances under mild assumptions about the time dependence within each subject's data. The efficiency of the proposed estimators is examined. An analysis of stress in mothers with infants is presented to illustrate the proposed method.

Journal ArticleDOI
TL;DR: In this article, power transformations for achieving distributional symmetry are discussed and the Box-Cox method or a robust adaptation of it is found to be the generally most suitable method.
Abstract: SUMMARY Power transformations for achieving distributional symmetry are discussed. Estimates of the transformation power are based on general measures of symmetry. They are shown to be consistent and asymptotically normal. Use of the skewness coefficient as a measure of symmetry is shown to be optimal in an important special case. The methods are compared to the likelihood methods of Box & Cox (1964) and alternative methods of Hinkley (1975, 1977). The Box-Cox method or a robust adaptation of it (Carroll, 1980; Bickel & Doksum, 1981) is found to be the generally most suitable method.

Journal ArticleDOI
TL;DR: In this paper, the second-order moment structure of time series models is used to derive a canonical analysis in time series modelling and a canonical correlation approach for tentative order determination in building autoregressive moving average models is proposed.
Abstract: SUMMARY The second-order moment structure of time series models is used to derive a canonical analysis in time series modelling. Consistency properties of certain canonical correlations and the corresponding eigenvectors are shown. Based on these properties, a canonical correlation approach for tentative order determination in building autoregressivemoving average models is proposed. This approach can handle directly nonstationary as well as stationary processes and it also provides consistent estimates of the autoregressive parameters involved. The asymptotic distribution of the identification statistic is discussed.

Journal ArticleDOI
TL;DR: In this article, influence functions for the regression parameters in the proportional hazards model are presented, and it is suggested that empirical influence functions, computed for each observation and each covariate, can be useful in an informal way to identify influential observations.
Abstract: SUMMARY Influence functions for the regression parameters in the proportional hazards model are presented. It is suggested that empirical influence functions, computed for each observation and each covariate, can be useful in an informal way to identify influential observations. This is illustrated on the Stanford heart transplant data and two other examples.

Journal ArticleDOI
TL;DR: In this article, a modification of the heterogeneity score test is presented wherein a consistent but inefficient estimator of the parameter of interest is substituted for the overall maximum likelihood estimator, using the C(Q) theory of Neyman (1959).
Abstract: SUMMARY For cases in which a parameter may be estimated from several independent data sets, Mather (1935) proposed a heterogeneity test based on the efficient scores for the individual data sets evaluated at the overall maximum likelihood estimator. Following Fisher (1946), a modification of the heterogeneity score test is presented wherein a consistent but inefficient estimator of the parameter of interest is substituted for the overall maximum likelihood estimator. The modified test is also derived using the C(Q) theory of Neyman (1959). Using the modified procedure, we derive a test for a common odds ratio in several 2 x 2 tables based on the Mantel-Haenszel estimator.

Journal ArticleDOI
TL;DR: In this article, it is shown that confidence regions for nonlinear parameters constructed by the repeated sampling principle are asymptotically valid for sequential designs in general linear models, and related questions of consistency of parameter estimators and convergence of sequential design to an optimal design are answered positively.
Abstract: SUMMARY It is shown that confidence regions for nonlinear parameters constructed by the repeated-sampling principle, are asymptotically valid for sequential designs in general linear models. The related questions of consistency of parameter estimators and convergence of sequential design to an optimal design are answered positively. An empirical finding of Ford & Silvey (1980) is given a theoretical justification.

Journal ArticleDOI
TL;DR: In this paper, an asymptotic expansion for the null distribution of the efficient score statistic for testing a composite hypothesis in the presence of nuisance parameters is derived, and an interpretation of the terms occurring in the expansion is given.
Abstract: SUMMARY An asymptotic expansion for the null distribution of the efficient score statistic for testing a composite hypothesis in the presence of nuisance parameters is derived, and an interpretation of the terms occurring in the expansion is given. The use of the expansion to modify the percentage points of the large-sample reference X2 distribution is discussed. The first three moments of the null distribution are obtained and are used as a check on the accuracy of the algebra via a comparison with the first three moments of the index of dispersion test for homogeneity of Poisson parameters.

Journal ArticleDOI
M. Lundy1
TL;DR: Annealing as mentioned in this paper is a widely applicable stochastic search procedure which can escape local optima, and they use the evolutionary tree problem to illustrate the method of application, and describe the annealing algorithm.
Abstract: SUMMARY There are several problems in statistics which can be formulated so that the desired solution is the global minimum of some explicitly defined objective function. In many cases the number of candidate solutions increases exponentially with the size of the problem making exhaustive search impossible, but descent procedures, devised to reduce the number of solutions examined, can terminate with local minima. In this paper we describe the annealing algorithm, a widely applicable stochastic search procedure which can escape local optima, and we use the evolutionary tree problem to illustrate the method of application.

Journal ArticleDOI
TL;DR: In this article, the authors deal with inference for the common odds ratio i/l when the binomial assumption is invalid and derive the consistency and asymptotic normality of the Mantel & Haenszel (1959) estimator.
Abstract: SUMMARY When data can be presented as a series of k 2 x 2 tables with cell counts (xi, ni-xi; yi, mi - yi), it is often assumed that xi and yi are binomially distributed. This paper deals with inference for the common odds ratio i/l when the binomial assumption is invalid. When k increases, the consistency and asymptotic normality of the Mantel & Haenszel (1959) estimator is derived. The conditional maximum likelihood estimator is shown to be inconsistent and the asymptotic bias is computed when either the first-order Markov chain or beta-binomial model is assumed. The Mantel-Haenszel test for testing f = 1 is also shown to be inappropriate through some simulation studies. Two consistent test statistics are proposed and shown to be comparable to each other in terms of size and efficiency. Some possible further work is described.

Journal ArticleDOI
TL;DR: In this paper, the Mahalanobis distance is shown to be an appropriate measure of distance between two elliptic distributions having different locations but a common shape, which extends a result long familiar in multivariate analysis to a class of nonnormal distributions.
Abstract: SUMMARY The Mahalanobis distance is shown to be an appropriate measure of distance between two elliptic distributions having different locations but a common shape. This extends a result long familiar in multivariate analysis to a class of nonnormal distributions. It can also be used to show that the sample version of the Mahalanobis distance is appropriate under both estimative and predictive approaches to estimation for the family of multivariate normal distributions differing only in location.

Journal ArticleDOI
TL;DR: In this paper, group sequential tests for the analysis of survival data that compare two treatments and allow for the adjustment of other concomitant variables are discussed, which allow early termination of a study if large treatment differences occur.
Abstract: SUMMARY Group sequential tests for the analysis of survival data that compare two treatments and allow for the adjustment of other concomitant variables are discussed. These methods allow early termination of a study if large treatment differences occur. Based on simulation results, we compare the operating characteristics of these adjusted tests with the unadjusted sequentially-computed log rank test.

Journal ArticleDOI
TL;DR: In this article, a model for repeated measurements designs where the measurements on the same unit are assumed to be correlated with known correlation coefficient is considered and it is shown that Latin squares with an additional balancing property are E-optimal for the weighted least squares estimate.
Abstract: SUMMARY We consider a model for repeated measurements designs where the measurements on the same unit are assumed to be correlated with known correlation coefficient. It is shown that Latin squares with an additional balancing property are E-optimal for the weighted least squares estimate.

Journal ArticleDOI
TL;DR: In this paper, several measures of influence for logistic regression have been suggested for the purpose of identifying observations which are influential relative to the estimation of the regression coefficients vector and the deviance.
Abstract: SUMMARY Several measures of influence for logistic regression have been suggested. These measures have been developed for the purpose of identifying observations which are influential relative to the estimation of the regression coefficients vector and the deviance. We propose measures for detecting influence relative to the determination of probabilities and the classification of future observations. The relationships among measures are indicated.

Journal ArticleDOI
TL;DR: In this article, the sequential nature of design can be ignored asymptotically, and links are forged with inference for stochastic processes and missing data problems for fixed designs.
Abstract: SUMMARY If an experiment is designed sequentially, repeated-sampling inference may not necessarily be made using distributional results that are valid for fixed designs A few simple illustrative examples and alternative approaches are given Sometimes the sequential nature of design can be ignored asymptotically Links are forged with inference for stochastic processes and missing data problems

Journal ArticleDOI
TL;DR: In this article, the authors investigated nearest neighbor balanced block designs for autoregressive correlation models when generalized least squares estimation is used and the efficiency of the proposed designs in comparison to the randomized block designs is computed and tabulated for some cases.
Abstract: SUMMARY Block designs for observations correlated in one dimension are investigated. Nearest neighbour balanced block designs turn out to be optimal for autoregressive correlation models when generalized least squares estimation is used. Performance of these designs for other correlation models is shown to be quite satisfactory. The efficiency of the proposed designs in comparison to the randomized block designs is computed and tabulated for some cases.

Journal ArticleDOI
TL;DR: In this article, a regression model with autoregressive moving average residuals is considered, and the authors show how to compute efficiently the likelihood, predict future values and interpolate missing values.
Abstract: SUMMARY For a time series regression model with correlated residuals we show how to compute efficiently the likelihood, predict future values and interpolate missing values. As an illustration we consider a regression model with autoregressive moving average residuals.