scispace - formally typeset
Search or ask a question

Showing papers on "Likelihood principle published in 1982"



Journal ArticleDOI
TL;DR: In this paper, the authors present EM algorithms for both exploratory and confirmatory models for maximum likelihood factor analysis, which are essentially the same for both cases and involve only simple least squares regression operations; the largest matrix inversion required is for aq ×q symmetric matrix whereq is the matrix of factors.
Abstract: The details of EM algorithms for maximum likelihood factor analysis are presented for both the exploratory and confirmatory models. The algorithm is essentially the same for both cases and involves only simple least squares regression operations; the largest matrix inversion required is for aq ×q symmetric matrix whereq is the matrix of factors. The example that is used demonstrates that the likelihood for the factor analysis model may have multiple modes that are not simply rotations of each other; such behavior should concern users of maximum likelihood factor analysis and certainly should cast doubt on the general utility of second derivatives of the log likelihood as measures of precision of estimation.

608 citations


Journal ArticleDOI
TL;DR: In this paper, the maximum likelihood method is used to estimate a density function from an infinite-dimensional space, where the maximum of the likelihood is not attained by any density, and the parameter space is too big.
Abstract: Maximum likelihood estimation often fails when the parameter takes values in an infinite dimensional space. For example, the maximum likelihood method cannot be applied to the completely nonparametric estimation of a density function from an $\operatorname{iid}$ sample; the maximum of the likelihood is not attained by any density. In this example, as in many other examples, the parameter space (positive functions with area one) is too big. But the likelihood method can often be salvaged if we first maximize over a constrained subspace of the parameter space and then relax the constraint as the sample size grows. This is Grenander's "method of sieves." Application of the method sometimes leads to new estimators for familiar problems, or to a new motivation for an already well-studied technique. We will establish some general consistency results for the method, and then we will focus on three applications.

375 citations


Journal ArticleDOI
A. Buse1
TL;DR: In this paper, it was shown that if the log-likelihood function is quadratic then the three test statistics are numerically identical and have χ2 distributions for all sample sizes under the null hypothesis.
Abstract: By means of simple diagrams this note gives an intuitive account of the likelihood ratio, the Lagrange multiplier, and Wald test procedures. It is also demonstrated that if the log-likelihood function is quadratic then the three test statistics are numerically identical and have χ2 distributions for all sample sizes under the null hypothesis.

317 citations


Journal ArticleDOI
John T. Kent1
TL;DR: In this paper, the authors examined the distribution of the likelihood ratio statistic when the data do not come from the parametric model, but when the 'nearest' member of the parameter family still satisfies the null hypothesis.
Abstract: SUMMARY The usual asymptotic chi-squared distribution for the likelihood ratio test statistic is based on the assumptions that the data come from the parametric model under consideration and that the parameter satisfies the null hypothesis. In this paper we examine the distribution of the likelihood ratio statistic when the data do not come from the parametric model, but when the 'nearest' member of the parametric family still satisfies the null hypothesis. In general, the likelihood ratio statistic no longer follows an asymptotic chi-squared distribution, and an alternative statistic based on the unionintersection approach is proposed.

272 citations


Journal ArticleDOI
TL;DR: In this article, the authors show that the likelihood function always has a stationary point at one particular set of parameter values, and a condition is given when this point is a local maximum and when it is not.

192 citations


Journal ArticleDOI
TL;DR: In this paper, the authors unify the classical maximum likelihood arguments and apply them to the nonlinear econonetric literature, where the objective function of the optimization problem can be treated as if it were the likelihood to derive the Wald test statistic, the likelihood ratio test statistic, and Rao's efficient score statistic.
Abstract: After reading a few articles in the nonlinear econonetric literature one begins to notice that each discussion follows roughly the same lines as the classical treatment of maximum likelihood estimation. There are some technical problems having to do with simultaneously conditioning on the exogenous variables and subjecting the true parameter to a Pittman drift which prevent the use of the classical methods of proof but the basic impression of similarity is correct . An estimator – be it nonlinear least squares, three – stage nonlinear least squares, or whatever – is the solution of an optimization problem. And the objective function of the optimization problem can be treated as if it were the likelihood to derive the Wald test statistic, the likelihood ratio test statistic , and Rao's efficient score statistic. Their asymptotic null and non – null distributions can be found using arguments fairly similar to the classical maximum likelihood arguments. In this article we exploit these observations and unify...

160 citations


Book ChapterDOI
TL;DR: The chapter discusses the efficiency of the mixture approach for k = 2 normal subpopulations, contrasting the asymptotic theory with small sample results available from simulation.
Abstract: Publisher Summary This chapter discusses the classification and mixture maximum likelihood approaches to cluster analysis. In principle the maximization process for the classification maximum likelihood procedure can be carried out since it is just a matter of computing the maximum value of the likelihood over all possible partitions of the n observations to the k subpopulations. However, unless n is quite small, searching over all possible partitions is prohibitive. The chapter discusses the efficiency of the mixture approach for k = 2 normal subpopulations, contrasting the asymptotic theory with small sample results available from simulation.

110 citations


Journal ArticleDOI
TL;DR: In this paper, the authors discuss five questions concerning maximum likelihood estimation: What kind of theory is maximum likelihood, how it is used in practice, to what extent can this theory and practice be justified from a decision-theoretic viewpoint, what are maximum likelihood's principal virtues and defects, and what improvements have been suggested by decision theory.
Abstract: This paper discusses five questions concerning maximum likelihood estimation: What kind of theory is maximum likelihood? How is maximum likelihood used in practice? To what extent can this theory and practice be justified from a decision-theoretic viewpoint? What are maximum likelihood's principal virtues and defects? What improvements have been suggested by decision theory?

104 citations


Journal ArticleDOI
TL;DR: In this article, a family of distributions on [1, 1] with a continuous one-dimensional parameterization that joins the triangular distribution (when Θ = 0) to the uniform distribution, for which the maximum likelihood estimates exist and converge strongly to Θ ≥ 1 as the sample size tends to infinity, whatever be the true value of the parameter.
Abstract: An example is given of a family of distributions on [— 1, 1] with a continuous one-dimensional parameterization that joins the triangular distribution (when Θ = 0) to the uniform (when Θ = 1), for which the maximum likelihood estimates exist and converge strongly to Θ = 1 as the sample size tends to infinity, whatever be the true value of the parameter. A modification that satisfies Cramer's conditions is also given.

97 citations


Journal ArticleDOI
TL;DR: In this paper, an iteration procedure for finding the maximum likelihood estimate of a compound Poisson process was proposed, including assumptions to ensure convergence to an optimum, and it was shown that Simar's functional fulfills these assumptions.
Abstract: Simar (1976) suggested an iteration procedure for finding the maximum likelihood estimate of a compound Poisson process, but he could not show convergence. Here the more general case of maximizing a concave functional on the set of all probability measures is treated. As a generalization of Simar's procedure, an algorithm is given for solving this problem, including assumptions to ensure convergence to an optimum. Finally, it is shown that Simar's functional fulfills these assumptions.

Journal ArticleDOI
P.V. de Souza1, P.J. Thomson
TL;DR: In this paper, it is shown that when Itakura's measure is used to compare two estimated LPC vectors, it is not the log likelihood ratio at all, and the true likelihood ratio for these conditions is derived.
Abstract: Several LPC distance measures and statistical tests have been proposed for use in speech processing, the most popular of which is Itakura's log likelihood ratio statistic, and some simple variants thereof. In this paper it is shown that these statistics share some undesirable properties. It is argued that there are more tractable and more sensitive measures available including other relevant likelihood ratio statistics. It is also shown that when Itakura's measure is used to compare two estimated LPC vectors, it is not the log likelihood ratio at all, and the true likelihood ratio for these conditions is derived.

Journal ArticleDOI
TL;DR: In this paper, the statistical behavior of some common parameter estimation methods is investigated by simulation and it is shown that the methods based on the maximum likelihood principle give the best estimates of parameters in a given model.
Abstract: The statistical behavior of some common parameter estimation methods is investigated by simulation. The methods based on the maximum likelihood principle give the best estimates of parameters in a given model. An analysis of the possibilities for detecting systematic errors by means of the maximum likelihood method showed that this method gives at best the same information as other well known tests.

Journal ArticleDOI
01 Jan 1982
TL;DR: In this article, it is argued that randomization has no role to play in the design or analysis of an experiment and that conditionality and similarity lead, via the likelihood principle, to the same conclusion.
Abstract: It is argued that randomization has no role to play in the design or analysis of an experiment. If a Bayesian approach is adopted this conclusion is easily demonstrated. Outside that approach two principles, of conditionality and similarity, lead, via the likelihood principle, to the same conclusion. In the case of design, however, it is important to avoid confounding the effect of interest with an unexpected factor and this consideration leads to a principle of haphazardness that is clearly related to, but not identical with, randomization. The role of exchangeability is discussed.

Journal ArticleDOI
TL;DR: In this article, the same authors apply the results for testing and estimation problems for patterned means and covariance matrices with explicit linear maximum likelihood estimates to the block compound symmetry problem.
Abstract: Known results for testing and estimation problems for patterned means and covariance matrices with explicit linear maximum likelihood estimates are applied to the block compound symmetry problem. New results given include the constants for G.P.E. Box’s approximate null distribution of the likelihood ratio statistic. These techniques are applied to the analysis of an educational testing problem.

Journal ArticleDOI
TL;DR: In this article, an iterative procedure for solving the likelihood equations is derived which includes the covariance determinant, which has many of the computational advantages of the direct representation of Godolphin (1977, 1978).
Abstract: SUMMARY Several authors have described methods for computing the exact maximum likelihood estimates for the parameters of a Gaussian moving average process. In this paper, an iterative procedure for solving the likelihood equations is derived which includes the covariance determinant. The procedure expresses the estimator of each parameter as a linear combination of a suitably large set of sample serial correlations which has many of the computational advantages of the direct representation of Godolphin (1977, 1978). It is also shown that the two procedures lead to virtually the same set of estimates for sample sizes likely to be encountered in practice. These results imply that the additional effort required to compute exact maximum likelihood estimates is hard to justify. Illustrations of all three procedures are given separately for the moving average model of order one: the iterative procedure for maximizing the exact likelihood function is less stable than the approximate procedures in certain cases.


Book ChapterDOI
TL;DR: In this article, it was shown that there is a situation where Birnbaum's basic axiom of mathematical equivalence and the likelihood principle is a tautology, and this axiom disqualifies the probability of Bayesian statistics.

Journal ArticleDOI
TL;DR: In this article, the authors showed that the likelihood function associated with an ARMA model asymptotically has a unique stationary point which is a global maximum corresponding to the true parameter values.
Abstract: Under mild conditions, the likelihood function associated with an ARMA model asymptotically has a unique stationary point which is a global maximum corresponding to the true parameter values. This note gives a new proof of this fact. The proof is considerably simpler than that previously published.

Journal ArticleDOI
TL;DR: This work would use a formalism very similar to that of Birnbaum (1962) in relation to a practice of randomization called balanced sampling, and demonstrates the process of elimination of nuisance parameters contradicts a very basic principle of statistical inference subsequently defined as the ancillarity principle.
Abstract: Among the many reasons underlying the practice of randomization some of the main ones can be described as averaging out or elimination of the effects of nuisance parameters. It is already well known (Godambe 1966) that averaging over all the possible results of the adopted randomization is directly in conflict with the likelihood principle. The process of elimination of nuisance parameters has possibly deeper intuitive appeal. But this process contradicts a very basic principle of statistical inference subsequently defined as the ancillarity principle. This is demonstrated in relation to a practice of randomization called balanced sampling. We would use a formalism very similar to that of Birnbaum (1962).

Journal ArticleDOI
TL;DR: In this article, a suitable maximum-likelihood based sequential testing procedure for functions of unknown parameters is developed for independent and identically distributed observations of an underlying distribution of known form, where the theoretical Operating Characteristic (OC) and Average Sample Number (ASN) functions are derived for local alternatives by approximating the distribution of the test statistic with linear combinations of the standard Wiener process.
Abstract: In statistical inference it is often desired to test a specified funcrion of unkown parameters form an underlying distribution. Sequential procedures utilize information from the already collected observations and allow for a possible eariy termination of experimentation with a concurrent savings in time and cost. In the present work a suitable maximum-likelihood based sequential testing procedure for functions of unknown parameters is developed for independent and identically distributed observations of an underlying distribution of known form. The theoretical Operating Characteristic (OC) and Average Sample Number (ASN) functions are derived for local alternatives by approximating the distribution of the test statistic with linear combinations of the standard Wiener process, Simlriation studies were utilized to investigate the goodness of the asymptotic results in finite samples.

Journal ArticleDOI
TL;DR: In this article, maximum likelihood parameter estimation in the Pareto distribution for multicensored samples is discussed and the modality of the associated conditional log-likelihood function is investigated in order to resolve questions concern the existence and uniqueness of the lnarimum likelihood estimates.
Abstract: This paper discusses maximum likelihood parameter estimation in the Pareto distribution for multicensored samples. In particu- lar, the modality of the associated conditional log-likelihood function is investigated in order to resolve questions concerninc the existence and uniqurneas of the lnarimum likelihood estimates.For the cases with one parameter known, the maximum likelihood estimates of the remaining unknown parameters are shown to exist and to be unique. When both parameters are unknown, the maximum likelihood estimates may or may not exist and be unique. That is, their existence and uniqueness would seem to depend solely upon the information inherent in the sample data. In viav of the possible nonexistence and/or non-uniqueness of the maximum likelihood estimates when both parameters are unknown, alternatives to standard iterative numerical methods are explored.

Journal ArticleDOI
TL;DR: In this paper, the distribution of the likelihood ratio statistic for testing the general linear hypothesis is shown to depend only on a "standarised error distribution" and a standarised hypothesis which makes it fessible to simulate the distribution.
Abstract: Linear models with non-normal or dependent errors are shown by example to occur frequently in some areas of application* PivÂotal functions based on maximum likelihood estimation are given for making inferences about the parameters of the made!, .The distriby-tion of the likelihood ratio statistic for testing the general linÂear hypothesis is shown to depend only on a "standarised error distribution" and a standarised hypothesis which makes it fessible to simulate the distribution of this statistic.

Book ChapterDOI
TL;DR: In this paper, the point of change in a regression model with an autoregressive error process of order one is considered, and expressions for the exact and for an approximate joint maximum likelihood function for the points of change are derived, and iterative procedures for the estimation of n1 and p are given.
Abstract: Inference about the point of change in a regression model with an autoregressive error process of order one is considered. Expressions for the exact and for an approximate joint maximum likelihood function for the point of change “ 1 and the autocorrelation coefficient p are derived, and iterative procedures for the estimation of n1 and p are given. The two likelihood functions are shown to give approximately the same results when the number of observations is large. For the number of observations odd, a conditional procedure is used to eliminate the dependency between the random variables, and thus, to permit the derivation of a likelihood function for n1 alone. The performance of the above procedures and the likelihood function obtained assuming no autocorrelation are compared in a Monte Carlo study, where the samples generated mimic a change in mean level of the volume of discharge of a river.

Journal ArticleDOI
TL;DR: In this paper, the relative bias and relative mean square error related to the discriminant function and maximum likelihood procedures are explored using a sampling experiment following an epidemiologic study, and the relative estimating properties of these two approaches with respect to the specific logistic coefficients are investigated.
Abstract: Using a sampling experiment following an cpiderniologic study, the relative bias and relative mean square error related to the discriminant function and maximum likelihood procedures are explored. Interest was on investigating the relative estimating properties of these two approaches with respect to the specific logistic coefficients. In epidemiologic research these terms can be used as approximations to the odds ratio. Under the various conditions explored in the present paper, the maximum likelihood approach appears to be preferable when studying categorical data.

Journal ArticleDOI
TL;DR: In this paper, the problem of testing a general parametric hypothsis following a preliminary test on some other parametric restraints is considered, which is based on appropriate likelihood ratio statistics.
Abstract: The problem of testing a general parametric hypothsis following a preliminary test on some other parametric restraints is considered. These tests are based on appropriate likelihood ratio statistics. The effect of the preliminary test on the size and power of the ultimate test is studied. In this context, some asymptotic distributional properties of some likelihood ratio statistics are studied and incorporated in the study of the main results. 1 . Introduction In a parametric model, the underlying distribution are of assumed forms and the parameter θ belongs to a parameter * Work supported by the National Heart, Lung and Blood Institute, Contract NIH-NHLBI-71-2243-L from the National Institutes of Health. AMS 1980 classifications. 62 Ε 20


Journal ArticleDOI
TL;DR: The small sample powers of two statistics, the likelihood ratio test and a test based on the asymptotic normality of maximum likelihood estimators (z-test) were compared in a simulation experiment as discussed by the authors.
Abstract: The small sample powers of two statistics, the likelihood ratio test, and a test based on the asymptotic normality of maximum likelihood estimators (z-test) were compared in a simulation experiment Two models were specified, one containing the Box-Cox transformation on the dependent variable only, and one containing the Box Cox transformation on both the dependent and independent variables The transformation parameter,λ was estimated 200 times, for each of six different values of z in each of three sample sizes foi both models At each replication 17 hypotheses were tested using both a likelihood ratio test and a z-test Results indicate that w hiic both likelihood ratio tests and z-tests are unbiased, in small samples the z-test is generally preferable to the likelihood ratio test