scispace - formally typeset
Search or ask a question

Showing papers on "Expectation–maximization algorithm published in 1973"


Book ChapterDOI
01 Jan 1973
TL;DR: In this paper, it is shown that the classical maximum likelihood principle can be considered to be a method of asymptotic realization of an optimum estimate with respect to a very general information theoretic criterion.
Abstract: In this paper it is shown that the classical maximum likelihood principle can be considered to be a method of asymptotic realization of an optimum estimate with respect to a very general information theoretic criterion. This observation shows an extension of the principle to provide answers to many practical problems of statistical model fitting.

15,424 citations


Journal ArticleDOI
TL;DR: In this paper, a reliability coefficient is proposed to indicate quality of representation of interrelations among attributes in a battery by a maximum likelihood factor analysis, which can indicate that an otherwise acceptable factor model does not exactly represent the interrelations between the attributes for a population.
Abstract: Maximum likelihood factor analysis provides an effective method for estimation of factor matrices and a useful test statistic in the likelihood ratio for rejection of overly simple factor models. A reliability coefficient is proposed to indicate quality of representation of interrelations among attributes in a battery by a maximum likelihood factor analysis. Usually, for a large sample of individuals or objects, the likelihood ratio statistic could indicate that an otherwise acceptable factor model does not exactly represent the interrelations among the attributes for a population. The reliability coefficient could indicate a very close representation in this case and be a better indication as to whether to accept or reject the factor solution.

6,359 citations



Journal ArticleDOI
TL;DR: The application of maximum likelihood methods to discrete characters is examined, and it is shown that parsimony methods are notmaximum likelihood methods under the assumptions made by Farris, and an algorithm which enables rapid calculation of the likelihood of a phylogeny is described.
Abstract: Felsenstein, J. (Department of Genetics SK-50, University of Washington, Seattle, Washington 98195). 1973. Maximum likelihood and minimum-steps methods for estimating evolutionary trees from data on discrete characters. Syst. Zool. 22:240-249.The general maximum likelihood approach to the statistical estimation of phylogenies is outlined, for data in which there are a number of discrete states for each character. The details of the maximum likelihood method will depend on the details of the probabilistic model of evolution assumed. There are a very large number of possible models of evolution. For a few of the simpler models, the calculation of the likelihood of an evolutionary tree is outlined. For these models, the maximum likelihood tree will be the same as the "most parsimonious" (or minimum-steps) tree if the probability of change during the evolution of the group is assumed a priori to be very small. However, most sets of data require too many assumed state changes per character to be compatible with this assumption. Farris (1973) has argued that maximum likelihood and parsimony methods are identical under a much less restrictive set of assumptions. It is argued that the present methods are preferable to his, and a counterexample to his argument is presented. An algorithm which enables rapid calculation of the likelihood of a phylogeny is described. [Evolutionary trees: maximum likelihood.] The first systematic attempt to apply standard statistical inference procedures to the estimation of evolutionary trees was the work of Edwards and Cavalli-Sforza (1964; see also Cavalli-Sforza and Edwards, 1967). At about the same time, the "parsimony' or minimum evolutionary steps method of Camin and Sokal (1965) gave a great impetus to the development of welldefined procedures for obtaining evolutionary trees. Edwards and Cavalli-Sforza concerned themselves with data from continuous variables such as gene frequencies and quantitative characters. The CaminSokal approach, on the other hand, was developed for characters which are recorded as a series of discrete states. Although some taxonomists have declared that the problem of guessing phylogenies should be viewed as a problem of statistical inference (Farris, 1967, 1968; Throckmorton, 1968), until recently there have been no attempts to explore the relationship between the statistical inference and minimum-steps approaches. Recently, Farris (1973) has presented a detailed argument that, under certain reasonable assumptions, the maximum-likelihood method of statistical inference appropriate to discrete-character data is precisely the parsimony method of Camin and Sokal. In this paper, I will examine the application of maximum likelihood methods to discrete characters, and will show that parsimony methods are not maximum likelihood methods under the assumptions made by Farris. They are maximum likelihood methods under considerably more restrictive assumptions about evolution. METHODS OF MAXIMUM LIKELIHOOD Suppose that we want to estimate the evolutionary tree, T, which is to be specified by the topological form of the tree and the times of branching. We are given a set of data, D, and a model of evolution, M, which incorporates not only the evolutionary processes, but also the processes of sampling by which we obtained the data. This model will usually be probabilistic, involving random events such as changes of the environment, occurrence of favorable

641 citations



Journal ArticleDOI
TL;DR: In this paper, the authors established the inadmissibility of the traditional maximum likelihood estimators by exhibiting explicit procedures having lower risk than the corresponding maximum likelihood procedure, and proved that these results are proved in Theorems 1 and 2 of Section 3.
Abstract: Consider a multiple regression problem in which the dependent variable and (3 or more) independent variables have a joint normal distribution. This problem was investigated some time ago by Charles Stein, who proposed reasonable loss functions for various problems involving estimation of the regression coefficients and who obtained various minimax and admissibility results. In this paper we continue this investigation and establish the inadmissibility of the traditional maximum likelihood estimators. Inadmissibility is proved by exhibiting explicit procedures having lower risk than the corresponding maximum likelihood procedure. These results are given in Theorems 1 and 2 of Section 3.

83 citations


Journal ArticleDOI
TL;DR: In this paper, the Monte Carlo method was used to estimate the parameters of a mixture of two univariate normal distributions with, and sample sizes less than 300, where the components are not well separated and the sample size is small.
Abstract: There are few results In the literature on the properties of the maximum likelihood estimates of the parameters of a mixture of two normal distributions when the components are not well separated and the sample size is small. In the present investigation mixtures of two univariate normal distributions with , and sample sizes less than 300 are studied by the Monte Carlo method. For the cases considered, empirical evidtnca is given that the method of maximum likelihood should be used with extreme caution or not at all.

68 citations


Journal ArticleDOI
TL;DR: In this paper, a new, statistically efficient and computationally efficient maximum likelihood computational procedure for determining the period and damping values of linear multi-degree-of-freedom structures and estimates of their statistical reliability from records of their response to random winds or earthquake excitation is described.

66 citations


Journal ArticleDOI
TL;DR: In this paper, the location, shape, and scale parameters of the Weibull distribution are estimated from Type I progressively censored samples by the method of maximum likelihood, and the approximate asymptotic variance-covariance matrix for the maximum likelihood parameter estimates is given.
Abstract: The location, shape, and scale parameters of the Weibull distribution are estimated from Type I progressively censored samples by the method of maximum likelihood. Nonlinear logarithmic likelihood estimating equations are derived, and the approximate asymptotic variance-covariance matrix for the maximum likelihood parameter estimates is given. The iterative procedure to solve the likelihood equations is a stable and rapidly convergent constrained modified quasilinearization algorithm which is applicable to the general case in which all three parameters are unknown. The numerical results indicate that, in terms of the number of iterations required for convergence and in the accuracy of the solution, the proposed algorithm is a very effective technique for solving systems of logarithmic likelihood equations for which all iterative approximations to the solution vector must satisfy certain intrinsic constraints on the parameters. A FORTRAN IV program implementing the maximum likelihood estimation procedure is included.

54 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that weighted least squares estimators for the coefficients of a multiple regression model are the same as the maximum likelihood estimators when the dependent observations are from a member of the regular exponential class of distributions.
Abstract: In this article it is shown that weighted least squares estimators for the coefficients of a multiple regression model are the same as the maximum likelihood estimators when the dependent observations are from a member of the regular exponential class of distributions.

54 citations


Journal ArticleDOI
TL;DR: In this paper, the estimation of the parameters 1, al2 /-2 a22 and p in the mixture of two normal distributions when independent sample information is available from one of the populations is studied.
Abstract: This paper is primarily concerned with estimation of the parameters 1, al2 /-2 a22 and p in the mixture of two normal distributions when independent sample information is available from one of the populations. The solution to the maximum likelihood (ML) equations was obtained using Newton's iterative method. Some interesting results for the moment estimates were obtained for the case when independent sample observations are available from one population. Extensive Monte Carlo simulation was employed to obtain the sample variances of the estimates as well as the estimated asymptotic variances. The variances of the estimates are influenced by the separation of the two means with respect to the variances, the mixture proportion (p), and, of course, the size of the sample. When the number of observations is small and the means are not well separated, the sample variance of the estimates can be as much as three times greater than the estimated asymptotic variances.


Journal ArticleDOI
TL;DR: In this paper, a modified Newton method was applied to the computation of full-information maximum likelihood estimates of parameters of a system of linear structural equations to the case of nonlinear structural equations.
Abstract: N this paper, I will generalize the modified Newton method previously applied in Chow (1968) to the computation of full-information maximum likelihood estimates of parameters of a system of linear structural equations to the case of a system of nonlinear structural equations. The success of that method for linear systems 1 has stimulated my present attempt to generalize it for nonlinear systems. The subject of maximum likelihood estimation of nonlinear simultaneous equation systems has been studied by Eisenpress and Greenstadt (1966). There are three main differences between their approach and ours. First, their basic formulation is more general, assuming that all parameters in the system may appear in every equation,2 whereas we assume as the basic setup that there is a distinct set of parameters belonging to each equation. Second, partly because of the first, we are able to obtain simpler and more explicit expressions for the derivatives of likelihood function required in the calculations. Third, and also partly because of the first, we can conveniently deal with the important problem of linear restrictions on the parameters in the same equation or in different equations. A fourth feature of this paper, and a feature which has partly motivated it, is the contrast of the linear with the nonlinear case. As it will be shown, there are many similarities in the computations of both. This demonstration can enhance our understanding of the nature of the estimation equations. Two additional features of this paper are the treatments of identities in the system and of residuals which may follow an autoregressive scheme. We will derive in section II the estimation equations for nonlinear systems, under the assumptions that each structural equation contains a distinct set of parameters, that the parameters are not subject to any linear restrictions, and that the (additive) residuals are serially uncorrelated. Section III treats the special case when some equations are linear, and contrasts this case with the nonlinear case. Section IV deals with identities and linear restrictions on the parameters. Section V is concerned with the problem of autoregressive residuals.

Journal ArticleDOI
TL;DR: Given a likelihood function generated from an additive model, this paper gives sufficient conditions for log-concavity and strict log- Conc Cavity of the likelihood function.


Journal ArticleDOI
TL;DR: A method is developed for producing a maximum-likelihood estimate of an optical object scene that simultaneously is bounded above and below by known radiance levels.
Abstract: A method is developed for producing a maximum-likelihood estimate of an optical object scene that simultaneously is bounded above and below by known radiance levels. The a priori object statistics are assumed unknown and so must be estimated if a maximum-likelihood estimator is to be found. Recourse is made to a counting procedure in the spirit of Jaynes' [2] approach to estimating a "prior probability" distribution.

Journal ArticleDOI
TL;DR: In this paper, it was shown that the distribution of the maximum likelihood estimator of γ is independent of the location and scale parameters, α and β, and some examples of problems in which this result may have useful application are discussed.
Abstract: Consider a three-parameter density function of the form , where g is a known function and -∞ , < α <, ∞ , β < 0.. It is shown that the distribution of the maximum likelihood estimator of γ is independent of the location and scale parameters, α and β. Some examples of problems in which this result may have useful application are discussed.

Journal ArticleDOI
TL;DR: In this paper, the problem of obtaining the maximum likelihood estimates of the parameters from a special type of linear combination of discrete probability functions is discussed, and it is shown that when the sample is completely categorized the estimation problem is no more complicated that that of estimating the parameters of each of the component probability functions separately.
Abstract: In this article the problem of obtaining the maximum likelihood estimates of the parameters from a special type of linear combination of discrete probability functions is discussed. It is shown that when the sample is completely categorized the estimation problem is no more complicated that that of estimating the parameters of each of the component probability functions separately. When the sample is less than completely classified an iterative procedure must be used to obtain solutions to the likelihood equations and it is shown how the problem reduces to that of the full data case. A discussion of the asymptotic properties of the resulting estimators follows.

Journal ArticleDOI
TL;DR: In this article, a simple derivation of the likelihood function for a linear dynamic model with rational pulse transfer function and excited by Gaussian signal is given, based on a simple linear model.
Abstract: Whittle [1] proposed a method of obtaining the likelihood function for a linear dynamic model (with rational pulse transfer function and excited by Gaussian signal). In this note a simple derivation of his result is given.


01 Jan 1973
TL;DR: In this paper, the standard maximum likelihood classifier is reformulated so that, in most cases, only a small number of density functions need be computed each time a data point is to be classified.
Abstract: The standard maximum-likelihood classifier is reformulated so that, in most cases, only a small number of density functions need be computed each time a data point is to be classified. The technique relies upon class thresholds which are obtained at the beginning of the classification process and which remain fixed thereafter. The result of the reformulation is that a significant reduction in classification processing time is obtained while retaining complete consistency with the standard maximum-likelihood classifier.

Journal ArticleDOI
TL;DR: A computational procedure is presented for calculating the maximum likelihood sampled parameters for the gamma and beta distributions and a self-contained FORTRAN IV computer code has been included to carry out the necessary operations.
Abstract: A computational procedure is presented for calculating the maximum likelihood sampled parameters for the gamma and beta distributions. The procedure employed is both computationally efficient and easy to use. In order to facilitate immediate application, a self-contained FORTRAN IV computer code has been included to carry out the necessary operations.

Proceedings ArticleDOI
01 Dec 1973
TL;DR: In this article, the authors consider families of stochastic processes indexed by a finite number of alternative parameter values and show that maximum likelihood estimates converge almost surely to the correct parameter value.
Abstract: We consider families of stochastic processes indexed by a finite number of alternative parameter values. For general classes of stochastic processes it is shown that maximum likelihood estimates converge almost surely to the correct parameter value. This is established by use of a submartingale property of the sequence of maximized likelihood ratios together with a technique first employed by Wald [3] in the case of independent identically distributed random variables.

Proceedings ArticleDOI
01 Dec 1973
TL;DR: In this article, a method for estimating the parameters of a fixed order autoregressive moving average model, based on maximization of an appropriate likelihood function, is presented, and the resulting static optimization is accomplished with a modified Newton numerical algorithm.
Abstract: A method is presented for estimating the parameters of a fixed order autoregressive moving average model, based on maximization of an appropriate likelihood function. The resulting static optimization is accomplished with a modified Newton numerical algorithm. Under suitable initial conditions for the model, the gradient and hessian matrices of each iteration can be compted analytically. Because of the highly nonlinear character of the likelihood function, starting values of the algorithm are important. Extensions of the method to more complicated models are described. Some numerical examples illustrate the properties of the method.


Journal ArticleDOI
TL;DR: In this article, a method of ranking a set of objects which are presented in pairs in a preference testing experiment is discussed, based on the maximum likelihood estimates of the parameters of a probability model for paired comparisons.
Abstract: This paper discusses a method of ranking a set of objects which are presented in pairs in a preference testing experiment.The rankingto be considered is based on the maximum likelihood estimates of theparameters of a probability model for paired comparisons.It is shown, under a veak assumption, that the maximum likelihood estimatesexist and are unique.It is further noted that one can employ an iterative procedure which converges monotonically to the unique estimates.A property of the ranking in the case of a partially balanced paired comparison experiment is presented.



Journal ArticleDOI
TL;DR: The equivalence of maximum likelihood and least squares estimation for sequentially generated dependent normal observations was shown in this paper, and the relevance of this equivalence to response surface analyzes was discussed.
Abstract: The equivalence of maximum likelihood and least squares estimation for sequentially generated dependent normal observations in indicated. The relevance of this equivalence to response surface analy...