scispace - formally typeset
Search or ask a question

Showing papers on "Expectation–maximization algorithm published in 1978"


Journal ArticleDOI
Nan M. Laird1
TL;DR: In this article, the authors show that the nonparametric maximum likelihood estimate of a mixing distribution is self-consistent, i.e., it is a step function with a finite number of steps.
Abstract: The nonparametric maximum likelihood estimate of a mixing distribution is shown to be self-consistent, a property which characterizes the nonparametric maximum likelihood estimate of a distribution function in incomplete data problems. Under various conditions the estimate is a step function, with a finite number of steps. Its computation is illustrated with a small example.

763 citations



Journal ArticleDOI
TL;DR: In this paper, the covariance matrix of the disturbances depends upon a finite number of unknown parameters θ 1, θm, and it is proved that β is unbiased if its mean exists.

159 citations



Journal ArticleDOI
TL;DR: In this paper, the authors examined maximum likelihood techniques as applied to classification and clustering problems, and showed that the classification maximum likelihood technique, in which individual observations are assigned on an "all-or-nothing" basis to one of several classes as part of the maximization process, gives results which are asymptotically biased.
Abstract: SUMMARY This paper examines maximum likelihood techniques as applied to classification and clustering problems, and shows that the classification maximum likelihood technique, in which individual observations are assigned on an "all-or-nothing" basis to one of several classes as part of the maximization process, gives results which are asymptotically biased. This extends Marriott's (1975) work for normal component distributions. Numerical examples are presented for normal component distributions and for a problem in genetics. The results indicate that biases can be severe, though determining in simple form when the biases will and will not be severe seems difficult.

108 citations


Journal ArticleDOI
TL;DR: The problem of estimating density functions using data from different distributions, and a mixture of them, is considered and maximum likelihood and Bayesian parametric techniques are summarized and various approaches using distribution‐free kernel methods are expounded.
Abstract: The problem of estimating density functions using data from different distributions, and a mixture of them, is considered. Maximum likelihood and Bayesian parametric techniques are summarized and various approaches using distribution‐free kernel methods are expounded. A comparative study is made using the halibut data of Hosmer (1973) and the problem of incomplete data is briefly discussed.

57 citations


Journal ArticleDOI
TL;DR: In this paper, the joint maximum likelihood estimates of the location and scale parameters of a Cauchy distribution based on samples of size 3 and 4 are given for the joint estimation of the locations and scales of a set of points.
Abstract: Expressions are given for the joint maximum likelihood estimates of the location and scale parameters of a Cauchy distribution based on samples of size 3 and 4.

46 citations


Journal ArticleDOI
TL;DR: In this paper, explicit expressions for the maximum likelihood estimators of the parameters of the two-parameter exponential distribution, when a doubly censored sample is available, are given.
Abstract: In this note explicit expressions are given for the maximum likelihood estimators of the parameters of the two-parameter exponential distribution, when a doubly censored sample is available.

35 citations


Journal ArticleDOI
TL;DR: In this paper, the quartic exponential distribution defined by the probability density function of the type is examined in detail and the problem of obtaining maximum likelihood point estimates of the population parameters reduces to that of identifying the α as functions of population moments μ r ′, r = 1, 2.4.
Abstract: The quartic exponential (QE) distribution defined by the probability density function of the type is examined in detail. The problem of obtaining maximum likelihood point estimates of the population parameters reduces to that of identifying the α as functions of the population moments μ r ′, r = 1, 2.3.4. The invalidity is explained of methods proposed by previous authors to deal with the nonlinear relationships involved, and a new algorithm is developed which overcomes these objections. The new algorithm is applied to practical data, and the resulting distributions fitted to observed frequencies are shown to compare favourably with those obtained by previous Methods.

29 citations


01 Jan 1978
TL;DR: In this paper, the authors present a user-oriented presentation of results scattered in the literature for computing the likelihood function, maximizing it, evaluating the Fisher information matrix and finding the asymptotic properties of ML parameter estimates.
Abstract: Maximum likelihood (ML) identification of state space models for linear dynamic systems is presented in a unified tutorial form. First linear filtering theory and classical maximum likelihood theory are reviewed. Then ML identification of linear state space models is discussed. A compact user-oriented presentation of results scattered in the literature is given for computing the likelihood function, maximizing it, evaluating the Fisher information matrix and finding the asymptotic properties of ML parameter estimates. The practically important case where a system is described by a simpler model is also briefly discussed.

28 citations


Journal ArticleDOI
TL;DR: In this paper, an appropriate transform on the parameters is performed in order to satisfy this constraint, and the estimation of the transformed parameters according to the maximum likelihood principle is outlined, and a numerical example is given for which the basis solution and the usual maximum likelihood method failed.
Abstract: As the literature indicates, no method is presently available which takes explicitly into account that the parameters of Lazarsfeld's latent class analysis are defined as probabilities and are therefore restricted to the interval [0, 1]. In the present paper an appropriate transform on the parameters is performed in order to satisfy this constraint, and the estimation of the transformed parameters according to the maximum likelihood principle is outlined. In the sequel, a numerical example is given for which the basis solution and the usual maximum likelihood method failed. The different results are compared and the advantages of the proposed method discussed.

Journal ArticleDOI
TL;DR: In this paper, the problem of finding a suitable (asymptotic) efficiency criterion for inference concerning parameters of stochastic processes is addressed, and a contiguity calculation is used to show that a previously suggested criterion is inadequate and itself provides a partial solution to the problem.

Journal ArticleDOI
TL;DR: MFIT was designed, in part, to motivate the maximum likelihood method by making it as accessible to the scientifically oriented user as nonlinear regression or least squares analysis.

Journal ArticleDOI
TL;DR: In this paper, a recursive algorithm for computing the resulting estimates for increasing model orders is presented, which is more economical than standard solutions using Gaussian elimination, for example, for high order model fitting, and can be regarded as a special case of the algorithm presented here.
Abstract: Estimates for autoregressive models are obtained by approximating the maximum likelihood estimates in two ways. A recursive algorithm for computing the resulting estimates for increasing model orders is presented. To calculate a pth order estimate 0(p 2) arithmetic operations are required; hence for high order model fitting, the method is more economical than standard solutions using Gaussian elimination, for example. The Levinson–Durbin recursions for the Yule-Walker estimates can be regarded as a special case of the algorithm presented here.

Journal ArticleDOI
TL;DR: In this paper, a method of marginal likelihood is presented for the elimination of nuisance parameters from a model which represents a wide class of single equation distributed lag models, which relies on the ability to divide the information on the nuisance parameters given by the data into sufficient and ancillary statistics for the nuisance parameter.
Abstract: The method of marginal likelihood is presented for the elimination of nuisance parameters from a model which represents a wide class of single equation distributed lag models. The technique relies on the ability to divide the information on the nuisance parameters given by the data into sufficient and ancillary statistics for the nuisance parameters. The marginal likelihood for the parameters of interest is obtained through the marginal distribution of the ancillary statistics. The marginal likelihood that is obtained is compared to the concentrated likelihood function obtained by substituting the solution of the likelihood equations of the nuisance parameters for their parameter values in the full likelihood. Three special cases of the general model are considered as examples: models with structural disturbances, lagged variables models, and a polynomial distributed lag model.

Journal ArticleDOI
TL;DR: In this article, a relatively simple estimation technique based on the sufficient statistics is given and shown to be best asymptotically normal for certain multivariate normal models with known variances.
Abstract: It is well-known that in certain multivariate normal models with known variances, maximum likelihood estimation of correlations is arithmetically cumbersome. For certain of these models, a relatively simple estimation technique based on the sufficient statistics is given and shown to be best asymptotically normal. The procedure uses the fact that many of the likelihood equations for these models are essentially cubic equations. A consistent, explicit solution for the associated cubic likelihood equation is found and shown to have the appropriate efficiency and normality properties. Considered in detail are the intraclass, autoregressive, and moving average models.

Proceedings ArticleDOI
Fang-kuo Sun1, T. Lee1
01 Jan 1978
TL;DR: Maximum likelihood estimates of the mean and the covariance of a normal random variable, based on a set of independently, but nonidentically distributed observations, are discussed and an efficient algorithm for computing MLEs is introduced.
Abstract: In this paper, maximum likelihood estimates of the mean and the covariance of a normal random variable, based on a set of independently, but nonidentically distributed observations, are discussed. An efficient algorithm for computing MLEs is introduced. The asymptotic properties such as strong consistency and asymptotic normality are examined.

Journal ArticleDOI
TL;DR: It is shown that several maximum likelihood survivor paths cannot occur simultaneously, and hence that searching some of the nodes of the state trellis diagram can be avoided.
Abstract: A further analysis of maximum likelihood sequence estimation algorithms for Gaussian channels with finite intersymbol interference is presented. It is shown that several maximum likelihood survivor paths cannot occur simultaneously, and hence that searching some of the nodes of the state trellis diagram can be avoided. An efficient algorithm is given that updates the metrics while avoiding redundant parts of the search.

01 May 1978
TL;DR: In this article, the two-parameter Beta method was used to estimate the probability density function of the maximum likelihood estimate, using polynomials of degree 3 and 4.
Abstract: : The Two-Parameter Beta Method, introduced in the previous study as a method of estimating the operating characteristics of a test item, has proved to be as efficient as the Normal Approximation Method, for a set of simulated data of 500 hypothetical examinees having a uniform latent trait distribution between -2.475 and 2.475. Both methods are characterized: (1) by the use of a relatively small number of subjects-like 500 -- in the whole procedure of estimation; (2) without assuming any prior mathematical model; and (3) by the use of the estimated joint distribution of the latent trait and its maximum likelihood estimate. In the Two-Parameter Beta Method, the method of moments is adopted to approximate the probability density function of the maximum likelihood estimate, using polynomials of degree 3 and 4. The first two conditional moments of the latent trait, given the maximum likelihood estimate, are derived from theory and computed for the data for each value of the maximum likelihood estimate. The conditional distribution of the latent trait, given the maximum likelihood estimate, is approximated by a Beta distribution using the method of moments, with two a priori set parameters and two estimated parameters from the conditional moments.


Journal ArticleDOI
TL;DR: In this paper, a Monte Carlo study showed that one of these estimators is less absolutely biased and has smaller variance than the exact maximum likelihood estimator, and this estimator exhibits greater power in tests of autocorrelation when used in place of the standard Durbin-Watson statistic.
Abstract: Two approximations to the maximum likelihood estimiator of the autocorrelation coefficient in first-order Markov disturbances in a linear model have been previously suggested by Durbin & Watson. A Monte Carlo study shows that one of these estimators is less absolutely biased and has smaller variance than the exact maximum likelihood estimator. Further, this estimator exhibits greater power in tests of autocorrelation when used in place of the standard Durbin-Watson statistic. The second estimator, while having very small variance, is severely biased and of little use in power considerations.

20 Dec 1978
TL;DR: In this article, a Bivariate P.D.F. approach for estimating the operating characteristics of item response categories is introduced, and used in conjunction with the Normal Approach Method, assuming a normal distribution for the conditional distribution of ability, given its maximum likelihood estimate.
Abstract: : Bivariate P.D.F. Approach for estimating the operating characteristics of item response categories is introduced, and used in conjunction with the Normal Approach Method, assuming a normal distribution for the conditional distribution of ability, given its maximum likelihood estimate. In this approach, the total set of the maximum likelihood estimates is divided into the item score groups, and for each score group the density function of the maximum likelihood estimate is approximated by a polynomial. It is tried on the same hypothetical data, i.e., the maximum likelihood estimates of ability of the five hundred hypothetical subjects and their responses to the ten binary items following the normal ogive model. Three different degrees of polynomials are used in approximating the density functions of the subsets of the maximum likelihood estimates, and they are called Degree 3, 4 and 5 Cases. The mean square errors are used for evaluating the resultant estimated item characteristic functions, and the two item parameters in the normal ogive model are also estimated.

Book ChapterDOI
TL;DR: A review of the use of the Lognormal distribution as a model for species abundance is given in this article, where the maximum likelihood method, not the nonlinear regression, is appropriate for estimating the parameters of this species abundance model.
Abstract: The lognormal distribution has been used as a model for species abundance. A historical review of the use of this model is given. The method of maximum likelihood described by Cohen (1959, 1961) and a nonlinear regressionmethod developed by Gauch and Chase (1974) have been used to estimate the parameters of the model. These two methods are described and appropriate applications of each method are discussed. The maximum likelihood method, not the nonlinear regression, is appropriate for estimating the parameters of this species abundance model. Results of a Monte Carlo simulation which demonstrates the advantage of maximum likelihood are presented.

Posted Content
TL;DR: In this paper, the authors suggest that maximum likelihood should be preferred to other estimation schemes not only because of its optimal large-sample statistical properties, but also because of their ability to incorporate certain a priori restrictions from economic theory.
Abstract: Because of the presence of Jacobian terms, determinants which arise as a result of a transformation of variables, many common likelihood functions have singularities. This fact has several implications for maximum likelihood estimation. The most interesting of these is that singularities often correspond with economically meaningful restrictions, and can be used to impose the latter. Several applications of this principle are presented. They suggest that maximum likelihood should be preferred to other estimation schemes not only because of its optimal large-sample statistical properties, but also because of its ability to incorporate certain a priori restrictions from economic theory.


Proceedings ArticleDOI
01 Jan 1978
TL;DR: This paper surveys direct search optimization methods and compares them to the algorithms requiring a direct computation of the gradients to show the class of problems for which such methods are likely to be useful.
Abstract: Though the theory of the maximum likelihood method for parameter estimation in dynamic systems is well developed, its application to complex systems has been limited by the unavailability of fast and reliable computational algorithms to maximize the likelihood function. Gradient-based algorithms have been mostly used for this purpose until now. A summary of such techniques was given by Gupta and Mehra (1974). Recent experience with algorithms which do not explicitly compute the gradients of the innovations or the likelihood function indicates that such algorithms offer potential benefits over gradient-based algorithms. This paper surveys direct search optimization methods and compares them to the algorithms requiring a direct computation of the gradients. An example problem is presented to show the class of problems for which such methods are likely to be useful.