scispace - formally typeset
Search or ask a question

Showing papers on "Likelihood principle published in 2006"


Journal ArticleDOI
TL;DR: Nested sampling as mentioned in this paper estimates directly how the likelihood function relates to prior mass, and the evidence (alternatively the marginal likelihood, marginal den- sity of the data, or the prior predictive) is immediately obtained by summation.
Abstract: Nested sampling estimates directly how the likelihood function relates to prior mass. The evidence (alternatively the marginal likelihood, marginal den- sity of the data, or the prior predictive) is immediately obtained by summation. It is the prime result of the computation, and is accompanied by an estimate of numerical uncertainty. Samples from the posterior distribution are an optional by- product, obtainable for any temperature. The method relies on sampling within a hard constraint on likelihood value, as opposed to the softened likelihood of an- nealing methods. Progress depends only on the shape of the ested" contours of likelihood, and not on the likelihood values. This invariance (over monotonic re- labelling) allows the method to deal with a class of phase-change problems which eectiv ely defeat thermal annealing.

1,118 citations


Book
13 Jul 2006
TL;DR: In this paper, the authors proposed an extended framework for estimating the likelihood of fixed parameters using a mixture of conditional and conditional likelihoods, which is derived from the profile likelihood distribution of the likelihood-ratio statistic distribution.
Abstract: LIST OF NOTATIONS PREFACE INTRODUCTION CLASSICAL LIKELIHOOD THEORY Definition Quantities derived from the likelihood Profile likelihood Distribution of the likelihood-ratio statistic Distribution of the MLE and the Wald statistic Model selection Marginal and conditional likelihoods Higher-order approximations Adjusted profile likelihood Bayesian and likelihood methods Jacobian in likelihood methods GENERALIZED LINEAR MODELS Linear models Generalized linear models Model checking Examples QUASI-LIKELIHOOD Examples Iterative weighted least squares Asymptotic inference Dispersion models Extended Quasi-likelihood Joint GLM of mean and dispersion Joint GLMs for quality improvement EXTENDED LIKELIHOOD INFERENCES Two kinds of likelihood Inference about the fixed parameters Inference about the random parameters Optimality in random-parameter estimation Canonical scale, h-likelihood and joint inference Statistical prediction Regression as an extended model Missing or incomplete-data problems Is marginal likelihood enough for inference about fixed parameters? Summary: likelihoods in extended framework NORMAL LINEAR MIXED MODELS Developments of normal mixed linear models Likelihood estimation of fixed parameters Classical estimation of random effects H-likelihood approach Example Invariance and likelihood inference HIERARCHICAL GLMS HGLMs H-likelihood Inferential procedures using h-likelihood Penalized quasi-likelihood Deviances in HGLMs Examples Choice of random-effect scale HGLMS WITH STRUCTURED DISPERSION HGLMs with structured dispersion Quasi-HGLMs Examples CORRELATED RANDOM EFFECTS FOR HGLMS HGLMs with correlated random effects Random effects described by fixed L matrices Random effects described by a covariance matrix Random effects described by a precision matrix Fitting and model-checking Examples Twin and family data Ascertainment problem SMOOTHING Spline models Mixed model framework Automatic smoothing Non-Gaussian smoothing RANDOM-EFFECT MODELS FOR SURVIVAL DATA Proportional-hazard model Frailty models and the associated h-likelihood *Mixed linear models with censoring Extensions Proofs DOUBLE HGLMs DHGLMs Models for finance data H-likelihood procedure for fitting DHGLMs Random effects in the ? component Examples FURTHER TOPICS Model for multivariate responses Joint model for continuous and binary data Joint model for repeated measures and survival time Missing data in longitudinal studies Denoising signals by imputation REFERENCE DATA INDEX AUTHOR INDEX SUBJECT INDEX

495 citations


Journal ArticleDOI
TL;DR: In this paper, the authors describe a discrete-time, stochastic population model with density dependence, environmental-type process noise, and lognormal observation or sampling error.
Abstract: We describe a discrete-time, stochastic population model with density depend ence, environmental-type process noise, and lognormal observation or sampling error. The model, a stochastic version of the Gompertz model, can be transformed into a linear Gaussian state-space model (Kaiman filter) for convenient fitting to time series data. The model has a multivariate normal likelihood function and is simple enough for a variety of uses ranging from theoretical study of parameter estimation issues to routine data analyses in population monitoring. A special case of the model is the discrete-time, stochastic exponential growth model (density independence) with environmental-type process error and lognormal observation error. We describe two methods for estimating parameters in the Gompertz state-space model, and we compare the statistical qualities of the methods with computer simulations. The methods are maximum likelihood based on observations and restricted maximum likelihood based on first differences. Both offer adequate statistical properties. Because the likelihood function is identical to a repeated-measures analysis of variance model with a random time effect, parameter estimates can be calculated using PROC MIXED of SAS. We use the model to analyze a data set from the Breeding Bird Survey. The fitted model suggests that over 70% of the noise in the population's growth rate is due to observation error. The model describes the autocovariance properties of the data especially well. While observation error and process noise variance parameters can both be estimated from one time series, multimodal likelihood functions can and do occur. For data arising from the model, the statistically consistent parameter estimates do not necessarily correspond to the global maximum in the likelihood function. Maximization, simulation, and bootstrapping programs must accommodate the phenomenon of multimodal likelihood functions to produce statistically valid results.

409 citations


Journal ArticleDOI
TL;DR: Insight is provided into the robustness property of the MLEs against departure from the normal random effects assumption and the difficulty of reliable estimates for the standard errors is suggested by using bootstrap procedures.
Abstract: The maximum likelihood approach to jointly model the survival time and its longitudinal covariates has been successful to model both processes in longitudinal studies. Random effects in the longitudinal process are often used to model the survival times through a proportional hazards model, and this invokes an EM algorithm to search for the maximum likelihood estimates (MLEs). Several intriguing issues are examined here, including the robustness of the MLEs against departure from the normal random effects assumption, and difficulties with the profile likelihood approach to provide reliable estimates for the standard error of the MLEs. We provide insights into the robustness property and suggest to overcome the difficulty of reliable estimates for the standard errors by using bootstrap procedures. Numerical studies and data analysis illustrate our points.

211 citations


Journal ArticleDOI
TL;DR: In this article, the authors assess the strengths and weaknesses of the frequentist and Bayes systems of inference and suggest that calibrated Bayes-a compromise based on the works of Box, Rubin, and others-captures the strengths of both approaches and provides a roadmap for future advances.
Abstract: The lack of an agreed inferential basis for statistics makes life "interesting" for academic statisticians, but at the price of negative implications for the status of statistics in industry, science, and government. The practice of our discipline will mature only when we can come to a basic agreement about how to apply statistics to real problems. Simple and more general illustrations are given of the negative consequences of the existing schism between frequentists and Bayesians. An assessment of strengths and weaknesses of the frequentist and Bayes systems of inference suggests that calibrated Bayes-a compromise based on the works of Box, Rubin, and others-captures the strengths of both approaches and provides a roadmap for future advances. The approach asserts that inferences under a particular model should be Bayesian, but model assessment can and should involve frequentist ideas. This article also discusses some implications of this proposed compromise for the teaching and practice of statistics.

197 citations


Journal ArticleDOI
TL;DR: Using the relationship between least squares and maximum likelihood estimators for balanced designs, it is shown why the asymptotic distribution of the likelihood ratio test for variance components does not follow a chi2 distribution with degrees of freedom equal to the number of parameters tested when the null hypothesis is true.
Abstract: When using maximum likelihood methods to estimate genetic and environmental components of (co)variance, it is common to test hypotheses using likelihood ratio tests, since such tests have desirable asymptotic properties. In particular, the standard likelihood ratio test statistic is assumed asymptotically to follow a chi2 distribution with degrees of freedom equal to the number of parameters tested. Using the relationship between least squares and maximum likelihood estimators for balanced designs, it is shown why the asymptotic distribution of the likelihood ratio test for variance components does not follow a chi2 distribution with degrees of freedom equal to the number of parameters tested when the null hypothesis is true. Instead, the distribution of the likelihood ratio test is a mixture of chi2 distributions with different degrees of freedom. Implications for testing variance components in twin designs and for quantitative trait loci mapping are discussed. The appropriate distribution of the likelihood ratio test statistic should be used in hypothesis testing and model selection.

158 citations


Journal Article
TL;DR: A formalization of the concept of uncertainty in statistical matching when the variables are categorical is given, and a consistent maximum likelihood estimator of the elements characterizing uncertainty is suggested.
Abstract: Statistical matching is a technique for combining information from different sources. It can be used in situations when variables of interest are not jointly observed and conclusions must be drawn on the basis of partial knowledge of the phenomenon. Uncertainty regarding conclusions arises naturally unless strong and nontestable hypotheses are assumed. Hence, the main goal of statistical matching can be reinterpreted as the study of the key aspects of uncertainty, and what conclusions can be drawn. In this article we give a formalization of the concept of uncertainty in statistical matching when the variables are categorical, and formalize the key elements to be investigated. A consistent maximum likelihood estimator of the elements characterizing uncertainty is suggested. Furthermore, the introduction of logical constraints and their effect on uncertainty are studied. All the analyses have been performed according to the likelihood principle. An example with real data is presented and a comparison with other approaches already defined is performed.

50 citations


Journal Article
TL;DR: This work was motivated by the proportional hazards mixed effects model (PHMM), which incorporates general random effects of arbitrary covariates and includes the frailty model as a special case and enables the quadratic expansion of the log profile likelihood.
Abstract: We consider selection of nested and non-nested semiparametric models. Using profile likelihood we can define both a likelihood ratio statistic and an Akaike information for models with nuisance parameters. Asymptotic quadratic expansion of the log profile likelihood allows derivation of the asymptotic null distribution of the likelihood ratio statistic including the boundary cases, as well as unbiased estimation of the Akaike information by an Akaike information criterion. Our work was motivated by the proportional hazards mixed effects model (PHMM), which incorporates general random effects of arbitrary covariates and includes the frailty model as a special case. The asymptotic properties of its parameter estimate has recently been established, which enables the quadratic expansion of the log profile likelihood. For computation of the (profile) likelihood under PHMM we apply three algorithms: Laplace approximation, reciprocal importance sampling, and bridge sampling. We compare the three algorithms under different data structures, and apply the methods to a multi-center lung cancer clinical trial.

38 citations


Journal ArticleDOI
TL;DR: In this article, the authors re-examine the endogeneity issue in light of the likelihood principle and show that, once the data are collected, adhering to the likelihood principles leads to analysis where endogeneity becomes ignorable for estimation.
Abstract: The use of adaptive designs in conjoint analysis has been shown to lead to an endogeneity bias in part-worth estimates using sampling experiments. In this paper, we re-examine the endogeneity issue in light of the likelihood principle. The likelihood principle asserts that all relevant information in the data about model parameters is contained in the likelihood function. We show that, once the data are collected, adhering to the likelihood principle leads to analysis where endogeneity becomes ignorable for estimation. The likelihood principle is implicit to Bayesian analysis, and discussion is offered for detecting and dealing with endogeneity bias in marketing.

31 citations


Journal ArticleDOI
TL;DR: In this article, the authors study the large deviation principle for M-estimators (and maximum likelihood estimators in particular) and obtain the rate function of the LDA for M estimators.
Abstract: We study the large deviation principle for M-estimators (and maximum likelihood estimators in particular) We obtain the rate function of the large deviation principle for M-estimators For exponential families, this rate function agrees with the Kullback–Leibler information number However, for location or scale families this rate function is smaller than the Kullback–Leibler information number We apply our results to obtain confidence regions of minimum size whose coverage probability converges to one exponentially In the case of full exponential families, the constructed confidence regions agree with the ones obtained by inverting the likelihood ratio test with a simple null hypothesis

31 citations


Journal ArticleDOI
TL;DR: In this paper, Chen et al. developed an empirical likelihood inference for censored survival data under the linear transformation models, which generalize Cox's [Regression models and life tables (with Discussion), J Roy Statist Soc Ser B 34 (1972) 187-220] proportional hazards model, and showed that the limiting distribution of the empirical likelihood ratio is a weighted sum of standard chi-squared distribution.

Posted Content
TL;DR: mvprobit as discussed by the authors uses the Geweke-Hajivassiliou-Keane (GHK) simulator to evaluate the M-dimensional Normal integrals in the likelihood function.
Abstract: mvprobit estimates M-equation probit models, by the method of simulated maximum likelihood (SML). (Cf. probit and biprobit which estimate 1-equation and 2-equation probit models by maximum likelihood.) The variance-covariance matrix of the cross-equation error terms has values of 1 on the leading diagonal, and the off-diagonal elements are correlations to be estimated. mvprobit uses the Geweke-Hajivassiliou-Keane (GHK) simulator to evaluate the M-dimensional Normal integrals in the likelihood function. For each observation, a likelihood contribution is calculated for each replication, and the simulated likelihood contribution is the average of the values derived from all the replications. The simulated likelihood function for the sample as a whole is then maximized using standard methods (ml in this case).

Journal ArticleDOI
TL;DR: In this paper, a consistent sequence of roots of the likelihood equation that is asymptotically normal with the inverse of the Fisher information as its variance is established under some regularity conditions.
Abstract: Motivated by studying asymptotic properties of the maximum likelihood estimator (MLE) in stochastic volatility (SV) models, in this paper we investigate likelihood estimation in state space models. We first prove, under some regularity conditions, there is a consistent sequence of roots of the likelihood equation that is asymptotically normal with the inverse of the Fisher information as its variance. With an extra assumption that the likelihood equation has a unique root for each n, then there is a consistent sequence of estimators of the unknown parameters. If, in addition, the supremum of the log likelihood function is integrable, the MLE exists and is strongly consistent. Edge-worth expansion of the approximate solution of likelihood equation is also established. Several examples, including Markov switching models, ARMA models, (G)ARCH models and stochastic volatility (SV) models, are given for illustration.

Journal ArticleDOI
TL;DR: The typical inference method derived under the assumption of independence between genotype and exposure is extended to that under a more general assumption of conditional independence and can be reduced to simply fitting a multinomial logistic model when the authors have case‐only data.
Abstract: Given the biomedical interest in gene-environment interactions along with the difficulties inherent in gathering genetic data from controls, epidemiologists need methodologies that can increase precision of estimating interactions while minimizing the genotyping of controls. To achieve this purpose, many epidemiologists suggested that one can use case-only design. In this paper, we present a maximum likelihood method for making inference about gene-environment interactions using case-only data. The probability of disease development is described by a logistic risk model. Thus the interactions are model parameters measuring the departure of joint effects of exposure and genotype from multiplicative odds ratios. We extend the typical inference method derived under the assumption of independence between genotype and exposure to that under a more general assumption of conditional independence. Our maximum likelihood method can be applied to analyse both categorical and continuous environmental factors, and generalized to make inference about gene-gene-environment interactions. Moreover, the application of this method can be reduced to simply fitting a multinomial logistic model when we have case-only data. As a consequence, the maximum likelihood estimates of interactions and likelihood ratio tests for hypotheses concerning interactions can be easily computed. The methodology is illustrated through an example based on a study about the joint effects of XRCC1 polymorphisms and smoking on bladder cancer. We also give two simulation studies to show that the proposed method is reliable in finite sample situation.

Journal ArticleDOI
TL;DR: In this paper, the authors employ second-order likelihood asymptotics to investigate how ideal frequentist inferences depend on the probability model for the data through more than the likelihood function, referring to this as the effect of the reference set.
Abstract: SUMMARY We employ second-order likelihood asymptotics to investigate how ideal frequentist inferences depend on the probability model for the data through more than the likelihood function, referring to this as the effect of the reference set. There are two aspects of higher order corrections to first-order likelihood methods, namely (i) that involving effects of fitting nuisance parameters and leading to the modified profile likelihood, and (ii) another part pertaining to limitation in adjusted information. Generally, each of these involves a first-order adjustment depending on the reference set. However, we show that, for some important settings, likelihood-irrelevant model specifications have a second-order effect on both of these adjustments; this result includes specification of the censoring model for survival data. On the other hand, for sequential experiments the likelihood-irrelevant specification of the stopping rule has a second-order effect on adjustment (i) but a first order effect on adjustment (ii). These matters raise the issue of what are 'ideal' frequentist inferences, since consideration of 'exact' frequentist inferences will not suffice. We indicate that to second order ideal frequentist inferences may be based on the distribution of the ordinary likelihood ratio statistic, without commonly considered adjustments thereto.

Journal ArticleDOI
TL;DR: This paper studies several more complicated seemingly unrelated regression models, and shows how all stationary points of the likelihood function can be computed using algebraic geometry.

Journal ArticleDOI
TL;DR: Bunouf and Lecoutre as discussed by the authors extended the classical Jeffreys priors for the Binomial and Pascal sampling models to more general stopping rules and showed that the correction induced on the posterior is proportional to the bias induced by the stopping rule on the maximum likelihood estimator.

Book ChapterDOI
TL;DR: The stochastic complexity criterion is applied to estimation of the order in AR and ARMA models and exact asymptotic formulas for the Fisher information matrix are derived.
Abstract: In this paper the stochastic complexity criterion is applied to estimation of the order in AR and ARMA models. The power of the criterion for short strings is illustrated by simulations. It requires an integral of the square root of Fisher information, which is done by Monte Carlo technique. The stochastic complexity, which is the negative logarithm of the Normalized Maximum Likelihood universal density function, is given. Also, exact asymptotic formulas for the Fisher information matrix are derived.

Journal ArticleDOI
01 Apr 2006
TL;DR: In this paper, two celebrated statistical principles -Principle of maximum likelihood and principle of maximum entropy -were merged, and a novel estimation scheme for statistical inversion was proposed. But the scheme is not suitable for the case of large numbers of variables.
Abstract: Two celebrated statistical principles- Principle of Maximum Likelihood and Principle of Maximum Entropy are merged establishing a novel estimation scheme for statistical inversion.

Journal ArticleDOI
TL;DR: In this article, the empirical likelihood method for a parametric model which parameterizes the conditional density of a response given covariate is studied and the adjusted empirical log-likelihood ratio is asymptotically standard ´ 2 when missing responses are imputed using maximum likelihood estimate.
Abstract: In the present paper, we study the empirical likelihood method for a parametric model which parameterizes the conditional density of a response given covariate. It is shown the adjusted empirical log-likelihood ratio is asymptotically standard ´ 2 when missing responses are imputed using maximum likelihood estimate.

Journal ArticleDOI
TL;DR: In this article, two different approaches for calculating the maximum likelihood estimates (MLE) are given and examined for left-censored data with two different detection limits: DL1 and DL2.
Abstract: Left-censored data often arise in environmental contexts with one or more detection limits, DLs. Estimators of the parameters are derived for left-censored data having two detection limits: DL1 and DL2 assuming an underlying normal distribution. Two different approaches for calculating the maximum likelihood estimates (MLE) are given and examined. These methods also apply to lognormally distributed environmental data with two distinct detection limits. The performance of the new estimators is compared utilizing many simulated data sets. Examples are given illustrating the use of these methods utilizing a computer program given in the Appendix. Copyright © 2006 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: In this article, an approximate likelihood function for panel data with an autoregressive moving-average (ARMA) model remainder disturbance is presented and Whittle's approximate maximum likelihood estimator (MLE) is used to yield an asymptotic estimator.
Abstract: . An approximate likelihood function for panel data with an autoregressive moving-average (ARMA)(p, q) model remainder disturbance is presented and Whittle's approximate maximum likelihood estimator (MLE) is used to yield an asymptotic estimator. Although an asymptotic approach, the power test is quite successful for estimating and testing. In this approach, we do not need to calculate the transformation matrix in exact form. Through the Riemann sum approach, we can construct a simple approximate concentrated likelihood function. In addition, the model is also extended to the restricted maximum likelihood (REML) function, in which the package of Gilmour, Thompson and Cullis [Biometrics (1995) Vol. 51, pp. 1440–1450] is applied without difficulty. In the case study, we implement the model on the characteristic line for the investment analysis of Taiwanese computer motherboard makers.

Journal ArticleDOI
TL;DR: In this article, the asymptotic properties of the likelihood ratio statistic for testing homogeneity in a bivariate normal mixture model with known covariance were investigated, and the results of a small simulation study to approximate the null distribution were presented.

Journal ArticleDOI
TL;DR: It is shown that maximum likelihood estimation can be alternatively performed by employing mis classification probabilities and a missing data specification, and a quasi-likelihood parameterisation of the misclassification model is proposed as an alternative tomaximum likelihood estimation.
Abstract: We discuss alternative approaches for estimating from cross-sectional categorical data in the presence of misclassification. Two parameterisations of the misclassification model are reviewed. The first employs misclassification probabilities and leads to moment-based inference. The second employs calibration probabilities and leads to maximum likelihood inference. We show that maximum likelihood estimation can be alternatively performed by employing misclassification probabilities and a missing data specification. As an alternative to maximum likelihood estimation we propose a quasi-likelihood parameterisation of the misclassification model. In this context an explicit definition of the likelihood function is avoided and a different way of resolving a missing data problem is provided. Variance estimation for the alternative point estimators is considered. The different approaches are illustrated using real data from the UK Labour Force Survey and simulated data.

01 Jan 2006
TL;DR: In this paper, a precise definition of likelihood statistic is given, and simple and easy to use criteria are proposed to establish under weak conditions that minimal sufficiency in statistics emerges from observed likelihood functions.
Abstract: The likelihood statistic guides statistical analysis in almost all areas of application. A precise definition of likelihood statistic is given, and simple and easy to use criteria are proposed to establish under weak conditions that minimal sufficiency in statistics emerges from observed likelihood functions. Some examples are presented.

Journal ArticleDOI
22 Feb 2006-Metrika
TL;DR: In this article, the usual likelihood ratio test, with maximum likelihood estimator for the unspecified parameters, is generalized to tests based on -divergence statistics, using minimum divergence estimator.
Abstract: Consider the loglinear model for categorical data under the assumption of multinomial sampling. We are interested in testing between various hypotheses on the parameter space when we have some hypotheses relating to the parameters of the models that can be written in terms of constraints on the frequencies. The usual likelihood ratio test, with maximum likelihood estimator for the unspecified parameters, is generalized to tests based on -divergence statistics, using minimum -divergence estimator. These tests yield the classical likelihood ratio test as a special case. Asymptotic distributions for the new -divergence test statistics are derived under the null hypothesis.

01 Jan 2006
TL;DR: For a simplified structural equation/IV regression model with one right-side endogenous variable, the authors obtained the exact conditional distribution function for Moreira's (2003) conditional likelihood ratio (CLR) test, which is then used to obtain the critical value function needed to implement the CLR test.
Abstract: For a simplified structural equation/IV regression model with one right-side endogenous variable, we obtain the exact conditional distribution function for Moreira’s (2003) conditional likelihood ratio (CLR) test This is then used to obtain the critical value function needed to implement the CLR test, and reasonably comprehensive graphical versions of the function are provided for practical use The analogous functions are also obtained for the case of testing more than one right-side endogenous coefficient, but only for an approximation to the true likelihood ratio test We then go on to provide an exact analysis of the power functions of the CLR test, the AndersonRubin test, and the LM test suggested by Kleibergen (2002) The CLR test is shown to clearly conditionally dominate the other two tests for virtually all parameter configurations, but none of these test is either inadmissable or uniformly superior to the other two

Journal Article
TL;DR: In this article, the authors investigate the test for serial correlation in partial linear model and derive the asymptotic distribution of the test statistics under null hypothesis, and show that their test has good power.
Abstract: In the paper,we investigate the test for serial correlation in partial linear model. We use empirical likelihood method to construct test statistics,and derive the asymptotic distribution of the test statistics under null hypothesis.Our method tests not only first order autocorrelation but higher order autocorrelation as well.Simulation results show that our test has good power.

Journal ArticleDOI
22 Apr 2006-Metrika
TL;DR: In this article, the authors studied the likelihood ratio test for and against the hypothesis that the parameter is restricted by some nonlinear inequalities, and derived the asymptotic null distributions of the likelihood ratios by using the limits of the related optimization problems.
Abstract: In applied statistics a finite dimensional parameter involved in the distribution function of the observed random variable is very often constrained by a number of nonlinear inequalities. This paper is devoted to studying the likelihood ratio test for and against the hypothesis that the parameter is restricted by some nonlinear inequalities. The asymptotic null distributions of the likelihood ratio statistics are derived by using the limits of the related optimization problems. The author also shows how to compute critical values for the tests.

Journal ArticleDOI
TL;DR: In this paper, a simple errors-in-variables regression model is given for illustrating the method of marginal maximum likelihood (MML) and given suitable estimates of reliability, error variables, as nuisance variables, can be integrated out of likelihood equations.
Abstract: A simple errors-in-variables regression model is given in this article for illustrating the method of marginal maximum likelihood (MML). Given suitable estimates of reliability, error variables, as nuisance variables, can be integrated out of likelihood equations. Given the closed form expression of the resulting marginal likelihood, the effects of error can be more clearly demonstrated. Derivations are given in detail to provide a detailed example of the marginalization strategy, and to prepare students for understanding more advanced applications of MML.