scispace - formally typeset
Search or ask a question

Showing papers in "Statistics and Computing in 1997"


Journal ArticleDOI
TL;DR: The user interface is simple and homogeneous among all the programs; this contributes to making the use of ADE-4 very easy for non- specialists in statistics, data analysis or computer science.
Abstract: We present ADE-4, a multivariate analysis and graphical display software. Multivariate analysis methods available in ADE-4 include usual one-table methods like principal component analysis and correspondence analysis, spatial data analysis methods (using a total variance decomposition into local and global components, analogous to Moran and Geary indices), discriminant analysis and within/between groups analyses, many linear regression methods including lowess and polynomial regression, multiple and PLS (partial least squares) regression and orthogonal regression (principal component regression), projection methods like principal component analysis on instrumental variables, canonical correspondence analysis and many other variants, coinertia analysis and the RLQ method, and several three-way table (k-table) analysis methods. Graphical display techniques include an automatic collection of elementary graphics corresponding to groups of rows or to columns in the data table, thus providing a very efficient way for automatic k-table graphics and geographical mapping options. A dynamic graphic module allows interactive operations like searching, zooming, selection of points, and display of data values on factor maps. The user interface is simple and homogeneous among all the programs; this contributes to making the use of ADE-4 very easy for non- specialists in statistics, data analysis or computer science.

1,651 citations


Journal ArticleDOI
TL;DR: This paper presents a new methodology for making Bayesian inference about exponential family regression models, overdispersed data and longitudinal studies and involves the use of Markov chain Monte Carlo techniques.
Abstract: Generalized linear mixed models provide a unified framework for treatment of exponential family regression models, overdispersed data and longitudinal studies. These problems typically involve the presence of random effects and this paper presents a new methodology for making Bayesian inference about them. The approach is simulation-based and involves the use of Markov chain Monte Carlo techniques. The usual iterative weighted least squares algorithm is extended to include a sampling step based on the Metropolis–Hastings algorithm thus providing a unified iterative scheme. Non-normal prior distributions for the regression coefficients and for the random effects distribution are considered. Random effect structures with nesting required by longitudinal studies are also considered. Particular interests concern the significance of regression coefficients and assessment of the form of the random effects. Extensions to unknown scale parameters, unknown link functions, survival and frailty models are outlined.

329 citations


Journal ArticleDOI
TL;DR: An approach to significance testing by the direct interpretation of likelihood is defined, developed and distinguished from the traditional forms of tail-area testing and Bayesian testing.
Abstract: An approach to significance testing by the direct interpretation of likelihood is defined, developed and distinguished from the traditional forms of tail-area testing and Bayesian testing. The emphasis is on conceptual issues. Some theoretical aspects of the new approach are sketched in the two cases of simple vs. simple hypotheses and simple vs. composite hypotheses.

225 citations


Journal ArticleDOI
TL;DR: This work proposes a new approach to cluster analysis which consists of exact Bayesian inference via Gibbs sampling, and the calculation of Bayes factors from the output using the Laplace–Metropolis estimator, which works well in several real and simulated examples.
Abstract: A new approach to cluster analysis has been introduced based on parsimonious geometric modelling of the within-group covariance matrices in a mixture of multivariate normal distributions, using hierarchical agglomeration and iterative relocation. It works well and is widely used via the MCLUST software available in S-PLUS and StatLib. However, it has several limitations: there is no assessment of the uncertainty about the classification, the partition can be suboptimal, parameter estimates are biased, the shape matrix has to be specified by the user, prior group probabilities are assumed to be equal, the method for choosing the number of groups is based on a crude approximation, and no formal way of choosing between the various possible models is included. Here, we propose a new approach which overcomes all these difficulties. It consists of exact Bayesian inference via Gibbs sampling, and the calculation of Bayes factors (for choosing the model and the number of groups) from the output using the Laplace–Metropolis estimator. It works well in several real and simulated examples.

215 citations


Journal ArticleDOI
TL;DR: The optimal decomposition of Bayesian networks is considered and the applicability of genetic algorithms to the problem of the triangulation of moral graphs is examined empirically.
Abstract: In this paper we consider the optimal decomposition of Bayesian networks. More concretely, we examine empirically the applicability of genetic algorithms to the problem of the triangulation of moral graphs. This problem constitutes the only difficult step in the evidence propagation algorithm of Lauritzen and Spiegelhalter (1988) and is known to be NP-hard (Wen, 1991). We carry out experiments with distinct crossover and mutation operators and with different population sizes, mutation rates and selection biasses. The results are analysed statistically. They turn out to improve the results obtained with most other known triangulation methods (Kj\sgmaelig;rulff, 1990) and are comparable to results obtained with simulated annealing (Kj\sgmaelig;rulff, 1990; Kj\sgmaelig;rulff, 1992).

98 citations


Journal ArticleDOI
TL;DR: This paper identifies, categorize and discusses the various problem-specific factors that influence the development process of developing classifiers for real-world classification problems, and presents a case study of a large scale classification application using the process framework described.
Abstract: In this paper we present a perspective on the overall process of developing classifiers for real-world classification problems. Specifically, we identify, categorize and discuss the various problem-specific factors that influence the development process. Illustrative examples are provided to demonstrate the iterative nature of the process of applying classification algorithms in practice. In addition, we present a case study of a large scale classification application using the process framework described, providing an end-to-end example of the iterative nature of the application process. The paper concludes that the process of developing classification applications for operational use involves many factors not normally considered in the typical discussion of classification models and algorithms.

68 citations


Journal ArticleDOI
TL;DR: The predictability index τ is proposed as a splitting rule for growing the same classification tree as CART does when using the Gini index of heterogeneity as an impurity measure to make a substantial saving in the time required to generate a classification tree.
Abstract: This paper provides a faster method to find the best split at each node when using the CART methodology. The predictability index τ is proposed as a splitting rule for growing the same classification tree as CART does when using the Gini index of heterogeneity as an impurity measure. A theorem is introduced to show a new property of the index τ: the τ for a given predictor has a value not lower than the τ for any split generated by the predictor. This property is used to make a substantial saving in the time required to generate a classification tree. Three simulation studies are presented in order to show the computational gain in terms of both the number of splits analysed at each node and the CPU time. The proposed splitting algorithm can prove computational efficiency in real data sets as shown in an example.

58 citations


Journal ArticleDOI
TL;DR: It is shown that in the context of moderate to low signal-to-noise ratios, this ‘block thresholding’ approach does indeed improve performance, by allowing greater adaptivity and reducing mean squared error.
Abstract: Usually, methods for thresholding wavelet estimators are implemented term by term, with empirical coefficients included or excluded depending on whether their absolute values exceed a level that reflects plausible moderate deviations of the noise. We argue that performance may be improved by pooling coefficients into groups and thresholding them together. This procedure exploits the information that coefficients convey about the sizes of their neighbours. In the present paper we show that in the context of moderate to low signal-to-noise ratios, this ’block thresholding‘ approach does indeed improve performance, by allowing greater adaptivity and reducing mean squared error. Block thresholded estimators are less biased than term-by-term thresholded ones, and so react more rapidly to sudden changes in the frequency of the underlying signal. They also suffer less from spurious aberrations of Gibbs type, produced by excessive bias. On the other hand, they are more susceptible to spurious features produced by noise, and are more sensitive to selection of the truncation parameter.

56 citations


Journal ArticleDOI
TL;DR: The algorithm proposed here has an acceptance probability which is superior to e/4 and the efficiency of the algorithm is compared with the previous method and the improvement in terms of minimum acceptance probability is shown.
Abstract: We study the properties of truncated gamma distributions and we derive simulation algorithms which dominate the standard algorithms for these distributions. For the right truncated gamma distribution, an optimal accept–reject algorithm is based on the fact that its density can be expressed as an infinite mixture of beta distribution. For integer values of the parameters, the density of the left truncated distributions can be rewritten as a mixture which can be easily generated. We give an optimal accept–reject algorithm for the other values of the parameter. We compare the efficiency of our algorithm with the previous method and show the improvement in terms of minimum acceptance probability. The algorithm proposed here has an acceptance probability which is superior to e/4.

47 citations


Journal ArticleDOI
TL;DR: Under very general conditions, the authors have shown that the attractive stationary points of the SAEM algorithm correspond to the global and local maxima of the observed likelihood.
Abstract: The Expectation–Maximization (EM) algorithm is a very popular technique for maximum likelihood estimation in incomplete data models. When the expectation step cannot be performed in closed form, a stochastic approximation of EM (SAEM) can be used. Under very general conditions, the authors have shown that the attractive stationary points of the SAEM algorithm correspond to the global and local maxima of the observed likelihood. In order to avoid convergence towards a local maxima, a simulated annealing version of SAEM is proposed. An illustrative application to the convolution model for estimating the coefficients of the filter is given.

41 citations


Journal ArticleDOI
TL;DR: The posterior distribution of the likelihood is used to interpret the evidential meaning of P-values, posterior Bayes factors and Akaike's information criterion when comparing point null hypotheses with composite alternatives.
Abstract: The posterior distribution of the likelihood is used to interpret the evidential meaning of P-values, posterior Bayes factors and Akaike‘s information criterion when comparing point null hypotheses with composite alternatives. Asymptotic arguments lead to simple re-calibrations of these criteria in terms of posterior tail probabilities of the likelihood ratio. (’Prior‘) Bayes factors cannot be calibrated in this way as they are model-specific.

Journal ArticleDOI
TL;DR: A hybrid algorithm is proposed in which crossover is used to combine subsections of image reconstructions obtained using SA and it is shown that this algorithm is more effective and efficient than SA or a GA individually.
Abstract: Genetic algorithms (GAs) are adaptive search techniques designed to find near-optimal solutions of large scale optimization problems with multiple local maxima. Standard versions of the GA are defined for objective functions which depend on a vector of binary variables. The problem of finding the maximum a posteriori (MAP) estimate of a binary image in Bayesian image analysis appears to be well suited to a GA as images have a natural binary representation and the posterior image probability is a multi-modal objective function. We use the numerical optimization problem posed in MAP image estimation as a test-bed on which to compare GAs with simulated annealing (SA), another all-purpose global optimization method. Our conclusions are that the GAs we have applied perform poorly, even after adaptation to this problem. This is somewhat unexpected, given the widespread claims of GAs‘ effectiveness, but it is in keeping with work by Jennison and Sheehan (1995) which suggests that GAs are not adept at handling problems involving a great many variables of roughly equal influence. We reach more positive conclusions concerning the use of the GA‘s crossover operation in recombining near-optimal solutions obtained by other methods. We propose a hybrid algorithm in which crossover is used to combine subsections of image reconstructions obtained using SA and we show that this algorithm is more effective and efficient than SA or a GA individually.

Journal ArticleDOI
TL;DR: This paper presents various properties which allow the computation time to be drastically reduced, thus enabling one to use not only the more traditional and simple versions given by McDonald et al. (1977) and Garside and Mack (1967), but also the more complex original version of Barnard (1947).
Abstract: Unconditional non-asymptotic methods for comparing two independent binomial proportions have the drawback that they take a rather long time to compute. This problem is especially acute in the most powerful version of the method (Barnard, 1947). Thus, despite being the version which originated the method, it has hardly ever been used. This paper presents various properties which allow the computation time to be drastically reduced, thus enabling one to use not only the more traditional and simple versions given by McDonald et al. (1977) and Garside and Mack (1967), but also the more complex original version of Barnard (1947).

Journal ArticleDOI
Merrilee Hurn1
TL;DR: The results suggest that adapting the auxiliary variables to the specific application is beneficial, however the form of adaptation needed and the extent of the resulting benefits are not always clear-cut.
Abstract: Markov chain Monte Carlo (MCMC) methods are now widely used in a diverse range of application areas to tackle previously intractable problems Difficult questions remain, however, in designing MCMC samplers for problems exhibiting severe multimodality where standard methods may exhibit prohibitively slow movement around the state space Auxiliary variable methods, sometimes together with multigrid ideas, have been proposed as one possible way forward Initial disappointing experiments have led to data-driven modifications of the methods In this paper, these suggestions are investigated for lattice data such as is found in imaging and some spatial applications The results suggest that adapting the auxiliary variables to the specific application is beneficial However the form of adaptation needed and the extent of the resulting benefits are not always clear-cut

Journal ArticleDOI
TL;DR: This article traces some of the developments in computational multivariate methodology in the past decade and highlights those trends that may prove most fruitful for future practical implementation.
Abstract: Many traditional multivariate techniques such as ordination, clustering, classification and discriminant analysis are now routinely used in most fields of application However, the past decade has seen considerable new developments, particularly in computational multivariate methodology This article traces some of these developments and highlights those trends that may prove most fruitful for future practical implementation

Journal ArticleDOI
TL;DR: This paper proposes a general form of shrinkage, and suggests that, in practice, shrinkage be towards a proper curve estimator, and proposes a local linear estimator based on an infinitely supported kernel.
Abstract: Local linear curve estimators are typically constructed using a compactly supported kernel, which minimizes edge effects and (in the case of the Epanechnikov kernel) optimizes asymptotic performance in a mean square sense. The use of compactly supported kernels can produce numerical problems, however. A common remedy is ‘ridging’, which may be viewed as shrinkage of the local linear estimator towards the origin. In this paper we propose a general form of shrinkage, and suggest that, in practice, shrinkage be towards a proper curve estimator. For the latter we propose a local linear estimator based on an infinitely supported kernel. This approach is resistant against selection of too large a shrinkage parameter, which can impair performance when shrinkage is towards the origin. It also removes problems of numerical instability resulting from using a compactly supported kernel, and enjoys very good mean squared error properties.

Journal ArticleDOI
TL;DR: A polynomial security test is provided to solve the computational problems of testing a given table of censored data for security, and searching for a secure suppression pattern of minimum size for agiven table.
Abstract: The technique of data suppression for protecting sensitive information in a two-dimensional table from exact disclosure raises the computational problems of testing a given table of censored data for security, and searching for a secure suppression pattern of minimum size for a given table. We provide a polynomial security test to solve the former problem, and prove that the latter problem is intractable in the general case, but can be solved in linear time in the special case in which only sensitive cells are to be protected.

Journal ArticleDOI
TL;DR: Comments on the recent reformulation of the Aitkin paper on P -values and the message of the Lindley paradox, which suggest that a subhypothesis strongly rejected by a signi®cance test may be strongly supported in posterior probability if the prior puts insucient weight on the hypotheses of non-negligible likelihood.
Abstract: Professor Aitkin acknowledges that the seeds of his work were sown over two decades ago in the paper of Professor Dempster here reproduced (pp. 247±252). Since we can also read Dempster's present views on the key idea (pp. 256± 269), which may well have changed since 1973, I will try to con®ne my comments to its recent reformulation by Aitkin. There is a nice evenhandedness in the Aitkin paper and its forerunner (Aitkin, 1991): Bayesian commitment to what might be described as a bedrock prior comes under as much ®re as naive interpretation of P -values. As far as the latter is concerned, I believe that the naivety may stem from overweening ambition, rather than from the simplicity with which naivety is usually associated. As deployed in scienti®c investigation, P -values were originally intended to provide `simple tests of signi®cance, in which the only available expectations are those which ̄ow from the null hypothesis being true' (Fisher, 1951, p. 17). When not misused, they still provide some sort of control over the pursuit of weak clues ± not a measure of faith in some alternative hypothesis. A P -value is a P -value is a P -value! That some users like to misinterpret it as a posterior probability or odds ratio or other inferentialmeasure, ormay join Aitkin in an elaborate `calibration' programme, should not detract from the P -value's intrinsic, if uninterpretable, value. Aitkin is harder on `uncalibrated' P -values than Dempster was. This is slightly paradoxical given his recommendation of the posterior Bayes factor in his 1991 paper, on the grounds that it avoids the Lindley paradox. The latter, after all, did exhibit a striking recalibration of P -values. It is the fate of some paradoxes to be transformed beyond recognition by multiple reanalyses, or deployed for purposes beyond their authors' imagining. A hilarious illustration of this is E. T. Jaynes's treatment (1994) of Dawid et al.'s marginalization `paradox' and the Flatland `paradox' (Stone, 1976) ± a treatment that elicited a rebuttal in Dawid et al. (1996) where, incidentally, it is explained that both `paradoxes' are really inconsistencies, irremovable by any logically acceptable analysis. My own generalization and interpretation of the message of the Lindley paradox may likewise o€end: a subhypothesis strongly rejected by a signi®cance test may be strongly supported in posterior probability if the prior puts insucient weight on the hypotheses of non-negligible likelihood

Journal ArticleDOI
TL;DR: Commenting on the related current paper of Aitkin and the Lindley paradox, which draws attention to sometimes sharp discrepancies that can arise between two approaches, it is maintained that any scienti®c use of probability rests upon both subjective and objective bases.
Abstract: I much appreciate the opportunity provided by David Hand and the journal Statistics and Computing to reprint my Aarhus Conference paper and to comment on the related current paper of Aitkin. For readers wishing to think again about the tangled issues surrounding the concept of `signi®cance', a good starting point is the `Lindley paradox' that draws attention to sometimes sharp discrepancies that can arise between two approaches, namely R. A. Fisher's `P -value' signi®cance tests based on sampling distributions, and Harold Je€reys's position that Bayesian hypothesis testing also has a stake in the term `signi®cance'. Relevant references to both early and recent literature may be found in the review paper of Kass and Raftery (1995) on the hybrid `Bayes factor' approach that is also evaluated by Aitkin. Mervyn Stone in his discussion of Aitkin describes the Lindley paradox as `not easily resolved', in particular not via the lopsided Bayesian preference often advocated by Lindley. I agree with Mervyn. I start with related comments that may help to create a context for the techniques assessed in Aitkin's interesting contribution. Controversy and confusion are easily created by attempts to make opposites from concepts that are inseparable and complementary, as well as by con ̄ating ideas that truly need to be distinguished. Both types of fallacy su€use the Lindley paradox. Proponents of the Fisherian approach, or of its more frequentist Neyman±Pearson cousin, make much of claimed virtues of scienti®c objectivity. On the other side, subjectivist statisticians point to the high principle of Bayesian coherence as justifying their position. Both arguments have merit, but are not really in con ̄ict, so may obscure other real di€erences. I maintain that any scienti®c use of probability rests upon both subjective and objective bases. Meanwhile, another basic distinction regarding the logical interpretation of probability fails to receive the recognition it deserves, namely, the distinction between `postdictive' (Dempster, 1964) and predictive interpretation. Another type of confusion results from imprecision in stating the questions that probabilistic reasoning is called upon to answer. Fisher (1956 and later editions) is insuciently recognized for drawing attention to logical varieties of statistical inference in relation to the di€erent questions being asked. Fisher's ideas deserve elaboration, and perhaps revision, in today's environment of increased complexities of data structures and associated probability models, such as graphical, hierarchical, temporal, and spatial systems. Inferential applications of formal probabilities involve simultaneous subjective and objective considerations. Apart from a few ideologues who insist that probability has no meaning other than describing an identi®ed long-run frequency, it is generally accepted that, whatever else probability may or may not be, it surely is a measure of uncertainty about individual trials that make up long-run frequencies. From there, it is a short step to considering that every applicable probability has a subjective interpretation as someone's idealized predictive uncertainty, given an implied or assumed state of knowledge. The most common vehicle for applicable probability is a stochastic model based on an analogy between an empirical situation and a game of chance. We habitually pay lip service to the idea that such models, which usually have undetermined parameters in them, are objectively based. Certainly they do include designed mathematical representations of objective reality. But it is not easy to explain convincingly, and indeed is ultimately implausible in principle, that the complexities of the real world are ever captured precisely by rari®ed mathematical idealizations. Models assumed in practice are in fact human constructions in search of an ideal that may or may not exist. Model-building involves subjective choices that invariably complement objective reality. Assessing model validity means comparing what is actual with predictions from the ideal, so implies subjective activities of choosing test statistics. The concept of predictive validity is closely bound up with the subjective interpretation of probability as `your' (Savage, 1962) measure of uncertainty. It is never possible to `verify' the probability of an individual uncertain event, but by locating the event in one or more aggregates whose frequencies are approximately veri®ed, Statistics and Computing (1997) 7, 265±269

Journal ArticleDOI
TL;DR: Model selection based on conditional predictive ordinates from cross-validated data is developed and Bayesian methods for estimating the dose response curves with the one-hit model, the gamma multi-hitmodel and their modified versions with Abbott's correction are studied.
Abstract: Bayesian methods for estimating the dose response curves with the one-hit model, the gamma multi-hit model, and their modified versions with Abbott‘s correction are studied. The Gibbs sampling approach with data augmentation and with the Metropolis algorithm is employed to compute the Bayes estimates of the potency curves. In addition, estimation of the ’relative additional risk‘ and the ’virtually safe dose‘ is studied. Model selection based on conditional predictive ordinates from cross-validated data is developed.

Journal ArticleDOI
TL;DR: A simplified robust logistic model is proposed which does not have any such problems and which takes a generalized linear model form and can be used to predict accurately which method of feeding a woman will eventually use: breast feeding or bottle feeding.
Abstract: Logistic discrimination is a well documented method for classifying observations to two or more groups. However, estimation of the discriminant rule can be seriously affected by outliers. To overcome this, Cox and Ferry produced a robust logistic discrimination technique. Although their method worked in practice, parameter estimation was sometimes prone to convergence problems. This paper proposes a simplified robust logistic model which does not have any such problems and which takes a generalized linear model form. Misclassification rates calculated in a simulation exercise are used to compare the new method with ordinary logistic discrimination. Model diagnostics are also presented. The newly proposed model is then used on data collected from pregnant women at two district general hospitals. A robust logistic discriminant is calculated which can be used to predict accurately which method of feeding a woman will eventually use: breast feeding or bottle feeding.

Journal ArticleDOI
TL;DR: This work presents a manually-adaptive extension of QMC for approximating marginal densities when the joint density is known up to a normalization constant and demonstrates by examples that adaptive QMC can be a viable alternative to the Metropolis algorithm.
Abstract: We first review quasi Monte Carlo (QMC) integration for approximating integrals, which we believe is a useful tool often overlooked by statistics researchers. We then present a manually-adaptive extension of QMC for approximating marginal densities when the joint density is known up to a normalization constant. Randomization and a batch-wise approach involving (0,s)-sequences are the cornerstones of our method. By incorporating a variety of graphical diagnostics the method allows the user to adaptively allocate points for joint density function evaluations. Through intelligent allocation of resources to different regions of the marginal space, the method can quickly produce reliable marginal density approximations in moderate dimensions. We demonstrate by examples that adaptive QMC can be a viable alternative to the Metropolis algorithm.

Journal ArticleDOI
TL;DR: An efficient set of statistical methods for analysing the security of these algorithms under the black-box approach can be fully automated, which provides the designer or user of a block cipher with a useful set of tools for security analysis.
Abstract: A block cipher is one of the most common forms of algorithms used for data encryption. This paper describes an efficient set of statistical methods for analysing the security of these algorithms under the black-box approach. The procedures can be fully automated, which provides the designer or user of a block cipher with a useful set of tools for security analysis.

Journal ArticleDOI
TL;DR: ODA is a software system for discriminant analysis reviewed not as one who is a user of statistical software packages, but more as a developer of techniques with particular application in the engineering sciences.
Abstract: ODA is a software system for discriminant analysis. I am reviewing this system not as one who is a user of statistical software packages, but more as a developer of techniques with particular application in the engineering sciences. Therefore, I will not be making a comparative assessment, but more an absolute assessment; that is, I am primarily seeking to answer the question: Is it good enough for me and for the types of problem I am interested in? These are problems in sensor signal processing (radar, infrared, sonar, chemical or biological sensors) where the data are often high dimensional (10 to 100's of variables) and there are many samples (up to tens of thousands). This is rather a sel®sh approach and I am aware that the readership of Statistics and Computing are from a wide variety of disciplines. Therefore, I have also set out to assess performance on some small data sets in addition to an automatic target recognition problem.

Journal ArticleDOI
TL;DR: Interval analysis, which uses interval elements throughout the computation and produces intervals as output with the guarantee that the true results are contained in them, can obtain results to any arbitrary accuracy.
Abstract: Conventional computations use real numbers as input and produce real numbers as results without any indication of the accuracy. Interval analysis, instead, uses interval elements throughout the computation and produces intervals as output with the guarantee that the true results are contained in them. One major use for interval analysis in statistics is to get results of high-dimensional multivariate probabilities. With the efforts to decrease the length of the intervals that contain the theoretically true answers, we can obtain results to any arbitrary accuracy, which is demonstrated by multivariate normal and multivariate t integrations. This is an advantage over the approximation methods that are currently in use. Since interval analysis is more computationally intensive than traditional computing, a MasPar parallel computer is used in this research to improve performance.

Journal ArticleDOI
TL;DR: This approach is a multivariate generalization of the method for univariate mixtures presented by Hinde (1982), and the new approach with Gaussian quadrature outperforms the alternative methods considered.
Abstract: This paper introduces a new approach, based on dependent univariate GLMs, for fitting multivariate mixture models. This approach is a multivariate generalization of the method for univariate mixtures presented by Hinde (1982). Its accuracy and efficiency are compared with direct maximization of the log-likelihood. Using a simulation study, we also compare the efficiency of Monte Carlo and Gaussian quadrature methods for approximating the mixture distribution. The new approach with Gaussian quadrature outperforms the alternative methods considered. The work is motivated by the multivariate mixture models which have been proposed for modelling changes of employment states at an individual level. Similar formulations are of interest for modelling movement between other social and economic states and multivariate mixture models also occur in biostatistics and epidemiology.

Journal ArticleDOI
TL;DR: The role of an authoring tool in providing a graphical interface to a strategy for solving simple statistical problems in the context of teaching is discussed and specific examples, including the use of dynamic graphical displays in exploring data and in communicating the meaning of a model are proposed.
Abstract: Software which allows interactive exploration of graphical displays is widely available. In addition there now exist sophisticated ’authoring tools‘ which allow more general textual and graphical material to be presented in computer-based form. The role of an authoring tool in providing a graphical interface to a strategy for solving simple statistical problems in the context of teaching is discussed. This interface allows a variety of resources to be integrated. Specific examples, including the use of dynamic graphical displays in exploring data and in communicating the meaning of a model, are proposed. These ideas are illustrated by a problem involving the identification of the sex of a herring gull.