scispace - formally typeset
Search or ask a question

Showing papers in "Statistics and Computing in 2008"


Journal ArticleDOI
TL;DR: This work proposes a series of novel adaptive algorithms which prove to be robust and reliable in practice and reviews criteria and the useful framework of stochastic approximation, which allows one to systematically optimise generally used criteria.
Abstract: We review adaptive Markov chain Monte Carlo algorithms (MCMC) as a mean to optimise their performance Using simple toy examples we review their theoretical underpinnings, and in particular show why adaptive MCMC algorithms might fail when some fundamental properties are not satisfied This leads to guidelines concerning the design of correct algorithms We then review criteria and the useful framework of stochastic approximation, which allows one to systematically optimise generally used criteria, but also analyse the properties of adaptive MCMC algorithms We then propose a series of novel adaptive algorithms which prove to be robust and reliable in practice These algorithms are applied to artificial and high dimensional scenarios, but also to the classic mine disaster dataset inference problem

957 citations


Journal ArticleDOI
TL;DR: The degree distribution and the clustering coefficient associated with this model are given, a variational method to estimate its parameters and a model selection criterion to select the number of classes are selected, which allows us to deal with large networks containing thousands of vertices.
Abstract: The Erdos---Renyi model of a network is simple and possesses many explicit expressions for average and asymptotic properties, but it does not fit well to real-world networks. The vertices of those networks are often structured in unknown classes (functionally related proteins or social communities) with different connectivity properties. The stochastic block structures model was proposed for this purpose in the context of social sciences, using a Bayesian approach. We consider the same model in a frequentest statistical framework. We give the degree distribution and the clustering coefficient associated with this model, a variational method to estimate its parameters and a model selection criterion to select the number of classes. This estimation procedure allows us to deal with large networks containing thousands of vertices. The method is used to uncover the modular structure of a network of enzymatic reactions.

498 citations


Journal ArticleDOI
TL;DR: This approach extends the practical applicability of DE-MC with a snooker updater and is shown to be about 5–26 times more efficient than the optimal Normal random walk Metropolis sampler for the 97.5% point of a variable from a 25–50 dimensional Student t3 distribution.
Abstract: Differential Evolution Markov Chain (DE-MC) is an adaptive MCMC algorithm, in which multiple chains are run in parallel. Standard DE-MC requires at least N=2d chains to be run in parallel, where d is the dimensionality of the posterior. This paper extends DE-MC with a snooker updater and shows by simulation and real examples that DE-MC can work for d up to 50---100 with fewer parallel chains (e.g. N=3) by exploiting information from their past by generating jumps from differences of pairs of past states. This approach extends the practical applicability of DE-MC and is shown to be about 5---26 times more efficient than the optimal Normal random walk Metropolis sampler for the 97.5% point of a variable from a 25---50 dimensional Student t 3 distribution. In a nonlinear mixed effects model example the approach outperformed a block-updater geared to the specific features of the model.

451 citations


Journal ArticleDOI
TL;DR: A class of eight parsimonious Gaussian mixture models which are based on the mixtures of factor analyzers model are introduced and the maximum likelihood estimates for the parameters in these models are found using an AECM algorithm.
Abstract: Parsimonious Gaussian mixture models are developed using a latent Gaussian model which is closely related to the factor analysis model. These models provide a unified modeling framework which includes the mixtures of probabilistic principal component analyzers and mixtures of factor of analyzers models as special cases. In particular, a class of eight parsimonious Gaussian mixture models which are based on the mixtures of factor analyzers model are introduced and the maximum likelihood estimates for the parameters in these models are found using an AECM algorithm. The class of models includes parsimonious models that have not previously been developed. These models are applied to the analysis of chemical and physical properties of Italian wines and the chemical properties of coffee; the models are shown to give excellent clustering performance.

337 citations


Journal ArticleDOI
TL;DR: An adaptive algorithm that iteratively updates both the weights and component parameters of a mixture importance sampling density so as to optimise the performance of importance sampling, as measured by an entropy criterion is proposed.
Abstract: In this paper, we propose an adaptive algorithm that iteratively updates both the weights and component parameters of a mixture importance sampling density so as to optimise the performance of importance sampling, as measured by an entropy criterion. The method, called M-PMC, is shown to be applicable to a wide class of importance sampling densities, which includes in particular mixtures of multivariate Student t distributions. The performance of the proposed scheme is studied on both artificial and real examples, highlighting in particular the benefit of a novel Rao-Blackwellisation device which can be easily incorporated in the updating scheme.

302 citations


Journal ArticleDOI
TL;DR: This paper explores how to make Bayesian inference for the kinetic rate constants of regulatory networks, using the stochastic kinetic Lotka-Volterra system as a model.
Abstract: The ability to infer parameters of gene regulatory networks is emerging as a key problem in systems biology. The biochemical data are intrinsically stochastic and tend to be observed by means of discrete-time sampling systems, which are often limited in their completeness. In this paper we explore how to make Bayesian inference for the kinetic rate constants of regulatory networks, using the stochastic kinetic Lotka-Volterra system as a model. This simple model describes behaviour typical of many biochemical networks which exhibit auto-regulatory behaviour. Various MCMC algorithms are described and their performance evaluated in several data-poor scenarios. An algorithm based on an approximating process is shown to be particularly efficient.

296 citations


Journal ArticleDOI
TL;DR: Three alternatives to MCMC methods are reviewed, including importance sampling, the forward-backward algorithm, and sequential Monte Carlo (SMC), which are demonstrated on a range of examples, including estimating the transition density of a diffusion and of a discrete-state continuous-time Markov chain; inferring structure in population genetics; and segmenting genetic divergence data.
Abstract: We consider analysis of complex stochastic models based upon partial information. MCMC and reversible jump MCMC are often the methods of choice for such problems, but in some situations they can be difficult to implement; and suffer from problems such as poor mixing, and the difficulty of diagnosing convergence. Here we review three alternatives to MCMC methods: importance sampling, the forward-backward algorithm, and sequential Monte Carlo (SMC). We discuss how to design good proposal densities for importance sampling, show some of the range of models for which the forward-backward algorithm can be applied, and show how resampling ideas from SMC can be used to improve the efficiency of the other two methods. We demonstrate these methods on a range of examples, including estimating the transition density of a diffusion and of a discrete-state continuous-time Markov chain; inferring structure in population genetics; and segmenting genetic divergence data.

131 citations


Journal ArticleDOI
TL;DR: The Fourier inversion method is compared to a series evaluation method and the two methods are found to be complementary in that they perform well in different regions of the parameter space.
Abstract: The Tweedie family of distributions is a family of exponential dispersion models with power variance functions V(μ)=μ p for $p ot\in(0,1)$ . These distributions do not generally have density functions that can be written in closed form. However, they have simple moment generating functions, so the densities can be evaluated numerically by Fourier inversion of the characteristic functions. This paper develops numerical methods to make this inversion fast and accurate. Acceleration techniques are used to handle oscillating integrands. A range of analytic results are used to ensure convergent computations and to reduce the complexity of the parameter space. The Fourier inversion method is compared to a series evaluation method and the two methods are found to be complementary in that they perform well in different regions of the parameter space.

119 citations


Journal ArticleDOI
TL;DR: Two Monte Carlo studies are conducted to show that model identification improves when the model parameters are jointly estimated and the joint estimation takes into account all dependence structure of the parameters’ posterior distributions.
Abstract: Copula functions and marginal distributions are combined to produce multivariate distributions. We show advantages of estimating all parameters of these models using the Bayesian approach, which can be done with standard Markov chain Monte Carlo algorithms. Deviance-based model selection criteria are also discussed when applied to copula models since they are invariant under monotone increasing transformations of the marginals. We focus on the deviance information criterion. The joint estimation takes into account all dependence structure of the parameters' posterior distributions in our chosen model selection criteria. Two Monte Carlo studies are conducted to show that model identification improves when the model parameters are jointly estimated. We study the Bayesian estimation of all unknown quantities at once considering bivariate copula functions and three known marginal distributions.

98 citations


Journal ArticleDOI
TL;DR: A new method for modelling functional data with ‘spatially’ indexed data is presented, which takes advantage of both GPFR and mixture models and therefore improves the accuracy of predictions.
Abstract: Shi, Wang, Murray-Smith and Titterington (Biometrics 63:714---723, 2007) proposed a Gaussian process functional regression (GPFR) model to model functional response curves with a set of functional covariates. Two main problems are addressed by their method: modelling nonlinear and nonparametric regression relationship and modelling covariance structure and mean structure simultaneously. The method gives very good results for curve fitting and prediction but side-steps the problem of heterogeneity. In this paper we present a new method for modelling functional data with `spatially' indexed data, i.e., the heterogeneity is dependent on factors such as region and individual patient's information. For data collected from different sources, we assume that the data corresponding to each curve (or batch) follows a Gaussian process functional regression model as a lower-level model, and introduce an allocation model for the latent indicator variables as a higher-level model. This higher-level model is dependent on the information related to each batch. This method takes advantage of both GPFR and mixture models and therefore improves the accuracy of predictions. The mixture model has also been used for curve clustering, but focusing on the problem of clustering functional relationships between response curve and covariates, i.e. the clustering is based on the surface shape of the functional response against the set of functional covariates. The model is examined on simulated data and real data.

80 citations


Journal ArticleDOI
TL;DR: This work builds a sequence of artificial distributions whose support concentrates itself on the set of maximum likelihood estimates using a sequential Monte Carlo approach and demonstrates state-of-the-art performance for several applications of the proposed approach.
Abstract: Standard methods for maximum likelihood parameter estimation in latent variable models rely on the Expectation-Maximization algorithm and its Monte Carlo variants. Our approach is different and motivated by similar considerations to simulated annealing; that is we build a sequence of artificial distributions whose support concentrates itself on the set of maximum likelihood estimates. We sample from these distributions using a sequential Monte Carlo approach. We demonstrate state-of-the-art performance for several applications of the proposed approach.

Journal ArticleDOI
TL;DR: An important result is the significantly higher ability of local polynomial regression with orthogonal fit to accurately approximate the target regression, even though it may hardly be visible when calculating error criteria against corrupted data.
Abstract: Short-term forecasting of wind generation requires a model of the function for the conversion of meteorological variables (mainly wind speed) to power production. Such a power curve is nonlinear and bounded, in addition to being nonstationary. Local linear regression is an appealing nonparametric approach for power curve estimation, for which the model coefficients can be tracked with recursive Least Squares (LS) methods. This may lead to an inaccurate estimate of the true power curve, owing to the assumption that a noise component is present on the response variable axis only. Therefore, this assumption is relaxed here, by describing a local linear regression with orthogonal fit. Local linear coefficients are defined as those which minimize a weighted Total Least Squares (TLS) criterion. An adaptive estimation method is introduced in order to accommodate nonstationarity. This has the additional benefit of lowering the computational costs of updating local coefficients every time new observations become available. The estimation method is based on tracking the left-most eigenvector of the augmented covariance matrix. A robustification of the estimation method is also proposed. Simulations on semi-artificial datasets (for which the true power curve is available) underline the properties of the proposed regression and related estimation methods. An important result is the significantly higher ability of local polynomial regression with orthogonal fit to accurately approximate the target regression, even though it may hardly be visible when calculating error criteria against corrupted data.

Journal ArticleDOI
TL;DR: It is shown that the maximum likelihood approach on the raw data using the G0 law is the best technique for boundary detection in speckled imagery.
Abstract: We compare the accuracy of five approaches for contour detection in speckled imagery. Some of these methods take advantage of the statistical properties of speckled data, and all of them employ active contours using B-spline curves. Images obtained with coherent illumination are affected by a noise called speckle, which is inherent to the imaging process. These data have been statistically modeled by a multiplicative model using the G0 distribution, under which regions with different degrees of roughness can be characterized by the value of a parameter. We use this information to find boundaries between regions with different textures. We propose and compare five strategies for boundary detection: three based on the data (maximum discontinuity on raw data, fractal dimension and maximum likelihood) and two based on estimates of the roughness parameter (maximum discontinuity and anisotropic smoothed roughness estimates). In order to compare these strategies, a Monte Carlo experience was performed to assess the accuracy of fitting a curve to a region. The probability of finding the correct edge with less than a specified error is estimated and used to compare the techniques. The two best procedures are then compared in terms of their computational cost and, finally, we show that the maximum likelihood approach on the raw data using the G0 law is the best technique.

Journal ArticleDOI
TL;DR: Two novel algorithms based on Metropolis–Hastings-within-Gibbs sampling using mixtures of triangular and trapezoidal densities are presented as improved versions of the all-purpose adaptive rejection Metropolis sampling (ARMS) algorithm to sample from non-logconcave univariate densities.
Abstract: Different strategies have been proposed to improve mixing and convergence properties of Markov Chain Monte Carlo algorithms. These are mainly concerned with customizing the proposal density in the Metropolis---Hastings algorithm to the specific target density and require a detailed exploratory analysis of the stationary distribution and/or some preliminary experiments to determine an efficient proposal. Various Metropolis---Hastings algorithms have been suggested that make use of previously sampled states in defining an adaptive proposal density. Here we propose a general class of adaptive Metropolis---Hastings algorithms based on Metropolis---Hastings-within-Gibbs sampling. For the case of a one-dimensional target distribution, we present two novel algorithms using mixtures of triangular and trapezoidal densities. These can also be seen as improved versions of the all-purpose adaptive rejection Metropolis sampling (ARMS) algorithm to sample from non-logconcave univariate densities. Using various different examples, we demonstrate their properties and efficiencies and point out their advantages over ARMS and other adaptive alternatives such as the Normal Kernel Coupler.

Journal ArticleDOI
TL;DR: It is somewhat surprising that in scenarios with low information the fitting of a linear model, even with stepwise variable selection, has not much advantage over thefitting of an additive model when the true underlying structure is linear.
Abstract: There are several procedures for fitting generalized additive models, i.e. regression models for an exponential family response where the influence of each single covariates is assumed to have unknown, potentially non-linear shape. Simulated data are used to compare a smoothing parameter optimization approach for selection of smoothness and of covariates, a stepwise approach, a mixed model approach, and a procedure based on boosting techniques. In particular it is investigated how the performance of procedures is linked to amount of information, type of response, total number of covariates, number of influential covariates, and extent of non-linearity. Measures for comparison are prediction performance, identification of influential covariates, and smoothness of fitted functions. One result is that the mixed model approach returns sparse fits with frequently over-smoothed functions, while the functions are less smooth for the boosting approach and variable selection is less strict. The other approaches are in between with respect to these measures. The boosting procedure is seen to perform very well when little information is available and/or when a large number of covariates is to be investigated. It is somewhat surprising that in scenarios with low information the fitting of a linear model, even with stepwise variable selection, has not much advantage over the fitting of an additive model when the true underlying structure is linear. In cases with more information the prediction performance of all procedures is very similar. So, in difficult data situations the boosting approach can be recommended, in others the procedures can be chosen conditional on the aim of the analysis.

Journal ArticleDOI
TL;DR: Algorithms for computing the Tukey depth of a point in various dimensions are considered, making them suited to situations, such as outlier removal, where the value of the output is typically small.
Abstract: The Tukey depth (Proceedings of the International Congress of Mathematicians, vol. 2, pp. 523---531, 1975) of a point p with respect to a finite set S of points is the minimum number of elements of S contained in any closed halfspace that contains p. Algorithms for computing the Tukey depth of a point in various dimensions are considered. The running times of these algorithms depend on the value of the output, making them suited to situations, such as outlier removal, where the value of the output is typically small.

Journal ArticleDOI
TL;DR: Bayesian forecasting methods are employed in a Value-at-Risk study of the international return series and generally favour the proposed smooth transition model and highlight explosive and smooth nonlinear behaviour in financial markets.
Abstract: Inference, quantile forecasting and model comparison for an asymmetric double smooth transition heteroskedastic model is investigated. A Bayesian framework in employed and an adaptive Markov chain Monte Carlo scheme is designed. A mixture prior is proposed that alleviates the usual identifiability problem as the speed of transition parameter tends to zero, and an informative prior for this parameter is suggested, that allows for reliable inference and a proper posterior, despite the non-integrability of the likelihood function. A formal Bayesian posterior model comparison procedure is employed to compare the proposed model with its two limiting cases: the double threshold GARCH and symmetric ARX GARCH models. The proposed methods are illustrated using both simulated and international stock market return series. Some illustrations of the advantages of an adaptive sampling scheme for these models are also provided. Finally, Bayesian forecasting methods are employed in a Value-at-Risk study of the international return series. The results generally favour the proposed smooth transition model and highlight explosive and smooth nonlinear behaviour in financial markets.

Journal ArticleDOI
TL;DR: Two new adaptive MCMC algorithms based on the Independent Metropolis–Hastings algorithm are proposed, one of which provides a general technique for deriving natural adaptive formulae and the other addresses a realistic problem arising in Comparative Genomics.
Abstract: Markov chain Monte Carlo (MCMC) is an important computational technique for generating samples from non-standard probability distributions. A major challenge in the design of practical MCMC samplers is to achieve efficient convergence and mixing properties. One way to accelerate convergence and mixing is to adapt the proposal distribution in light of previously sampled points, thus increasing the probability of acceptance. In this paper, we propose two new adaptive MCMC algorithms based on the Independent Metropolis---Hastings algorithm. In the first, we adjust the proposal to minimize an estimate of the cross-entropy between the target and proposal distributions, using the experience of pre-runs. This approach provides a general technique for deriving natural adaptive formulae. The second approach uses multiple parallel chains, and involves updating chains individually, then updating a proposal density by fitting a Bayesian model to the population. An important feature of this approach is that adapting the proposal does not change the limiting distributions of the chains. Consequently, the adaptive phase of the sampler can be continued indefinitely. We include results of numerical experiments indicating that the new algorithms compete well with traditional Metropolis---Hastings algorithms. We also demonstrate the method for a realistic problem arising in Comparative Genomics.

Journal ArticleDOI
TL;DR: A non-centered parameterization of the standard random-effects model, which is based on the Cholesky decomposition of the variance-covariance matrix is considered, able to learn from the data for each effect whether it is random or not, and whether covariances among random effects are zero.
Abstract: We consider a non-centered parameterization of the standard random-effects model, which is based on the Cholesky decomposition of the variance-covariance matrix. The regression type structure of the non-centered parameterization allows us to use Bayesian variable selection methods for covariance selection. We search for a parsimonious variance-covariance matrix by identifying the non-zero elements of the Cholesky factors. With this method we are able to learn from the data for each effect whether it is random or not, and whether covariances among random effects are zero. An application in marketing shows a substantial reduction of the number of free elements in the variance-covariance matrix.

Journal ArticleDOI
TL;DR: A novel and fast conditional maximization algorithm, which has quadratic and monotone convergence, consisting of a sequence of CM log-likelihood steps, which outperforms EM and ECME substantially in all situations, no matter assessed by the CPU time or the number of iterations.
Abstract: To obtain maximum likelihood (ML) estimation in factor analysis (FA), we propose in this paper a novel and fast conditional maximization (CM) algorithm, which has quadratic and monotone convergence, consisting of a sequence of CM log-likelihood (CML) steps. The main contribution of this algorithm is that the closed form expression for the parameter to be updated in each step can be obtained explicitly, without resorting to any numerical optimization methods. In addition, a new ECME algorithm similar to Liu's (Biometrika 81, 633---648, 1994) one is obtained as a by-product, which turns out to be very close to the simple iteration algorithm proposed by Lawley (Proc. R. Soc. Edinb. 60, 64---82, 1940) but our algorithm is guaranteed to increase log-likelihood at every iteration and hence to converge. Both algorithms inherit the simplicity and stability of EM but their convergence behaviors are much different as revealed in our extensive simulations: (1) In most situations, ECME and EM perform similarly; (2) CM outperforms EM and ECME substantially in all situations, no matter assessed by the CPU time or the number of iterations. Especially for the case close to the well known Heywood case, it accelerates EM by factors of around 100 or more. Also, CM is much more insensitive to the choice of starting values than EM and ECME.

Journal ArticleDOI
TL;DR: A model for image segmentation based on a finite mixture of Gaussian distributions for prior probabilities of class memberships is proposed, where association between labels of adjacent pixels is modeled by a class-specific term allowing for different interaction strengths across classes.
Abstract: In this paper, we propose a model for image segmentation based on a finite mixture of Gaussian distributions. For each pixel of the image, prior probabilities of class memberships are specified through a Gibbs distribution, where association between labels of adjacent pixels is modeled by a class-specific term allowing for different interaction strengths across classes. We show how model parameters can be estimated in a maximum likelihood framework using Mean Field theory. Experimental performance on perturbed phantom and on real benchmark images shows that the proposed method performs well in a wide variety of empirical situations.

Journal ArticleDOI
TL;DR: The generalized von Mises distribution provides a flexible model for circular data allowing for symmetry, asymmetry, unimodality and bimodality, and the equivalence between the trigonometric method of moments and the maximum likelihood estimators is shown.
Abstract: This article deals with some important computational aspects of the generalized von Mises distribution in relation with parameter estimation, model selection and simulation. The generalized von Mises distribution provides a flexible model for circular data allowing for symmetry, asymmetry, unimodality and bimodality. For this model, we show the equivalence between the trigonometric method of moments and the maximum likelihood estimators, we give their asymptotic distribution, we provide bias-corrected estimators of the entropy, the Akaike information criterion and the measured entropy for model selection, and we implement the ratio-of-uniforms method of simulation.

Journal ArticleDOI
TL;DR: It is shown that the joint posterior distributions are finite mixtures of conditionally independent gamma distributions for which their full form can be easily deduced by a recursively updating scheme.
Abstract: Bivariate count data arise in several different disciplines (epidemiology, marketing, sports statistics just to name a few) and the bivariate Poisson distribution being a generalization of the Poisson distribution plays an important role in modelling such data. In the present paper we present a Bayesian estimation approach for the parameters of the bivariate Poisson model and provide the posterior distributions in closed forms. It is shown that the joint posterior distributions are finite mixtures of conditionally independent gamma distributions for which their full form can be easily deduced by a recursively updating scheme. Thus, the need of applying computationally demanding MCMC schemes for Bayesian inference in such models will be removed, since direct sampling from the posterior will become available, even in cases where the posterior distribution of functions of the parameters is not available in closed form. In addition, we define a class of prior distributions that possess an interesting conjugacy property which extends the typical notion of conjugacy, in the sense that both prior and posteriors belong to the same family of finite mixture models but with different number of components. Extension to certain other models including multivariate models or models with other marginal distributions are discussed.

Journal ArticleDOI
TL;DR: The simplicity and flexibility of SiZer Map for the authors' purposes are highlighted from the performed empirical study with several real datasets, and the conclusions derived fromSiZer analysis with the global results derived from standard tests are compared.
Abstract: Sizer Map is proposed as a graphical tool for assistance in nonparametric additive regression testing problems. Four problems have been analyzed by using SiZer Map: testing for additivity, testing the components significance, testing parametric models for the components and testing for interactions. The simplicity and flexibility of SiZer Map for our purposes are highlighted from the performed empirical study with several real datasets. With these data, we compare the conclusions derived from SiZer analysis with the global results derived from standard tests, previously proposed in the literature.

Journal ArticleDOI
TL;DR: The AEMC algorithm is presented, which combines a tree-based predictive model with an evolutionary Monte Carlo sampling procedure for the purpose of global optimization and is the first adaptive MCMC algorithm that simulates multiple Markov chains in parallel.
Abstract: In this paper, we present an adaptive evolutionary Monte Carlo algorithm (AEMC), which combines a tree-based predictive model with an evolutionary Monte Carlo sampling procedure for the purpose of global optimization. Our development is motivated by sensor placement applications in engineering, which requires optimizing certain complicated "black-box" objective function. The proposed method is able to enhance the optimization efficiency and effectiveness as compared to a few alternative strategies. AEMC falls into the category of adaptive Markov chain Monte Carlo (MCMC) algorithms and is the first adaptive MCMC algorithm that simulates multiple Markov chains in parallel. A theorem about the ergodicity property of the AEMC algorithm is stated and proven. We demonstrate the advantages of the proposed method by applying it to a sensor placement problem in a manufacturing process, as well as to a standard Griewank test function.

Journal ArticleDOI
TL;DR: A variable screening step is proposed for a frequentist model averaging procedure that leads to more applicable models without eliminating models, which are more strongly supported by the data.
Abstract: In many studies a large number of variables is measured and the identification of relevant variables influencing an outcome is an important task. For variable selection several procedures are available. However, focusing on one model only neglects that there usually exist other equally appropriate models. Bayesian or frequentist model averaging approaches have been proposed to improve the development of a predictor. With a larger number of variables (say more than ten variables) the resulting class of models can be very large. For Bayesian model averaging Occam's window is a popular approach to reduce the model space. As this approach may not eliminate any variables, a variable screening step was proposed for a frequentist model averaging procedure. Based on the results of selected models in bootstrap samples, variables are eliminated before deriving a model averaging predictor. As a simple alternative screening procedure backward elimination can be used. Through two examples and by means of simulation we investigate some properties of the screening step. In the simulation study we consider situations with fifteen and 25 variables, respectively, of which seven have an influence on the outcome. With the screening step most of the uninfluential variables will be eliminated, but also some variables with a weak effect. Variable screening leads to more applicable models without eliminating models, which are more strongly supported by the data. Furthermore, we give recommendations for important parameters of the screening step.

Journal ArticleDOI
TL;DR: The resulting semi-parametric transformation function is shown empirically, via a Monte Carlo study, to perform at least as well as any parametric transformation currently available in the literature.
Abstract: A non-parametric transformation function is introduced to transform data to any continuous distribution. When transformation of data to normality is desired, the use of a suitable parametric pre-transformation function improves the performance of the proposed non-parametric transformation function. The resulting semi-parametric transformation function is shown empirically, via a Monte Carlo study, to perform at least as well as any parametric transformation currently available in the literature.

Journal ArticleDOI
TL;DR: Simulations show that for small and medium sample sizes, parametric bootstrap tests appear to work well for determining whether data arise from a normal mixture with equal variances or anormal mixture with unequal variances.
Abstract: It is generally assumed that the likelihood ratio statistic for testing the null hypothesis that data arise from a homoscedastic normal mixture distribution versus the alternative hypothesis that data arise from a heteroscedastic normal mixture distribution has an asymptotic ? 2 reference distribution with degrees of freedom equal to the difference in the number of parameters being estimated under the alternative and null models under some regularity conditions. Simulations show that the ? 2 reference distribution will give a reasonable approximation for the likelihood ratio test only when the sample size is 2000 or more and the mixture components are well separated when the restrictions suggested by Hathaway (Ann. Stat. 13:795---800, 1985) are imposed on the component variances to ensure that the likelihood is bounded under the alternative distribution. For small and medium sample sizes, parametric bootstrap tests appear to work well for determining whether data arise from a normal mixture with equal variances or a normal mixture with unequal variances.

Journal ArticleDOI
TL;DR: A simple stochastic model is proposed, based on which support and confidence are reasonable estimates for certain probabilities of the model, and two new measures, called ?
Abstract: This article utilizes stochastic ideas for reasoning about association rule mining, and provides a formal statistical view of this discipline. A simple stochastic model is proposed, based on which support and confidence are reasonable estimates for certain probabilities of the model. Statistical properties of the corresponding estimators, like moments and confidence intervals, are derived, and items and itemsets are observed for correlations. After a brief review of measures of interest of association rules, with the main focus on interestingness measures motivated by statistical principles, two new measures are described. These measures, called ?- and ?-precision, respectively, rely on statistical properties of the estimators discussed before. Experimental results demonstrate the effectivity of both measures.

Journal ArticleDOI
TL;DR: The main conclusion is that arithmetic spacing of the values of the characteristic function, coupled with appropriately limiting the range for these values, improves the overall performance of the regression-type method of Koutrouvelis, which is the standard procedure for estimating general stable law parameters.
Abstract: Fitting general stable laws to data by maximum likelihood is important but difficult. This is why much research has considered alternative procedures based on empirical characteristic functions. Two problems then are how many values of the characteristic function to select, and how to position them. We provide recommendations for both of these topics. We propose an arithmetic spacing of transform variables, coupled with a recommendation for the location of the variables. It is shown that arithmetic spacing, which is far simpler to implement, closely approximates optimum spacing. The new methods that result are compared in simulation studies with existing methods, including maximum-likelihood. The main conclusion is that arithmetic spacing of the values of the characteristic function, coupled with appropriately limiting the range for these values, improves the overall performance of the regression-type method of Koutrouvelis, which is the standard procedure for estimating general stable law parameters.