Showing papers in "Communications in Statistics - Simulation and Computation in 2009"

PDF

Open Access

Journal Article•DOI•

Using the standardized difference to compare the prevalence of a binary variable between two groups in observational research

[...]

Peter C. Austin¹•Institutions (1)

York University¹

09 Apr 2009-Communications in Statistics - Simulation and Computation

TL;DR: The utility and interpretation of the standardized difference for comparing the prevalence of dichotomous variables between two groups is explored, and a standardized difference of 10% is equivalent to having a phi coefficient of 0.05 for the correlation between treatment group and the binary variable.

...read moreread less

Abstract: Researchers are increasingly using the standardized difference to compare the distribution of baseline covariates between treatment groups in observational studies. Standardized differences were initially developed in the context of comparing the mean of continuous variables between two groups. However, in medical research, many baseline covariates are dichotomous. In this article, we explore the utility and interpretation of the standardized difference for comparing the prevalence of dichotomous variables between two groups. We examined the relationship between the standardized difference, and the maximal difference in the prevalence of the binary variable between two groups, the relative risk relating the prevalence of the binary variable in one group compared to the prevalence in the other group, and the phi coefficient for measuring correlation between the treatment group and the binary variable. We found that a standardized difference of 10% (or 0.1) is equivalent to having a phi coefficient of 0.05 ...

...read moreread less

1,532 citations

Journal Article•DOI•

On Some Ridge Regression Estimators: An Empirical Comparisons

[...]

Gisela Muniz¹, B. M. Golam Kibria¹•Institutions (1)

Florida International University¹

05 Feb 2009-Communications in Statistics - Simulation and Computation

TL;DR: This article reviewed and proposed some estimators based on Kibria (2003) and Khalaf and Shukur (2005) that performed well compared to the ordinary least squared (OLS) estimator and some existing popular estimators.

...read moreread less

Abstract: In ridge regression analysis, the estimation of the ridge parameter k is an important problem. Many methods are available for estimating such a parameter. This article reviewed and proposed some estimators based on Kibria (2003) and Khalaf and Shukur (2005). A simulation study has been made and mean squared error (MSE) criteria are used to compare the performances of the estimators. We observed that under certain conditions some of the proposed estimators performed well compared to the ordinary least squared (OLS) estimator and some existing popular estimators. Finally, a numerical example has been considered to illustrate the performance of the estimators.

...read moreread less

249 citations

Journal Article•DOI•

A Comparison of Hierarchical Methods for Clustering Functional Data

[...]

Laura Ferreira¹, David B. Hitchcock¹•Institutions (1)

University of South Carolina¹

09 Dec 2009-Communications in Statistics - Simulation and Computation

TL;DR: A simulation study compares the performance of four major hierarchical methods for clustering functional data and yields concrete suggestions to future researchers to determine the best method for clustered their functional data.

...read moreread less

Abstract: Functional data analysis (FDA)—the analysis of data that can be considered a set of observed continuous functions—is an increasingly common class of statistical analysis. One of the most widely used FDA methods is the cluster analysis of functional data; however, little work has been done to compare the performance of clustering methods on functional data. In this article, a simulation study compares the performance of four major hierarchical methods for clustering functional data. The simulated data varied in three ways: the nature of the signal functions (periodic, non periodic, or mixed), the amount of noise added to the signal functions, and the pattern of the true cluster sizes. The Rand index was used to compare the performance of each clustering method. As a secondary goal, clustering methods were also compared when the number of clusters has been misspecified. To illustrate the results, a real set of functional data was clustered where the true clustering structure is believed to be known. Compari...

...read moreread less

210 citations

Journal Article•DOI•

Acceptance Sampling Plans from Truncated Life Tests Based on the Birnbaum–Saunders Distribution for Percentiles

[...]

Yuhlong Lio¹, Tzong-Ru Tsai², Shuo-Jye Wu²•Institutions (2)

University of South Dakota¹, Tamkang University²

10 Nov 2009-Communications in Statistics - Simulation and Computation

TL;DR: In this article, acceptance sampling plans are developed for the Birnbaum–Saunders distribution percentiles when the life test is truncated at a pre-specified time to ensure the specified life percentile is obtained under a given customer's risk.

...read moreread less

Abstract: Time to failure due to fatigue is one of the common quality characteristics in material engineering applications. In this article, acceptance sampling plans are developed for the Birnbaum–Saunders distribution percentiles when the life test is truncated at a pre-specified time. The minimum sample size necessary to ensure the specified life percentile is obtained under a given customer's risk. The operating characteristic values (and curves) of the sampling plans as well as the producer's risk are presented. The R package named spbsq is developed to implement the developed sampling plans. Two examples with real data sets are also given as illustration.

...read moreread less

109 citations

Journal Article•DOI•

Computations of Signatures of Coherent Systems with Five Components

[...]

Jorge Navarro¹, Rafael Rubio¹•Institutions (1)

University of Murcia¹

10 Nov 2009-Communications in Statistics - Simulation and Computation

TL;DR: An algorithm is obtained to compute all the coherent systems with n components and their signatures and it is shown that there exist 180 coherent system with 5 components and the signatures are shown.

...read moreread less

Abstract: The signatures of coherent systems are useful tools to compute the system reliability functions, the system expected lifetimes and to compare different systems using stochastic orderings. It is well known that there exist 2, 5, and 20 different coherent systems with 2, 3, and 4 components, respectively. The signatures for these systems were given in Shaked and Suarez-Llorens (2003). In this article, we obtain an algorithm to compute all the coherent systems with n components and their signatures. Using this algorithm we show that there exist 180 coherent systems with 5 components and we compute their signatures.

...read moreread less

87 citations

Journal Article•DOI•

Modeling Multivariate Count Data Using Copulas

[...]

Aristidis K. Nikoloulopoulos¹, Dimitris Karlis²•Institutions (2)

University of East Anglia¹, Athens University of Economics and Business²

01 Dec 2009-Communications in Statistics - Simulation and Computation

TL;DR: A series of different copula models providing various residual dependence structures are considered for vectors of count response variables whose marginal distributions depend on covariates through negative binomial regressions.

...read moreread less

Abstract: Multivariate count data occur in several different disciplines. However, existing models do not offer great flexibility for dependence modeling. Models based on copulas nowadays are widely used for continuous data dependence modeling. Modeling count data via copulas is still in its infancy; see the recent article of Genest and Neslehova (2007). A series of different copula models providing various residual dependence structures are considered for vectors of count response variables whose marginal distributions depend on covariates through negative binomial regressions. A real data application related to the number of purchases of different products is provided.

...read moreread less

77 citations

Journal Article•DOI•

Comparison of Times Series with Unequal Length in the Frequency Domain

[...]

Jorge Caiado¹, Nuno Crato¹, Daniel Peña²•Institutions (2)

Technical University of Lisbon¹, Charles III University of Madrid²

06 Feb 2009-Communications in Statistics - Simulation and Computation

TL;DR: A spectral domain method for handling time series of unequal length is proposed to make the spectral estimates comparable by producing statistics at the same frequency.

...read moreread less

Abstract: In statistical data analysis it is often important to compare, classify, and cluster different time series. For these purposes various methods have been proposed in the literature, but they usually assume time series with the same sample size. In this article, we propose a spectral domain method for handling time series of unequal length. The method make the spectral estimates comparable by producing statistics at the same frequency. The procedure is compared with other methods proposed in the literature by a Monte Carlo simulation study. As an illustrative example, the proposed spectral method is applied to cluster industrial production series of some developed countries.

...read moreread less

59 citations

Journal Article•DOI•

Comparing Pearson Correlations: Dealing with Heteroscedasticity and Nonnormality

[...]

Rand R. Wilcox¹•Institutions (1)

University of Southern California¹

12 Oct 2009-Communications in Statistics - Simulation and Computation

TL;DR: Variations of a method derived by Zou performed relatively well in simulations for the dependent case and for the independent case, the Wilcox–Muska method performed best.

...read moreread less

Abstract: Methods for computing a confidence interval for the difference between two Pearson correlations are compared when dealing with nonnormality and heteroscedasticity. Variations of a method derived by Zou performed relatively well in simulations for the dependent case. For the independent case, the Wilcox–Muska method performed best.

...read moreread less

58 citations

Journal Article•DOI•

Designing Variables Sampling Plans with Process Loss Consideration

[...]

Ching-Ho Yen¹, Chia-Hao Chang•Institutions (1)

Huafan University¹

07 Jul 2009-Communications in Statistics - Simulation and Computation

TL;DR: A variables sampling plan based on L e is proposed to handle processes requiring low process loss and provides a feasible policy, which can be applied to products requiringLow process loss where classical sampling plans cannot be applied.

...read moreread less

Abstract: For the implementation of an acceptance sampling plan, a problem the quality practitioners have to deal with is the determination of the critical acceptance values and inspection sample sizes that provide the desired levels of protection to both vendors and buyers. Traditionally, most acceptance sampling plans focus on the percentage of defective products instead of considering the process loss, which doesn't distinguish among the products that fall within the specification limits. However, the quality between products that fall within the specification limits may be very different. So how to design an acceptance sampling plan with process loss consideration is necessary. In this article, a variables sampling plan based on L e is proposed to handle processes requiring low process loss. The required sample sizes n and the critical acceptance value c with various combination of acceptance quality level are tabulated. The proposed sampling plan provides a feasible policy, which can be applied to products req...

...read moreread less

48 citations

Journal Article•DOI•

Statistical Modeling of Temporal Dependence in Financial Data via a Copula Function

[...]

Filippo Domma¹, Sabrina Giordano¹, Pier Francesco Perri¹•Institutions (1)

University of Calabria¹

30 Jan 2009-Communications in Statistics - Simulation and Computation

TL;DR: The Archimedean two-parameter BB7 copula is adopted to describe the underlying dependence structure between two consecutive returns, while the log-Dagum distribution is employed to model the margins marked by skewness and kurtosis.

...read moreread less

Abstract: In financial analysis it is useful to study the dependence between two or more time series as well as the temporal dependence in a univariate time series. This article is concerned with the statistical modeling of the dependence structure in a univariate financial time series using the concept of copula. We treat the series of financial returns as a first order Markov process. The Archimedean two-parameter BB7 copula is adopted to describe the underlying dependence structure between two consecutive returns, while the log-Dagum distribution is employed to model the margins marked by skewness and kurtosis. A simulation study is carried out to evaluate the performance of the maximum likelihood estimates. Furthermore, we apply the model to the daily returns of four stocks and, finally, we illustrate how its fitting to data can be improved when the dependence between consecutive returns is described through a copula function.

...read moreread less

37 citations

Journal Article•DOI•

The Performance of Double Sampling Control Charts Under Non Normality

[...]

Chau-Chen Torng¹, Pei-Hsi Lee¹•Institutions (1)

National Yunlin University of Science and Technology¹

27 Feb 2009-Communications in Statistics - Simulation and Computation

TL;DR: The comparison result shows the DS chart has performance as good as the VP chart, but it is more effective than the Shewhart chart in detecting small process mean shifts.

...read moreread less

Abstract: In this study, the performances of double sampling (DS) chart under non normality are presented and compared with Shewhart chart and VP chart. The comparison result shows the DS chart has performance as good as the VP chart, but it is more effective than the Shewhart chart in detecting small process mean shifts.

...read moreread less

Journal Article•DOI•

Analyzing Small Samples of Repeated Measures Data with the Mixed-Model Adjusted F Test

[...]

Jaime Arnau¹, Roser Bono¹, Guillermo Vallejo²•Institutions (2)

University of Barcelona¹, University of Oviedo²

24 Mar 2009-Communications in Statistics - Simulation and Computation

TL;DR: The main contribution of this study lies in showing that the Kenward–Roger method corrects the liberal Type I error rates obtained with the Between–Within and Satterthwaite approaches, especially with positive pairings between group sizes and covariance matrices.

...read moreread less

Abstract: This research examines the Type I error rates obtained when using the mixed model with the Kenward-Roger correction and compares them with the Between–Within and Satterthwaite approaches in split-plot designs. A simulated study was conducted to generate repeated measures data with small samples under normal distribution conditions. The data were obtained via three covariance matrices (unstructured, heterogeneous first-order auto-regressive, and random coefficients), the one with the best fit being selected according to the Akaike criterion. The results of the simulation study showed the Kenward-Roger test to be more robust, particularly when the population covariance matrices were unstructured or heterogeneous first-order auto-regressive. The main contribution of this study lies in showing that the Kenward–Roger method corrects the liberal Type I error rates obtained with the Between–Within and Satterthwaite approaches, especially with positive pairings between group sizes and covariance matrices.

...read moreread less

Journal Article•DOI•

Pitman Closeness of Order Statistics to Population Quantiles

[...]

Narayanaswamy Balakrishnan¹, Katherine F. Davies², Jerome P. Keating¹•Institutions (2)

McMaster University¹, University of Manitoba²

05 Feb 2009-Communications in Statistics - Simulation and Computation

TL;DR: Pitman closeness of sample order statistics to population quantiles of a location-scale family of distributions is discussed and explicit expressions are derived for some specific families such as uniform, exponential, and power function.

...read moreread less

Abstract: In this article, Pitman closeness of sample order statistics to population quantiles of a location-scale family of distributions is discussed. Explicit expressions are derived for some specific families such as uniform, exponential, and power function. Numerical results are then presented for these families for sample sizes n = 10,15, and for the choices of p = 0.10, 0.25, 0.75, 0.90. The Pitman-closest order statistic is also determined in these cases and presented.

...read moreread less

Journal Article•DOI•

Median Control Charts Based on Bootstrap Method

[...]

Hyo-Il Park

06 Feb 2009-Communications in Statistics - Simulation and Computation

TL;DR: This study considers using an estimate of the variance of sample median and applying the bootstrap methods to determine the control limits of the median control chart.

...read moreread less

Abstract: In this study, we propose a median control chart. In order to determine the control limits, we consider using an estimate of the variance of sample median. Also, we consider applying the bootstrap methods. Then we illustrate the proposed median control chart with an example and compare the bootstrap methods by simulation study. Finally, we discuss some peculiar features for the median control chart as concluding remarks.

...read moreread less

Journal Article•DOI•

Simple Tests for Exogeneity of a Binary Explanatory Variable in Count Data Regression Models

[...]

Kevin E. Staub¹•Institutions (1)

University of Zurich¹

09 Dec 2009-Communications in Statistics - Simulation and Computation

TL;DR: Power and size of some tests for exogeneity of a binary explanatory variable in count models by conducting extensive Monte Carlo simulations are investigated and it is indicated that often the tests that are simpler to estimate outperform Tests that are more demanding.

...read moreread less

Abstract: This article investigates power and size of some tests for exogeneity of a binary explanatory variable in count models by conducting extensive Monte Carlo simulations. The tests under consideration are Hausman contrast tests as well as univariate Wald tests, including a new test of notably easy implementation. Performance of the tests is explored under misspecification of the underlying model and under different conditions regarding the instruments. The results indicate that often the tests that are simpler to estimate outperform tests that are more demanding. This is especially the case for the new test.

...read moreread less

Journal Article•DOI•

A Parametric Bootstrap Approach for Testing Equality of Inverse Gaussian Means Under Heterogeneity

[...]

Chang-Xing Ma¹, Lili Tian¹•Institutions (1)

University at Buffalo¹

09 Apr 2009-Communications in Statistics - Simulation and Computation

TL;DR: Simulation results show that the proposed parametric bootstrap (PB) approach for the generalized variable test for equality of several inverse Gaussian means with unknown and arbitrary variances performs very satisfactorily regardless of the number of samples and sample sizes.

...read moreread less

Abstract: The inverse Gaussian distribution provides a flexible model for analyzing positive, right-skewed data. The generalized variable test for equality of several inverse Gaussian means with unknown and arbitrary variances has satisfactory Type-I error rate when the number of samples (k) is small (Tian, 2006). However, the Type-I error rate tends to be inflated when k goes up. In this article, we propose a parametric bootstrap (PB) approach for this problem. Simulation results show that the proposed test performs very satisfactorily regardless of the number of samples and sample sizes. This method is illustrated by an example.

...read moreread less

Journal Article•DOI•

A Synthetic Scaled Weighted Variance Control Chart for Monitoring the Process Mean of Skewed Populations

[...]

Philippe Castagliola¹, Michael B. C. Khoo²•Institutions (2)

University of Nantes¹, Universiti Sains Malaysia²

13 Jul 2009-Communications in Statistics - Simulation and Computation

TL;DR: A synthetic scaled weighted variance (synthetic SWV- ) control chart is proposed to monitor the process mean of skewed populations to improve the detection of a negative shift in the mean.

...read moreread less

Abstract: In this article, a synthetic scaled weighted variance (synthetic SWV- ) control chart is proposed to monitor the process mean of skewed populations. This control chart is an improvement over the synthetic weighted variance (synthetic WV- ) chart suggested by Khoo et al. (2008), in the detection of a negative shift in the mean. A comparison between the performances of the synthetic SWV- and synthetic WV- charts are made in terms of the average run length (ARL) values for the various levels of skewnesses as well as different magnitudes of positive and negative shifts in the mean. A method to construct the synthetic SWV- chart is explained in detail. An illustrative example is also given to show the implementation of the synthetic SWV- chart.

...read moreread less

Journal Article•DOI•

A Numerical Study of PQL Estimation Biases in Generalized Linear Mixed Models Under Heterogeneity of Random Effects

[...]

Woncheol Jang¹, Johan Lim²•Institutions (2)

University of Georgia¹, Seoul National University²

30 Jan 2009-Communications in Statistics - Simulation and Computation

TL;DR: It is numerically shown that the biases of variance component estimates by PQL are systematically related to the bias of regression coefficient estimates byPQL, and also show that the biased estimates increase as random effects become more heterogeneous.

...read moreread less

Abstract: The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized linear mixed model (GLMM). However, it has been noticed that the PQL tends to underestimate variance components as well as regression coefficients in the previous literature. In this article, we numerically show that the biases of variance component estimates by PQL are systematically related to the biases of regression coefficient estimates by PQL, and also show that the biases of variance component estimates by PQL increase as random effects become more heterogeneous.

...read moreread less

Journal Article•DOI•

Asymptotic Confidence Intervals for a New Inequality Measure

[...]

Francesca Greselin¹, Leo Pasquazzi¹•Institutions (1)

University of Milan¹

27 Aug 2009-Communications in Statistics - Simulation and Computation

TL;DR: Assessment of the performance of asymptotic confidence intervals for Zenga's new inequality measure shows that the coverage accuracy and the size of the confidence interval for the two measures are very similar in samples from economic size distributions.

...read moreread less

Abstract: This work aims at assessing, by simulation methods, the performance of asymptotic confidence intervals for Zenga's new inequality measure. The results are compared with those obtained on Gini's measure, perhaps the most widely used index for measuring inequality in income and wealth distributions. Our findings show that the coverage accuracy and the size of the confidence intervals for the two measures are very similar in samples from economic size distributions.

...read moreread less

Journal Article•DOI•

The Variable Parameters Control Charts for Monitoring Autocorrelated Processes

[...]

Yu-Chang Lin

30 Jan 2009-Communications in Statistics - Simulation and Computation

TL;DR: The VP control scheme generally has better performance in detecting small mean shifts than the standard and the other adaptive charts, but when the observations are highly autocorrelated, the complexity of the VP chart gives a negative effect on the performance.

...read moreread less

Abstract: The variable parameters (VP) control chart varies all control parameters from the current sample information, and results in more effective monitoring based on statistical and economic criteria. The usual assumption for designing a control chart is that the observations from the process are independent. However, for many processes, such as chemical processes, consecutive measurements are often highly correlated. In the present article, the observations are modeled as an AR(1) process plus a random error, and the properties of the VP charts are evaluated and studied under this model. Based on the study, the VP control scheme generally has better performance in detecting small mean shifts than the standard and the other adaptive charts. However, when the observations are highly autocorrelated, the complexity of the VP chart gives a negative effect on the performance.

...read moreread less

Journal Article•DOI•

Optimal Threshold from ROC and CAP Curves

[...]

Chong Sun Hong¹•Institutions (1)

Sungkyunkwan University¹

29 Sep 2009-Communications in Statistics - Simulation and Computation

TL;DR: This article has shown that both alternative optimal thresholds by using the true rate are the identical, and this single threshold coincides with the score corresponding to Kolmogorov–Smirnov statistic used to test the homogeneous distribution functions of the defaults and non defaults.

...read moreread less

Abstract: Receiver Operating Characteristic (ROC) and Cumulative Accuracy Profile (CAP) curves are used to assess the discriminatory power of different credit-rating approaches. The thresholds of optimal classification accuracy on an ROC curve and of maximal profit on a CAP curve can be found by using iso-performance tangent lines, which are based on the standard notion of accuracy. In this article, we propose another accuracy measure called the true rate. Using this rate, one can obtain alternative optimal thresholds on both ROC and CAP curves. For most real populations of borrowers, the number of the defaults is much less than that of the non defaults, and in such cases using the true rate may be more efficient than using the accuracy rate in terms of cost functions. Moreover, it is shown that both alternative optimal thresholds by using the true rate are the identical, and this single threshold coincides with the score corresponding to Kolmogorov–Smirnov statistic used to test the homogeneous distribution functi...

...read moreread less

Journal Article•DOI•

Modifications of the Empirical Likelihood Interval Estimation with Improved Coverage Probabilities

[...]

Albert Vexler¹, Shuling Liu¹, Le Kang¹, Alan D. Hutson¹•Institutions (1)

University at Buffalo¹

29 Sep 2009-Communications in Statistics - Simulation and Computation

TL;DR: This article proposes the application of the Chen (1995) t-test modification to the EL ratio test and displays that the Chen approach leads to a location change of observed data whereas the classical Bartlett method is known to be a scale correction of the data distribution.

...read moreread less

Abstract: The empirical likelihood (EL) technique has been well addressed in both the theoretical and applied literature in the context of powerful nonparametric statistical methods for testing and interval estimations. A nonparametric version of Wilks theorem (Wilks, 1938) can usually provide an asymptotic evaluation of the Type I error of EL ratio-type tests. In this article, we examine the performance of this asymptotic result when the EL is based on finite samples that are from various distributions. In the context of the Type I error control, we show that the classical EL procedure and the Student's t-test have asymptotically a similar structure. Thus, we conclude that modifications of t-type tests can be adopted to improve the EL ratio test. We propose the application of the Chen (1995) t-test modification to the EL ratio test. We display that the Chen approach leads to a location change of observed data whereas the classical Bartlett method is known to be a scale correction of the data distribution. Finally,...

...read moreread less

Journal Article•DOI•

A Multivariate Synthetic Control Chart for Monitoring the Process Mean Vector of Skewed Populations Using Weighted Standard Deviations

[...]

Michael B. C. Khoo, A. M. Atta, Zhang Wu¹•Institutions (1)

Nanyang Technological University¹

19 Jun 2009-Communications in Statistics - Simulation and Computation

TL;DR: In general, the simulation results show that the proposed chart performs better than the existing multivariate charts for skewed populations and the standard T 2 chart, in terms of false alarm rates as well as moderate and large mean shift detection rates based on the various degrees of skewnesses.

...read moreread less

Abstract: This article proposes a multivariate synthetic control chart for skewed populations based on the weighted standard deviation method. The proposed chart incorporates the weighted standard deviation method into the standard multivariate synthetic control chart. The standard multivariate synthetic chart consists of the Hotelling's T 2 chart and the conforming run length chart. The weighted standard deviation method adjusts the variance–covariance matrix of the quality characteristics and approximates the probability density function using several multivariate normal distributions. The proposed chart reduces to the standard multivariate synthetic chart when the underlying distribution is symmetric. In general, the simulation results show that the proposed chart performs better than the existing multivariate charts for skewed populations and the standard T 2 chart, in terms of false alarm rates as well as moderate and large mean shift detection rates based on the various degrees of skewnesses.

...read moreread less

Journal Article•DOI•

Nonlinear Logistic Discrimination Via Regularized Gaussian Basis Expansions

[...]

Shuichi Kawano¹, Sadanori Konishi¹•Institutions (1)

Kyushu University¹

21 May 2009-Communications in Statistics - Simulation and Computation

TL;DR: A nonlinear logistic discriminant model is introduced based on Gaussian basis functions constructed by the self-organizing map based on information-theoretic and Bayesian approaches for multi-class classification methods for analyzing data with complex structure.

...read moreread less

Abstract: We consider the problem of constructing multi-class classification methods for analyzing data with complex structure. A nonlinear logistic discriminant model is introduced based on Gaussian basis functions constructed by the self-organizing map. In order to select adjusted parameters, we employ model selection criteria derived from information-theoretic and Bayesian approaches. Numerical examples are conducted to investigate the performance of the proposed multi-class discriminant procedure. Our modeling procedure is also applied to protein structure recognition in life science. The results indicate the effectiveness of our strategy in terms of prediction accuracy.

...read moreread less

Journal Article•DOI•

On the Performance of Sequential Regression Multiple Imputation Methods with Non Normal Error Distributions

[...]

Yulei He¹, Trivellore E. Raghunathan²•Institutions (2)

Harvard University¹, University of Michigan²

24 Feb 2009-Communications in Statistics - Simulation and Computation

TL;DR: This work uses a simulation study to investigate the performance of several sequential regression imputation methods when the error distribution is flat or heavy tailed, and suggests that all methods can have poor performances for the regression coefficient because they cannot accommodate the extreme values well.

...read moreread less

Abstract: Sequential regression multiple imputation has emerged as a popular approach for handling incomplete data with complex features. In this approach, imputations for each missing variable are produced based on a regression model using other variables as predictors in a cyclic manner. Normality assumption is frequently imposed for the error distributions in the conditional regression models for continuous variables, despite that it rarely holds in real scenarios. We use a simulation study to investigate the performance of several sequential regression imputation methods when the error distribution is flat or heavy tailed. The methods evaluated include the sequential normal imputation and its several extensions which adjust for non normal error terms. The results show that all methods perform well for estimating the marginal mean and proportion, as well as the regression coefficient when the error distribution is flat or moderately heavy tailed. When the error distribution is strongly heavy tailed, all methods ...

...read moreread less

Journal Article•DOI•

Model Selection for Linear Mixed Models Using Predictive Criteria

[...]

Jun Wang¹, G. Bruce Schaalje¹•Institutions (1)

Brigham Young University¹

05 Feb 2009-Communications in Statistics - Simulation and Computation

TL;DR: Characteristics of the data, such as the covariance structure, parameter values, and sample size, greatly impacted performance of various model selection criteria and no one criterion was consistently better than the others.

...read moreread less

Abstract: Predictive criteria, including the adjusted squared multiple correlation coefficient, the adjusted concordance correlation coefficient, and the predictive error sum of squares, are available for model selection in the linear mixed model. These criteria all involve some sort of comparison of observed values and predicted values, adjusted for the complexity of the model. The predicted values can be conditional on the random effects or marginal, i.e., based on averages over the random effects. These criteria have not been investigated for model selection success. We used simulations to investigate selection success rates for several versions of these predictive criteria as well as several versions of Akaike's information criterion and the Bayesian information criterion, and the pseudo F-test. The simulations involved the simple scenario of selection of a fixed parameter when the covariance structure is known. Several variance–covariance structures were used. For compound symmetry structures, higher success r...

...read moreread less

Journal Article•DOI•

A Parametric Bootstrap Test for Comparing Heteroscedastic Regression Models

[...]

Lili Tian¹, Chang-Xing Ma¹, Albert Vexler¹•Institutions (1)

University at Buffalo¹

25 Feb 2009-Communications in Statistics - Simulation and Computation

TL;DR: This article presents a parametric bootstrap (PB) approach and compares its performance to that of another simulation-based approach, namely, the generalized variable approach to testing equality of regression coefficients in several regression models.

...read moreread less

Abstract: Testing equality of regression coefficients in several regression models is a common problem encountered in many applied fields. This article presents a parametric bootstrap (PB) approach and compares its performance to that of another simulation-based approach, namely, the generalized variable approach. Simulation studies indicate that the PB approach controls the Type I error rates satisfactorily regardless of the number of regression models and sample sizes whereas the generalized variable approach tends to be very liberal as the number of regression models goes up. The proposed PB approach is illustrated using a data set from stability study.

...read moreread less

Journal Article•DOI•

The Effects of Imputing the Missing Standard Deviations on the Standard Error of Meta Analysis Estimates

[...]

Nik Ruzni Nik Idris¹, Chris Robertson•Institutions (1)

International Islamic University Malaysia¹

13 Jan 2009-Communications in Statistics - Simulation and Computation

TL;DR: The simulation results show that if the SDs are missing under Missing Completely at Random and Missing at Random mechanism, imputation is recommended and with non random missing, imputations can lead to overestimation of the SE of the estimate.

...read moreread less

Abstract: A common problem in the meta analysis of continuous data is that some studies do not report sufficient information to calculate the standard deviation (SDs) of the treatment effect. One of the approaches in handling this problem is through imputation. This article examines the empirical implications of imputing the missing SDs on the standard error (SE) of the overall meta analysis estimate. The simulation results show that if the SDs are missing under Missing Completely at Random and Missing at Random mechanism, imputation is recommended. With non random missing, imputation can lead to overestimation of the SE of the estimate.

...read moreread less

Journal Article•DOI•

A New Test of Discordancy in Circular Data

[...]

Ali H. Abuzaid¹, Ibrahim Mohamed¹, Abdul Ghapor Hussin¹•Institutions (1)

University of Malaya¹

30 Jan 2009-Communications in Statistics - Simulation and Computation

TL;DR: An alternative test of discordancy in samples of univariate circular data is presented based on the effect of existence of an outlier on the summation of circular distances of the point of interest to all other points and performs relatively better than other known tests.

...read moreread less

Abstract: In this article, we present an alternative test of discordancy in samples of univariate circular data. The new technique is based on the effect of existence of an outlier on the summation of circular distances of the point of interest to all other points. The percentage points are calculated and the performance is examined. We compare the performance of the test in detecting an outlier with other tests and show that the new approach performs relatively better than other known tests. As an illustration a practical example is presented.

...read moreread less

Journal Article•DOI•

Importance Sampling for Sums of Lognormal Distributions with Applications to Operational Risk

[...]

Marco Bee¹•Institutions (1)

University of Trento¹

24 Feb 2009-Communications in Statistics - Simulation and Computation

TL;DR: A defensive mixture is used, and a method of choosing the parameters via the EM algorithm is developed, and the technique which assumes the importance sampling density to belong to the same parametric family of the random variables to be summed is considered.

...read moreread less

Abstract: In this paper we use Importance Sampling to estimate tail probabilities for a finite sum of lognormal distributions. We use a defensive mixture, and develop a method of choosing the parameters via the EM algorithm; we also consider the technique which assumes the importance sampling density to belong to the same parametric family of the random variables to be summed. In both cases, the instrumental density is found by minimizing Cross-Entropy. A comparison based on several simulation experiments shows that the defensive mixture has the best performance. Finally, we study the Poisson-lognormal compound distribution framework and present a real-data application.

...read moreread less

Collapse