scispace - formally typeset
Search or ask a question

Showing papers on "Goodness of fit published in 2019"


Book
02 Dec 2019
TL;DR: In this article, the authors present a series of tests for univariate normality with Censored data, including plots, probability plots and regression tests, as well as a robust estimation of location and scale.
Abstract: 1. Introduction Part 1: Testing for Univariate Normality 2. Plots, Probability Plots and Regression Tests 3. Test Using Moments 4. Other Tests for Univariate Normality 5. Goodness of Fit Tests 6. Tests for Outliers 7. Power Comparisons for Univariate Tests for Normality 8. Testing for Normality with Censored Data Part 2: Testing for Multivariate Normality 9. Assessing Multivariate Normality 10. Testing for Multivariate Outliers Part 3: Additional Topics 11. Testing for Normal Mixtures 12. Robust Estimation of Location and Scale 13. Computational Issues

768 citations


Journal ArticleDOI
TL;DR: The results showed that DWLS and ULS lead to smaller RMSEA and larger CFI and TLI values than does ML for all manipulated conditions, regardless of whether or not the indices are scaled.
Abstract: In structural equation modeling, application of the root mean square error of approximation (RMSEA), comparative fit index (CFI), and Tucker–Lewis index (TLI) highly relies on the conventional cutoff values developed under normal-theory maximum likelihood (ML) with continuous data. For ordered categorical data, unweighted least squares (ULS) and diagonally weighted least squares (DWLS) based on polychoric correlation matrices have been recommended in previous studies. Although no clear suggestions exist regarding the application of these fit indices when analyzing ordered categorical variables, practitioners are still tempted to adopt the conventional cutoff rules. The purpose of our research was to answer the question: Given a population polychoric correlation matrix and a hypothesized model, if ML results in a specific RMSEA value (e.g., .08), what is the RMSEA value when ULS or DWLS is applied? CFI and TLI were investigated in the same fashion. Both simulated and empirical polychoric correlation matrices with various degrees of model misspecification were employed to address the above question. The results showed that DWLS and ULS lead to smaller RMSEA and larger CFI and TLI values than does ML for all manipulated conditions, regardless of whether or not the indices are scaled. Applying the conventional cutoffs to DWLS and ULS, therefore, has a pronounced tendency not to discover model–data misfit. Discussions regarding the use of RMSEA, CFI, and TLI for ordered categorical data are given.

475 citations


Journal ArticleDOI
TL;DR: The results showed that the effect of p on the population CFI and TLI depended on thetype of specification error, whereas a higher p was associated with lower values of the population RMSEA regardless of the type of model misspecification.
Abstract: This study investigated the effect the number of observed variables (p) has on three structural equation modeling indices: the comparative fit index (CFI), the Tucker-Lewis index (TLI), and the root mean square error of approximation (RMSEA). The behaviors of the population fit indices and their sample estimates were compared under various conditions created by manipulating the number of observed variables, the types of model misspecification, the sample size, and the magnitude of factor loadings. The results showed that the effect of p on the population CFI and TLI depended on the type of specification error, whereas a higher p was associated with lower values of the population RMSEA regardless of the type of model misspecification. In finite samples, all three fit indices tended to yield estimates that suggested a worse fit than their population counterparts, which was more pronounced with a smaller sample size, higher p, and lower factor loading.

323 citations


Journal ArticleDOI
TL;DR: A snapshot of the main concepts involved in Wasserstein distances and optimal transportation is provided, and a succinct overview of some of their many statistical aspects are provided.
Abstract: Wasserstein distances are metrics on probability distributions inspired by the problem of optimal mass transportation. Roughly speaking, they measure the minimal effort required to reconfigure the ...

137 citations


Journal ArticleDOI
TL;DR: It is found using both simulated and real-world complex data that constraint- based algorithms are often less accurate than score-based algorithms, but are seldom faster (even at large sample sizes); and that hybrid algorithms are neither faster nor more accurate than constraint-based algorithm.

135 citations


Journal ArticleDOI
TL;DR: Comparisons of bifactor models to alternatives using fit indices may be misleading and call into question the evidentiary meaning of previous studies that identified the b ifactor model as superior based on fit.
Abstract: Structural models of psychopathology provide dimensional alternatives to traditional categorical classification systems. Competing models, such as the bifactor and correlated factors models, are typically compared via statistical indices to assess how well each model fits the same data. However, simulation studies have found evidence for probifactor fit index bias in several psychological research domains. The present study sought to extend this research to models of psychopathology, wherein the bifactor model has received much attention, but its susceptibility to bias is not well characterized. We used Monte Carlo simulations to examine how various model misspecifications produced fit index bias for 2 commonly used estimators, WLSMV and MLR. We simulated binary indicators to represent psychiatric diagnoses and positively skewed continuous indicators to represent symptom counts. Across combinations of estimators, indicator distributions, and misspecifications, complex patterns of bias emerged, with fit indices more often than not failing to correctly identify the correlated factors model as the data-generating model. No fit index emerged as reliably unbiased across all misspecification scenarios. Although, tests of model equivalence indicated that in one instance fit indices were not biased-they favored the bifactor model, albeit not unfairly. Overall, results suggest that comparisons of bifactor models to alternatives using fit indices may be misleading and call into question the evidentiary meaning of previous studies that identified the bifactor model as superior based on fit. We highlight the importance of comparing models based on substantive interpretability and their utility for addressing study aims, the methodological significance of model equivalence, as well as the need for implementation of statistical metrics that evaluate model quality. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

88 citations


Journal ArticleDOI
TL;DR: A new approach to account for heteroscedasticity and covariance among observations present in residual error or induced by random effects is proposed and is universally applicable for arbitrary variance-covariance structures including spatial models and repeated measures.
Abstract: Extensions of linear models are very commonly used in the analysis of biological data. Whereas goodness of fit measures such as the coefficient of determination (R2 ) or the adjusted R2 are well established for linear models, it is not obvious how such measures should be defined for generalized linear and mixed models. There are by now several proposals but no consensus has yet emerged as to the best unified approach in these settings. In particular, it is an open question how to best account for heteroscedasticity and for covariance among observations present in residual error or induced by random effects. This paper proposes a new approach that addresses this issue and is universally applicable for arbitrary variance-covariance structures including spatial models and repeated measures. It is exemplified using three biological examples.

63 citations


Journal ArticleDOI
TL;DR: Overall, it is found that covariance ratio effect sizes are useful for comparing patterns of modular signal across datasets or for evaluating alternative modular hypotheses for the same dataset.
Abstract: The study of modularity is paramount for understanding trends of phenotypic evolution, and for determining the extent to which covariation patterns are conserved across taxa and levels of biological organization. However, biologists currently lack quantitative methods for statistically comparing the strength of modular signal across datasets, and a robust approach for evaluating alternative modular hypotheses for the same dataset. As a solution to these challenges, we propose an effect size measure ( ZCR ) derived from the covariance ratio, and develop hypothesis-testing procedures for their comparison. Computer simulations demonstrate that ZCR displays appropriate statistical properties and low levels of mis-specification, implying that it correctly identifies modular signal, when present. By contrast, alternative methods based on likelihood (EMMLi) and goodness of fit (MINT) suffer from high false positive rates and high model mis-specification rates. An empirical example in sigmodontine rodent mandibles is provided to illustrate the utility of ZCR for comparing modular hypotheses. Overall, we find that covariance ratio effect sizes are useful for comparing patterns of modular signal across datasets or for evaluating alternative modular hypotheses for the same dataset. Finally, the statistical philosophy for pairwise model comparisons using effect sizes should accommodate any future analytical developments for characterizing modular signal.

60 citations


Journal ArticleDOI
TL;DR: The development of a spatial back-propagation neural network model designed specifically to make spatial correlations implicit by incorporating a spatial lag variable (SLV) as a virtual input variable is reported on.
Abstract: Methods for estimating the spatial distribution of PM2.5 concentrations have been developed but have not yet been able to effectively include spatial correlation. We report on the development of a spatial back-propagation neural network (S-BPNN) model designed specifically to make such correlations implicit by incorporating a spatial lag variable (SLV) as a virtual input variable. The S-BPNN fits the nonlinear relationship between ground-based air quality monitoring station measurements of PM2.5, satellite observations of aerosol optical depth, meteorological synoptic conditions data and emissions data that include auxiliary geographical parameters such as land use, normalized difference vegetation index, elevation, and population density. We trained and validated the S-BPNN for both yearly and seasonal mean PM2.5 concentrations. In addition, principal components analysis was employed to reduce the dimensionality of the data and a grid of neural network models was run to optimize the model design. The S-BPNN was cross-validated against an analogous but SLV-free BPNN model using the coefficient of determination (R2) and root mean squared error (RMSE) as statistical measures of goodness of fit. The inclusion of the SLV led to demonstrably superior performance of the S-BPNN over the BPNN with R2 values increasing from 0.80 to 0.89 and with the RMSE decreasing from 8.1 to 5.8 μg/m3. The yearly mean PM2.5 concentration in China during the study period was found to be 41.8 μg/m3 and the model estimated spatial distribution was found to exceed Level 2 of the China Ambient Air Quality Standards (CAAQS) enacted in 2012 (>35 μg/m3) in more than 70% of the Chinese territory. The inclusion of spatial correlation upgrades the performance of conventional BPNN models and provides a more accurate estimation of PM2.5 concentrations for air quality monitoring.

52 citations


Journal ArticleDOI
TL;DR: This review provides several in-depth concepts regarding a survival analysis and several codes for specific survival analysis are listed to enhance the understanding of such an analysis and to provide an applicable survival analysis method.
Abstract: As a follow-up to a previous article, this review provides several in-depth concepts regarding a survival analysis. Also, several codes for specific survival analysis are listed to enhance the understanding of such an analysis and to provide an applicable survival analysis method. A proportional hazard assumption is an important concept in survival analysis. Validation of this assumption is crucial for survival analysis. For this purpose, a graphical analysis method and a goodnessof- fit test are introduced along with detailed codes and examples. In the case of a violated proportional hazard assumption, the extended models of a Cox regression are required. Simplified concepts of a stratified Cox proportional hazard model and time-dependent Cox regression are also described. The source code for an actual analysis using an available statistical package with a detailed interpretation of the results can enable the realization of survival analysis with personal data. To enhance the statistical power of survival analysis, an evaluation of the basic assumptions and the interaction between variables and time is important. In doing so, survival analysis can provide reliable scientific results with a high level of confidence.

49 citations


Journal ArticleDOI
26 Dec 2019-Symmetry
TL;DR: A new univariate version of the Lomax model as well as a simple type copula-based construction via Morgenstern family and via Clayton copula for introducing a new bivariate and a multivariate type extension of the new model are introduced.
Abstract: In this paper, we introduce a new univariate version of the Lomax model as well as a simple type copula-based construction via Morgenstern family and via Clayton copula for introducing a new bivariate and a multivariate type extension of the new model. The new density has a strong physical interpretation and can be a symmetric function and unimodal with a heavy tail with positive skewness. The new failure rate function can be “upside-down”, “decreasing” with many different shapes and “decreasing-constant”. Some mathematical and statistical properties of the new model are derived. The model parameters are estimated using different estimation methods. For comparing the estimation methods, Markov Chain Monte Carlo (MCMC) simulations are performed. The applicability of the new model is illustrated via four real data applications, these data sets are symmetric and right skewed. We constructed a modified Chi-Square goodness-of-fit test based on Nikulin-Rao-Robson test in the case of complete and censored sample for the new model. Different simulation studies are performed along applications on real data for validation propose.

Journal ArticleDOI
TL;DR: The situation, common in the current literature, is that of a whole family of location-scale/scale invariant test statistics, indexed by a parameter λ∈Λ, is available to test the goodness o... as mentioned in this paper.
Abstract: The situation, common in the current literature, is that of a whole family of location-scale/scale invariant test statistics, indexed by a parameter λ∈Λ, is available to test the goodness o...

Journal ArticleDOI
19 Dec 2019-Forests
TL;DR: In this paper, support vector regression (SVR) and random forest (RF) were used to predict the aboveground biomass (AGB) of the Sierra Madre Occidental in Mexico.
Abstract: An accurate estimation of forests’ aboveground biomass (AGB) is required because of its relevance to the carbon cycle, and because of its economic and ecological importance. The selection of appropriate variables from satellite information and physical variables is important for precise AGB prediction mapping. Because of the complex relationships for AGB prediction, non-parametric machine-learning techniques represent potentially useful techniques for AGB estimation, but their use and comparison in forest remote-sensing applications is still relatively limited. The objective of the present study was to evaluate the performance of automatic learning techniques, support vector regression (SVR) and random forest (RF), to predict the observed AGB (from 318 permanent sampling plots) from the Landsat 8 Landsat 8 Operational Land Imager (OLI) sensor, spectral indexes, texture indexes and physical variables the Sierra Madre Occidental in Mexico. The result showed that the best SVR model explained 80% of the total variance (root mean square error (RMSE) = 8.20 Mg ha−1). The variables that best predicted AGB, in order of importance, were the bands that belong to the region of red and near and middle infrared, and the average temperature. The results show that the SVR technique has a good potential for the estimation of the AGB and that the selection of the model hyperparameters has important implications for optimizing the goodness of fit.

Journal ArticleDOI
TL;DR: A Mahalanobis distance–based Monte Carlo goodness of fit testing procedure for the family of stochastic actor-oriented models for social network evolution and a modified model distance estimator is proposed to help the researcher identify model extensions that will remediate poor fit.
Abstract: We propose a Mahalanobis distance–based Monte Carlo goodness of fit testing procedure for the family of stochastic actor-oriented models for social network evolution. A modified model distance esti...

Journal ArticleDOI
TL;DR: In this paper, a structural equation model and test confirmatory factor analysis system was designed to better explain how students could utilize social networking system (Facebook) for educational purposes, and examined the attitude, perception and behaviour of Japanese students towards social-networking sites, and how students from non-English speaking backgrounds (especially Japanese students) at the University of Toyama perceive the use of Facebook for learning English as a foreign language.
Abstract: The objective of this study is to design a structural equation model and test confirmatory factor analysis system in order to better explain how students could utilize social networking system (Facebook) for educational purposes. Thus, this paper seeks to examine the attitude, perception and behaviour of Japanese students’ towards social-networking sites, and how students from non-English speaking backgrounds (especially Japanese students) at the University of Toyama perceive the use of Facebook for learning English as a foreign language. Our Structural Equation Modelling system based Facebook model outline the relations among different types of independent, dependent variables and constructs. We tested our model using adequate fitting indices like Goodness of Fit Index (GFI), Adjusted Goodness of Fit Index (AGFI), Root Mean Square Error of Approximation (RMSEA), Comparative Fit Index (CFI), Non-Normed Fit Index/Tucker Lewis index (NNFI/TLI) and Incremental Fit Index (IFI). The results of the proposed model confirmed the hypothesized latent structures and theoretical validity of probed factors. Conclusions drawn from this study might be useful to better understand the use of social networking tools in educational context.


Posted Content
TL;DR: A Glivenko-Cantelli type theorem is proved that shows the asymptotic stability of the empirical rank map in any direction, and proposes multivariate (nonparametric) goodness-of-fit tests based on the notion of quantiles and ranks.
Abstract: In this paper we study multivariate ranks and quantiles, defined using the theory of optimal transportation, and build on the work of Chernozhukov et al. (2017) and del Barrio et al. (2018). We study the characterization, computation and properties of the multivariate rank and quantile functions and their empirical counterparts. We derive the uniform consistency of these empirical estimates to their population versions, under certain assumptions. In fact, we prove a Glivenko-Cantelli type theorem that shows the asymptotic stability of the empirical rank map in any direction. We provide easily verifiable sufficient conditions that guarantee the existence of a continuous and invertible population quantile map --- a crucial assumption for our main consistency result. We provide a framework to derive the local uniform rate of convergence of the estimated quantile and ranks functions and explicitly illustrate the technique in a special case. Further, we propose multivariate (nonparametric) goodness-of-fit tests --- a two-sample test and a test for mutual independence --- based on our notion of quantiles and ranks. Asymptotic consistency of these tests are also shown. Additionally, we derive many properties of (sub)-gradients of convex functions and their Legendre-Fenchel duals that may be of independent interest.

Journal ArticleDOI
TL;DR: In this article, the authors use the relative entropy or Kullback-Leibler (KL) divergence between different data sets to assess the goodness of fit of different models.
Abstract: With the high-precision data from current and upcoming experiments, it becomes increasingly important to perform consistency tests of the standard cosmological model. In this work, we focus on consistency measures between different data sets and methods that allow us to assess the goodness of fit of different models. We address both of these questions using the relative entropy or Kullback-Leibler (KL) divergence [1]. First, we revisit the relative entropy as a consistency measure between data sets and further investigate some of its key properties, such as asymmetry and path dependence. We then introduce a novel model rejection framework, which is based on the relative entropy and the posterior predictive distribution. We validate the method on several toy models and apply it to Type Ia supernovae data from the JLA and CMB constraints from Planck 2015, testing the consistency of the data with six different cosmological models.

Journal ArticleDOI
TL;DR: In this paper, a modified Chi-squared goodness-of-fit test based on the Nikulin-Rao-Robson statistic in presence of censored and complete data is proposed.
Abstract: In this paper, we first introduce a new extension of the exponentiated exponential distribution along with its several mathematical properties. Second, we construct a modified Chi-squared goodness-of-fit test based on the Nikulin-Rao-Robson statistic in presence of censored and complete data. We describe the theory and the mechanism of the Yn2 statistic test which can be used in survival and reliability data analysis. We use the maximum likelihood estimators based on the initial non grouped data sets. Then, we conduct numerical simulations to reinforce the results. For showing the applicability of our model in various fields, we illustrate it and the proposed test by applications to two real data sets for complete data case and two other right censored data sets.

Journal ArticleDOI
TL;DR: The proposed framework allows for a parsimonious specification without compromising the model explanatory power and provides similar performance as the most traditional multivariate NB model for analyzing different crash dimensions.

Journal ArticleDOI
TL;DR: Based on concerns about the item response theory (IRT) linking approach used in the Programme for International Student Assessment (PISA) until 2012 as well as the desire to include new, more compl...
Abstract: Based on concerns about the item response theory (IRT) linking approach used in the Programme for International Student Assessment (PISA) until 2012 as well as the desire to include new, more compl...

Journal ArticleDOI
TL;DR: The purpose of this study was to explore the psychometric properties of the GHQ-28 when applied in the stroke population included in the randomized controlled trial; “Psychosocial well-being following stroke” by evaluating the internal consistency, exploring the factor structure, construct validity and measurement invariance.
Abstract: Several studies have documented the variety of post-stroke psychosocial challenges, which are complex, multifaceted, and affect a patient’s rehabilitation and recovery. Due to the consequences of these challenges, psychosocial well-being should be considered an important outcome of the stroke rehabilitation. Thus, a valid and reliable instrument that is appropriate for the stroke population is required. The factor structure of the Norwegian version of GHQ-28 has not previously been examined when applied to a stroke population. The purpose of this study was to explore the psychometric properties of the GHQ-28 when applied in the stroke population included in the randomized controlled trial; “Psychosocial well-being following stroke”, by evaluating the internal consistency, exploring the factor structure, construct validity and measurement invariance. Data were obtained from 322 individuals with a stroke onset within the past month. The Kaiser-Meyer-Olkin (KMO) test was used to test the sampling adequacy for exploratory factor analysis, and the Bartlett’s test of sphericity was used to test equal variances. Internal consistency was analysed using Cronbach’s alpha. The factor structure of the GHQ-28 was evaluated by exploratory factor analysis (EFA), and a confirmatory factor analysis (CFA) was used to determine the goodness of fit to the original structure of the outcome measurement. Measurement invariance for two time points was evaluated by configural, metric and scalar invariance. The results from the EFA supported the four-factor dimensionality, but some of the items were loaded on different factors compared to those of the original structure. The differences resulted in a reduced goodness of fit in the CFA. Measurement invariance at two time points was confirmed. The change in mean score from one to six months on the GHQ-28 and the factor composition are assumed to be affected by characteristics in the stroke population. The results, when applying the GHQ-28 in a stroke population, and sub-factor analysis based on the original factor structure should be interpreted with caution. ClinicalTrials.gov, NCT02338869 , registered 10/04/2014.

Journal ArticleDOI
TL;DR: The gradient-enhanced damage is found to be the most probable model class with the lowest total model uncertainty and can serve as a platform for future investigations on uncertainties associated with damage modelling and hence the concerned countermeasures.

Journal ArticleDOI
TL;DR: In this article, a novel approach for finding and evaluating structural models of small metallic nanoparticles is presented, where libraries of clusters from multiple structural motifs are built algorithmically and individually refined against experimental pair distribution functions.
Abstract: A novel approach for finding and evaluating structural models of small metallic nanoparticles is presented. Rather than fitting a single model with many degrees of freedom, libraries of clusters from multiple structural motifs are built algorithmically and individually refined against experimental pair distribution functions. Each cluster fit is highly constrained. The approach, called cluster-mining, returns all candidate structure models that are consistent with the data as measured by a goodness of fit. It is highly automated, easy to use, and yields models that are more physically realistic and result in better agreement to the data than models based on cubic close-packed crystallographic cores, often reported in the literature for metallic nanoparticles.

Journal ArticleDOI
TL;DR: A new class of (probability) distributions, based on a cosine-sine transformation, obtained by compounding a baseline distribution with cosine and sine functions is introduced, showing a better fit in comparison to some existing distributions based on some goodness-of-fit tests.
Abstract: In this paper, we introduce a new class of (probability) distributions, based on a cosine-sine transformation, obtained by compounding a baseline distribution with cosine and sine functions. Some of its properties are explored. A special focus is given to a particular cosine-sine transformation using the exponential distribution as baseline. Estimations of parameters of a particular cosine-sine exponential distribution are performed via the maximum likelihood estimation method. A simulation study investigates the performances of these estimates. Applications are given for four real data sets, showing a better fit in comparison to some existing distributions based on some goodness-of-fit tests.

Journal ArticleDOI
TL;DR: It is suggested that nursing organizations should strive for effective multidisciplinary cooperation with active support for patient-centered care and openness to change by using strategies to improve self-leadership and empathy.
Abstract: PURPOSE Patient-centered care is a widely utilized concept in nursing and health care. However, the key components of patient-centered nursing have not yet been reported. Moreover, previous studies on patient-centered care have mostly focused on components of nursing rather than organizational factors. Therefore, a comprehensive understanding of influential factors of patient-centered care is required. METHODS The purpose of this study was to develop a theoretical model based on person-centered care theory, and the relevant literature and to test the developed model with covariance structure analysis in order to determine the causal paths among the variables. RESULTS The model fit indices for the hypothetical model were suitable for the recommended level (goodness of fit index=.87, standardized root mean residual=.01, root mean square error of approximation=.06, Tucker-Lewis index=.90, comparative fit index=.92, parsimonious normed fit index=.75). In this study, five of the six paths established in the initial hypothetical model were supported. The variables of teamwork, self-leadership, and empathy accounted for 56.4% of hospital nurses' patient-centered care. Among these, empathy was the strongest predictor of patient-centered care. CONCLUSION These results suggest that it is necessary to use strategies to improve self-leadership and empathy. In addition to enhancing the personal factors of nurses, nursing organizations should strive for effective multidisciplinary cooperation with active support for patient-centered care and openness to change.

Journal ArticleDOI
TL;DR: In this paper, the authors focus on statistical inference and model evaluation in possibly misspecified and unidentified linear asset pricing models estimated by maximum likelihood, and show that when spurious factors are present, the model exhibits perfect fit, as measured by the squared correlation between the model's fitted expected returns and the average realized returns.

Journal ArticleDOI
TL;DR: In this article, a new goodness-of-fit test for regular vine (R-vine) copula models, a very flexible class of multivariate copulas based on a pair-copula construction (PCC), was introduced.
Abstract: We introduce a new goodness-of-fit test for regular vine (R-vine) copula models, a very flexible class of multivariate copulas based on a pair-copula construction (PCC). The test arises from White’s information matrix test and extends an existing goodness-of-fit test for copulas. The corresponding critical value can be approximated by asymptotic theory or simulation. The simulation based test shows excellent performance with regard to observed size and power in an extensive simulation study, while the asymptotic theory based test is inadequate for n≤10,000 for a 5-dimensional model (in d = 8 even 20,000 are not enough). The simulation based test is applied to select among different R-vine specifications modeling the dependency among exchange rates.

Journal ArticleDOI
29 Jan 2019-Metrika
TL;DR: In this article, a class of weighted $$L 2$$ -type tests of fit to the Gamma distribution is proposed. But their procedure is based on a fixed point property of a new transformation connected to a Steinian characterization of the family of Gamma distributions, and they derive the weak limits of the statistic under the null hypothesis and under contiguous alternatives.
Abstract: We propose a class of weighted $$L^2$$ -type tests of fit to the Gamma distribution. Our novel procedure is based on a fixed point property of a new transformation connected to a Steinian characterization of the family of Gamma distributions. We derive the weak limits of the statistic under the null hypothesis and under contiguous alternatives. The result on the limit null distribution is used to prove the asymptotic validity of the parametric bootstrap that is implemented to run the tests. Further, we establish the global consistency of our tests in this bootstrap setting, and conduct a Monte Carlo simulation study to show the competitiveness to existing test procedures.

Journal ArticleDOI
TL;DR: This paper proposes an algorithm to assist analysts in the search of an appropriate specification in terms of explanatory power and goodness of fit for mixed logit models and suggests that the proposed algorithm can find adequate model specifications, thereby supporting the analyst in the modeling process.
Abstract: Mixed logit is a widely used discrete outcome model that requires for the analyst to make three important decisions that affect the quality of the model specification. These decisions are: 1) what variables are considered in the analysis, 2) which variables are to be modeled with random parameters, and; 3) what density function do these parameters follow. The literature provides guidance; however, a strong statistical background and an ad hoc search process are required to obtain an adequate model specification. Knowledge and data about the problem context are required; also, the process is time consuming, and there is no certainty that the specified model is the best available. This paper proposes an algorithm to assist analysts in the search of an appropriate specification in terms of explanatory power and goodness of fit for mixed logit models. The specification includes the variables that should be considered as well as the random and deterministic parameters and their corresponding distributions. Three experiments were performed to test the effectiveness of the proposed algorithm. Comparison with existing model specifications for the same datasets were performed. The results suggest that the proposed algorithm can find adequate model specifications, thereby supporting the analyst in the modeling process.