scispace - formally typeset
Search or ask a question

Showing papers in "Open Journal of Statistics in 2014"


Journal ArticleDOI
Abstract: Psychometric theory requires unidimensionality (i.e., scale items should represent a common latent variable). One advocated approach to test unidimensionality within the Rasch model is to identify two item sets from a Principal Component Analysis (PCA) of residuals, estimate separate person measures based on the two item sets, compare the two estimates on a person-by-person basis using t-tests and determine the number of cases that differ significantly at the 0.05-level; if ≤5% of tests are significant, or the lower bound of a binomial 95% confidence interval (CI) of the observed proportion overlaps 5%, then it is suggested that strict unidimensionality can be inferred; otherwise the scale is multidimensional. Given its proposed significance and potential implications, this procedure needs detailed scrutiny. This paper explores the impact of sample size and method of estimating the 95% binomial CI upon conclusions according to recommended conventions. Normal approximation, “exact”, Wilson, Agresti-Coull, and Jeffreys binomial CIs were calculated for observed proportions of 0.06, 0.08 and 0.10 and sample sizes from n= 100 to n= 2500. Lower 95%CI boundaries were inspected regarding coverage of the 5% threshold. Results showed that all binomial 95% CIs included as well as excluded 5% as an effect of sample size for all three investigated proportions, except for the Wilson, Agresti-Coull, and JeffreysCIs, which did not include 5% for any sample size with a 10% observed proportion. The normal approximation CI was most sensitive to sample size. These data illustrate that the PCA/t-test protocol should be used and interpreted as any hypothesis testing procedure and is dependent on sample size as well as binomial CI estimation procedure. The PCA/t-test protocol should not be viewed as a “definite” test of unidimensionality and does not replace an integrated quantitative/qualitative interpretation based on an explicit variable definition in view of the perspective, context and purpose of measurement.

74 citations


Journal ArticleDOI
TL;DR: In this article, the authors evaluated the performance of the four most commonly used methods in practice, namely, complete case (CC), mean substitution (MS), last observation carried forward (LOCF), and multiple imputation (MI), and concluded that MI is more reliable and a better grounded statistical method to be used under MAR.
Abstract: Missing data can frequently occur in a longitudinal data analysis. In the literature, many methods have been proposed to handle such an issue. Complete case (CC), mean substitution (MS), last observation carried forward (LOCF), and multiple imputation (MI) are the four most frequently used methods in practice. In a real-world data analysis, the missing data can be MCAR, MAR, or MNAR depending on the reasons that lead to data missing. In this paper, simulations under various situations (including missing mechanisms, missing rates, and slope sizes) were conducted to evaluate the performance of the four methods considered using bias, RMSE, and 95% coverage probability as evaluation criteria. The results showed that LOCF has the largest bias and the poorest 95% coverage probability in most cases under both MAR and MCAR missing mechanisms. Hence, LOCF should not be used in a longitudinal data analysis. Under MCAR missing mechanism, CC and MI method are performed equally well. Under MAR missing mechanism, MI has the smallest bias, smallest RMSE, and best 95% coverage probability. Therefore, CC or MI method is the appropriate method to be used under MCAR while MI method is a more reliable and a better grounded statistical method to be used under MAR.

33 citations


Journal ArticleDOI
F. Brouers1
TL;DR: In this article, it was shown that most of the empirical or semi-empirical isotherms proposed to extend the Langmuir formula to sorption (adsorption, chimisorption and biosorption) on heterogeneous surfaces in the gaseous and liquid phase belong to the family and subfamily of the BurrXII cumulative distribution functions.
Abstract: We show that most of the empirical or semi-empirical isotherms proposed to extend the Langmuir formula to sorption (adsorption, chimisorption and biosorption) on heterogeneous surfaces in the gaseous and liquid phase belong to the family and subfamily of the BurrXII cumulative distribution functions. As a consequence they obey relatively simple differential equations which describe birth and death phenomena resulting from mesoscopic and microscopic physicochemical processes. Using the probability theory, it is thus possible to give a physical meaning to their empirical coefficients, to calculate well defined quantities and to compare the results obtained from different isotherms. Another interesting consequence of this finding is that it is possible to relate the shape of the isotherm to the distribution of sorption energies which we have calculated for each isotherm. In particular, we show that the energy distribution corresponding to the Brouers-Sotolongo (BS) isotherm [1] is the Gumbel extreme value distribution. We propose a generalized GBS isotherm, calculate its relevant statistical properties and recover all the previous results by giving well defined values to its coefficients. Finally we show that the Langmuir, the Hill-Sips, the BS and GBS isotherms satisfy the maximum Bolzmann-Shannon entropy principle and therefore should be favoured.

28 citations


Journal ArticleDOI
TL;DR: In this paper, a maximum ranked set sampling procedure with unequal samples (MRSSU) is proposed and its properties are studied under exponential distribution under both perfect and imperfect ranking (with errors in ranking).
Abstract: In this paper maximum ranked set sampling procedure with unequal samples (MRSSU) is proposed. Maximum likelihood estimator and modified maximum likelihood estimator are obtained and their properties are studied under exponential distribution. These methods are studied under both perfect and imperfect ranking (with errors in ranking). These estimators are then compared with estimators based on simple random sampling (SRS) and ranked set sampling (RSS) procedures. It is shown that relative efficiencies of the estimators based on MRSSU are better than those of the estimator based on SRS. Simulation results show that efficiency of proposed estimator is better than estimator based on RSS under ranking error.

20 citations


Journal ArticleDOI
TL;DR: In this paper, the authors compare sample quality across two probability samples and one that uses probabilistic cluster sampling combined with random route and quota sampling within the selected clusters in order to define the ultimate survey.
Abstract: The aim of this paper is to compare sample quality across two probability samples and one that uses probabilistic cluster sampling combined with random route and quota sampling within the selected clusters in order to define the ultimate survey units. All of them use the face-to-face interview as the survey procedure. The hypothesis to be tested is that it is possible to achieve the same degree of representativeness using a combination of random route sampling and quota sampling (with substitution) as it can be achieved by means of household sampling (without substitution) based on the municipal register of inhabitants. We have found such marked differences in the age and gender distribution of the probability sampling, where the deviations exceed 6%. A different picture emerges when it comes to comparing the employment variables, where the quota sampling overestimates the economic activity rate (2.5%) and the unemployment rate (8%) and underestimates the employment rate (3.46%).

20 citations


Journal ArticleDOI
TL;DR: In this article, the effect of item inversion on the construct validity and reliability of psychometric scales and proposed a theoretical framework for the evaluation of the psychometric properties of data gathered with psychometric instruments.
Abstract: This study evaluated the effect of item inversion on the construct validity and reliability of psychometric scales and proposed a theoretical framework for the evaluation of the psychometric properties of data gathered with psychometric instruments. To this propose, we used the Maslach Burnout Inventory, which is the most used psychometric inventory to measure burnout in different professional context (Students, Teachers, Police, Doctors, Nurses, etc…). The version of the MBI used was the MBI-Student Survey (MBI-SS). This inventory is composed of three key dimensions: Exhaustion, Cynicism and Professional Efficacy. The two first dimensions—which have positive formulated items—are moderate to strong positive correlated, and show moderate to strong negative correlations with the 3rd dimension—which has negative formulated items. We tested the hypothesis that, in college students, formulating the 3rd dimension of burnout as Inefficacy (reverting the negatively worded items in the Efficacy dimension) improves the correlation of the 3rd dimension with the other two dimensions, improves its internal consistency, and the overall MBI-SS’ construct validity and reliability. Confirmatory factor analysis results, estimated by Maximum Likelihood, revealed adequate factorial fit for both forms of the MBI-SS (with Efficacy) vs. the MBI-SSi (with Inefficacy). Also both forms showed adequate convergent and discriminant related validity. However, reliability and convergent validity were higher for the MBI-SSi. There were also stronger (positive) correlations between the 3 factors in MBI-SSi than the ones observed in MBI-SS. Results show that positively rewording of the 3rd dimension of the MBI-SS improves its validity and reliability. We therefore propose that the 3rd dimension of the MBI-SS should be named Professional Inefficacy and its items should be positively worded.

18 citations


Journal ArticleDOI
TL;DR: A simulation study to investigate the efficiency of four typical imputation methods with longitudinal data setting under missing completely at random concludes that MI method is the most effective imputation method in the authors' MCAR simulation study.
Abstract: In analyzing data from clinical trials and longitudinal studies, the issue of missing values is always a fundamental challenge since the missing data could introduce bias and lead to erroneous statistical inferences. To deal with this challenge, several imputation methods have been developed in the literature to handle missing values where the most commonly used are complete case method, mean imputation method, last observation carried forward (LOCF) method, and multiple imputation (MI) method. In this paper, we conduct a simulation study to investigate the efficiency of these four typical imputation methods with longitudinal data setting under missing completely at random (MCAR). We categorize missingness with three cases from a lower percentage of 5% to a higher percentage of 30% and 50% missingness. With this simulation study, we make a conclusion that LOCF method has more bias than the other three methods in most situations. MI method has the least bias with the best coverage probability. Thus, we conclude that MI method is the most effective imputation method in our MCAR simulation study.

18 citations


Journal ArticleDOI
TL;DR: In this paper, the exact expression of the distribution of the sample matrix of correlation R, with the sample variance acting as parameters, is given, for the case where the multivariate normal population does not have null correlations, and applications to the concept of system variance pendence in Reliability Theory are presented.
Abstract: For the case where the multivariate normal population does not have null correlations, we give the exact expression of the distribution of the sample matrix of correlations R, with the sample variances acting as parameters. Also, the distribution of its determinant is established in terms of Meijer G-functions in the null-correlation case. Several numerical examples are given, and applications to the concept of system de- pendence in Reliability Theory are presented.

16 citations


Journal ArticleDOI
TL;DR: In this paper, the ROC curves for Bi-Pareto and Bi-two parameter exponential distributions were calculated using simulations and compared in terms of root mean square and mean absolute errors.
Abstract: In this paper, we find the ROC curves for Bi-Pareto and Bi-two parameter exponential distributions. Theoretical, parametric and non-parametric values of area under receiver operating characteristic (AUROC) curve for different parametric combinations have been calculated using simulations. These values are compared in terms of root mean square and mean absolute errors. The results are demonstrated for two real data sets.

13 citations


Journal ArticleDOI
TL;DR: The opportunity of using the most innovative spatial sampling designs in business surveys, in order to produce samples that are well spread in space, is here tested by means of Monte Carlo experiments.
Abstract: An innovative use of spatial sampling designs is here presented. Sampling methods which consider spatial locations of statistical units are already used in agricultural and environmental contexts, while they have never been exploited for establishment surveys. However, the rapidly increasing availability of geo- referenced information about business units makes that possible. In business studies, it may indeed be important to take into account the presence of spatial autocorrelation or spatial trends in the variables of interest, in order to have more precise and efficient estimates. The opportunity of using the most innovative spatial sampling designs in business surveys, in order to produce samples that are well spread in space, is here tested by means of Monte Carlo experiments. For all designs, the Horvitz-Thompson estimator of the population total is used both with equal and unequal inclusion probabilities. The efficiency of sampling designs is evaluated in terms of relative RMSE and efficiency gain compared with designs ignoring the spatial information. Furthermore, an evaluation of spatially balancing samples is also conducted.

13 citations


Journal ArticleDOI
TL;DR: An algorithm involving Mel-Frequency Cepstral Coefficients (MFCCs) is provided to perform signal feature extraction for the task of speaker accent recognition, and k-nearest neighbors yield the highest average test accuracy.
Abstract: An algorithm involving Mel-Frequency Cepstral Coefficients (MFCCs) is provided to perform signal feature extraction for the task of speaker accent recognition. Then different classifiers are compared based on the MFCC feature. For each signal, the mean vector of MFCC matrix is used as an input vector for pattern recognition. A sample of 330 signals, containing 165 US voice and 165 non-US voice, is analyzed. By comparison, k-nearest neighbors yield the highest average test accuracy, after using a cross-validation of size 500, and least time being used in the computation.

Journal ArticleDOI
TL;DR: Findings of the study suggest that this simulation-based power analysis method can be used to estimate sample size and statistical power for Guastello’s polynomial regression method in cusp catastrophe modeling.
Abstract: Guastello’s polynomial regression method for solving cusp catastrophe model has been widely applied to analyze nonlinear behavior outcomes. However, no statistical power analysis for this modeling approach has been reported probably due to the complex nature of the cusp catastrophe model. Since statistical power analysis is essential for research design, we propose a novel method in this paper to fill in the gap. The method is simulation-based and can be used to calculate statistical power and sample size when Guastello’s polynomial regression method is used to do cusp catastrophe modeling analysis. With this novel approach, a power curve is produced first to depict the relationship between statistical power and samples size under different model specifications. This power curve is then used to determine sample size required for specified statistical power. We verify the method first through four scenarios generated through Monte Carlo simulations, and followed by an application of the method with real published data in modeling early sexual initiation among young adolescents. Findings of our study suggest that this simulation-based power analysis method can be used to estimate sample size and statistical power for Guastello’s polynomial regression method in cusp catastrophe modeling.

Journal ArticleDOI
TL;DR: In this article, the shape parameter of Weibull distribution is used to calculate PCIs for the verification and validation purpose of two data sets for verification purpose and the effectiveness of the technique is assessed by bootstrapping the results of estimate and standard error of shape parameter.
Abstract: Process capability analysis is used to determine the process performance as capable or incapable within a specified tolerance. Basic indices Cp, Cpk, Cpm, Cpmk initially developed for normally distributed processes showed inappropriate for processes with non-normal distributions. A number of authors worked on non-normal distributions which were most notably those of Clements, Pearn and Chen, Montgomery and Johnson-Kotz-Pearn (JKP). Obtaining PCIs based on the parameters of non-normal distributions are completely disregarded and ignored. However parameters of some non-normal distributions have significance for knowing the status of process as capable or incapable. In this article we intend to work on the shape parameter of Weibull distribution to calculate PCIs. We work on two data sets for verification and validation purpose. Efficacy of the technique is assessed by bootstrapping the results of estimate and standard error of shape parameter.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a method for estimating the duration of the hiatus that is robust to unknown forms of heteroskedasticity and autocorrelation (HAC) in the temperature series and to cherry-picking of endpoints.
Abstract: The IPCC has drawn attention to an apparent leveling-off of globally-averaged temperatures over the past 15 years or so. Measuring the duration of the hiatus has implications for determining if the underlying trend has changed, and for evaluating climate models. Here, I propose a method for estimating the duration of the hiatus that is robust to unknown forms of heteroskedasticity and autocorrelation (HAC) in the temperature series and to cherry-picking of endpoints. For the specific case of global average temperatures I also add the requirement of spatial consistency between hemispheres. The method makes use of the Vogelsang-Franses (2005) HAC-robust trend variance estimator which is valid as long as the underlying series is trend stationary, which is the case for the data used herein. Application of the method shows that there is now a trendless interval of 19 years duration at the end of the HadCRUT4 surface temperature series, and of 16 - 26 years in the lower troposphere. Use of a simple AR1 trend model suggests a shorter hiatus of 14 - 20 years but is likely unreliable.

Journal ArticleDOI
TL;DR: In this paper, a new nonparametric test based on the rank difference between the paired sample for testing the equality of the marginal distributions from a bivariate distribution was proposed, which has comparable power to the paired t test for the data simulated from bivariate normal distributions.
Abstract: We propose a new nonparametric test based on the rank difference between the paired sample for testing the equality of the marginal distributions from a bivariate distribution. We also consider a modification of the novel nonparametric test based on the test proposed by Baumgartern, Weiβ, and Schindler (1998). An extensive numerical power comparison for various parametric and nonparametric tests was conducted under a wide range of bivariate distributions for small sample sizes. The two new nonparametric tests have comparable power to the paired t test for the data simulated from bivariate normal distributions, and are generally more powerful than the paired t test and other commonly used nonparametric tests in several important bivariate distributions.

Journal ArticleDOI
TL;DR: In this article, the Weibull kernel is used to estimate the hazard rate and the probability density function for independent and identically distributed (iid) data, and the performance of the proposed estimator is tested using simulation study and real data.
Abstract: In this paper, we define the Weibull kernel and use it to nonparametric estimation of the probability density function (pdf) and the hazard rate function for independent and identically distributed (iid) data. The bias, variance and the optimal bandwidth of the proposed estimator are investigated. Moreover, the asymptotic normality of the proposed estimator is investigated. The performance of the proposed estimator is tested using simulation study and real data.

Journal ArticleDOI
TL;DR: In this article, the authors focus on the detection and estimation of changes in patients' failure rates, which is important for the evaluation and comparison of treatments and prediction of their effects.
Abstract: Effects of many medical procedures appear after a time lag, when a significant change occurs in subjects’ failure rate. This paper focuses on the detection and estimation of such changes which is important for the evaluation and comparison of treatments and prediction of their effects. Unlike the classical change-point model, measurements may still be identically distributed, and the change point is a parameter of their common survival function. Some of the classical change-point detection techniques can still be used but the results are different. Contrary to the classical model, the maximum likelihood estimator of a change point appears consistent, even in presence of nuisance parameters. However, a more efficient procedure can be derived from Kaplan-Meier estimation of the survival function followed by the least-squares estimation of the change point. Strong consistency of these estimation schemes is proved. The finite-sample properties are examined by a Monte Carlo study. Proposed methods are applied to a recent clinical trial of the treatment program for strong drug dependence.

Journal ArticleDOI
TL;DR: In this paper, the authors used principal component regression (PCR) to determine the time lag of GCM data and build statistical downscaling model using PCR method with time lag.
Abstract: Statistical downscaling (SD) analyzes relationship between local-scale response and global-scale predictors. The SD model can be used to forecast rainfall (local-scale) using global-scale precipitation from global circulation model output (GCM). The objectives of this research were to determine the time lag of GCM data and build SD model using PCR method with time lag of the GCM precipitation data. The observations of rainfall data in Indramayu were taken from 1979 to 2007 showing similar patterns with GCM data on 1st grid to 64th grid after time shift (time lag). The time lag was determined using the cross-correlation function. However, GCM data of 64 grids showed multicollinearity problem. This problem was solved by principal component regression (PCR), but the PCR model resulted heterogeneous errors. PCR model was modified to overcome the errors with adding dummy variables to the model. Dummy variables were determined based on partial least squares regression (PLSR). The PCR model with dummy variables improved the rainfall prediction. The SD model with lag-GCM predictors was also better than SD model without lag-GCM.

Journal ArticleDOI
TL;DR: In this paper, the authors used the Cox proportional hazard test to determine the appropriate method for modeling the birth of the first child in Indonesia, considering that newly married couples tend to have desire for having a baby as soon as possible and such desire will be weakened by increasing age of marriage.
Abstract: First birth interval is one of the examples of survival data. One of the characteristics of survival data is its observation period that is fully unobservable or censored. Analyzing the censored data using ordinary methods will lead to bias, so that reducing such bias required a certain method called survival analysis. There are two methods used in survival analysis that are parametric and non-parametric method. The objective of this paper is to determine the appropriate method for modeling the birth of the first child. The exponential model with the inclusion of covariates is used as parametric method, considering that the newly married couples tend to have desire for having baby as soon as possible and such desire will be weakened by increasing age of marriage. The data that will be analyzed were taken from the Indonesia Demographic and Health Survey (IDHS) 2012. The result of data analysis shows that the birth of the first child data is not exponentially distributed thus the Cox proportional hazard method is used. Because of the suspicion that disproportional covariate exists, then the proportional hazard test is conducted to show that the covariate of age is not proportional, the generalized Cox proportional method is used, namely Cox extended that allows the inclusion of disproportional covariates. The result of analysis using Cox extended model indicates that the factors affecting the birth of the first child in Indonesia are the area of residence, educational history and its age.

Journal ArticleDOI
TL;DR: An inference framework on the modality of a KDE under multivariate setting using Gaussian kernel is developed and the modal clustering method proposed by [1] for mode hunting is applied.
Abstract: The number of modes (also known as modality) of a kernel density estimator (KDE) draws lots of interests and is important in practice. In this paper, we develop an inference framework on the modality of a KDE under multivariate setting using Gaussian kernel. We applied the modal clustering method proposed by [1] for mode hunting. A test statistic and its asymptotic distribution are derived to assess the significance of each mode. The inference procedure is applied on both simulated and real data sets.

Journal ArticleDOI
TL;DR: In this article, two different tools to evaluate quantile regression and predictions are proposed: MAD, to summarize forecast errors, and a fluctuation test to evaluate in-sample predictions.
Abstract: Two different tools to evaluate quantile regression forecasts are proposed: MAD, to summarize forecast errors, and a fluctuation test to evaluate in-sample predictions. The scores of the PISA test to evaluate students’ proficiency are considered. Growth analysis relates school attainment to economic growth. The analysis is complemented by investigating the estimated regression and predictions not only at the centre but also in the tails. For out-of-sample forecasts, the estimates in one wave are employed to forecast the following waves. The reliability of in-sample forecasts is controlled by excluding the part of the sample selected by a specific rule: boys to predict girls, public schools to forecast private ones, vocational schools to predict non-vocational, etc. The gradient computed in the subset is compared to its analogue computed in the full sample in order to verify the validity of the estimated equation and thus of the in-sample predictions.

Journal ArticleDOI
TL;DR: In this paper, the analysis of market sentiments in exchange rates is discussed, which are of great interest to trading individuals and institutional investors, and a multinomial probability model is built to capture the uncertainties in market sentiments.
Abstract: The paper deals with the analysis of market sentiments in exchange rates which are of great interest to trading individuals and institutional investors. For example, an institutional investor or a trading individual makes better investments and minimizes losses when equipped with an understanding of market sentiments in weekly or monthly exchange returns. In the approach suggested here, a typical market sentiment is defined on the basis of the certain function of the mean and the standard error of the logarithm of the ratio of successive daily exchange rates. Based on this surmise, the market sentiments are classified into various states, whereby states are defined according to the perceptions of the market player. A multinomial probability model is built to capture the uncertainties in market sentiments. Two asymptotically distribution-free tests, namely the chi-square and the likelihood ratio test of goodness of fit for the hypothesis of the symmetry in market sentiments are suggested. Two different measures of market sentiments are proposed. The approach advocated here will be of interest to researchers, exchange rate traders and financial analysts. As an application of the proposed line of approach, we analyze weekly market sentiments that govern exchange rates of the major global currencies—EUR, GBP, SDR, YEN, ZAR, USD, data from 2001-2012. Some interesting conclusions are revealed based on the data analysis.

Journal ArticleDOI
TL;DR: In this article, two extensions of the stochastic logistic model for fish growth have been examined and the basic features of a logistic growth rate are deeply influenced by the carrying capacity of the system and the changes are periodical with time.
Abstract: Two extensions of stochastic logistic model for fish growth have been examined. The basic features of a logistic growth rate are deeply influenced by the carrying capacity of the system and the changes are periodical with time. Introduction of a new parameter , enlarges the scope of investing the growthof different fish species. For rapid growth lying between 1 and 2 and for slowly growing.

Journal ArticleDOI
TL;DR: Results suggest that sparse Bayesian Multinomial Probit model applied to cancer progression data allows for better subclass prediction and produces more functionally relevant gene sets.
Abstract: A major limitation of expression profiling is caused by the large number of variables assessed compared to relatively small sample sizes. In this study, we developed a multinomial Probit Bayesian model which utilizes the double exponential prior to induce shrinkage and reduce the number of covariates in the model [1]. A hierarchical Sparse Bayesian Generalized Linear Model (SBGLM) was developed in order to facilitate Gibbs sampling which takes into account the progressive nature of the response variable. The method was evaluated using a published dataset (GSE6099) which contained 99 prostate cancer cell types in four different progressive stages [2]. Initially, 398 genes were selected using ordinal logistic regression with a cutoff value of 0.05 after Benjamini and Hochberg FDR correction. The dataset was randomly divided into training (N = 50) and test (N = 49) groups such that each group contained equal number of each cancer subtype. In order to obtain more robust results we performed 50 re-samplings of the training and test groups. Using the top ten genes obtained from SBGLM, we were able to achieve an average classification accuracy of 85% and 80% in training and test groups, respectively. To functionally evaluate the model performance, we used a literature mining approach called Geneset Cohesion Analysis Tool [3]. Examination of the top 100 genes produced an average functional cohesion p-value of 0.007 compared to 0.047 and 0.131 produced by classical multi-category logistic regression and Random Forest approaches, respectively. In addition, 96 percent of the SBGLM runs resulted in a GCAT literature cohesion p-value smaller than 0.047. Taken together, these results suggest that sparse Bayesian Multinomial Probit model applied to cancer progression data allows for better subclass prediction and produces more functionally relevant gene sets.

Journal ArticleDOI
TL;DR: In this paper, the authors give a study on the performance of two specific modifications of the Weibull distribution which are the exponentiated Weibell distribution and the additive Weibullah distribution.
Abstract: Proposed by the Swedish engineer and mathematician Ernst Hjalmar Waloddi Weibull (1887-1979), the Weibull distribution is a probability distribution that is widely used to model lifetime data. Because of its flexibility, some modifications of the Weibull distribution have been made from several researches in order to best adjust the non-monotonic shapes. This paper gives a study on the performance of two specific modifications of the Weibull distribution which are the exponentiated Weibull distribution and the additive Weibull distribution.

Journal ArticleDOI
TL;DR: In this article, a set of variables relative to socio-economic class, urban environment, and travel characteristics was applied in a sample consisting of workers of the S?o Paulo Metropolitan Area, based on the origin-destination home interview survey, carried out in 1997, in order to examine the interdependence between travel patterns and the set of socioeconomic and urban environment variables.
Abstract: The main objective of this study is to analyze work travel-related behavior through a set of variables relative to socio-economic class, urban environment and travel characteristics. The Principal Component Analysis was applied in a sample consisting of workers of the S?o Paulo Metropolitan Area, based on the origin-destination home interview survey, carried out in 1997, in order to: 1) examine the interdependence between travel patterns and a set of socioeconomic and urban environment variables; 2) determine if the original database can be synthetized on components. The results enabled to observe relations between the individual’s socio-economic class and car usage, characteristics of urban environment and destination choices, as well as age and non-motorized travel mode choice. It is then concluded that the database can be adequately summarized in three components for subsequent analysis: 1) urban environment; 2) socio-economic class; and 3) family structure.

Journal ArticleDOI
TL;DR: In this article, the relative strength and rotational robustness of some SWT-based normality tests are investigated for multidimensional normality, including Royston's H-test and the SWT based test proposed by Villase?or-Alva and Gonzalez-Estrada.
Abstract: The Shapiro-Wilk test (SWT) for normality is well known for its competitive power against numerous one-dimensional alternatives. Several extensions of the SWT to multi-dimensions have also been proposed. This paper investigates the relative strength and rotational robustness of some SWT-based normality tests. In particular, the Royston’s H-test and the SWT-based test proposed by Villase?or-Alva and Gonzalez-Estrada have R packages available for testing multivariate normality; thus they are user friendly but lack of rotational robustness compared to the test proposed by Fattorini. Numerical power comparison is provided for illustration along with some practical guidelines on the choice of these SWT-type tests in practice.

Journal ArticleDOI
TL;DR: In this paper, a general framework for large scale modeling of macroeconomic and financial time series is introduced, which is characterized by simplicity of implementation and performs well independently of persistence and heteroskedasticity properties, accounting for common deterministic and============stochastic factors.
Abstract: In the paper, a general framework for large scale modeling of macroeconomic and financial time series is introduced. The proposed approach is characterized by simplicity of implementation, performing well independently of persistence and heteroskedasticity properties, accounting for common deterministic and stochastic factors. Monte Carlo results strongly support the proposed methodology, validating its use also for relatively small cross-sectional and temporal samples.

Journal ArticleDOI
TL;DR: Performance on the Trail Making Test B did not correlate with pain, fatigue, depression, anxiety, or sensation of rest, and TMT-B cannot be considered fully validated.
Abstract: Introduction: Cognitive impairment is common in patients with cancer; however, studies examining the adaptation and validation of instruments for use in patients with cancer are scarce. Purpose: The purpose of this study was to validate the Trail Making Test B (TMT-B) for use in patients with cancer. Methods: Ninety-four outpatients receiving palliative treatment and 39 healthy companions were assessed. Patients were tested with the TMT-B and answered questions regarding the presence and intensity of pain, fatigue, quality of sleep, anxiety, and depression, at two time points with a 7-day inter-assessment interval. Results: The instrument discriminated between patients, who were slower, and healthy companions with respect to the time required to complete the test, but not in terms of the number of errors. The test was stable for the healthy companions across the two assessments in terms of time to complete the TMT-B and the number of errors; for patients, the instrument was stable only for the number of errors. Performance on the TMT-B did not correlate with pain, fatigue, depression, anxiety, or sensation of rest. Conclusions: TMT-B cannot be considered fully validated. Further studies incorporating and comparing other instruments evaluating executive function and mental flexibility are needed.

Journal ArticleDOI
TL;DR: It is shown that many of the common choices in hypothesis testing led to a severely underpowered form of theory evaluation and that confirmatory methods are required in the context of theory Evaluation and that the scientific literature would benefit from a clearer distinction between confirmatory and exploratory findings.
Abstract: Experimental studies are usually designed with specific expectations about the results in mind. However, most researchers apply some form of omnibus test to test for any differences, with follow up tests like pairwise comparisons or simple effects analyses for further investigation of the effects. The power to find full support for the theory with such an exploratory approach which is usually based on multiple testing is, however, rather disappointing. With the simulations in this paper we showed that many of the common choices in hypothesis testing led to a severely underpowered form of theory evaluation. Furthermore, some less commonly used approaches were presented and a comparison of results in terms of power to find support for the theory was made. We concluded that confirmatory methods are required in the context of theory evaluation and that the scientific literature would benefit from a clearer distinction between confirmatory and exploratory findings. Also, we emphasis the importance of reporting all tests, significant or not, including the appropriate sample statistics like means and standard deviations. Another recommendation is related to the fact that researchers, when they discuss the conclusions of their own study, seem to underestimate the role of sampling variability. The execution of more replication studies in combination with proper reporting of all results provides insight in between study variability and the amount of chance findings.