scispace - formally typeset
Search or ask a question

Showing papers on "Sample size determination published in 2008"


Journal ArticleDOI
TL;DR: In this article, a broad suite of algorithms with independent presence-absence data from multiple species and regions were evaluated for 46 species (from six different regions of the world) at three sample sizes (100, 30 and 10 records).
Abstract: A wide range of modelling algorithms is used by ecologists, conservation practitioners, and others to predict species ranges from point locality data. Unfortunately, the amount of data available is limited for many taxa and regions, making it essential to quantify the sensitivity of these algorithms to sample size. This is the first study to address this need by rigorously evaluating a broad suite of algorithms with independent presence‐absence data from multiple species and regions. We evaluated predictions from 12 algorithms for 46 species (from six different regions of the world) at three sample sizes (100, 30, and 10 records). We used data from natural history collections to run the models, and evaluated the quality of model predictions with area under the receiver operating characteristic curve (AUC). With decreasing sample size, model accuracy decreased and variability increased across species and between models. Novel modelling methods that incorporate both interactions between predictor variables and complex response shapes (i.e. GBM, MARS-INT, BRUTO) performed better than most methods at large sample sizes but not at the smallest sample sizes. Other algorithms were much less sensitive to sample size, including an algorithm based on maximum entropy (MAXENT) that had among the best predictive power across all sample sizes. Relative to other algorithms, a distance metric algorithm (DOMAIN) and a genetic algorithm (OM-GARP) had intermediate performance at the largest sample size and among the best performance at the lowest sample size. No algorithm predicted consistently well with small sample size ( n < 30) and this should encourage highly conservative use of predictions based on small sample size and restrict their use to exploratory modelling.

1,906 citations


Journal ArticleDOI
TL;DR: In this paper, confidence intervals constructed around a desired or anticipated value can help determine the sample size needed for a pilot study, and sample sizes ranging in size from 10 to 40 per group are evaluated for their adequacy in providing estimates precise enough to meet a variety of possible aims.
Abstract: There is little published guidance concerning how large a pilot study should be. General guidelines, for example using 10% of the sample required for a full study, may be inadequate for aims such as assessment of the adequacy of instrumentation or providing statistical estimates for a larger study. This article illustrates how confidence intervals constructed around a desired or anticipated value can help determine the sample size needed. Samples ranging in size from 10 to 40 per group are evaluated for their adequacy in providing estimates precise enough to meet a variety of possible aims. General sample size guidelines by type of aim are offered.

1,449 citations


Journal ArticleDOI
TL;DR: There is little empirical support for the use of .05 or any other value as universal cutoff values to determine adequate model fit, regardless of whether the point estimate is used alone or jointly with the confidence interval.
Abstract: This article is an empirical evaluation of the choice of fixed cutoff points in assessing the root mean square error of approximation (RMSEA) test statistic as a measure of goodness-of-fit in Structural Equation Models. Using simulation data, the authors first examine whether there is any empirical evidence for the use of a universal cutoff, and then compare the practice of using the point estimate of the RMSEA alone versus that of using it jointly with its related confidence interval. The results of the study demonstrate that there is little empirical support for the use of .05 or any other value as universal cutoff values to determine adequate model fit, regardless of whether the point estimate is used alone or jointly with the confidence interval. The authors' analyses suggest that to achieve a certain level of power or Type I error rate, the choice of cutoff values depends on model specifications, degrees of freedom, and sample size.

1,159 citations


Journal ArticleDOI
TL;DR: A meta-analysis of experimental studies to determine the effectiveness of stress management interventions in occupational settings suggested that intervention type played a moderating role, but if additional treatment components were added the effect was reduced.
Abstract: A meta-analysis was conducted to determine the effectiveness of stress management interventions in occupational settings. Thirty-six experimental studies were included, representing 55 interventions. Total sample size was 2,847. Of the participants, 59% were female, mean age was 35.4, and average length of intervention was 7.4 weeks. The overall weighted effect size (Cohen's d) for all studies was 0.526 (95% confidence interval = 0.364, 0.687), a significant medium to large effect. Interventions were coded as cognitive-behavioral, relaxation, organizational, multimodal, or alternative. Analyses based on these subgroups suggested that intervention type played a moderating role. Cognitive-behavioral programs consistently produced larger effects than other types of interventions, but if additional treatment components were added the effect was reduced. Within the sample of studies, relaxation interventions were most frequently used, and organizational interventions continued to be scarce. Effects were based mainly on psychological outcome variables, as opposed to physiological or organizational measures. The examination of additional moderators such as treatment length, outcome variable, and occupation did not reveal significant variations in effect size by intervention type.

1,125 citations


Journal ArticleDOI
TL;DR: As precision increases, while estimates of the heterogeneity variance τ2 remain unchanged on average, estimates of I2 increase rapidly to nearly 100%.
Abstract: The heterogeneity statistic I 2, interpreted as the percentage of variability due to heterogeneity between studies rather than sampling error, depends on precision, that is, the size of the studies included. Based on a real meta-analysis, we simulate artificially 'inflating' the sample size under the random effects model. For a given inflation factor M = 1, 2, 3,... and for each trial i, we create a M-inflated trial by drawing a treatment effect estimate from the random effects model, using /M as within-trial sampling variance. As precision increases, while estimates of the heterogeneity variance τ 2 remain unchanged on average, estimates of I 2 increase rapidly to nearly 100%. A similar phenomenon is apparent in a sample of 157 meta-analyses. When deciding whether or not to pool treatment estimates in a meta-analysis, the yard-stick should be the clinical relevance of any heterogeneity present. τ 2, rather than I 2, is the appropriate measure for this purpose.

808 citations


Journal ArticleDOI
TL;DR: The limitations and usefulness of each method are addressed in order to give researchers guidance in constructing appropriate estimates of biomarkers' true discriminating capabilities.
Abstract: The receiver operating characteristic (ROC) curve is used to evaluate a biomarker's ability for classifying disease status. The Youden Index (J), the maximum potential effectiveness of a biomarker, is a common summary measure of the ROC curve. In biomarker development, levels may be unquantifiable below a limit of detection (LOD) and missing from the overall dataset. Disregarding these observations may negatively bias the ROC curve and thus J. Several correction methods have been suggested for mean estimation and testing; however, little has been written about the ROC curve or its summary measures. We adapt non-parametric (empirical) and semi-parametric (ROC-GLM [generalized linear model]) methods and propose parametric methods (maximum likelihood (ML)) to estimate J and the optimal cut-point (c *) for a biomarker affected by a LOD. We develop unbiased estimators of J and c * via ML for normally and gamma distributed biomarkers. Alpha level confidence intervals are proposed using delta and bootstrap methods for the ML, semi-parametric, and non-parametric approaches respectively. Simulation studies are conducted over a range of distributional scenarios and sample sizes evaluating estimators' bias, root-mean square error, and coverage probability; the average bias was less than one percent for ML and GLM methods across scenarios and decreases with increased sample size. An example using polychlorinated biphenyl levels to classify women with and without endometriosis illustrates the potential benefits of these methods. We address the limitations and usefulness of each method in order to give researchers guidance in constructing appropriate estimates of biomarkers' true discriminating capabilities.

801 citations


Journal ArticleDOI
TL;DR: Methods that standardize A on the basis of the size of the smallest number of samples in a comparison are preferable to other approaches, and rarefaction and repeated random subsampling provide unbiased estimates of A with the greatest precision.
Abstract: Although differences in sampling intensity can bias comparisons of allelic richness (A) among populations, investigators often fail to correct estimates of A for differences in sample size. Methods that standardize A on the basis of the size of the smallest number of samples in a comparison are preferable to other approaches. Rarefaction and repeated random subsampling provide unbiased estimates of A with the greatest precision and thus provide greatest statistical power to detect differences in variation. Less promising approaches, in terms of bias or precision, include single random subsampling, eliminating very small samples, using sample size as a covariate or extrapolating estimates obtained from small samples to a larger number of individuals.

570 citations


Journal ArticleDOI
TL;DR: This work studies approximations of optimization problems with probabilistic constraints in which the original distribution of the underlying random vector is replaced with an empirical distribution obtained from a random sample to obtain a lower bound to the true optimal value.
Abstract: We study approximations of optimization problems with probabilistic constraints in which the original distribution of the underlying random vector is replaced with an empirical distribution obtained from a random sample. We show that such a sample approximation problem with a risk level larger than the required risk level will yield a lower bound to the true optimal value with probability approaching one exponentially fast. This leads to an a priori estimate of the sample size required to have high confidence that the sample approximation will yield a lower bound. We then provide conditions under which solving a sample approximation problem with a risk level smaller than the required risk level will yield feasible solutions to the original problem with high probability. Once again, we obtain a priori estimates on the sample size required to obtain high confidence that the sample approximation problem will yield a feasible solution to the original problem. Finally, we present numerical illustrations of how these results can be used to obtain feasible solutions and optimality bounds for optimization problems with probabilistic constraints.

568 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider Bayesian regression with normal and double-exponential priors as forecasting methods based on large panels of time series and show that these forecasts are highly correlated with principal component forecasts and that they perform equally well for a wide range of prior choices.

372 citations


Journal ArticleDOI
TL;DR: A large number of clinical research studies are conducted, including audits of patient data, observational studies, clinical trials and those based on laboratory analyses, which needs to be a balance between those that can be performed quickly and those that should be based on more subjects and hence may take several years to complete.
Abstract: A large number of clinical research studies are conducted, including audits of patient data, observational studies, clinical trials and those based on laboratory analyses. While small studies can be published over a short time-frame, there needs to be a balance between those that can be performed quickly and those that should be based on more subjects and hence may take several years to complete. The present article provides an overview of the main considerations associated with small studies. The definition of “small” depends on the main study objective. When simply describing the characteristics of a single group of subjects, for example the prevalence of smoking, the larger the study the more reliable the results. The main results should have 95% confidence intervals (CI), and the width of these depend directly on the sample size: large studies produce narrow intervals and, therefore, more precise results. A study of 20 subjects, for example, is likely to be too small for most investigations. For example, imagine that the proportion of smokers among a particular group of 20 individuals is 25%. The associated 95% CI is 9–49. This means that the true prevalence in these subjects generally is anywhere between a low or high value, which is not a useful result. When comparing characteristics between two or more groups of subjects ( e.g. examining risk factors or treatments for disease), the size of the study depends on the magnitude of the expected effect size, which is usually quantified by a relative risk, odds ratio, absolute risk difference, hazard ratio, or difference between two means or medians. The smaller the true-effect size, the larger the study needs to be 1, 2. This is because it is more difficult to distinguish between a real effect and random variation. Consider mortality as the end-point in a …

362 citations


Journal ArticleDOI
TL;DR: In this paper, a review examines recent advances in sample size planning, not only from the perspective of an individual researcher, but also with regard to the goal of developing cumulative knowledge.
Abstract: This review examines recent advances in sample size planning, not only from the perspective of an individual researcher, but also with regard to the goal of developing cumulative knowledge. Psychologists have traditionally thought of sample size planning in terms of power analysis. Although we review recent advances in power analysis, our main focus is the desirability of achieving accurate parameter estimates, either instead of or in addition to obtaining sufficient power. Accuracy in parameter estimation (AIPE) has taken on increasing importance in light of recent emphasis on effect size estimation and formation of confidence intervals. The review provides an overview of the logic behind sample size planning for AIPE and summarizes recent advances in implementing this approach in designs commonly used in psychological research.

Journal ArticleDOI
TL;DR: The exact likelihood approach is the method of preference and should be used whenever feasible because it performs always better than the approximate approach and gives unbiased estimates.

Posted Content
TL;DR: This review examines recent advances in sample size planning, not only from the perspective of an individual researcher, but also with regard to the goal of developing cumulative knowledge, for accuracy in parameter estimation (AIPE).
Abstract: This review examines recent advances in sample size planning, not only from the perspective of an individual researcher, but also with regard to the goal of developing cumulative knowledge. Psychologists have traditionally thought of sample size planning in terms of power analysis. Although we review recent advances in power analysis, our main focus is the desirability of achieving accurate parameter estimates, either instead of or in addition to obtaining sufficient power. Accuracy in parameter estimation (AIPE) has taken on increasing importance in light of recent emphasis on effect size estimation and formation of confidence intervals. The review provides an overview of the logic behind sample size planning for AIPE and summarizes recent advances in implementing this approach in designs commonly used in psychological research.

Journal ArticleDOI
TL;DR: When planning a study reporting differences among groups of patients or describing some variable in a single group, sample size should be considered because it allows the researcher to control for the risk of reporting a false-negative finding or to estimate the precision his or her experiment will yield.
Abstract: The increasing volume of research by the medical community often leads to increasing numbers of contradictory findings and conclusions. Although the differences observed may represent true differences, the results also may differ because of sampling variability as all studies are performed on a limited number of specimens or patients. When planning a study reporting differences among groups of patients or describing some variable in a single group, sample size should be considered because it allows the researcher to control for the risk of reporting a false-negative finding (Type II error) or to estimate the precision his or her experiment will yield. Equally important, readers of medical journals should understand sample size because such understanding is essential to interpret the relevance of a finding with regard to their own patients. At the time of planning, the investigator must establish (1) a justifiable level of statistical significance, (2) the chances of detecting a difference of given magnitude between the groups compared, ie, the power, (3) this targeted difference (ie, effect size), and (4) the variability of the data (for quantitative data). We believe correct planning of experiments is an ethical issue of concern to the entire community.

Journal ArticleDOI
TL;DR: It was concluded that mean square statistics were relatively independent of sample size for polytomous data and that misfit to the model could be identified using published recommended ranges.
Abstract: BACKGROUND: Previous research on educational data has demonstrated that Rasch fit statistics (mean squares and t-statistics) are highly susceptible to sample size variation for dichotomously scored rating data, although little is known about this relationship for polytomous data. These statistics help inform researchers about how well items fit to a unidimensional latent trait, and are an important adjunct to modern psychometrics. Given the increasing use of Rasch models in health research the purpose of this study was therefore to explore the relationship between fit statistics and sample size for polytomous data. METHODS: Data were collated from a heterogeneous sample of cancer patients (n = 4072) who had completed both the Patient Health Questionnaire - 9 and the Hospital Anxiety and Depression Scale. Ten samples were drawn with replacement for each of eight sample sizes (n = 25 to n = 3200). The Rating and Partial Credit Models were applied and the mean square and t-fit statistics (infit/outfit) derived for each model. RESULTS: The results demonstrated that t-statistics were highly sensitive to sample size, whereas mean square statistics remained relatively stable for polytomous data. CONCLUSION: It was concluded that mean square statistics were relatively independent of sample size for polytomous data and that misfit to the model could be identified using published recommended ranges.

Journal ArticleDOI
TL;DR: A fundamental asymptotic limit of sample-eigenvalue-based detection of weak or closely spaced high-dimensional signals from a limited sample size is highlighted; this motivates the heuristic definition of the effective number of identifiable signals which is equal to the number of ldquosignalrdquo eigenvalues of the population covariance matrix.
Abstract: The detection and estimation of signals in noisy, limited data is a problem of interest to many scientific and engineering communities. We present a mathematically justifiable, computationally simple, sample-eigenvalue-based procedure for estimating the number of high-dimensional signals in white noise using relatively few samples. The main motivation for considering a sample-eigenvalue-based scheme is the computational simplicity and the robustness to eigenvector modelling errors which can adversely impact the performance of estimators that exploit information in the sample eigenvectors. There is, however, a price we pay by discarding the information in the sample eigenvectors; we highlight a fundamental asymptotic limit of sample-eigenvalue-based detection of weak or closely spaced high-dimensional signals from a limited sample size. This motivates our heuristic definition of the effective number of identifiable signals which is equal to the number of ldquosignalrdquo eigenvalues of the population covariance matrix which exceed the noise variance by a factor strictly greater than . The fundamental asymptotic limit brings into sharp focus why, when there are too few samples available so that the effective number of signals is less than the actual number of signals, underestimation of the model order is unavoidable (in an asymptotic sense) when using any sample-eigenvalue-based detection scheme, including the one proposed herein. The analysis reveals why adding more sensors can only exacerbate the situation. Numerical simulations are used to demonstrate that the proposed estimator, like Wax and Kailath's MDL-based estimator, consistently estimates the true number of signals in the dimension fixed, large sample size limit and the effective number of identifiable signals, unlike Wax and Kailath's MDL-based estimator, in the large dimension, (relatively) large sample size limit.

Journal ArticleDOI
TL;DR: In this article, a matrix perturbation approach was used to study the nonasymptotic relation between the eigenvalues and eigenvectors of PCA computed on a finite sample of size n and those of the limiting population PCA as n → oo.
Abstract: Principal component analysis (PCA) is a standard tool for dimensional reduction of a set of n observations (samples), each with p variables. In this paper, using a matrix perturbation approach, we study the nonasymptotic relation between the eigenvalues and eigenvectors of PCA computed on a finite sample of size n, and those of the limiting population PCA as n → oo. As in machine learning, we present a finite sample theorem which holds with high probability for the closeness between the leading eigenvalue and eigenvector of sample PCA and population PCA under a spiked covariance model. In addition, we also consider the relation between finite sample PCA and the asymptotic results in the joint limit p, n → ∞, with p/n = c. We present a matrix perturbation view of the "phase transition phenomenon," and a simple linear-algebra based derivation of the eigenvalue and eigenvector overlap in this asymptotic limit. Moreover, our analysis also applies for finite p, n where we show that although there is no sharp phase transition as in the infinite case, either as a function of noise level or as a function of sample size n, the eigenvector of sample PCA may exhibit a sharp "loss of tracking," suddenly losing its relation to the (true) eigenvector of the population PCA matrix. This occurs due to a crossover between the eigenvalue due to the signal and the largest eigenvalue due to noise, whose eigenvector points in a random direction.

Journal ArticleDOI
TL;DR: Data-driven selection of the optimal cutoff value can lead to overly optimistic estimates of sensitivity and specificity, especially in small studies, but alternative methods can reduce this bias, but finding robust estimates for cutoff values and accuracy requires considerable sample sizes.
Abstract: Background: Optimal cutoff values for tests results involving continuous variables are often derived in a data-driven way. This approach, however, may lead to overly optimistic measures of diagnostic accuracy. We evaluated the magnitude of the bias in sensitivity and specificity associated with data-driven selection of cutoff values and examined potential solutions to reduce this bias. Methods: Different sample sizes, distributions, and prevalences were used in a simulation study. We compared data-driven estimates of accuracy based on the Youden index with the true values and calculated the median bias. Three alternative approaches (assuming a specific distribution, leave-one-out, smoothed ROC curve) were examined for their ability to reduce this bias. Results: The magnitude of bias caused by data-driven optimization of cutoff values was inversely related to sample size. If the true values for sensitivity and specificity are both 84%, the estimates in studies with a sample size of 40 will be approximately 90%. If the sample size increases to 200, the estimates will be 86%. The distribution of the test results had little impact on the amount of bias when sample size was held constant. More robust methods of optimizing cutoff values were less prone to bias, but the performance deteriorated if the underlying assumptions were not met. Conclusions: Data-driven selection of the optimal cutoff value can lead to overly optimistic estimates of sensitivity and specificity, especially in small studies. Alternative methods can reduce this bias, but finding robust estimates for cutoff values and accuracy requires considerable sample sizes.

Journal ArticleDOI
TL;DR: The log-binomial method results in less bias in most common situations, and because it fits the correct model and obtains maximum likelihood estimates, it generally results in slightly higher power, smaller standard errors, and, unlike the Robust Poisson, it always yields estimated prevalences between zero and one.
Abstract: It is usually preferable to model and estimate prevalence ratios instead of odds ratios in cross-sectional studies when diseases or injuries are not rare. Problems with existing methods of modeling prevalence ratios include lack of convergence, overestimated standard errors, and extrapolation of simple univariate formulas to multivariable models. We compare two of the newer methods using simulated data and real data from SAS online examples. The Robust Poisson method, which uses the Poisson distribution and a sandwich variance estimator, is compared to the log-binomial method, which uses the binomial distribution to obtain maximum likelihood estimates, using computer simulations and real data. For very high prevalences and moderate sample size, the Robust Poisson method yields less biased estimates of the prevalence ratios than the log-binomial method. However, for moderate prevalences and moderate sample size, the log-binomial method yields slightly less biased estimates than the Robust Poisson method. In nearly all cases, the log-binomial method yielded slightly higher power and smaller standard errors than the Robust Poisson method. Although the Robust Poisson often gives reasonable estimates of the prevalence ratio and is very easy to use, the log-binomial method results in less bias in most common situations, and because it fits the correct model and obtains maximum likelihood estimates, it generally results in slightly higher power, smaller standard errors, and, unlike the Robust Poisson, it always yields estimated prevalences between zero and one.

Journal ArticleDOI
TL;DR: In this paper, the authors examined the relationship between the squared multiple correlation coefficients and minimum necessary sample sizes and found a definite relationship, similar to a negative exponential relationship, and provided guidelines for sample size needed for accurate predictions.
Abstract: When using multiple regression for prediction purposes, the issue of minimum required sample size often needs to be addressed. Using a Monte Carlo simulation, models with varying numbers of independent variables were examined and minimum sample sizes were determined for multiple scenarios at each number of independent variables. The scenarios arrive from varying the levels of correlations between the criterion variable and predictor variables as well as among predictor variables. Two minimum sample sizes were determined for each scenario, a good and an excellent prediction level. The relationship between the squared multiple correlation coefficients and minimum necessary sample sizes were examined. A definite relationship, similar to a negative exponential relationship, was found between the squared multiple correlation coefficient and the minimum sample size. As the squared multiple correlation coefficient decreased, the sample size increased at an increasing rate. This study provides guidelines for sample size needed for accurate predictions.

Journal ArticleDOI
TL;DR: Overall, BCT shows better outcomes than more typical individual-based treatment for married or cohabiting individuals who seek help for alcohol dependence or drug dependence problems and the benefit for BCT with low severity problem drinkers has received little attention.

Journal ArticleDOI
TL;DR: The comet or single-cell gel electrophoresis assay is now widely used in regulatory, mechanistic and biomonitoring studies using a range of in vitro and in vivo systems and Statistical issues associated with the design and subsequent analyses of current validation studies for the comet assay include the identification of acceptable levels of intra- and inter-laboratory repeatability and reproducibility.
Abstract: The comet or single-cell gel electrophoresis assay is now widely used in regulatory, mechanistic and biomonitoring studies using a range of in vitro and in vivo systems. Each of these has issues associated with the experimental design which determine to a large extent the statistical analyses than can be used. A key concept is that the experimental unit is the smallest 'amount' of experimental material that can be randomly assigned to a treatment: the animal for in vivo studies and the culture for in vitro studies. Biomonitoring studies, being observational rather than experimental, are vulnerable to confounding and biases. Critical factors in any statistical analysis include the identification of suitable end points, the choice of measure to represent the distribution of the comet end point in a sample of cells, estimates of variability between experimental units and the identification of the size of effects that could be considered biologically important. Power and sample size calculations can be used in conjunction with this information to identify optimum experimental sizes and provide help in combining the results of statistical analyses with other information to aid interpretation. Interpretation based upon the size of effects and their confidence intervals is preferred to that based solely upon statistical significance tests. Statistical issues associated with the design and subsequent analyses of current validation studies for the comet assay include the identification of acceptable levels of intra- and inter-laboratory repeatability and reproducibility and criteria for dichotomizing results into positive or negative.

Journal ArticleDOI
TL;DR: New estimators of the eigenvalues and eigenvectors of the covariance matrix are derived, that are shown to be consistent in a more general asymptotic setting than the traditional one and have an excellent performance in small sample size scenarios.
Abstract: The problem of estimating the eigenvalues and eigenvectors of the covariance matrix associated with a multivariate stochastic process is considered. The focus is on finite sample size situations, whereby the number of observations is limited and comparable in magnitude to the observation dimension. Using tools from random matrix theory, and assuming a certain eigenvalue splitting condition, new estimators of the eigenvalues and eigenvectors of the covariance matrix are derived, that are shown to be consistent in a more general asymptotic setting than the traditional one. Indeed, these estimators are proven to be consistent, not only when the sample size increases without bound for a fixed observation dimension, but also when the observation dimension increases to infinity at the same rate as the sample size. Numerical evaluations indicate that the estimators have an excellent performance in small sample size scenarios, where the observation dimension and the sample size are comparable in magnitude.

Journal ArticleDOI
TL;DR: For example, Trochim et al. as discussed by the authors defined degrees of freedom as the number of values in a distribution that are free to vary for any particular statistic, which is defined in the theory of thermodynamics.
Abstract: As we were teaching a multivariate statistics course for doctoral students, one of the students in the class asked, "What are degrees of freedom? I know it is not good to lose degrees of freedom, but what are they?" Other students in the class waited for a clear-cut response. As we tried to give a textbook answer, we were not satisfied and we did not get the sense that our students understood. We looked through our statistics books to determine whether we could find a more clear way to explain this term to social work students. The wide variety of language used to define degrees of freedom is enough to confuse any social worker! Definitions range from the broad, "Degrees of freedom are the number of values in a distribution that are free to vary for any particular statistic" (Healey, 1990, p. 214), to the technical: Statisticians start with the number of terms in the sum [of squares], then subtract the number of mean values that were calculated along the way. The result is called the degrees of freedom, for reasons that reside, believe it or not, in the theory of thermodynamics. (Norman & Streiner, 2003, p. 43) Authors who have tried to be more specific have defined degrees of freedom in relation to sample size (Trochim, 2005; Weinbach & Grinnell, 2004), cell size (Salkind, 2004), the number of relationships in the data (Walker, 1940), and the difference in dimensionalities of the parameter spaces (Good, 1973). The most common definition includes the number or pieces of information that are free to vary (Healey, 1990; Jaccard & Becker, 1990; Pagano, 2004; Warner, 2008; Wonnacott & Wonnacott, 1990). These specifications do not seem to augment students' understanding of this term. Hence, degrees of freedom are conceptually difficult but are important to report to understand statistical analysis. For example, without degrees of freedom, we are unable to calculate or to understand any underlying population variability. Also, in a bivariate and multivariate analysis, degrees of freedom are a function of sample size, number of variables, and number of parameters to be estimated; therefore, degrees of freedom are also associated with statistical power. This research note is intended to comprehensively define degrees of freedom, to explain how they are calculated, and to give examples of the different types of degrees of freedom in some commonly used analyses. DEGREES OF FREEDOM DEFINED In any statistical analysis the goal is to understand how the variables (or parameters to be estimated) and observations are linked. Hence, degrees of freedom are a function of both sample size (N) (Trochim, 2005) and the number of independent variables (k) in one's model (Toothaker & Miller, 1996; Walker, 1940; Yu, 1997). The degrees of freedom are equal to the number of independent observations (N), or the number of subjects in the data, minus the number of parameters (k) estimated (Toothaker & Miller, 1996; Walker, 1940). A parameter (for example, slope) to be estimated is related to the value of an independent variable and included in a statistical equation (an additional parameter is estimated for an intercept in a general linear model). A researcher may estimate parameters using different amounts or pieces of information, and the number of independent pieces of information he or she uses to estimate a statistic or a parameter are called the degrees of freedom (df) (HyperStat Online, n.d.). For example, a researcher records income of N number of individuals from a community. Here he or she has Nindependent pieces of information (that is, N points of incomes) and one variable called income (k);in subsequent analysis of this data set, degrees of freedom are associated with both N and k. For instance, if this researcher wants to calculate sample variance to understand the extent to which incomes vary in this community, the degrees of freedom equal N - k. …

Journal ArticleDOI
TL;DR: The probability-based measure A, the nonparametric generalization of what K. O. McGraw and S. Wong (1992) called the common language effect size statistic, is insensitive to base rates and more robust to several other factors (e.g., extreme scores, nonlinear transformations).
Abstract: Calculating and reporting appropriate measures of effect size are becoming standard practice in psychological research. One of the most common scenarios encountered involves the comparison of 2 groups, which includes research designs that are experimental (e.g., random assignment to treatment vs. placebo conditions) and nonexperimental (e.g., testing for gender differences). Familiar measures such as the standardized mean difference (d) or the point-biserial correlation (rpb) characterize the magnitude of the difference between groups, but these effect size measures are sensitive to a number of additional influences. For example, R. E. McGrath and G. J. Meyer (2006) showed that rpb is sensitive to sample base rates, and extending their analysis to situations of unequal variances reveals that d is, too. The probability-based measure A, the nonparametric generalization of what K. O. McGraw and S. P. Wong (1992) called the common language effect size statistic, is insensitive to base rates and more robust to several other factors (e.g., extreme scores, nonlinear transformations). In addition to its excellent generalizability across contexts, A is easy to understand and can be obtained from standard computer output or through simple hand calculations.

Journal ArticleDOI
TL;DR: As RDS is increasingly used for HIV surveillance, it is important to learn from past practical, theoretical and analytical challenges to maximize the utility of this method.
Abstract: Using respondent-driven sampling (RDS), we gathered data from 128 HIV surveillance studies conducted outside the United States through October 1, 2007. We examined predictors of poor study outcomes, reviewed operational, design and analytical challenges associated with conducting RDS in international settings and offer recommendations to improve HIV surveillance. We explored factors for poor study outcomes using differences in mean sample size ratios (recruited/calculated sample size) as the outcome variable. Ninety-two percent of studies reported both calculated and recruited sample sizes. Studies of injecting drug users had a higher sample size ratio compared with other risk groups. Study challenges included appropriately defining eligibility criteria, structuring social network size questions, selecting design effects and conducting statistical analysis. As RDS is increasingly used for HIV surveillance, it is important to learn from past practical, theoretical and analytical challenges to maximize the utility of this method.

Journal ArticleDOI
TL;DR: In this article, a simulation framework is developed to generate data from the Poisson-gamma distributions using different values describing the mean, the dispersion parameter, the sample size, and the prior specification.

Journal ArticleDOI
TL;DR: A flexible power calculation model is introduced that makes fewer simplifying assumptions, leading to a more accurate power analysis that can be used on a wide variety of study designs and can be use to increase understanding of the data as well as calculate power for a future study.

Journal ArticleDOI
TL;DR: The approach first constructs a prior chosen to be vague in a suitable sense, and updates this prior to obtain a sequence of posteriors corresponding to each of a range of sample sizes, and compute a distance between each posterior and the parametric prior.
Abstract: We present a definition for the effective sample size of a parametric prior distribution in a Bayesian model, and propose methods for computing the effective sample size in a variety of settings. Our approach first constructs a prior chosen to be vague in a suitable sense, and updates this prior to obtain a sequence of posteriors corresponding to each of a range of sample sizes. We then compute a distance between each posterior and the parametric prior, defined in terms of the curvature of the logarithm of each distribution, and the posterior minimizing the distance defines the effective sample size of the prior. For cases where the distance cannot be computed analytically, we provide a numerical approximation based on Monte Carlo simulation. We provide general guidelines for application, illustrate the method in several standard cases where the answer seems obvious, and then apply it to some nonstandard settings.

Journal ArticleDOI
TL;DR: In this paper, the authors provide an in-depth examination of country-of-origin (COO) perceptions of consumers in a multinational setting, and show how explanatory factors like demographics, familiarity with a country's products, purchase behaviour and psychological variables jointly work to explain consumers' COO perceptions.
Abstract: Purpose – The purpose of this paper is to provide an in‐depth examination of country‐of‐origin (COO) perceptions of consumers in a multinational setting. It shows how explanatory factors like demographics, familiarity with a country's products, purchase behaviour and psychological variables jointly work to explain consumers' COO perceptions.Design/methodology/approach – This is a quantitative study using a drop‐off and pick‐up survey among three samples of consumers in Canada, Morocco and Taiwan. The final sample size was comprised of 506 male consumers. The data were analyzed using factor analysis to group countries of origin and analyses of variance to relate COO perceptions to the explanatory variables.Findings – The familiarity with products made in a country was the strongest predictor of country perceptions, followed by nationality and the manufacturing process and product complexity dimensions of country evaluation. Canadians had the highest propensity to distinguish between countries of origin on ...