scispace - formally typeset
Search or ask a question
Topic

Sample size determination

About: Sample size determination is a research topic. Over the lifetime, 21300 publications have been published within this topic receiving 961457 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: Whereas the Bayesian Information Criterion performed the best of the ICs, the bootstrap likelihood ratio test proved to be a very consistent indicator of classes across all of the models considered.
Abstract: Mixture modeling is a widely applied data analysis technique used to identify unobserved heterogeneity in a population. Despite mixture models' usefulness in practice, one unresolved issue in the application of mixture models is that there is not one commonly accepted statistical indicator for deciding on the number of classes in a study population. This article presents the results of a simulation study that examines the performance of likelihood-based tests and the traditionally used Information Criterion (ICs) used for determining the number of classes in mixture modeling. We look at the performance of these tests and indexes for 3 types of mixture models: latent class analysis (LCA), a factor mixture model (FMA), and a growth mixture models (GMM). We evaluate the ability of the tests and indexes to correctly identify the number of classes at three different sample sizes (n = 200, 500, 1,000). Whereas the Bayesian Information Criterion performed the best of the ICs, the bootstrap likelihood ratio test ...

7,716 citations

Journal ArticleDOI
TL;DR: Results from a limited simulation study indicate that this approach is very reliable even with total sample sizes as small as 100, and the method is illustrated with two data sets.
Abstract: Relative risk is usually the parameter of interest in epidemiologic and medical studies. In this paper, the author proposes a modified Poisson regression approach (i.e., Poisson regression with a robust error variance) to estimate this effect measure directly. A simple 2-by-2 table is used to justify the validity of this approach. Results from a limited simulation study indicate that this approach is very reliable even with total sample sizes as small as 100. The method is illustrated with two data sets.

7,045 citations

Journal ArticleDOI
TL;DR: Findings indicate that low EPV can lead to major problems, and the regression coefficients were biased in both positive and negative directions, and paradoxical associations (significance in the wrong direction) were increased.

6,490 citations

Journal ArticleDOI
TL;DR: A more progressive resource for sample-based studies, meta-analyses, and case studies in sports medicine and exercise science is presented, and forthright advice on controversial or novel issues is offered.
Abstract: Statistical guidelines and expert statements are now available to assist in the analysis and reporting of studies in some biomedical disciplines. We present here a more progressive resource for sample-based studies, meta-analyses, and case studies in sports medicine and exercise science. We offer forthright advice on the following controversial or novel issues: using precision of estimation for inferences about population effects in preference to null-hypothesis testing, which is inadequate for assessing clinical or practical importance; justifying sample size via acceptable precision or confidence for clinical decisions rather than via adequate power for statistical significance; showing SD rather than SEM, to better communicate the magnitude of differences in means and nonuniformity of error; avoiding purely nonparametric analyses, which cannot provide inferences about magnitude and are unnecessary; using regression statistics in validity studies, in preference to the impractical and biased limits of agreement; making greater use of qualitative methods to enrich sample-based quantitative projects; and seeking ethics approval for public access to the depersonalized raw data of a study, to address the need for more scrutiny of research and better meta-analyses. Advice on less contentious issues includes the following: using covariates in linear models to adjust for confounders, to account for individual differences, and to identify potential mechanisms of an effect; using log transformation to deal with nonuniformity of effects and error; identifying and deleting outliers; presenting descriptive, effect, and inferential statistics in appropriate formats; and contending with bias arising from problems with sampling, assignment, blinding, measurement error, and researchers' prejudices. This article should advance the field by stimulating debate, promoting innovative approaches, and serving as a useful checklist for authors, reviewers, and editors.

6,467 citations

Journal ArticleDOI
TL;DR: Two simple formulas are found that estimate the mean using the values of the median, low and high end of the range, and n (the sample size) and these hope to help meta-analysts use clinical trials in their analysis even when not all of the information is available and/or reported.
Abstract: Usually the researchers performing meta-analysis of continuous outcomes from clinical trials need their mean value and the variance (or standard deviation) in order to pool data. However, sometimes the published reports of clinical trials only report the median, range and the size of the trial. In this article we use simple and elementary inequalities and approximations in order to estimate the mean and the variance for such trials. Our estimation is distribution-free, i.e., it makes no assumption on the distribution of the underlying data. We found two simple formulas that estimate the mean using the values of the median (m), low and high end of the range (a and b, respectively), and n (the sample size). Using simulations, we show that median can be used to estimate mean when the sample size is larger than 25. For smaller samples our new formula, devised in this paper, should be used. We also estimated the variance of an unknown sample using the median, low and high end of the range, and the sample size. Our estimate is performing as the best estimate in our simulations for very small samples (n ≤ 15). For moderately sized samples (15 70), the formula range/6 gives the best estimator for the standard deviation (variance). We also include an illustrative example of the potential value of our method using reports from the Cochrane review on the role of erythropoietin in anemia due to malignancy. Using these formulas, we hope to help meta-analysts use clinical trials in their analysis even when not all of the information is available and/or reported.

6,384 citations


Network Information
Related Topics (5)
Regression analysis
31K papers, 1.7M citations
91% related
Statistical hypothesis testing
19.5K papers, 1M citations
89% related
Linear regression
21.3K papers, 1.2M citations
88% related
Linear model
19K papers, 1M citations
86% related
Estimator
97.3K papers, 2.6M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
2023834
20221,603
20211,004
2020950
2019974