scispace - formally typeset

Topic

Sample size determination

About: Sample size determination is a(n) research topic. Over the lifetime, 21300 publication(s) have been published within this topic receiving 961457 citation(s).


Papers
More filters
Journal ArticleDOI
Jacob Cohen1
TL;DR: A convenient, although not comprehensive, presentation of required sample sizes is providedHere the sample sizes necessary for .80 power to detect effects at these levels are tabled for eight standard statistical tests.
Abstract: One possible reason for the continued neglect of statistical power analysis in research in the behavioral sciences is the inaccessibility of or difficulty with the standard material. A convenient, although not comprehensive, presentation of required sample sizes is provided here. Effect-size indexes and conventional values for these are given for operationally defined small, medium, and large effects. The sample sizes necessary for .80 power to detect effects at these levels are tabled for eight standard statistical tests: (a) the difference between independent means, (b) the significance of a product-moment correlation, (c) the difference between independent rs, (d) the sign test, (e) the difference between independent proportions, (f) chi-square tests for goodness of fit and contingency tables, (g) one-way analysis of variance, and (h) the significance of a multiple or multiple partial correlation.

33,656 citations

Book
01 Jan 1969
Abstract: 1. Introduction 2. Data in Biology 3. Computers and Data Analysis 4. Descriptive Statistics 5. Introduction to Probability Distributions 6. The Normal Probability Distribution 7. Hypothesis Testing and Interval Estimation 8. Introduction to Analysis of Variance 9. Single-Classification Analysis of Variance 10. Nested Analysis of Variance 11. Two-Way and Multiway Analysis of Variance 12. Statistical Power and Sample Size in the Analysis of Variance 13. Assumptions of Analysis of Variance 14. Linear Regression 15. Correlation 16. Multiple and Curvilinear Regression 17. Analysis of Frequencies 18. Meta-Analysis and Miscellaneous Methods

21,263 citations

Journal ArticleDOI
TL;DR: The purpose of this discussion is to offer some unity to various estimation formulae and to point out that correlations of genes in structured populations, with which F-statistics are concerned, are expressed very conveniently with a set of parameters treated by Cockerham (1 969, 1973).
Abstract: This journal frequently contains papers that report values of F-statistics estimated from genetic data collected from several populations. These parameters, FST, FIT, and FIS, were introduced by Wright (1951), and offer a convenient means of summarizing population structure. While there is some disagreement about the interpretation of the quantities, there is considerably more disagreement on the method of evaluating them. Different authors make different assumptions about sample sizes or numbers of populations and handle the difficulties of multiple alleles and unequal sample sizes in different ways. Wright himself, for example, did not consider the effects of finite sample size. The purpose of this discussion is to offer some unity to various estimation formulae and to point out that correlations of genes in structured populations, with which F-statistics are concerned, are expressed very conveniently with a set of parameters treated by Cockerham (1 969, 1973). We start with the parameters and construct appropriate estimators for them, rather than beginning the discussion with various data functions. The extension of Cockerham's work to multiple alleles and loci will be made explicit, and the use of jackknife procedures for estimating variances will be advocated. All of this may be regarded as an extension of a recent treatment of estimating the coancestry coefficient to serve as a mea-

16,821 citations

Journal ArticleDOI
TL;DR: It is found that in most cases the estimated ‘log probability of data’ does not provide a correct estimation of the number of clusters, K, and using an ad hoc statistic ΔK based on the rate of change in the log probability between successive K values, structure accurately detects the uppermost hierarchical level of structure for the scenarios the authors tested.
Abstract: The identification of genetically homogeneous groups of individuals is a long standing issue in population genetics. A recent Bayesian algorithm implemented in the software STRUCTURE allows the identification of such groups. However, the ability of this algorithm to detect the true number of clusters (K) in a sample of individuals when patterns of dispersal among populations are not homogeneous has not been tested. The goal of this study is to carry out such tests, using various dispersal scenarios from data generated with an individual-based model. We found that in most cases the estimated 'log probability of data' does not provide a correct estimation of the number of clusters, K. However, using an ad hoc statistic DeltaK based on the rate of change in the log probability of data between successive K values, we found that STRUCTURE accurately detects the uppermost hierarchical level of structure for the scenarios we tested. As might be expected, the results are sensitive to the type of genetic marker used (AFLP vs. microsatellite), the number of loci scored, the number of populations sampled, and the number of individuals typed in each sample.

16,374 citations

Book
01 Jan 1981
Abstract: Preface.Preface to the Second Edition.Preface to the First Edition.1. An Introduction to Applied Probability.2. Statistical Inference for a Single Proportion.3. Assessing Significance in a Fourfold Table.4. Determining Sample Sizes Needed to Detect a Difference Between Two Proportions.5. How to Randomize.6. Comparative Studies: Cross-Sectional, Naturalistic, or Multinomial Sampling.7. Comparative Studies: Prospective and Retrospective Sampling.8. Randomized Controlled Trials.9. The Comparison of Proportions from Several Independent Samples.10. Combining Evidence from Fourfold Tables.11. Logistic Regression.12. Poisson Regression.13. Analysis of Data from Matched Samples.14. Regression Models for Matched Samples.15. Analysis of Correlated Binary Data.16. Missing Data.17. Misclassification Errors: Effects, Control, and Adjustment.18. The Measurement of Interrater Agreement.19. The Standardization of Rates.Appendix A. Numerical Tables.Appendix B. The Basic Theory of Maximum Likelihood Estimation.Appendix C. Answers to Selected Problems.Author Index.Subject Index.

16,098 citations


Network Information
Related Topics (5)
Regression analysis

31K papers, 1.7M citations

91% related
Statistical hypothesis testing

19.5K papers, 1M citations

89% related
Linear regression

21.3K papers, 1.2M citations

88% related
Linear model

19K papers, 1M citations

86% related
Estimator

97.3K papers, 2.6M citations

82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20228
2021973
2020941
2019971
2018955
2017975