scispace - formally typeset
Search or ask a question
Topic

Statistical hypothesis testing

About: Statistical hypothesis testing is a research topic. Over the lifetime, 19580 publications have been published within this topic receiving 1037815 citations. The topic is also known as: statistical hypothesis testing & confirmatory data analysis.


Papers
More filters
Journal ArticleDOI
TL;DR: Five metrics commonly used as quantitative descriptors of sample similarity in detrital geochronology, including the Kolmogorov-Smirnov and Kuiper tests are tested, as well as Cross-correlation, Likeness, and Similarity coefficients of probability density plots, and locally adaptive, variable-bandwidth KDEs.
Abstract: The increase in detrital geochronological data presents challenges to existing approaches to data visualization and comparison, and highlights the need for quantitative techniques able to evaluate and compare multiple large data sets. We test five metrics commonly used as quantitative descriptors of sample similarity in detrital geochronology: the Kolmogorov-Smirnov (K-S) and Kuiper tests, as well as Cross-correlation, Likeness, and Similarity coefficients of probability density plots (PDPs), kernel density estimates (KDEs), and locally adaptive, variable-bandwidth KDEs (LA-KDEs). We assess these metrics by applying them to 20 large synthetic data sets and one large empirical data set, and evaluate their utility in terms of sample similarity based on the following three criteria. (1) Similarity of samples from the same population should systematically increase with increasing sample size. (2) Metrics should maximize sensitivity by using the full range of possible coefficients. (3) Metrics should minimize artifacts resulting from sample-specific complexity. K-S and Kuiper test p-values passed only one criterion, indicating that they are poorly suited as quantitative descriptors of sample similarity. Likeness and Similarity coefficients of PDPs, as well as K-S and Kuiper test D and V values, performed better by passing two of the criteria. Cross-correlation of PDPs passed all three criteria. All coefficients calculated from KDEs and LA-KDEs failed at least two of the criteria. As hypothesis tests of derivation from a common source, individual K-S and Kuiper p-values too frequently reject the null hypothesis that samples come from a common source when they are identical. However, mean p-values calculated by repeated subsampling and comparison (minimum of 4 trials) consistently yield a binary discrimination of identical versus different source populations. Cross-correlation and Likeness of PDPs and Cross-correlation of KDEs yield the widest divergence in coefficients and thus a consistent discrimination between identical and different source populations, with Cross-correlation of PDPs requiring the smallest sample size. In light of this, we recommend acquisition of large detrital geochronology data sets for quantitative comparison. We also recommend repeated subsampling of detrital geochronology data sets and calculation of the mean and standard deviation of the comparison metric in order to capture the variability inherent in sampling a multimodal population. These statistical tools are implemented using DZstats, a MATLAB-based code that can be accessed via an executable file graphical user interface. It implements all of the statistical tests discussed in this paper, and exports the results both as spreadsheets and as graphic files.

195 citations

Journal ArticleDOI
TL;DR: This tutorial study clarifies the relationships between traditional methods based on allele frequency differentiation and EA methods and provides a unified framework for their underlying statistical tests, and demonstrates how techniques developed in the area of genomewide association studies, such as inflation factors and linear mixed models, benefit genome scan methods.
Abstract: Population differentiation (PD) and ecological association (EA) tests have recently emerged as prominent statistical methods to investigate signatures of local adaptation using population genomic data. Based on statistical models, these genomewide testing procedures have attracted considerable attention as tools to identify loci potentially targeted by natural selection. An important issue with PD and EA tests is that incorrect model specification can generate large numbers of false-positive associations. Spurious association may indeed arise when shared demographic history, patterns of isolation by distance, cryptic relatedness or genetic background are ignored. Recent works on PD and EA tests have widely focused on improvements of test corrections for those confounding effects. Despite significant algorithmic improvements, there is still a number of open questions on how to check that false discoveries are under control and implement test corrections, or how to combine statistical tests from multiple genome scan methods. This tutorial study provides a detailed answer to these questions. It clarifies the relationships between traditional methods based on allele frequency differentiation and EA methods and provides a unified framework for their underlying statistical tests. We demonstrate how techniques developed in the area of genomewide association studies, such as inflation factors and linear mixed models, benefit genome scan methods and provide guidelines for good practice while conducting statistical tests in landscape and population genomic applications. Finally, we highlight how the combination of several well-calibrated statistical tests can increase the power to reject neutrality, improving our ability to infer patterns of local adaptation in large population genomic data sets.

195 citations

Book
08 Jun 2009

194 citations

Journal ArticleDOI
TL;DR: A general model of human judgment is introduced aimed at describing how people generate hypotheses from memory and how these hypotheses serve as the basis of probability judgment and hypothesis testing.
Abstract: Diagnostic hypothesis-generation processes are ubiquitous in human reasoning. For example, clinicians generate disease hypotheses to explain symptoms and help guide treatment, auditors generate hypotheses for identifying sources of accounting errors, and laypeople generate hypotheses to explain patterns of information (i.e., data) in the environment. The authors introduce a general model of human judgment aimed at describing how people generate hypotheses from memory and how these hypotheses serve as the basis of probability judgment and hypothesis testing. In 3 simulation studies, the authors illustrate the properties of the model, as well as its applicability to explaining several common findings in judgment and decision making, including how errors and biases in hypothesis generation can cascade into errors and biases in judgment.

194 citations

Journal ArticleDOI
TL;DR: An accurate and efficient method for multiple testing correction in genome-wide association studies—SLIDE, which accounts for all correlation within a sliding window and corrects for the departure of the true null distribution of the statistic from the asymptotic distribution.
Abstract: With the development of high-throughput sequencing and genotyping technologies, the number of markers collected in genetic association studies is growing rapidly, increasing the importance of methods for correcting for multiple hypothesis testing. The permutation test is widely considered the gold standard for accurate multiple testing correction, but it is often computationally impractical for these large datasets. Recently, several studies proposed efficient alternative approaches to the permutation test based on the multivariate normal distribution (MVN). However, they cannot accurately correct for multiple testing in genome-wide association studies for two reasons. First, these methods require partitioning of the genome into many disjoint blocks and ignore all correlations between markers from different blocks. Second, the true null distribution of the test statistic often fails to follow the asymptotic distribution at the tails of the distribution. We propose an accurate and efficient method for multiple testing correction in genome-wide association studies—SLIDE. Our method accounts for all correlation within a sliding window and corrects for the departure of the true null distribution of the statistic from the asymptotic distribution. In simulations using the Wellcome Trust Case Control Consortium data, the error rate of SLIDE's corrected p-values is more than 20 times smaller than the error rate of the previous MVN-based methods' corrected p-values, while SLIDE is orders of magnitude faster than the permutation test and other competing methods. We also extend the MVN framework to the problem of estimating the statistical power of an association study with correlated markers and propose an efficient and accurate power estimation method SLIP. SLIP and SLIDE are available at http://slide.cs.ucla.edu.

194 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
88% related
Linear model
19K papers, 1M citations
88% related
Inference
36.8K papers, 1.3M citations
87% related
Regression analysis
31K papers, 1.7M citations
86% related
Sampling (statistics)
65.3K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023267
2022696
2021959
2020998
20191,033
2018943