False positive paradox
About: False positive paradox is a(n) research topic. Over the lifetime, 3497 publication(s) have been published within this topic receiving 117570 citation(s).
Papers published on a yearly basis
TL;DR: This work proposes an approach to measuring statistical significance in genomewide studies based on the concept of the false discovery rate, which offers a sensible balance between the number of true and false positives that is automatically calibrated and easily interpreted.
Abstract: With the increase in genomewide experiments and the sequencing of multiple genomes, the analysis of large data sets has become commonplace in biology. It is often the case that thousands of features in a genomewide data set are tested against some null hypothesis, where a number of features are expected to be significant. Here we propose an approach to measuring statistical significance in these genomewide studies based on the concept of the false discovery rate. This approach offers a sensible balance between the number of true and false positives that is automatically calibrated and easily interpreted. In doing so, a measure of statistical significance called the q value is associated with each tested feature. The q value is similar to the well known p value, except it is a measure of significance in terms of the false discovery rate rather than the false positive rate. Our approach avoids a flood of false positive results, while offering a more liberal criterion than what has been used in genome scans for linkage.
TL;DR: The exquisite sensitivity of the polymerase chain reaction means DNA contamination can ruin an entire experiment and adherence to a strict set of protocols can avoid disaster.
Abstract: The exquisite sensitivity of the polymerase chain reaction means DNA contamination can ruin an entire experiment. Tidiness and adherence to a strict set of protocols can avoid disaster.
Abstract: The typical functional magnetic resonance (fMRI) study presents a formidable problem of multiple statistical comparisons (i.e., > 10,000 in a 128 x 128 image). To protect against false positives, investigators have typically relied on decreasing the per pixel false positive probability. This approach incurs an inevitable loss of power to detect statistically significant activity. An alternative approach, which relies on the assumption that areas of true neural activity will tend to stimulate signal changes over contiguous pixels, is presented. If one knows the probability distribution of such cluster sizes as a function of per pixel false positive probability, one can use cluster-size thresholds independently to reject false positives. Both Monte Carlo simulations and fMRI studies of human subjects have been used to verify that this approach can improve statistical power by as much as fivefold over techniques that rely solely on adjusting per pixel false positive probabilities.
TL;DR: The performance of the genomic control method is quite good for plausible effects of liability genes, which bodes well for future genetic analyses of complex disorders.
Abstract: A dense set of single nucleotide polymorphisms (SNP) covering the genome and an efficient method to assess SNP genotypes are expected to be available in the near future. An outstanding question is how to use these technologies efficiently to identify genes affecting liability to complex disorders. To achieve this goal, we propose a statistical method that has several optimal properties: It can be used with case control data and yet, like family-based designs, controls for population heterogeneity; it is insensitive to the usual violations of model assumptions, such as cases failing to be strictly independent; and, by using Bayesian outlier methods, it circumvents the need for Bonferroni correction for multiple tests, leading to better performance in many settings while still constraining risk for false positives. The performance of our genomic control method is quite good for plausible effects of liability genes, which bodes well for future genetic analyses of complex disorders.
TL;DR: It is concluded that there are probably many common variants in the human genome with modest but real effects on common disease risk, and that studies using large samples will convincingly identify such variants.
Abstract: Association studies offer a potentially powerful approach to identify genetic variants that influence susceptibility to common disease1,2,3,4, but are plagued by the impression that they are not consistently reproducible5,6. In principle, the inconsistency may be due to false positive studies, false negative studies or true variability in association among different populations4,5,6,7,8. The critical question is whether false positives overwhelmingly explain the inconsistency. We analyzed 301 published studies covering 25 different reported associations. There was a large excess of studies replicating the first positive reports, inconsistent with the hypothesis of no true positive associations (P < 10−14). This excess of replications could not be reasonably explained by publication bias and was concentrated among 11 of the 25 associations. For 8 of these 11 associations, pooled analysis of follow-up studies yielded statistically significant replication of the first report, with modest estimated genetic effects. Thus, a sizable fraction (but under half) of reported associations have strong evidence of replication; for these, false negative, underpowered studies probably contribute to inconsistent replication. We conclude that there are probably many common variants in the human genome with modest but real effects on common disease risk, and that studies using large samples will convincingly identify such variants.