scispace - formally typeset
Search or ask a question

Showing papers on "Bonferroni correction published in 2015"


Journal ArticleDOI
TL;DR: Dunn's test is the appropriate nonparametric pairwise multiple-comparison procedure when a Kruskal-Wallis test is rejected as mentioned in this paper, and it is now implemented for Stata in the dunntest command.
Abstract: Dunn's test is the appropriate nonparametric pairwise multiple-comparison procedure when a Kruskal–Wallis test is rejected, and it is now implemented for Stata in the dunntest command. dunntest pro...

433 citations


Journal ArticleDOI
TL;DR: A strategy has been proposed that provides a balance between the optimal imputation strategy that was k‐means nearest neighbor and the best approximation of positioning real zeros and it was observed that as low as 40% missing data could be truly missing.
Abstract: The origin of missing values can be caused by different reasons and depending on these origins missing values should be considered differently and dealt with in different ways. In this research, four methods of imputation have been compared with respect to revealing their effects on the normality and variance of data, on statistical significance and on the approximation of a suitable threshold to accept missing data as truly missing. Additionally, the effects of different strategies for controlling familywise error rate or false discovery and how they work with the different strategies for missing value imputation have been evaluated. Missing values were found to affect normality and variance of data and k-means nearest neighbour imputation was the best method tested for restoring this. Bonferroni correction was the best method for maximizing true positives and minimizing false positives and it was observed that as low as 40% missing data could be truly missing. The range between 40 and 70% missing values was defined as a "gray area" and therefore a strategy has been proposed that provides a balance between the optimal imputation strategy that was k-means nearest neighbor and the best approximation of positioning real zeros.

110 citations


Journal ArticleDOI
TL;DR: The detailed steps of multiple attribute decision making with the presented operators under intuitionistic fuzzy environment are investigated and an example is illustrated to show the validity and feasibility of the new approach.
Abstract: The Bonferroni mean (BM) was originally presented by Bonferroni and had been generalized by many researchers for its capacity to capture the interrelationship between input arguments. Nevertheless, the existing intuitionistic fuzzy BMs only consider the effects of membership function or nonmembership function of different intuitionistic fuzzy sets (IFSs). As complements to the existing generalizations of BM under intuitionistic fuzzy environment, this paper also considers the interactions between the membership function and nonmembership function of different IFSs and develops the intuitionistic fuzzy interaction BM and the weighted intuitionistic fuzzy interaction BM. We investigate the properties of these new extensions of BM and discuss their special cases. Furthermore, the detailed steps of multiple attribute decision making with the presented operators under intuitionistic fuzzy environment are investigated and an example is illustrated to show the validity and feasibility of the new approach.

101 citations


Journal ArticleDOI
TL;DR: This work improves one such method, JTK_CYCLE, by explicitly calculating the null distribution such that it accounts for multiple hypothesis testing and by including non-sinusoidal reference waveforms, and finds that it gives the greatest sensitivity while controlling for the false discovery rate.
Abstract: Robust methods for identifying patterns of expression in genome-wide data are important for generating hypotheses regarding gene function To this end, several analytic methods have been developed for detecting periodic patterns We improve one such method, JTK_CYCLE, by explicitly calculating the null distribution such that it accounts for multiple hypothesis testing and by including non-sinusoidal reference waveforms We term this method empirical JTK_CYCLE with asymmetry search, and we compare its performance to JTK_CYCLE with Bonferroni and Benjamini-Hochberg multiple hypothesis testing correction, as well as to five other methods: cyclohedron test, address reduction, stable persistence, ANOVA, and F24 We find that ANOVA, F24, and JTK_CYCLE consistently outperform the other three methods when data are limited and noisy; empirical JTK_CYCLE with asymmetry search gives the greatest sensitivity while controlling for the false discovery rate Our analysis also provides insight into experimental design and we find that, for a fixed number of samples, better sensitivity and specificity are achieved with higher numbers of replicates than with higher sampling density Application of the methods to detecting circadian rhythms in a metadataset of microarrays that quantify time-dependent gene expression in whole heads of Drosophila melanogaster reveals annotations that are enriched among genes with highly asymmetric waveforms These include a wide range of oxidation reduction and metabolic genes, as well as genes with transcripts that have multiple splice forms

97 citations


Journal ArticleDOI
TL;DR: This work demonstrates a novel extension to the phenome-wide association study approach, using automated screening with genotypic instruments to screen for causal associations amongst any number of phenotypic outcomes, and finds novel evidence of effects of BMI on a global self-worth score.
Abstract: Observational cohort studies can provide rich datasets with a diverse range of phenotypic variables. However, hypothesis-driven epidemiological analyses by definition only test particular hypotheses chosen by researchers. Furthermore, observational analyses may not provide robust evidence of causality, as they are susceptible to confounding, reverse causation and measurement error. Using body mass index (BMI) as an exemplar, we demonstrate a novel extension to the phenome-wide association study (pheWAS) approach, using automated screening with genotypic instruments to screen for causal associations amongst any number of phenotypic outcomes. We used a sample of 8,121 children from the ALSPAC dataset, and tested the linear association of a BMI-associated allele score with 172 phenotypic outcomes (with variable sample sizes). We also performed an instrumental variable analysis to estimate the causal effect of BMI on each phenotype. We found 21 of the 172 outcomes were associated with the allele score at an unadjusted p < 0.05 threshold, and use Bonferroni corrections, permutation testing and estimates of the false discovery rate to consider the strength of results given the number of tests performed. The most strongly associated outcomes included leptin, lipid profile, and blood pressure. We also found novel evidence of effects of BMI on a global self-worth score.

86 citations


Journal ArticleDOI
TL;DR: In this article, the authors studied test procedures that are robust in the sense that their asymptotic null distributions are invariant to the persistence of the predictor, that is, the limiting distribution is the same irrespective of whether the regressors are stationary or (nearly) integrated.

47 citations


Journal ArticleDOI
TL;DR: A new method for large-scale frequentist multiple testing with Bayesian prior information that can discover new loci in genome-wide association studies and compares favourably to competitors is developed.
Abstract: We develop a new method for large-scale frequentist multiple testing with Bayesian prior information. We find optimal [Formula: see text]-value weights that maximize the average power of the weighted Bonferroni method. Due to the nonconvexity of the optimization problem, previous methods that account for uncertain prior information are suitable for only a small number of tests. For a Gaussian prior on the effect sizes, we give an efficient algorithm that is guaranteed to find the optimal weights nearly exactly. Our method can discover new loci in genome-wide association studies and compares favourably to competitors. An open-source implementation is available.

41 citations


Journal ArticleDOI
TL;DR: An adaptive resampling test (ART) is proposed that provides an alternative to the popular (yet conservative) Bonferroni method of controlling family-wise error rates and is evaluated using a simulation study and applied to gene expression data and HIV drug resistance data.
Abstract: This article investigates marginal screening for detecting the presence of significant predictors in high-dimensional regression. Screening large numbers of predictors is a challenging problem due to the nonstandard limiting behavior of post-model-selected estimators. There is a common misconception that the oracle property for such estimators is a panacea, but the oracle property only holds away from the null hypothesis of interest in marginal screening. To address this difficulty, we propose an adaptive resampling test (ART). Our approach provides an alternative to the popular (yet conservative) Bonferroni method of controlling family-wise error rates. ART is adaptive in the sense that thresholding is used to decide whether the centered percentile bootstrap applies, and otherwise adapts to the nonstandard asymptotics in the tightest way possible. The performance of the approach is evaluated using a simulation study and applied to gene expression data and HIV drug resistance data.

40 citations


Journal ArticleDOI
TL;DR: Based on the GFNIFWBM operator, an approach to deal with multiattribute decision‐making problems under fuzzy number intuitionistic fuzzy environment is developed.
Abstract: The Bonferroni mean BM was originally introduced by Bonferroni in 1950. A prominent characteristic of BM is its capability to capture the interrelationship between input arguments. This makes BM useful in various application fields, such as decision making, information retrieval, pattern recognition, and data mining. In this paper, we examine the issue of fuzzy number intuitionistic fuzzy information fusion. We first propose a new generalized Bonferroni mean operator called generalized fuzzy number intuitionistic fuzzy weighted Bonferroni mean GFNIFWBM operator for aggregating fuzzy number intuitionistic fuzzy information. The properties of the new aggregation operator are studied and their special cases are examined. Furthermore, based on the GFNIFWBM operator, an approach to deal with multiattribute decision-making problems under fuzzy number intuitionistic fuzzy environment is developed. Finally, a practical example is provided to illustrate the multiattribute decision-making process.

39 citations


Journal ArticleDOI
TL;DR: Which multiple-comparisons procedure is preferable depends on the number of outcome variables, the importance of the PFER, the necessity of confidence intervals, and the extent to which significance in multiple variables is more valuable than significance in one variable.
Abstract: Simulations were conducted to evaluate the statistical power and Type I error control provided by several multiple-comparisons procedures in two-group designs. Stepwise Bonferroni-based procedures, which are known to control the familywise Type I error rate, tended to be more powerful than other methods but did not control the per-family Type I error rate (PFER). It is proposed that more attention should be given to the PFER, particularly with regard to these procedures. Only two methods controlled the PFER: the classical Bonferroni procedure and a modified version of MANOVA-protection. Which of these two procedures was more powerful depended on multiple factors that this article describes in detail and illustrates graphically. It is concluded that which multiple-comparisons procedure is preferable depends on the number of outcome variables, the importance of the PFER, the necessity of confidence intervals, and the extent to which significance in multiple variables is more valuable than significance in one variable.

31 citations


Book ChapterDOI
22 Dec 2015
TL;DR: This paper discusses a transductive version of conformal predictors, which is computationally inefficient for big test sets, but it turns out that apparently crude “Bonferroni predictors” are about as good in their information efficiency and vastly superior in computational efficiency.
Abstract: This paper discusses a transductive version of conformal predictors. This version is computationally inefficient for big test sets, but it turns out that apparently crude “Bonferroni predictors” are about as good in their information efficiency and vastly superior in computational efficiency.

Journal ArticleDOI
25 Jun 2015-eLife
TL;DR: The Reproducibility Project: Cancer Biology seeks to address growing concerns about reproducibility in scientific research by replicating selected results from a substantial number of high-profile papers in the field of cancer biology published between 2010 and 2012 by replication of results from ‘BET bromodomain inhibition as a therapeutic strategy to target c-Myc’.
Abstract: Protocol 1: One reviewer noted that one of the main strengths of the published study (Delmore et al., 2011) was that JQ1 treatment led to downregulation of Myc was not limited to MM1.S cells, and that the effect was observed in several MM cell lines (Figure 3H and 3I). This is a key observation and we recommend that the qPCR analyses be extended to the MM cell types indicated in Figures 2A and 3I. We agree that testing additional cell types (KMS11, OPM1, LR5, and INA6) provided additional evidence that the downregulation of Myc was not limited to MM1.S cells, however the Reproducibility Project: Cancer Biology aims to perform direct replications using the same methodology reported in the original paper. The additional cell types would be a conceptual replication, which we agree is a useful approach to test the experiment’s underlying hypothesis, but which is not an aim of the project. Aspects of an experiment not included in the original study are occasionally added to ensure the quality of the research, but by no means is a requirement of this project; rather, it is an extension of the original work. Adding additional aspects not included in the original study can be of scientific interest, and can be included if it is possible to balance them with the main aim of this project: to perform a direct replication of the original experiment(s). As such, we will restrict our analysis to the experiments being replicated and will not include discussion of experiments not being replicated in this study. Addition of one more time point is recommended, the 1h treatment with 500 nM (+)-JQ1. This will indicate whether the Myc downregulation by JQ1 is as dynamic as reported. Thank you for the recommendation. We have updated the manuscript to reflect this additional time point. In addition to the qPCR primers for Myc and GAPDH mentioned, the exact qPCR primers used by (Delmore et al., 2011) should be included. The qPCR primers included in this experimental design were reported in Delmore et al., 2011 (Supplemental information, Extended Materials and methods, Expression analysis). One reviewer requested scripts and a detailed description of the calculation performed with R. For example in Protocol 1, in the subsection headed “Test family”, the following sentence can be added: F test statistic (interaction) has been calculated following Cohen (2002) and the partial η2 has been calculated following Lakens (2013). Thank you for this recommendation. We have included a link to the scripts (https://osf.io/bjrpc/?view_only=737ba0f51c474aa1bc2782a44fba34d5). Additionally, we have added descriptions as suggested to more clearly describe the approach. This reviewer disagrees with the choice of a simple two-way ANOVA. In the original paper (Figure 3B) paired Student's t-tests were used. Since the replicated experiments are similar to the original ones a repeated measures ANOVA is more appropriate. A carefully chosen repeated measures ANOVA is a natural extension of the paired t-test and can simplify the implementation of the proposed meta-analysis. A drawback with a repeated measures ANOVA is that it can be more difficult to set the parameters to determine the power. In such a case a sensitivity analysis can be performed with the G* power software. We thank the reviewer for catching the original analysis method. We have adjusted the planned analysis to include this approach. As the reviewer suggested, we conducted a sensitivity analysis for the repeated measures ANOVA using the planned sample size and assuming 0 correlation among repeat measures and a nonsphericity correction of 1 to allow for a conservative estimate. We included the two planned comparisons (paired t-tests) assuming a correlation between groups of 0. This is because we do not have access to the original raw data. Additionally, it is not clear what df was used in the original paper since the reported p values suggest a larger effect size estimate than using the estimated means and standard deviations reported in the figure. This is likely due to the combination of the technical replicates (3) being combined with the biological replicates (2) giving a df of 5, opposed to using only the biological replicates (what is proposed in this manuscript), which gives a df of 2. Thus, we used the point estimated from G*Power using the estimated values from the graph reported in Figure 3B. We are also proposing to analyze the data as originally planned (as a between subjects design) since the experimental set-up suggests this analysis. The sample comes from multiple random dishes of cells treated with or without drug. As such, it is possible to have a different number of data points in one group (vehicle) than the other (JQ1), thus making matched samples difficult. We think it is reasonable to use the independent test, and both analysis designs are reported in the literature. This will be considered additional exploratory analysis since it was not originally reported. Protocol 2: Weight of mice should be recorded at day-0/day of injection. This is a parameter we have in the manuscript. Protocol 2, Step 5b. Weight of mice will be recorded during each day of injection during the course of the experiment. In this protocol the analyses following an ANOVA have been performed using Fisher's LSD correction and alpha error = 0.05. In its basic form (I think the one used in the protocol) the LSD correction is not taking into account that multiple comparisons will be performed and therefore a Bonferroni correction (or other corrections) must be employed. For Protocol 2 this brings alpha to 0.025 and in practice is not dramatically changing the power calculations. As an alternative to Fisher's LSD followed by Bonferroni, the Hayter-Fisher's LSD procedure (Hayter, 1986) controls the MFWER (maximum family wise error rate). We agree with the reviewers comment on the use of a correction, such as Bonferroni or the modification of LSD by Hayter are ways to control for the MFWER, however as Hayter describes in his 1986 paper, this applies in situations where the ANOVA is unbalanced or with a balanced design with four or more populations. Since the proposed analysis is balanced with three population groups, the LSD is sufficiently conservative and powerful to account for the multiple comparisons in this specific situation. This is further explained by Levin et al., 1994 and discussed in Maxwell and Delaney, 2004 (Chapter 5) and Cohen, 2001 (Chapter 12). Fisher's LSD correction has been reported also for survival data but it doesn't apply to this kind of data. Also here we need a (Bonferroni) correction. For the survival data, power calculations were performed with the Sample Size Calculator, however I do not have a clear link to the software. The authors should provide all the used parameters and references. Thank you for this correction. We have updated the power calculations to reflect this adjustment. The link to the online calculator used is included as a hyperlink in the manuscript and should direct you to here: http://www.sample-size.net/sample-size-survival-analysis/, which includes the reference (Schoenfeld, 1983) for the formulas used. Additionally, screen shots of the input and output parameters are included on the project page on the Open Science Framework. References: Levin, J.R., Serline, R.C., & Seaman M.A. (1994). A controlled, powerful multiple-comparison strategy for several situations. Psychological Bulletin, 115, 153–159. Maxwell, S.E. & Delaney, H.D. (2004). Designing experiments and analyzing data: a model comparison perspecitive. Lawrence Erlbaum Associates, Mahwah, N.J., second edition. Cohen, B.H. (2001). Explaining psychological statistics. John Wiley and Sons, New York, second edition.

Proceedings ArticleDOI
01 Jan 2015
TL;DR: In this paper, a new efficient search strategy was proposed, which always returns the same solution as the state-of-the-art approach and is approximately two orders of magnitude faster.
Abstract: The problem of finding itemsets that are statistically significantly enriched in a class of transactions is complicated by the need to correct for multiple hypothesis testing. Pruning untestable hypotheses was recently proposed as a strategy for this task of significant itemset mining. It was shown to lead to greater statistical power, the discovery of more truly significant itemsets, than the standard Bonferroni correction on real-world datasets. An open question, however, is whether this strategy of excluding untestable hypotheses also leads to greater statistical power in subgraph mining, in which the number of hypotheses is much larger than in itemset mining. Here we answer this question by an empirical investigation on eight popular graph benchmark datasets. We propose a new efficient search strategy, which always returns the same solution as the stateof-the-art approach and is approximately two orders of magnitude faster. Moreover, we exploit the dependence between subgraphs by considering the effective number of tests and thereby further increase the statistical power.

Journal ArticleDOI
TL;DR: The detailed steps of multiple attribute group decision making with the developed IFPGBM or WIFPGBM are listed and a comparison of the new extensions of Bonferroni mean by this paper with the corresponding existing intuitionistic fuzzy Bonferronsi means is given.
Abstract: The geometric Bonferroni mean (GBM) can capture the interrelationships between input arguments, which is an important generalization of Bonferroni mean (BM). In this paper, we combine geometric Bonferroni mean (GBM) with the power geometric average (PGA) operator under intuitionistic fuzzy environment and present the intuitionistic fuzzy geometric power Bonferroni mean (IFPGBM) and the weighted intuitionistic fuzzy power geometric Bonferroni mean (WIFPGBM). The desirable properties of these new extensions of Bonferroni mean and their special cases are investigated. We list the detailed steps of multiple attribute group decision making with the developed IFPGBM or WIFPGBM, and give a comparison of the new extensions of Bonferroni mean by this paper with the corresponding existing intuitionistic fuzzy Bonferroni means. Finally, examples are illustrated to show the validity and feasibility of the new approaches.

Journal ArticleDOI
TL;DR: In this paper, it is argued that by using the Bonferroni method, a band can often be obtained which is smaller than the Wald band, and the joint bootstrap distribution of the impulse response coefficient estimators is taken into account and mapped into the band.
Abstract: In impulse response analysis estimation uncertainty is typically displayed by constructing bands around estimated impulse response functions. If they are based on the joint asymptotic distribution possibly constructed with bootstrap methods in a frequentist framework, often individual confidence intervals are simply connected to obtain the bands. Such bands are known to be too narrow and have a joint coverage probability lower than the desired one. If instead the Wald statistic is used and the joint bootstrap distribution of the impulse response coefficient estimators is taken into account and mapped into the band, it is shown that such a band is typically rather conservative. It is argued that, by using the Bonferroni method, a band can often be obtained which is smaller than the Wald band.

Journal ArticleDOI
TL;DR: This work proposes an alternative approach for multiplicity adjustment that incorporates dependence between outcomes, resulting in an appreciably less conservative evaluation and is demonstrated in two examples from the literature.
Abstract: Evaluation of intervention effects on multiple outcomes is a common scenario in clinical studies. In longitudinal studies, such evaluation is a challenge if one wishes to adequately capture simultaneous data behavior. In this situation, a common approach is to analyze each outcome separately. As a result, multiple statistical statements describing the intervention effect need to be reported and an adjustment for multiple testing is necessary. This is typically done by means of the Bonferroni procedure, which does not take into account the correlation between outcomes, thus resulting in overly conservative conclusions. We propose an alternative approach for multiplicity adjustment that incorporates dependence between outcomes, resulting in an appreciably less conservative evaluation. The ability of the proposed method to control the familywise error rate is evaluated in a simulation study, and the applicability of the method is demonstrated in two examples from the literature.

Journal ArticleDOI
TL;DR: The area of focus in this study was a methodology of quantitative analysis for monitoring treatment progression and clinical outcome with 19ZNF, which included repeated measures analyses of variance (rANOVA) and t-tests for z-scores; it was conducted on 10 cases in a single subject design.
Abstract: 19-Channel Z-Score Neurofeedback (19ZNF) is a modality using 19-electrodes with real-time normative database z-scores, suggesting effective clinical outcomes in fewer sessions than traditional neurofeedback. Thus, monitoring treatment progression and clinical outcome is necessary. The area of focus in this study was a methodology of quantitative analysis for monitoring treatment progression and clinical outcome with 19ZNF. This methodology is noted as the Sites-of-Interest, which included repeated measures analyses of variance (rANOVA) and t-tests for z-scores; it was conducted on 10 cases in a single subject design. To avoid selection bias, the 10 sample cases were randomly selected from a pool of 17 cases that met the inclusion criteria. Available client outcome measures (including self-report) are briefly discussed. The results showed 90 % of the pre-post comparisons moved in the targeted direction (z = 0) and of those, 96 % (80 % Bonferroni corrected) of the t-tests and 96 % (91 % Bonferroni corrected) of the rANOVAs were statistically significant; thus indicating a progression towards the mean in 15 or fewer 19ZNF sessions. All cases showed and reported improvement in all outcome measures (including quantitative electroencephalography assessment) at case termination.

Journal ArticleDOI
01 Apr 2015
TL;DR: In this article, three alternative inequality curves are considered as competitors of the classical Lorenz curve as descriptors of income inequality, and the Bonferroni curve and the Zenga-07 curve appear to be essentially equivalent to the Lorenz curves.
Abstract: Three alternative inequality curves are considered as competitors of the classical Lorenz curve as descriptors of income inequality. The Bonferroni curve and the Zenga-07 curve appear to be essentially equivalent to the Lorenz curve. They each determine the parent distribution up to scale factor, and they each yield an inequality partial order that is equivalent to the Lorenz order. The Zenga-84 curve is more problematic. It is scale invariant, but it is possible that different distributions can have the same Zenga-84 curve. Thus it fails to identify the parent distribution up to a scale factor.

Journal ArticleDOI
TL;DR: The aim of this review was to quantify multiple testing in recent large clinical studies in the otolaryngology literature and to discuss strategies to address this potential problem.
Abstract: Objectives/Hypothesis Multiple hypothesis testing (or multiple testing) refers to testing more than one hypothesis within a single analysis, and can inflate the type I error rate (false positives) within a study. The aim of this review was to quantify multiple testing in recent large clinical studies in the otolaryngology literature and to discuss strategies to address this potential problem. Data Sources Original clinical research articles with >100 subjects published in 2012 in the four general otolaryngology journals with the highest Journal Citation Reports 5-year impact factors. Review Methods Articles were reviewed to determine whether the authors tested greater than five hypotheses in at least one family of inferences. For the articles meeting this criterion for multiple testing, type I error rates were calculated, and statistical correction was applied to the reported results. Results Of the 195 original clinical research articles reviewed, 72% met the criterion for multiple testing. Within these studies, there was a mean 41% chance of a type I error and, on average, 18% of significant results were likely to be false positives. After the Bonferroni correction was applied, only 57% of significant results reported within the articles remained significant. Conclusions Multiple testing is common in recent large clinical studies in otolaryngology and deserves closer attention from researchers, reviewers, and editors. Strategies for adjusting for multiple testing are discussed. Level of Evidence NA Laryngoscope, 125:599–603, 2015

Journal ArticleDOI
TL;DR: In this article, the authors provided the profile of elite young soccer players and analyzed the contribution of independent variables to performance indicators such as complete anthropometry, chronological age (CA), age at peak height velocity, 15m sprint test, agility test, Yo-yo IT level 1 (Yo-yo IR1), counter-movement jump and hand dynamometry.
Abstract: The aim of this study was to provide the profile of elite young soccer players. Fifty-five players of the Under-14 category of Athletic Club Bilbao participated in this study. Players were classified into 4 playing positions: forwards (n=30), midfielders (n=15), defenders (n=37) and goalkeepers (n=15). Complete anthropometry, chronological age (CA), age at peak height velocity, 15-m sprint test, agility test, Yo-yo IT level 1 (Yo-yo IR1), counter-movement jump and hand dynamometry were measured. Results were transformed into z-scores and summed up to make two performance composites (SCORE and SCORE HG ). One-way analysis of variance and a Bonferroni posthoc test were used to examine the differences between playing positions. Multiple linear regression analysis was performed to estimate the contribution of independent variables to performance indicators. Significant differences were observed between playing positions in body mass and height (P<0.05); CA, maturity offset and muscle % (P<0.01); sum of skinfolds, fat %, endomorphy, sprint and agility tests (P<0.001). Stepwise regression analysis revealed that the CA and sum of skinfolds were the most important predictors of performance. Collectively, playing positions were characterised by specific anthropometrical characteristics whereas no significant positional differences were observed in performance. This study provides further insight concerning coaches’ practice of selecting young soccer players because of physical advantages. However, other components like technical and tactical skills, cognitive and psychological factors may be important to excel in soccer. A b s t r a c t

Journal ArticleDOI
TL;DR: It is found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation.

Journal ArticleDOI
TL;DR: A high variability in false discovery rate controls for typical genomic studies is found and researchers are advised to present the bootstrapped standard errors alongside with the falseiscovery rate indices.
Abstract: Multiple hypothesis testing is a pervasive problem in genomic data analysis. The conventional Bonferroni method which controls the family-wise error rate is conservative and with low power. The current paradigm is to control the false discovery rate. We characterize the variability of the false discovery rate indices (local false discovery rates, q-value and false discovery proportion) using the bootstrapped method. A colon cancer gene-expression data and a visual refractive errors genome-wide association study data are analyzed as demonstration. We found a high variability in false discovery rate controls for typical genomic studies. We advise researchers to present the bootstrapped standard errors alongside with the false discovery rate indices.

Journal ArticleDOI
TL;DR: A novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data is developed and has the ability to capture position-level variant information and account for gametic phase disequilibrium.
Abstract: Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.

Journal ArticleDOI
TL;DR: A generalized sequential Bonferroni (smooth-GSB) procedure that incorporates smoothed weights calculated from admixture mapping tests into association tests that correct for local ancestry is described that can result in a better performance than several existing methods for GWAS in admixed populations.
Abstract: Objective: To develop effective methods for GWAS in admixed populations such as African Americans. Methods: We show that, wh

Journal ArticleDOI
TL;DR: In this paper, a sequential method based on constrained Bayesian methods is developed for testing multiple hypotheses, which controls the familywise error rate and the family-wise power in a more accurate form than the Bonferroni or intersection scheme using the ideas of step-up and step-down methods for multiple comparisons of sequential designs.
Abstract: A sequential method based on constrained Bayesian methods is developed for testing multiple hypotheses. It controls the family-wise error rate and the family-wise power in a more accurate form than the Bonferroni or intersection scheme using the ideas of step-up and step-down methods for multiple comparisons of sequential designs. The new method surpasses the existing testing methods proposed earlier in a substantial reduction of the expected sample size.

01 Jan 2015
TL;DR: In this paper, the authors show that data analysis techniques that ignore the multivariate structure of the data may lead to inaccurate estimates and erroneous conclusions, and they also show that performing multiple univariate tests on a multivariate dataset results in inflated Type-I error rates, and erroneously concluding that there is a distracting effect when there may not be.
Abstract: Over the past decade, a number of bodies, including government agencies, traffic safety advocacy groups and law enforcement agencies, have successfully increased the public awareness level of traffic safety risks from distracted driving. The driving simulator continues to be popular with researchers in collecting data on dependent variables that provide scientific knowledge of the effects of distracted driving. Several of these dependent variables can be used to quantify a single distracting effect, resulting in a multivariate dataset. A literature review of current studies revealed that researchers overwhelmingly use univariate (single and multiple) tests to analyze the resulting dataset. Performing multiple univariate tests on a multivariate dataset results in inflated Type-I error rates, and erroneously concluding that there is a distracting effect when there may not be. The primary objective of this research study is to show that data analysis techniques that ignore the multivariate structure of the data may lead to inaccurate estimates and erroneous conclusions. This is demonstrated with a pilot study where 13 drivers participated in a 2 (drive) x 1 (operating a navigation device) repeated measures driving simulator experiment. Six commonly used dependent variables that are often used to quantify lateral and longitudinal control were used as the multivariate response variables. The corresponding data were analyzed using univariate tests, Bonferroni adjustment tests, and multivariate gate-keeper tests. The results indicate that ignoring the multivariate structure and performing multiple univariate tests, as has been found to be prevalent in past studies, will lead to inflated Type-I error rates and potentially misleading conclusions.

Journal ArticleDOI
TL;DR: It is demonstrated that the Meff approach can be an effective alternative to Bonferroni-based methods for multichannel fNIRS studies and remained almost constant regardless of the number of measured channels.
Abstract: Recent advances in multichannel functional near-infrared spectroscopy (fNIRS) allow wide coverage of cortical areas while entailing the necessity to control family-wise errors (FWEs) due to increased multiplicity. Conventionally, the Bonferroni method has been used to control FWE. While Type I errors (false positives) can be strictly controlled, the application of a large number of channel settings may inflate the chance of Type II errors (false negatives). The Bonferroni-based methods are especially stringent in controlling Type I errors of the most activated channel with the smallest p value. To maintain a balance between Types I and II errors, effective multiplicity (Meff) derived from the eigenvalues of correlation matrices is a method that has been introduced in genetic studies. Thus, we explored its feasibility in multichannel fNIRS studies. Applying the Meff method to three kinds of experimental data with different activation profiles, we performed resampling simulations and found that Meff was controlled at 10 to 15 in a 44-channel setting. Consequently, the number of significantly activated channels remained almost constant regardless of the number of measured channels. We demonstrated that the Meff approach can be an effective alternative to Bonferroni-based methods for multichannel fNIRS studies.

Journal ArticleDOI
TL;DR: In this article, an approximate upper percentile of the Hotelling's T2-type statistic in which each dataset has a three-step monotone missing data pattern and the population covariance matrices are equal is proposed.
Abstract: SYNOPTIC ABSTRACTIn this article, we consider the problem of testing the equality of two mean vectors when the data have a three-step monotone pattern of missing observations. We propose an approximate upper percentile of the Hotelling’s T2-type statistic in which each dataset has a three-step monotone missing data pattern and the population covariance matrices are equal. Further, we obtain the Hotelling’s T2-type statistics and their approximate upper percentiles in the case of data with unequal two-step monotone missing data patterns. We also consider multivariate multiple comparisons for mean vectors with three-step monotone missing data. Approximate simultaneous confidence intervals for pairwise comparisons among mean vectors and comparisons with a control are obtained using Bonferroni’s approximate upper percentiles of the T2max · p and T2max · c statistics, respectively. Finally, the accuracy of the approximations is investigated via Monte Carlo simulation.

Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed a combined p-value test for multiple hypothesis testing based on the adaptive rank truncated product (ARTP) method and modified it by using different critical percentile parameter τj for the statistic Wj=∏i=1jp(i) with each j=1, n.
Abstract: Multiple test based on familywise error rate is a popular problem in statistical inference as testing the null hypothesis H0=H1∩H2∩···∩Hn is true versus the alternative that at least one hypothesis in H1, H2, … , Hn is false. Classical tests include the Bonferroni test and the Simes test. So far, the most efficient methods for this problem are known as the combined p-value tests, such as the Fisher's product test, the truncated product method, the rank truncated product test, the adaptive rank truncated product (ARTP) method and Zhang et al.’s [A combined p-value test for multiple hypothesis testing. J Stat Plan Inference. 2013;143:764–770] test that extends the ARTP method. Our method is based on Zhang et al.’s [A combined p-value test for multiple hypothesis testing. J Stat Plan Inference. 2013;143:764–770] test and modifies it by using different critical percentile parameter τj for the statistic Wj=∏i=1jp(i) with each j=1, … , n. Simulation studies show that the sizes of our test are stable at the sign...

Posted Content
TL;DR: The authors showed that false positives are increased to unacceptable levels when no corrections are applied and that countermeasures like the Bonferroni correction can keep false positives in check while reducing statistical power only moderately.
Abstract: In research on eye movements in reading, it is common to analyze a number of canonical dependent measures to study how the effects of a manipulation unfold over time. Although this gives rise to the well-known multiple comparisons problem, i.e. an inflated probability that the null hypothesis is incorrectly rejected (Type I error), it is accepted standard practice not to apply any correction procedures. Instead, there appears to be a widespread belief that corrections are not necessary because the increase in false positives is too small to matter. To our knowledge, no formal argument has ever been presented to justify this assumption. Here, we report a computational investigation of this issue using Monte Carlo simulations. Our results show that, contrary to conventional wisdom, false positives are increased to unacceptable levels when no corrections are applied. Our simulations also show that counter-measures like the Bonferroni correction keep false positives in check while reducing statistical power only moderately. Hence, there is little reason why such corrections should not be made a standard requirement. Further, we discuss three statistical illusions that can arise when statistical power is low, and we show how power can be improved to prevent these illusions. In sum, our work renders a detailed picture of the various types of statistical errors than can occur in studies of reading behavior and we provide concrete guidance about how these errors can be avoided.