scispace - formally typeset
Search or ask a question

Showing papers on "Statistical hypothesis testing published in 2009"


Journal ArticleDOI
TL;DR: In this paper, the estimation and testing of long-run relations in economic modeling are addressed, starting with a vector autoregressive (VAR) model, the hypothesis of cointegration is formulated as a hypothesis of reduced rank of the long run impact matrix.
Abstract: The estimation and testing of long-run relations in economic modeling are addressed. Starting with a vector autoregressive (VAR) model, the hypothesis of cointegration is formulated as the hypothesis of reduced rank of the long-run impact matrix. This is given in a simple parametric form that allows the application of the method of maximum likelihood and likelihood ratio tests. In this way, one can derive estimates and test statistics for the hypothesis of a given number of cointegration vectors, as well as estimates and tests for linear hypotheses about the cointegration vectors and their weights. The asymptotic inferences concerning the number of cointegrating vectors involve nonstandard distributions. Inference concerning linear restrictions on the cointegration vectors and their weights can be performed using the usual chi squared methods. In the case of linear restrictions on beta, a Wald test procedure is suggested. The proposed methods are illustrated by money demand data from the Danish and Finnish economies.

12,449 citations


Journal ArticleDOI
TL;DR: The use (and misuse) of GLMMs in ecology and evolution are reviewed, estimation and inference are discussed, and 'best-practice' data analysis procedures for scientists facing this challenge are summarized.
Abstract: How should ecologists and evolutionary biologists analyze nonnormal data that involve random effects? Nonnormal data such as counts or proportions often defy classical statistical procedures. Generalized linear mixed models (GLMMs) provide a more flexible approach for analyzing nonnormal data when random effects are present. The explosion of research on GLMMs in the last decade has generated considerable uncertainty for practitioners in ecology and evolution. Despite the availability of accurate techniques for estimating GLMM parameters in simple cases, complex GLMMs are challenging to fit and statistical inference such as hypothesis testing remains difficult. We review the use (and misuse) of GLMMs in ecology and evolution, discuss estimation and inference and summarize 'best-practice' data analysis procedures for scientists facing this challenge.

7,207 citations


Journal ArticleDOI
TL;DR: To facilitate use of the Bayes factor, an easy-to-use, Web-based program is provided that performs the necessary calculations and has better properties than other methods of inference that have been advocated in the psychological literature.
Abstract: Progress in science often comes from discovering invariances in relationships among variables; these invariances often correspond to null hypotheses. As is commonly known, it is not possible to state evidence for the null hypothesis in conventional significance testing. Here we highlight a Bayes factor alternative to the conventional t test that will allow researchers to express preference for either the null hypothesis or the alternative. The Bayes factor has a natural and straightforward interpretation, is based on reasonable assumptions, and has better properties than other methods of inference that have been advocated in the psychological literature. To facilitate use of the Bayes factor, we provide an easy-to-use, Web-based program that performs the necessary calculations.

3,012 citations


Journal ArticleDOI
TL;DR: It is suggested that random-effects meta-analyses as currently conducted often fail to provide the key results, and the extent to which distribution-free, classical and Bayesian approaches can provide satisfactory methods is investigated.
Abstract: Meta-analysis in the presence of unexplained heterogeneity is frequently undertaken by using a random-effects model, in which the effects underlying different studies are assumed to be drawn from a normal distribution. Here we discuss the justification and interpretation of such models, by addressing in turn the aims of estimation, prediction and hypothesis testing. A particular issue that we consider is the distinction between inference on the mean of the random-effects distribution and inference on the whole distribution. We suggest that random-effects meta-analyses as currently conducted often fail to provide the key results, and we investigate the extent to which distribution-free, classical and Bayesian approaches can provide satisfactory methods. We conclude that the Bayesian approach has the advantage of naturally allowing for full uncertainty, especially for prediction. However, it is not without problems, including computational intensity and sensitivity to a priori judgements. We propose a simple prediction interval for classical meta-analysis and offer extensions to standard practice of Bayesian meta-analysis, making use of an example of studies of 'set shifting' ability in people with eating disorders.

1,792 citations


Journal ArticleDOI
TL;DR: This study analyzes the published results for the algorithms presented in the CEC’2005 Special Session on Real Parameter Optimization by using non-parametric test procedures and states that a parametric statistical analysis could not be appropriate specially when the authors deal with multiple-problem results.
Abstract: In recent years, there has been a growing interest for the experimental analysis in the field of evolutionary algorithms. It is noticeable due to the existence of numerous papers which analyze and propose different types of problems, such as the basis for experimental comparisons of algorithms, proposals of different methodologies in comparison or proposals of use of different statistical techniques in algorithms' comparison. In this paper, we focus our study on the use of statistical techniques in the analysis of evolutionary algorithms' behaviour over optimization problems. A study about the required conditions for statistical analysis of the results is presented by using some models of evolutionary algorithms for real-coding optimization. This study is conducted in two ways: single-problem analysis and multiple-problem analysis. The results obtained state that a parametric statistical analysis could not be appropriate specially when we deal with multiple-problem results. In multiple-problem analysis, we propose the use of non-parametric statistical tests given that they are less restrictive than parametric ones and they can be used over small size samples of results. As a case study, we analyze the published results for the algorithms presented in the CEC'2005 Special Session on Real Parameter Optimization by using non-parametric test procedures.

1,543 citations


Journal ArticleDOI
TL;DR: It is concluded that the Bayesian phylogeographic framework will make an important asset in molecular epidemiology that can be easily generalized to infer biogeogeography from genetic data for many organisms.
Abstract: As a key factor in endemic and epidemic dynamics, the geographical distribution of viruses has been frequently interpreted in the light of their genetic histories. Unfortunately, inference of historical dispersal or migration patterns of viruses has mainly been restricted to model-free heuristic approaches that provide little insight into the temporal setting of the spatial dynamics. The introduction of probabilistic models of evolution, however, offers unique opportunities to engage in this statistical endeavor. Here we introduce a Bayesian framework for inference, visualization and hypothesis testing of phylogeographic history. By implementing character mapping in a Bayesian software that samples time-scaled phylogenies, we enable the reconstruction of timed viral dispersal patterns while accommodating phylogenetic uncertainty. Standard Markov model inference is extended with a stochastic search variable selection procedure that identifies the parsimonious descriptions of the diffusion process. In addition, we propose priors that can incorporate geographical sampling distributions or characterize alternative hypotheses about the spatial dynamics. To visualize the spatial and temporal information, we summarize inferences using virtual globe software. We describe how Bayesian phylogeography compares with previous parsimony analysis in the investigation of the influenza A H5N1 origin and H5N1 epidemiological linkage among sampling localities. Analysis of rabies in West African dog populations reveals how virus diffusion may enable endemic maintenance through continuous epidemic cycles. From these analyses, we conclude that our phylogeographic framework will make an important asset in molecular epidemiology that can be easily generalized to infer biogeogeography from genetic data for many organisms.

1,535 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present a critical review of the blanket test procedures and suggest new ones for goodness-of-fit testing of copula models, and describe and interpret the results of a large Monte Carlo experiment designed to assess the effect of the sample size and the strength of dependence on the level and power of blanket tests for various combinations of Copula models under the null hypothesis and the alternative.
Abstract: Many proposals have been made recently for goodness-of-fit testing of copula models. After reviewing them briefly, the authors concentrate on “blanket tests”, i.e., those whose implementation requires neither an arbitrary categorization of the data nor any strategic choice of smoothing parameter, weight function, kernel, window, etc. The authors present a critical review of these procedures and suggest new ones. They describe and interpret the results of a large Monte Carlo experiment designed to assess the effect of the sample size and the strength of dependence on the level and power of the blanket tests for various combinations of copula models under the null hypothesis and the alternative. To circumvent problems in the determination of the limiting distribution of the test statistics under composite null hypotheses, they recommend the use of a double parametric bootstrap procedure, whose implementation is detailed. They conclude with a number of practical recommendations.

995 citations


Journal ArticleDOI
TL;DR: In this article, a measure of accounting conservatism, C_score, was proposed to capture variation in conservatism and also predict asymmetric earnings timeliness at horizons of up to three years ahead.
Abstract: We estimate a firm-year measure of accounting conservatism, examine its empirical properties as a metric, and illustrate applications by testing new hypotheses that shed further light on the nature and effects of conservatism. The results are consistent with the measure, C_Score, capturing variation in conservatism and also predicting asymmetric earnings timeliness at horizons of up to three years ahead. Cross-sectional hypothesis tests suggest firms with longer investment cycles, higher idiosyncratic uncertainty and higher information asymmetry have higher accounting conservatism. Event studies suggest increased conservatism is a response to increases in information asymmetry and idiosyncratic uncertainty.

943 citations


Book
03 Sep 2009
TL;DR: This valuable book shows second language researchers how to use the statistical program SPSS to conduct statistical tests frequently done in second language research, including chi-square, t-tests, correlation, multiple regression, ANOVA and non-parametric analogs to these tests.
Abstract: This valuable book shows second language researchers how to use the statistical program SPSS to conduct statistical tests frequently done in SLA research. Using data sets from real SLA studies, A Guide to Doing Statistics in Second Language Research Using SPSS shows newcomers to both statistics and SPSS how to generate descriptive statistics, how to choose a statistical test, and how to conduct and interpret a variety of basic statistical tests. The author covers the statistical tests that are most commonly used in second language research, including chi-square, t-tests, correlation, multiple regression, ANOVA and non-parametric analogs to these tests. The text is abundantly illustrated with graphs and tables depicting actual data sets, and exercises throughout the book help readers understand concepts (such as the difference between independent and dependent variables) and work out statistical analyses. Answers to all exercises are provided on the book�s companion website, along with sample data sets and other supplementary material.

754 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider an alternative explanation, which adds the hypothesis that people like to be perceived as fair, which has additional testable implications, the validity of which they confirm through new experiments.
Abstract: A norm of 50-50 division appears to have considerable force in a wide range of economic environments, both in the real world and in the laboratory. Even in settings where one party unilaterally determines the allocation of a prize (the dictator game), many subjects voluntarily cede exactly half to another individual. The hypothesis that people care about fairness does not by itself account for key experimental patterns. We consider an alternative explanation, which adds the hypothesis that people like to be perceived as fair. The properties of equilibria for the resulting signaling game correspond closely to laboratory observations. The theory has additional testable implications, the validity of which we confirm through new experiments.

733 citations


Journal ArticleDOI
01 Jan 2009-Oikos
TL;DR: It is observed that traditional ‘gap-counting’ metrics are biased towards species loss among columns (occupied sites) and that many metrics are not invariant to basic matrix properties and that the study of nestedness should be combined with an appropriate gradient analysis to infer possible causes of the observed presence absence sequence.
Abstract: Nestedness analysis has become increasingly popular in the study of biogeographic patterns of species occurrence. Nested patterns are those in which the species composition of small assemblages is a nested subset of larger assemblages. For species interaction networks such as plantpollinator webs, nestedness analysis has also proven a valuable tool for revealing ecological and evolutionary constraints. Despite this popularity, there has been substantial controversy in the literature over the best methods to define and quantify nestedness, and how to test for patterns of nestedness against an appropriate statistical null hypothesis. Here we review this rapidly developing literature and provide suggestions and guidelines for proper analyses. We focus on the logic and the performance of different metrics and the proper choice of null models for statistical inference. We observe that traditional ‘gap-counting’ metrics are biased towards species loss among columns (occupied sites) and that many metrics are not invariant to basic matrix properties. The study of nestedness should be combined with an appropriate gradient analysis to infer possible causes of the observed presence absence sequence. In our view, statistical inference should be based on a null model in which row and columns sums are fixed. Under this model, only a relatively small number of published empirical matrices are significantly nested. We call for a critical reassessment of previous studies that have used biased metrics and unconstrained null models for statistical inference.

Journal ArticleDOI
02 Apr 2009
TL;DR: A study involving a set of techniques which can be used for doing a rigorous comparison among algorithms, in terms of obtaining successful classification models, and proposes the use of the most powerful non-parametric statistical tests to carry out multiple comparisons.
Abstract: The experimental analysis on the performance of a proposed method is a crucial and necessary task to carry out in a research. This paper is focused on the statistical analysis of the results in the field of genetics-based machine Learning. It presents a study involving a set of techniques which can be used for doing a rigorous comparison among algorithms, in terms of obtaining successful classification models. Two accuracy measures for multi-class problems have been employed: classification rate and Cohen’s kappa. Furthermore, two interpretability measures have been employed: size of the rule set and number of antecedents. We have studied whether the samples of results obtained by genetics-based classifiers, using the performance measures cited above, check the necessary conditions for being analysed by means of parametrical tests. The results obtained state that the fulfillment of these conditions are problem-dependent and indefinite, which supports the use of non-parametric statistics in the experimental analysis. In addition, non-parametric tests can be satisfactorily employed for comparing generic classifiers over various data-sets considering any performance measure. According to these facts, we propose the use of the most powerful non-parametric statistical tests to carry out multiple comparisons. However, the statistical analysis conducted on interpretability must be carefully considered.

Book Chapter
01 Dec 2009
TL;DR: A novel test of the independence hypothesis for one particular kernel independence measure, the Hilbert-Schmidt independence criterion (HSIC), which outperforms established contingency table and functional correlation-based tests, and is greater for multivariate data.
Abstract: Although kernel measures of independence have been widely applied in machine learning (notably in kernel ICA), there is as yet no method to determine whether they have detected statistically significant dependence. We provide a novel test of the independence hypothesis for one particular kernel independence measure, the Hilbert-Schmidt independence criterion (HSIC). The resulting test costs O(m2), wherem is the sample size. We demonstrate that this test outperforms established contingency table and functional correlation-based tests, and that this advantage is greater for multivariate data. Finally, we show the HSIC test also applies to text (and to structured data more generally), for which no other independence test presently exists.

Journal ArticleDOI
TL;DR: This paper looks at the error rates and power of some multi-stage regression methods and considers three screening methods: the lasso, marginal regression, and forward stepwise regression.
Abstract: This paper explores the following question: what kind of statistical guarantees can be given when doing variable selection in high dimensional models? In particular, we look at the error rates and power of some multi-stage regression methods. In the first stage we fit a set of candidate models. In the second stage we select one model by cross-validation. In the third stage we use hypothesis testing to eliminate some variables. We refer to the first two stages as "screening" and the last stage as "cleaning." We consider three screening methods: the lasso, marginal regression, and forward stepwise regression. Our method gives consistent variable selection under certain conditions.

Journal ArticleDOI
TL;DR: In this paper, the authors propose a new definition of depth for functional observations based on the graphic representation of the curves, which establishes the centrality of an observation and provides a natural center-outward ordering of the sample curves.
Abstract: The statistical analysis of functional data is a growing need in many research areas. In particular, a robust methodology is important to study curves, which are the output of many experiments in applied statistics. As a starting point for this robust analysis, we propose, analyze, and apply a new definition of depth for functional observations based on the graphic representation of the curves. Given a collection of functions, it establishes the “centrality” of an observation and provides a natural center-outward ordering of the sample curves. Robust statistics, such as the median function or a trimmed mean function, can be defined from this depth definition. Its finite-dimensional version provides a new depth for multivariate data that is computationally feasible and useful for studying high-dimensional observations. Thus, this new depth is also suitable for complex observations such as microarray data, images, and those arising in some recent marketing and financial studies. Natural properties of these ...

Journal ArticleDOI
TL;DR: In this article, the authors proposed a spectrum-sensing algorithm based on the sample covariance matrix calculated from a limited number of received signal samples, and two test statistics are then extracted from the sampled covariance matrices.
Abstract: Spectrum sensing, i.e., detecting the presence of primary users in a licensed spectrum, is a fundamental problem in cognitive radio. Since the statistical covariances of the received signal and noise are usually different, they can be used to differentiate the case where the primary user's signal is present from the case where there is only noise. In this paper, spectrum-sensing algorithms are proposed based on the sample covariance matrix calculated from a limited number of received signal samples. Two test statistics are then extracted from the sample covariance matrix. A decision on the signal presence is made by comparing the two test statistics. Theoretical analysis for the proposed algorithms is given. Detection probability and the associated threshold are found based on the statistical theory. The methods do not need any information about the signal, channel, and noise power a priori. In addition, no synchronization is needed. Simulations based on narrow-band signals, captured digital television (DTV) signals, and multiple antenna signals are presented to verify the methods.

Journal ArticleDOI
TL;DR: The retrieval effort hypothesis as discussed by the authors states that difficult but successful retrieevals are better for memory than easier successful retrievals, and it has been shown that the difficulty of retrieval during practice increases, final test performance increases.

Journal ArticleDOI
TL;DR: The results of this study have major implications for all analyses that rely on accurate estimates of topology or branch lengths, including divergence time estimation, ancestral state reconstruction, tree-dependent comparative methods, rate variation analysis, phylogenetic hypothesis testing, and phylogeographic analysis.
Abstract: Although an increasing number of phylogenetic data sets are incomplete, the effect of ambiguous data on phy- logenetic accuracy is not well understood. We use 4-taxon simulations to study the effects of ambiguous data (i.e., missing characters or gaps) in maximum likelihood (ML) and Bayesian frameworks. By introducing ambiguous data in a way that removes confounding factors, we provide the first clear understanding of 1 mechanism by which ambiguous data can mislead phylogenetic analyses. We find that in both ML and Bayesian frameworks, among-site rate variation can interact with ambiguous data to produce misleading estimates of topology and branch lengths. Furthermore, within a Bayesian framework, priors on branch lengths and rate heterogeneity parameters can exacerbate the effects of ambiguous data, re- sulting in strongly misleading bipartition posterior probabilities. The magnitude and direction of the ambiguous data bias are a function of the number and taxonomic distribution of ambiguous characters, the strength of topological support, and whether or not the model is correctly specified. The results of this study have major implications for all analyses that rely on accurate estimates of topology or branch lengths, including divergence time estimation, ancestral state reconstruc- tion, tree-dependent comparative methods, rate variation analysis, phylogenetic hypothesis testing, and phylogeographic analysis. (Ambiguous characters; ambiguous data; Bayesian; bias; maximum likelihood; missing data; model misspecifica- tion; phylogenetics; posterior probabilities; prior.)

Journal ArticleDOI
TL;DR: It can be demonstrated that data closure must be overcome prior to calculating even simple statistical measures like mean or standard deviation or plotting graphs of the data distribution, e.g. a histogram.

Journal ArticleDOI
TL;DR: The simple computations and the intuitive graphic representation of the analysis are illustrated by the analysis of diverse examples from the current literature.
Abstract: Null hypotheses are simple, precise, and theoretically important. Conventional statistical analysis cannot support them; Bayesian analysis can. The challenge in a Bayesian analysis is to formulate a suitably vague alternative, because the vaguer the alternative is (the more it spreads out the unit mass of prior probability), the more the null is favored. A general solution is a sensitivity analysis: Compute the odds for or against the null as a function of the limit(s) on the vagueness of the alternative. If the odds on the null approach 1 from above as the hypothesized maximum size of the possible effect approaches 0, then the data favor the null over any vaguer alternative to it. The simple computations and the intuitive graphic representation of the analysis are illustrated by the analysis of diverse examples from the current literature. They pose 3 common experimental questions: (a) Are 2 means the same? (b) Is performance at chance? (c) Are factors additive?

Journal ArticleDOI
TL;DR: Simulation experiments are provided that show the benefits of the proposed cyclostationary approach compared to energy detection, the importance of collaboration among spatially displaced secondary users for overcoming shadowing and fading effects, as well as the reliable performance of the suggested algorithms even in very low signal-to-noise ratio (SNR) regimes and under strict communication rate constraints for collaboration overhead.
Abstract: This paper proposes an energy efficient collaborative cyclostationary spectrum sensing approach for cognitive radio systems. An existing statistical hypothesis test for the presence of cyclostationarity is extended to multiple cyclic frequencies and its asymptotic distributions are established. Collaborative test statistics are proposed for the fusion of local test statistics of the secondary users, and a censoring technique in which only informative test statistics are transmitted to the fusion center (FC) during the collaborative detection is further proposed for improving energy efficiency in mobile applications. Moreover, a technique for numerical approximation of the asymptotic distribution of the censored FC test statistic is proposed. The proposed tests are nonparametric in the sense that no assumptions on data or noise distributions are required. In addition, the tests allow dichotomizing between the desired signal and interference. Simulation experiments are provided that show the benefits of the proposed cyclostationary approach compared to energy detection, the importance of collaboration among spatially displaced secondary users for overcoming shadowing and fading effects, as well as the reliable performance of the proposed algorithms even in very low signal-to-noise ratio (SNR) regimes and under strict communication rate constraints for collaboration overhead.

Journal ArticleDOI
TL;DR: In this article, an asymptotic test procedure is proposed to assess the stability of volatilities and crossvolatilites of linear and nonlinear multivariate time series models.
Abstract: In this paper, we introduce an asymptotic test procedure to assess the stability of volatilities and cross-volatilites of linear and nonlinear multivariate time series models. The test is very flexible as it can be applied, for example, to many of the multivariate GARCH models established in the literature, and also works well in the case of high dimensionality of the underlying data. Since it is nonparametric, the procedure avoids the difficulties associated with parametric model selection, model fitting and parameter estimation. We provide the theoretical foundation for the test and demonstrate its applicability via a simulation study and an analysis of financial data. Extensions to multiple changes and the case of infinite fourth moments are also discussed.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a test procedure that allows a break under both the null and alternative hypotheses and, when a break is present, the limit distribution of the test is the same as in the case of a known break date, thereby allowing increased power while maintaining the correct size.

Proceedings ArticleDOI
28 Jun 2009
TL;DR: This paper proposes a general framework for assessing predictive stream learning algorithms, and defends the use of Predictive Sequential methods for error estimate - the prequential error.
Abstract: Learning from data streams is a research area of increasing importance. Nowadays, several stream learning algorithms have been developed. Most of them learn decision models that continuously evolve over time, run in resource-aware environments, detect and react to changes in the environment generating data. One important issue, not yet conveniently addressed, is the design of experimental work to evaluate and compare decision models that evolve over time. There are no golden standards for assessing performance in non-stationary environments. This paper proposes a general framework for assessing predictive stream learning algorithms. We defend the use of Predictive Sequential methods for error estimate - the prequential error. The prequential error allows us to monitor the evolution of the performance of models that evolve over time. Nevertheless, it is known to be a pessimistic estimator in comparison to holdout estimates. To obtain more reliable estimators we need some forgetting mechanism. Two viable alternatives are: sliding windows and fading factors. We observe that the prequential error converges to an holdout estimator when estimated over a sliding window or using fading factors. We present illustrative examples of the use of prequential error estimators, using fading factors, for the tasks of: i) assessing performance of a learning algorithm; ii) comparing learning algorithms; iii) hypothesis testing using McNemar test; and iv) change detection using Page-Hinkley test. In these tasks, the prequential error estimated using fading factors provide reliable estimators. In comparison to sliding windows, fading factors are faster and memory-less, a requirement for streaming applications. This paper is a contribution to a discussion in the good-practices on performance assessment when learning dynamic models that evolve over time.

Journal ArticleDOI
TL;DR: This paper presents a new algorithm for detection of the number of sources via a sequence of hypothesis tests, and theoretically analyze the consistency and detection performance of the proposed algorithm, showing its superiority compared to the standard minimum description length (MDL)-based estimator.
Abstract: Detection of the number of signals embedded in noise is a fundamental problem in signal and array processing. This paper focuses on the non-parametric setting where no knowledge of the array manifold is assumed. First, we present a detailed statistical analysis of this problem, including an analysis of the signal strength required for detection with high probability, and the form of the optimal detection test under certain conditions where such a test exists. Second, combining this analysis with recent results from random matrix theory, we present a new algorithm for detection of the number of sources via a sequence of hypothesis tests. We theoretically analyze the consistency and detection performance of the proposed algorithm, showing its superiority compared to the standard minimum description length (MDL)-based estimator. A series of simulations confirm our theoretical analysis.

Journal ArticleDOI
TL;DR: This study compares recently developed Meff methods and validate them by the permutation test with 10,000 random shuffles using two real GWAS data sets and shows that the simpleM method produces the best approximation of the permutations threshold, and it does so in the shortest amount of time.
Abstract: A major challenge in genome-wide association studies (GWASs) is to derive the multiple testing threshold when hypothesis tests are conducted using a large number of single nucleotide polymorphisms. Permutation tests are considered the gold standard in multiple testing adjustment in genetic association studies. However, it is computationally intensive, especially for GWASs, and can be impractical if a large number of random shuffles are used to ensure accuracy. Many researchers have developed approximation algorithms to relieve the computing burden imposed by permutation. One particularly attractive alternative to permutation is to calculate the effective number of independent tests, M(eff), which has been shown to be promising in genetic association studies. In this study, we compare recently developed M(eff) methods and validate them by the permutation test with 10,000 random shuffles using two real GWAS data sets: an Illumina 1M BeadChip and an Affymetrix GeneChip Human Mapping 500K Array Set. Our results show that the simpleM method produces the best approximation of the permutation threshold, and it does so in the shortest amount of time. We also show that M(eff) is indeed valid on a genome-wide scale in these data sets based on statistical theory and significance tests. The significance thresholds derived can provide practical guidelines for other studies using similar population samples and genotyping platforms.

Journal ArticleDOI
TL;DR: Results suggest that some of the most commonly applied techniques in landscape genetics have high type-1 error rates, and that multivariate, non-linear methods are better suited for landscape genetic data analysis.
Abstract: The goal of landscape genetics is to detect and explain landscape effects on genetic diversity and structure. Despite the increasing popularity of landscape genetic approaches, the statistical methods for linking genetic and landscape data remain largely untested. This lack of method evaluation makes it difficult to compare studies utilizing different statistics, and compromises the future development and application of the field. To investigate the suitability and comparability of various statistical approaches used in landscape genetics, we simulated data sets corresponding to five landscape-genetic scenarios. We then analyzed these data with eleven methods, and compared the methods based on their statistical power, type-1 error rates, and their overall ability to lead researchers to accurate conclusions about landscape-genetic relationships. Results suggest that some of the most commonly applied techniques (e.g. Mantel and partial Mantel tests) have high type-1 error rates, and that multivariate, non-linear methods are better suited for landscape genetic data analysis. Furthermore, different methods generally show only moderate levels of agreement. Thus, analyzing a data set with only one method could yield method-dependent results, potentially leading to erroneous conclusions. Based on these findings, we give recommendations for choosing optimal combinations of statistical methods, and identify future research needs for landscape genetic data analyses.

Journal ArticleDOI
TL;DR: The present paper discusses the methods of working up a good hypothesis and statistical concepts of hypothesis testing.
Abstract: Hypothesis testing is an important activity of empirical research and evidence-based medicine. A well worked up hypothesis is half the answer to the research question. For this, both knowledge of the subject derived from extensive review of the literature and working knowledge of basic statistical concepts are desirable. The present paper discusses the methods of working up a good hypothesis and statistical concepts of hypothesis testing.

Journal ArticleDOI
TL;DR: The application of Kullback–Leibler-distance based model selection to real data using the model generating data set for the Abrahamson and Silva (1997) ground-motion model demonstrates the superior performance of the information-theoretic perspective in comparison to earlier attempts at data-driven model selection.
Abstract: Although the methodological framework of probabilistic seismic hazard analysis is well established, the selection of models to predict the ground motion at the sites of interest remains a major challenge. Information theory provides a powerful theoretical framework that can guide this selection process in a consistent way. From an information-theoretic perspective, the appropriateness of models can be expressed in terms of their relative information loss (Kullback–Leibler distance) and hence in physically meaningful units (bits). In contrast to hypothesis testing, information-theoretic model selection does not require ad hoc decisions regarding significance levels nor does it require the models to be mutually exclusive and collectively exhaustive. The key ingredient, the Kullback–Leibler distance, can be estimated from the statistical expectation of log-likelihoods of observations for the models under consideration. In the present study, data-driven ground-motion model selection based on Kullback–Leibler-distance differences is illustrated for a set of simulated observations of response spectra and macroseismic intensities. Information theory allows for a unified treatment of both quantities. The application of Kullback–Leibler-distance based model selection to real data using the model generating data set for the Abrahamson and Silva (1997) ground-motion model demonstrates the superior performance of the information-theoretic perspective in comparison to earlier attempts at data-driven model selection (e.g., Scherbaum et al. , 2004).

BookDOI
17 Apr 2009
TL;DR: A guided tour of decision theory can be found in this paper, where the authors present a set of decision-theoretic approaches to sample size, including the standard gamble, the "standard gamble of money", and the axioms of probabilities.
Abstract: Preface. Acknowledgments. 1 Introduction. 1.1 Controversies. 1.2 A guided tour of decision theory. Part One: Foundations. 2 Coherence. 2.1 The "Dutch Book" theorem. 2.2 Temporal coherence. 2.3 Scoring rules and the axioms of probabilities. 2.4 Exercises. 3 Utility. 3.1 St. Petersburg paradox. 3.2 Expected utility theory and the theory of means. 3.3 The expected utility principle. 3.4 The von Neumann-Morgenstern representation theorem. 3.5 Allais' criticism. 3.6 Extensions. 3.7 Exercises. 4 Utility in action. 4.1 The "standard gamble". 4.2 Utility of money. 4.3 Utility functions for medical decisions. 4.4 Exercises. 5 Ramsey and Savage. 5.1 Ramsey's theory. 5.2 Savage's theory. 5.3 Allais revisited. 5.4 Ellsberg paradox. 5.5 Exercises. 6 State independence. 6.1 Horse lotteries. 6.2 State-dependent utilities. 6.3 State-independent utilities. 6.4 Anscombe-Aumann representation theorem. 6.5 Exercises. Part Two Statistical Decision Theory. 7 Decision functions. 7.1 Basic concepts. 7.2 Data-based decisions. 7.3 The travel insurance example. 7.4 Randomized decision rules. 7.5 Classification and hypothesis tests. 7.6 Estimation. 7.7 Minimax-Bayes connections. 7.8 Exercises. 8 Admissibility. 8.1 Admissibility and completeness. 8.2 Admissibility and minimax. 8.3 Admissibility and Bayes. 8.4 Complete classes. 8.5 Using the same alpha level across studies with different sample sizes is inadmissible. 8.6 Exercises. 9 Shrinkage. 9.1 The Stein effect. 9.2 Geometric and empirical Bayes heuristics. 9.3 General shrinkage functions. 9.4 Shrinkage with different likelihood and losses. 9.5 Exercises. 10 Scoring rules. 10.1 Betting and forecasting. 10.2 Scoring rules. 10.3 Local scoring rules. 10.4 Calibration and refinement. 10.5 Exercises. 11 Choosing models. 11.1 The "true model" perspective. 11.2 Model elaborations. 11.3 Exercises. Part Three Optimal Design. 12 Dynamic programming. 12.1 History. 12.2 The travel insurance example revisited. 12.3 Dynamic programming. 12.4 Trading off immediate gains and information. 12.5 Sequential clinical trials. 12.6 Variable selection in multiple regression. 12.7 Computing. 12.8 Exercises. 13 Changes in utility as information. 13.1 Measuring the value of information. 13.2 Examples. 13.3 Lindley information. 13.4 Minimax and the value of information. 13.5 Exercises. 14 Sample size. 14.1 Decision-theoretic approaches to sample size. 14.2 Computing. 14.3 Examples. 14.4 Exercises. 15 Stopping. 15.1 Historical note. 15.2 A motivating example. 15.3 Bayesian optimal stopping. 15.4 Examples. 15.5 Sequential sampling to reduce uncertainty. 15.6 The stopping rule principle. 15.7 Exercises. Appendix. A.1 Notation. A.2 Relations. A.3 Probability (density) functions of some distributions. A.4 Conjugate updating. References. Index.