scispace - formally typeset
Search or ask a question

Showing papers on "Statistical hypothesis testing published in 2002"


Book
21 Mar 2002
TL;DR: An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data is as discussed by the authors, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models Multivariate techniques, including classification and ordination, are then introduced.
Abstract: An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data The text begins with a revision of estimation and hypothesis testing methods, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models Multivariate techniques, including classification and ordination, are then introduced Special emphasis is placed on checking assumptions, exploratory data analysis and presentation of results The main analyses are illustrated with many examples from published papers and there is an extensive reference list to both the statistical and biological literature The book is supported by a website that provides all data sets, questions for each chapter and links to software

9,509 citations


Journal ArticleDOI
TL;DR: The standard nonparametric randomization and permutation testing ideas are developed at an accessible level, using practical examples from functional neuroimaging, and the extensions for multiple comparisons described.
Abstract: Requiring only minimal assumptions for validity, nonparametric permutation testing provides a flexible and intuitive methodology for the statistical analysis of data from functional neuroimaging experiments, at some computational expense. Introduced into the functional neuroimaging literature by Holmes et al. ([1996]: J Cereb Blood Flow Metab 16:7-22), the permutation approach readily accounts for the multiple comparisons problem implicit in the standard voxel-by-voxel hypothesis testing framework. When the appropriate assumptions hold, the nonparametric permutation approach gives results similar to those obtained from a comparable Statistical Parametric Mapping approach using a general linear model with multiple comparisons corrections derived from random field theory. For analyses with low degrees of freedom, such as single subject PET/SPECT experiments or multi-subject PET/SPECT or fMRI designs assessed for population effects, the nonparametric approach employing a locally pooled (smoothed) variance estimate can outperform the comparable Statistical Parametric Mapping approach. Thus, these nonparametric techniques can be used to verify the validity of less computationally expensive parametric approaches. Although the theory and relative advantages of permutation approaches have been discussed by various authors, there has been no accessible explication of the method, and no freely distributed software implementing it. Consequently, there have been few practical applications of the technique. This article, and the accompanying MATLAB software, attempts to address these issues. The standard nonparametric randomization and permutation testing ideas are developed at an accessible level, using practical examples from functional neuroimaging, and the extensions for multiple comparisons described. Three worked examples from PET and fMRI are presented, with discussion, and comparisons with standard parametric approaches made where appropriate. Practical considerations are given throughout, and relevant statistical concepts are expounded in appendices.

5,777 citations


Journal ArticleDOI
TL;DR: This paper introduces to the neuroscience literature statistical procedures for controlling the false discovery rate (FDR) and demonstrates this approach using both simulations and functional magnetic resonance imaging data from two simple experiments.

4,838 citations


Journal ArticleDOI
TL;DR: It is shown that the AU test is less biased than other methods in typical cases of tree selection, as well as in the analysis of mammalian mitochondrial protein sequences.
Abstract: An approximately unbiased (AU) test that uses a newly devised multiscale bootstrap technique was developed for general hypothesis testing of regions in an attempt to reduce test bias. It was applied to maximum-likelihood tree selection for obtaining the confidence set of trees. The AU test is based on the theory of Efron et al. (Proc. Natl. Acad. Sci. USA 93:13429-13434; 1996), but the new method provides higher-order accuracy yet simpler implementation. The AU test, like the Shimodaira-Hasegawa (SH) test, adjusts the selection bias overlooked in the standard use of the bootstrap probability and Kishino-Hasegawa tests. The selection bias comes from comparing many trees at the same time and often leads to overconfidence in the wrong trees. The SH test, though safe to use, may exhibit another type of bias such that it appears conservative. Here I show that the AU test is less biased than other methods in typical cases of tree selection. These points are illustrated in a simulation study as well as in the analysis of mammalian mitochondrial protein sequences. The theoretical argument provides a simple formula that covers the bootstrap probability test, the Kishino-Hasegawa test, the AU test, and the Zharkikh-Li test. A practical suggestion is provided as to which test should be used under particular circumstances.

2,452 citations


Book
04 Mar 2002
TL;DR: This chapter discusses statistical concepts in climate research, as well as time series and stochastic processes, and some of the techniques used to estimate covariance functions and spectra.
Abstract: 1. Introduction Part I. Fundamentals: 2. Probability theory 3. Distributions of climate variables 4. Concepts in statistical inference 5. Estimation Part II. Confirmation and Analysis: 6. The statistical test of a hypothesis 7. Analysis of atmospheric circulation problems Part III. Fitting Statistical Models: 8. Regression 9. Analysis of variance Part IV. Time Series: 10. Time series and stochastic processes 11. Parameters of univariate and bivariate time series 12. Estimating covariance functions and spectra Part V. Eigen Techniques: 13. Empirical orthogonal functions 14. Canonical correlation analysis 15. POP analysis 16. Complex eigentechniques Part VI. Other Topics: 17. Specific statistical concepts in climate research 18. Forecast quality evaluation Part VII. Appendices.

1,915 citations


Book
15 Apr 2002
TL;DR: In this paper, Efroymson's algorithm was used to replace two variables at a time with all subsets using branch-and-bound techniques. But the results showed that one subset was better than another.
Abstract: OBJECTIVES Prediction, Explanation, Elimination or What? How Many Variables in the Prediction Formula? Alternatives to Using Subsets 'Black Box' Use of Best-Subsets Techniques LEAST-SQUARES COMPUTATIONS Using Sums of Squares and Products Matrices Orthogonal Reduction Methods Gauss-Jordan v. Orthogonal Reduction Methods Interpretation of Projections Appendix A: Operation Counts for All-Subsets Regression FINDING SUBSETS WHICH FIT WELL Objectives and Limitations of this Chapter Forward Selection Efroymson's Algorithm Backward Elimination Sequential Replacement Algorithm Replacing Two Variables at a Time Generating All Subsets Using Branch-and-Bound Techniques Grouping Variables Ridge Regression and Other Alternatives The Non-Negative Garrote and the Lasso Some Examples Conclusions and Recommendations HYPOTHESIS TESTING Is There any Information in the Remaining Variables? Is One Subset Better than Another? Appendix A: Spjftvoll's Method - Detailed Description WHEN TO STOP? What Criterion Should We Use? Prediction Criteria Cross-Validation and the PRESS Statistic Bootstrapping Likelihood and Information-Based Stopping Rules Appendix A. Approximate Equivalence of Stopping Rules ESTIMATION OF REGRESSION COEFFICIENTS Selection Bias Choice Between Two Variables Selection Bias in the General Case, and its Reduction Conditional Likelihood Estimation Estimation of Population Means Estimating Least-Squares Projections Appendix A: Changing Projections to Equate Sums of Squares BAYESIAN METHODS Bayesian Introduction 'Spike and Slab' Prior Normal prior for Regression Coefficients Model Averaging Picking the Best Model CONCLUSIONS AND SOME RECOMMENDATIONS REFERENCES INDEX

1,722 citations


Book
08 Oct 2002
TL;DR: This paper presents a meta-analyses of source and channel coding for multi-Terminal Information Theory, which aims to clarify the role of symbols in the development of information theory.
Abstract: 1 Source Coding.- 2 Random Number Generation.- 3 Channel Coding.- 4 Hypothesis Testing.- 5 Rate-Distortion Theory.- 6 Identification Code and Channel Resolvability.- 7 Multi-Terminal Information Theory.- References.

837 citations


Journal ArticleDOI
TL;DR: This paper characterize an important class of problems in which the LRT and the F-test fail and illustrate this nonstandard behavior, and briefly sketch several possible acceptable alternatives, focusing on Bayesian posterior predictive probability values.
Abstract: The likelihood ratio test (LRT) and the related F-test, popularized in astrophysics by Eadie and coworkers in 1971, Bevington in 1969, Lampton, Margon, & Bowyer, in 1976, Cash in 1979, and Avni in 1978, do not (even asymptotically) adhere to their nominal χ2 and F-distributions in many statistical tests common in astrophysics, thereby casting many marginal line or source detections and nondetections into doubt. Although the above authors illustrate the many legitimate uses of these statistics, in some important cases it can be impossible to compute the correct false positive rate. For example, it has become common practice to use the LRT or the F-test to detect a line in a spectral model or a source above background despite the lack of certain required regularity conditions. (These applications were not originally suggested by Cash or by Bevington.) In these and other settings that involve testing a hypothesis that is on the boundary of the parameter space, contrary to common practice, the nominal χ2 distribution for the LRT or the F-distribution for the F-test should not be used. In this paper, we characterize an important class of problems in which the LRT and the F-test fail and illustrate this nonstandard behavior. We briefly sketch several possible acceptable alternatives, focusing on Bayesian posterior predictive probability values. We present this method in some detail since it is a simple, robust, and intuitive approach. This alternative method is illustrated using the gamma-ray burst of 1997 May 8 (GRB 970508) to investigate the presence of an Fe K emission line during the initial phase of the observation. There are many legitimate uses of the LRT and the F-test in astrophysics, and even when these tests are inappropriate, there remain several statistical alternatives (e.g., judicious use of error bars and Bayes factors). Nevertheless, there are numerous cases of the inappropriate use of the LRT and similar tests in the literature, bringing substantive scientific results into question.

730 citations


Journal ArticleDOI
TL;DR: In this paper, the authors compared the effect of spatial autocorrelation on the statistical tests commonly used by ecologists to analyse field survey data and found that the presence of a broad-scale spatial structure present in data has the same effect on the tests as spatial auto-correlation.
Abstract: In ecological field surveys, observations are gathered at different spatial locations. The purpose may be to relate biological response variables (e.g., species abundances) to explanatory environmental variables (e.g., soil characteristics). In the absence of prior knowledge, ecologists have been taught to rely on systematic or random sampling designs. If there is prior knowledge about the spatial patterning of the explanatory variables, obtained from either previous surveys or a pilot study, can we use this information to optimize the sampling design in order to maximize our ability to detect the relationships between the response and explanatory variables? The specific questions addressed in this paper are: a) What is the effect (type I error) of spatial autocorrelation on the statistical tests commonly used by ecologists to analyse field survey data? b) Can we eliminate, or at least minimize, the effect of spatial autocorrelation by the design of the survey? Are there designs that provide greater power for surveys, at least under certain circumstances? c) Can we eliminate or control for the effect of spatial autocorrelation during the analysis? To answer the last question, we compared regular regression analysis to a modified t-test developed by Dutilleul for correlation coefficients in the presence of spatial autocorrelation. Replicated surfaces (typically, 1000 of them) were simulated using different spatial parameters, and these surfaces were subjected to different sampling designs and methods of statistical analysis. The simulated surfaces may represent, for example, vegetation response to underlying environmental variation. This allowed us 1) to measure the frequency of type I error (the failure to reject the null hypothesis when in fact there is no effect of the environment on the response variable) and 2) to estimate the power of the different combinations of sampling designs and methods of statistical analysis (power is measured by the rate of rejection of the null hypothesis when an effect of the environment on the response variable has been created). Our results indicate that: 1) Spatial autocorrelation in both the response and environmental variables affects the classical tests of significance of correlation or regression coefficients. Spatial autocorrelation in only one of the two variables does not affect the test of significance. 2) A broad-scale spatial structure present in data has the same effect on the tests as spatial autocorrelation. When such a structure is present in one of the variables and autocorrelation is found in the other, or in both, the tests of significance have inflated rates of type I error. 3) Dutilleul's modified t-test for the correlation coefficient, corrected for spatial autocorrelation, effectively corrects for spatial autocorrelation in the data. It also effectively corrects for the presence of deterministic structures, with or without spatial autocorrelation. The presence of a broad-scale deterministic structure may, in some cases, reduce the power of the modified t-test.

666 citations


Journal ArticleDOI
TL;DR: In this article, the authors investigate the operating characteristics of the Benjamini-Hochberg false discovery rate procedure for multiple testing, which is a distribution-free method that controls the expected fraction of falsely rejected null hypotheses among those rejected.
Abstract: Summary. We investigate the operating characteristics of the Benjamini–Hochberg false discovery rate procedure for multiple testing. This is a distribution-free method that controls the expected fraction of falsely rejected null hypotheses among those rejected. The paper provides a framework for understanding more about this procedure. We first study the asymptotic properties of the ‘deciding point’ D that determines the critical p-value. From this, we obtain explicit asymptotic expressions for a particular risk function. We introduce the dual notion of false non-rejections and we consider a risk function that combines the false discovery rate and false non-rejections. We also consider the optimal procedure with respect to a measure of conditional risk.

603 citations


Journal ArticleDOI
TL;DR: A new procedure to take the product of only those P‐values less than some specified cut‐off value and to evaluate the probability of such a product, or a smaller value, under the overall hypothesis that all L hypotheses are true is presented.
Abstract: We present a new procedure for combining P-values from a set of L hypothesis tests. Our procedure is to take the product of only those P-values less than some specified cut-off value and to evaluate the probability of such a product, or a smaller value, under the overall hypothesis that all L hypotheses are true. We give an explicit formulation for this P-value, and find by simulation that it can provide high power for detecting departures from the overall hypothesis. We extend the procedure to situations when tests are not independent. We present both real and simulated examples where the method is especially useful. These include exploratory analyses when L is large, such as genome-wide scans for marker-trait associations and meta-analytic applications that combine information from published studies, with potential for dealing with the "publication bias" phenomenon. Once the overall hypothesis is rejected, an adjustment procedure with strong family-wise error protection is available for smaller subsets of hypotheses, down to the individual tests.

Journal ArticleDOI
TL;DR: In this paper, a global smoothing procedure is developed using basis function approximations for estimating the parameters of a varying-coefficient model with repeated measurements, which applies whether or not the covariates are time invariant and does not require binning of the data when observations are sparse at distinct observation times.
Abstract: SUMMARY A global smoothing procedure is developed using basis function approximations for estimating the parameters of a varying-coefficient model with repeated measurements. Inference procedures based on a resampling subject bootstrap are proposed to construct confidence regions and to perform hypothesis testing. Conditional biases and variances of our estimators and their asymptotic consistency are developed explicitly. Finite sample properties of our procedures are investigated through a simulation study. Application of the proposed approach is demonstrated through an example in epidemiology. In contrast to the existing methods, this approach applies whether or not the covariates are timeinvariant and does not require binning of the data when observations are sparse at distinct observation times.

Journal ArticleDOI
TL;DR: In this article, a generalization of the variance ratio statistic is suggested, which can be used to test the cointegration rank in the spirit of Johansen (J. Econ. Dyn. Econom. 15 (1992) 159) but assumes nonstationarity under the null hypothesis.

Journal ArticleDOI
TL;DR: In this paper, a general framework for identification, estimation, and hypothesis testing in cointegrated systems when the cointegrating coefficients are subject to (possibly) non-linear and cross-equation restrictions, obtained from economic theory or other relevant a priori information is developed.
Abstract: The paper develops a general framework for identification, estimation, and hypothesis testing in cointegrated systems when the cointegrating coefficients are subject to (possibly) non-linear and cross-equation restrictions, obtained from economic theory or other relevant a priori information. It provides a proof of the consistency of the quasi maximum likelihood estimators (QMLE), establishes the relative rates of convergence of the QMLE of the short-run and the long-run parameters, and derives their asymptotic distributions; thus generalizing the results already available in the literature for the linear case. The paper also develops tests of the over-identifying (possibly) non-linear restrictions on the cointegrating vectors. The estimation and hypothesis testing procedures are applied to an Almost Ideal Demand System estimated on U.K. quarterly observations. Unlike many other studies of consumer demand this application does not treat relative prices and real per capita expenditures as exogenously given.

Book ChapterDOI
27 Jul 2002
TL;DR: A model independent procedure for verifying properties of discrete event systems based on Monte Carlo simulation and statistical hypothesis testing that is probabilistic in two senses and carried out in an anytime manner.
Abstract: We propose a model independent procedure for verifying properties of discrete event systems The dynamics of such systems can be very complex, making them hard to analyze, so we resort to methods based on Monte Carlo simulation and statistical hypothesis testing The verification is probabilistic in two senses First, the properties, expressed as CSL formulas, can be probabilistic Second, the result of the verification is probabilistic, and the probability of error is bounded by two parameters passed to the verification procedure The verification of properties can be carried out in an anytime manner by starting off with loose error bounds, and gradually tightening these bounds

Journal ArticleDOI
TL;DR: The original method, based on the EM algorithm, is shown to be superior to the standard one for a priori probability estimation and always performs better than the original one in terms of classification accuracy, when the a Priori probability conditions differ from the training set to the real-world data.
Abstract: It sometimes happens (for instance in case control studies) that a classifier is trained on a data set that does not reflect the true a priori probabilities of the target classes on real-world data. This may have a negative effect on the classification accuracy obtained on the real-world data set, especially when the classifier's decisions are based on the a posteriori probabilities of class membership. Indeed, in this case, the trained classifier provides estimates of the a posteriori probabilities that are not valid for this real-world data set (they rely on the a priori probabilities of the training set). Applying the classifier as is (without correcting its outputs with respect to these new conditions) on this new data set may thus be suboptimal. In this note, we present a simple iterative procedure for adjusting the outputs of the trained classifier with respect to these new a priori probabilities without having to refit the model, even when these probabilities are not known in advance. As a by-product, estimates of the new a priori probabilities are also obtained. This iterative algorithm is a straightforward instance of the expectation-maximization (EM) algorithm and is shown to maximize the likelihood of the new data. Thereafter, we discuss a statistical test that can be applied to decide if the a priori class probabilities have changed from the training set to the real-world data. The procedure is illustrated on different classification problems involving a multilayer neural network, and comparisons with a standard procedure for a priori probability estimation are provided. Our original method, based on the EM algorithm, is shown to be superior to the standard one for a priori probability estimation. Experimental results also indicate that the classifier with adjusted outputs always performs better than the original one in terms of classification accuracy, when the a priori probability conditions differ from the training set to the real-world data. The gain in classification accuracy can be significant.

Journal ArticleDOI
TL;DR: In this article, the authors consider an approach to the Durbin problem involving a martingale transformation of the parametric empirical process suggested by Khmaladze (1981) and show that it can be adapted to a wide variety of inference problems involving quantile regression process.
Abstract: Tests based on the quantile regression process can be formulated like the classical Kolmogorov–Smirnov and Cramer–von–Mises tests of goodness–of–fit employing the theory of Bessel processes as in Kiefer (1959). However, it is frequently desirable to formulate hypotheses involving unknown nuisance parameters, thereby jeopardizing the distribution free character of these tests. We characterize this situation as “the Durbin problem” since it was posed in Durbin (1973), for parametric empirical processes. In this paper we consider an approach to the Durbin problem involving a martingale transformation of the parametric empirical process suggested by Khmaladze (1981) and show that it can be adapted to a wide variety of inference problems involving the quantile regression process. In particular, we suggest new tests of the location shift and location–scale shift models that underlie much of classical econometric inference. The methods are illustrated with a reanalysis of data on unemployment durations from the Pennsylvania Reemployment Bonus Experiments. The Pennsylvania experiments, conducted in 1988–89, were designed to test the efficacy of cash bonuses paid for early reemployment in shortening the duration of insured unemployment spells.

Journal ArticleDOI
TL;DR: In this article, a new specification test for IV estimators adopting a particular second-order approximation of Bekker was developed, where the difference between the forward (conventional) 2SLS estimator of the coefficient of the right-hand side endogenous variable with the reverse (non-SLS) estimators of the same unknown parameter when the normalization is changed.
Abstract: We develop a new specification test for IV estimators adopting a particular second order approximation of Bekker. The new specification test compares the difference of the forward (conventional) 2SLS estimator of the coefficient of the right-hand side endogenous variable with the reverse 2SLS estimator of the same unknown parameter when the normalization is changed. Under the null hypothesis that conventional first order asymptotics provide a reliable guide to inference, the two estimates should be very similar. Our test sees whether the resulting difference in the two estimates satisfies the results of second order asymptotic theory. Essentially the same idea is applied to develop another new specification test using second-order unbiased estimators of the type first proposed by Nagar. If the forward and reverse Nagar-type estimators are not significantly different we recommend estimation by LIML, which we demonstrate is the optimal linear combination of the Nagar-type estimators (to second order). We also demonstrate the high degree of similarity for k-class estimators between the approach of Bekker and the Edgeworth expansion approach of Rothenberg. An empirical example and Monte Carlo evidence demonstrate the operation of the new specification test.

Book ChapterDOI
28 May 2002
TL;DR: This work addresses the problem of face recognition from a large set of images obtained over time - a task arising in many surveillance and authentication applications and proposes an information-theoretic algorithm that classifies sets of images using the relative entropy between the estimated density of the input set and that of stored collections of images for each class.
Abstract: We address the problem of face recognition from a large set of images obtained over time - a task arising in many surveillance and authentication applications. A set or a sequence of images provides information about the variability in the appearance of the face which can be used for more robust recognition. We discuss different approaches to the use of this information, and show that when cast as a statistical hypothesis testing problem, the classification task leads naturally to an information-theoretic algorithm that classifies sets of images using the relative entropy (Kullback-Leibler divergence) between the estimated density of the input set and that of stored collections of images for each class. We demonstrate the performance of the proposed algorithm on two medium-sized data sets of approximately frontal face images, and describe an application of the method as part of a view-independent recognition system.

Journal ArticleDOI
TL;DR: A unique combination of time series analysis, neural networks, and statistical inference techniques is developed for damage classification explicitly taking into account ambient variations of the system.
Abstract: Stated in its most basic form, the objective of damage diagnosis is to ascertain simply if damage is present or not based on measured dynamic characteristics of a system to be monitored. In reality...

Journal ArticleDOI
TL;DR: The results of Davies (1977, 1987) are extended to a linear model situation with unknown residual variance as discussed by the authors, where the residual variance is defined as the sum of the variance of all the parameters of the model.
Abstract: The results of Davies (1977, 1987) are extended to a linear model situation with unknown residual variance.

Journal ArticleDOI
01 Sep 2002-Ecology
TL;DR: In this paper, the authors focus on two estimators of relative abundance, which assume that the probability that an individual is detected at least once in the survey is either equal or unequal for the two populations.
Abstract: Determination of the relative abundance of two populations, separated by time or space, is of interest in many ecological situations. We focus on two estimators of relative abundance, which assume that the probability that an individual is detected at least once in the survey is either equal or unequal for the two populations. We present three methods for incorporating the collected information into our inference. The first method, proposed previously, is a traditional hypothesis test for evidence that detection probabilities are unequal. However, we feel that, a priori, it is more likely that detection probabilities are actually different; hence, the burden of proof should be shifted, requiring evidence that detection probabilities are practically equivalent. The second method we present, equivalence testing, is one approach to doing so. Third, we suggest that model averaging could be used by combining the two estimators according to derived model weights. These differing approaches are applied to a mark–recapture experiment on Nuttall's cottontail rabbit (Sylvilagus nuttallii) conducted in central Oregon during 1974 and 1975, which has been previously analyzed by other authors.

Journal ArticleDOI
TL;DR: The results suggest that the SOWH test may accord overconfidence in the true topology when the null hypothesis is in fact correct, and that the SH test was observed to be much more conservative, even under high substitution rates and branch length heterogeneity.
Abstract: Probabilistic tests of topology offer a powerful means of evaluating competing phylogenetic hypotheses. The performance of the nonparametric Shimodaira-Hasegawa (SH) test, the parametric Swofford-Olsen-Waddell-Hillis (SOWH) test, and Bayesian posterior probabilities were explored for five data sets for which all the phylogenetic relationships are known with a very high degree of certainty. These results are consistent with previous simulation studies that have indicated a tendency for the SOWH test to be prone to generating Type 1 errors because of model misspecification coupled with branch length heterogeneity. These results also suggest that the SOWH test may accord overconfidence in the true topology when the null hypothesis is in fact correct. In contrast, the SH test was observed to be much more conservative, even under high substitution rates and branch length heterogeneity. For some of those data sets where the SOWH test proved misleading, the Bayesian posterior probabilities were also misleading. The results of all tests were strongly influenced by the exact substitution model assumptions. Simple models, especially those that assume rate homogeneity among sites, had a higher Type 1 error rate and were more likely to generate misleading posterior probabilities. For some of these data sets, the commonly used substitution models appear to be inadequate for estimating appropriate levels of uncertainty with the SOWH test and Bayesian methods. Reasons for the differences in statistical power between the two maximum likelihood tests are discussed and are contrasted with the Bayesian approach.

Journal ArticleDOI
TL;DR: This article presents a Bayesian phylogenetic method that evaluates the adequacy of evolutionary models using posterior predictive distributions and, unlike the likelihood-ratio test and parametric bootstrap, accounts for uncertainty in the phylogeny and model parameters.
Abstract: Bayesian inference is becoming a common statistical approach to phylogenetic estimation because, among other reasons, it allows for rapid analysis of large data sets with complex evolutionary models. Conveniently, Bayesian phylogenetic methods use currently available stochastic models of sequence evolution. However, as with other model-based approaches, the results of Bayesian inference are conditional on the assumed model of evolution: inadequate models (models that poorly fit the data) may result in erroneous inferences. In this article, I present a Bayesian phylogenetic method that evaluates the adequacy of evolutionary models using posterior predictive distributions. By evaluating a model's posterior predictive performance, an adequate model can be selected for a Bayesian phylogenetic study. Although I present a single test statistic that assesses the overall (global) performance of a phylogenetic model, a variety of test statistics can be tailored to evaluate specific features (local performance) of evolutionary models to identify sources failure. The method presented here, unlike the likelihood-ratio test and parametric bootstrap, accounts for uncertainty in the phylogeny and model parameters.

Journal ArticleDOI
TL;DR: Some popular methods for combining information are reviewed, the question of thresholding is critical here and is also explored; the surveyed techniques on a sample data set are demonstrated.

Journal ArticleDOI
TL;DR: The purpose of this paper is to provide an introduction to the problem of clustered data in clinical research, and provides guidance and examples of methods for analyzing clustered data and calculating sample sizes when planning studies.
Abstract: Sometimes interventions in randomized clinical trials are not allocated to individual patients, but rather to patients in groups. This is called cluster allocation, or cluster randomization, and is particularly common in health services research. Similarly, in some types of observational studies, patients (or observations) are found in naturally occurring groups, such as neighborhoods. In either situation, observations within a cluster tend to be more alike than observations selected entirely at random. This violates the assumption of independence that is at the heart of common methods of statistical estimation and hypothesis testing. Failure to account for the dependence between individual observations and the cluster to which they belong can have profound implications on the design and analysis of such studies. Their p-values will be too small, confidence intervals too narrow, and sample size estimates too small, sometimes to a dramatic degree. This problem is similar to that caused by the more familiar "unit of analysis error" seen when observations are repeated on the same subjects, but are treated as independent. The purpose of this paper is to provide an introduction to the problem of clustered data in clinical research. It provides guidance and examples of methods for analyzing clustered data and calculating sample sizes when planning studies. The article concludes with some general comments on statistical software for cluster data and principles for planning, analyzing, and presenting such studies.

Journal ArticleDOI
TL;DR: In this article, the ability of empirical downscaling models to resolve these changes is investigated using ensemble neural networks forced with synoptic-scale atmospheric conditions, using five-day averages of streamflow data from 21 watersheds in British Columbia, Canada.

Journal ArticleDOI
01 Nov 2002
TL;DR: A modified SSD method is applied, as well as the expectation-maximization and partition-ligation algorithms, to sequence data from eight loci spanning >1 Mb on the human X chromosome, and phase reconstructions are found to be highly accurate over regions with high linkage disequilibrium (LD).
Abstract: Contemporary genotyping and sequencing methods do not provide information on linkage phase in diploid organisms. The application of statistical methods to infer and reconstruct linkage phase in samples of diploid sequences is a potentially time- and labor-saving method. The Stephens-Smith-Donnelly (SSD) algorithm is one such method, which incorporates concepts from population genetics theory in a Markov chain–Monte Carlo technique. We applied a modified SSD method, as well as the expectation-maximization and partition-ligation algorithms, to sequence data from eight loci spanning >1 Mb on the human X chromosome. We demonstrate that the accuracy of the modified SSD method is better than that of the other algorithms and is superior in terms of the number of sites that may be processed. Also, we find phase reconstructions by the modified SSD method to be highly accurate over regions with high linkage disequilibrium (LD). If only polymorphisms with a minor allele frequency >0.2 are analyzed and scored according to the fraction of neighbor relations correctly called, reconstructions are 95.2% accurate over entire 100-kb stretches and are 98.6% accurate within blocks of high LD.

Journal ArticleDOI
TL;DR: A confidence interval for a general linear function of population medians, which can be used to test 2-sided directional hypotheses and finite interval hypotheses, is proposed and sample size formulas are given.
Abstract: When the distribution of the response variable is skewed, the population median may be a more meaningful measure of centrality than the population mean, and when the population distribution of the response variable has heavy tails, the sample median may be a more efficient estimator of centrality than the sample mean. The authors propose a confidence interval for a general linear function of population medians. Linear functions have many important special cases including pairwise comparisons, main effects, interaction effects, simple main effects, curvature, and slope. The confidence interval can be used to test 2-sided directional hypotheses and finite interval hypotheses. Sample size formulas are given for both interval estimation and hypothesis testing problems.

Journal ArticleDOI
TL;DR: A general framework is presented for data analysis of latent finite partially ordered classification models and it is demonstrated that sequential analytic methods can dramatically reduce the amount of testing that is needed to make accurate classifications.
Abstract: Summary. A general framework is presented for data analysis of latent finite partially ordered classification models. When the latent models are complex, data analytic validation of model fits and of the analysis of the statistical properties of the experiments is essential for obtaining reliable and accurate results. Empirical results are analysed from an application to cognitive modelling in educational testing. It is demonstrated that sequential analytic methods can dramatically reduce the amount of testing that is needed to make accurate classifications.