scispace - formally typeset
Search or ask a question

Showing papers on "Statistical hypothesis testing published in 1998"


Journal ArticleDOI
18 Apr 1998-BMJ
TL;DR: This paper advances the view, widely held by epidemiologists, that Bonferroni adjustments are, at best, unnecessary and, at worst, deleterious to sound statistical inference.
Abstract: When more than one statistical test is performed in analysing the data from a clinical study, some statisticians and journal editors demand that a more stringent criterion be used for “statistical significance” than the conventional P<0051 Many well meaning researchers, eager for methodological rigour, comply without fully grasping what is at stake Recently, adjustments for multiple tests (or Bonferroni adjustments) have found their way into introductory texts on medical statistics, which has increased their apparent legitimacy This paper advances the view, widely held by epidemiologists, that Bonferroni adjustments are, at best, unnecessary and, at worst, deleterious to sound statistical inference #### Summary points Adjusting statistical significance for the number of tests that have been performed on study data—the Bonferroni method—creates more problems than it solves The Bonferroni method is concerned with the general null hypothesis (that all null hypotheses are true simultaneously), which is rarely of interest or use to researchers The main weakness is that the interpretation of a finding depends on the number of other tests performed The likelihood of type II errors is also increased, so that truly important differences are deemed non-significant Simply describing what tests of significance have been performed, and why, is generally the best way of dealing with multiple comparisons Bonferroni adjustments are based on the following reasoning1-3 If a null hypothesis is true (for instance, two treatment groups in a randomised trial do not differ in terms of cure rates), a significant difference (P<005) will be observed by chance once in 20 trials This is the type I error, or α When 20 independent tests are performed (for example, study groups are compared with regard to 20 unrelated variables) and the null hypothesis holds for all 20 comparisons, the chance of at least one test being significant is no longer 005, but 064 …

5,471 citations


Journal ArticleDOI
TL;DR: In this article, the authors developed the statistical theory for testing and estimating multiple change points in regression models, and several test statistics were proposed to determine the existence as well as the number of change points.
Abstract: This paper develops the statistical theory for testing and estimating multiple change points in regression models. The rate of convergence and limiting distribution for the estimated parameters are obtained. Several test statistics are proposed to determine the existence as well as the number of change points. A partial structural change model is considered. The authors study both fixed and shrinking magnitudes of shifts. In addition, the models allow for serially correlated disturbances (mixingales). An estimation strategy for which the location of the breaks need not be simultaneously determined is discussed. Instead, the authors' method successively estimates each break point.

4,820 citations


Posted Content
TL;DR: In this paper, the problem of estimating the number of break dates in a linear model with multiple structural changes has been studied and an efficient algorithm to obtain global minimizers of the sum of squared residuals has been proposed.
Abstract: In a recent paper, Bai and Perron (1998) considered theoretical issues related to the limiting distribution of estimators and test statistics in the linear model with multiple structural changes. In this companion paper, we consider practical issues for the empirical applications of the procedures. We first address the problem of estimation of the break dates and present an efficient algorithm to obtain global minimizers of the sum of squared residuals. This algorithm is based on the principle of dynamic programming and requires at most least-squares operations of order O(T 2) for any number of breaks. Our method can be applied to both pure and partial structural-change models. Secondly, we consider the problem of forming confidence intervals for the break dates under various hypotheses about the structure of the data and the errors across segments. Third, we address the issue of testing for structural changes under very general conditions on the data and the errors. Fourth, we address the issue of estimating the number of breaks. We present simulation results pertaining to the behavior of the estimators and tests in finite samples. Finally, a few empirical applications are presented to illustrate the usefulness of the procedures. All methods discussed are implemented in a GAUSS program available upon request for non-profit academic use.

3,836 citations


Journal ArticleDOI
TL;DR: In this paper, the authors evaluate the performance of confidence intervals and hypothesis tests when each type of statistical procedure is used for each kind of inference and confirm that each procedure is best for making the kind of inferences for which it was designed.
Abstract: There are 2 families of statistical procedures in meta-analysis: fixed- and randomeffects procedures. They were developed for somewhat different inference goals: making inferences about the effect parameters in the studies that have been observed versus making inferences about the distribution of effect parameters in a population of studies from a random sample of studies. The authors evaluate the performance of confidence intervals and hypothesis tests when each type of statistical procedure is used for each type of inference and confirm that each procedure is best for making the kind of inference for which it was designed. Conditionally random-effects procedures (a hybrid type) are shown to have properties in between those of fixed- and random-effects procedures. The use of quantitative methods to summarize the results of several empirical research studies, or metaanalysis, is now widely used in psychology, medicine, and the social sciences. Meta-analysis usually involves describing the results of each study by means of a numerical index (an estimate of effect size, such as a correlation coefficient, a standardized mean difference, or an odds ratio) and then combining these estimates across studies to obtain a summary. Two somewhat different statistical models have been developed for inference about average effect size from a collection of studies, called the fixed-effects and random-effects models. (A third alternative, the mixedeffects model, arises in conjunction with analyses involving study-level covariates or moderator variables, which we do not consider in this article; see Hedges, 1992.) Fixed-effects models treat the effect-size parameters as fixed but unknown constants to be estimated and usually (but not necessarily) are used in conjunction with assumptions about the homogeneity of effect parameters (see, e.g., Hedges, 1982; Rosenthal & Rubin, 1982). Random-effects models treat the effectsize parameters as if they were a random sample from

2,513 citations


Journal ArticleDOI
TL;DR: In this paper, the authors point out that one of these estimators is correct while the other is incorrect, which biases one's hypothesis test in favor of rejecting the null hypothesis that b1= b2.
Abstract: Criminologists are often interested in examining interactive effects within a regression context. For example, “holding other relevant factors constant, is the effect of delinquent peers on one's own delinquent conduct the same for males and females?” or “is the effect of a given treatment program comparable between first-time and repeat offenders?” A frequent strategy in examining such interactive effects is to test for the difference between two regression coefficients across independent samples. That is, does b1= b2? Traditionally, criminologists have employed a t or z test for the difference between slopes in making these coefficient comparisons. While there is considerable consensus as to the appropriateness of this strategy, there has been some confusion in the criminological literature as to the correct estimator of the standard error of the difference, the standard deviation of the sampling distribution of coefficient differences, in the t or z formula. Criminologists have employed two different estimators of this standard deviation in their empirical work. In this note, we point out that one of these estimators is correct while the other is incorrect. The incorrect estimator biases one's hypothesis test in favor of rejecting the null hypothesis that b1= b2. Unfortunately, the use of this incorrect estimator of the standard error of the difference has been fairly widespread in criminology. We provide the formula for the correct statistical test and illustrate with two examples from the literature how the biased estimator can lead to incorrect conclusions.

2,346 citations


Journal ArticleDOI
TL;DR: In this paper, the effect of autocorrelation on the variance of the Mann-Kendall trend test statistic is discussed, and a modified non-parametric trend test is proposed.

2,252 citations


Book
01 Dec 1998
TL;DR: Time-series Analysis Methods and Error Handling: The Spatial Analyses of Data Fields, a meta-analysis of Stochastic processes and stationarity.
Abstract: Chapter and section headings: Preface. Acknowledgments. Data Acquisition and Recording. Introduction. Basic sampling requirements. Temperature. Salinity. Depth or pressure. Sea-level measurement. Eulerian currents. Lagrangian current measurements. Wind. Precipitation. Chemical tracers. Transient chemical tracers. Data Processing and Presentation. Introduction. Calibration. Interpolation. Data presentation. Statistical Methods and Error Handling. Introduction. Sample distributions. Probability. Moments and expected values. Common probability density functions. Central limit theorem. Estimation. Confidence intervals. Selecting the sample size. Confidence intervals for altimeter bias estimators. Estimation methods. Linear estimation (regression). Relationship between regression and correlation. Hypothesis testing. Effective degrees of freedom. Editing and despiking techniques: the nature of errors. Interpolation: filling the data gaps. Covariance and the covariance matrix. Bootstrap and jackknife methods. The Spatial Analyses of Data Fields. Traditional block and bulk averaging. Objective analysis. Empirical orthogonal functions. Normal mode analysis. Inverse methods. Time-series Analysis Methods. Basic concepts. Stochastic processes and stationarity. Correlation functions. Fourier analysis. Harmonic analysis. Spectral analysis. Spectral analysis (parametric methods). Cross-spectral analysis. Wavelet analysis. Digital filters. Fractals. Appendices. References. Index. 8 illus., 135 line drawings.

1,357 citations


Book
01 Jan 1998
TL;DR: In this article, the Monte Carlo method is used to estimate probability functions and statistical errors, confidence intervals and limits, and the method of least squares is used for estimating probability functions.
Abstract: Preface Notation 1. Fundamental Concepts 2. Examples of Probability Functions 3. The Monte Carlo Method 4. Statistical Tests 5. General Concepts of Parameter Estimation 6. The Method of Maximum Likelihood 7. The Method of Least Squares 8. The Method of Moments 9. Statistical Errors, Confidence Intervals and Limits 10. Characteristic Functions and Related Examples 11. Unfolding Bibliography Index

1,103 citations


Book
01 Oct 1998
TL;DR: In this article, a statistical description of the quality of processes and measurements is given, and an introduction to Hypothesis Testing is given. But this is not a complete survey of the literature.
Abstract: Introduction. Statistical Description of the Quality of Processes and Measurements. The Normal Distribution. An Introduction to Hypothesis Testing. Some Important Hypothesis Tests. Analysis of Variance. Control Charts. Straight Line Regression and Calibration. Vectors and Matrices. Multiple and Polynomial Regression. Non-linear Regression. Robust Statistics. Internal Method Validation. Method Validation by Interlaboratory Studies. Other Distributions. The 2 2 Contingency Table. Principal Components. Information Theory. Fuzzy Methods. Process Modelling and Sampling. An Introduction to Experimental Design. Two-level Factorial Designs. Fractional Factorial Designs. Multi-level Designs. Mixture Designs. Other Optimization Methods. Genetic Algorithms and Other Global Search Strategies. Index.

928 citations


Journal ArticleDOI
TL;DR: In this article, the advantages and disadvantages of the preferred technique (weighted averaging partial least squares) are reviewed and the problems in model selection are discussed and the need for evaluation and validation of reconstructions is emphasised.
Abstract: In the last decade, palaeolimnology has shifted emphasis from being a predominantly qualitative, descriptive subject to being a quantitative, analytical science with the potential to address critical hypotheses concerning the impacts of environmental changes on limnic systems. This change has occurred because of (1) major developments in applied statistics, some of which have only become possible because of the extraordinary developments in computer technology, (2) increased concern about problem definition, research hypotheses, and project design, (3) the building up of high quality modern calibration data-sets, and (4) the narrowing of temporal resolution in palaeolimnological studies from centuries to decades or even single years or individual seasons. The most significant development in quantitative palaeolimnology has been the creation of many modern calibration data-sets of biotic assemblages and associated environmental data. Such calibration sets, when analysed by appropriate numerical procedures, have the potential to transform fossil biostratigraphical data into quantitative estimates of the past environment. The relevant numerical techniques are now well developed, widely tested, and perform remarkably well. The properties of these techniques are becoming better known as a result of simulation studies. The advantages and disadvantages of the preferred technique (weighted averaging partial least squares) are reviewed and the problems in model selection are discussed. The need for evaluation and validation of reconstructions is emphasised. Several statistical surprises have emerged from calibration studies. Outstanding problems remain the need for a detailed and consistent taxonomy in the calibration sets, the quality, representativeness, and inherent variability of the environmental variables of interest, and the inherent bias in the calibration models. Besides biological- environmental calibration sets, there is the potential to develop modern sediment-environment calibration sets to link sedimentary properties to catchment parameters. The adoption of a ‘dynamic calibration set’ approach may help to minimise the inherent bias in current calibration models. Modern regression techniques are available to explore the vast amount of unique ecological information about taxon-environment relationships in calibration data-sets. Hypothesis testing in palaeolimnology can be attempted directly by careful project design to take account of ‘natural experiments’ or indirectly by means of statistical testing, often involving computer- intensive permutation tests to evaluate specific null hypotheses. The validity of such tests depends on the type of permutation used in relation to the particular data-set being analysed, the sampling design, and the research questions being asked. Stratigraphical data require specific permutation tests. Several problems remain unsolved in devising permutation designs for fine-resolution stratigraphical data and for combined spatial and temporal data. Constrained linear or non-linear reduced rank regression techniques (e.g. redundancy analysis, canonical correspondence analysis and their partial counterparts) provide powerful tools for testing hypotheses in palaeolimnology. Work is needed, however, to extend their use to investigate and test for lag responses between biological assemblages and their environment. Having developed modern calibration data-sets, many palaeolimnologists are returning to the sedimentary record and are studying stratigraphical changes. In contrast to much palynological data, palaeolimnological data are often fine-resolution and as a result are often noisy, large, and diverse. Recent developments for detecting and summarising patterns in such data are reviewed, including statistical evaluation of zones, summarisation by detrended correspondence analysis, and non-parametric regression (e.g. LOESS). Techniques of time-series analysis that are free of many of the assumptions of conventional time-series analysis due to the development of permutation tests to assess statistical significance are of considerable potential in analysing fine-resolution palaeolimnological data. Such data also contain a wealth of palaeopopulation information. Robust statistical techniques are needed to help identify non-linear deterministic dynamics (chaos) from noise or random effects in fine-resolution palaeolimnological data.

846 citations


Book
01 Apr 1998
TL;DR: In this paper, the authors present a simple and general model for power analysis for minimum-effect tests, using power analysis with t-Tests and the analysis of variance, and the Implications of power analysis.
Abstract: 1. The Power of Statistical Tests. 2. A Simple and General Model for Power Analysis. 3. Power Analyses for Minimum-Effect Tests. 4. Using Power Analyses. 5. Correlation and Regression. 6. t-Tests and the Analysis of Variance. 7. Multi-Factor ANOVA Designs. 8. Split-Plot Factorial and Multivariate Analyses. 9. The Implications of Power Analyses. Appendices.

Journal ArticleDOI
TL;DR: Autocorrelation in fish recruitment and environmental data can complicate statistical inference in correlation analyses as discussed by the authors, and researchers often either adjust hypothesis testing or use hypothesis testing to address this problem.
Abstract: Autocorrelation in fish recruitment and environmental data can complicate statistical inference in correlation analyses. To address this problem, researchers often either adjust hypothesis testing ...

Journal ArticleDOI
TL;DR: While the method of types is suitable primarily for discrete memoryless models, its extensions to certain models with memory are also discussed, and a wide selection of further applications are surveyed.
Abstract: The method of types is one of the key technical tools in Shannon theory, and this tool is valuable also in other fields. In this paper, some key applications are presented in sufficient detail enabling an interested nonspecialist to gain a working knowledge of the method, and a wide selection of further applications are surveyed. These range from hypothesis testing and large deviations theory through error exponents for discrete memoryless channels and capacity of arbitrarily varying channels to multiuser problems. While the method of types is suitable primarily for discrete memoryless models, its extensions to certain models with memory are also discussed.

Posted Content
TL;DR: In this article, a brief overview of the class of models under study and central theoretical issues such as the curse of dimensionality, the bias-variance trade-off and rates of convergence are discussed.
Abstract: This introduction to nonparametric regression emphasizes techniques that might be most accessible and useful to the applied economist. The paper begins with a brief overview of the class of models under study and central theoretical issues such as the curse of dimensionality, the bias-variance trade-off and rates of convergence. The paper then focuses on kernel and nonparametric least squares estimation of the nonparametric regression model, and optimal differencing estimation of the partial linear model. Constrained estimation and hypothesis testing is also discussed. Empirical examples include returns to scale in electricity distribution and hedonic pricing of housing attributes.

Book ChapterDOI
01 Jan 1998
TL;DR: Model building and data analysis in the biological sciences somewhat presupposes that the person has some advanced education in the quantitative sciences, and statistics in particular, and this requirement also implies that a person has substantial knowledge of statistical hypothesis-testing approaches.
Abstract: Model building and data analysis in the biological sciences somewhat presupposes that the person has some advanced education in the quantitative sciences, and statistics in particular This requirement also implies that a person has substantial knowledge of statistical hypothesis-testing approaches Such people, including ourselves over the past several years, often find it difficult to understand the information-theoretic approach, only because it is conceptually so very different from the testing approach that is so familiar Relatively speaking, the concepts and practical use of the information-theoretic approach are much simpler than those of statistical hypothesis testing, and very much simpler than some of the various Bayesian approaches to data analysis (eg, Laud and Ibrahim 1995 and Carlin and Chib 1995)

Journal ArticleDOI
TL;DR: The authors examined the results of nonparametric tests with small sample sizes published in a recent issue of Animal Behaviour and found that in more than half of the articles concerned, the asymptotic variant had apparently been inappropriately used and incorrect P values had been presented.


Journal ArticleDOI
TL;DR: In this paper, test statistics are proposed that can be used to test hypotheses about the parameters of the deterministic trend function of a univariate time series, and the tests are valid for I(0) and I(1) errors.
Abstract: In this paper test statistics are proposed that can be used to test hypotheses about the parameters of the deterministic trend function of a univariate time series. The tests are valid in the presence of general forms of serial correlation in the errors and can be used without having to estimate the serial correlation parameters either parametrically or nonparametrically. The tests are valid for I(0) and I(1) errors. Trend functions that are permitted include general linear polynomial trend functions that may have breaks at either known or unknown locations. Asymptotic distributions are derived, and consistency of the tests is established. The general results are applied to a model with a simple linear trend. A local asymptotic analysis is used to compute asymptotic size and power of the tests for this example. Size is well controlled and is relatively unaffected by the variance of the initial condition. Asymptotic power curves are computed for the simple linear trend model and are compared to existing tests. It is shown that the new tests have nontrivial asymptotic power. A simulation study shows that the asymptotic approximations are adequate for sample sizes typically used in economics. The tests are used to construct confidence intervals for average GNP growth rates for eight industrialized countries using post-war data.

Journal ArticleDOI
TL;DR: The purpose of this paper is to explore some of the issues involved in estimating models with spatially autocorrelated error terms and the two most common methods, the weight matrix approach and the correlation structure itself, and their resulting correlation structures.

Journal ArticleDOI
TL;DR: In this article, simple techniques for the graphical display of simulation evidence concerning the size and power of hypothesis tests are developed and illustrated, including P value plots, P value discrepancy plots, and size-power curves.
Abstract: Simple techniques for the graphical display of simulation evidence concerning the size and power of hypothesis tests are developed and illustrated. Three types of figures—called P value plots, P value discrepancy plots, and size-power curves—are discussed. Some Monte Carlo experiments on the properties of alternative forms of the information matrix test for linear regression models and probit models are used to illustrate these figures. Tests based on the OPG regression generally perform much worse in terms of both size and power than ecient score tests.

Proceedings ArticleDOI
18 May 1998
TL;DR: This paper presents the problem of identifying whether a received signal at a base station is due to a line-of-sight (LOS) transmission or not (NLOS), and solves the binary hypothesis test under several assumptions.
Abstract: This paper presents the problem of identifying whether a received signal at a base station is due to a line-of-sight (LOS) transmission or not (NLOS). This is a first step towards estimating the mobile station's location. We formulate the NLOS identification problem as a binary hypothesis test where the range measurements are modeled as being corrupted by additive noise, with different probability distributions depending on the hypothesis. We solve the binary hypothesis test under several assumptions, proposing appropriate decision criteria.

Journal ArticleDOI
TL;DR: In this article, an alternative approach is proposed: E-sufficiency, asymptotic confidence zones of minimum size, bypassing the Likelihood, and by hypothesis testing.
Abstract: The General Framework.- An Alternative Approach: E-Sufficiency.- Asymptotic Confidence Zones of Minimum Size.- Asymptotic Quasi-Likelihood.- Combining Estimating Functions.- Projected Quasi-Likelihood.- Bypassing the Likelihood.- Hypothesis Testing.- Infinite Dimensional Problems.- Miscellaneous Applications.- Consistency and Asymptotic Normality for Estimating Functions.- Complements and Strategies for Application.

Journal ArticleDOI
TL;DR: In this paper, a statistical inference is developed and applied to estimation of the error, validation of optimality of a calculated solution and statistically based stopping criteria for an iterative alogrithm for two-stage stochastic programming with recourse where the random data have a continuous distribution.
Abstract: In this paper we consider stochastic programming problems where the objective function is given as an expected value function. We discuss Monte Carlo simulation based approaches to a numerical solution of such problems. In particular, we discuss in detail and present numerical results for two-stage stochastic programming with recourse where the random data have a continuous (multivariate normal) distribution. We think that the novelty of the numerical approach developed in this paper is twofold. First, various variance reduction techniques are applied in order to enhance the rate of convergence. Successful application of those techniques is what makes the whole approach numerically feasible. Second, a statistical inference is developed and applied to estimation of the error, validation of optimality of a calculated solution and statistically based stopping criteria for an iterative alogrithm. © 1998 The Mathematical Programming Society, Inc. Published by Elsevier Science B.V.

Journal ArticleDOI
TL;DR: In this paper, a simple consistent test is considered and a bootstrap method is proposed for testing a parametric regression functional form, which gives a more accurate approximation to the null distribution of the test than the asymptotic normal theory result.

Journal ArticleDOI
TL;DR: This paper presents a survey of the literature on the information-theoretic problems of statistical inference under multiterminal data compression with rate constraints, and includes three new results, i.e., the converse theorems for all problems of multiterMinal hypothesis testing, multiter Minal parameter estimation, and multiterMINal pattern classification at the zero rate.
Abstract: This paper presents a survey of the literature on the information-theoretic problems of statistical inference under multiterminal data compression with rate constraints. Significant emphasis is put on problems: (1) multiterminal hypothesis testing, (2) multiterminal parameter estimation and (3) multiterminal pattern classification, in either case of positive rates or zero rates. In addition, the paper includes three new results, i.e., the converse theorems for all problems of multiterminal hypothesis testing, multiterminal parameter estimation, and multiterminal pattern classification at the zero rate.

Journal ArticleDOI
TL;DR: In this paper, the adaptive Neyman test and wavelet thresholding were used to detect differences between two sets of curves, resulting in an adaptive high-dimensional analysis of variance, called HANOVA.
Abstract: With modern technology, massive data can easily be collected in a form of multiple sets of curves New statistical challenge includes testing whether there is any statistically significant difference among these sets of curves In this article we propose some new tests for comparing two groups of curves based on the adaptive Neyman test and the wavelet thresholding techniques introduced earlier by Fan We demonstrate that these tests inherit the properties outlined by Fan and that they are simple and powerful for detecting differences between two sets of curves We then further generalize the idea to compare multiple sets of curves, resulting in an adaptive high-dimensional analysis of variance, called HANOVA These newly developed techniques are illustrated by using a dataset on pizza commercials where observations are curves and an analysis of cornea topography in ophthalmology where images of individuals are observed A simulation example is also presented to illustrate the power of the adapti

ReportDOI
TL;DR: In this paper, regression-based tests of hypotheses about out-of-sample prediction errors are developed for zero mean and zero correlation between a prediction error and a vector of predictors.
Abstract: We develop regression-based tests of hypotheses about out of sample prediction errors. Representative tests include ones for zero mean and zero correlation between a prediction error and a vector of predictors. The relevant environments are ones in which predictions depend on estimated parameters. We show that standard regression statistics generally fail to account for error introduced by estimation of these parameters. We propose computationally convenient test statistics that properly account for such error. Simulations indicate that the procedures can work well in samples of size typically available, although there sometimes are substantial size distortions.

Journal ArticleDOI
TL;DR: This study uses cluster analysis to overcome common modeling techniques such as the fixed and random effects models, developed to account for heterogeneity, are impractical for count data, and indicates that separate models describe data more efficiently than the joint model.

Journal ArticleDOI
TL;DR: The critical consideration for studies with covariance analyses planned as the primary method for comparing treatments is the specification of the covariables in the protocol (or in an amendment or formal plan prior to any unmasking of the study).
Abstract: Analysis of covariance is an effective method for addressing two considerations for randomized clinical trials. One is reduction of variance for estimates of treatment effects and thereby the production of narrower confidence intervals and more powerful statistical tests. The other is the clarification of the magnitude of treatment effects through adjustment of corresponding estimates for any random imbalances between the treatment groups with respect to the covariables. The statistical basis of covariance analysis can be either non-parametric, with reliance only on the randomization in the study design, or parametric through a statistical model for a postulated sampling process. For non-parametric methods, there are no formal assumptions for how a response variable is related to the covariables, but strong correlation between response and covariables is necessary for variance reduction. Computations for these methods are straightforward through the application of weighted least squares to fit linear models to the differences between treatment groups for the means of the response variable and the covariables jointly with a specification that has null values for the differences that correspond to the covariables. Moreover, such analysis is similarly applicable to dichotomous indicators, ranks or integers for ordered categories, and continuous measurements. Since non-parametric covariance analysis can have many forms, the ones which are planned for a clinical trial need careful specification in its protocol. A limitation of non-parametric analysis is that it does not directly address the magnitude of treatment effects within subgroups based on the covariables or the homogeneity of such effects. For this purpose, a statistical model is needed. When the response criterion is dichotomous or has ordered categories, such a model may have a non-linear nature which determines how covariance adjustment modifies results for treatment effects. Insight concerning such modifications can be gained through their evaluation relative to non-parametric counterparts. Such evaluation usually indicates that alternative ways to compare treatments for a response criterion with adjustment for a set of covariables mutually support the same conclusion about the strength of treatment effects. This robustness is noteworthy since the alternative methods for covariance analysis have substantially different rationales and assumptions. Since findings can differ in important ways across alternative choices for covariables (as opposed to methods for covariance adjustment), the critical consideration for studies with covariance analyses planned as the primary method for comparing treatments is the specification of the covariables in the protocol (or in an amendment or formal plan prior to any unmasking of the study.

Journal ArticleDOI
TL;DR: In this paper, a simulation and verification study assessing the performance of 10 censored data reconstitution methods was conducted to develop guidance for statistical comparisons among very small samples (n < 10) with below detection limit observations in dredged sediment testing.
Abstract: A simulation and verification study assessing the performance of 10 censored data reconstitution methods was conducted to develop guidance for statistical comparisons among very small samples (n < 10) with below detection limit observations in dredged sediment testing. Censored data methods were evaluated for preservation of power and nominal type I error rate in subsequent statistical comparisons. Method performance was influenced by amount of censoring, data transformation, population distribution, and variance characteristics. For nearly all situations examined, substitution of a constant such as one-half the detection limit equaled or outperformed more complicated methods. Regression order statistics and maximum likelihood techniques previously recommended for estimating population parameters from censored environmental samples generally performed poorly in very small-sample statistical hypothesis testing with more than minimal censoring, due to their inability to accurately infer distributional prope...