scispace - formally typeset
Search or ask a question

Showing papers on "Statistical hypothesis testing published in 1993"


Journal ArticleDOI
TL;DR: In this article, the authors considered tests for parameter instability and structural change with unknown change point, and the results apply to a wide class of parametric models that are suitable for estimation by generalized method of moments procedures.
Abstract: This paper considers tests for parameter instability and structural change with unknown change point. The results apply to a wide class of parametric models that are suitable for estimation by generalized method of moments procedures. The asymptotic distributions of the test statistics considered here are nonstandard because the change point parameter only appears under the alternative hypothesis and not under the null. The tests considered here are shown to have nontrivial asymptotic local power against all alternatives for which the parameters are nonconstant. The tests are found to perform quite well in a Monte Carlo experiment reported elsewhere. Copyright 1993 by The Econometric Society.

4,348 citations


Book
09 Jul 1993
TL;DR: In this article, the authors present a summary of statistical tests for classical analysis, including errors in classical analysis - Statistics of Repeated Measurements and Statistical Tests for Instrumental Analysis.
Abstract: Introduction. Errors in Classical Analysis - Statistics of Repeated Measurements. Significance Tests. Quality Control and Sampling. Errors in Instrumental Analysis. Regression and Correlation. Non-parametric and Robust Methods. Experimental Design. Optimization and Pattern Recognition. Solutions to Exercises. Appendix 1: Summary of Statistical Tests. Appendix 2: Statistical Tests.

3,834 citations


Journal ArticleDOI
01 Sep 1993-Ecology
TL;DR: The paper discusses first how autocorrelation in ecological variables can be described and measured, and ways are presented of explicitly introducing spatial structures into ecological models, and two approaches are proposed.
Abstract: ilbstract. Autocorrelation is a very general statistical property of ecological variables observed across geographic space; its most common forms are patches and gradients. Spatial autocorrelation. which comes either from the physical forcing of environmental variables or from community processes, presents a problem for statistical testing because autocorrelated data violate the assumption of independence of most standard statistical procedures. The paper discusses first how autocorrelation in ecological variables can be described and measured. with emphasis on mapping techniques. Then. proper statistical testing in the presence of autocorrelation is briefly discussed. Finally. ways are presented of explicitly introducing spatial structures into ecological models. Two approaches are proposed: in the raw-data approach, the spatial structure takes the form of a polynomial of the x and .v geographic coordinates of the sampling stations; in the matrix approach. the spatial structure is introduced in the form of a geographic distance matrix among locations. These two approaches are compared in the concluding section. A table provides a list of computer programs available for spatial analysis.

3,491 citations


Journal ArticleDOI
TL;DR: A method for rapid visual recognition of personal identity is described, based on the failure of a statistical test of independence, which implies a theoretical "cross-over" error rate of one in 131000 when a decision criterion is adopted that would equalize the false accept and false reject error rates.
Abstract: A method for rapid visual recognition of personal identity is described, based on the failure of a statistical test of independence. The most unique phenotypic feature visible in a person's face is the detailed texture of each eye's iris. The visible texture of a person's iris in a real-time video image is encoded into a compact sequence of multi-scale quadrature 2-D Gabor wavelet coefficients, whose most-significant bits comprise a 256-byte "iris code". Statistical decision theory generates identification decisions from Exclusive-OR comparisons of complete iris codes at the rate of 4000 per second, including calculation of decision confidence levels. The distributions observed empirically in such comparisons imply a theoretical "cross-over" error rate of one in 131000 when a decision criterion is adopted that would equalize the false accept and false reject error rates. In the typical recognition case, given the mean observed degree of iris code agreement, the decision confidence levels correspond formally to a conditional false accept probability of one in about 10/sup 31/. >

3,399 citations


Book
29 Oct 1993
TL;DR: This book presents a meta-modelling framework for analysing two or more samples of unimodal data from von Mises distributions, and some modern Statistical Techniques for Testing and Estimation used in this study.
Abstract: Preface 1. The purpose of the book 2. Survey of contents 3. How to use the book 4. Notation, terminology and conventions 5. Acknowledgements Part I. Introduction: Part II. Descriptive Methods: 2.1. Introduction 2.2. Data display 2.3. Simple summary quantities 2.4. Modifications for axial data Part III. Models: 3.1. Introduction 3.2. Notation trigonometric moments 3.3. Probability distributions on the circle Part IV. Analysis of a Single Sample of Data: 4.1. Introduction 4.2. Exploratory analysis 4.3. Testing a sample of unit vectors for uniformity 4.4. Nonparametric methods for unimodal data 4.5. Statistical analysis of a random sample of unit vectors from a von Mises distribution 4.6. Statistical analysis of a random sample of unit vectors from a multimodal distribution 4.7. Other topics Part V. Analysis of Two or More Samples, and of Other Experimental Layouts: 5.1. Introduction 5.2. Exploratory analysis 5.3. Nonparametric methods for analysing two or more samples of unimodal data 5.4. Analysis of two or more samples from von Mises distributions 5.5. Analysis of data from more complicated experimental designs Part VI. Correlation and Regression: 6.1. Introduction 6.2. Linear-circular association and circular-linear association 6.3. Circular-circular association 6.4. Regression models for a circular response variable Part VII. Analysis of Data with Temporal or Spatial Structure: 7.1. Introduction 7.2. Analysis of temporal data 7.3. Spatial analysis Part VIII. Some Modern Statistical Techniques for Testing and Estimation: 8.1. Introduction 8.2. Bootstrap methods for confidence intervals and hypothesis tests: general description 8.3. Bootstrap methods for circular data: confidence regions for the mean direction 8.4. Bootstrap methods for circular data: hypothesis tests for mean directions 8.5. Randomisation, or permutation, tests Appendix A. Tables Appendix B. Data sets References Index.

2,323 citations


Posted Content
TL;DR: This book focuses on the exploration of relationships among integrated data series and the exploitation of these relationships in dynamic econometric modelling, and the asymptotic theory of integrated processes is described.
Abstract: This book provides a wide-ranging account of the literature on co-integration and the modelling of integrated processes (those which accumulate the effects of past shocks). Data series which display integrated behaviour are common in economics, although techniques appropriate to analysing such data are of recent origin and there are few existing expositions of the literature. This book focuses on the exploration of relationships among integrated data series and the exploitation of these relationships in dynamic econometric modelling. The concepts of co-integration and error-correction models are fundamental components of the modelling strategy. This area of time-series econometrics has grown in importance over the past decade and is of interest to econometric theorists and applied econometricians alike. By explaining the important concepts informally, but also presenting them formally, the book bridges the gap between purely descriptive and purely theoretical accounts of the literature. The asymptotic theory of integrated processes is described and the tools provided by this theory are used to develop the distributions of estimators and test statistics. Practical modelling advice, and the use of techniques for systems estimation, are also emphasized. A knowledge of econometrics, statistics, and matrix algebra at the level of a final-year undergraduate or first-year undergraduate course in econometrics is sufficient for most of the book. Other mathematical tools are described as they occur.

2,050 citations


Journal ArticleDOI
TL;DR: Using empirically scaled computer simulation models of continuous traits evolving along phylogenetic trees to obtain null distributions of F statistics for ANCOVA of comparative data sets is proposed.
Abstract: Biologists often compare average phenotypes of groups of species defined cladistically or on behavioral, ecological, or physiological criteria (e.g., carnivores vs. herbivores, social vs. nonsocial species, endotherms vs. ectotherms). Hypothesis testing typically is accomplished via analysis of variance (ANOVA) or covariance (ANCOVA; often with body size as a covariate). Because of the hierarchical nature of phylogenetic descent, however, species may not represent statistically independent data points, degrees of freedom may be inflated, and significance levels derived from conventional tests cannot be trusted. As one solution to this degrees of freedom problem, we propose using empirically scaled computer simulation models of continuous traits evolving along «known» phylogenetic trees to obtain null distributions of F statistics for ANCOVA of comparative data sets

1,188 citations


Journal ArticleDOI
01 Oct 1993-Genetics
TL;DR: Simple statistical methods for testing the molecular evolutionary clock hypothesis are developed which can be applied to both nucleotide and amino acid sequences and are similar to those of the likelihood ratio test and the relative rate test, in spite of the fact that the latter two tests assume that the pattern of substitution rates follows a certain model.
Abstract: Simple statistical methods for testing the molecular evolutionary clock hypothesis are developed which can be applied to both nucleotide and amino acid sequences. These methods are based on the chi-square test and are applicable even when the pattern of substitution rates is unknown and/or the substitution rate varies among different sites. Furthermore, some of the methods can be applied even when the outgroup is unknown. Using computer simulations, these methods were compared with the likelihood ratio test and the relative rate test. The results indicate that the powers of the present methods are similar to those of the likelihood ratio test and the relative rate test, in spite of the fact that the latter two tests assume that the pattern of substitution rates follows a certain model and that the substitution rate is the same among different sites, while such assumptions are not necessary to apply the present methods. Therefore, the present methods might be useful.

873 citations


Journal ArticleDOI
TL;DR: A test statistics suggested by Cox is employed to test the adequacy of some statistical models of DNA sequence evolution used in the phylogenetic inference method introduced by Felsentein.
Abstract: Penny et al. have written that "The most fundamental criterion for a scientific method is that the data must, in principle, be able to reject the model. Hardly any [phylogenetic] tree-reconstruction methods meet this simple requirement." The ability to reject models is of such great importance because the results of all phylogenetic analyses depend on their underlying models--to have confidence in the inferences, it is necessary to have confidence in the models. In this paper, a test statistic suggested by Cox is employed to test the adequacy of some statistical models of DNA sequence evolution used in the phylogenetic inference method introduced by Felsenstein. Monte Carlo simulations are used to assess significance levels. The resulting statistical tests provide an objective and very general assessment of all the components of a DNA substitution model; more specific versions of the test are devised to test individual components of a model. In all cases, the new analyses have the additional advantage that values of phylogenetic parameters do not have to be assumed in order to perform the tests.

725 citations


Book
01 Jan 1993
TL;DR: In this article, the authors introduce Statistical Testing Examples of Test Procedures List of Tests Classification of Tests The Tests List of Tables Tables Tables The Tests list of tables tables Table 1.1.
Abstract: Introduction to Statistical Testing Examples of Test Procedures List of Tests Classification of Tests The Tests List of Tables Tables

685 citations


Journal ArticleDOI
TL;DR: A family of statistical models termed random regression models were used that provide a more realistic approach to analysis of longitudinal psychiatric data and indicated that both person-specific effects and serial correlation play major roles in the longitudinal psychiatric response process.
Abstract: L studies have a prominent role in psychiatric research; however, statistical methods for analyzing these data are rarely commensurate with the effort involved in their acquisition. Frequently the majority of data are discarded and a simple end-point analysis is performed. In other cases, so called repeated-measures analysis of variance procedures are used with little regard to their restrictive and often unrealistic assumptions and the effect of missing data on the statistical properties of their estimates. We explored the unique features of longitudinal psychiatric data from both statistical and conceptual perspectives. We used a family of statistical models termed random regression models that provide a more realistic approach to analysis of longitudinal psychiatric data. Random regression models provide solutions to commonly observed problems of missing data, serial correlation, time-varying covariates, and irregular measurement occasions, and they accommodate systematic person-specific deviations from the average time trend. Properties of these models were compared with traditional approaches at a conceptual level. The approach was then illustrated in a new analysis of the National Institute of Mental Health Treatment of Depression Collaborative Research Program dataset, which investigated two forms of psychotherapy, pharmacotherapy with clinical management, and a placebo with clinical management control. Results indicated that both person-specific effects and serial correlation play major roles in the longitudinal psychiatric response process. Ignoring either of these effects produces misleading estimates of uncertainty that form the basis of statistical tests of hypotheses.

Proceedings ArticleDOI
01 Jul 1993
TL;DR: It is suggested that relevance feedback be evaluated from the perspective of the user and a number of different statistical tests are described for determining if differences in performance between retrieval methods are significant.
Abstract: The standard strategies for evaluation based on precision and recall are examined and their relative advantages and disadvantages are discussed. In particular, it is suggested that relevance feedback be evaluated from the perspective of the user. A number of different statistical tests are described for determining if differences in performance between retrieval methods are significant. These tests have often been ignored in the past because most are based on an assumption of normality which is not strictly valid for the standard performance measures. However, one can test this assumption using simple diagnostic plots, and if it is a poor approximation, there are a number of non-parametric alternatives.

ReportDOI
TL;DR: In this paper, the authors introduce a class of statistical tests for the hypothesis that some feature that is present in each of several variables is common to them, which are data properties such as serial correlation, trends, seasonality, heteroscedasticity, auto-regression, and excess kurtosis.
Abstract: This article introduces a class of statistical tests for the hypothesis that some feature that is present in each of several variables is common to them. Features are data properties such as serial correlation, trends, seasonality, heteroscedasticity, autoregressive conditional hetero-scedasticity, and excess kurtosis. A feature is detected by a hypothesis test taking no feature as the null, and a common feature is detected by a test that finds linear combinations of variables with no feature. Often, an exact asymptotic critical value can be obtained that is simply a test of overidentifying restrictions in an instrumental variable regression. This article tests for a common international business cycle.

Journal ArticleDOI
TL;DR: There are costs associated with these benefits, such as increased complexity, decreased power, multiple ways of answering the same question, and ambiguity in the allocation of shared variance.
Abstract: The more commonly known statistical procedures, such as the t-test, analysis of variance, or chi-squared test, can handle only one dependent variable (DV) at a time. Two types of problems can arise when there is more than one DV: 1. a greater probability of erroneously concluding that there is a significant difference between the groups when in fact there is none (a Type I error); and 2. failure to detect differences between the groups in terms of the patterns of DVs (a Type II error). Multivariate statistics are designed to overcome both of these problems. However, there are costs associated with these benefits, such as increased complexity, decreased power, multiple ways of answering the same question, and ambiguity in the allocation of shared variance. This is the first of a series of articles on multivariate statistical tests which will address these issues and explain their possible uses.

01 Jan 1993
TL;DR: Equivalency testing, a statistical method often used in biostatistics to determine the equivalence of two experimental drugs, is introduced to social scientists as mentioned in this paper, and the usefulness of the method to the social scientist is discussed.
Abstract: Equivalency testing, a statistical method often used in biostatistics to determine the equivalence of 2 experimental drugs, is introduced to social scientists. Examples of equivalency testing are offered, and the usefulness of the method to the social scientist is discussed. Although the central limit theorem was developed to allow for the estimation of confidence bounds around an observed mean (see Adams, 1974, for a fascinating presentation), its major application in empirical science has been to test whether the absolute difference between two means is greater than zero. However, there has been a growing dissatisfaction with traditional tests of the null hypothesis, in which the difference between two population means is precisely zero. Somehow, the testing of a hypothesis of "no difference" has resulted in the cognitive illusion that the investigator did not actively choose this as a plausible alternative hypothesis—that the null hypothesis was just a given of nature. However, it has been long recognized that with very large sample sizes, this null hypothesis will be rejected in almost all cases, resulting in statistically significant differences that are substantively trivial. In response to this state of affairs, and after establishing the statistical reliability of results, investigators have turned to estimates of the "amount of variance accounted for" or effect sizes (ESs) to evaluate the substantive significance of their findings. With the recent popularity of power analysis, investigators now design their studies in such a way that statistical analyses will be relevant to preselected differences (e.g., small, moderate, or large ESs) of presumed substantive import. This has resulted in a more complex interplay between hypothesis testing and statistical analyses: one in which investigators are asked to select a meaningful difference before executing a study. Dissatisfaction with the traditional null hypothesis has also emerged in an area of research in which the aim is not to establish the superiority of one treatment or method over another, but rather to establish equality between the two methods. This type of research involves the testing of treatment innovations to

Journal ArticleDOI
TL;DR: In this paper, an unbiased hit rate is proposed to measure the joint probability that a stimulus category is correctly identified given that it is presented at all and that a response is correctly used given that the response is used at all.
Abstract: Attention is drawn to three interrelated types of error that are committed with high frequencies in the description and analysis of studies of nonverbal behavior. The errors involve the calculation of inappropriate measures of accuracy, the use in statistical analyses of inappropriate chance levels, and misapplications ofX 2 and binomial statistical tests. Almost all papers published between 1979 and 1991 that reported performance separately for different stimulus and response classes suffer from one or more of these errors. The potential consequences of these errors are described, and a variety of proposed measures of performance is examined. Since all measures formerly proposed have weaknesses, a new and easily calculated measure, an unbiased hit rate (H u ), is proposed. This measure is the joint probability that a stimulus category is correctly identified given that it is presented at all and that a response is correctly used given that it is used at all. Two available data sets are reanalyzed using this measure, and the differences in the conclusions reached compared to those reached with an analysis of hit rates are described.

Book
01 Jan 1993
TL;DR: In this paper, the asymptotic distribution of the QMLE and the information matrix equality is analyzed. And the Radon-Nikodym theorem and central limit theorem are discussed.
Abstract: 1. Introductory remarks 2. Probability densities, likelihood functions and the quasi-maximum likelihood estimator 3. Consistency of the QMLE 4. Correctly specified models of density 5. Correctly specified models of conditional expectation 6. The asymptotic distribution of the QMLE and the information matrix equality 7. Asymptotic efficiency 8. Hypothesis testing and asymptotic covariance matrix estimation 9. Specification testing via m-tests 10. Applications of m-testing 11. Information matrix testing 12. Conclusion Appendix 1. Elementary concepts of measure theory and the Radon-Nikodym theorem Appendix 2. Uniform laws of large numbers Appendix 3. Central limit theorems.

Journal ArticleDOI
TL;DR: This method serves three purposes: it accurately locates boundaries between changed and unchanged areas, it brings to bear a regularizing effect on these boundaries in order to smooth them, and it eliminates small regions if the original data permits this.

Book
01 Jan 1993
TL;DR: This book explains the theory underlying the classical statistical methods using confidence intervals, hypothesis testing, and the theory of tests; estimation (including maximum likelihood); goodness of fit; and non-parametric and rank tests.
Abstract: Aimed at a diverse scientific audience, including physicists, astronomers, chemists, geologists, and economists, this book explains the theory underlying the classical statistical methods. Its level is between introductory "how to" texts and intimidating mathematical monographs. A reader without previous exposure to statistics will finish the book with a sound working knowledge of statistical methods, while a reader already familiar with the standard tests will come away with an understanding of their strengths, weaknesses, and domains of applicability. The mathematical level is that of an advanced undergraduate; for example, matrices and Fourier analysis are used where appropriate. Among the topics covered are common probability distributions; sampling and the distribution of sampling statistics; confidence intervals, hypothesis testing, and the theory of tests; estimation (including maximum likelihood); goodness of fit (including c2 and Kolmogorov-Smirnov tests); and non-parametric and rank tests. There are nearly one hundred problems (with answers) designed to bring out points in the text and to cover topics slightly outside the main line of development.

Journal ArticleDOI
TL;DR: In this paper, the authors introduce tests of superexogeneity and invariance, which are sensitive to particular types of parameter nonconstancy, especially with changing variances and covariances.

Book ChapterDOI
TL;DR: The Fisher and Neyman-Pearson approaches to testing statistical hypotheses are compared with respect to their attitudes to the interpretation of the outcome, to power, to conditioning, and to the use of fixed significance levels as discussed by the authors.
Abstract: The Fisher and Neyman-Pearson approaches to testing statistical hypotheses are compared with respect to their attitudes to the interpretation of the outcome, to power, to conditioning, and to the use of fixed significance levels. It is argued that despite basic philosophical differences, in their main practical aspects the two theories are complementary rather than contradictory and that a unified approach is possible that combines the best features of both. As applications, the controversies about the Behrens-Fisher problem and the comparison of two binomials (2 × 2 tables) are considered from the present point of view.

Book
01 Jan 1993
TL;DR: This book discusses methodological and statistical issues surrounding the development of Mathematical Models in Psychology, as well as some of the techniques used in Bayesian Statistics, a branch of statistics based on Bayesian inference.
Abstract: Volume I: Methodological Issues. Contents: Preface. Part I: Models and Measurement. W.K. Estes, Mathematical Models in Psychology. N.A. Macmillan, Signal Detection Theory as Data Analysis Method and Psychological Decision Model. N. Cliff, What Is and Isn't Measurement. L.E. Jones, L.M. Koehly, Multidimensional Scaling. G. Shafer, Can the Various Meanings of Probability Be Reconciled? Part II: Methodological Issues. R.C. Serlin, D.K. Lapsley, Rational Appraisal of Psychological Research and the Good-Enough Principle. D. MacKay, The Theoretical Epistemology: A New Perspective on Some Long-Standing Methodological Issues in Psychology. G. Keren, Between- or Within-Subjects Design: A Methodological Dilemma. P.W. Holland, Which Comes First, Cause or Effect? N. Brenner-Golomb, R.A. Fisher's Philosophical Approach to Inductive Inference. Part III: Intuitive Statistics. G. Gigerenzer, The Superego, the Ego, and the Id in Statistical Reasoning. A. Tversky, D. Kahneman, Belief in the Law of Small Numbers. R.M. Dawes, D. Faust, P.E. Meehl, Statistical Prediction Versus Clinical Prediction: Improving What Works. M. Bar-Hillel, W.A. Wagenaar, The Perception of Randomness. P.J. Pashley, On Generating Random Sequences. Part IV: Hypothesis Testing, Power, and Effect Size. A.G. Greenwald, Consequences of Prejudice Against the Null Hypothesis. P. Pollard, How Significant Is "Significance"? M. Tatsuoka, Effect Size. D.W. Zimmerman, B.D. Zumbo, The Relative Power of Parametric and Nonparametric Statistical Methods. R. Rosenthal, Cumulating Evidence. Volume 2: Statistical Issues. Contents: Preface. Part I: Analysis of Variance and Multiple Regression. M. Tatsuoka, Elements of the General Linear Model. R. Zwick, Pairwise Comparison Procedures for One-Way Analysis of Variance Designs. C. Lewis, Analyzing Means From Repeated Measures Data. G. Keren, A Balanced Approach to Unbalanced Designs. N.M. Timm, MANOVA and MANCOVA: An Overview. J. Cohen, Set Correlation. Part II: Bayesian Statistics. R.L. Winkler, Bayesian Statistics: An Overview. C. Lewis, Bayesian Methods for the Analysis of Variance. Part III: Categorical Data and the Analysis of Frequencies. S.S. Brier, Analysis of Categorical Data. K.L. Delucchi, On the Use and Misuse of Chi-Square. B.S. Everitt, Some Aspects of the Analysis of Categorical Data. Part IV: Other Topics. A.F. Smith, D.A. Prentice, Exploratory Data Analysis. H. Wainer, D. Thissen, Graphical Data Analysis. R.M. Church, Uses of Computers in Psychological Research. G.R. Loftus, Computer Simulation: Some Remarks on Theory in Psychology. R.H. Rushe, J.M. Gottman, Essentials in the Design and Analysis of Time-Series Experiments.

Journal ArticleDOI
Mark Broadie1
TL;DR: The tradeoff between estimation error and stationarity is investigated and a method for adjusting for the bias is suggested and a statistical test is proposed to check for nonstationarity in historical data.
Abstract: The mean-variance model for portfolio selection requires estimates of many parameters. This paper investigates the effect of errors in parameter estimates on the results of mean-variance analysis. Using a small amount of historical data to estimate parameters exposes the model to estimation errors. However, using a long time horizon to estimate parametes increasers the possibility of nonstationarity in the parameters. This paper investigates the tradeoff between estimation error and stationarity. A simulation study shows that the effects of estimation error can be surprisingly large. The magnitude of the errors increase with the number of securities in the analysis. Due to the error maximization property of mean-variance analysis, estimates of portfolio performance are optimistically biased predictors of actual portfolio performance. It is important for users of mean-variance analysis to recognize and correct for this phenomenon in order to develop more realistic expectations of the future performance of a portfolio. This paper suggests a method for adjusting for the bias. A statistical test is proposed to check for nonstationarity in historical data.

Journal ArticleDOI
TL;DR: In this article, the effectiveness of the all-uses and all-edges test data adequacy criteria is discussed, and a large number of test sets was randomly generated for each of nine subject programs with subtle errors, and it was determined whether the test set exposed an error.
Abstract: An experiment comparing the effectiveness of the all-uses and all-edges test data adequacy criteria is discussed. The experiment was designed to overcome some of the deficiencies of previous software testing experiments. A large number of test sets was randomly generated for each of nine subject programs with subtle errors. For each test set, the percentages of executable edges and definition-use associations covered were measured, and it was determined whether the test set exposed an error. Hypothesis testing was used to investigate whether all-uses adequate test sets are more likely to expose errors than are all-edges adequate test sets. Logistic regression analysis was used to investigate whether the probability that a test set exposes an error increases as the percentage of definition-use associations or edges covered by it increases. Error exposing ability was shown to be strongly positively correlated to percentage of covered definition-use associations in only four of the nine subjects. Error exposing ability was also shown to be positively correlated to the percentage of covered edges in four different subjects, but the relationship was weaker. >

Journal ArticleDOI
TL;DR: Two programs, CHIRXC and CHIHW, which estimate the significance of x statistics using pseudo-probability tests, permit analysis of sparse table data without pooling rare categories and prevents loss of information.
Abstract: where o, and e, are observed and expected (under the null hypothesis) numbers of the fth category (see any textbook on biometry or population genetics, e.g., Ayala and Kiger 1984; Sokal and Rohlf 1981). An unfortunate requirement of the test is that the expected numbers (e,) should not be small. Different authors give different recommendations; common opinion is that e, should not be less than 4 (see cited books). Introduction into population genetics of molecular techniques revealed a great wealth of genetic variation—allozymic, DNA restriction fragment length polymorphisms, etc.—in both plants and animals. Hence, samples of practical size (say, hundreds of individuals) will very often contain rare phenotypic or allelic categories. To obtain reliable x estimates one should then pool these rare categories, otherwise the calculated x will be inflated. Unfortunately, this causes loss of information, which is undesirable. An alternative is to use Fisher's exact probability test, but in the case of many categories and considerable total sample size, this is impractical. Roff and Bentzen (1989) suggested another practical alternative (without any pooling of data) for testing heterogeneity in R x C contingency tables containing rare categories. They used a Monte Carlo procedure, which was termed by Hernandez and Weir (1989) a pseudo-probability test, to test Hardy-Weinberg equilibrium. The procedure consists of (1) generating a sample of all possible data sets having the same marginal totals as the original data set and (2) computing x for each derived data set and counting all the sets for which x is larger than that of the original sample. The ratio of the obtained number to the overall number of generated data sets is the estimate of probability of the null hypothesis. We present here two programs, CHIRXC and CHIHW, which estimate the significance of x statistics using pseudo-probability tests. Thus, our programs permit analysis of sparse table data without pooling rare categories. This saves time and prevents loss of information. The CHIRXC analyzes R x C contingency tables. For the 2 x 2 case, it can perform Fisher's exact probability test as well. The CHIHW estimates conformity of genotypic proportions in a sample to Hardy-Weinberg expectations. It also computes an index of heterozygote deficiency or excess [D = (Ho He)/He] (Koehn et al. 1973) and estimates its significance through the pseudoprobability test. The programs are written in C (Turbo C, ver. 2, Copyright Borland International 1988). They will run on an IBM PC and compatibles. Sample sizes and dimensionality of R x C tables under analysis are limited only by the available computer memory. The approach of Roff and Bentzen (1989) was used in the MONTE program in REAP (McElroy et al. 1991). Our algorithm of randomization is different from theirs and much faster (Pudovkin Al and Zaykin DV, unpublished). The programs are available from the authors. To receive a copy, send a nonformatted 5.25-in diskette, and we will supply the disk with the programs (executables and listings), README files with user instructions, and input file formats.

Journal ArticleDOI
TL;DR: An analysis using another method promoted by Fisher, mathematical likelihood, shows that the p value substantially overstates the evidence against the null hypothesis.
Abstract: It is not generally appreciated that the p value, as conceived by R. A. Fisher, is not compatible with the Neyman-Pearson hypothesis test in which it has become embedded. The p value was meant to be a flexible inferential measure, whereas the hypothesis test was a rule for behavior, not inference. The combination of the two methods has led to a reinterpretation of the p value simultaneously as an "observed error rate" and as a measure of evidence. Both of these interpretations are problematic, and their combination has obscured the important differences between Neyman and Fisher on the nature of the scientific method and inhibited our understanding of the philosophic implications of the basic methods in use today. An analysis using another method promoted by Fisher, mathematical likelihood, shows that the p value substantially overstates the evidence against the null hypothesis. Likelihood makes clearer the distinction between error rates and inferential evidence and is a quantitative tool for expressing evidential strength that is more appropriate for the purposes of epidemiology than the p value.

Journal ArticleDOI
TL;DR: In this article, the authors present a new paradigm using experimental mathematics to examine the claims made in the levels of measurement controversy, which is referred to as monte carlo simulation, and demonstrate that the approach advocated in this paper is linked closely to representational theory.
Abstract: The notion that nonparametric methods are required as a replacement of parametric statistical methods when the scale of measurement in a research study does not achieve a certain level was discussed in light of recent developments in representational measurement theory. A new approach to examining the problem via computer simulation was introduced. Some of the beliefs that have been widely held by psychologists for several decades were examined by means of a computer simulation study that mimicked measurement of an underlying empirical structure and performed two - sample Student t - tests on the resulting sample data. It was concluded that there is no need to replace parametric statistical tests by nonparametric methods when the scale of measurement is ordinal and not interval.Stevens' (1946) classic paper on the theory of scales of measurement triggered one of the longest standing debates in behavioural science methodology. The debate -- referred to as the levels of measurement controversy, or measurement - statistics debate -- is over the use of parametric and nonparametric statistics and its relation to levels of measurement. Stevens (1946; 1951; 1959; 1968), Siegel (1956), and most recently Siegel and Castellan (1988) and Conover (1980) argue that parametric statistics should be restricted to data of interval scale or higher. Furthermore, nonparametric statistics should be used on data of ordinal scale. Of course, since each scale of measurement has all of the properties of the weaker measurement, statistical methods requiring only a weaker scale may be used with the stronger scales. A detailed historical review linking Stevens' work on scales of measurement to the acceptance of psychology as a science, and a pedagogical presentation of fundamental axiomatic (i.e., representational) measurement can be found in Zumbo and Zimmerman (1991).Many modes of argumentation can be seen in the debate about levels of measurement and statistics. This paper focusses almost exclusively on an empirical form of rhetoric using experimental mathematics (Ripley, 1987). The term experimental mathematics comes from mathematical physics. It is loosely defined as the mimicking of the rules of a model of some kind via random processes. In the methodological literature this is often referred to as monte carlo simulation. However, for the purpose of this paper, the terms experimental mathematics or computer simulation are preferred to monte carlo because the latter is typically referred to when examining the robustness of a test in relation to particular statistical assumptions. Measurement level is not an assumption of the parametric statistical model (see Zumbo & Zimmerman, 1991 for a discussion of this issue) and to call the method used herein "monte carlo" would imply otherwise. The term experimental mathematics emphasizes the modelling aspect of the present approach to the debate.The purpose of this paper is to present a new paradigm using experimental mathematics to examine the claims made in the levels of measurement controversy. As Michell (1986) demonstrated, the concern over levels of measurement is inextricably tied to the differing notions of measurement and scaling. Michell further argued that fundamental axiomatic measurement or representational theory (see, for example, Narens & Luce, 1986) is the only measurement theory which implies a relation between measurement scales and statistics. Therefore, the approach advocated in this paper is linked closely to representational theory. The novelty of this approach, to the authors knowledge, is in the use of experimental mathematics to mimic representational measurement. Before describing the methodology used in this paper, we will briefly review its motivation.Admissible TransformationsRepresentational theory began in the late 1950's with Scott and Suppes (1958) and later with Suppes and Zinnes (1963), Pfanzagl (1968), and Krantz, Luce, Suppes & Tversky (1971). …

Journal ArticleDOI
TL;DR: This paper shows the general superiority of the ''extended'' nonconvergent methods compared to classical penalty term methods, simple stopped training, and methods which only vary the number of hidden units.

Book
01 Feb 1993
TL;DR: In this article, the authors introduce basic statistical concepts in Geography: Introduction: The Context of Statistical Techniques The Role of Statistical Methods The Scientific Research Process/Statistical Concepts and Themes Geographic Data: Characteristics and Preparation Selected Dimensions of Geographic Data Levels of Measurement Measurement Concepts Basic Classification Methods Graphic Procedures.
Abstract: Part 1 Basic Statistical Concepts in Geography: Introduction: The Context of Statistical Techniques The Role of Statistical Methods The Scientific Research Process/Statistical Concepts and Themes Geographic Data: Characteristics and Preparation Selected Dimensions of Geographic Data Levels of Measurement Measurement Concepts Basic Classification Methods Graphic Procedures. Part 2 Descriptive Problem Solving in Geography: Descriptive Statistics Measures of Central Tendency Measures of Dispersion and Variability Measures of Shape or Relative Position Spatial Data and Descriptive Statistics Descriptive Spatial Statistics Spatial Measures of Central Tendency Spatial Measures of Dispersion Locational Issues and Descriptive Spatial Statistics. Part 3 The Transition to Inferential Problem Solving: Probability/Deterministic and Probabilistic Processes The Concept of Probability The Binomial Distribution The Poisson Distribution The Normal Distribution Probability Mapping Basic Elements of Sampling Concepts in Sampling Types of Probability Sampling Spatial Sampling Estimation in Sampling Basic Concepts in Estimation Confidence Intervals and Estimation Sample Size Selection. Part 4 Inferential Problem Solving in Geography: Elements of Inferential Statistics Classical Traditional Hypothesis Testing P-Value or Prob-Value Hypothesis Testing Related One Sample Difference Tests Issues in Statistical Test Selection Multiple Sample Difference Tests Two Sample Difference Tests Matched Pairs (Dependent Sample) Difference Tests Three or More Sample Difference Tests Goodness-of-Fit and Categorical Difference Tests Goodness-of-Fit Tests Contingency Analysis Difference Test for Ordinal Categories Inferential Spatial Statistics Point Pattern Analysis Area Pattern Analysis. Part 5 Statistical Relationships Between Variables: Correlation The Nature of Correlation Association of Interval Ratio Variables Association of Ordinal Variables Association of Nominal Variables Use of Correlation Indices in Map Comparison Issues Regarding Correlation Regression Bivariate Regression Multivariate Regression Appendix

Journal ArticleDOI
TL;DR: In this article, a decentralized sequential detection problem is considered in which each one of a set of sensors receives a sequence of observations about the hypothesis, and each sensor sends a series of summary messages to the fusion center where a sequential test is carried out to determine the true hypothesis.
Abstract: A decentralized sequential detection problem is considered in which each one of a set of sensors receives a sequence of observations about the hypothesis. Each sensor sends a sequence of summary messages to the fusion center where a sequential test is carried out to determine the true hypothesis. A Bayesian framework for this problem is introduced, and for the case when the information structure in the system is quasi-classical, it is shown that the problem is tractable. A detailed analysis of this case is presented, along with some numerical results. >