scispace - formally typeset
Search or ask a question

Showing papers on "Statistical hypothesis testing published in 1972"


Journal ArticleDOI
TL;DR: In this paper, the authors report the level of power for recent statistical tests reported in the AERJ and propose alternative reporting schemes relative to hypothesis testing to include power and effect size as well as the traditional a.
Abstract: It is almost universally accepted by educational researchers that the power (probability of rejecting Ho when Ho is false, that is, 1-0-) of a statistical test is important and should be substantial. What is not universally accepted or known is that the power can and should be calculated and reported for every standard statistical test. The power of statistical tests already conducted in educational research is equally unknown. It is the purpose of this paper to report the level of power for recent statistical tests reported in the AERJ and to propose alternative reporting schemes relative to hypothesis testing to include power and effect size as well as the traditional a. Cohen (1962, 1969), Tversky and Kahnman (1971), Overall (1969) and others argue quite strongly that explicit computation of power relative to reasonable hypotheses should be made before any study is completed and subsequently reported. Tversky and Kahnman (1971) suggest three reasons why this computation is important: (1) Such computations can lead the researcher to the conclusion that there is no point in running the study unless the sample size is materially increased; (2) The computation is essential to the interpretation of negative results, that is, failures to reject the null hypothesis; and (3) Computed power gives the researcher an indication of the level of the probability of a valid rejection of the null hypothesis.

108 citations


Journal ArticleDOI
TL;DR: In this paper, the optimality criteria formulated in terms of the power functions of individual tests are given for problems where several hypotheses are tested simultaneously, subject to the constraint that the expected number of false rejections is less than a given constant gamma when all null hypotheses are true.
Abstract: : Optimality criteria formulated in terms of the power functions of the individual tests are given for problems where several hypotheses are tested simultaneously. Subject to the constraint that the expected number of false rejections is less than a given constant gamma when all null hypotheses are true, tests are found which maximize the minimum average power and the minimum power of the individual tests over certain alternatives. In the common situations in the analysis of variance this leads to application of multiple t-tests. Recommendations for choosing the value of gamma are given by relating gamma to the probability of no false rejections if all hypotheses are true. Based upon the optimality of the tests, a similar optimality property of joint confidence sets is also derived. (Author)

96 citations




Journal ArticleDOI
TL;DR: In this paper, special cases of the factor analysis model are developed for four selection situations and methods are suggested whereby parameters in each case can be estimated using a maximum likelihood procedure recently developed by Joreskog.
Abstract: Special cases of the factor analysis model are developed for four selection situations. Methods are suggested whereby parameters in each case can be estimated using a maximum likelihood procedure recently developed by Joreskog. Also, a numerical illustration is presented for each case.

22 citations



Journal ArticleDOI
TL;DR: In this article, the problem of choosing between two simple hypotheses, H sub o and H sub l, in terms of independent, identically distributed random variables, when observations can be taken in groups is investigated.
Abstract: : The paper investigates the problem of choosing between two simple hypothesis, H sub o and H sub l, in terms of independent, identically distributed random variables, when observations can be taken in groups. At any stage in the decision procedure it must be decided whether to stop or take action now or to continue, in which case the size of the next group of observations must be decided upon. The problem is to find an optimal procedure incorporating a stopping, group size (batch) and terminal action rule. It is shown that the optimal stopping rule is of the sequential probability ratio type. The special, but important, situation where the log likelihood can assume only a finite number of integral multiples of a constant, is investigated. It is shown that optimum procedures can be obtained by proper formulation of the problem in terms of Markov sequential decision schemes and solved by linear programming. Finally, a policy improvement type of routine is presented when the stopping rule is specified.

17 citations


Journal ArticleDOI
TL;DR: In this article, the significance of observed differences in well yields with respect to variation in controlling hydrogeologic factors was analyzed using Krusk-Wallis One-Way Analysis of Variance and Mann-Whitney U Test.
Abstract: Appropriate nonparametric or distribution free statistical techniques are useful tools when data do not satisfy the conditions required by parametric statistical tests, and may be applied to a variety of hydrogeological problems. Two nonparametric tests, Krusk-Wallis One-Way Analysis of Variance and Mann-Whitney U Test, were used to test the significance of observed differences in well yields with respect to variation in controlling hydrogeologic factors. This paper presents the steps involved in performing these two tests with one example for each and suggests other applications to water-related problems. To avoid computational errors and save time, a computer program was written for calculating the statistics used in these tests.

12 citations



Journal ArticleDOI
TL;DR: Within this consideration, the formal question arises, how to decide whether or not an observed G index is significantly different from an expectation of zero under the null hypothesis.
Abstract: it is the difference between the frequencies of the homonymly assigned cells (++ and --) and the heteronymly assigned cells (+and ―+) in a four-fold contingency table. This name seems to be more descriptive than the name Index of agreement, which is restricting the index to the cases of agreement between placements. Whether or not such an index is more adequate in describing the relationship between two characteristics in N individuals or between two persons on n items than is the phi or the tetrachoric correlation coefficient (c.f., Cliff, 1962; Holley, 1964) is a matter of methodological consideration. Within this consideration, the formal question arises, how to decide whether or not an observed G index is significantly different from an expectation of zero under the null hypothesis.

11 citations


Journal ArticleDOI
TL;DR: In this article, a simple confidence interval and hypothesis tests for the ratio of the proportions in two categories in a sampled population are presented, illustrated by examples on a consumer preference study, election poll results, and the life of coins.
Abstract: Summary A simple confidence interval and hypothesis tests for the ratio of the proportions in two categories in a sampled population are presented. They are illustrated by examples on a consumer preference study, election poll results, and the life of coins.

Journal ArticleDOI
TL;DR: The hypothesis-testing problem has recently been studied under a finite-memory constraint but most work has been concerned with the large-sample theory, so here the small- sample theory for binary-valued observations is studied.
Abstract: The hypothesis-testing problem has recently been studied under a finite-memory constraint. However, most work has been concerned with the large-sample theory. Here we study the small-sample theory for binary-valued observations.

Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of applying the transformed likelihood ratio statistic and the contingency statistic to testing the assumptions of a particular order and of stationarity when the assumptions are not true.
Abstract: When applying the Markov model, it is often assumed the transition matrix is stationary and of order one. This article considers the problem of applying the transformed likelihood ratio statistic and the contingency statistic to testing the assumptions of a particular order and of stationarity when the assumptions are not true. A Monte Carlo study of the small sample power of both these test statistics shows that if the true alpha level is kept equal for the two tests they have similar power. However, if the alpha level is set by reference to any standard table of the chi-square distribution the likelihood ratio statistic has larger power.


Journal ArticleDOI
TL;DR: In this paper, exact statistical tests are developed to test hypotheses that specific effects or sets of effects are zero, yielding procedures for exploring relationships among qualitative variables which are suitable for small samples.
Abstract: The log-linear model for contingency tables expresses the logarithm of a cell frequency as an additive function of main effects, interactions, etc., in a way formally identical with an analysis of variance model. Exact statistical tests are developed to test hypotheses that specific effects or sets of effects are zero, yielding procedures for exploring relationships among qualitative variables which are suitable for small samples. The tests are analogous to Fisher's exact test for a 2 × 2 contingency table. Given a hypothesis, the exact probability of the obtained table is determined, conditional on fixed marginals or other functions of the cell frequencies. The sum of the probabilities of the obtained table and of all less probable ones is the exact probability to be considered in testing the null hypothesis. Procedures for obtaining exact probabilities are explained in detail, with examples given.

01 Jan 1972
TL;DR: In this article, the exact probability of a table is determined, given a hypothesis, conditional on fixed marginals or other functions of the cell frequencies, and the sum of the probabilities of the obtained table and of all less probable ones is the approximate probability to be considered in testing the null hypothesis.
Abstract: The log-linear model for contingency tables expresses the logarithm of a cell frequency as an additive function of main effects, interactions, etc., in a way formally identical with an analysis of variance model. Exact statistical tests are developed to test hypotheses that specific effects or sets of effects are zero, yielding procedures for exploring relationships among qualitative variables which are suitable for small samples. The tests are analogous to Fisher's exact test for a 2 x 2 contingency table. Given a hypothesis, the exact probability of the obtained table is determined, conditional on fixed marginals or other functions of the cell frequencies. The sum of the probabilities of the obtained table and of all less probable ones is the exact probability to be considered in testing the null hypothesis. Procedures for obtaining exact probabilities are explained in detail, with examples given. A multidimensional contingency table results when each observation in a sample is classified according to its value on two or more categorical variables. When there are two variables, the definition of independence, or lack of association, between them is relatively straightforward. The familiar chi-square test can be used as an approximate test of the hypothesis of independence. Fisher's exact test is an exact procedure for a 2 x 2 table; the analogous exact procedure for larger tables is a special case of the general method presented in this paper. When there are more than two variables, the definitions of relationships of varying degrees of complexity are not obvious, and different approaches have been taken. In a previous paper (Shaffer, 1972), it was argued that the log-linear model provides a framework suitable for a general-purpose analysis of the relationships in contingency tables. The properties of the model and reasons for its choice were discussed in detail, and it was pointed out that it had been advocated by many statisticians from a variety of pers{M!ctives.


Journal ArticleDOI
TL;DR: If ties exist among the k sample values, and the researcher has access to a program like that described in this paper, then the tests presented by Ferguson (1965, 1971) are a viable alternative and offer many statistical advantages.
Abstract: an a priori ordering among the k treatment groups. For such situations, trend analysis is often an appropriate statistical technique. Comprehensive discussions of the use of trend analysis with parametric data are available in statistics textbooks commonly used by behavioral scientists (e.g., Ferguson, 1971; Hays, 1963; Kirk, 1968; Winer, 1962). Analogous procedures for ranked data have been presented by Ferguson (1965, 1971), Jonckheere (1954a, 1954b), Jonckheere and Bower (1967), May and Konkin (1970), and Page (1963). Those described by May and Konkin (1970), for testing ordered hypotheses for k independent samples, and Page (1963), for testing ordered hypotheses for k correlated samples, are accompanied by extensive tables and are computationally facile. However, if ties exist among the k sample values, and the researcher has access to a program like that described in this paper, then the tests presented by Ferguson (1965,1971) are a viable alternative. The tests described by Ferguson employ the statistic S which is related to Kendall’s r and offers many statistical advantages. For example, as n increases, the sampling distribution of S rapidly approaches the normal distribution. More precisely, for n > 10, the normal approximations and exact values based on the distribution of

Journal ArticleDOI
TL;DR: In this article, a new statistical test based on the number and length of runs up and down was proposed to identify and intermediate structure for the examples of 206 Pb and 115 In (n, γ).


01 Mar 1972
TL;DR: The results of this study lead to the conclusion that the t test is not as robust as generally thought and that researchers should consider all of the basic assumptions before applying this test to their data.
Abstract: The purpose of this study was to empirically determine the effects of quantified violations of the underlying assumptions of parametric statistical tests commonly used in educational research, namely the correlation coefficient (r) and the t test. The effects of heterogeneity of variance, nonnormality, and nonlinear transformations of scales were studied separately and in all combinations. . Monte Carlo procedures were followed to generate random digits which had the following shapes: normal, positively skewed, negatively skewed, and leptokurtic. Interval, ordinal, and percentile rank transformations were used for all of the computations which were based.on 5,000 sets of randomly generated numbers, each set containing either 5, 15, or 30 such numbers. A total of 1,332 combinations of differences in shape of distribution, variance, size of sample, and type of scale were studied. The results indicate that the distributions of r do not deviate significantly from the theoretical distributions even under the most severe combinations of violations. However, there were many significant discrepancies for the t test. The results of this study lead to the conclusion that the t test is not as robust as generally thought and that researchers should consider all of the basic assumptions before applying this test to their data. FINAL REPORT Project No. 1G060 Grant No. 0EG-7-71-002.1(509) AN EMPIRICAL INVESTIGATION OF SPECIFIED VIOLATIONS OF THE ASSUMPTIONS UNDERLYING STATISTICAL TECHNIQUES Larry L. Havlicek School of Education University of Kansas Lawrence, Kansas 66044 March 1972.. The research reported herein was performed pursuant to a grant with the Office of Education, U.S. Department of Health,.Education, and Welfare. Contractors undertaking such projects under Government sponsorship are encouraged to express freely their professional judgment in the conduct of the project. Points of view or opinions stated do not, therefore,-necessarily represent official Office of Education position or policy. . U.S. DEPARTMENT OF HEALTH, EDUCATION, AND WELFARE Office of Education National Center for Educational Research and Development (Regional Research Program) ii 1111151111111111117111111111MPINRWRTIF I would like to express my indebtedness to Mrs. Judith S. Halderson for the many hours she spent writing the computer programs, checking computations, and insuring that the results were accurate and valid. She was primarily responsible for the development of all of the computer programs, which she did very efficiently and effectively. She also provided valuable help, advice, and collaboration on all other aspects of the study. The assistance of the University of Kansas Computer Center staff is also appreciated. Jim Frane's assistance in developing the computer routines to generate the distributions of scores and John Kocourek's assistance in running the analyses merit special mention. To researchers in all areas, we hope that this study will make a contribution toward better understanding of the application of the t test and the correlation coefficient in situations in which violitions of the basic assumptions may be, suspected. To accomplish this, the results are presented in a non-technical, as-they-occurred fashion. It is suggested that researchers compare the conditions under which these analyses were computed with the conditions they are working under and then decide whether or not t or r is appropriate for their analyses. iii TABLE tit CONTENTS Page Abstract Title Page ii Preface iii Table of




01 Dec 1972
TL;DR: This report provides an elementary outline, based largely on the monograph by Guest, of the theory underlying the least-squares analysis of polynomials and the statistical testing of the resulting parameters, as it relates to the program LEASTB currently in use.
Abstract: : One of the most frequent mathematical routines encountered in experimental work is the fitting of observations to an equation, empirical or otherwise. The techniques for doing this are by now standard and a number of computer programs are available for this task. Generally these are not adapted for use on a time-sharing system, or do not allow extensive testing of the statistical significance of the results. In order to establish the significance of experiments on field evaporation as well as on field emission of electrons, the author has developed LEASTB, a flexible program in BASIC, designed to curve fitting of polynomials on a time-shared computer. This is adapted to general use in analyzing experimental parameters, and has proved very statisfactory for testing statistical significance. This report provides an elementary outline, based largely on the monograph by Guest, of the theory underlying the least-squares analysis of polynomials and the statistical testing of the resulting parameters, as it relates to the program LEASTB currently in use. (Author Modified Abstract)

01 Oct 1972
TL;DR: In this article, the derivation and properties of sequential tests of composite hypotheses for families of distributions satisfying certain conditions are discussed, including discrimination among k > 2 hypotheses in the presence of nuisance parameters, and some tests with the usual decision boundaries.
Abstract: : The report contains the derivation and properties of sequential tests of composite hypotheses for families of distributions satisfying certain conditions. The following topics are discussed: Sequential test 'S'; Discrimination among k > 2 hypotheses in the presence of nuisance parameters; Some tests with the usual decision boundaries; Sequential tests for analysis of variance under random and mixed models; Some exact results.

Journal ArticleDOI
TL;DR: A general computer program is described that will compute asymptotic standard errors and carry out significance tests for an endless variety of (standard and) nonstandard large-sample statistical problems, without requiring the statistician to derive asymPTotic standard error formulas.
Abstract: A general computer program is described that will compute asymptotic standard errors and carry out significance tests for an endless variety of (standard and) nonstandard large-sample statistical problems, without requiring the statistician to derive asymptotic standard error formulas. The program assumes that the observations have a multinormal distribution and that the null hypothesis to be tested has the form g = 0 where g is some function (to be specified by the user) of means, variances, and covariances. Only minor reprogramming is required to replace either or both of these assumptions.

Journal ArticleDOI
TL;DR: A confidence model for finite-memory learning systems is advanced and the optimal rule for this model is deterministic, whereas the previous model required randomized rules to achieve minimum error probability.
Abstract: A confidence model for finite-memory learning systems is advanced in this correspondence. The primary difference between this and the previously used probability-of-error model is that a measure of confidence is associated with each decision and any incorrect decisions are weighted according to their confidence measure in figuring total loss. The optimal rule for this model is deterministic, whereas the previous model required randomized rules to achieve minimum error probability.

01 Aug 1972
TL;DR: In this article, a general computer program is described that will compute asymptotic standard errors and carry out significance tests for an endless variety of (standard and) nonstandard large-sample statistical problems, without requiring the statistician to derive asymptonotic standard error formulas.
Abstract: A general computer program is described that will compute asymptotic standard errors and carry out significance tests for an endless variety of (standard and) nonstandard large-sample statistical problems, without requiring the statistician to derive asymptotic standard error formulas. The program assumes that the observations have a multinormal distribution and that the null hypothesis to be tested has the form g = 0 where g is some function (to be specified by the user) of means, variances, and covariances. Only minor reprogramming is required to replace either or both of these assumptions.

Journal ArticleDOI
TL;DR: In this article, a numerical procedure is outlined for obtaining an interval estimate of true score for several sets of test data, and the procedure is applied to several test data sets of the same class.
Abstract: : A numerical procedure is outlined for obtaining an interval estimate of true score. The procedure is applied to several sets of test data. (Author)