scispace - formally typeset
Search or ask a question

Showing papers on "Statistical hypothesis testing published in 1976"


Journal ArticleDOI
TL;DR: In this paper, a simple algorithm is constructed and shown to converge monotonically to yield a maximum likelihood estimate of a distribution function when the data are incomplete due to grouping, censoring and/or truncation.
Abstract: SUMMARY This paper is concerned with the non-parametric estimation of a distribution function F, when the data are incomplete due to grouping, censoring and/or truncation. Using the idea of self-consistency, a simple algorithm is constructed and shown to converge monotonically to yield a maximum likelihood estimate of F. An application to hypothesis testing is indicated.

1,669 citations


Journal ArticleDOI
TL;DR: In this paper, the problems of model classification and parameter estimation are examined, with the objective of establishing the statistical reliability of inferences drawn from X-ray observations, and a procedure based on minimizing the chi/sup 2/ statistic is recommended; it provides a rejection criterion at any desired significance level.
Abstract: The problems of model classification and parameter estimation are examined, with the objective of establishing the statistical reliability of inferences drawn from X-ray observations. For testing the validities of classes of models, the procedure based on minimizing the chi/sup 2/ statistic is recommended; it provides a rejection criterion at any desired significance level. Once a class of models has been accepted, a related procedure based on the increase of chi/sup 2/ gives a confidence region for the values of the model's adjustable parameters. The procedure allows the confidence level to be chosen exactly, even for highly nonlinear models. Numerical experiments confirm the validity of the prescribed technique.The chi/sup 2//sub min/+1 error estimation method is evaluated and found unsuitable when several parameter ranges are to be derived, because it substantially underestimates their joint errors. The ratio of variances method, while formally correct, gives parameter confidence regions which are more variable than necessary. (AIP)

700 citations


Journal ArticleDOI
TL;DR: Fisher's contributions to statistics are surveyed in this article, where his background, skills, temperament, and style of thought and writing are sketched and his mathematical and methodological contributions are outlined.
Abstract: Fisher's contributions to statistics are surveyed. His background, skills, temperament, and style of thought and writing are sketched. His mathematical and methodological contributions are outlined. More attention is given to the technical concepts he introduced or emphasized, such as consistency, sufficiency, efficiency, information, and maximum likelihood. Still more attention is given to his conception and concepts of probability and inference, including likelihood, the fiducial argument, and hypothesis testing. Fisher is at once very near to and very far from modern statistical thought generally.

129 citations


DissertationDOI
01 Jan 1976

124 citations


Book
01 Jan 1976
TL;DR: In this paper, the authors present a sampling theory for Hypothesis Testing and a two-way analysis of variance, which is used in computer assisted data analysis, and the significance of the difference between means.
Abstract: 1. Some Thoughts on Measurement. 2. Frequency Distributions and Graphical Methods. 3. Central Tendency. 4. Variability. 5. The Normal Curve. 6. Sampling Theory for Hypothesis Testing. 7. Correlation. 8. Prediction and Regression. 9. The Significance of the Difference Between Means. 10. Decision Making, Power, and Effect Size. 11. One-Way Analysis of Variance. 12. Two-Way Analysis of Variance. 13. Some Nonparametric Statistical Tests. References. Appendix 1: Formulas. Appendix 2: Basic Mathematics Refresher. Appendix 3: Statistical Tables. Appendix 4: Computer-Assisted Data Analysis. Answers to Odd-Numbered Problems. Index.

102 citations


Journal ArticleDOI
J. Scott Long1
TL;DR: A review of Joreskog's model for the analysis of covariance structures by first introducing the simpler case of confirmatory factor analysis is presented in this article. The mathematical results necessary for estimation and hypothesis testing are presented in a way which should be more accessible to sociologists than the original sources.
Abstract: This paper reviews Joreskog's model for the analysis of covariance structures by first introducing the simpler case of confirmatory factor analysis. The mathematical results necessary for estimation and hypothesis testing are presented in a way which should be more accessible to sociologists than the original sources. The usefulness of Joreskog's techniques is indicated by reformulating a series of models which have been estimated by sociologists using techniques without statistical justification in the format of covariance structures. Identification is considered in this context. The argument is made that these methods can greatly extend our ability to construct structural equation models containing measurement error.

94 citations


Journal ArticleDOI
Milan Zeleny1
TL;DR: It is suggested here that the model is based on a number of significant conceptual simplifications, designed to explain and predict changes in individual choice behavior under varying conditions of choice.
Abstract: The traditional compensatory multi-attribute model persists in being inconsistent with empirical findings. It is suggested here that the model is based on a number of significant conceptual simplifications. Although the individual choice behavior is partially characterized by randomness, the model attempts to reveal the underlying deterministic rationale of the choice behavior, abstracted from random fluctuations. If such a model is conceptually false then no stochastic refinement can validate it, although some statistical tests might be improved slightly. We need a different kind of deterministic choice rationale, a new paradigm, capable of explaining and predicting those changes, intransitivities and inconsistencies of the individual choice behavior for which we currently have only a stochastic explanation. Such a methodology can then be further refined via stochastic extension. The model presented here is designed to explain and predict changes in individual choice behavior under varying conditions of choice.

79 citations


Journal ArticleDOI
TL;DR: A general statistical model for the multivariate analysis of mean and covariance structures is described, which has common-factor loadings that are invariant with respect to variable scaling and unique variances that must be positive.
Abstract: A general statistical model for the multivariate analysis of mean and covariance structures is described. Various models, such as those of Bock and Bargmann; Joreskog; and Wiley, Schmidt and Bramble, are special cases. One specialization of the general model produces a class of factor analytic models. The simplest case represents an alternative conceptualization of the multiple-factor model developed by Thurstone. In contrast to the traditional model, the new model has common-factor loadings that are invariant with respect to variable scaling and unique variances that must be positive. A special feature of the model is that it does not allow the confounding of principal components analysis and factor analysis. Matrix calculus is used to develop statistical aspects of the model. Parameters are estimated by the maximum likelihood method with Newton-Raphson iterations.

73 citations


Journal ArticleDOI
TL;DR: In this article, the asymptotic power of the Cramer-von Mises test (CvM) when parameters are estimated from the data is studied under certain local (contiguous) alternatives.

58 citations


Journal ArticleDOI
TL;DR: The field of organizational behavior has begun to place undue emphasis on hypothesis testing and thereby underemphasize hypothesis generation, and as discussed by the authors reviewed how useful research ideas are discovered and the prerequisites for hypothesis creation, and then cataloged hypothesis generation.
Abstract: The field of organizational behavior has begun to place undue emphasis on hypothesis testing and thereby underemphasize hypothesis generation. After suggesting several reasons for this, this paper reviews how useful research ideas are discovered and the prerequisites for hypothesis creation, and then catalogues hypothesis generation. Suggested criteria for acting on new research ideas and questions are: extensity, professional encounter, mental experiments, and timing.

57 citations


Journal ArticleDOI
01 Mar 1976
TL;DR: In this paper, the authors introduce Probability Theory, the theory of confidence sets, and non-parametric theory of regression and sampling theory of Multidimensional Normal Distributions.
Abstract: Notation and Preliminary Remarks.- I Introduction to Probability Theory.- II Elementary Sampling Theory.- III Introduction to the Theory of Hypothesis Testing.- IV The Theory of Confidence Sets.- V Theory of Estimation.- VI Theory of Regression and the Sampling Theory of Multidimensional Normal Distributions.- VII Introduction to Non-parametric Theories.- Name and Subject Index.

Journal ArticleDOI
TL;DR: A design method for digital control systems which is optimally tolerant of failures in aircraft sensors is presented and the system can compensate for hardover as well as increased noise-type failures by computing the likelihood ratios as generalized likelihood ratios.
Abstract: A design method for digital control systems which is optimally tolerant of failures in aircraft sensors is presented. The functions of this system are accomplished with software instead of the popular and costly technique of hardware duplication. The approach taken, based on M-ary hypothesis testing, results in a bank of Kalman filters operating in parallel. A moving window of the innovations of each Kalman filter drives a detector that decides the failure state of the system. The detector calculates the likelihood ratio for each hypothesis corresponding to a specific failure state of the system. It also selects the most likely state estimate in the Bayesian sense from the bank of Kalman filters. The system can compensate for hardover as well as increased noise-type failures by computing the likelihood ratios as generalized likelihood ratios. The design method is applied to the design of a fault tolerant control system for a current configuration of the space shuttle orbiter at Mach 5 and 120,000 ft. The failure detection capabilities of the system are demonstrated using a real-time simulation of the system with noisy sensors.

Journal ArticleDOI
TL;DR: In this paper, the equality of two correlated means with incomplete data on both responses are compared by using Monte Carlo studies; three procedures were suggested by Lin & Stivers (1974), and two are suggested in this paper.
Abstract: SUMMARY Five procedures for testing the equality of two correlated means with incomplete data on both responses are compared by using Monte Carlo studies; three procedures were suggested by Lin & Stivers (1974) and two are suggested in this paper. The results are that the two tests based on a modified maximum likelihood estimator are to be preferred, the one due to Lin & Stivers when the number of complete pairs is large and the one proposed in this paper otherwise, provided the variances of the two responses do not differ by much. When the correlation between the two responses is small two other tests may be used; a test proposed in this paper when the homoscedasticity assumption is not strongly violated, and a Welch type statistic suggested by Lin & Stivers otherwise. Some key word8: Equality of means; Missing data; Monte Carlo study; Student's t test; Welch approximate test. 1. INTRODIUCTION The problem of testing the equality of two correlated means with incomplete data has been treated by several authors. Ekbohm (1976) has carried out some simulation studies for the case with missing data on one response and recommends Morrison's (1973) statistic when the variances are not too different. A similar comparison is given by Lin & Stivers (1975). Naik (1975) has given a test which may be appropriate in the heteroscedastic case with negative or small positive values of the correlation coefficient. For the case with missing values on both responses Lin & Stivers ( 1974) have suggested some test statistics and it is the purpose of this paper to suggest two more and to give the results of some simulation studies. In constructing test statistics two different approaches are commonly used. We may consider some appropriate linear model, regard all effects in the model as fixed in deriving the sums of squares, and then obtain the expected mean squares and a test ratio under the assumed mixed model. In the present case this procedure leads to the complete paired test, using only the complete pairs and discarding the extra observations. Another approach is to start by obtaining an appropriate estimator, find an estimator of its variance and then forming a statistic which, at least for large samples, is approximately normally distributed. If the maximum likelihood estimator is easily obtainable it will as a rule be a good starting-point. In the present case there is no explicit expression for the maximum likelihood estimator (Ratkowsky, 1974), and so some modifications have to be done in order to construct a noniterative test statistic. Among the five statistics compared, two, one due to Lin & Stivers (1974) and one proposed in this paper, are based on a modified maximum likelihood estimator; the other three are based on the simple mean difference. The main results of the simulation studies

Journal ArticleDOI
TL;DR: In this paper, the authors discuss the role of significance testing in contrast to hypothesis testing, taking the view that the two processes must be differentiated because they are in fact different even though they possess some common mathematical features.
Abstract: This paper discusses the role of significance testing, in contrast to hypothesis testing. It takes the view that the two processes must be differentiated because they are in fact different even though they possess some common mathematical features. A supporting view of significance testing is presented. Difficulties and obscurities are discussed.

Journal ArticleDOI
TL;DR: This paper describes a comparative study of six binary inference systems and two search strategies employed in resolution-based problem solving systems, using a variety of performance measures to provide insight into the behavior of each inference system/search strategy combination.
Abstract: This paper describes a comparative study of six binary inference systems and two search strategies employed in resolution-based problem solving systems. A total of 152 problems, most of which were taken from the recent literature, were employed. Each of these problems was attempted under a standard set of conditions using each inference system and each search strategy, for a total of twelve attempts for each problem. Using a variety of performance measures, a large number of hypotheses were examined in an effort to provide insight into the behavior of each inference system/search strategy combination. Whenever possible, the authors employed distribution-free statistical tests to minimize the subjectivity of the comparisons and hypothesis testing. Conclusions are presented concerning the effectiveness of the binary inference systems and search strategies, some effects of employing different problem representations, and certain characteristics of problems found to be significant in the overall system performance. Suggestions are made as to additional techniques that might enable theorem provers to solve practical problems.

Journal ArticleDOI
TL;DR: The concept of statistical power and Type II error was introduced by Neyman and Pearson as discussed by the authors in response to a fundamental asymmetry in the hypothesis testing process and became a regular textbook inclusion until some 30 years later.
Abstract: In 1933, Neyman and Pearson introduced the interrelated concepts of statistical power and Type II error in response to a fundamental asymmetry in the hypothesis testing process. With few exceptions, however, statistical power did not become a regular textbook inclusion until some 30 years later. Modern concern for power evolved naturally from the “significance test controversy,” and was further stimulated by Cohen’s (1962) review in the Journal of Abnormal and Social Psychology. To date, eight power-analytic surveys have been conducted. Generally, the average power estimates derived from these analyses have been quite low. Providing sufficient power serves to decrease the commission of Type II errors, and may prevent misinterpretations of nonsignificant results. Including statistical power in the design and analysis of an experiment requires an a priori estimate of the effect size, as well as calculating obtained effect size. The obtained effect size reflects the relationship between the independent and dependent variables, and as such provides a better characterization of the research effort than does reporting only the significance level.

Journal ArticleDOI
TL;DR: In this paper, the statistical tests and measures that are commonly used to evaluate the accuracy of the calibration of gravity models are discussed theoretically and a further method of analysis involving the likelihood is proposed.

Journal ArticleDOI
TL;DR: In this article, the Neyman-Pearson framework of hypothesis testing with fixed-error-level specifications was used to obtain two-stage hypothesis testing designs with known variance and binomially distributed variates, and it was shown that when the alternative hypothesis is true, these optimal twostage designs generally achieve between one-half and two-thirds of the ASN differential between the two extremes of analogous fixed-sample designs (maximum ASN) and item-by-item Wald SPRT design (minimum ASN).
Abstract: Within the Neyman-Pearson framework of hypothesis testing with fixed-error-level specifications, two-stage designs are obtained such that sample size is minimized when the alternative hypothesis is true. Normally distributed variates with known variance and binomially distributed variates are considered. It is shown that when the alternative hypothesis is true, these optimal two-stage designs generally achieve between one-half and two-thirds of the ASN differential between the two extremes of analogous fixed-sample designs (maximum ASN) and item-by-item Wald SPRT design (minimum ASN when alternative hypothesis is true).

01 Jul 1976
TL;DR: In this article, the authors studied the asymptotic distribution theory and limiting efficiencies of families of test statistics for the null hypothesis, based on these numbers {Sjt}, and showed that tests of the symme fc-1 trie type have poor asymPTotic performance in the sense that they can only distinguish alter natives at a "distance" of n"1^ from the hypothesis.
Abstract: SUMMARY. Let Xl9 ... , Xm-i and Ylt ... , Yn be independent random samples from two continuous distribution functions F and O respectively on the real line. We wish to test the null hypothesis that these two parent populations are identical. LetX' < ... ^ X'~ be the ordered X-observations. Denote by Sjt the number of Y-observations falling in the interval [X'v X'.), k = 1,... , m. This paper studies the asymptotic distribution theory and limiting efficiencies of families of test statistics for the null hypothesis, based on these numbers {Sjt}. Let h( . ) and {h*( . ) k = 1,. .. , m} be real-valued functions satisfying some simple regularity condi tions. Asymptotic theory under the null hypothesis as well as under a suitable sequence of alter m natives, is studied for test statistics of the form 2 h(Sk), based symmetrically on $*'? and those fc-1 m of the form 2 hjc(Sjc) which are not symmetric in {Sjc}. It is shown here that tests of the symme fc-1 trie type have poor asymptotic performance in the sense that they can only distinguish alter natives at a "distance" of n"1^ from the hypothesis. Among this class of symmetric tests, which includes for instance the well known run test and the Dixon test, it is shown that the Dixon test has the maximum asymptotic relative efficiency. On the other hand, tests of the nonsym metric type can distinguish alternatives converging at the more standard rate of n'112. Wilcoxon-Mann-Whitney test is an example which belongs to this class. After investigating the asymptotic theory under such alternatives, methods are suggested which allow one to select m an "optimal" test against any specific alternative, from among tests of the type 2 hj?(Sjc). *-=l Connections with rank tests are briefly explored and some illustrative examples provided.

Journal ArticleDOI
TL;DR: In this paper, a class of locally optimal tests based on linear functions of certain transformations of independent test statistics is proposed and studied, which are useful in combining independent tests for certain multivariate or multiparameter hypotheses where local optimality is interpreted in a specialized sense.
Abstract: A class of locally optimal tests based on linear functions of certain transformations of independent test statistics is proposed and studied. These overall tests are useful in combining independent tests for certain multivariate or multiparameter hypotheses where local optimality is interpreted in a specialized sense. For the two-sample case, along with the expressions for the null distributions of the test statistics, tabulations of percentile points are provided.

Journal ArticleDOI
TL;DR: In this article, it was suggested that post hoc analysis problems may be eliminated if contrasts congruent with the mathematical model are defined, and these contrasts are subsequently assessed for statistical significance by means of Scheffe's (1953) post hoc multiple comparison procedure.
Abstract: A Type IV error, as introduced by Marascuilo and Levin (1970), refers to the incorrect interpretation of a correctly rejected hypothesis. It was suggested that Type IV errors may be found in abundance in factorial analysis of variance (ANOVA) designs, in particular in the interpretation of significant interactions. The source of such Type IV errors is two-fold: (1) researchers typically employ post hoc statistical procedures with an associated Type 1 error probability different from that of the omnibus interaction test; and (2) researchers typically define contrasts among cell means which confound interaction parameters with other parameters of the ANOVA model. Marascuilo and Levin have pointed out that both of these post hoc analysis problems may be eliminated if contrasts congruent with the mathematical model are defined, and these contrasts are subsequently assessed for statistical significance by means of Scheffe's (1953) post hoc multiple comparison procedure. This advice, though sound theoretically and statistically, has been questioned by Games (1973) on practical grounds. With regard to Point 1 above, Games correctly notes that Scheffe's procedure generally produces wide confidence intervals relative to other procedures, and consequently is not the most powerful method for researchers who are in search of significant differences. No argument with Games would exist, given that alternative procedures are suitably equipped to investigate contrasts of interest, at a familywise Type I error rate comparable to Scheffe's. In fact, Levin and Marascuilo (1972) have advocated the use of planned, rather than post hoc, interaction comparisons whenever possible. In such cases, no omnibus interaction test is performed since each

Journal ArticleDOI
TL;DR: In this article, the authors describe admissible hypothesis tests for Markov order, the tests being defined on certain conditioned variables, which have the following appealing properties: the power function is constant on the hypothesis set (and therefore the level is not pessimistic), and the test is asymptotically most powerful within the class of tests based on regular estimators.
Abstract: This paper describes admissible hypothesis tests for Markov order, the tests being defined on certain conditioned variables. In the spirit of classical hypothesis testing, the tests have the following appealing properties: The power function is constant on the hypothesis set (and, therefore, the level is not pessimistic), and the test is asymptotically most powerful within the class of tests based on regular estimators. Application of the tests herein derived to simulated and actual hydrologic data indicates that our test is superior to the usual chi-square test for Markov order, and that with moderate amounts of data, conclusions with significant physical import can be gleaned.

Journal ArticleDOI
TL;DR: In this article, the author is not aware of a conceptual difference between a "test of a statistical hypothesis" and a 'test of significance' and uses these terms interchangeably, and the procedure is illustrated on two examples: (i) Le Cam's (and associates) study of immunotherapy of cancer and (ii) a socioeconomic experiment relating to low-income homeownership problems.
Abstract: Contrary to ideas suggested by the title of the conference at which the present paper was presented, the author is not aware of a conceptual difference between a “test of a statistical hypothesis” and a “test of significance” and uses these terms interchangeably. A study of any serious substantive problem involves a sequence of incidents at which one is forced to pause and consider what to do next. In an effort to reduce the frequency of misdirected activities one uses statistical tests. The procedure is illustrated on two examples: (i) Le Cam’s (and associates’) study of immunotherapy of cancer and (ii) a socio-economic experiment relating to low-income homeownership problems.


Journal ArticleDOI
TL;DR: In this article, an approach to hypothesis testing when more than two instruments are involved is proposed, based on large sample theory, which is simple to apply and can be applied to test specific hypotheses involving three instruments.
Abstract: When two or more instruments or techniques are used to measure the same items, the measurement precisions may be estimated using a method proposed by Grubbs[1]. The problems of testing various hypotheses about the measurement precisions have been considered extensively by several authors for the case of two instruments, and some attention has been given to testing specific hypotheses involving three instruments. This paper proposes an approach to hypothesis testing when more than two instruments are involved. The proposed tests, based on large sample theory, are simple to apply.

Journal ArticleDOI
TL;DR: Two programs are described that may be used to find condensed descriptions for data available in a contingency table or in a covariance matrix in the case that these data follow a multinomial or a multivariate normal distribution.

Journal ArticleDOI
TL;DR: This paper shows how the problem of systemic hypothesis-testing (Sys HT) can be systematically expressed as a constrained maximimization problem and how the error of the third kind (EIII) is fundamental to the theory of Sys HT.
Abstract: Scientific ideas neither arise nor develop in a vacuum. They are always nutured against a background of prior, partially conflicting ideas. Systemic hypothesistesting is the problem of testing scientific hypotheses relative to various systems of background knowledge. This paper shows how the problem of systemic hypothesis-testing (Sys HT) can be systematically expressed as a constrained maximimization problem. It is also shown how the error of the third kind (E III) is fundamental to the theory of Sys HT.The error of the third kind is defined as the probability of having solved the ‘wrong’ problem when one should have solved the ‘right’ problem. This paper shows howE III can be given both a systematic as well as a systemic treatment. Sys HT gives rise to a whole host of new decision problems, puzzles, and paradoxes.

Journal ArticleDOI
TL;DR: In this article, a method is described for constructing nonparametric tests for experiments arranged in blocks when the underlying distributions are of Lehmann's form, and the test derived by this method is a generalization of an existing well-known test.
Abstract: SUMMARY A method is described for constructing nonparametric tests for experiments arranged in blocks when the underlying distributions are of Lehmann's form. In simple cases the resulting test statistics are those that would be obtained by an analysis of variance following a trans- formation of the data into exponential scores. In the past considerable work has been done on examining the power of various standard nonparametric tests under the so-called Lehmann alternative, where the probability in the upper tail of the alternative distribution is a power of the probability in the upper tail of the null distribution. With the exception of certain two-sample tests little effort seems to have been devoted to deriving tests more appropriate to this underlying situation, which is of considerable importance in life testing. In this note it is shown how the condi- tional likelihood approach adopted by Cox (1972) may be used to derive nonparametric test statistics for a variety of situations. Because of their method of construction the resulting tests necessarily have asymptotically optimal properties under the Lehmann alternative, but their small-sample behaviour still needs investigation. In some cases the test derived by this method is a generalization of an existing well-known test.

Journal ArticleDOI
John W. Pratt1
TL;DR: In this article, the authors argue that tests provide a poor model of most real problems, usually so poor that their objectivity is tangential and often too poor to be useful.
Abstract: The author believes that tests provide a poor model of most real problems, usually so poor that their objectivity is tangential and often too poor to be useful.

Journal ArticleDOI
TL;DR: In this paper, a method for the approximate separation of exponentials of unknown decay rates is described, which is based on the well-known method of "stripping" which fits exponenentials singly in increasing order of decay rate, until a statistical test rejects the hypothesis that the observations already included are well represented by the model.