scispace - formally typeset
Search or ask a question

Showing papers on "Statistical hypothesis testing published in 1971"


Journal ArticleDOI
TL;DR: In this paper, the authors examined a secondary aspect, where the departure from initial conditions has taken place in a sequence of normal random variables, where initially the mean and the variance o2 were known.
Abstract: SUMMARY The point of change in mean in a sequence of normal random variables can be estimated from a cumulative sum test scheme. The asymptotic distribution of this estimate and associated test statistics are derived and numerical results given. The relation to likelihood inference is emphasized. Asymptotic results are compared with empirical sequential results, and some practical implications are discussed. The cumulative sum scheme for detecting distributional change in a sequence of random variables is a well-known technique in quality control, dating from the paper of Page (1954) to the recent expository account by van Dobben de Bruyn (1968). Throughout the literature on cumulative sum schemes the emphasis is placed on tests of departure from initial conditions. The purpose of this paper is to examine a secondary aspect: estimation of the index T in a sequence {xt}, where the departure from initial conditions has taken place. The work is closely related to an earlier paper by Hinkley (1970), in which maximum likelihood estimation and inference were discussed. We consider specifically sequences of normal random variables x1, ..., xT, say, where initially the mean 00 and the variance o2 are known. A cumulative sum, cusum, scheme is used to detect possible change in mean from 00, and for simplicity suppose that it is a one-sided scheme for detecting decrease in mean. Then the procedure is to compute the cumulative sums t

473 citations


Journal ArticleDOI
TL;DR: In this paper, the Fisher-Irwin treatment of a single 2 x 2 contingency table is extended to the case when the difference between the two populations on a logistic or probit scale is nearly constant for each table.
Abstract: SUMMARY Consider data arranged into k 2 x 2 contingency tables. The principal result is the derivation of a statistical test for making an inference on whether each of the k contingency tables has the same relative risk. The test is based on a conditional reference set and can be regarded as an extension of the Fisher-Irwin treatment of a single 2 x 2 contingency table. Both exact and asymptotic procedures are presented. The analysis of k 2 x 2 contingency tables is required in several contexts. The two principal ones are (i) the comparison of binary response random variables, i.e. random variables taking on the values zero or one, for two treatments, over a spectrum of different conditions or populations; and (ii) the comparison of the degree of association among two binary random variables over k different populations. Cochran (1954) has investigated this problem with respect to testing if the success probability for each of two treatments is the same for every contingency table. Cochran's recommendation is that the equality of the two success probabilities should be tested using the total number, summed over all tables, of successes for one of the treatments. Cochran considers the asymptotic distribution of the total number of successes, for one of the treatments, conditional on all marginals being fixed in every table. He recommends this technique whenever the difference between the two populations on a logistic or probit scale is nearly constant for each contingency table. The constant logistic difference is equivalent to the relative risk being equal for each table. Mantel & Haenlszel (1959), in an important paper discussing retrospective studies, have also proposed an asymptotic method for analysing several 2 x 2 contingency tables. Their worlk on this problem was evidently done independently of Cochran, for their method is exactly the same as Cochran's except for a modification dealing with the correction factor associated with a finite population. Birch (1964) and Cox (1966) clarified the problem by showing, that under the assumption of constant logistic differences for each table, same relative risk, the conditional distribution of the total number of successes, for one of the treatments, leads to a uniformly most powerful unbiased test. Birch and Cox also derived the exact probability distribution of this conditional random variable under the given model. In this paper, we investigate the more general situation where the difference between the logits in each table is not necessarily constant. Procedures are derived for making an inference with regard to the hypothesis of constant logistic differences. Both the exact and asymptotic distributions are derived for the null and nonnull cases. This problem has been discussed by several investigators. A constant logistic difference corresponds to no interaction between the treatments and the k populations. The case k = 2 corresponds to one in which Bartlett (1935) has derived both an exact and an asymptotic procedure. Norton (1945)

286 citations


Journal ArticleDOI
TL;DR: In this paper, a maximum likelihood method for fitting and testing the fit of a log-normal function to grouped particle size data is described, even if the numbers of particles observed in the size groups are small or nil.

96 citations



Journal ArticleDOI
TL;DR: The purpose of the present paper is to show that it is possible to view certain single-subject designs in such a way as to make the technique of Analysis of Vari-
Abstract: IN Psychology, there has been a long standing conflict between single-subject and multisubject researchers, which tends to center about the question of whether or not statistics is really useful in single-subject research. Such writers as. Sidman (1961) and Skinner (1953) emphasize precisely controlled, single-subject experiments as the most fruitful experimental approach in Psychology, with the role of statistics being limited to the use of elementary descriptive statistics such as means and standard deviations. The powerful, multivariate methods of modern statistics are considered to be inapplicable because they are primarily designed to deal with groups instead of individuals and because their averaging out processes tend to obscure individual differences. Other writers, such as Underwood (1957), argue that the best experimental approach in Psychology is to study groups of subjects to which the modern statistical inference methods may be applied. The purpose of the present paper is to show that it is possible to view certain single-subject designs in such a way as to make the technique of Analysis of Vari-

83 citations


Journal ArticleDOI
TL;DR: In this paper, contingency tables are used for analysis of variance in a mixed model, where the hypothesis of equality of the mean scores over the first-order marginals is investigated.
Abstract: This paper is concerned with contingency tables which are analogous to the well-known mixed model in analysis of variance. The corresponding experimental situation involves exposing each of n subjects to each of the d levels of a given factor and classifying the d responses into one of r categories. The resulting data are represented in an r X r X ... X r contingency table of d dimensions. The hypothesis of priincipal interest is equality of the one-dimensioinal marginal distributions. Alternatively, if the r categories may be quantitatively scaled, then attention is directed at the hypothesis of equality of the mean scores over the d first order marginals. Test statistics are developed in terms of minimum Neyman X2 or equivalently weighted least squares analysis of underlying linear models. As such, they bear a strong resemblance to the Hotelling T2 procedures used with continuous data in mixed models. Several numerical examples are given to illustrate the use of the various methods discussed.

68 citations


Journal ArticleDOI
David F. Andrews1
TL;DR: In this article, the distribution of residuals from linear regression models is used to construct exact tests of significance, which are then applied to the problem of testing for the presence of one or more outliers.
Abstract: SUMMARY The known distribution of residuals from linear regression models may be used to construct exact tests of significance. New tests for the presence of one or more outliers are considered in detail. Applications of the theory to other tests are discussed. Exact results are worked out for the normal and exponential error distributions; formulae are given for other nonnormal cases. All statistical tests are based on some model specifying the form or structure of the response. Linear models are a large and important class of the models currently used. The statistical tests based on linear models fall generally into two categories: (i) tests within the model that are sensitive to departures from some hypothesis about the parameters of the model; and (ii) tests of the model that are sensitive to departures from the assumptions of the model regardless of the parameters within the model. Tests of the latter type are based on the normalized residual vector, or some function of it, which has a known marginal distribution independent of the parameters of the model. Tests within the model are made conditionally given this ancillary residual vector. However, the distinction between these tests, at least for normal models, exists more in theory thain in practice. Section 2 contains some preliminary definitions and results which lead, in ? 3, to the distribution of residuals in both normal and nonnormal cases. In ? 4 a class of significance tests based on the structure of the regression problem is proposed and in ? 5 it is shown that the common analysis of variance tests for normal models belong to this class. In ? 6 this theory is applied to the problem of testing for the presence of one or more outliers. Examples of the derived tests are given for normal and exponential cases. Finally, in ? 7 the relation to other tests for nonadditivity and nonnormality is discussed.

41 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that the energy/frequency properties of non-stationary processes may be described in terms of evolutionary power spectra which have physical interpretations as local energy spectra, and which may be used for statistical analyses in much the same way as the spectra of stationary processes.

17 citations


Journal ArticleDOI
TL;DR: In this paper, the authors investigate which of the well known tests for homogeneity is unbiased and demostrate the unbiased character for several sub-families of the Laue family.
Abstract: : Consider the classical homogeneity of variances model. That is, suppose one has a one way layout, with independent random samples of equal sizes in each column. Assume the samples are from normal populations with unknown means and unknown variances. One wishes to test the hypothesis that all the variances are equal. R. V. Laue (1965) defined a two parameter family of statistics to test the homogeneity hypothesis. The family is defined as a function of the ratio of mean value functions of the sample variances. Included in the family are many of the well known tests for homogeneity. In this paper the authors investigate which of the tests is unbiased. Although the authors are unable to resolve the question for every test in the family, they can demostrate the unbiased character for several subfamilies, which include some of the better known tests. (Author)

16 citations


Journal ArticleDOI
TL;DR: In this paper, a concept learning model was developed and tested in two conjunctive attribute identification tasks, which includes assumptions about the focus of attention, decision making, and memory for stimulus information and prior decisions.

16 citations


Journal ArticleDOI
TL;DR: In this paper, the authors discuss the difficulty and the importance of developing training programs which adequately integrate theoretical and applied aspects of the analysis and interpretation of scientific research data, in ways which do matter in typical applications as well as in the theory of statistics.
Abstract: Everyone closely concerned with the field of statistics is familiar with recurrent discussions about the difficulty and the importance of developing training programs which adequately integrate the theoretical and applied aspects of the analysis and interpretation of scientific research data. It now is increasingly clear that the difficult problems of training in statistics are linked with live substantive problems involving several aspects of theoretical and applied statistics. As statistical concepts and techniques have found broader and deeper roles in various disciplines during recent years, they have also encountered more sophisticated scientific-methodological challenges. These challenges concern the work-a-day techniques of data analysis, and the “elementary” concepts of interpretation of statistical research data, in ways which do matter in typical applications as well as in the theory of statistics. The following represent only a few of the kinds of challenges referred to: 1. Can we give a coherent definition of a best test of a statistical hypothesis, which is compatible with some systematic theory and also represents adequately usual applications? What about : randomized best tests? conditional tests? interpreting power along with significance level in data analysis? subject-matter significance vs. formal statistical significance? fixed-level theory vs. variable-level practice?

Journal ArticleDOI
TL;DR: In this paper, the assumptions underlying derivations of various indices of error of measurement and such coefficients of reliability as KR-20 and KR-21 have been discussed and discussed extensively.
Abstract: WHAT are the assumptions underlying derivations of various indices of error of measurement and such coefficients of reliability as KR-20 and KR-21? Over the years there has been a great variety of instructive discussion relating to questions of this kind (cf. Cronbach, 1951; Cronbach, Rajaratman and Gleser, 1963; Gulliksen, 1950; Henrysson, 1959; Hoyt, 1941; Kuder and Richardson, 1937; Lord 1955a; 1955b; 1957; 1959a; 1959b; 1962; Novick, 1966; Novick and Lewis, 1967; Penfield, 1967; Winer, 1962 and others). Recent articles by Lord have been particularly helpful in indicating the minimal assumptions under which a reliability coefficient may be obtained and in showing the basis upon which one would need to justify use of one standard error of measurement (SEM) for all scores. Yet it is true that interesting and important relationships between various derivations still remain somewhat obscure and the implications in use of various ways of estimating a standard error of measurement are by no

Journal ArticleDOI
TL;DR: In this article, the authors warn against discarding the statistical tests when passing on analytical results, because the analyst producing and forwarding information has the scientific obligation to provide each important result with data on its reliability, because he must to consider the possibility that another person might incorrectly interpret his results.
Abstract: In the preceding parts of the Green Pages dealing with the simplest statistical arithmetic rules for judging the random error of the mean values obtained from a few repetitive measurements, some statistical tests were given. Experience has shown that a laboratory assistant will be able to readily apply statistical tests if he has access to suitably programmed desk top calculators reducing his own manual calculation work to almost zero. The necessary work can be even reduced by using some nomograms. If no small desk computers are available, then the analyst tends to ignore to use statistical mathematics and the information produced in this way concerning the value of a result. The intention of the following two figures and the table is to urgently warn against discarding the statistical tests when passing on analytical results. The analyst producing and forwarding information has the scientific obligation to provide each important result with data on its reliability, because he must to consider the possibility that otherwise a third person might incorrectly interpret his results. Particularly in those cases where scientific, economic, or politically relevant decisions are to be based on data supplied by an analyst, special attention has to be paid to the reliability of the results. Fig. 1 shows the relationship between the precision of analytical data that can be achieved, and the expense in the necessary analytical technique and time. The ordinate shows the necessary number n o f repetitive measurements. On the abscissa we have plotted the difference between an analytical mean value ~ and a nominal value W which with 99 % confidence level just indicates an ascertained difference between the measured value and the nominal value. Naturally, the smaller, this difference the better the analytical procedure. The quality o f the ana-



Book ChapterDOI
01 Jan 1971
TL;DR: There has been much controversy over the question what computational arrangements are best suited to statistical computing; many packages have been offered for this purpose, with the developers sometimes advocating their exclusive use.
Abstract: There has been much controversy over the question what computational arrangements are best suited to statistical computing. Many packages have been offered for this purpose, with the developers sometimes advocating their exclusive use. Opponents, in turn, have argued that the idea of a statistical package is unsound per se since it encourages routine application of standard analyses for hypothesis testing, estimating, and so on.

Journal ArticleDOI
TL;DR: In this article, it was shown that the extreme stable laws have one-sided moment-generating functions with interesting mathematical forms, such as exp (cz log z) and exp (r log r) for a density with decreasing failure rate.
Abstract: : It is shown that the extreme stable laws have one-sided moment-generating functions with interesting mathematical forms. The fact that one of these forms, exp (cz log z), is a moment-generating function is used to establish two interesting statistical results: first, that exp (r log r) is a moment sequence for a density with decreasing failure rate, and secondly, that the likelihood ratio test for testing a simple null hypothesis in a multinomial distribution is admissible and Bayes. (Author)

Journal ArticleDOI
TL;DR: It is suggested that pattern analysis, in addition to its usual function of simplification of complex data, may contribute to the analysis of grazing experiments in the special case in which there is reason to suspect the existence of an external non-random environmental factor.
Abstract: The complementary functions of pattern analysis and statistical analysis are discussed, with particular reference to the analysis of agricultural experiments. It is suggested that pattern analysis, in addition to its usual function of simplification of complex data, may contribute to the analysis of grazing experiments in the special case in which there is reason to suspect the existence of an external non-random environmental factor. Such a case is analysed completely; it is shown that the existence of such a factor can be established by intrinsic classification of entire liveweight sequences. The factor can then be partitioned out by principal coordinate analysis; its spatial configuration can be elucidated, and the extent of its contribution to the overall results assessed. Its optimum correlation with parallel botanical data can be established by canonical coordinate analysis. It is then possible to formulate causative hypotheses as to the nature of the factor; in the present case the most plausible hypothesis was that a small systematic change in tree density caused a progressive reduction in quantity and quality of herbage as we passed from the centre of the area to the periphery. This hypothesis could be used for further experiment and statistical test. Standard programmes for the entire analysis exist on the Control Data 3600 computer at Canberra.

Journal ArticleDOI
TL;DR: In this paper, the first four moments for each test statistic, conditional on observed samples, are obtained under an hypothesis implying interchangeability of observation vectors between the two samples, and the approximate sampling distributions for the test statistics are fitted yielding the approximate test procedures.
Abstract: : The multivariate, two-sample, nonparametric, location problem is considered. Emphasis is on approximate significance tests, important in applications since generally useful tables are precluded because of inherent correlations between variates. Multivariate randomization, ranksum and normal scores tests are considered. The first four moments for each test statistic, conditional on observed samples, are obtained under an hypothesis implying interchangeability of observation vectors between the two samples. Approximate sampling distributions for the test statistics are fitted yielding the approximate test procedures. While there is no way to evaluate the goodness of these approximations generally, a specific example is studied where Monte Carlo results are used to check the approximate methods. (Author)

Journal ArticleDOI
TL;DR: Reliable and Valid Hierarchical Classification attempts to accomplish the task of finding the most reliable and valid solution to the problem of hierarchical classification.
Abstract: SOME methods of hierarchical classification use only the highest index of association of every object with every other object; other methods use all indices of association (McQuitty, 1967, 1968; McQuitty and Clark, 1968). A problem is to use that particular set of indices of association which produces the most reliable and valid solution. This is what Reliable and Valid Hierarchical Classification attempts to accomplish.

Journal ArticleDOI
TL;DR: In this article, the authors evaluated the economic costs of type I and II errors for simple and composite hypotheses, and the relationship of the hypothesis testing problem to Bayesian decision theory was discussed; it was felt that the latter theory offers a more comprehensive framework for design and use of hydrologic data networks.
Abstract: The efficiency of hydrologic data collection systems is relevant to solution of environmental problems, scientific understanding of hydrologic processes, model-building and management of water resources. Because these goals may be overlapping and non-commensurate, design of data networks is not simple. Identified are four elements of error or risk in such networks: (a) choice of variables and mathematical model for the same process, (b) accuracy of model parameter estimates, (c) acceptance of wrong hypothesis or rejection of correct hypothesis and (d) economic losses associated with error. Of these four, the classical hypothesis testing problem is specifically evaluated in terms of costs of type I and II errors for simple and composite hypotheses; mathematical models for these economic analyses also include costs of sample data and costs of waiting while new data is obtained. An illustrative computational example focuses on the hypothesis that natural recharge might be augmented by a system of pumping wells along an ephemeral channel. The relationship of the hypothesis testing problem to Bayesian decision theory is discussed; it is felt that the latter theory offers a more comprehensive framework for design and use of hydrologic data networks.

Journal ArticleDOI
TL;DR: In this paper, a mathematical model has been formulated, solely through the use of the mathematical statistics, for the purpose of predicting river water quality without reference to the causal chemical, biological and physical relationships.
Abstract: A mathematical model has been formulated, solely through the use of the mathematical statistics, for the purpose of predicting river water quality without reference to the causal chemical, biological and physical relationships. In a sense, this is a “black box” approach wherein with a known input, one may reliably predict the output. The use of a main force statistical method for predicting river-water quality can provide accurate predictive information with a minimum of time and money expended if a sufficiently large data base is available for the river system in question. What has been lacking in the past is a model which is not only statistically significant but contains only those water quality parameters which contribute significantly to the estimation of the dependent variable. The model which is herein described discusses the formulation procedure, data collection requirements, model hypothesis testing and significance procedures, and finally validation methods employed in verifying the final model equations. A description of how the simulated results are employed in the forecasting procedure is also developed.


Journal ArticleDOI
TL;DR: In this paper, the Johnson-Neyman technique was used to determine regions of significance and nonsignificance when the equal slopes hypothesis is rej ected, and two programs have been reported which analyze aptitude-treatment interactions.
Abstract: RECENT interest in aptitude-treatment research has revitalized procedures for determining interactions among group regressions. Among these procedures is the homogeneity of group regressions test and the Johnson-Neyman technique. The homogeneity of group regressions model (Walker and Lev, 1953; Edwards, 1968) tests the hypothesis that regression slopes are equal across treatments, while the Johnson-Neyman technique (1936) determines regions of significance and nonsignificance when the equal slopes hypothesis is rej ected. Two programs have been reported which analyze aptitudetreatment interactions; both, however, are limited in scope. Ter-

Book ChapterDOI
Herman Rubin1
01 Jan 1971
TL;DR: In this article, the authors considered the testing of the null hypothesis theta = 0 against the one-dimensional alternative theta not equal to 0, where the assumption is that the problem is sufficiently regular, that the likelihood function is sufficiently close to that of a sample from a normal distribution with mean theta and variance 1.
Abstract: : The author considers the testing of the 'null hypothesis' theta = 0 against the one-dimensional alternative theta not equal to 0. In most problems, the investigator knows that theta = 0 is unreasonable, and would prefer to 'accept' theta = 0 if absolute value of theta is sufficiently small. The assumption is made that the problem is sufficiently regular, that is, that the likelihood function is sufficiently close to that of a sample from a normal distribution with mean theta and variance 1, after normalization if necessary. A mathematical formulation of this problem is given and the solution is investigated. It is shown that a crude procedure based on a 'small sample' treatment and a 'very large sample' treatment can be very bad in the transition region; also, there is not enough information in those treatments to get robust results. (Author)

Journal ArticleDOI
TL;DR: One of the earliest studies on the effect of test form on test scores was carried out by as mentioned in this paper, who employed a method of factor analysis to examine the structure underlying scores obtained by teachers' college students on expression and recognition tests of English skills.
Abstract: \"The use of tests\", wrote Dr. Romero of Argentine, \"permits the logical determination of aptitudes of the person to be educated, his abilities, his capacity-in short, it is a problem of measurement.\" (1968 p. 3.) Problems of measuring aptitudes and abilities are at present rather great. Errors of measurement within a test, coupled with fluctuations in a candidate's performance, necessitate that a test score be interpreted with caution. One of many possible variables influencing test scores was suggested by Ralph Tyler in 1930. He reported a correlation of '38 between two forms of the same test presented to 66 freshmen college students studying zoology. One form of the test required students to express, in their own words, suitable hypotheses that might be drawn from given facts. The other form, presented immediately after the expression test papers had been collected, entailed the selection of the most suitable hypotheses. After each set of given facts a number of hypotheses were presented from which students were required to select one. From his results, Tyler deduced that \"the ability to select the most reasonable inference from a given list is not the same as the ability to propose an original inference\" (pp. 477-8). A later study undertaken by A. S. Hurd (1932) showed, however, quite a high relationship, as indicated by a coefficient of '78 between parallel expression and recognition forms of a science test given to a large number of school children. As no details were included in the report regarding the age range and education level of the sample, these and other variables might have obscured the effect of test form in the ranking of pupils. M. R. Dunstan (1963) took a fresh approach to the study of test form. He employed a method of factor analysis to examine the structure underlying scores obtained by Teachers' College students on expression and recognition tests of English skills. Although his results showed evidence for an \"expression form\" factor in testing English, it should be pointed out that the materials on which the tests were based varied in content. Studies, such as those reported by J. M. Trenaman (1967) and M. Smilansky and L.

01 Oct 1971
TL;DR: It turns out that the inverted gamma distribution remains a good choice for a prior distribution, the Bayes tests are economical and easily implemented and tabulated, the Industry/Government interest in such cases is high, and the linear model is a plausible model for modifying prior distributions.
Abstract: : The final report is a result of a study performed as the second phase of a three phase effort to develop Bayesian Reliability Demonstration Tests (BRDT). The objectives of this, the second phase, were: Fit additional prior distributions; Develop a preliminary format for BRDT; Test the receptiveness to BRDT; Study methods of modifying prior distributions in the face of new data; Study the possible application of empirical Bayes methods to RDT. It turns out that the inverted gamma distribution remains a good choice for a prior distribution, the Bayes tests are economical and easily implemented and tabulated, the Industry/Government interest in such cases is high, and finally, the linear model is a plausible model for modifying prior distributions. (Author)

Journal ArticleDOI
TL;DR: This chapter discusses analogous post hoc trend analysis procedures which may be employed following the rejection of the null hypothesis by a nonparametric test such as the Kruskal-Wallis one-way analysis of variance for rank data, the Friedman (1937) two- way analysis ofvariance for rankData, or the Cochran (1950) two way analysis for dichotomous data.
Abstract: a common procedure, following the rejection of the null hypothesis of equal expected values, is the use of orthogonal polynomials to estimate the magnitude of these trend comparisons. Recently, Marascuilo and McSweeney (1967) discussed analogous post hoc trend analysis procedures which may be employed following the rejection of the null hypothesis by a nonparametric test such as the Kruskal-Wallis (1952) one-way analysis of variance for rank data, the Friedman (1937) two-way analysis of variance for rank data, or the Cochran (1950) two-way analysis of variance for dichotomous data.

Journal ArticleDOI
TL;DR: Detailed discussions concerning missing data, statistical tests, random number generation, and interpretation of results are presented along with a review of the generation schemes that have been used in the stochastic generation of hydrologic data.
Abstract: The stochastic approach to watershed modeling refers to the techniques used to generate synthetic hydrologic data. These data may be used either for input to a parametric watershed model or to provide directly an estimate of the output of a hydrologic process. In both cases the basic techniques of the generation processes are the same. The type of process depends primarily on the purpose for which the data are being generated and on the quality and quantity of sample data. Techniques are presented which can be used to generate data for one or any number of variates. The data generated can be normal, skewed, or log normal, and include serial correlation. If two or more variates are involved, cross correlation may also be considered. Brief discussions concerning missing data, statistical tests, random number generation, and interpretation of results are presented along with a review of the generation schemes that have been used in the stochastic generation of hydrologic data.

Book ChapterDOI
Anthony D. Whalen1
01 Jan 1971