scispace - formally typeset
Search or ask a question

Showing papers on "Statistical hypothesis testing published in 1981"


Journal ArticleDOI
TL;DR: In this paper, several procedures are proposed for testing the specification of an econometric model in the presence of one or more other models which purport to explain the same phenomenon.
Abstract: Several procedures are proposed for testing the specification of an econometric model in the presence of one or more other models which purport to explain the same phenomenon. These procedures are shown to be closely related, but not identical, to the non-nested hypothesis tests recently proposed by Pesaran and Deaton [7], and to have similar asymptotic properties.. They are remarkably simple both conceptually and computationally, and, unlike earlier techniques, they may be used to test against several alternative models simultaneously. Some empirical results are presented which suggest that the ability of the tests to reject false hypotheses is likely to be rather good in practice.

1,599 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider some appropriate and inappropriate uses of coefficient kappa and alternative kappa-like statistics, and discuss the descriptive characteristics of these statistical methods, but their discussion is restricted to descriptive characteristics only.
Abstract: This paper considers some appropriate and inappropriate uses of coefficient kappa and alternative kappa-like statistics. Discussion is restricted to the descriptive characteristics of these statist...

1,145 citations


Journal ArticleDOI
TL;DR: The effect of stratification and clustering on the asymptotic distributions of standard Pearson chi-squared test statistics for goodness of fit and independence in a two-way contingency table, denoted as X 2 and XI 2, respectively, is investigated in this article.
Abstract: The effect of stratification and clustering on the asymptotic distributions of standard Pearson chi-squared test statistics for goodness of fit (simple hypothesis) and independence in a two-way contingency table, denoted as X 2 and XI 2, respectively, is investigated It is shown that both X 2 and XI 2 are asymptotically distributed as weighted sums of independent χ1 2 random variables The weights are then related to the familiar design effects (deffs) used by survey samplers A simple correction to X 2, which requires only the knowledge of variance estimates (or deffs) for individual cells in the goodness-of-fit problem, is proposed and empirical results on the performance of corrected X 2 provided Empirical work on XI 2 indicated that the distortion of nominal significance level is substantially smaller with XI 2 than with X 2 Some results under simple models for clustering are also given

950 citations


Journal ArticleDOI
TL;DR: Test statistics, confidence intervals, and sample size calculations are discussed, and the required sample size may be larger for either null hypothesis formulation than for the other, depending on the specific assumptions made.

820 citations


Journal ArticleDOI
TL;DR: In this paper, a single-factor multivariate time series model is proposed to estimate the unobserved metropolitan wage rate for Los Angeles, based on observations of sectoral wages within the Standard Metropolitan Statistical Area.
Abstract: The paper formulates and estimates a single-factor multivariate time series model. The model is a dynamic generalization of the multiple indicator (or factor analysis) model. It is shown to be a special case of the general state space model and can be estimated by maximum likelihood methods using the Kalman filter algorithm. The model is used to obtain estimates of the unobserved metropolitan wage rate for Los Angeles, based on observations of sectoral wages within the Standard Metropolitan Statistical Area. Hypothesis tests, model diagnostics, and out-of-sample forecasts are used to evaluate the model.

445 citations


Journal ArticleDOI
TL;DR: In this paper, the authors illustrate Bayesian and empirical Bayesian techniques that can be used to summarize the evidence in such data about differences among treatments, thereby obtaining improved estimates of the treatment effect in each experiment, including the one having the largest observed effect.
Abstract: Many studies comparing new treatments to standard treatments consist of parallel randomized experiments. In the example considered here, randomized experiments were conducted in eight schools to determine the effectiveness of special coaching programs for the SAT. The purpose here is to illustrate Bayesian and empirical Bayesian techniques that can be used to help summarize the evidence in such data about differences among treatments, thereby obtaining improved estimates of the treatment effect in each experiment, including the one having the largest observed effect. Three main tools are illustrated: 1) graphical techniques for displaying sensitivity within an empirical Bayes framework, 2) simple simulation techniques for generating Bayesian posterior distributions of individual effects and the largest effect, and 3) methods for monitoring the adequacy of the Bayesian model specification by simulating the posterior predictive distribution in hypothetical replications of the same treatments in the same eig...

263 citations


Posted Content
TL;DR: Abel et al. as discussed by the authors show that a theory which does not generate a complete specification of the regression test is nonetheless to have testable implications, these implications must be robust over the permissible alternative specifications.
Abstract: In most natural sciences (physics, chemistry, biology) theories are validated by controlled experiment. However, in other natural sciences (astronomy, meteorology), and in most social sciences, including economics, the data are characteristically generated not by experiment but by measurement of uncontrolled systems. In economics, theories take the form of restrictions on the models assumed to generate the data, and statistical methods replace experimental controls in testing these restrictions. And here is the difficulty: in economics, particularly macroeconomics, the theory used to derive tests ordinarily does not generate a complete specification of which variables are to be held constant when statistical tests are performed on the relation between the dependent variable and the independent variables of primary interest. Accordingly, in such cases there will be a set of often very different candidate regression-based tests, each of which has equal status with the others since each is based on a different projection of the same underlying multivariate model. Except in the unlikely event that the explanatory variables are mutually orthogonal, the conditional regression coefficients, which generally form the basis for the test statistic, will depend on the conditioning set. We conclude from this that, if a theory which does not generate a complete specification of the regression test is nonetheless to have testable implications, these implications must be robust over the permissible alternative specifications. If the restrictions indicated by the theory are satisfied in some projections, but not in others that have an equal claim to represent implications of the theory, one cannot conclude that the theory has been confirmed. The fact that the observable implications of valid theories must obtain over a broad (but usually incompletely specified) set of regressions rather than for a single regression introduces a large and unavoidable element of imprecision into hypothesis testing in macroeconomics. Generally it appears to be appropriate to weaken the statistical criterion for rejecting theories. Consider, for example, the theory of money demand, which will engage our attention in this paper. The Tobin-Baumol square root formula implies that the elasticity of money demand with respect to the interest rate is exactly one-half. But which interest rate? Should wealth be held constant? Inflation? In view of such uncertainties it would be inappropriate to insist in a literal-minded fashion on rejecting the Tobin-Baumol model if in some regression the measured interest elasticity differed from one-half by more than two standard deviations, and only then. Obviously a more flexible approach is called for. The practice has been to conclude that the statistical evidence is consistent with the Tobin-Baumol model as long as the interest rate coefficient is negative. If it is negative and significant, or negative and insignificantly different from minus one-half, that would provide somewhat stronger confirmation. But a positive coefficient, particularly a significantly positive coefficient, would be viewed as raising questions about the validity of the theory. In macroeconomics generally, as in the money demand application, the typical response to specification uncertainty has been to regard a theory as supported if the signs of the estimated coefficients agree with those expected from theory, and as disconfirmed otherwise. There is no theoretical justification for this procedure, but it seems to be a reasonable course to follow. The point that economic theory ordinarily generates incompletely specified statistical *University of California, Santa Barbara. We have received helpful comments from Andrew Abel, Robert Clower, Michael Darby, Robert Engle, Stephen Goldfeld, David Laidler, Edward Leamer, Robert Lucas, Frederic Mishkin, and Edward Prescott. Thomas Hall provided able research assistance.

243 citations


Book
01 Jan 1981
TL;DR: The Where, Why and How of Data Collection and how to describe data using Numerical Measures are explained.
Abstract: Chapter 1: The Where, Why and How of Data Collection Chapter 2: Graphs, Charts, and Tables - Describing Your Data Chapter 3: Describing Data Using Numerical Measures Chapter 4: Introduction to Probability Chapter 5: Introduction to Discrete Probability Distributions Chapter 6: Introduction to Continuous Probability Distributions Chapter 7: Introduction to Sampling Distributions Chapter 8: Estimating Single Population Parameters Chapter 9: Introduction to Hypothesis Testing Chapter 10: Estimation and Hypothesis Testing for Two Population Parameters Chapter 11: Hypothesis Tests for One and Two Population Variances Chapter 12: Analysis of Variance Chapter 13: Goodness-of-Fit Tests and Contingency Analysis Chapter 14: Introduction to Linear Regression and Correlation Analysis Chapter 15: Multiple Regression and Model Building Chapter 16: Analyzing and Forecasting Time-Series Data Chapter 17: Introduction to Nonparametric Statistics Chapter 18: Introduction to Quality and Statistical Process Control Chapter 19: Introduction to Decision Analysis

220 citations


Journal ArticleDOI
TL;DR: In this article, a class of new non-parametric test statistics is proposed for goodness-of-fit or two-sample hypothesis testing problems when dealing with randomly right censored survival data.
Abstract: This paper proposes a class of new non-parametric test statistics useful for goodness-of-fit or two-sample hypothesis testing problems when dealing with randomly right censored survival data. The procedures are especially useful when one desires sensitivity to differences in survival distributions that are particularly evident at at least one point in time. This class is also sufficiently rich to allow certain statistics to be chosen which are yery sensitive to survival differences occurring over a specified period of interest. The asymptotic distribution of each test statistic is obtained and then employed in the formulation of the corresponding test procedure. Size and power of the new procedures are evaluated for small and moderate sample sizes using Monte Carlo simulations. The simulations, generated in the two sample situation, also allow comparisons to be made with the behavior of the Gehan-Wilcoxon and log-rank test procedures.

164 citations


Journal ArticleDOI
TL;DR: This paper illustrates how problems of canonical correlation can be resolved by expressing canonical correlation as a special case of a linear structural relations model.
Abstract: Canonical correlation analysis is commonly considered to be a general model for most parametric bivariate and multivariate statistical methods Because of its capability for handling multiple criteria and multiple predictors simultaneously, canonical correlation analysis has a great deal of appeal and has also enjoyed increasing application in the behavioral sciences However, it has also been plagued by several serious shortcomings In particular, researchers have been unable to determine the statistical significance of individual parameter estimates or to relax assumptions of the canonical model that are inconsistent with theory and/or observed data As a result, canonical correlation analysis has found more application in exploratory research than in theory testing This paper illustrates how these problems can be resolved by expressing canonical correlation as a special case of a linear structural relations model

153 citations


Journal ArticleDOI
TL;DR: The most costly phase of statistical design,statistical simulation, may be carried out only once, and equivalent or superior designs for intermediate size networks are obtained with less computational effort than previously published methods.
Abstract: A new statistical circuit design centering and tolerancing methodology based on a synthesis of concepts from network analysis, recent optimization methods, sampling theory, and statistical estimation and hypothesis testing is presented. The method permits incorporation of such realistic manufacturing constraints as tuning, correlation, and end-of-life performance specifications. Changes in design specifications and component cost models can be handled with minimal additional computational requirements. A database containing the results of a few hundred network analyses is first constructed. As the nominal values and tolerances are changed by the optimizer, each new yield and its gradient are evaluated by a new method called Parametric sampling without resorting to additional network analyses. Thus the most costly phase of statistical design,-statistical simulation, may be carried out only once, which leads to considerable computational efficiency. Equivalent or superior designs for intermediate size networks are obtained with less computational effort than previously published methods. For example, a worst-case design for an eleventh-order Chebychev filter gives a filter cost of 44 units, a centered worst-case design reduces the cost to 18 units and statistical design using Parametric sampling further reduces the cost to 5 units (800 analyses, 75 CPU seconds on an IBM 370/158).

Journal ArticleDOI
TL;DR: In this paper, the Lagrange multiplier approach is used to compare linear and log-linear models against the more general alternative of the extended Box-Cox (1964) regression model considered by Savin and White (1978).
Abstract: It is often a problem to know what functional form to choose when specifying an econometric model since economic theory does not usually provide a very precise guide. The choice of functional form may, however, have important implications for subsequent statistical tests, for forecasts and for policy analysis, e.g. see Hall (1978)and Mizon (1977). Due to their simplicity, the specifications most commonly used are the linear and log-linear models. Sometimes the estimates of these two variants are compared with a view to choosing one of them as the "correct" representation. Although this comparison may be of interest in certain cases, in others it may be more appropriate to test linearity or log-linearity against a more general functional form rather than against each other. In this paper, we discuss two approaches to testing the adequacy of the linear and log-linear specifications against the more general alternative of the extended Box-Cox (1964) regression model considered by Savin and White (1978). The first of these procedures is based on the Lagrange multiplier approach discussed by Breusch and Pagan (1980) and by Godfrey and Wickens (1980), while the second is derived from work by Andrews (1971) on the selection of data transformations.1 Both approaches lead to tests which are easy to compute and which can reject both models as well as being capable of selecting one form rather than the other. The paper is set out as follows. In Section 2 we compare some existing procedures for testing the functional form of linear and log-linear models. In Section 3 we derive new large sample tests which are based on the Lagrange multiplier approach. The possibility of using small sample tests is discussed in Section 4 and a numerical example to illustrate the use of our new tests is given in Section 5.

Journal ArticleDOI
TL;DR: In this article, the authors developed tests for independence of rows and columns in an $r \times s$ contingency table from canonical correlation analysis and from models of linear-by-linear interaction.
Abstract: Tests for independence of rows and columns in an $r \times s$ contingency table are developed from canonical correlation analysis and from models of linear-by-linear interaction. The resulting test statistics are asymptotically equivalent under the null hypothesis. They are consistent and asymptotically unbiased. Approximate critical values are available from existing tables. The proposed tests are most appropriate when the matrix of joint probabilities is well approximated by a matrix of rank 2. Against some alternatives which may arise in such tables, the proposed statistics have greater asymptotic power than conventional chi-square tests of independence.

Journal ArticleDOI
TL;DR: Only recently have statistical models and methods been developed that begin to meet the need for proper handling of sampling variation, measurement errors, and other kinds of uncertainty in network data.
Abstract: Sociometric problems involving empirical sociograms and more general networks have been one of the major sources of an increasing interest in statistical graph models. The well-known book by Harary, Norman, and Cartwright (1965) has contributed much to the mathematical modeling of social networks, and we now find numerous methodological articles appearing in the literature-for instance, in the Journal of Mathematical Sociology, Social Networks, and other professional journals. Nonetheless, while statistical testing and estimation problems are discussed in some reports, statistical issues are all too often ignored or treated in a very elementary way. Only recently have statistical models and methods been developed that begin to meet the need for proper handling of sampling variation, measurement errors, and other kinds of uncertainty in network data.


Journal ArticleDOI
TL;DR: The nature and form of the restrictions implied by the rational expectations hypothesis are examined in a variety of models with expectations and the properties of appropriate test statistics are analyzed with Monte Carlo evidence.

Journal ArticleDOI
TL;DR: In this article, a statistical model for asymmetric times series is developed and analyzed and an estimation procedure is given as well as a statistical test of the hypothesis of symmetry versus the alternative of asymmetry.
Abstract: Asymmetric time series respond to innovations with one of two different rules according to whether the innovation is positive or negative. Quoted industrial prices are apparently such a series. It has been observed that when market conditions change, quoted prices are not revised immediately. This delay operates more strongly against reductions in price quotations than against increases. A statistical model for such asymmetric times series is developed and analyzed. An estimation procedure is given as well as a statistical test of the hypothesis of symmetry versus the alternative of asymmetry. Asymmetric time series models are fit to several economic time series.

Journal ArticleDOI
TL;DR: Three alternatives to the generally overconservative Bonferroni procedure are proposed where each of these alternatives may suit different conditions as set by the physician.

Journal ArticleDOI
TL;DR: In this article, the form of the Johnson-Neyman region of significance was determined by the statistic for testing the null hypothesis that the population within-group regressions are parallel.
Abstract: The form of the Johnson-Neyman region of significance is shown to be determined by the statistic for testing the null hypothesis that the population within-group regressions are parallel. Results are obtained for both simultaneous and nonsimultaneous regions of significance.


Journal ArticleDOI
TL;DR: In this article, the authors discuss a simple methodological point which provides the basis of one explanation of the rather surprising set of results suggesting that concentration is not a significant determinant of industry profitability in the UK.
Abstract: This paper discusses a simple methodological point which provides the basis of one explanation of the rather surprising set of results suggesting that concentration is not a significant determinant of industry profitability in the UK (see the survey by Hart and Clarke, 1979) The methodological point is the familiar one that, in fitting a regression, one is constraining the data to a specific functional form; thus, prior to hypothesis testing, one must ensure that the imposed form is an acceptable representation of the data We shall apply this principle to the basic profits-concentration model which is almost always unquestioningly taken to be linear In Section I below, we shall outline the basic profits-concentration relationship, showing that linearity is not an obviously appealing assumption to make In Section II, we shall discuss estimation and testing procedures given uncertainty concerning the appropriate functional form; Section III contains some experiments on UK data (which show that the relationship between profits and concentration is positive but not linear); and Section IV contains our principal conclusions

Book ChapterDOI
Irving John Good1
01 Jan 1981
TL;DR: The foundations of statistics are controversial, as foundations usually are, and the main controversy is between so-called Bayesian methods and the non-Bayesian, or ‘orthodox’, or sampling-theory methods on the other.
Abstract: The foundations of statistics are controversial, as foundations usually are. The main controversy is between so-called Bayesian methods, or rather neo-Bayesian, on the one hand and the non-Bayesian, or ‘orthodox’, or sampling-theory methods on the other.1 The most essential distinction between these two methods is that the use of Bayesian methods is based on the assumption that you should try to make your subjective or personal probabilities more objective, whereas anti-Bayesians act as if they wished to sweep their subjective probabilities under the carpet. (See, for example, Good (1976).) Most anti-Bayesians will agree, if asked, that they use judgment when they apply statistical methods, and that these judgments must make use of intensities of conviction,2 but that they would prefer not to introduce numerical intensities of conviction into their formal and documented reports. They regard it as politically desirable to give their reports an air of objectivity and they therefore usually suppress some of the background judgments in each of their applications of statistical methods, where these judgments would be regarded as of potential importance by the Bayesian. Nevertheless, the anti-Bayesian will often be saved by his own common sense, if he has any. To clarify what I have just asserted, I shall give some examples in the present article.

Journal ArticleDOI
01 Jan 1981
TL;DR: In this article, a Pareto optimal solution is proposed to the problem of finding a joint decision procedure for a group of n persons, which maximizes the generalized Nash product over the set of jointly achievable utility n-vectors.
Abstract: SUMMARY A solution is proposed to the problem of finding ajoint decision procedure for a group of n persons. It is any Pareto optimal solution which maximizes the generalized Nash product over the set of jointly achievable utility n-vectors. This result was originally proposed in the theory of bargaining but is readily adapted to the statistical context. The individuals involved need not have identical utility functions or identical prior (posterior) distributions. The solution may be a non-randomized rule but is randomized when the individual opinions or preferences are sufficiently diverse. Applications to hypothesis testing and estimation are included.

Journal ArticleDOI
TL;DR: In this article, an extension of the influence curve to non-Fisher-consistent functionals is proposed in order to investigate the infinitesimal robustness of more general statistics, e.g. those used in hypothesis testing.

Journal ArticleDOI
TL;DR: In this article, the problem of testing statistical hypothesis in nonlinear regression models with inequality constraints on the parameters is considered, and it is shown that the distribution of the Kuhn-Tucker, the likelihood ratio and the Wald test statistics converges to the same mixture of chi-square distributions under the null hypothesis.

Journal ArticleDOI
TL;DR: In this paper, the asymptotic distribution theory of test statistics which are functions of spacings is studied and the locally most powerful spacing tests are derived and used for the two-sample problem, which is to test if two independent samples are from the same population, test statistics are based on "spacing-frequencies" (i.e., the numbers of observations of one sample which fall in between the spacings made by the other sample).
Abstract: The asymptotic distribution theory of test statistics which are functions of spacings is studied here. Distribution theory under appropriate close alternatives is also derived and used to find the locally most powerful spacing tests. For the two-sample problem, which is to test if two independent samples are from the same population, test statistics which are based on “spacing-frequencies” (i.e., the numbers of observations of one sample which fall in between the spacings made by the other sample) are utilized. The general asymptotic distribution theory of such statistics is studied both under the null hypothesis and under a sequence of close alternatives.

Book ChapterDOI
TL;DR: In this article, the lowest rejected concentration tested (LRCT) is recommended as an experimental end point for no-observed-effect level (NOEL) testing. But the use of LRCT violates the principle of negative inference, the logical basis of statistical hypothesis testing.
Abstract: Use of the no-observed-effect level (NOEL) as an experimental end point violates the principle of negative inference, the logical basis of statistical hypothesis testing. The NOEL is based on the nonrejection of the hypothesis of no toxic effect, whereas scientific methods stress conclusions based on the rejection of a hypothesis. Failure to reject this hypothesis may result either because the concentration is safe or because the experimental protocol to detect a toxic response is insensitive, but the cause is indeterminant and the error rate unknown. The lowest rejected concentration tested (LRCT) is recommended as an experimental end point. By the use of LRCT the rate of misclassification is known and is equal to ..cap alpha... Experimental designs that emphasize both the biological and the statistical significance of LRCT end points are recommended.

Journal ArticleDOI
TL;DR: In this article, pairwise preference data are represented as a monotone integral transformation of difference on the underlying stimulus-object or utility scale, and the parameters of the transformation and the underlying scale values or utilities are estimated by maximum likelihood with inequality constraints on the transformation parameters.
Abstract: Pairwise preference data are represented as a monotone integral transformation of difference on the underlying stimulus-object or utility scale. The class of monotone transformations considered is that in which the kernel of the integral is a linear combination of B-splines. Two types of data are analyzed: binary and continuous. The parameters of the transformation and the underlying scale values or utilities are estimated by maximum likelihood with inequality constraints on the transformation parameters. Various hypothesis tests and interval estimates are developed. Examples of artificial and real data are presented.

Journal ArticleDOI
TL;DR: In this article, an extension of the well-known beta binomial model, which incorporates non-stationarity of individuals' exposure probabilities, is used to account for these errors, providing insights into the sensitivity of various media statistics to nonstationarity.
Abstract: In the field of media research, the beta binomial has performed very well for estimating the distribution of the frequency of exposures to a media vehicle. However, long-term projections have shown consistent biases. The beta binomial geometric model, an extension of the well-known beta binomial model, which incorporates non-stationarity of individuals' exposure probabilities, is able to account for these errors. In addition this beta binomial geometric framework provides insights into the sensitivity of various media statistics to non-stationarity. This model is a particular operationalization of Howard's general Dynamic Inference Model Howard, R. A. 1965. Dynamic inference. Oper. Res.13 2 712-733.. The paper focuses on applications to some television viewing and magazine readership data. The properties of the model, estimation of the parameters and statistical tests are also presented. Finally, some future research possibilities are discussed.

Journal ArticleDOI
TL;DR: In this paper, a Bayesian approach to nonlinear regression is implemented and evaluated in the light of practical as well as theoretical considerations, and numerical integration methods and the calculation of confidence regions using the posterior distributions are given.
Abstract: A traditional approach to parameter estimation and hypothesis testing in nonlinear models is based on least squares procedures. Error analysis depends on large-sample theory; Bayesian analysis is an alternative approach that avoids substantial errors which could result frorn this dependence. This communication is concerned with the implementation of the Bayesian approach as an alternative to least squares nonlinear regression. Special attention is given to the numerical evaluation of multiple integrals and to the behavior of the parameter estimators and their estimated covariances. The Bayesian approach is evaluated in the light of practical as well as theoretical considerations. 1. Inttoduction The traditional approach to the statistical analysis of nonlinear models is first to use some numerical method to minimize the sum-of-squares objective function in order to obtain least squares estimators of the parameters (Draper and Smith, 1966, pp. 267-275; Nelder and Mead, 1965), and then to apply linear regression theory to the linear part of the Taylor series approximation of the model expanded about these estimators in order to obtain the asymptotic covariance matrix (Bard, 1974, pp. 176-179). The distributions of the estimators obtained in this manner are known only in the limit as the sample size approaches infinity (Jennrich, 1969). Hence, analyses based on these statistics may be inappropriate for small-sample problems, such as those arising in pharmacokinetics (Wagner, 1975). An alternative approach to the statistical analysis of nonlinear models is to utilize methods based on Bayes's theorem (Box and Tiao, 1973, pp. 1-73). Parameters are regarded as random variables rather than as unknown constants. If a nonlinear model with known error distribution is assumed, a-correct probability analysis follows and asymptotic theory is not involved. In this communication a Bayesian approach to nonlinear regression is implemented and evaluated. Particular attention is given to numerical integration methods and the calculation of confidence regions using the posterior distributions.