scispace - formally typeset
Search or ask a question
Topic

Statistical hypothesis testing

About: Statistical hypothesis testing is a research topic. Over the lifetime, 19580 publications have been published within this topic receiving 1037815 citations. The topic is also known as: statistical hypothesis testing & confirmatory data analysis.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, test statistics are proposed that can be used to test hypotheses about the parameters of the deterministic trend function of a univariate time series, and the tests are valid for I(0) and I(1) errors.
Abstract: In this paper test statistics are proposed that can be used to test hypotheses about the parameters of the deterministic trend function of a univariate time series. The tests are valid in the presence of general forms of serial correlation in the errors and can be used without having to estimate the serial correlation parameters either parametrically or nonparametrically. The tests are valid for I(0) and I(1) errors. Trend functions that are permitted include general linear polynomial trend functions that may have breaks at either known or unknown locations. Asymptotic distributions are derived, and consistency of the tests is established. The general results are applied to a model with a simple linear trend. A local asymptotic analysis is used to compute asymptotic size and power of the tests for this example. Size is well controlled and is relatively unaffected by the variance of the initial condition. Asymptotic power curves are computed for the simple linear trend model and are compared to existing tests. It is shown that the new tests have nontrivial asymptotic power. A simulation study shows that the asymptotic approximations are adequate for sample sizes typically used in economics. The tests are used to construct confidence intervals for average GNP growth rates for eight industrialized countries using post-war data.

337 citations

Journal ArticleDOI
TL;DR: This paper discusses, from a philosophical perspective, the reasons for considering the power of any statistical test used in environmental biomonitoring, because Type II errors can be more costly than Type I errors for environmental management.
Abstract: This paper discusses, from a philosophical perspective, the reasons for considering the power of any statistical test used in environmental biomonitoring. Power is inversely related to the probability of making a Type II error (i.e. low power indicates a high probability of Type II error). In the context of environmental monitoring, a Type II error is made when it is concluded that no environmental impact has occurred even though one has. Type II errors have been ignored relative to Type I errors (the mistake of concluding that there is an impact when one has not occurred), the rates of which are stipulated by the a values of the test. In contrast, power depends on the value of α, the sample size used in the test, the effect size to be detected, and the variability inherent in the data. Although power ideas have been known for years, only recently have these issues attracted the attention of ecologists and have methods been available for calculating power easily. Understanding statistical power gives three ways to improve environmental monitoring and to inform decisions about actions arising from monitoring. First, it allows the most sensitive tests to be chosen from among those applicable to the data. Second, preliminary power analysis can be used to indicate the sample sizes necessary to detect an environmental change. Third, power analysis should be used after any nonsignificant result is obtained in order to judge whether that result can be interpreted with confidence or the test was too weak to examine the null hypothesis properly. Power procedures are concerned with the statistical significance of tests of the null hypothesis, and they lend little insight, on their own, into the workings of nature. Power analyses are, however, essential to designing sensitive tests and correctly interpreting their results. The biological or environmental significance of any result, including whether the impact is beneficial or harmful, is a separate issue. The most compelling reason for considering power is that Type II errors can be more costly than Type I errors for environmental management. This is because the commitment of time, energy and people to fighting a false alarm (a Type I error) may continue only in the short term until the mistake is discovered. In contrast, the cost of not doing something when in fact it should be done (a Type II error) will have both short- and long-term costs (e.g. ensuing environmental degradation and the eventual cost of its rectification). Low power can be disastrous for environmental monitoring programmes.

335 citations

Journal ArticleDOI
TL;DR: Skewed distributions play an important role in the analysis of data from quality and reliability experiments as discussed by the authors, and very often unknown parameters must be estimated from the sample data in order to test whether the data has come from a certain family of distri
Abstract: Skewed distributions play an important role in the analysis of data from quality and reliability experiments Very often unknown parameters must be estimated from the sample data in order to test whether the data has come from a certain family of distri

334 citations

Journal ArticleDOI
TL;DR: In this article, a new specification test for IV estimators adopting a particular second-order approximation of Bekker was developed, where the difference between the forward (conventional) 2SLS estimator of the coefficient of the right-hand side endogenous variable with the reverse (non-SLS) estimators of the same unknown parameter when the normalization is changed.
Abstract: We develop a new specification test for IV estimators adopting a particular second order approximation of Bekker. The new specification test compares the difference of the forward (conventional) 2SLS estimator of the coefficient of the right-hand side endogenous variable with the reverse 2SLS estimator of the same unknown parameter when the normalization is changed. Under the null hypothesis that conventional first order asymptotics provide a reliable guide to inference, the two estimates should be very similar. Our test sees whether the resulting difference in the two estimates satisfies the results of second order asymptotic theory. Essentially the same idea is applied to develop another new specification test using second-order unbiased estimators of the type first proposed by Nagar. If the forward and reverse Nagar-type estimators are not significantly different we recommend estimation by LIML, which we demonstrate is the optimal linear combination of the Nagar-type estimators (to second order). We also demonstrate the high degree of similarity for k-class estimators between the approach of Bekker and the Edgeworth expansion approach of Rothenberg. An empirical example and Monte Carlo evidence demonstrate the operation of the new specification test.

333 citations

Proceedings ArticleDOI
28 Jun 2009
TL;DR: This paper proposes a general framework for assessing predictive stream learning algorithms, and defends the use of Predictive Sequential methods for error estimate - the prequential error.
Abstract: Learning from data streams is a research area of increasing importance. Nowadays, several stream learning algorithms have been developed. Most of them learn decision models that continuously evolve over time, run in resource-aware environments, detect and react to changes in the environment generating data. One important issue, not yet conveniently addressed, is the design of experimental work to evaluate and compare decision models that evolve over time. There are no golden standards for assessing performance in non-stationary environments. This paper proposes a general framework for assessing predictive stream learning algorithms. We defend the use of Predictive Sequential methods for error estimate - the prequential error. The prequential error allows us to monitor the evolution of the performance of models that evolve over time. Nevertheless, it is known to be a pessimistic estimator in comparison to holdout estimates. To obtain more reliable estimators we need some forgetting mechanism. Two viable alternatives are: sliding windows and fading factors. We observe that the prequential error converges to an holdout estimator when estimated over a sliding window or using fading factors. We present illustrative examples of the use of prequential error estimators, using fading factors, for the tasks of: i) assessing performance of a learning algorithm; ii) comparing learning algorithms; iii) hypothesis testing using McNemar test; and iv) change detection using Page-Hinkley test. In these tasks, the prequential error estimated using fading factors provide reliable estimators. In comparison to sliding windows, fading factors are faster and memory-less, a requirement for streaming applications. This paper is a contribution to a discussion in the good-practices on performance assessment when learning dynamic models that evolve over time.

333 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
88% related
Linear model
19K papers, 1M citations
88% related
Inference
36.8K papers, 1.3M citations
87% related
Regression analysis
31K papers, 1.7M citations
86% related
Sampling (statistics)
65.3K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023267
2022696
2021959
2020998
20191,033
2018943