scispace - formally typeset
Search or ask a question

Showing papers on "Resampling published in 1988"


Book
01 Jan 1988
TL;DR: In this article, the authors discuss diagnostic checking, model selection, and specification testing for Econometrics, including Diagnostic Checking, Model Selection, and Specification Testing, as well as a discussion of nonlinear regression, models of expectations, and nonnormality errors in Variables.
Abstract: Foreword Preface to the Second Edition Preface to the Third Edition Obituary INTRODUCTION AND THE LINEAR REGRESSION MODEL What is Econometrics? Statistical Background and Matrix Algebra Simple Regression *Multiple Regression VIOLATION OF THE ASSUMPTIONS OF THE BASIC MODEL *Heteroskedasticity *Autocorrelation Multicollinearity *Dummy Variables and Truncated Variables Simultaneous Equations Models Nonlinear Regression, Models of Expectations, and Nonnormality Errors in Variables SPECIAL TOPICS Diagnostic Checking, Model Selection, and Specification Testing *Introduction to Time--Series Analysis Vector Autoregressions, Unit Roots, and Cointegration *Panel Data Analysis *Large--Sample Theory *Small--Sample Inference: Resampling Methods Appendix A: *Data Sets Appendix B:* Data Sets on the Web Appendix C:* Computer Programs Index

3,694 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed a resampling method based on the balanced repeated replication (BRR) method for stratified multistage multi-stage designs with replacement, in particular for two sampled clusters per stratum.
Abstract: Methods for standard errors and confidence intervals for nonlinear statistics —such as ratios, regression, and correlation coefficients—have been extensively studied for stratified multistage designs in which the clusters are sampled with replacement, in particular, the important special case of two sampled clusters per stratum. These methods include the customary linearization (or Taylor) method and resampling methods based on the jackknife and balanced repeated replication (BRR). Unlike the jackknife or the BRR, the linearization method is applicable to general sampling designs, but it involves a separate variance formula for each nonlinear statistic, thereby requiring additional programming efforts. Both the jackknife and the BRR use a single variance formula for all nonlinear statistics, but they are more computing-intensive. The resampling methods developed here retain these features of the jackknife and the BRR, yet permit extension to more complex designs involving sampling without replace...

445 citations


Journal ArticleDOI
TL;DR: In this paper, a single unifying approach to bootstrap resampling, applicable to a very wide range of statistical problems, has been proposed, including bias reduction, shrinkage, hypothesis testing and confidence interval construction.
Abstract: SUMMARY We propose a single unifying approach to bootstrap resampling, applicable to a very wide range of statistical problems. It enables attention to be focused sharply on one or more characteristics which are of major importance in any particular problem, such as coverage error or length for confidence intervals, or bias for point estimation. Our approach leads easily and directly to a very general form of bootstrap iteration, unifying and generalizing present disparate accounts of this subject. It also provides simple solutions to relatively complex problems, such as a suggestion by Lehmann (1986) for 'conditionally' short confidence intervals. We set out a single unifying principle guiding the operation of bootstrap resampling, applicable to a very wide range of statistical problems including bias reduction, shrinkage, hypothesis testing and confidence interval construction. Our principle differs from other approaches in that it focuses attention directly on a measure of quality or accuracy, expressed in the form of an equation whose solution is sought. A very general form of bootstrap iteration is an immediate consequence of iterating the empirical solution to this equation so as to improve accuracy. When employed for bias reduction, iteration of the resampling principle yields a competitor to the generalized jackknife, enabling bias to be reduced to arbitrarily low levels. When applied to confidence intervals it produces the techniques of Hall (1986) and Beran (1987). The resampling principle leads easily to solutions of new, complex problems, such as empirical versions of confidence intervals proposed by Lehmann (1986). Lehmann argued that an 'ideal' confidence interval is one which is short when it covers the true parameter value but not necessarily otherwise. The resampling principle suggests a simple empirical means of constructing such intervals. Section 2 describes the general principle, and ? 3 shows how it leads naturally to bootstrap iteration. There we show that in many problems of practical interest, such as bias reduction and coverage-error reduction in two-sided confidence intervals, each iteration reduces error by the factor n-1, where n is sample size. In the case of confidence intervals our result sharpens one of Beran (1987), who showed that coverage error is reduced by the factor n-2 in two-sided intervals. The main exception to our n-1 rule is coverage error of one-sided intervals, where error is reduced by the factor n-A at each iteration. Our approach to bootstrap iteration serves to unify not just the philosophy of iteration for different statistical problems, but also different techniques of iteration for the same

204 citations


Journal ArticleDOI
TL;DR: Saddlepoint approximations are shown to be easy to use and accurate in a variety of simple bootstrap and randomization applications as discussed by the authors, such as mean estimation, ratio estimation, two-sample comparisons, and autoregressive estimation.
Abstract: SUMMARY Saddlepoint approximations are shown to be easy to use and accurate in a variety of simple bootstrap and randomization applications. Examples include mean estimation, ratio estimation, two-sample comparisons, and autoregressive estimation.

137 citations


Journal ArticleDOI
TL;DR: In this article, a review of variance estimation techniques for nonlinear statistics, such as ratios and regression coefficients, and functionals such as quantiles, are reviewed in the context of sampling from stratified populations.
Abstract: Variance estimation techniques for nonlinear statistics, such as ratios and regression and correlation coefficients, and functionals, such as quantiles, are reviewed in the context of sampling from stratified populations. In particular, resampling methods such as the bootstrap, the jackknife, and balanced repeated replication are compared with the traditional linearization method for nonlinear statistics and a method based on Woodruff's confidence intervals for the quantiles. Results of empirical studies are presented on the bias and stability of these variance estimators and on confidence‐interval coverage probabilities and lengths. Copyright

113 citations


Journal ArticleDOI
TL;DR: In this paper, the importance sampling method of Hammersley and Handscomb (1964) is modified to apply to the estimation of quantiles, and a resampling procedure is introduced that recenters the bootstrap distribution of a robust estimator of location to produce estimated confidence limits that are much more accurate than those obtained by the usual bootstrap method (Efron 1982).
Abstract: The use of importance-sampling methods to substantially reduce the amount of resampling necessary for the construction of nonparametric bootstrap confidence intervals is investigated. The classical importance-sampling method of Hammersley and Handscomb (1964) is modified to apply to the estimation of quantiles. Based on this method, a resampling procedure is introduced that recenters the bootstrap distribution of a robust estimator of location to produce estimated confidence limits that are much more accurate (for a given amount of resampling) than those obtained by the usual bootstrap method (Efron 1982). The required recentering is accomplished with a suitable “exponential tilting” similar to that used in another connection by Field and Hampel (1982). This importance-sampling procedure is used to produce bootstrap confidence intervals for location based on a class of estimators that includes symmetric M estimators. These interval estimates are asymptotically optimal in a certain sense. Simulati...

103 citations


Journal ArticleDOI
TL;DR: In this paper, three algorithms for balanced bootstrap sampling are described, and it is shown that a balanced bootstrapping simulation costs little more than an ordinary unbalanced one, which is the case in this paper.
Abstract: Efron's nonparametric bootstrap method simulates the distributional properties of a statistic by repeated resampling of a given sample. A balanced bootstrap simulation is one in which each sample observation is reused exactly equally often. Three algorithms for balanced bootstrap sampling are described, and it is shown that a balanced bootstrap simulation costs little more than an ordinary unbalanced one.

68 citations


Journal ArticleDOI
TL;DR: In this paper, the authors investigated the asymptotic validity of bootstrap techniques to estimate the sampling distribution of the estimates of the kernel density estimates and showed that a straightforward application of a naive bootstrap yields invalid inferences.
Abstract: The problem of constructing bootstrap confidence intervals for the mode of a density is considered Estimates of the mode are derived from kernel density estimates based on fixed and data-dependent bandwidths The asymptotic validity of bootstrap techniques to estimate the sampling distribution of the estimates is investigated In summary, the results are negative in the sense that a straightforward application of a naive bootstrap yields invalid inferences In particular, the bootstrap fails if resampling is done from the kernel density estimate On the other hand, if one resamples from a smoother kernel density estimate (which is necessarily different from the one which yields the original estimate of the mode), the bootstrap is consistent The bootstrap also fails if resampling is done from the empirical distribution, unless the choice of bandwidth is suboptimal Similar results hold when applying bootstrap techniques to other functionals of a density

47 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present a method for calculating the overall significance level associated with P1, the most extreme isolated trend P-value observed among the tumor sites encountered. But, the method is based on a multiresponse randomization procedure and does not account for the inherent dependencis that may exist between tumor sites.
Abstract: Because the evaluation of rodent carcinogenicity studies involves performing a statistical analysis at each tumor site encountered it is important to understand the extent to which this multiplicity affects the false positive rate. It is equally important to apply methods of accounting for this multiplicity in the analysis. In this paper we discuss one such method which involves calculating the overall significance level associated with P1, the most extreme isolated trend P-value observed among the tumor sites encountered. The method constructs the distribution of trend scores simultaneously for each tumor site using a multiresponse randomization procedure. As such, it recognizes the discrete nature of the data and incorporates inherent dependencis that may exist between the tumor sites. For small studies it is possible to perform a complete rerandomization and compute an exact adjusted trend P-value. However, for moderate or large studies the need exists for approximations based on efficient resampling plans. We report one such approximation proposed by Dr. John Tukey which involves correcting the exact Bonferroni upper bound. Also, we show that the independence assumption used in methods proposed by Mantel (1980) and Mantel et al. (1982) seems to be a reasonable approximation for the study discussed in the present report. This result needs to be supported further using additional studies.

43 citations



Journal ArticleDOI
TL;DR: In this article, the variance and bias of the weighted jackknife, unweighted jackknife and bootstrap resampling methods are compared. And the results indicate that the weighted jknifes are asymptotically unbiased and consistent and their mean squared errors are of order O(n − 2 )$ if the imbalance measure converges to zero as the sample size of the model grows.
Abstract: Let $g$ be a nonlinear function of the regression parameters $\beta$ in a heteroscedastic linear model and $\hat{\beta}$ be the least squares estimator of $\beta.$ We consider the estimation of the variance and bias of $g(\hat{\beta})$ [as an estimator of $g(\beta)$] by using three resampling methods: the weighted jackknife, the unweighted jackknife and the bootstrap. The asymptotic orders of the mean squared errors and biases of the resampling variance and bias estimators are given in terms of an imbalance measure of the model. Consistency of the resampling estimators is also studied. The results indicate that the weighted jackknife variance and bias estimators are asymptotically unbiased and consistent and their mean squared errors are of order $o(n^{-2})$ if the imbalance measure converges to zero as the sample size $n \rightarrow \infty$. Furthermore, based on large sample properties, the weighted jackknife is better than the unweighted jackknife. The bootstrap method is shown to be asymptotically correct only under a homoscedastic error model. Bias reduction, a closely related problem, is also discussed.

Journal ArticleDOI
TL;DR: In this paper, a test for second-degree stochastic dominance is proposed, which is a permutation test using only the sample data, exemplified using data from Kramer and Pope.
Abstract: A test for second-degree stochastic dominance is proposed. The test is a permutation test using only the sample data. It is exemplified using data from Kramer and Pope. The test conclusions differ substantially from standard practice in which no statistical test is conducted.

Journal ArticleDOI
TL;DR: In this paper, a new framework for evaluating probability distribution models used in hydrologic frequency analysis is proposed, where the variance (standard deviation) in estimation of T-year events (quantiles) obtained by the model is incorporated as an evaluation criterion as well as some goodness-of-fit criteria; resampling methods such as the jackknife and the bootstrap are also incorporated to quantify the variance.
Abstract: This paper proposes a new framework for evaluating probability distribution models used in hydrologic frequency analysis. In the framework, the variance (standard deviation) in estimation of T-year events (quantiles) obtained by the model is incorporated as an evaluation criterion as well as some goodness-of-fit criteria; resampling methods such as the jackknife and the bootstrap are also incorporated to quantify the variance. Using the existing extreme data (annual maxima of κ-day precipitation, κ=1, 2, 3), the authors reveal the insufficiency of the conventional model evaluation which is based on only the goodness of fit. The proposed framework evaluates ten distributions with two or three parameters. Additionally, the relation between the amount of data and the variance (estimation accuracy) is investigated through bootstrap-type resampling.

Journal ArticleDOI
TL;DR: In this paper, a branch-and-bound algorithm is described for finding the permutation (randomization) P value in matched-pairs designs without enumeration of the entire reference distribution.
Abstract: A branch-and-bound algorithm is described for finding the permutation (randomization) P value in matched-pairs designs without enumeration of the entire reference distribution. It is not restricted to test statistics that are linear in functions of the observations, and permutation tests based on trimmed means are investigated. We apply the algorithm to six examples, demonstrating that the use of a moderately trimmed, instead of an untrimmed, mean can sometimes lead to substantially smaller P values and shorter confidence intervals. Confidence intervals are obtained by trial-and-error inversion of the P value. Permutation tests arise in randomization inference, though they can be applied to nonrandomized studies. Under the randomization model, permutation tests are exact, giving the correct probability of a Type I error, without distributional assumptions. The observed test statistic is compared with the reference set of test statistics that would occur under all possible randomizations. Thus inf...



01 Jan 1988
TL;DR: A treatise of statistical functionals in resampling plans is considered along with a review of some of these recent developments.
Abstract: SUMMARY. The jackknife. bootstrap and some other resampling plans are commonly used to estimate (and reduce) the bias and sampling error of statistical estimators. In these contexts. general (differentiable) statistical functionals crop up in a variety of models (especially. in nonparametric setups). A treatise of statistical functionals in resampling plans is considered along with a review of some of these recent developments.

Journal ArticleDOI
TL;DR: In this article, simulation results comparing various resampling estimators of classification error rate for linear discriminant type classification algorithms are presented for three non-Gaussian multivariate populations, namely, exponential, Cauchy and uniform.
Abstract: This article presents simulation results comparing various resampling estimators of classification error rate for linear discriminant type classification algorithms. Three non-Gaussian multivariate populations are studied namely, exponential, Cauchy and uniform. Simulations are conducted for small sample sizes, two-class and three-class problems and 2-D, 3-D and 5-D distributions. Estimation procedures and sample sizes are the same as in our previous study of Gaussian populations; again 200 bootstrap replications are used for each simulation trial. For exponential and uniform distributions the 0.632 estimator generally performs best. However, for Cauchy distributions the convex bootstrap and the e 0 often outperform the 0.632 estimator.

Book ChapterDOI
01 Jan 1988
TL;DR: A computer-based compromise between complete randomization and optimum design is provided, partially answering the question “how much randomization is enough?”
Abstract: We discuss three recent data analyses which illustrate making statistical inferences (finding significance levels, confidence intervals, and standard errors) with the critical assistance of a computer. The first example concerns a permutation test for a linear model situation with several covariates. We provide a computer-based compromise between complete randomization and optimum design, partially answering the question “how much randomization is enough?”


Journal ArticleDOI
TL;DR: In this paper, the use of a bootstrap adjustment to reduce the bias of the NS method results in an estimator which combines the advantages of small bias with low variance, and is therefore preferable to existing resampling estimators.
Abstract: The resubstitution estimator of classification error rates is known to have both an optimistic bias and a large variance. Modifications to this method have addressed these problems. the bootstrap estimator, for example, uses a resampling scheme to reduce bias, and the NS method uses a smoothing algorithm to reduce variance. In this paper we show that the use of a bootstrap adjustment to reduce the bias of the NS method results in an estimator which combines the advantages of small bias with low variance, and is therefore preferable to existing resampling estimators. In addition, a new smoothed estimator with reduced bias is introduced which may eliminate the need for resampling in somesituations.

Journal ArticleDOI
Xiquan Shi1, Jun Shao1
TL;DR: In this paper, an alternative resampling procedure is proposed for estimating the distribution and variance of a function of the sample mean, which is shown to be strongly consistent for m-dependent, identically distributed random observation.
Abstract: For m–dependent, identically distributed random observation, the bootstrap method provides inconsistent estimators of the distribution and variance of the sample mean. This paper proposes an alternative resampling procedure. For estimating the distribution and variance of a function of the sample mean, the proposed resampling estimators are shown to be strongly consistent.


Journal ArticleDOI
Barry P. Katz1
TL;DR: For a series of data points that follow a pattern due to natural ordering, a modified runs test for continuous data to detect such a pattern is developed.
Abstract: For a series of data points that follow a pattern due to natural ordering, I develop a modified runs test for continuous data to detect such a pattern. The test statistic is the number of runs in the series after having smoothed the data with use of a moving average. The testing procedure is based on permutations of the original data. My simulations indicate that the modified test is generally more powerful than the usual runs test.

Book ChapterDOI
01 Jan 1988
TL;DR: In this paper, an uninhibited frontal view of a part of the Fisher randomization test is presented. But no satisfactory answer to the question "Why randomize?" is given.
Abstract: Randomization is widely recognized as a basic principle of statistical experimentation. Yet we find no satisfactory answer to the question, Why randomize ? In a previous paper (Basu 1978b)the question was examined from the point of view of survey statistics. In this article we take an uninhibited frontal view of a part of the randomization methodology generally known as the Fisher randomization test.


01 Jan 1988
TL;DR: The purpose of this report is to present some essential elements, as the author perceives them, of a unified approach to the statistical modelling and analysis of data arising from animal marking studies.
Abstract: Both capture-recapture and band-recovery studies can be conveniently modelled by one approach. The key is to view these methods as studies of cohort survival processes. Thus, in capture-recapture, as in band-recovery, attention is focused on survival rate, not population size. Probability models for the data are conditional on the known releases initiating each cohort. The first resampling of an individual after a release (e.g., live-capture, dead-recovery, live-resighting) effectively removes that individual from its release cohort. That individual may then be re-released as part of a new cohort. The re-sampling data are modelled as multinomial counts with a standardized form for the cell probabilities. The different resampling methods (e.g., live-recapture vs. band-recoveries) correspond to different model parameterizations. For capture-recapture, the complete model also requires the probability distribution of the unmarked animals caught on each occasion. That distribution, which is given here under the time-specific assumptions of the Jolly-Seber model, incorporates the recruitment parameters. The full distribution of the minimal sufficient statistic for the Jolly-Seber model can be given as a product of conditionally independent binomials. This representation of the minimal sufficient statistic greatly facilitates deriving variances and covariances. Goodness of fit tests for such cohort-survival data have two basic components. One component is identical for capture-recapture data and band-recovery data. The second component exists only for capture-recapture data; it uses information in subcohorts as defined by capture histories within each released cohort. The basic notation required in this modelling approach is much less than is typically used to develop capture-recapture models and the derivation of theory is more straightforward. Several special cases of the Jolly-Seber model are easily dealt with using this approach. Also considered here is the generalization of the time-specific model to allow different parameters to apply during the time interval i to i+l for individuals released at time i as compared to individuals released before time i and not captured (resampled) at time i. PREFACE The purpose of this report is to present some essential elements, as the author perceives them, of a unified approach to the statistical modelling and analysis of data arising from animal marking studies. This subject is often referred to as capture-recapture for open populations. The subject is, however, much broader than just capture-recapture taken literally. The essence of the methods is that animals are being sampled, somehow, from populations open to the dynamics of entry and loss. Sampled animals are "marked" in any …

Proceedings ArticleDOI
Arup Bose1
01 Dec 1988
TL;DR: The bootstrap is a resampling method of estimating distributional properties of estimators that can be applied to time series models and directions of theoretical and applied research in this area are indicated.
Abstract: The bootstrap is a resampling method of estimating distributional properties of estimators. We discuss how this method can be applied to time series models and indicate directions of theoretical and applied research in this area.

Journal ArticleDOI
TL;DR: In this paper, the bias associated with the bootstrap was investigated via an extensive Monte Carlo simulation study in the linear regression context and three new corrections for the bias in estimation of the variance were considered, and one of these is demonstrated to be an improvement over the usual Bickel and Freedman's correction.
Abstract: The bootstrap is a computer based resampling procedure for estimating the correct variance of an estimator directly from the data obtained rather than from assumptions on the underlying error distribution. The objective of the research is to study the bias associated with the bootstrap and to consider several alternative procedures for correcting this bias. This is accomplished via an extensive Monte Carlo simulation study in the linear regression context. This simulation involves a range of underlying error distributions, a variety of structures for the design matrix, and a range of sample sizes. Three new corrections for the bias in estimation of the variance are considered, and a significant contribution of this research is that one of these is demonstrated to be an improvement over the usual Bickel and Freedman's correction. The remaining two are demonstrated to be less desirable, these are based on an inner/outer loop bootstrap procedure

Book ChapterDOI
01 Jan 1988
TL;DR: In this paper, a direct approach to statistical tests of significance is possible using bootstrap techniques to simulate the null distribution of the test statistic of interest. But this approach is not suitable for univariate populations.
Abstract: A direct approach to statistical tests of significance is possible using bootstrap techniques to simulate the null distribution of the test statistic of interest The approach is outlined by Hinkley(1988) and by Young(1986) In this paper issues involved in the construction of such tests are considered in the context of testing the mean of a univariate population The general method is summarized in Section 2, while questions relating to choice of reference distribution and test statistic are considered and illustrated empirically in Section 3 Section 4 discusses the importance of appropriate conditioning in resampling tests of significance