scispace - formally typeset
Search or ask a question
Topic

Resampling

About: Resampling is a research topic. Over the lifetime, 5428 publications have been published within this topic receiving 242291 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A simple procedure where observed response rates from an adaptive experiment are input to a simulation program, and sequences from the adaptive sampling scheme are generated, which yields a confidence interval approximation with coverage close to 1-alpha in most cases.
Abstract: Adaptive designs generate dependent sequences of random variables that are not exchangeable. Therefore, it is not obvious how to employ a resampling scheme for confidence interval estimation. We propose a simple procedure where observed response rates from an adaptive experiment are input to a simulation program. The program then generates sequences from the adaptive sampling scheme. We compare, via simulation, three bootstrap confidence intervals with the asymptotic confidence interval for two adaptive designs useful for clinical trials. A simple ranking of simulated response rates yields a confidence interval approximation with coverage close to 1 - α in most cases. The method allows us to incorporate such complexities as staggered entry and delayed response. We give an example of its utility on a clinical trial of fluoxetine in depression.

35 citations

Journal ArticleDOI
TL;DR: It was showed that accuracy estimates for popular resampling methods, such as sample splitting and leave‐one‐out cross validation (Loo CV), have a higher mean square error than for other methods, and the large variability of the split‐sample and Loo CV may make the point estimates of accuracy obtained using these methods unreliable and hence should be interpreted carefully.
Abstract: Resampling techniques are often used to provide an initial assessment of accuracy for prognostic prediction models developed using high-dimensional genomic data with binary outcomes. Risk prediction is most important, however, in medical applications and frequently the outcome measure is a right-censored time-to-event variable such as survival. Although several methods have been developed for survival risk prediction with high-dimensional genomic data, there has been little evaluation of the use of resampling techniques for the assessment of such models. Using real and simulated datasets, we compared several resampling techniques for their ability to estimate the accuracy of risk prediction models. Our study showed that accuracy estimates for popular resampling methods, such as sample splitting and leave-one-out cross validation (Loo CV), have a higher mean square error than for other methods. Moreover, the large variability of the split-sample and Loo CV may make the point estimates of accuracy obtained using these methods unreliable and hence should be interpreted carefully. A k-fold cross-validation with k = 5 or 10 was seen to provide a good balance between bias and variability for a wide range of data settings and should be more widely adopted in practice.

35 citations

Journal Article
TL;DR: In this article, the power and error rates of statistical tests for detecting trends in raptor population count data collected from a single monitoring site were estimated using 1,000 replica-tions each, using n = 1 0 and n = 50 years of count data.
Abstract: We conducted simulations that estimated power and Type I error rates of statistical tests for detecting trends in raptor population count data collected from a single monitoring site. Results of the simulations were used to help analyze count data of bald eagles (HaIi- aeetus leucocephalus) from 7 national forests in Michigan, Minnesota, and Wisconsin during 1 980-1 989. Seven statistical tests were evaluated, including simple linear regres- sion on the log scale and linear regression with a permutation test. Using 1,000 replica- tions each, we simulated n = 1 0 and n = 50 years of count data and trends ranging from -5 to 5O/% change/year. We evaluated the tests at 3 critical levels (a = 0.01, 0.05, and 0.10) for both upper- and lower-tailed tests. Exponential count data were simulated by adding sampling error with a coefficient of variation of 400/o from either a log-normal or autocorrelated log-normal distribution. Not surprisingly, tests performed with 50 years of data were much more powerful than tests with 1 O years of data. Positive autocorrelation inflated a-levels upward from their nominal levels, making the tests less conservative and more likely to reject the null hypothesis of no trend. Of the tests studied, Cox and Stuart's test and Pollard's test clearly had lower power than the others. Surprisingly, the linear re- gression t-test, Collins' linear regression permutation test, and the nonparametric Lehmann's and Mann's tests all had similar power in our simulations. Analyses of the count data suggested that bald eagles had increasing trends on at least 2 of the 7 national forests during 1 980-1 989.

35 citations

Journal ArticleDOI
TL;DR: In this article, an antithetic variates method for the bootstrap is proposed and discussed, which is applicable quite generally, to bias estimation, distribution function estimation and quantile estimation, for example.
Abstract: SUMMARY An antithetic variates method for the bootstrap is proposed and discussed. It is applicable quite generally, to bias estimation, distribution function estimation and quantile estimation, for example. It is based on an 'antithetic permutation' of the sample, which amounts to ranking values of a certain function of the data. Once this has been done, B uniform resampling operations may be immediately converted into 2B 'effective' resampling operations, yielding greater statistical efficiency than 2B totally independent resampling operations. We show that antithetic resampling leads to positive nonnegligible gains in performance, for the same level of labour, when compared with ordinary uniform resampling.

35 citations

Journal ArticleDOI
TL;DR: In this article, a non-parametric method is applied to quantify residual uncertainty in hydrologic streamflow forecasting, which acts as a post-processor on deterministic model forecasts and generates a residual uncertainty distribution.
Abstract: . A non-parametric method is applied to quantify residual uncertainty in hydrologic streamflow forecasting. This method acts as a post-processor on deterministic model forecasts and generates a residual uncertainty distribution. Based on instance-based learning, it uses a k nearest-neighbour search for similar historical hydrometeorological conditions to determine uncertainty intervals from a set of historical errors, i.e. discrepancies between past forecast and observation. The performance of this method is assessed using test cases of hydrologic forecasting in two UK rivers: the Severn and Brue. Forecasts in retrospect were made and their uncertainties were estimated using kNN resampling and two alternative uncertainty estimators: quantile regression (QR) and uncertainty estimation based on local errors and clustering (UNEEC). Results show that kNN uncertainty estimation produces accurate and narrow uncertainty intervals with good probability coverage. Analysis also shows that the performance of this technique depends on the choice of search space. Nevertheless, the accuracy and reliability of uncertainty intervals generated using kNN resampling are at least comparable to those produced by QR and UNEEC. It is concluded that kNN uncertainty estimation is an interesting alternative to other post-processors, like QR and UNEEC, for estimating forecast uncertainty. Apart from its concept being simple and well understood, an advantage of this method is that it is relatively easy to implement.

35 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
89% related
Inference
36.8K papers, 1.3M citations
87% related
Sampling (statistics)
65.3K papers, 1.2M citations
86% related
Regression analysis
31K papers, 1.7M citations
86% related
Markov chain
51.9K papers, 1.3M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20251
20242
2023377
2022759
2021275
2020279