scispace - formally typeset
Search or ask a question
Topic

Resampling

About: Resampling is a research topic. Over the lifetime, 5428 publications have been published within this topic receiving 242291 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, the authors combine the mutual information criterion with a forward feature selection strategy and propose to use resampling methods, a K-fold cross-validation and the permutation test, to address both issues, which can then be used to automatically set the parameter and calculate a threshold to stop the forward procedure.

132 citations

Journal ArticleDOI
TL;DR: The bootstrap is illustrated as an alternative method for estimating the standard errors when the theoretical calculation is complicated or not available in the current software.
Abstract: Bootstrapping is a nonparametric approach for evaluating the dis-tribution of a statistic based on random resampling This article illustrates the bootstrap as an alternative method for estimating

131 citations

Journal ArticleDOI
TL;DR: Three alternatives to MCMC methods are reviewed, including importance sampling, the forward-backward algorithm, and sequential Monte Carlo (SMC), which are demonstrated on a range of examples, including estimating the transition density of a diffusion and of a discrete-state continuous-time Markov chain; inferring structure in population genetics; and segmenting genetic divergence data.
Abstract: We consider analysis of complex stochastic models based upon partial information. MCMC and reversible jump MCMC are often the methods of choice for such problems, but in some situations they can be difficult to implement; and suffer from problems such as poor mixing, and the difficulty of diagnosing convergence. Here we review three alternatives to MCMC methods: importance sampling, the forward-backward algorithm, and sequential Monte Carlo (SMC). We discuss how to design good proposal densities for importance sampling, show some of the range of models for which the forward-backward algorithm can be applied, and show how resampling ideas from SMC can be used to improve the efficiency of the other two methods. We demonstrate these methods on a range of examples, including estimating the transition density of a diffusion and of a discrete-state continuous-time Markov chain; inferring structure in population genetics; and segmenting genetic divergence data.

131 citations

Journal ArticleDOI
TL;DR: In this paper, the authors present four methods that combine bootstrap estimation with multiple imputation to address missing data and show that three of the four approaches yield valid inference, but that the performance of the methods varies with respect to the number of imputed data sets and the extent of missingness.
Abstract: Many modern estimators require bootstrapping to calculate confidence intervals because either no analytic standard error is available or the distribution of the parameter of interest is nonsymmetric. It remains however unclear how to obtain valid bootstrap inference when dealing with multiple imputation to address missing data. We present 4 methods that are intuitively appealing, easy to implement, and combine bootstrap estimation with multiple imputation. We show that 3 of the 4 approaches yield valid inference, but that the performance of the methods varies with respect to the number of imputed data sets and the extent of missingness. Simulation studies reveal the behavior of our approaches in finite samples. A topical analysis from HIV treatment research, which determines the optimal timing of antiretroviral treatment initiation in young children, demonstrates the practical implications of the 4 methods in a sophisticated and realistic setting. This analysis suffers from missing data and uses the g-formula for inference, a method for which no standard errors are available.

131 citations

Journal ArticleDOI
TL;DR: In this paper, the authors compare the traditional approach of a single split of data into a training set and test set (for accuracy assessment), to a resampling framework where the classification and accuracy assessment are repeated many times.

131 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
89% related
Inference
36.8K papers, 1.3M citations
87% related
Sampling (statistics)
65.3K papers, 1.2M citations
86% related
Regression analysis
31K papers, 1.7M citations
86% related
Markov chain
51.9K papers, 1.3M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20251
20242
2023377
2022759
2021275
2020279