scispace - formally typeset
Search or ask a question
Topic

Resampling

About: Resampling is a research topic. Over the lifetime, 5428 publications have been published within this topic receiving 242291 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, an ultrafast bootstrap approximation approach (UFBoot) is proposed to compute the support of phylogenetic groups in maximum likelihood (ML) based trees, which combines the resampling estimated log-likelihood method with a simple but effective collection scheme of candidate trees.
Abstract: Nonparametric bootstrap has been a widely used tool in phylogenetic analysis to assess the clade support of phylogenetic trees. However, with the rapidly growing amount of data, this task remains a computational bottleneck. Recently, approximation methods such as the RAxML rapid bootstrap (RBS) and the Shimodaira-Hasegawa-like approximate likelihood ratio test have been introduced to speed up the bootstrap. Here, we suggest an ultrafast bootstrap approximation approach (UFBoot) to compute the support of phylogenetic groups in maximum likelihood (ML) based trees. To achieve this, we combine the resampling estimated log-likelihood method with a simple but effective collection scheme of candidate trees. We also propose a stopping rule that assesses the convergence of branch support values to automatically determine when to stop collecting candidate trees. UFBoot achieves a median speed up of 3.1 (range: 0.66-33.3) to 10.2 (range: 1.32-41.4) compared with RAxML RBS for real DNA and amino acid alignments, respectively. Moreover, our extensive simulations show that UFBoot is robust against moderate model violations and the support values obtained appear to be relatively unbiased compared with the conservative standard bootstrap. This provides a more direct interpretation of the bootstrap support. We offer an efficient and easy-to-use software (available at http://www.cibiv.at/software/iqtree) to perform the UFBoot analysis with ML tree inference.

723 citations

Journal ArticleDOI
TL;DR: In this paper, the authors evaluate the performance of the bootstrap resampling method for estimating model test statistic p values and parameter standard errors under non-normality data conditions.
Abstract: Though the common default maximum likelihood estimator used in structural equation modeling is predicated on the assumption of multivariate normality, applied researchers often find themselves with data clearly violating this assumption and without sufficient sample size to utilize distribution-free estimation methods. Fortunately, promising alternatives are being integrated into popular software packages. Bootstrap resampling, which is offered in AMOS (Arbuckle, 1997), is one potential solution for estimating model test statistic p values and parameter standard errors under nonnormal data conditions. This study is an evaluation of the bootstrap method under varied conditions of nonnormality, sample size, model specification, and number of bootstrap samples drawn from the resampling space. Accuracy of the test statistic p values is evaluated in terms of model rejection rates, whereas accuracy of bootstrap standard error estimates takes the form of bias and variability of the standard error estimates thems...

715 citations

Journal ArticleDOI
TL;DR: The motivation for this work comes from a desire to preserve the dependence structure of the time series while bootstrapping (resampling it with replacement), and the method is data driven and is preferred where the investigator is uncomfortable with prior assumptions.
Abstract: A nonparametric method for resampling scalar or vector-valued time series is introduced. Multivariate nearest neighbor probability density estimation provides the basis for the resampling scheme developed. The motivation for this work comes from a desire to preserve the dependence structure of the time series while bootstrapping (resampling it with replacement). The method is data driven and is preferred where the investigator is uncomfortable with prior assumptions as to the form (e.g., linear or nonlinear) of dependence and the form of the probability density function (e.g., Gaussian). Such prior assumptions are often made in an ad hoc manner for analyzing hydrologic data. Connections of the nearest neighbor bootstrap to Markov processes as well as its utility in a general Monte Carlo setting are discussed. Applications to resampling monthly streamflow and some synthetic data are presented. The method is shown to be effective with time series generated by linear and nonlinear autoregressive models. The utility of the method for resampling monthly streamflow sequences with asymmetric and bimodal marginal probability densities is also demonstrated.

713 citations

Journal ArticleDOI
TL;DR: In this article, it is shown that when some functionals of the distribution of the data are known, one can get sharper inferences on other functionals by imposing the known values as constraints on the optimization.
Abstract: Empirical likelihood is a nonparametric method of inference. It has sampling properties similar to the bootstrap, but where the bootstrap uses resampling, it profiles a multinomial likelihood supported on the sample. Its properties in i.i.d. settings have been investigated in works by Owen, by Hall and by DiCiccio, Hall and Romano. This article extends the method to regression problems. Fixed and random regressors are considered, as are robust and heteroscedastic regressions. To make the extension, three variations on the original idea are considered. It is shown that when some functionals of the distribution of the data are known, one can get sharper inferences on other functionals by imposing the known values as constraints on the optimization. The result is first order equivalent to conditioning on a sample value of the known functional. The use of a Euclidean alternative to the likelihood function is investigated. A triangular array version of the empirical likelihood theorem is given. The one-way ANOVA and heteroscedastic regression models are considered in detail. An example is given in which inferences are drawn on the parameters of both the regression function and the conditional variance model.

704 citations

Proceedings ArticleDOI
24 Oct 2005
TL;DR: It is first shown using simple arguments that the so-called residual and stratified methods do yield an improvement over the basic multinomial resampling approach, and a central limit theorem is established for the case where resamplings is performed using the residual approach.
Abstract: This contribution is devoted to the comparison of various resampling approaches that have been proposed in the literature on particle filtering. It is first shown using simple arguments that the so-called residual and stratified methods do yield an improvement over the basic multinomial resampling approach. A simple counter-example showing that this property does not hold true for systematic resampling is given. Finally, some results on the large-sample behavior of the simple bootstrap filter algorithm are given. In particular, a central limit theorem is established for the case where resampling is performed using the residual approach.

692 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
89% related
Inference
36.8K papers, 1.3M citations
87% related
Sampling (statistics)
65.3K papers, 1.2M citations
86% related
Regression analysis
31K papers, 1.7M citations
86% related
Markov chain
51.9K papers, 1.3M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20251
20242
2023377
2022759
2021275
2020279