scispace - formally typeset
Search or ask a question
Topic

Resampling

About: Resampling is a research topic. Over the lifetime, 5428 publications have been published within this topic receiving 242291 citations.


Papers
More filters
Proceedings ArticleDOI
28 Jul 2013
TL;DR: A large-scale study comprising nearly 60 million system comparisons showing that in practice the bootstrap, t-test and Wilcoxon test outperform the permutation test under different optimality criteria.
Abstract: Previous research has suggested the permutation test as the theoretically optimal statistical significance test for IR evaluation, and advocated for the discontinuation of the Wilcoxon and sign tests. We present a large-scale study comprising nearly 60 million system comparisons showing that in practice the bootstrap, t-test and Wilcoxon test outperform the permutation test under different optimality criteria. We also show that actual error rates seem to be lower than the theoretically expected 5%, further confirming that we may actually be underestimating significance.

33 citations

Journal ArticleDOI
TL;DR: In this paper, a generalized significance testing method for global measures of spatial association by extending the Mantel test is proposed. But it is not known how to deal with spatial weights matrices with nonzero diagonal elements, and it has not been shown that the significance test can be applied to bivariate spatial association measures such as Cross-Moran and Lee's L.
Abstract: This research is concerned with providing a generalized significance testing method for global measures of spatial association by extending the Mantel test. Even though it has long been recognized that univariate spatial association measures such as Moran's I and Geary's c are special cases of Mantel's generalized association statistic, an intensive and comprehensive examination of the connections, particularly in terms of significance testing has never been undertaken. Furthermore, researchers have faced difficulties in dealing with spatial weights matrices with nonzero diagonal elements, and establishing the significance testing method for bivariate spatial association measures such as Cross–Moran and Lee's L. The author demonstrates that the proposed extended Mantel test can be applied to any global measure of spatial association with any form of spatial weights matrix in order to approximate the first two moments of the measures. A Monte Carlo simulation for each measure with various forms of spatial ...

33 citations

Proceedings ArticleDOI
09 Oct 2008
TL;DR: The results show that the prediction accuracy of Analogy-X is similar to the one of ANGEL, and the use of Mantel statistics to select project features and detect abnormal data points, which provides a sound statistical basis for analogy-based systems.
Abstract: This paper reports on the empirical evaluation of a novel approach called Analogy-X, which is an extension to the classical analogy-based software cost estimation. The Analogy-X approach is a set of procedures that utilize the principles of the Mantel randomization test to provide inferential statistics to Analogy. Our previous studies have clearly demonstrated the novelty and effectiveness of this technique. This paper provides further empirical evaluation of Analogy-X using different kinds of datasets. Our results show that the prediction accuracy of Analogy-X is similar to the one of ANGEL. Analogy-X has the additional advantage of allowing the use of Mantel statistics to select project features and detect abnormal data points, which provides a sound statistical basis for analogy-based systems.

33 citations

Journal ArticleDOI
TL;DR: It is shown that a class of SPRT boundaries is minimax with respect to resampling risk and recommended a truncated version of boundaries in that class by comparing their resamplings risk (RR) to the RR of fixed boundaries with the same maximum resample size.
Abstract: When designing programs or software for the implementation of Monte Carlo (MC) hypothesis tests, we can save computation time by using sequential stopping boundaries. Such boundaries imply stopping resampling after relatively few replications if the early replications indicate a very large or a very small p value. We study a truncated sequential probability ratio test (SPRT) boundary and provide a tractable algorithm to implement it. We review two properties desired of any MC p value, the validity of the p value and a small resampling risk, where resampling risk is the probability that the accept/reject decision will be different than the decision from complete enumeration. We show how the algorithm can be used to calculate a valid p value and confidence intervals for any truncated SPRT boundary. We show that a class of SPRT boundaries is minimax with respect to resampling risk and recommend a truncated version of boundaries in that class by comparing their resampling risk (RR) to the RR of fixed boundari...

33 citations

Journal ArticleDOI
TL;DR: To avoid the recalculation of time-consuming robust regression estimates, fast approximations for the robust estimates of the resampled data are used and this leads to time-efficient and robust estimators of prediction error.

33 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
89% related
Inference
36.8K papers, 1.3M citations
87% related
Sampling (statistics)
65.3K papers, 1.2M citations
86% related
Regression analysis
31K papers, 1.7M citations
86% related
Markov chain
51.9K papers, 1.3M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20251
20242
2023377
2022759
2021275
2020279