Topic

Resampling

About: Resampling is a research topic. Over the lifetime, 5428 publications have been published within this topic receiving 242291 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Permutation Tests for Studying Classifier Performance

[...]

Markus Ojala¹, Gemma C. Garriga¹•Institutions (1)

Helsinki University of Technology¹

06 Dec 2009

TL;DR: In this paper, the authors explore the framework of permutation-based p-values for assessing the behavior of the classification error and study two simple permutation tests: the first test estimates the null distribution by permuting the labels in the data; this has been used extensively in classification problems in computational biology and the second test produces permutations of the features within classes, inspired by restricted randomization techniques traditionally used in statistics.

...read moreread less

Abstract: We explore the framework of permutation-based p-values for assessing the behavior of the classification error. In this paper we study two simple permutation tests. The first test estimates the null distribution by permuting the labels in the data; this has been used extensively in classification problems in computational biology. The second test produces permutations of the features within classes, inspired by restricted randomization techniques traditionally used in statistics. We study the properties of these tests and present an extensive empirical evaluation on real and synthetic data. Our analysis shows that studying the classification error via permutation tests is effective; in particular, the restricted permutation test clearly reveals whether the classifier exploits the interdependency between the features in the data.

...read moreread less

392 citations

Journal Article•DOI•

Rank-based inference for the accelerated failure time model

[...]

Zhezhen Jin¹, Danyu Lin, Lee-Jen Wei², Zhiliang Ying¹•Institutions (2)

Columbia University¹, Harvard University²

01 Jun 2003-Biometrika

TL;DR: In this paper, a broad class of rank-based monotone estimating functions is developed for the semiparametric accelerated failure time model with censored observations, which are shown to be consistent and asymptotically normal.

...read moreread less

Abstract: SUMMARY A broad class of rank-based monotone estimating functions is developed for the semiparametric accelerated failure time model with censored observations. The corresponding estimators can be obtained via linear programming, and are shown to be consistent and asymptotically normal. The limiting covariance matrices can be estimated by a resampling technique, which does not involve nonparametric density estimation or numerical derivatives. The new estimators represent consistent roots of the non-monotone estimating equations based on the familiar weighted log-rank statistics. Simulation studies demonstrate that the proposed methods perform well in practical settings. Two real examples are provided.

...read moreread less

382 citations

Book•

Resampling Methods: A Practical Guide to Data Analysis

[...]

Phillip I. Good

20 Feb 2013

TL;DR: This Second edition is a practical guide to data analysis using the bootstrap, cross-validation, and permutation tests and is an essential resource for industrial statisticians, statistical consultants, and research professionals in science, engineering, and technology.

...read moreread less

Abstract: The goal of this book is to introduce statistical methodology-estimation, hypothesis, testing and classification-to a wide applied audience through resampling from existing data via the bootstrap, and estimation or cross-validation methods. The book provides an accessible introduction and practical guide to the power, simplicity and veritability of the bootstrap, cross-validation and permutation tests. Industrial statistical consultants, professionals and researchers will find the book's methods and software imimediately helpful. (unvollstandig)) This Second edition is a practical guide to data analysis using the bootstrap, cross-validation, and permutation tests. It is an essential resource for industrial statisticians, statistical consultants, and research professionals in science, engineering, and technology. Only requiring minimal mathematics beyond algebra, it provides a table-free introduction to data analysis utilizing numerous exercizes, practical data sets, and freely available statistical shareware. Topics and features: *Thoroughly revised text features more practical examples plus an additional chapter devoted to regression and data mining techniques and their limitations *Uses resampling approach to introduction statistics *A Practical presentation that covers all three sampling methods - bootstrap, density-estimation, and permutations *Includes systematic guide to help one select correct procedure for a particular application *Detailed coverage of all three statistical methodologies - classification, estimation, and hypothesis testing *Suitable for classroom use and individual, self-study purposes *Numerous practical examples using popular computer programs such as SAS, Stata, and StatXact *Useful appendices with computer programs and code to develop own methods *Downloadable freeware from author's website: http://users.oco.net/drphilgood/resamp.htm With its accessable style and intuitive topic development, the book is an excellent basic resource and guide to the power, simplicity and versatility of bootstrap, cross-validation and permutation tests. Students, professionals, and researchers will find it a particularly useful guide to modern resampling methods and their applications.

...read moreread less

376 citations

Journal Article•DOI•

Resampling algorithms and architectures for distributed particle filters

[...]

Miodrag Bolic¹, Petar M. Djuric², Sangjin Hong²•Institutions (2)

Ottawa University¹, State University of New York System²

01 Jul 2005-IEEE Transactions on Signal Processing

TL;DR: The proposed algorithms improve the scalability of the filter architectures affected by the resampling process and reduce communication through the interconnection network is reduced and made deterministic, which results in simpler network structure and increased sampling frequency.

...read moreread less

Abstract: In this paper, we propose novel resampling algorithms with architectures for efficient distributed implementation of particle filters. The proposed algorithms improve the scalability of the filter architectures affected by the resampling process. Problems in the particle filter implementation due to resampling are described, and appropriate modifications of the resampling algorithms are proposed so that distributed implementations are developed and studied. Distributed resampling algorithms with proportional allocation (RPA) and nonproportional allocation (RNA) of particles are considered. The components of the filter architectures are the processing elements (PEs), a central unit (CU), and an interconnection network. One of the main advantages of the new resampling algorithms is that communication through the interconnection network is reduced and made deterministic, which results in simpler network structure and increased sampling frequency. Particle filter performances are estimated for the bearings-only tracking applications. In the architectural part of the analysis, the area and speed of the particle filter implementation are estimated for a different number of particles and a different level of parallelism with field programmable gate array (FPGA) implementation. In this paper, only sampling importance resampling (SIR) particle filters are considered, but the analysis can be extended to any particle filters with resampling.

...read moreread less

360 citations

Journal Article•DOI•

Giving meaningful interpretation to ordination axes: assessing loading significance in principal component analysis

[...]

Pedro R. Peres-Neto¹, Donald A. Jackson¹, Keith M. Somers¹•Institutions (1)

University of Toronto¹

01 Sep 2003-Ecology

TL;DR: In this paper, the authors compared the performance of a variety of approaches for assessing the significance of eigenvector coefficients in terms of type I error rates and power, and two novel approaches based on the broken-stick model were also evaluated.

...read moreread less

Abstract: Principal component analysis (PCA) is one of the most commonly used tools in the analysis of ecological data. This method reduces the effective dimensionality of a multivariate data set by producing linear combinations of the original variables (i.e., com- ponents) that summarize the predominant patterns in the data. In order to provide meaningful interpretations for principal components, it is important to determine which variables are associated with particular components. Some data analysts incorrectly test the statistical significance of the correlation between original variables and multivariate scores using standard statistical tables. Others interpret eigenvector coefficients larger than an arbitrary absolute value (e.g., 0.50). Resampling, randomization techniques, and parallel analysis have been applied in a few cases. In this study, we compared the performance of a variety of approaches for assessing the significance of eigenvector coefficients in terms of type I error rates and power. Two novel approaches based on the broken-stick model were also evaluated. We used a variety of simulated scenarios to examine the influence of the number of real dimensions in the data; unique versus complex variables; the magnitude of eigen- vector coefficients; and the number of variables associated with a particular dimension. Our results revealed that bootstrap confidence intervals and a modified bootstrap confidence interval for the broken-stick model proved to be the most reliable techniques.

...read moreread less

357 citations

Collapse

Network Information

Performance

Metrics

6,588

Papers

269,186

Citations

No. of papers in the topic in previous years
Year	Papers
2025	1
2024	2
2023	377
2022	759
2021	275
2020	279

Resampling

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics