Topic

Resampling

About: Resampling is a research topic. Over the lifetime, 5428 publications have been published within this topic receiving 242291 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Optimally splitting cases for training and testing high dimensional classifiers

[...]

Kevin K. Dobbin¹, Richard M. Simon²•Institutions (2)

University of Georgia¹, National Institutes of Health²

08 Apr 2011-BMC Medical Genomics

TL;DR: A non-parametric algorithm for determining an optimal splitting proportion that can be applied with a specific dataset and classifier algorithm is developed and applied to any dataset, using any predictor development method, to determine the best split.

...read moreread less

Abstract: We consider the problem of designing a study to develop a predictive classifier from high dimensional data. A common study design is to split the sample into a training set and an independent test set, where the former is used to develop the classifier and the latter to evaluate its performance. In this paper we address the question of what proportion of the samples should be devoted to the training set. How does this proportion impact the mean squared error (MSE) of the prediction accuracy estimate? We develop a non-parametric algorithm for determining an optimal splitting proportion that can be applied with a specific dataset and classifier algorithm. We also perform a broad simulation study for the purpose of better understanding the factors that determine the best split proportions and to evaluate commonly used splitting strategies (1/2 training or 2/3 training) under a wide variety of conditions. These methods are based on a decomposition of the MSE into three intuitive component parts. By applying these approaches to a number of synthetic and real microarray datasets we show that for linear classifiers the optimal proportion depends on the overall number of samples available and the degree of differential expression between the classes. The optimal proportion was found to depend on the full dataset size (n) and classification accuracy - with higher accuracy and smaller n resulting in more assigned to the training set. The commonly used strategy of allocating 2/3rd of cases for training was close to optimal for reasonable sized datasets (n ≥ 100) with strong signals (i.e. 85% or greater full dataset accuracy). In general, we recommend use of our nonparametric resampling approach for determing the optimal split. This approach can be applied to any dataset, using any predictor development method, to determine the best split.

...read moreread less

252 citations

Journal Article•DOI•

The Performance of Risk Prediction Models

[...]

Thomas A. Gerds, Tianxi Cai, Martin Schumacher¹•Institutions (1)

University of Freiburg¹

01 Aug 2008-Biometrical Journal

TL;DR: A systematic review of the modern way of assessing risk prediction models using methods derived from ROC methodology and from probability forecasting theory to compare measures of predictive performance.

...read moreread less

Abstract: For medical decision making and patient information, predictions of future status variables play an important role. Risk prediction models can be derived with many different statistical approaches. To compare them, measures of predictive performance are derived from ROC methodology and from probability forecasting theory. These tools can be applied to assess single markers, multivariable regression models and complex model selection algorithms. This article provides a systematic review of the modern way of assessing risk prediction models. Particular attention is put on proper benchmarks and resampling techniques that are important for the interpretation of measured performance. All methods are illustrated with data from a clinical study in head and neck cancer patients.

...read moreread less

249 citations

Journal Article•DOI•

Resampling algorithms for particle filters: a computational complexity perspective

[...]

Miodrag Bolic¹, Petar M. Djuric¹, Sangjin Hong¹•Institutions (1)

Stony Brook University¹

01 Jan 2004-EURASIP Journal on Advances in Signal Processing

TL;DR: Newly developed resampling algorithms for particle filters suitable for real-time implementation that reduce the complexity of both hardware and DSP realization through addressing common issues such as decreasing the number of operations and memory access are described.

...read moreread less

Abstract: Newly developed resampling algorithms for particle filters suitable for real-time implementation are described and their analysis is presented. The new algorithms reduce the complexity of both hardware and DSP realization through addressing common issues such as decreasing the number of operations and memory access. Moreover, the algorithms allow for use of higher sampling frequencies by overlapping in time the resampling step with the other particle filtering steps. Since resampling is not dependent on any particular application, the analysis is appropriate for all types of particle filters that use resampling. The performance of the algorithms is evaluated on particle filters applied to bearings-only tracking and joint detection and estimation in wireless communications. We have demonstrated that the proposed algorithms reduce the complexity without performance degradation.

...read moreread less

248 citations

Book•

When Does Bootstrap Work?: Asymptotic Results and Simulations

[...]

Enno Mammen

29 Jul 1992

TL;DR: In this article, the authors consider the application of the bootstrap to the estimation of smooth functionals, non-parametric curve estimation, and to linear models, and investigate the conditions under which the bootstraps works satisfactorily.

...read moreread less

Abstract: Bootstrap methods are procedures for estimating or approximating the distribution of a statistic based on ideas from resampling and simulation methods. This volume is concerned with the asymptotic behaviour of the bootstrap and investigates the conditions under which the bootstrap works satisfactorily. In particular, the author considers the application of the bootstrap to the estimation of smooth functionals, non-parametric curve estimation, and to linear models. Readers are assumed to have a working familiarity with the basics of bootstrap methods.

...read moreread less

246 citations

Journal Article•DOI•

The Detection of Density-Dependence from a Series of Annual Censuses

[...]

E. Pollard, K. H. Lakhani, P. Rothery

01 Dec 1987-Ecology

TL;DR: A distribution-free approach to the detection of density-dependence in the variation of population abundance, measured by a series of annual censuses, is reported, which shows that the randomization test is effective whether or not there is a marked trend in the observed data.

...read moreread less

Abstract: We report a distribution-free approach to the detection of density-depen- dence in the variation of population abundance, measured by a series of annual censuses. The method uses the correlation coefficient between the observed population changes and population size and proposes a randomization procedure to define a rejection region for the hypothesis of density-independence. It is shown that the use of the proposed statistic under the randomization approach is equivalent to the likelihood ratio test for a particular family of time series models. The randomization test is compared with two other recently proposed tests. Using computer-generated density-independent and density-dependent data, it is shown that, unlike the other tests, the randomization test is effective whether or not there is a marked trend in the observed data. Arguments are presented showing how one of the other two tests can be further improved. Caution is urged in the use and interpretation of any test for detecting density-depen- dence in census data because (a) the tests depend on assumptions about population pro- cesses, (b) errors of measurement may lead to spurious detection of density-dependence.

...read moreread less

244 citations

Collapse

Network Information

Performance

Metrics

6,588

Papers

269,186

Citations

No. of papers in the topic in previous years
Year	Papers
2025	1
2024	2
2023	377
2022	759
2021	275
2020	279

Resampling

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics