Topic

Resampling

About: Resampling is a research topic. Over the lifetime, 5428 publications have been published within this topic receiving 242291 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A Multiple Resampling Method for Learning from Imbalanced Data Sets

[...]

Andrew Estabrooks¹, Taeho Jo¹, Nathalie Japkowicz¹•Institutions (1)

University of Ottawa¹

01 Feb 2004

TL;DR: It is concluded that combining different expressions of the resampling approach is an effective solution to the tuning problem and the proposed combination scheme is evaluated on imbalanced subsets of the Reuters‐21578 text collection and is shown to be quite effective for these problems.

...read moreread less

Abstract: Resampling methods are commonly used for dealing with the class-imbalance problem. Their advantage over other methods is that they are external and thus, easily transportable. Although such approaches can be very simple to implement, tuning them most effectively is not an easy task. In particular, it is unclear whether oversampling is more effective than undersampling and which oversampling or undersampling rate should be used. This paper presents an experimental study of these questions and concludes that combining different expressions of the resampling approach is an effective solution to the tuning problem. The proposed combination scheme is evaluated on imbalanced subsets of the Reuters-21578 text collection and is shown to be quite effective for these problems.

...read moreread less

904 citations

Book•

Paleontological Data Analysis

[...]

Øyvind Hammer, David A. T. Harper

01 Jan 2008

TL;DR: In this paper, the authors present an approach to multivariate data analysis for paleontological data, which is based on the allometric equation and a set of properties of the data.

...read moreread less

Abstract: Preface. Acknowledgments. 1 Introduction. 1.1 The nature of paleontological data. 1.2 Advantages and pitfalls of paleontological data analysis. 1.3 Software. 2 Basic statistical methods. 2.1 Introduction. 2.2 Statistical distributions. 2.3 Shapiro-Wilk test for normal distribution. 2.4 F test for equality of variances. 2.5 Student's t test and Welch test for equality of means. 2.6 Mann-Whitney U test for equality of medians. 2.7 Kolmogorov-Smirnov test for equality of distributions. 2.8 Permutation and resampling. 2.9 One-way ANOVA. 2.10 Kruskal-Wallis test. 2.11 Linear correlation. 2.12 Non-parametric tests for correlation. 2.13 Linear regression. 2.14 Reduced major axis regression. 2.15 Nonlinear curve fitting. 2.16 Chi-square test. 3 Introduction to multivariate data analysis. 3.1 Approaches to multivariate data analysis. 3.2 Multivariate distributions. 3.3 Parametric multivariate tests. 3.4 Non-parametric multivariate tests. 3.5 Hierarchical cluster analysis. 3.5 K-means cluster analysis. 4 Morphometrics. 4.1 Introduction. 4.2 The allometric equation. 4.3 Principal components analysis (PCA). 4.4 Multivariate allometry. 4.5 Discriminant analysis for two groups. 4.6 Canonical variate analysis (CVA). 4.7 MANOVA. 4.8 Fourier shape analysis. 4.9 Elliptic Fourier analysis. 4.10 Eigenshape analysis. 4.11 Landmarks and size measures. 4.12 Procrustean fitting. 4.13 PCA of landmark data. 4.14 Thin-plate spline deformations. 4.15 Principal and partial warps. 4.16 Relative warps. 4.17 Regression of partial warp scores. 4.18 Disparity measures. 4.19 Point distribution statistics. 4.20 Directional statistics. Case study: The ontogeny of a Silurian trilobite. 5 Phylogenetic analysis. 5.1 Introduction. 5.2 Characters. 5.3 Parsimony analysis. 5.4 Character state reconstruction. 5.5 Evaluation of characters and tree topologies. 5.6 Consensus trees. 5.7 Consistency index. 5.8 Retention index. 5.9 Bootstrapping. 5.10 Bremer support. 5.11 Stratigraphical congruency indices. 5.12 Phylogenetic analysis with Maximum Likelihood. Case study: The systematics of heterosporous ferns. 6 Paleobiogeography and paleoecology. 6.1 Introduction. 6.2 Diversity indices. 6.3 Taxonomic distinctness. 6.4 Comparison of diversity indices. 6.5 Abundance models. 6.6 Rarefaction. 6.7 Diversity curves. 6.8 Size-frequency and survivorship curves. 6.9 Association similarity indices for presence/absence data. 6.10 Association similarity indices for abundance data. 6.11 ANOSIM and NPMANOVA. 6.12 Correspondence analysis. 6.13 Principal Coordinates analysis (PCO). 6.14 Non-metric Multidimensional Scaling (NMDS). 6.15 Seriation. Case study: Ashgill brachiopod paleocommunities from East China. 7 Time series analysis. 7.1 Introduction. 7.2 Spectral analysis. 7.3 Autocorrelation. 7.4 Cross-correlation. 7.5 Wavelet analysis. 7.6 Smoothing and filtering. 7.7 Runs test. Case study: Sepkoski's generic diversity curve for the Phanerozoic. 8 Quantitative biostratigraphy. 8.1 Introduction. 8.2 Parametric confidence intervals on stratigraphic ranges. 8.3 Non-parametric confidence intervals on stratigraphic ranges. 8.4 Graphic correlation. 8.5 Constrained optimisation. 8.6 Ranking and scaling. 8.7 Unitary Associations. 8.8 Biostratigraphy by ordination. 8.9 What is the best method for quantitative biostratigraphy?. Appendix A: Plotting techniques. Appendix B: Mathematical concepts and notation. References. Index

...read moreread less

867 citations

Journal Article•

Bayesian statistics without tears: A sampling-resampling perspective

[...]

Adrian F. M. Smith, Alan E. Gelfand

01 Jan 1992-Quality Engineering

TL;DR: In this article, a sampling-resampling perspective on Bayesian inference is presented, which has both pedagogic appeal and suggests easily implemented calculation strategies, such as sampling-based methods.

...read moreread less

Abstract: Even to the initiated, statistical calculations based on Bayes's Theorem can be daunting because of the numerical integrations required in all but the simplest applications. Moreover, from a teaching perspective, introductions to Bayesian statistics—if they are given at all—are circumscribed by these apparent calculational difficulties. Here we offer a straightforward sampling-resampling perspective on Bayesian inference, which has both pedagogic appeal and suggests easily implemented calculation strategies.

...read moreread less

861 citations

Journal Article•DOI•

Bayesian Statistics without Tears: A Sampling–Resampling Perspective

[...]

Adrian F. M. Smith, Alan E. Gelfand

01 May 1992-The American Statistician

TL;DR: A straightforward sampling-resampling perspective on Bayesian inference is offered, which has both pedagogic appeal and suggests easily implemented calculation strategies.

...read moreread less

852 citations

Journal Article•DOI•

The psychometric function: II. Bootstrap-based confidence intervals and sampling.

[...]

Felix A. Wichmann¹, NJ Hill¹•Institutions (1)

Max Planck Society¹

01 Nov 2001-Attention Perception & Psychophysics

TL;DR: The present paper’s principal topic is the estimation of the variability of fitted parameters and derived quantities, such as thresholds and slopes, and introduces improved confidence intervals that improve on the parametric and percentile-based bootstrap confidence intervals previously used.

...read moreread less

Abstract: The psychometric function relates an observer's performance to an independent variable, usually a physical quantity of an experimental stimulus Even if a model is successfully fit to the data and its goodness of fit is acceptable, experimenters require an estimate of the variability of the parameters to assess whether differences across conditions are significant Accurate estimates of variability are difficult to obtain, however, given the typically small size of psychophysical data sets: Traditional statistical techniques are only asymptotically correct and can be shown to be unreliable in some common situations Here and in our companion paper (Wichmann & Hill, 2001), we suggest alternative statistical techniques based on Monte Carlo resampling methods The present paper's principal topic is the estimation of the variability of fitted parameters and derived quantities, such as thresholds and slopes First, we outline the basic bootstrap procedure and argue in favor of the parametric, as opposed to the nonparametric, bootstrap Second, we describe how the bootstrap bridging assumption, on which the validity of the procedure depends, can be tested Third, we show how one's choice of sampling scheme (the placement of sample points on the stimulus axis) strongly affects the reliability of bootstrap confidence intervals, and we make recommendations on how to sample the psychometric function efficiently Fourth, we show that, under certain circumstances, the (arbitrary) choice of the distribution function can exert an unwanted influence on the size of the bootstrap confidence intervals obtained, and we make recommendations on how to avoid this influence Finally, we introduce improved confidence intervals (bias corrected and accelerated) that improve on the parametric and percentile-based bootstrap confidence intervals previously used Software implementing our methods is available

...read moreread less

838 citations

Collapse

Network Information

Performance

Metrics

6,588

Papers

269,186

Citations

No. of papers in the topic in previous years
Year	Papers
2025	1
2024	2
2023	377
2022	759
2021	275
2020	279

Resampling

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics