scispace - formally typeset
Search or ask a question
Topic

Resampling

About: Resampling is a research topic. Over the lifetime, 5428 publications have been published within this topic receiving 242291 citations.


Papers
More filters
Book
01 Jan 2006
TL;DR: A model-based approach to sampling rare populations and its applications to forest inventory in Finland, and a closer look at the three-level model structure.
Abstract: Preface. Acknowledgements. List of contributing authors. Part I: Theory. 1. Introduction A. Kangas et al. 1.1 General. 1.2 Historical background of sampling theory. 1.3 History of forest inventories. References.- 2. Design-based sampling and inference A. Kangas. 2.1 Basis for probability sampling. 2.2 Simple random sampling. 2.3 Determining the sample size. 2.4 Systematic sampling. 2.5 Stratified sampling. 2.6 Cluster sampling. 2.7 Ratio and regression estimators. 2.8 Sampling with probability proportional to size. 2.9 Non-linear estimators. 2.10 Resampling. 2.11 Selecting the sampling method. References.- 3. Model-based inference A. Kangas. 3.1 Foundations of model-based inference. 3.2 Models. 3.3 Applications of model-based methods to forest inventory. 3.4 Model-based versus design-based inference. References.- 4. Mensurational aspects A. Kangas. 4.1 Sample plots. 4.1.1 Plot size. 4.1.2 Plot shape. 4.2 Point sampling. 4.3 Comparison of fixed-sized plots and points. 4.4 Plots located on an edge or slope. 4.4.1 Edge corrections. 4.4.2 Slope corrections. References.- 5. Change monitoring with permanent sample plots S. Poso. 5.1 Concepts and notations. 5.2 Choice of sample plot type and tree measurement. 5.3 Estimating components of growth at the plot level. 5.4 Monitoring volume and volume increment over two or more measuring periods at the plot level. 5.5 Estimating population parameters. 5.6 Concluding remarks. References.- 6. Generalizing sample tree information J. Lappi et al. 6.1 Estimation of tally tree regression. 6.2 Generalizing sample tree information in a small subpopulation. 6.2.1 Mixed estimation. 6.2.2 Applying mixed models. 6.3 A closer look at the three-level model structure. References.- 7. Use of additional information J. Lappi, A. Kangas. 7.1 Calibration estimation. 7.2 Small area estimates. References.- 8. Sampling rare populations A. Kangas. 8.1 Methods for sampling rare populations. 8.1.1 Principles. 8.1.2 Strip sampling. 8.1.3 Line intersect sampling. 8.1.4 Adaptive cluster sampling. 8.1.5 Transect and point relascope sampling. 8.1.6 Guided transect sampling. 8.2 Wildlife populations. 8.2.1 Line transect sampling. 8.2.2 Capture-recapture methods. 8.2.3 The wildlife triangle scheme. References.- 9. Inventories of vegetation, wild berries and mushrooms M. Maltamo. 9.1 Basic principles. 9.2 Vegetation inventories. 9.2.1 Approaches to the description of vegetation. 9.2.2 Recording of abundance. 9.2.3 Sampling methods for vegetation analysis. 9.3 Examples of vegetation surveys. 9.4 Inventories of mushrooms and wild berries. References.- 10. Assessment of uncertainty in spatially systematic sampling J. Heikkinen. 10.1 Introduction. 10.2 Notation, definitions and assumptions. 10.3 Variance estimators based on local differences. 10.3.1 Restrictions of SRS-estimator. 10.3.2 Development of estimators based on local differences. 10.4 Variance estimation in the national forest inventory in Finland. 10.5 Model-based approaches. 10.5.1 Modelling spatial variation. 10.5.2 Model-based variance and its estimation. 10.5.3 Descriptive versus analytic inference. 10.5.4 Kriging in inventories. 10.6 Other sources of uncertainty. References.- Part II: Applications. 11. The Finnish national forest inventory E. Tomppo. 11.1 Introduction. 11.2 Field sampling system used in NFI9. 11.3 Estimation based on field data. 11.3.1 Area estimation. 11.3.2 Volume estimation. 11.3.2.1 Predicting sample tree volumes and volumes by timber assortment classes. 11.3.2.2 Predicting volumes for tally trees. 11.3.3.3 Computing volumes for computation units. 11.4 Increment estimation. 11.5 Conclusions. References.-

183 citations

Proceedings ArticleDOI
07 Apr 2020
TL;DR: One of the key findings of this paper is noticing that oversampling performs better than undersampling for different classifiers and obtains higher scores in different evaluation metrics.
Abstract: Data imbalance in Machine Learning refers to an unequal distribution of classes within a dataset. This issue is encountered mostly in classification tasks in which the distribution of classes or labels in a given dataset is not uniform. The straightforward method to solve this problem is the resampling method by adding records to the minority class or deleting ones from the majority class. In this paper, we have experimented with the two resampling widely adopted techniques: oversampling and undersampling. In order to explore both techniques, we have chosen a public imbalanced dataset from kaggle website Santander Customer Transaction Prediction and have applied a group of well-known machine learning algorithms with different hyperparamters that give best results for both resampling techniques. One of the key findings of this paper is noticing that oversampling performs better than undersampling for different classifiers and obtains higher scores in different evaluation metrics.

182 citations

Journal ArticleDOI
TL;DR: This paper fills an apparent gap by detailing such an algorithm as p-values adjusted for resampling-based stepdown multiple testing procedures proposed in Romano and Wolf (2005a,b).

182 citations

Journal ArticleDOI
TL;DR: In this paper, the authors describe simple and reliable inference procedures based on the least-squares principle for this model with right-censored data, which is shown to be consistent and asymptotically normal.
Abstract: SUMMARY The semiparametric accelerated failure time model relates the logarithm of the failure time linearly to the covariates while leaving the error distribution unspecified. The present paper describes simple and reliable inference procedures based on the least-squares principle for this model with right-censored data. The proposed estimator of the vector valued regression parameter is an iterative solution to the Buckley-James estimating equation with a preliminary consistent estimator as the starting value. The estimator is shown to be consistent and asymptotically normal. A novel resampling procedure is developed for the estimation of the limiting covariance matrix. Extensions to marginal models for multivariate failure time data are considered. The performance of the new inference procedures is assessed through simulation studies. Illustrations with medical studies are provided.

181 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
89% related
Inference
36.8K papers, 1.3M citations
87% related
Sampling (statistics)
65.3K papers, 1.2M citations
86% related
Regression analysis
31K papers, 1.7M citations
86% related
Markov chain
51.9K papers, 1.3M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20251
20242
2023377
2022759
2021275
2020279