scispace - formally typeset
Search or ask a question

Showing papers on "Resampling published in 1999"


Journal ArticleDOI
TL;DR: It is emphasized that the relevant estimation procedure depends on the sampling density, and the validity of the variance estimation is examined in a collection of data sets, obtained by systematic sampling.
Abstract: In the present paper, we summarize and further develop recent reseach in the estimation of the variance of sterelogical estimators based on systematic sampling. In particular, it is emphasized that the relevant estimation procedure depends on the sampling density. The validity of the variance estimation is examined in a collection of data sets, obtained by systematic sampling. Practical recommendations are also provided in a separate section.

1,244 citations


Journal ArticleDOI
01 Jun 1999-Ecology
TL;DR: A number of considerations related to choosing methods for the meta-analysis of ecological data, including the choice of parametric vs. resampling methods, reasons for conducting weighted analyses where possible, and comparisons fixed vs. mixed models in categorical and regression-type analyses are outlined.
Abstract: Meta-analysis is the use of statistical methods to summarize research findings across studies. Special statistical methods are usually needed for meta-analysis, both because effect-size indexes are typically highly heteroscedastic and because it is desirable to be able to distinguish between-study variance from within-study sampling-error variance. We outline a number of considerations related to choosing methods for the meta-analysis of ecological data, including the choice of parametric vs. resampling methods, reasons for conducting weighted analyses where possible, and comparisons fixed vs. mixed models in categorical and regression-type analyses.

954 citations


Journal ArticleDOI
TL;DR: In this paper, a new false discovery rate controlling procedure is proposed for multiple hypotheses testing, which makes use of resampling-based p-value adjustment, and is designed to cope with correlated test statistics.

681 citations


Book
16 Sep 1999
TL;DR: What is Bootstrapping?
Abstract: What is Bootstrapping? Estimation Confidence Sets and Hypothesis Testing Regression Analysis Forecasting and Time Series Analysis Which Resampling Method Should You Use? Efficient and Effective Simulation Special Topics When Does Bootstrapping Fail? Bibliography Indexes.

450 citations


Journal ArticleDOI
TL;DR: Several possible hypothesis test methods are evaluated: the paired t test, the nonparametric Wilcoxon signed-rank test, and two resampling tests, which indicate the more involved resampled test methodology is the most appropriate when testing threat scores from nonprobabilistic forecasts.
Abstract: When evaluating differences between competing precipitation forecasts, formal hypothesis testing is rarely performed. This may be due to the difficulty in applying common tests given the spatial correlation of and non-normality of errors. Possible ways around these difficulties are explored here. Two datasets of precipitation forecasts are evaluated, a set of two competing gridded precipitation forecasts from operational weather prediction models and sets of competing probabilistic quantitative precipitation forecasts from model output statistics and from an ensemble of forecasts. For each test, data from each competing forecast are collected into one sample for each case day to avoid problems with spatial correlation. Next, several possible hypothesis test methods are evaluated: the paired t test, the nonparametric Wilcoxon signed-rank test, and two resampling tests. The more involved resampling test methodology is the most appropriate when testing threat scores from nonprobabilistic forecasts. ...

354 citations


Journal ArticleDOI
TL;DR: The problems of replication stability, model complexity, selection bias and an overoptimistic estimate of the predictive value of a model are discussed together with several proposals based on resampling methods, which favour greater simplicity of the final regression model.
Abstract: Summary. The number of variables in a regression model is often too large and a more parsimonious model may be preferred. Selection strategies (e.g. all-subset selection with various penalties for model complexity, or stepwise procedures) are widely used, but there are few analytical results about their properties. The problems of replication stability, model complexity, selection bias and an overoptimistic estimate of the predictive value of a model are discussed together with several proposals based on resampling methods. The methods are applied to data from a case-control study on atopic dermatitis and a clinical trial to compare two chemotherapy regimes by using a logistic regression and a Cox model. A recent proposal to use shrinkage factors to reduce the bias of parameter estimates caused by model building is extended to parameterwise shrinkage factors and is discussed as a further possibility to illustrate problems of models which are too complex. The results from the resampling approaches favour greater simplicity of the final regression model.

294 citations


Journal ArticleDOI
Volker Rasche1, Roland Proksa1, Ralph Sinkus1, Peter Börnert1, Holger Eggers1 
TL;DR: The authors introduce the application of the convolution interpolation for resampling of data from one arbitrary grid onto another and suggest that the suggested approach to derive the sampling density function is suitable even for arbitrary sampling patterns.
Abstract: For certain medical applications resampling of data is required. In magnetic resonance tomography (MRT) or computer tomography (CT), e.g., data may be sampled on nonrectilinear grids in the Fourier domain. For the image reconstruction a convolution-interpolation algorithm, often called gridding, can be applied for resampling of the data onto a rectilinear grid. Resampling of data from a rectilinear onto a nonrectilinear grid are needed, e.g., if projections of a given rectilinear data set are to be obtained. In this paper the authors introduce the application of the convolution interpolation for resampling of data from one arbitrary grid onto another. The basic algorithm can be split into two steps. First, the data are resampled from the arbitrary input grid onto a rectilinear grid and second, the rectilinear data is resampled onto the arbitrary output grid. Furthermore, the authors like to introduce a new technique to derive the sampling density function needed for the first step of their algorithm. For fast, sampling-pattern-independent determination of the sampling density function the Voronoi diagram of the sample distribution is calculated. The volume of the Voronoi cell around each sample is used as a measure for the sampling density. It is shown that the introduced resampling technique allows fast resampling of data between arbitrary grids. Furthermore, it is shown that the suggested approach to derive the sampling density function is suitable even for arbitrary sampling patterns. Examples are given in which the proposed technique has been applied for the reconstruction of data acquired along spiral, radial, and arbitrary trajectories and for the fast calculation of projections of a given rectilinearly sampled image.

253 citations


Journal ArticleDOI
TL;DR: The authors' adaptive resampling approach surpasses previous decision-tree performance and validates the effectiveness of small, pooled local dictionaries.
Abstract: The authors' adaptive resampling approach surpasses previous decision-tree performance and validates the effectiveness of small, pooled local dictionaries. They demonstrate their approach using the Reuters-21578 benchmark data and a real-world customer E-mail routing system.

227 citations


Book
29 Dec 1999
TL;DR: In this paper, the BCA algorithm has been used to estimate the standard error of a BCA CI for a normal population mean and for a Normal Population Mean for a nonparametric population variance.
Abstract: PREFACE: DATA ANALYSIS BY RESAMPLING PART I: RESAMPLING CONCEPTS INTRODUCTION CONCEPTS 1: TERMS AND NOTATION Case, Attributes, Scores, and Treatments / Experimental and Observational Studies / Data Sets, Samples, and Populations / Parameters, Statistics, and Distributions / Distribution Functions APPLICATIONS 1: CASES, ATTRIBUTES, AND DISTRIBUTIONS Attributes, Scores, Groups, and Treatments / Distributions of Scores and Statistics / Exercises CONCEPTS 2: POPULATIONS AND RANDOM SAMPLES Varieties of Populations / Random Samples APPLICATIONS 2: RANDOM SAMPLING Simple Random Samples / Exercises CONCEPTS 3: STATISTICS AND SAMPLING DISTRIBUTIONS Statistics and Estimators / Accuracy of Estimation / The Sampling Distribution / Bias of an Estimator / Standard Error of a Statistic / RMS Error of an Estimator / Confidence Interval APPLICATIONS 3: SAMPLING DISTRIBUTION COMPUTATIONS Exercises CONCEPTS 4: TESTING POPULATION HYPOTHESES Population Statistical Hypotheses / Population Hypothesis Testing APPLICATIONS 4: NULL SAMPLING DISTRIBUTION P-VALUES The p-value of a Directional Test / The p-value of a Nondirectional Test / Exercises CONCEPTS 5: PARAMETRICS, PIVOTALS, AND ASYMPTOTICS The Unrealizable Sampling Distribution / Sampling Distribution of a Sample Mean / Parametric Population Distributions / Pivotal Form Statistics / Asymptotic Sampling Distributions / Limitations of the Mathematical Approach APPLICATIONS 5: CIs FOR NORMAL POPULATION MEAN AND VARIANCE CI for a Normal Population Mean / CI for a Normal Population Variance / Nonparametric CI Estimation / Exercises CONCEPTS 6: LIMITATIONS OF PARAMETRIC INFERENCE Range and Precision of Scores / Size of Population / Size of Sample / Roughness of Population Distribution / Parameters and Statistics of Interests / Scarcity of Random Samples / Resampling Inference APPLICATIONS 6: RESAMPLING APPROACHES TO INFERENCE Exercises CONCEPTS 7: THE REAL AND BOOTSTRAP WORLDS The Real World of Population Inference / The Bootstrap World of Population Inference / Real World Population Distribution Estimates / Nonparametric Population Estimates / Sample Size and Distribution Estimates APPLICATIONS 7: BOOTSTRAP POPULATION DISTRIBUTIONS Nonparametric Population Estimates / Exercises CONCEPTS 8: THE BOOTSTRAP SAMPLING DISTRIBUTION The Bootstrap Conjecture / Complete Bootstrap Sampling Distributions / Monte Carlo Bootstrap Estimate of Standard Error / The Bootstrap Estimate of Bias / Simple Bootstrap CI Estimates APPLICATIONS 8: BOOTSTRAP SE, BIAS, AND CI ESTIMATES Example / Exercises CONCEPTS 9: BETTER BOOTSTRAP CIs: THE BOOTSTRAP-T Pivotal Form Statistics / The Bootstrap-t Pivotal Transformation / Forming Bootstrap-t CIs / Estimating the Standard Error of an Estimate / Range of Applications of the Bootstrap-t / Iterated Bootstrap CIs APPLICATIONS 9: SE AND CIs FOR TRIMMED MEANS Definition of the Trimmed Mean / Importance of the Trimmed Mean / A Note on Outliers / Determining the Trimming Fraction / Sampling Distribution of the Trimmed Mean / Applications / Exercises CONCEPTS 10: BETTER BOOTSTRAP CIs: BCA INTERVALS Bias Corrected and Accelerated CI Estimates / Applications of BCA CI / Better Confidence Interval Estimates APPLICATIONS 10: USING CI CORRECTION FACTORS Requirements for a BCA CI / Implementations of the BCA Algorithm / Exercise CONCEPTS 11: BOOTSTRAP HYPOTHESIS TESTING CIs, Null Hypothesis Tests, and p-values / Bootstrap-t Hypothesis Testing / Bootstrap Hypothesis Testing Alternatives / CI Hypothesis Testing / Confidence Intervals or p-values? APPLICATIONS 11: BOOTSTRAP P-VALUES Computing a Bootstrap-t p-value / Fixed-alpha CIs and Hypothesis Testing / Computing a BCI CI p-Value / Exercise CONCEPTS 12: RANDOMIZED TREATMENT ASSIGNMENT Two Functions of Randomization / Randomization of Sampled Cases / Randomization of Two Available Cases / Statistical Basis for Local Casual Inference / Population Hypothesis Revisited APPLICATIONS 12: MONTE CARLO REFERENCE DISTRIBUTIONS Serum Albumen in Diabetic Mice / Resampling Stats Analysis / SC Analysis / S-Plus Analysis / Exercises CONCEPTS 13: STRATEGIES FOR RANDOMIZING CASES Independent Randomization of Cases / Completely Randomized Designs / Randomized Blocks Designs / Restricted Randomization / Constraints on Rerandomization APPLICATIONS 13: IMPLEMENTING CASE RERANDOMIZATION Completely Randomized Designs / Randomized Blocks Designs / Independent Randomization of Cases / Restricted Randomization / Exercises CONCEPTS 14: RANDOM TREATMENT SEQUENCES Between- and Within-Cases Designs / Randomizing the Sequence of Treatments / Casual Inference for Within-Cases Designs / Sequence of Randomization Strategies APPLICATIONS 14: RERANDOMIZING TREATMENT SEQUENCES Analysis of the AB-BA Design / Sequences of k > 2 Treatments / Exercises CONCEPTS 15: BETWEEN- AND WITHIN-CASE DECISIONS Between/Within Designs / Between/Within Resampling Strategies / Doubly Randomized Available Cases APPLICATIONS 15: INTERACTIONS AND SIMPLE EFFECTS Simple and Main Effects / Exercises CONCEPTS 16: SUBSAMPLES: STABILITY OF DESCRIPTION Nonrandom Studies and Data Sets / Local Descriptive Inference / Descriptive Stability and Case Homogeneity / Subsample Descriptions / Employing Subsample Descriptions / Subsamples and Randomized Studies APPLICATIONS 16: STRUCTURED & UNSTRUCTURED DATA Half-Samples of Unstructured Data / Subsamples of Source-Structured Cases / Exercises PART II: RESAMPLING APPLICATIONS INTRODUCTION APPLICATIONS 17: A SINGLE GROUP OF CASES Random Sample or Set of Available Cases / Typical Size of Score Distribution / Variability of Attribute Scores / Association Between Two Attributes / Exercises APPLICATIONS 18: TWO INDEPENDENT GROUPS OF CASES Constitution of Independent Groups / Location Comparisons for Samples / Magnitude Differences, CR and RB Designs / Magnitude Differences, Nonrandom Designs / Study Size / Exercises APPLICATIONS 19: MULTIPLE INDEPENDENT GROUPS Multiple Group Parametric Comparisons / Nonparametric K-group Comparison / Comparisons among Randomized Groups / Comparisons among Nonrandom Groups / Adjustment for Multiple Comparisons / Exercises APPLICATIONS 20: MULTIPLE FACTORS AND COVARIATES Two Treatment Factors / Treatment and Blocking Factors / Covariate Adjustment of Treatment Scores / Exercises APPLICATIONS 21: WITHIN-CASES TREATMENT COMPARISONS Normal Models, Univariate and Multivariate / Bootstrap Treatment Comparisons / Randomized Sequence of Treatments / Nonrandom Repeated Measures / Exercises APPLICATIONS 22: LINEAR MODELS: MEASURED RESPONSE The Parametric Linear Model / Nonparametric Linear Models / Prediction Accuracy / Linear Models for Randomized Cases / Linear Models for Nonrandom Studies / Exercises APPLICATIONS 23: CATEGORICAL RESPONSE ATTRIBUTES Cross-Classification of Cases / The 2 x 2 Table / Logistic Regression / Exercises POSTSCRIPT: GENERALITY, CAUSALITY & STABILITY Study Design and Resampling / Resampling Tools / REFERENCES / INDEX

222 citations


Journal ArticleDOI
TL;DR: Empirical comparisons between model selection using VC-bounds and classical methods are performed for various noise levels, sample size, target functions and types of approximating functions, demonstrating the advantages of VC-based complexity control with finite samples.
Abstract: It is well known that for a given sample size there exists a model of optimal complexity corresponding to the smallest prediction (generalization) error. Hence, any method for learning from finite samples needs to have some provisions for complexity control. Existing implementations of complexity control include penalization (or regularization), weight decay (in neural networks), and various greedy procedures (aka constructive, growing, or pruning methods). There are numerous proposals for determining optimal model complexity (aka model selection) based on various (asymptotic) analytic estimates of the prediction risk and on resampling approaches. Nonasymptotic bounds on the prediction risk based on Vapnik-Chervonenkis (VC)-theory have been proposed by Vapnik. This paper describes application of VC-bounds to regression problems with the usual squared loss. An empirical study is performed for settings where the VC-bounds can be rigorously applied, i.e., linear models and penalized linear models where the VC-dimension can be accurately estimated, and the empirical risk can be reliably minimized. Empirical comparisons between model selection using VC-bounds and classical methods are performed for various noise levels, sample size, target functions and types of approximating functions. Our results demonstrate the advantages of VC-based complexity control with finite samples.

186 citations


Journal Article
TL;DR: It is demonstrated how to use these estimated standard errors for network statistics to compare statistics using an approximate t-test and how statistics can also be compared by another bootstrap approach that is not based on approximate normality.
Abstract: Two procedures are proposed for calculating standard errors for network statistics. Both are based on resampling of vertices: the first follows the bootstrap approach, the second the jackknife approach. In addition, we demonstrate how to use these estimated standard errors to compare statistics using an approximate t-test and how statistics can also be compared by another bootstrap approach that is not based on approximate normality.

01 Jan 1999
TL;DR: In this paper, the authors compared three methods to obtain a confidence interval for size at 50% maturity, and in gen- eral for P% maturity: Fieller's analyti- cal method, nonparametric bootstrap, and a Monte Carlo algorithm.
Abstract: Size at 50% maturity is commonly evaluated for wild popula- tions, but the uncertainty involved in such computation has been frequently overlooked in the application to marine fisheries. Here we evaluate three pro- cedures to obtain a confidence interval for size at 50% maturity, and in gen- eral for P% maturity: Fieller's analyti- cal method, nonparametric bootstrap, and a Monte Carlo algorithm. The three methods are compared in estimating size at 50% maturity (l 50% ) by using simulated data from an age-structured population, with von Bertalanffy growth and constant natural mortality, for sample sizes of 500 to 10,000 indi- viduals. Performance was assessed by using four criteria: 1) the proportion of times that the confidence interval did contain the true and known size at 50% maturity, 2) bias in estimating l 50% , 3) length and 4) shape of the confidence interval around l 50% . Judging from cri- teria 2-4, the three methods performed equally well, but in criterion 1, the Monte Carlo method outperformed the bootstrap and Fieller methods with a frequency remaining very close to the nominal 95% at all sample sizes. The Monte Carlo method was also robust to variations in natural mortality rate (M), although with lengthier and more asymmetric confidence intervals as M increased. This method was applied to two sets of real data. First, we used data from the squat lobster Pleuron- codes monodon with several levels of proportion mature, so that a confidence interval for the whole maturity curve could be outlined. Second, we compared two samples of the anchovy Engraulis ringens from different localities in cen- tral Chile to test the hypothesis that they differed in size at 50% maturity and concluded that they were not sta- tistically different. statistical uncertainty of the model- based l 50% is ignored (Table 1). In this work, we show three alterna- tive procedures: an analytical method derived from generalized linear models (McCullagh and Nelder, 1989), nonparametric boot- strap (Efron and Tibshirani, 1993), and a Monte Carlo algorithm devel- oped in our study. We show by simu- lation the behavior of the three methods for sample sizes of 500 to 10,000 individuals, concluding that they are similar in terms of bias, length, and shape of confidence inter- vals but that the Monte Carlo method outperforms the other two methods in percentage of times that the confi- dence interval contains the true pa- rameter, which remained close to the nominal 95% at all sample sizes.

Journal Article
TL;DR: An algorithm called EMMIX is described that automatically undertakes the fitting of normal or t-component mixture models to multivariate data, using maximum likelihood via the EM algorithm, including the provision of suitable initial values if not supplied by the user.
Abstract: We consider the fitting of normal or t-component mixture models to multivariate data, using maximum likelihood via the EM algorithm. This approach requires the initial specification of an initial estimate of the vector of unknown parameters, or equivalently, of an initial classification of the data with respect to the components of the mixture model under fit. We describe an algorithm called EMMIX that automatically undertakes this fitting, including the provision of suitable initial values if not supplied by the user. The EMMIX algorithm has several options, including the option to carry out a resampling-based test for the number of components in the mixture model.

Journal ArticleDOI
TL;DR: Permutation and ad hoc methods for testing with the random effects model, which theoretically controls the type I error rate for typical meta-analyses scenarios, are proposed.
Abstract: The standard approach to inference for random effects meta-analysis relies on approximating the null distribution of a test statistic by a standard normal distribution. This approximation is asymptotic on k, the number of studies, and can be substantially in error in medical meta-analyses, which often have only a few studies. This paper proposes permutation and ad hoc methods for testing with the random effects model. Under the group permutation method, we randomly switch the treatment and control group labels in each trial. This idea is similar to using a permutation distribution for a community intervention trial where communities are randomized in pairs. The permutation method theoretically controls the type I error rate for typical meta-analyses scenarios. We also suggest two ad hoc procedures. Our first suggestion is to use a t-reference distribution with k-1 degrees of freedom rather than a standard normal distribution for the usual random effects test statistic. We also investigate the use of a simple t-statistic on the reported treatment effects.

Journal ArticleDOI
TL;DR: In this article, approximate parametric bootstrap confidence intervals for functions of multinomial proportions are discussed, which are obtained via an Edgeworth expansion approximation for the rectangular multi-parameter probabilities rather than the resampling approach.

Journal Article
TL;DR: In this paper, the authors evaluated three methods - averaging, central-pixel resampling, and median using simulated images and found that the median method produces almost identical results because of the similarities between the averaged and median values of the simulated data.
Abstract: Spatial data aggregation is widely practiced for "scaling-up" environmental analyses and modeling from local to regional or global scales. Despite acknowledgments of the general effects of aggregation, there is a lack of systematic comparison between aggregation methods. The study evaluated three methods - averaging, central-pixel resampling, and median using simulated images. Both the averaging and median methods can retain the mean and median values, respectively, but alter significantly the standard deviation. The central-pixel method alters both statistics. The statistical changes can be modified by the presence of spatial autocorrelation for a11 three methods. Spatially, the averaging method can reveal underlying spatial patterns at scales within the spatial autocorrelation ranges. The median method produces almost identical results because of the similarities between the averaged and median values of the simulated data. To a limited extent, the central-pixel method retains contrast and spatial patterns of the original images. At scales coarser than the autocorrelation range, the averaged and median images become homogeneous and do not difler significantly between these scales. The central-pixel method can induce severe spatially biased errors at coarse scales. Understanding these trends can help select appropriate aggregation methods and aggregation levels for particular -- applications.

Patent
14 Apr 1999
TL;DR: In this article, the authors proposed a method of resampling medical imaging data from a first spatial distribution of data points onto a second spatial distribution, including determining a matrix of reverse interpolation coefficients.
Abstract: A method of resampling medical imaging data from a first spatial distribution of data points onto a second spatial distribution of data points, including determining a matrix of reverse interpolation coefficients for resampling data from said second spatial distribution onto said first spatial distribution, inverting a matrix based on said reverse interpolation matrix to determine forward resampling coefficients for resampling data from said first spatial distribution to said second spatial distribution, and resampling data from said first spatial distribution onto said second spatial distribution using said forward resampling coefficients.

Journal ArticleDOI
TL;DR: A simple sampling taxonomy is defined that shows the differences between and relationships among the bootstrap, the jackknife, and the randomization test, and is useful for teaching the goals and purposes of resampling schemes.
Abstract: A simple sampling taxonomy is defined that shows the differences between and relationships among the bootstrap, the jackknife, and the randomization test. Each method has as its goal the creation of an empirical sampling distribution that can be used to test statistical hypotheses, estimate standard errors, and/or create confidence intervals. Distinctions between the methods can be made based on the sampling approach (with replacement versus without replacement) and the sample size (replacing the whole original sample versus replacing a subset of the original sample). The taxonomy is useful for teaching the goals and purposes of resampling schemes. An extension of the taxonomy implies other possible resampling approaches that have not previously been considered. Univariate and multivariate examples are presented.

Journal ArticleDOI
01 Dec 1999-Ecology
TL;DR: A method to test statistically for fuzziness of the partitions in cluster analysis of sampling units that can be used with a wide range of data types and clustering methods and which can run very fast on microcomputers.
Abstract: Ecologists often use cluster analysis as a tool in the classification and mapping of entities such as communities or landscapes. The problem is that the researcher has to choose an adequate group partition level. In addition, cluster analysis techniques will always reveal groups, even if the data set does not have a clear group structure. This paper offers a method to test statistically for fuzziness of the partitions in cluster analysis of sampling units that can be used with a wide range of data types and clustering methods. The method applies bootstrap resampling. In this, partitions found in bootstrap samples are compared to the observed partition by the similarity of the sampling units that form the groups. The method tests the null hypothesis that the clusters in the bootstrap samples are random samples of their most similar corresponding clusters mapped one-to-one into the observed data. The resulting probability indicates whether the groups in the partition are sharp enough to reappear consistently in resampling. Examples with artificial and vegetational field data show that the test gives consistent and useful results. Though the method is computationally demanding, its implementation in a C++ program can run very fast on microcomputers.

Journal ArticleDOI
TL;DR: In this article, a generalization of the usual regression discontinuity design can be applied in a wider range of situations, focusing on the use of categorical treatment and response variables, but also considering the more general case of any regression relationship.
Abstract: Published studies using the regression discontinuity design have been limited to cases in which linear regression is applied to a categorical treatment indicator and an equal interval outcome. This is unnecessarily narrow. We show here how a generalization the usual regression discontinuity design can be applied in a wider range of situations. We focus on the use of categorical treatment and response variables, but we also consider the more general case of any regression relationship. We also show how a resampling sensitivity analysis may be used to address the credibility of the assumed assignment process. The broader formulation is applied to an evaluation of California's inmate classification system, which is used to allocate prisoners to different kinds of confinement.

Book
01 Jan 1999
TL;DR: A general approach to constructing confidence intervals by subsampling was presented in Politis and Romano (1994) as mentioned in this paper, where the crux of the method is recomputing a statistic over subsamples of the data, and these recomputed values are used to build up an estimated sampling distribution.
Abstract: A general approach to constructing confidence intervals by subsampling was presented in Politis and Romano (1994). The crux of the method is recom- puting a statistic over subsamples of the data, and these recomputed values are used to build up an estimated sampling distribution. The method works under extremely weak conditions, it applies to independent, identically distributed (i.i.d.) observations as well as to dependent data situations, such as time series (possi- bly nonstationary), random fields, and marked point processes. In this article, we present some theorems showing: a new construction for confidence intervals that removes a previous condition, a general theorem showing the validity of subsam- pling for data-dependent choices of the block size, and a general theorem for the construction of hypothesis tests (not necessarily derived from a confidence interval construction). The arguments apply to both the i.i.d. setting and the dependent data case.

Journal ArticleDOI
TL;DR: The method is probabilistic, based on bootstrap resampling, and suggests it is more reliable than other available methods in recovering the true intrinsic dimensionality in metric ordination of a sample.
Abstract: A method is described to determine the number of significant dimensions in metric ordination of a sample. The method is probabilistic, based on bootstrap resampling. An iterative algorithm takes bootstrap samples with replacement from the sample. It finds in each bootstrap sample ordination coordinates and computes, after Procrustean adjustments, the correlation between observed and bootstrap ordination scores. It compares this correlation to the same parameter generated in a parallel bootstrapped ordination of randomly permuted data, which upon many iterations will generate a probability. The method is assessed in principal coordinates analysis of simu- lated data sets that have varying number of variables and correlation levels, uniform or patterned correlation structure. The results suggest the method is more reliable than other available methods in recovering the true intrinsic dimen- sionality. Examples with grassland data illustrate utility.

Journal ArticleDOI
TL;DR: In this paper, the influence of sampling interval on the accuracy of estimates for selected trail impact problems was examined using a resampling simulation method using a complete census of four impact-types on 70 backcountry trails in the Great Smoky Mountains National Park was utilized as the base dataset for the analyses.

Journal ArticleDOI
TL;DR: In this article, a bootstrap procedure for the periodogram of a weakly dependent stationary sequence is proposed, which does not require estimation of the spectral density and of frequency domain residuals obtained by means of initial smoothing.
Abstract: A bootstrap procedure for the periodogram of a weakly dependent stationary sequence is proposed. The method works by locally resampling the periodogram ordinates and does not require estimation of the spectral density and of frequency domain residuals obtained by means of initial smoothing. Asymptotic properties of the proposed bootstrap procedure are studied and consistency is proved for interesting classes of statistics including ratio statistics, kernel estimates of the spectral density and parameter estimates. Some practical aspects concerning the implementation of the method are also discussed.

Journal ArticleDOI
TL;DR: This poster presents a probabilistic procedure to select the Higgs boson-like substance, a type of “spatially aggregating substance,” which is believed to have an important role in inflammation and wound healing.
Abstract: Reference STAT-ARTICLE-1999-002View record in Web of Science Record created on 2006-04-21, modified on 2017-05-12

Journal ArticleDOI
TL;DR: Several methods are superior to the currently used quadratic discriminant method for risk estimation for biochemical-based DS risk prediction, based on data from a prospective multicentre prenatal screening study.
Abstract: Currently, prenatal screening for Down Syndrome (DS) uses the mother's age as well as three biochemical markers for risk prediction. Risk calculations for the biochemical markers use a quadratic discriminant function. In this paper we compare several classification procedures to quadratic discrimination methods for biochemical-based DS risk prediction, based on data from a prospective multicentre prenatal screening study. We investigate alternative methods including linear discriminant methods, logistic regression methods, neural network methods, and classification and regression-tree methods. Several experiments are performed, and in each experiment resampling methods are used to create training and testing data sets. The procedures on the test data set are summarized by the area under their receiver operating characteristic curves. In each experiment this process is repeated 500 times and then the classification procedures are compared. We find that several methods are superior to the currently used quadratic discriminant method for risk estimation for these data. The implications of these results for prenatal screening programs are discussed.

Book
25 Oct 1999
TL;DR: The generation of "random" numbers random quadrature Monte Carlo solutions of differential equations Markov chains, Poisson processes and linear equations SIMEST, SIMDAT, and pseudoreality models for stocks and derivatives simulation assessment of multivariate and robust procedures in statistical process control noise and chaos Bayesian approaches resampling based tests optimisation and estimation in a noisy world modeling the USA AIDS epidemic.
Abstract: The generation of "random" numbers random quadrature Monte Carlo solutions of differential equations Markov chains, Poisson processes and linear equations SIMEST, SIMDAT, and pseudoreality models for stocks and derivatives simulation assessment of multivariate and robust procedures in statistical process control noise and chaos Bayesian approaches resampling based tests optimisation and estimation in a noisy world modeling the USA AIDS epidemic - exploration, simulation and conjecture.

Journal ArticleDOI
TL;DR: In this article, a general method for density estimation under constraints is proposed, in which resampling weights are chosen so as to minimize distance from the empirical or uniform bootstrap distribution subject to the constraints being satisfied.
Abstract: We suggest a general method for tackling problems of density estimation under constraints. It is, in effect, a particular form of the weighted bootstrap, in which resampling weights are chosen so as to minimize distance from the empirical or uniform bootstrap distribution subject to the constraints being satisfied. A number of constraints are treated as examples. They include conditions on moments, quantiles, and entropy, the latter as a device for imposing qualitative conditions such as those of unimodality or “interestingness.” For example, without altering the data or the amount of smoothing, we may construct a density estimator that enjoys the same mean, median, and quartiles as the data. Different measures of distance·give rise to slightly different results.

01 Jan 1999
TL;DR: It is concluded that, while resampling methods may be useful in some problems, there is little evidence of their usefulness as general purpose techniques for the analysis of complex surveys.
Abstract: Application of resampling methods in sample survey settings presents considerable practical and conceptual difficulties. Various potential solutions have recently been proffered in the statistical literature. This paper provides a brief critical review of these methods. Our main conclusion is that, while resampling methods may be useful in some problems, there is little evidence of their usefulness as general purpose techniques for the analysis of complex surveys.

Journal ArticleDOI
TL;DR: A SAS macro that implements simple nonparametric bootstrap statistical inference is presented with an example, easily generalized to any SAS procedure which includes a BY statement, and to cases of clustered data.