scispace - formally typeset
Search or ask a question

Showing papers on "Resampling published in 1995"


Book
21 Jul 1995
TL;DR: In this article, the authors proposed a method for estimating the variance of functions of means using the one-step jackknife, and showed that the method can be used to estimate the variance in the context of complex problems.
Abstract: 1. Introduction.- 1.1 Statistics and Their Sampling Distributions.- 1.2 The Traditional Approach.- 1.3 The Jackknife.- 1.4 The Bootstrap.- 1.5 Extensions to Complex Problems.- 1.6 Scope of Our Studies.- 2. Theory for the Jackknife.- 2.1 Variance Estimation for Functions of Means.- 2.1.1 Consistency.- 2.1.2 Other properties.- 2.1.3 Discussions and examples.- 2.2 Variance Estimation for Functionals.- 2.2.1 Differentiability and consistency.- 2.2.2 Examples.- 2.2.3 Convergence rate.- 2.2.4 Other differential approaches.- 2.3 The Delete-d Jackknife.- 2.3.1 Variance estimation.- 2.3.2 Jackknife histograms.- 2.4 Other Applications.- 2.4.1 Bias estimation.- 2.4.2 Bias reduction.- 2.4.3 Miscellaneous results.- 2.5 Conclusions and Discussions.- 3. Theory for the Bootstrap.- 3.1 Techniques in Proving Consistency.- 3.1.1 Bootstrap distribution estimators.- 3.1.2 Mallows' distance.- 3.1.3 Berry-Esseen's inequality.- 3.1.4 Imitation.- 3.1.5 Linearization.- 3.1.6 Convergence in moments.- 3.2 Consistency: Some Major Results.- 3.2.1 Distribution estimators.- 3.2.2 Variance estimators.- 3.3 Accuracy and Asymptotic Comparisons.- 3.3.1 Convergence rate.- 3.3.2 Asymptotic minimaxity.- 3.3.3 Asymptotic mean squared error.- 3.3.4 Asymptotic relative error.- 3.3.5 Conclusions.- 3.4 Fixed Sample Performance.- 3.4.1 Moment estimators.- 3.4.2 Distribution estimators.- 3.4.3 Conclusions.- 3.5 Smoothed Bootstrap.- 3.5.1 Empirical evidences and examples.- 3.5.2 Sample quantiles.- 3.5.3 Remarks.- 3.6 Nonregular Cases.- 3.7 Conclusions and Discussions.- 4. Bootstrap Confidence Sets and Hypothesis Tests.- 4.1 Bootstrap Confidence Sets.- 4.1.1 The bootstrap-t.- 4.1.2 The bootstrap percentile.- 4.1.3 The bootstrap bias-corrected percentile.- 4.1.4 The bootstrap accelerated bias-corrected percentile.- 4.1.5 The hybrid bootstrap.- 4.2 Asymptotic Theory.- 4.2.1 Consistency.- 4.2.2 Accuracy.- 4.2.3 Other asymptotic comparisons.- 4.3 The Iterative Bootstrap and Other Methods.- 4.3.1 The iterative bootstrap.- 4.3.2 Bootstrap calibrating.- 4.3.3 The automatic percentile and variance stabilizing.- 4.3.4 Fixed width bootstrap confidence intervals.- 4.3.5 Likelihood based bootstrap confidence sets.- 4.4 Empirical Comparisons.- 4.4.1 The bootstrap-t, percentile, BC, and BCa.- 4.4.2 The bootstrap and other asymptotic methods.- 4.4.3 The iterative bootstrap and bootstrap calibration.- 4.4.4 Summary.- 4.5 Bootstrap Hypothesis Tests.- 4.5.1 General description.- 4.5.2 Two-sided hypotheses with nuisance parameters.- 4.5.3 Bootstrap distance tests.- 4.5.4 Other results and discussions.- 4.6 Conclusions and Discussions.- 5. Computational Methods.- 5.1 The Delete-1 Jackknife.- 5.1.1 The one-step jackknife.- 5.1.2 Grouping and random subsampling.- 5.2 The Delete-d Jackknife.- 5.2.1 Balanced subsampling.- 5.2.2 Random subsampling.- 5.3 Analytic Approaches for the Bootstrap.- 5.3.1 The delta method.- 5.3.2 Jackknife approximations.- 5.3.3 Saddle point approximations.- 5.3.4 Remarks.- 5.4 Simulation Approaches for the Bootstrap.- 5.4.1 The simple Monte Carlo method.- 5.4.2 Balanced bootstrap resampling.- 5.4.3 Centering after Monte Carlo.- 5.4.4 The linear bootstrap.- 5.4.5 Antithetic bootstrap resampling.- 5.4.6 Importance bootstrap resampling.- 5.4.7 The one-step bootstrap.- 5.5 Conclusions and Discussions.- 6. Applications to Sample Surveys.- 6.1 Sampling Designs and Estimates.- 6.2 Resampling Methods.- 6.2.1 The jackknife.- 6.2.2 The balanced repeated replication.- 6.2.3 Approximated BRR methods.- 6.2.4 The bootstrap.- 6.3 Comparisons by Simulation.- 6.4 Asymptotic Results.- 6.4.1 Assumptions.- 6.4.2 The jackknife and BRR for functions of averages.- 6.4.3 The RGBRR and RSBRR for functions of averages.- 6.4.4 The bootstrap for functions of averages.- 6.4.5 The BRR and bootstrap for sample quantiles.- 6.5 Resampling Under Imputation.- 6.5.1 Hot deck imputation.- 6.5.2 An adjusted jackknife.- 6.5.3 Multiple bootstrap hot deck imputation.- 6.5.4 Bootstrapping under imputation.- 6.6 Conclusions and Discussions.- 7. Applications to Linear Models.- 7.1 Linear Models and Regression Estimates.- 7.2 Variance and Bias Estimation.- 7.2.1 Weighted and unweighted jackknives.- 7.2.2 Three types of bootstraps.- 7.2.3 Robustness and efficiency.- 7.3 Inference and Prediction Using the Bootstrap.- 7.3.1 Confidence sets.- 7.3.2 Simultaneous confidence intervals.- 7.3.3 Hypothesis tests.- 7.3.4 Prediction.- 7.4 Model Selection.- 7.4.1 Cross-validation.- 7.4.2 The bootstrap.- 7.5 Asymptotic Theory.- 7.5.1 Variance estimators.- 7.5.2 Bias estimators.- 7.5.3 Bootstrap distribution estimators.- 7.5.4 Inference and prediction.- 7.5.5 Model selection.- 7.6 Conclusions and Discussions.- 8. Applications to Nonlinear, Nonparametric, and Multivariate Models.- 8.1 Nonlinear Regression.- 8.1.1 Jackknife variance estimators.- 8.1.2 Bootstrap distributions and confidence sets.- 8.1.3 Cross-validation for model selection.- 8.2 Generalized Linear Models.- 8.2.1 Jackknife variance estimators.- 8.2.2 Bootstrap procedures.- 8.2.3 Model selection by bootstrapping.- 8.3 Cox's Regression Models.- 8.3.1 Jackknife variance estimators.- 8.3.2 Bootstrap procedures.- 8.4 Kernel Density Estimation.- 8.4.1 Bandwidth selection by cross-validation.- 8.4.2 Bandwidth selection by bootstrapping.- 8.4.3 Bootstrap confidence sets.- 8.5 Nonparametric Regression.- 8.5.1 Kernel estimates for fixed design.- 8.5.2 Kernel estimates for random regressor.- 8.5.3 Nearest neighbor estimates.- 8.5.4 Smoothing splines.- 8.6 Multivariate Analysis.- 8.6.1 Analysis of covariance matrix.- 8.6.2 Multivariate linear models.- 8.6.3 Discriminant analysis.- 8.6.4 Factor analysis and clustering.- 8.7 Conclusions and Discussions.- 9. Applications to Time Series and Other Dependent Data.- 9.1 m-Dependent Data.- 9.2 Markov Chains.- 9.3 Autoregressive Time Series.- 9.3.1 Bootstrapping residuals.- 9.3.2 Model selection.- 9.4 Other Time Series.- 9.4.1 ARMA(p,q) models.- 9.4.2 Linear regression with time series errors.- 9.4.3 Dynamical linear regression.- 9.5 Stationary Processes.- 9.5.1 Moving block and circular block.- 9.5.2 Consistency of the bootstrap.- 9.5.3 Accuracy of the bootstrap.- 9.5.4 Remarks.- 9.6 Conclusions and Discussions.- 10. Bayesian Bootstrap and Random Weighting.- 10.1 Bayesian Bootstrap.- 10.1.1 Bayesian bootstrap with a noninformative prior.- 10.1.2 Bayesian bootstrap using prior information.- 10.1.3 The weighted likelihood bootstrap.- 10.1.4 Some remarks.- 10.2 Random Weighting.- 10.2.1 Motivation.- 10.2.2 Consistency.- 10.2.3 Asymptotic accuracy.- 10.3 Random Weighting for Functional and Linear Models.- 10.3.1 Statistical functionals.- 10.3.2 Linear models.- 10.4 Empirical Results for Random Weighting.- 10.5 Conclusions and Discussions.- Appendix A. Asymptotic Results.- A.1 Modes of Convergence.- A.2 Convergence of Transformations.- A.4 The Borel-Cantelli Lemma.- A.5 The Law of Large Numbers.- A.6 The Law of the Iterated Logarithm.- A.7 Uniform Integrability.- A.8 The Central Limit Theorem.- A.9 The Berry-Esseen Theorem.- A.10 Edgeworth Expansions.- A.11 Cornish-Fisher Expansions.- Appendix B. Notation.- References.- Author Index.

1,660 citations


Journal Article
TL;DR: An expectation maximization (EM) algorithm to obtain allele frequencies, haplotype frequencies, and gametic disequilibrium coefficients for multiple-locus systems is given and a data resampling approach to estimate test statistic sampling distributions is suggested.
Abstract: This paper gives an expectation maximization (EM) algorithm to obtain allele frequencies, haplotype frequencies, and gametic disequilibrium coefficients for multiple-locus systems. It permits high polymorphism and null alleles at all loci. This approach effectively deals with the primary estimation problems associated with such systems; that is, there is not a one-to-one correspondence between phenotypic and genotypic categories, and sample sizes tend to be much smaller than the number of phenotypic categories. The EM method provides maximum-likelihood estimates and therefore allows hypothesis tests using likelihood ratio statistics that have chi 2 distributions with large sample sizes. We also suggest a data resampling approach to estimate test statistic sampling distributions. The resampling approach is more computer intensive, but it is applicable to all sample sizes. A strategy to test hypotheses about aggregate groups of gametic disequilibrium coefficients is recommended. This strategy minimizes the number of necessary hypothesis tests while at the same time describing the structure of disequilibrium. These methods are applied to three unlinked dinucleotide repeat loci in Navajo Indians and to three linked HLA loci in Gila River (Pima) Indians. The likelihood functions of both data sets are shown to be maximized by the EM estimates, and the testing strategy provides a useful description of the structure of gametic disequilibrium. Following these applications, a number of simulation experiments are performed to test how well the likelihood-ratio statistic distributions are approximated by chi 2 distributions. In most circumstances the chi 2 grossly underestimated the probability of type I errors. However, at times they also overestimated the type 1 error probability. Accordingly, we recommended hypothesis tests that use the resampling method.

580 citations


Journal ArticleDOI
TL;DR: In this paper, the authors introduce tests of linearity for time series based on nonparametric estimates of the conditional mean and the conditional variance, which are compared to a number of parametric tests and to non-parametric tests based on the bispectrum.
Abstract: SUMMARY We introduce tests of linearity for time series based on nonparametric estimates of the conditional mean and the conditional variance. The tests are compared to a number of parametric tests and to nonparametric tests based on the bispectrum. Asymptotic expressions give bad approximations, and the null distribution under linearity is constructed using resampling of the best linear approximation. The new tests perform well on the examples tested.

125 citations


Proceedings Article
27 Nov 1995
TL;DR: Methods of achieving error independence between the networks by training the networks with different resampling sets from the original training set are investigated.
Abstract: Central to the performance improvement of a committee relative to individual networks is the error correlation between networks in the committee. We investigated methods of achieving error independence between the networks by training the networks with different resampling sets from the original training set. The methods were tested on the sinwave artificial task and the real-world problems of hepatoma (liver cancer) and breast cancer diagnoses.

105 citations


Journal ArticleDOI
TL;DR: In this article, the singular value decomposition (SVD) is used to identify the parameters of a bilinear model and the parameters are identified by the standard orthogonality relationships of the SVD.
Abstract: SUMMARY Simulation is a standard technique for investigating the sampling distribution of parameter estimators. The bootstrap is a distribution-free method of assessing sampling variability based on resampling from the empirical distribution; the parametric bootstrap resamples from a fitted parametric model. However, if the parameters of the model are constrained, and the application of these constraints is a function of the realized sample, then the resampling distribution obtained from the parametric bootstrap may become badly biased and overdispersed. Here we discuss such problems in the context of estimating parameters from a bilinear model that incorporates the singular value decomposition (SVD) and in which the parameters are identified by the standard orthogonality relationships of the SVD. Possible effects of the SVD parameter identification are arbitrary changes in the sign of singular vectors, inversion of the order of singular values and rotation of the plotted co-ordinates. This paper proposes inverse transformation or 'filtering' techniques to avoid these problems. The ideas are illustrated by assessing the variability of the location of points in a principal co-ordinates diagram and in the marginal sampling distribution of singular values. An application to the analysis of a biological data set is described. In the discussion it is pointed out that several exploratory multivariate methods may benefit by using resampling with filtering.

82 citations


Journal ArticleDOI
TL;DR: It is shown that, provided block length or leave-out number, respectively, are chosen appropriately, both techniques produce first-order optimal bandwidths, and the block bootstrap has far better empirical properties, particularly under long-range dependence.
Abstract: We analyse methods based on the block bootstrap and leave-out cross-validation, for choosing the bandwidth in nonparametric regression when errors have an almost arbitrarily long range of dependence. A novel analytical device for modelling the dependence structure of errors is introduced. This allows a concise theoretical description of the way in which the range of dependence affects optimal bandwidth choice. It is shown that, provided block length or leave-out number, respectively, are chosen appropriately, both techniques produce first-order optimal bandwidths. Nevertheless, the block bootstrap has far better empirical properties, particularly under long-range dependence.

81 citations


Journal ArticleDOI
TL;DR: In this article, the authors introduced a method of multiple hypothesis testing that combines the idea of sequential multiple testing procedures with the structure of resampling methods, which can be seen as an alternative to the analytic method of Dunnett and Tamhane, which requires a specific distributional form.
Abstract: This article introduces a method of multiple hypothesis testing that combines the idea of sequential multiple testing procedures with the structure of resampling methods. The method can be seen as an alternative to the analytic method of Dunnett and Tamhane, which requires a specific distributional form. Resampling incorporates the covariance structure of the data without the need for distributional assumptions. Recent work by Westfall and Young has shown that a step-down resampling method is asymptotically consistent when adjusted p values can be obtained exactly for continuous data. This article shows that in the case of a comparison of two groups on multiple outcomes, those results are generalizable to discrete data where exact adjusted p values are not available. It is shown that the method asymptotically attains the desired level for controlling the experimentwise probability of a type I error.

78 citations


Journal ArticleDOI
TL;DR: This article describes the plication of resampling techniques to ROC data for which the binormal assumptions are not appropriate, and suggests that the bootstrap may be especially helpful in determining confidence intervals from small data samples.
Abstract: The methods most commonly used for analyzing receiver operating characteristic (ROC) data incorporate "binormal" assumptions about the latent frequency distributions of test results. Although these assumptions have proved robust to a wide variety of actual frequency distributions, some data sets do not "fit" the binormal model. In such cases, resampling techniques such as the jackknife and the bootstrap provide versatile, distribution-indepen dent, and more appropriate methods for hypothesis testing. This article describes the ap plication of resampling techniques to ROC data for which the binormal assumptions are not appropriate, and suggests that the bootstrap may be especially helpful in determining con fidence intervals from small data samples. The widespread availability of ever-faster com puters has made resampling methods increasingly accessible and convenient tools for data analysis. Key words: receiver operating characteristic; ROC; resampling; jackknife; bootstrap; diagnostic testing; diagnostic...

60 citations


Journal ArticleDOI
TL;DR: In this article, the performance of a bootstrapping enhanced DEA to measure the relative structural efficiency of unbalanced subsamples is investigated, and it is shown that a reasampling approach to DEA can cope with this problem and also allows the use of pooled samples.
Abstract: This paper investigates the performance of a bootstrapping enhanced DEA to measure the relative structural efficiency of unbalanced subsamples. Although this issue plays an important role in applied DEA, it is often ignored, resulting in misleading conclusions concerning relative efficiency. It is shown, that a reasampling approach to DEA can cope with this problem and also allows the use of pooled samples. The distribution of a statistic to test the hypotheses of equal structural efficiency is derived from Monte Carlo simulations and compared with the corresponding statistic calculated from standard DEA results. While the resampling variant of DEA justifies the use of the normal approximation, this is not the case for standard DEA.

57 citations


Journal ArticleDOI
TL;DR: The cluster sample technique, presented here in the context of a logistic dose-response model, incorporates many of the advantages of quasi-likelihood methods, are valid for any underlying nested correlation structure, and are adaptable to a variety of analytical settings.
Abstract: SUMMARY This paper presents a model-free approach for evaluating teratology and developmental toxicity data involving clustered binary responses. In teratology studies, a major statistical problem arises from the effect of intralitter correlation, or the potential for littermates to respond similarly. Some statistical methods impose strict distributional assumptions to account for extra-binomial variation, while others rely on nonparametric resampling and empirical variance estimation techniques. Quasi-likelihood methods and generalized estimating equations (GEE), which model the marginal mean/variance relationship, also avoid strict distributional assumptions. The proposed approach, often used to analyze complex sample survey data, is based on a first-order Taylor series approximation and a between-cluster variance estimation procedure, yielding consistent variance estimates for binomial-based proportions and regression coefficients from dose-response models. The cluster sample technique, presented here in the context of a logistic dose-response model, incorporates many of the advantages of quasi-likelihood methods, are valid for any underlying nested correlation structure, and are adaptable to a variety of analytical settings. The results of a simulation study show the cluster sample technique to be a viable competitor to GEE methods currently receiving attention.

47 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed three procedures for testing additivity in nonlinear time series analysis, combining some smoothing techniques with analysis of variance, the second is a Lagrange multiplier test using nonparametric estimation, and the third is a permutation test which uses smoothing technique to obtain the test statistic and its reference distribution.
Abstract: SUMMARY Additivity is commonly used in the statistical literature to simplify data analysis, especially in analysis of variance and in multivariate smoothing. In this paper, we propose three procedures for testing additivity in nonlinear time series analysis. The first procedure combines some smoothing techniques with analysis of variance, the second is a Lagrange multiplier test using nonparametric estimation, and the third is a permutation test which uses smoothing techniques to obtain the test statistic and its reference distribution. We investigate properties of the proposed tests and use simulation to check their performance in finite samples. Applications of the tests to nonlinear time series analysis are discussed and illustrated by real examples.

Journal ArticleDOI
TL;DR: Case-resampling bootstrap provides some justification for the DBM and TG methods and gives evidence for a trade-off of readers and cases with regard to precision and power in this data set.

Journal ArticleDOI
TL;DR: In this paper, asymptotic expansions of the required correction and the iterated interval endpoints are used to provide two new computationally efficient methods for constructing an approximation to the estimator.
Abstract: An iterated bootstrap confidence interval requires an additive correction to be made to the nominal coverage level of an uncorrected interval. Such correction is usually performed using a computationally intensive Monte Carlo simulation involving two nested levels of bootstrap sampling. Asymptotic expansions of the required correction and the iterated interval endpoints are used to provide two new computationally efficient methods for constructing an approximation to the iterated bootstrap confidence interval. The first asymptotic interval replaces the need for a second level of bootstrap sampling with a series of preliminary analytic calculations, which are readily automated, and from which an approximation to the coverage correction is easily obtained. The second interval directly approximates the endpoints of the iterated interval and yields, for the first time, the possibility of constructing an approximation to an iterated bootstrap confidence interval which does not require any resampling. The theoretical properties of the two intervals are considered. The computation required for their construction is detailed and has been coded in a fully automatic user-friendly Fortran program which may be obtained by anonymous ftp. A simulation study which illustrates their effectiveness on three examples is presented.

Journal ArticleDOI
TL;DR: In this article, a method of bootstrap estimation for the linear model is presented, where instead of resampling from the original sample, as is conventional, the proposed method resamples summands in an estimating function used to produce the original estimate.
Abstract: SUMMARY This paper presents a method of bootstrap estimation for the linear model. Rather than resampling from the original sample, as is conventional, the proposed method resamples summands in an estimating function used to produce the original estimate. The method is computationally simpler then existing competitors. However, its main advantage lies in the robustness to heteroscedasticity of the estimators it produces. These estimators are shown to have desirable asymptotic properties under fairly general conditions. A simulation study provides small sample support for the new method.

Journal ArticleDOI
01 Apr 1995-Heredity
TL;DR: The use of the bootstrap to estimate the distribution of statistics from allozyme data is examined and resampling over loci, in most practical cases, does not conform to the three basic assumptions of thebootstrap.
Abstract: The use of the bootstrap to estimate the distribution of statistics from allozyme data is examined. The different loci are often used as the unit of resampling. Since the interpretation and validity of the bootstrap is affected by the unit of resampling, and since resampling over loci, in most practical cases, does not conform to the three basic assumptions of the bootstrap, this method should be avoided. Resampling over individual genotypes may provide a valid alternative approach.

Journal ArticleDOI
TL;DR: Each stage of data processing is examined and the advantages and drawbacks of different techniques in relation to the effects on turbulence statistics (variance, instantaneous shear stress, etc.).
Abstract: Laser Doppler Anemometry (LDA) has proved a powerful tool for quantifying fluid turbulence and is increasingly being applied in fields such as fluvial sedimentology and geomorphology. When operated in the burst-signal processing mode, high-frequency velocity fluctuations are measured at irregular time internals. In many situations, there is a need to transform these data to obtain evenly spaced velocity values but at a lower frequency. However, clear guidelines for this type of data processing are lacking. Three steps are necessary in order to transform the original files into evenly spaced data: (1) resampling at the average sampling rate, (2) low-pass filtering with half-power frequency adjusted to the final sampling frequency, and (3) decimating at the desired frequency. The decision taken at each step will affect the resulting signal and may cause, if not assessed carefully, severe problems in the signal such as aliasing errors. This paper examines each stage of data processing and details the advantages and drawbacks of different techniques in relation to the effects on turbulence statistics (variance, instantaneous shear stress, etc.). A standard method and specific guidelines are finally proposed.

Journal ArticleDOI
01 Feb 1995
TL;DR: The authors propose the bootstrap as a valuable tool for the analysis of simulation output data since it can be used in situations in which either the distribution is not known or normal approximations are inappropriate.
Abstract: There are many situations in which parametric statistical techniques are less than ideal for evaluating a simulated system. For most simulation output, one must rely heavily on the central limit theorein in order to apply parametric statistical techniques. The bootstrap statistic is a nonparametric sample-resample technique that makes no distributional assumptions and may be used for estimation and hypothesis testing. The authors propose the bootstrap as a valuable tool for the analysis of simulation output data since it can be used in situations in which either the distribution is not known or normal approximations are inappropriate. Furthermore, since bootstrapping is itself a simulation technique it is inherently satisfying as a tool for the analysis of simulation output data. Illustrations are presented.

Journal ArticleDOI
TL;DR: In this article, the authors compare several test procedures for the two- or three-sample case, including the Birnbaum-Hall test and the k-sample Smirnov test as described in Conover (1980), for instance.

Journal ArticleDOI
TL;DR: It is shown how the moving block bootstrap can be applied successfully to a simple cubic spin flip Ising model, to assess standard errors and confidence intervals for thermodinamics densities and response functions at temperatures both greater than and equal to the critical temperature.

Journal ArticleDOI
TL;DR: This review describes two alternatives to classical tests for distinguishing means and argues that if randomization rather than random sampling has been done, permutation tests are superior to the classical t and F tests for detecting differences between means and should replace them.
Abstract: In this review there are described two alternatives to classical tests for distinguishing means. These are called computer-intensive because they can only be performed on fast computers. Permutation procedures have the virtue in that they are easy to understand, they can be employed to analyse small sets of experimental data, and under the randomization model of inference (though not the population model) they require no assumptions except that the experimental groups have been constructed by randomization. Bootstrap procedures are designed for use under the population model of inference (though not the randomization model) and are best suited to larger sets of experimental data. Non-parametric bootstrapping requires populations to be sampled randomly, but it depends on no prior assumptions about the distributions of those populations. It is argued that if randomization rather than random sampling has been done, permutation tests are superior to the classical t and F tests for detecting differences between means and therefore should replace them. If random sampling has been done, non-parametric bootstrap techniques may prove to be superior to classical tests for constructing population confidence intervals or testing hypotheses. However, their accuracy, especially for hypothesis-testing and when samples are small, has yet to be firmly established and there is a dearth of commercial software with which they can be executed on personal computers.

Journal ArticleDOI
TL;DR: In this article, a resampling plan is introduced for bootstrapping regression parameter estimators for the Cox (1972) proportional hazards regression model when explanatory variables are nonrandom constants fixed by the design of the experiment.
Abstract: A resampling plan is introduced for bootstrapping regression parameter estimators for the Cox (1972) proportional hazards regression model when explanatory variables are nonrandom constants fixed by the design of the experiment. The plan is an analog to the residual-resampling method for regression introduced by Efron (1979) and is related to the resampling method proposed by Hjort (1985) for the Coxmodel. The resampled quantities are a form of generalized residuals which have a distribution that is independent of the explanatory variables. Hence, unlike some methods, this approach does not require resampling of explanatory variables, which would be contrary to the assumption that they are nonrandom. An invariance property of the Cox likelihood allows these residuals to be transformed into a convenientscale for generating a likelihood. Also, the method can incorporate many forms of censoring. A simuation study of the proposed procedure shows that it can be used to improve upon the usual estimation procedu...

Journal ArticleDOI
Harry Mager1, Gernot Göller1
TL;DR: Simulation studies based on different predefined pharmacokinetic models revealed that even the non-parametric pseudo-profile stratified 'bootstrap' (resampling with replacement per time point) performs quite satisfactorily.
Abstract: In general, the pharmacokinetic model parameters, like rate constants, area under the curve (AUC) etc. are estimated via a two-stage procedure, where the values obtained from concentration-time relationships within one subject (experimental unit) are considered to be functionally related to the drug concentrations measured. In many cases 'mean' estimators and their respective standard errors are calculated afterwards. The determination of drug concentrations in organs as well as in the serum of small animals (mice, rats) in dependence of the time after administration often does not permit the establishment of reasonable profiles within one subject suited for conventional pharmacokinetic analyses and tolerability studies. Frequently, only one experimental value per organ or animal is recorded. The consequence is that most pharmacokinetic parameters are to be estimated on the basis of the mean concentrations rather than via the mean of individual parameter estimates. In all cases of a non-linear relationship between a target item and the concentration, the mean-concentration based estimators and the two-stage profile based estimators need not coincide. In addition, in the former case variance estimators may be either difficult to obtain or not deducible. In order to get variance estimators as well as to enable comparisons between different treatment regimens, in addition to bioequivalence testing as a step towards human dose finding studies, various resampling techniques (parametric and non-parametric bootstrap) were applied to generate pseudo-profiles from independent measurements and compared to their more conventional counterparts where applicable. Simulation studies based on different predefined pharmacokinetic models (first-order elimination after i.v. bolus, first-order elimination after first-order absorption, simple capacity-limited kinetics) revealed that even the non-parametric pseudo-profile stratified 'bootstrap' (resampling with replacement per time point) performs quite satisfactorily.

Journal Article
TL;DR: The four steps in the resampling method are defined, this procedure is compared with standard formulaic treatment of the same data, and the advantages of the resamspling approach are discussed.
Abstract: The statistical techniques reported in medical journals are often poorly understood and misused by researchers, and difficult for readers to comprehend. Even the relatively simple t test embodies an intricate collection of mathematical formulas such as the Normal approximation, with unintuitive elements such as pi and e. The researcher must consult a body of rules indicating whether a technique is applicable or inapplicable. If he or she does select a sound method and use it correctly, its mathematical complexity will still leave most readers in the dark. In most situations the resampling approach is at least as efficient as the formulaic method. It is transparently clear to the researcher and reader, and it protects the researcher from using the wrong formulaic technique. In this paper we will define the four steps in the resampling method, compare this procedure with standard formulaic treatment of the same data, and discuss the advantages of the resampling approach.

Journal ArticleDOI
TL;DR: In this article, the authors compared three alternative procedures for testing the significance of coefficients in least absolute value (LAV) regression, in the context of small samples, and concluded that the bootstrap test used in this study performs well, compared to the other two tests.

Proceedings ArticleDOI
27 Nov 1995
TL;DR: This paper applies a statistical resampling technique, called the bootstrap method, to this estimation problem and shows that the variance of the boot strap estimates can be smaller than those of the cross-validated estimates.
Abstract: We compare the cross-validation and bootstrap methods for estimating the expected error rates of feedforward neural network classifiers in small sample size situations. The cross-validation method, a commonly applied method, provides nearly unbiased classification error rates, using only the original samples. The cross-validated estimates, however, may suffer from a large variance. In this paper, we apply a statistical resampling technique, called the bootstrap method, to this estimation problem and compare the performances of these methods. Our results show that the variance of the bootstrap estimates can be smaller than those of the cross-validated estimates.

Journal ArticleDOI
Rose Baker1
TL;DR: The F-ratio test for equality of dispersion in two samples is by no means robust, while non-parametric tests either assume a common median, or are not very powerful.
Abstract: The F-ratio test for equality of dispersion in two samples is by no means robust, while non-parametric tests either assume a common median, or are not very powerful. Two new permutation tests are presented, which do not suffer from either of these problems. Algorithms for Monte Carlo calculation of P values and confidence intervals are given, and the performance of the tests are studied and compared using Monte Carlo simulations for a range of distributional types. The methods used to speed up Monte Carlo calculations, e.g. stratification, are of wider applicability.

Journal ArticleDOI
01 Feb 1995-Heredity
TL;DR: Two bootstrap procedures are proposed to perform one- and two-sample tests on inbreeding coefficients for single loci by resampling over the genotypes by Monte Carlo simulations, and a comparison with the classical chi-square test is made.
Abstract: Two bootstrap procedures are proposed to perform one- and two-sample tests on inbreeding coefficients for single loci by resampling over the genotypes. These tests allow testing against a broad range of new alternative hypotheses in addition to panmixis. Monte Carlo simulations show that the coverage probability of these tests behaves satisfactorily if the number of bootstrap resamples is larger than or equal to 2500 and the sample size is larger than or equal to 20, for the case of two alleles. As the fixation index of the underlying distribution becomes more extreme, higher sample sizes are required to obtain a reliable test. Two explicit formulae for the power of the two tests are estimated from Monte Carlo simulations, and a comparison with the classical chi-square test is made. A Turbo–Pascal computer program is available to perform the two presented bootstrap tests.

Journal ArticleDOI
TL;DR: The presented method is readily extended to enable use in recursive identification of parameters in a chosen model structure and as an improvement of tracking ability of an ordinary recursive routine.

Journal ArticleDOI
TL;DR: The effects of higher-order resampling methods on AVHRR data are analysed and discussed, and an exemplar taken from Normalised Difference Vegetation Index (NDVI) data sets derived from NOAA AV HRR data is presented to show the possible artefacts introduced by higher- order resamplings strategies.
Abstract: Before using remotely-sensed data for any scientific research, the data have to pass through several stages of pre-processing such as navigation and registration, rectification, replacement of dropouts and primary calibration. It is necessary to know the effects of these pre-processing operations on the information content in the raw data. In this article the effects of higher-order resampling methods on AVHRR data are analysed and discussed. An exemplar taken from Normalised Difference Vegetation Index (NDVI) data sets derived from NOAA AVHRR data is presented to show the possible artefacts introduced by higher-order resampling strategies.

Journal ArticleDOI
TL;DR: In this article, the authors present a paired test of the equality of two means, allowing for incomplete pairs. But the test may be performed on original data or after transformation of the data.
Abstract: Summary The paper presents a paired test of the equality of two means, allowing for incomplete pairs. It is a permutation test depending on minimal distributional assumptions. The test may be performed on original data or after transformation of the data. Transformation to ranks is illustrated.