scispace - formally typeset
Search or ask a question

Showing papers on "Resampling published in 1997"


Journal ArticleDOI
01 Jun 1997-Ecology
TL;DR: Re- sampling methods should be incorporated in meta-analysis studies, to ensure proper evaluation of main effects in ecological studies, and confidence limits based on bootstrapping methods were found to be wider than standard confidence limits, implying that resampling estimates are more conservative.
Abstract: Meta-analysis is a statistical technique that allows one to combine the results from multiple studies to glean inferences on the overall importance of various phenomena. This method can prove to be more informative than common ''vote counting,'' in which the number of significant results is compared to the number with nonsignificant results to determine whether the phenomenon of interest is globally important. While the use of meta- analysis is widespread in medicine and the social sciences, only recently has it been applied to ecological questions. We compared the results of parametric confidence limits and ho- mogeneity statistics commonly obtained through meta-analysis to those obtained from re- sampling methods to ascertain the robustness of standard meta-analytic techniques. We found that confidence limits based on bootstrapping methods were wider than standard confidence limits, implying that resampling estimates are more conservative. In addition, we found that significance tests based on homogeneity statistics differed occasionally from results of randomization tests, implying that inferences based solely on chi-square signif- icance tests may lead to erroneous conclusions. We conclude that resampling methods should be incorporated in meta-analysis studies, to ensure proper evaluation of main effects in ecological studies.

569 citations



Journal ArticleDOI
TL;DR: A resampling technique is developed to approximate the distribution of this function, which enables one to construct confidence bands for the cumulative incidence curve over the entire time span of interest and to perform Kolmogorov-Smirnov type tests for comparing two such curves.
Abstract: In the competing risks problem, a useful quantity is the cumulative incidence function, which is the probability of occurrence by time t for a particular type of failure in the presence of other risks. The estimator of this function as given by Kalbfleisch and Prentice is consistent, and, properly normalized, converges weakly to a zero-mean Gaussian process with a covariance function for which a consistent estimator is provided. A resampling technique is developed to approximate the distribution of this process, which enables one to construct confidence bands for the cumulative incidence curve over the entire time span of interest and to perform Kolmogorov-Smirnov type tests for comparing two such curves. An AIDS example is provided.

439 citations


Journal ArticleDOI
TL;DR: In this paper, a bootstrap method based on the method of sieves is proposed, where a linear process is approximated by a sequence of autoregressive processes of order p p(n), where p n), p n o (n) as the sample size n.
Abstract: We study a bootstrap method which is based on the method of sieves. A linear process is approximated by a sequence of autoregressive processes of order p p(n), where p(n) , p ( n ) o ( n) as the sample size n . For given data, we then estimate such an AR( p(n)) model and generate a bootstrap sample by resampling from the residuals. This sieve bootstrap enjoys a nice nonparametric property, being model-free within a class of linear processes. We show its consistency for a class of nonlinear estimators and compare the procedure with the

416 citations


Journal ArticleDOI
TL;DR: In this article, the authors describe the construction of resampling tests for differences of means that account simultaneously for temporal and spatial correlation, using the relatively new concept of moving blocks.
Abstract: Presently employed hypothesis tests for multivariate geophysical data (e.g., climatic fields) require the assumption that either the data are serially uncorrelated, or spatially uncorrelated, or both. Good methods have been developed to deal with temporal correlation, but generalization of these methods to multivariate problems involving spatial correlation has been problematic, particularly when (as is often the case) sample sizes are small relative to the dimension of the data vectors. Spatial correlation has been handled successfully by resampling methods when the temporal correlation can be neglected, at least according to the null hypothesis. This paper describes the construction of resampling tests for differences of means that account simultaneously for temporal and spatial correlation. First, univariate tests are derived that respect temporal correlation in the data, using the relatively new concept of “moving blocks” bootstrap resampling. These tests perform accurately for small samples ...

315 citations


Book ChapterDOI
TL;DR: Bootstrap as discussed by the authors is a method for estimating the distribution of an estimator or test statistic by resampling one's data, which can be used to substitute computation for mathematical analysis if calculating the asymptotic distribution of a statistic or estimator is difficult.
Abstract: The bootstrap is a method for estimating the distribution of an estimator or test statistic by resampling one's data. It amounts to treating the data as if they were the population for the purpose of evaluating the distribution of interest. Under mild regularity conditions, the bootstrap yields an approximation to the distribution of an estimator or test statistic that is at least as accurate as the approximation obtained from first-order asymptotic theory. Thus, the bootstrap provides a way to substitute computation for mathematical analysis if calculating the asymptotic distribution of an estimator or statistic is difficult. The maximum score estimator Manski (1975, 1985), the statistic developed by Ha..rdle et al. (1991) for testing positive- definiteness of income-effect matrices, and certain functions of time- series data (Blanchard and Quah 1989, Runkle 1987, West 1990) are examples in which evaluating the asymptotic distribution is difficult and bootstrapping has been used as an alternative.1 In fact, the bootstrap is often more accurate in finite samples than first-order asymptotic approximations but does not entail the algebraic complexity of higher-order expansions. Thus, it can provide a practical method for improving upon first-order approximations. First-order asymptotic theory often gives a poor approximation to the distributions of test statistics with the sample sizes available in applications. As a result, the nominal levels of tests based on asymptotic critical values can be very different from the true levels. The information matrix test of White(1982) is a well-known example of a test in which large finite- sample distortions of level can occur when asymptotic critical values are used (Horowitz 1994, Kennan and Neumann 1988, Orme 1990, Taylor 1987). Other illustrations are given later in this chapter. The bootstrap often provides a tractable way to reduce or eliminate finite- sample distortions of the levels of statistical tests.

210 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that permutation tests based on studentized statistics are asymptotically exact of size α also under certain extended non-i.i.d. null hypotheses.

144 citations


Journal ArticleDOI
TL;DR: The Kernel estimator was robust to changes in spatial resolution of the data, and the polygon estimates were severely biased upwards with decreasing spatial resolution (increasing grid cell size), therefore, comparative studies based on polygon methods must use the same spatial resolution.
Abstract: We compared 3 home range estimators (kernel estimator [Kernel], multiple polygons by clustering [Cluster], and minimum convex polygon [MCP]) and evaluated a measure of autocorrelation (Schoener's ratio), with respect to the effects of sampling frequency, spatial resolution of the sampling reference grid, and sample size. We also used Schoener's ratio as a descriptor of within home range movements. An extensive dataset from radiotracking of root voles (Microtus oeconomus) formed the basis for these comparisons. The degree of autocorrelation was sex specific. In particular, locations of reproductive females were significantly autocorrelated for a sampling interval equal to the period of the population's ultradian activity rhythm, indicating territory patrolling behavior in this sex. We assessed the effect of spatial resolution of animal location data on home range descriptors by manipulating the cell size of the sampling reference grids. The Kernel estimator was robust to changes in spatial resolution of the data. In contrast, the polygon estimates were severely biased upwards with decreasing spatial resolution (increasing grid cell size). Therefore, comparative studies based on polygon methods must use the same spatial resolution. The sampling frequency affected all estimators, but qualitative differences were found among the specific estimators. Numerical resampling methods indicated that home range sizes were underestimated, and that the precision of the estimators was generally low.

136 citations


Journal ArticleDOI
TL;DR: This paper illustrates the phenomenon of over-optimism with respect to the predictive ability of the 'final' regression model in a simple cutpoint model and explores to what extent bias can be reduced by using cross-validation and bootstrap resampling.
Abstract: The process of model building involved in the analysis of many medical studies may lead to a considerable amount of over-optimism with respect to the predictive ability of the 'final' regression model. In this paper we illustrate this phenomenon in a simple cutpoint model and explore to what extent bias can be reduced by using cross-validation and bootstrap resampling. These computer intensive methods are compared to an ad hoc approach and to a heuristic method. Besides illustrating all proposals with the data from a breast cancer study we perform a simulation study in order to assess the quality of the methods.

133 citations


Journal ArticleDOI
TL;DR: The objective is to produce two‐ or three‐dimensional brain maps that provide, at each pixel in the map, an estimated P value with absolute meaning, that is, each P value approximates the probability of having obtained by chance the observed signal effect at that pixel, given that the null hypothesis is true.
Abstract: Although functional magnetic resonance imaging (fMRI) methods yield rich temporal and spatial data for even a single subject, universally accepted data analysis techniques have not been developed that use all the potential information from fMRI of the brain. Specifically, temporal correlations and confounds are a problem in assessing change within pixels. Spatial correlations across pixels are a problem in determining regions of activation and in correcting for multiple significance tests. We propose methods that address these issues in the analysis of task-related changes in mean signal intensity for individual subjects. Our approach to temporally based problems within pixels is to employ a model based on autoregressive-moving average (ARMAor ''Box-Jenkins'') time series methods, which we call CARMA (Contrasts and ARMA). To adjust for performing multiple significance tests across pixels, taking into account between-pixel correlations, we propose adjustment of P values with ''resampling methods.'' Our objective is to produce two- or three-dimensional brain maps that provide, at each pixel in the map, an estimated P value with absolute meaning. That is, each P value approximates the probability of having obtained by chance the observed signal effect at that pixel, given that the null hypothesis is true. Simulated and real data examples are provided.Hum. Brain Mapping 5:168-193, 1997. r 1997 Wiley-Liss, Inc.

127 citations


Journal ArticleDOI
TL;DR: The software presented here should facilitate a validation approach in which actual field data sets are resampled numerous times to arrive at average performance values and associated variances.
Abstract: Many sampling plans have been developed for a wide variety of arthropods of economic importance. However, relatively few plans have been tested adequately to gauge their utility in the field. The software presented here should facilitate a validation approach in which actual field data sets are resampled numerous times to arrive at average performance values and associated variances. The major strength of this resampling approach is that analyses are based on actual sampling distributions of arthropod populations, not those specified by a theoretical model. A limitation is that this approach does require additional planning and effort to collect an adequate number of independent data sets. The software (Resampling for Validation of Sample Plans [RVSP])can be used to test 2 fixed-precision sequential sampling plans based on enumerative counts and 2 (1 sequential and 1 fixed) sampling plans based on binomial counts. The software is user friendly and permits easy entry of sample plan parameters and data sets. We present details of the required input data and output generated by RVSP.We further provide example analyses for 3 pest insect species to demonstrate the use of RVSP for evaluating several different sampling plans.

Journal ArticleDOI
TL;DR: Application of the bootstrap intervals to data from a Dutch follow-up study on preterm infants shows the corroborative usefulness of the intervals, while the intervals are seen to be a powerful diagnostic in studying annual measles data.
Abstract: We discuss and evaluate bootstrap algorithms for obtaining confidence intervals for parameters in Generalized Linear Models when the data are correlated. The methods are based on a stratified bootstrap and are suited to correlation occurring within “blocks” of data (e.g., individuals within a family, teeth within a mouth, etc.). Application of the intervals to data from a Dutch follow-up study on preterm infants shows the corroborative usefulness of the intervals, while the intervals are seen to be a powerful diagnostic in studying annual measles data. In a simulation study, we compare the coverage rates of the proposed intervals with existing methods (e.g., via Generalized Estimating Equations). In most cases, the bootstrap intervals are seen to perform better than current methods, and are produced in an automatic fashion, so that the user need not know (or have to guess) the dependence structure within a block.

Journal ArticleDOI
TL;DR: This work proposes an approach that dramatically decreases the computation time of the standard bootstrap filter and at the same time preserves its excellent performance.
Abstract: In discrete-time system analysis, nonlinear recursive state estimation is often addressed by a Bayesian approach using a resampling technique called the weighted bootstrap. Bayesian bootstrap filtering is a very powerful technique since it is not restricted by model assumptions of linearity and/or Gaussian noise. The standard implementation of the bootstrap filter, however, is not time efficient for large sample sizes, which often precludes its utilization. We propose an approach that dramatically decreases the computation time of the standard bootstrap filter and at the same time preserves its excellent performance. The time decrease is realized by resampling the prior into the posterior distribution at time instant k by using sampling blocks of varying size, rather than a sample at a time as in the standard approach. The size of each block resampled into the posterior in the algorithm proposed here depends on the product of the normalized weight determined by the likelihood function for each prior sample and the sample size N under consideration.

Journal ArticleDOI
TL;DR: In this paper, Beran's extension of the Kaplan-Meier estimator for the situation of right censored observations at fixed covariate values is studied and an almost sure asymptotic representation is established.
Abstract: We study Beran's extension of the Kaplan-Meier estimator for thesituation of right censored observations at fixed covariate values. Thisestimator for the conditional distribution function at a given value of thecovariate involves smoothing with Gasser-Muller weights. We establishan almost sure asymptotic representation which provides a key tool forobtaining central limit results. To avoid complicated estimation ofasymptotic bias and variance parameters, we propose a resampling methodwhich takes the covariate information into account. An asymptoticrepresentation for the bootstrapped estimator is proved and the strongconsistency of the bootstrap approximation to the conditional distributionfunction is obtained.

Journal ArticleDOI
TL;DR: A new performance evaluation paradigm for computer vision systems is proposed that exploits a resampling technique recently introduced in statistics, the bootstrap, to derive confidence in the adequacy of the assumptions embedded into the computational procedure for the given input.
Abstract: A new performance evaluation paradigm for computer vision systems is proposed. In real situation, the complexity of the input data and/or of the computational procedure can make traditional error propagation methods infeasible. The new approach exploits a resampling technique recently introduced in statistics, the bootstrap. Distributions for the output variables are obtained by perturbing the nuisance properties of the input, i.e., properties with no relevance for the output under ideal conditions. From these bootstrap distributions, the confidence in the adequacy of the assumptions embedded into the computational procedure for the given input is derived. As an example, the new paradigm is applied to the task of edge detection. The performance of several edge detection methods is compared both for synthetic data and real images. The confidence in the output can be used to obtain an edgemap independent of the gradient magnitude.

Journal ArticleDOI
TL;DR: This paper examines two Monte-Carlo variance reduction techniques, importance sampling and correlation, and proposes a method for using them in statistical tolerance synthesis.
Abstract: A statistical tolerance synthesis must analyse many sets of tolerances, each of which has a unique probability distribution. The Monte-Carlo technique that is typically used to evaluate the probability distribution must analyse large numbers of individual cases. The result is a huge number of individual analyses, which is computationally expensive. This paper examines two Monte-Carlo variance reduction techniques, importance sampling and correlation, and proposes a method for using them in statistical tolerance synthesis. Correlation is used to reduce the error in the tolerance analyses. Importance sampling is used to estimate the sensitivity of an analysis to the tolerances so that a gradient based optimization algorithm can be used.

Journal ArticleDOI
TL;DR: In this article, the authors examined the situation of resampling of lots and derived the performance measures of a resampled scheme having a single sampling plan for inspection, and discussed the usefulness and limitations of resample of resubmitted lots.
Abstract: Lot resubmissions are allowed in situations where the original inspection results are suspected or when the supplier is allowed to opt for resampling as per the provisions of the contract etc. This paper examines the situation of resampling of lots and derives the performance measures of a resampling scheme having a single sampling plan for inspection. The usefulness and limitations of resampling of resubmitted lots are also discussed

Journal ArticleDOI
TL;DR: In this paper, the non-parametric tests are compared with the t -test through Monte Carlo experiments, and the authors consider testing structural changes as an application in economics and show that the t-test has asymptotic properties with respect to various aspects.
Abstract: Summary Non-parametric tests that deal with two samples include scores tests (such as the Wilcoxon rank sum test, normal scores test, logistic scores test, Cauchy scores test, etc.) and Fisher's randomization test. Because the non-parametric tests generally require a large amount of computational work, there are few studies on small-sample properties, although asymptotic properties with regard to various aspects were studied in the past. In this paper, the non-parametric tests are compared with the t -test through Monte Carlo experiments. Also, we consider testing structural changes as an application in economics.

Journal ArticleDOI
01 Feb 1997
TL;DR: In this paper, a nonparametric resampling technique for generating daily weather variables at a site is presented, which can be thought of as a smoothed conditional bootstrap and is equivalent to simulation from a kernel density estimate of the multivariate conditional probability density function.
Abstract: A nonparametric resampling technique for generating daily weather variables at a site is presented. The method samples the original data with replacement while smoothing the empirical conditional distribution function. The technique can be thought of as a smoothed conditional Bootstrap and is equivalent to simulation from a kernel density estimate of the multivariate conditional probability density function. This improves on the classical Bootstrap technique by generating values that have not occurred exactly in the original sample and by alleviating the reproduction of fine spurious details in the data. Precipitation is generated from the nonparametric wet/dry spell model as described in Lall et al. [1995]. A vector of other variables (solar radiation, maximum temperature, minimum temperature, average dew point temperature, and average wind speed) is then simulated by conditioning on the vector of these variables on the preceding day and the precipitation amount on the day of interest. An application of the resampling scheme with 30 years of daily weather data at Salt Lake City, Utah, USA, is provided.

01 Jan 1997
TL;DR: In this paper, the non-parametric tests are compared with the t-test through Monte Carlo experiments, and the authors consider testing structural changes as an application in economics and propose a Monte Carlo-based approach to test structural changes.
Abstract: SUMMARY Non-parametric tests that deal with two samples include scores tests (such as the Wilcoxon rank sum test, normal scores test, logistic scores test, Cauchy scores test, etc.) and Fisher’s randomization test. Because the non-parametric tests generally require a large amount of computational work, there are few studies on small-sample properties, although asymptotic properties with regard to various aspects were studied in the past. In this paper, the non-parametric tests are compared with the t-test through Monte Carlo experiments. Also, we consider testing structural changes as an application in economics.

Journal ArticleDOI
TL;DR: In this paper, affine-invariant k-variate extensions of the one-sample signed-rank test and the Hodges-Lehmann estimate are considered, and the necessary distribution theory is developed, and asymptotic Pitman efficiencies with respect to Hotelling's T 2 test under multivariate t distributions are tabulated.
Abstract: Brown and Hettmansperger introduced affine-invariant bivariate analogs of the sign, rank, and signed-rank tests based on the Oja median. In this article affine-invariant k-variate extensions of the one-sample signed-rank test and the Hodges-Lehmann estimate are considered. The necessary distribution theory is developed, and asymptotic Pitman efficiencies with respect to Hotelling's T 2 test under multivariate t distributions are tabulated. An application of the signed-rank tests to a repeated-measurement setting is presented.

Journal ArticleDOI
TL;DR: In this paper, the authors used a bootstrap procedure to extend this approach and generate confidence intervals for diversity indexes, which are sensitive to the size, shape, and spatial arrangement of patches.
Abstract: Many landscape indexes with ecological relevance have been proposed, including diversity indexes, dominance, fractal dimension, and patch size distribution. Classified land cover data in a geographic information system (GIS) are frequently used to calculate these indexes. However, a lack of methods for quantifying uncertainty in these measures makes it difficult to test hypothesized relations among landscape indexes and ecological processes. One source of uncertainty in landscape indexes is classification error in land cover data, which can be reported in the form of an error matrix. Some researchers have used error matrices to adjust extent estimates derived from classified land cover data. Because landscape diversity indexes depend only on landscape composition – the extent of each cover in a landscape – adjusted extent estimates may be used to calculate diversity indexes. We used a bootstrap procedure to extend this approach and generate confidence intervals for diversity indexes. Bootstrapping is a technique that allows one to estimate sample variability by resampling from the empirical probability distribution defined by a single sample. Using the empirical distribution defined by an error matrix, we generated a bootstrap sample of error matrixes. The sample of error matrixes was used to generate a sample of adjusted diversity indexes from which estimated confidence intervals for the diversity indexes were calculated. We also note that present methods for accuracy assessment are not sufficient for quantifying the uncertainty in landscape indexes that are sensitive to the size, shape, and spatial arrangement of patches. More information about the spatial structure of error is needed to calculate uncertainty for these indexes. Alternative approaches should be considered, including combining traditional accuracy assessments with other probability data generated during the classification procedure.

Journal ArticleDOI
TL;DR: This article showed that a straightforward extrapolation of the bootstrap distribution obtained by resampling without replacement, as considered by Politis and Romano, leads to second-order correct confidence intervals, provided that the sampling size is chosen adequately.
Abstract: This paper shows that a straightforward extrapolation of the bootstrap distribution obtained by resampling without replacement, as considered by Politis and Romano, leads to second-order correct confidence intervals, provided that the resampling size is chosen adequately. We assume only that the statistic of interest Tn, suitably renormalized by a regular sequence, is asymptotically pivotal and admits an Edgeworth expansion on some differentiable functions. The results are extended to a corrected version of the moving-block bootstrap without replacement introduced by Kunsch for strong-mixing random fields. Moreover, we show that the generalized jackknife or the Richardson extrapolation of such bootstrap distributions, as considered by Bickel and Yahav, leads to better approximations.

Journal ArticleDOI
TL;DR: This work presents an alternative method and an associated computer program which use resampling (bootstrap) methods to estimate the NOEC without assuming a specific distribution, and has the advantage that no underlying distribution is assumed.
Abstract: Recent estimations of NOEC (no observed effect concentration) values for communities use single species effect data to predict the concentration at which not more than some particular acceptable percentage of the species in a community will be affected. This method has a number of difficulties, not the least of which is obtaining effects data for enough of the right species to accurately represent the whole community. Typically one has to make do with existing data sets in which the choice of species tested has been made for convenience rather than representativeness. Usually the raw data alone are not sufficient to make reasonable estimates. Statistical methods have been proposed which deal with this problem by assigning a specific distribution to the data. But assumption of a specific distribution may not be valid. We present an alternative method and an associated computer program which use resampling (bootstrap) methods to estimate the NOEC without assuming a specific distribution. This method has the advantage that no underlying distribution is assumed. Simulated and published data sets were used to compare this approach with published methods. The use of this technique to assess representativeness was also demonstrated

Journal ArticleDOI
01 Dec 1997
TL;DR: In this paper, some issues related to the selection and design of univariate kernel density estimators are reviewed and some strategies for bandwidth and kernel selection are discussed in an applied context and recommendations for parameter selection are offered.
Abstract: Kernel density estimators are useful building blocks for empirical statistical modeling of precipitation and other hydroclimatic variables. Data driven estimates of the marginal probability density function of these variables (which may have discrete or continuous arguments) provide a useful basis for Monte Carlo resampling and are also useful for posing and testing hypotheses (e.g bimodality) as to the frequency distributions of the variable. In this paper, some issues related to the selection and design of univariate kernel density estimators are reviewed. Some strategies for bandwidth and kernel selection are discussed in an applied context and recommendations for parameter selection are offered. This paper complements the nonparametric wet/dry spell resampling methodology presented in Lall et al. (1996).

Journal ArticleDOI
TL;DR: This paper considers methods of statistical analysis for highly skewed immune response data, using resampling techniques to consider the robustness of normal parametric methods, e.g. t tests and linear regression, and illustrates how bootstrap resampled can be used to provide a valid alternative method of analysis.

Journal ArticleDOI
TL;DR: A test of correlation of the residuals in generalized linear models which is a generalization of the spatial autocorrelation test based on Moran's I and a formula is given to compute the weights according to the alternative hypothesis.
Abstract: We propose a test of correlation of the residuals in generalized linear models which is a generalization of the spatial autocorrelation test based on Moran's I It allows adjustment for sizes of geographical areas and for explanatory variables A formula is given to compute the weights according to the alternative hypothesis We compare inference using the distribution in the model and using the permutation distribution A simulation study showed that the model-based test may be very conservative and this leads to a loss of power compared to the permutation test or to the model-based test with correction for estimated parameters As this latter is intractable for very large samples when the model includes explanatory variables, we recommend the use of the permutation test The permutation test is used to study geographical correlation of dyspnoea in the elderly

Journal ArticleDOI
TL;DR: In this paper, a permutation test is proposed for examining the significance of effects in factorial experiments, which is more flexible than other methods proposed for the same situation because it requires no a priori assumptions regarding the underlying distribution of the data nor does it impose any practical restriction on the number of potentially significant effects present.
Abstract: A permutation test is proposed for examining the significance of effects in unreplicated factorial experiments The procedure tests each effect with a separate sampling distribution using a test statistic that is equivalent to the optimal invariant decision rule of Birnbaum The proposed test is more flexible than other methods proposed for the same situation because it requires no a priori assumptions regarding the underlying distribution of the data nor does it impose any practical restriction on the number of potentially significant effects present

Journal ArticleDOI
TL;DR: In this article, a method of random resampling of residuals from stochastic models is used to generate a large number of 12-month-long traces of natural monthly runoff to be used in a position analysis model for a water-supply storage and delivery system.
Abstract: A method of random resampling of residuals from stochastic models is used to generate a large number of 12-month-long traces of natural monthly runoff to be used in a position analysis model for a water-supply storage and delivery system. Position analysis uses the traces to forecast the likelihood of specified outcomes such as reservoir levels falling below a specified level or streamflows falling below statutory passing flows conditioned on the current reservoir levels and streamflows. The advantages of this resampling scheme, called bootstrap position analysis, are that it does not rely on the unverifiable assumption of normality, fewer parameters need to be estimated directly from the data, and accounting for parameter uncertainty is easily done. For a given set of operating rules and water-use requirements for a system, water managers can use such a model as a decision-making tool to evaluate different operating rules.

Proceedings ArticleDOI
25 Apr 1997
TL;DR: This work provides an application and extension of the analysis of the effect of finite-sample training and test sets on the bias and variance of the classical discriminants as given by Fukunaga, and examines the uncertainties in the resulting estimates.
Abstract: This work provides an application and extension of the analysis of the effect of finite-sample training and test sets on the bias and variance of the classical discriminants as given by Fukunaga The extension includes new results for the area under the ROC curve, Az An upper bound on Az is provided by the so-called resubstitution method in which the classifier is trained and tested on the same patients; a lower bound is provided by the hold-out method in which the patient pool is partitioned into trainers and testers Both methods exhibit a bias in Az with a linear dependence on the inverse of the number of patients Nt used to train the classifier; this leads to the possibility of obtaining an unbiased estimate of the infinite-population performance by a simple regression procedure We examine the uncertainties in the resulting estimates Whereas the bias of classifier performance is determined by the finite size of the training sample, the variance is dominated by the finite size of the test sample This variance is approximately given by the simple result for an equivalent binomial process A number of applications to the linear classifier are presented in this paper More general applications, including the quadratic classifier and some elementary neural-network classifiers, are presented in a companion paper© (1997) COPYRIGHT SPIE--The International Society for Optical Engineering Downloading of the abstract is permitted for personal use only