scispace - formally typeset
Search or ask a question

Showing papers on "Sampling (statistics) published in 1990"


Journal ArticleDOI
TL;DR: In this paper, three sampling-based approaches, namely stochastic substitution, the Gibbs sampler, and the sampling-importance-resampling algorithm, are compared and contrasted in relation to various joint probability structures frequently encountered in applications.
Abstract: Stochastic substitution, the Gibbs sampler, and the sampling-importance-resampling algorithm can be viewed as three alternative sampling- (or Monte Carlo-) based approaches to the calculation of numerical estimates of marginal probability distributions. The three approaches will be reviewed, compared, and contrasted in relation to various joint probability structures frequently encountered in applications. In particular, the relevance of the approaches to calculating Bayesian posterior densities for a variety of structured models will be discussed and illustrated.

6,294 citations


Journal Article
TL;DR: Stochastic substitution, the Gibbs sampler, and the sampling-importance-resampling algorithm can be viewed as three alternative sampling- (or Monte Carlo-) based approaches to the calculation of numerical estimates of marginal probability distributions.
Abstract: Stochastic substitution, the Gibbs sampler, and the sampling-importance-resampling algorithm can be viewed as three alternative sampling- (or Monte Carlo-) based approaches to the calculation of numerical estimates of marginal probability distributions. The three approaches will be reviewed, compared, and contrasted in relation to various joint probability structures frequently encountered in applications. In particular, the relevance of the approaches to calculating Bayesian posterior densities for a variety of structured models will be discussed and illustrated.

6,223 citations


Book ChapterDOI
01 Jan 1990
TL;DR: This chapter deals with the procedures required to successfully conduct a plant analysis or tissue test and the importance of following the proper sampling, preparation, and analysis procedures.
Abstract: Plant analysis (sometimes referred to as leaf analysis) is the determination of the total elemental content of a specified plant part. The emphasis in this chapter will be on the determination of those elements required for plant growth. Interpretation is normally based on the use of a "critical value" or "sufficiency range" (Smith, 1962) comparison between the elemental concentration found and a known norm (Goodall & Gregory, 1947; Chapman, 1966; Reuter & Robinson, 1986; Adriano, 1986; Martin-Prevel et al., 1987). An alternative method of interpretation is Diagnosis and Recommendation Integrated System (ORIS), which interprets the ratios of elements (N/P, K/Ca, and K/Mg) as indicators of elemental status (Beaufils, 1973; Sumner, 1977, 1982). Most growers primarily use a plant analysis for diagnosing suspected elemental insufficiencies, while its most significant, yet little used application, is for evaluating the soil/plant elemental status. This is partially reflected in the relatively few plant tissue samples assayed for growers, about 500 000, in the USA each year (Jones, 1985). Tissue testing, an elemental assay of extracted cell sap by means of quick chemical tests in the field, seems to be gaining an interest at levels equal to that observed several decades ago. A plant analysis is carried out in a series of steps as shown in Fig. 15-1. The results obtained are no better than the care taken in collecting, handling, preparing, and analyzing the collected tissue. An error made in one of these steps can result in an erroneous interpretation leading to recommendations that may be either unnecessary, costly, or even damaging to the crop. Therefore, it is important for those employing either a plant analysis or tissue test to follow the proper sampling, preparation, and analysis procedures. This chapter deals with the procedures required to successfully conduct a plant analysis or tissue test.

1,004 citations


Book
13 Dec 1990
TL;DR: In this paper, a description of variable material sampling and estimation generalization, prediction, and classification relations between variables, covariance and correlation regression relations between individuals, similarity ordination analysis of dispersion and discrimination numerical classification, hierarchical systems numerical classification - non hierarchical methods spatial dependence nested sampling and analysis local estimation, kriging.
Abstract: Quantitative description of variable material sampling and estimation generalization, prediction, and classification relations between variables - covariance and correlation regression relations between individuals - similarity ordination analysis of dispersion and discrimination numerical classification - hierarchical systems numerical classification - non hierarchical methods spatial dependence nested sampling and analysis local estimation - kriging.

775 citations


Book
14 Jan 1990
TL;DR: In this paper, the authors propose a statistical method for sample size determination in case-control studies and cohort studies, based on sampling distribution characteristics of estimates of population parameters hypothesis testing two sample confidence intervals and hypothesis tests epidemiologic study design basis sampling concepts.
Abstract: Part 1 Statistical methods for sample size determination: the one sample problem the two sample problem sample size for case-control studies sample size determination for cohort studies lot quality assurance sampling the incidence density sample size for continuous response variables sample size for sample surveys. Part 2 Foundations of sampling and statistical theory: the population the sample sampling distribution characteristics of estimates of population parameters hypothesis testing two sample confidence intervals and hypothesis tests epidemiologic study design basis sampling concepts.

748 citations


Journal ArticleDOI
TL;DR: Application of the technique to existing studies on mitochondrial DNA in several animal species and on several nuclear genes in Drosophila indicates that the standard errors of genetic diversity estimates are usually quite large, so comparative studies of nucleotide diversity need to be substantially larger than the current standards.
Abstract: A technique is presented for the partitioning of nucleotide diversity into within- and between-population components for the case in which multiple populations have been surveyed for restriction-site variation. This allows the estimation of an analogue of FST at the DNA level. Approximate expressions are given for the variance of these estimates resulting from nucleotide, individual, and population sampling. Application of the technique to existing studies on mitochondrial DNA in several animal species and on several nuclear genes in Drosophila indicates that the standard errors of genetic diversity estimates are usually quite large. Thus, comparative studies of nucleotide diversity need to be substantially larger than the current standards. Normally, only a very small fraction of the sampling variance is caused by sampling of individuals. Even when 20 or so restriction enzymes are employed, nucleotide sampling is a major source of error, and population sampling is often quite important. Generally, the degree of population subdivision at the nucleotide level is comparable with that at the haplotype level, but significant differences do arise as a result of inequalities in the genetic distances between haplotypes.

540 citations



Book
06 Nov 1990
TL;DR: In this article, the cardinal series of Shannon sampling theory is extended to include interpolation and sampling, and the authors attempt to understand, generalize, and extend the sampling theory.
Abstract: Regaining original signals transformed from analog to digital systems or assessing information lost in the process are the fundamental issues addressed by sampling and interpolation theory. This study attempts to understand, generalize and extend the cardinal series of Shannon sampling theory.

479 citations


Journal ArticleDOI
TL;DR: In this article, the authors describe sampling designs in which, whenever an observed value of a selected unit satisfies a condition of interest, additional units are added to the sample from the neighborhood of that unit, if any of these additional units satisfies the condition, still more units may be added.
Abstract: In many real-world sampling situations, researchers would like to be able to adaptively increase sampling effort in the vicinity of observed values that are high or otherwise interesting. This article describes sampling designs in which, whenever an observed value of a selected unit satisfies a condition of interest, additional units are added to the sample from the neighborhood of that unit. If any of these additional units satisfies the condition, still more units may be added. Sampling designs such as these, in which the selection procedure is allowed to depend on observed values of the variable of interest, are in contrast to conventional designs, in which the entire selection of units to be included in the sample may be determined prior to making any observations. Because the adaptive selection procedure introduces biases into conventional estimators, several estimators are given that are design unbiased for the population mean with the adaptive cluster designs of this article; that is, the ...

420 citations


Journal ArticleDOI
TL;DR: In this paper, the clay content of the topsoil in two regions of contrasting physiography was predicted from sample data using four different procedures: the means of mapped classes, the usual kriging estimator, a cubic spline interpolator and a kriged estimator within classes using a pooled within-class variogram.
Abstract: SUMMARY The clay content of the topsoil in two regions of contrasting physiography was predicted from sample data using four different procedures. The predictors were the means of mapped classes, the usual kriging estimator, a cubic spline interpolator and a kriging estimator within classes using a pooled within-class variogram. The performances of the procedures were evaluated and compared. In the first region, Sandford St Martin on Jurassic sediments where there were some abrupt changes in soil, the classification predicted best within those classes bounded by sharp change. Elsewhere the usual kriging performed somewhat better, and kriging within classes was still more precise. In the second region, Yenne on the alluvial plain of the Rhone where the soil varied gradually, kriging performed better than classification, though a small improvement resulted from combining kriging with classification. Both prediction by class means and kriging attempt to minimize the estimation variance, and their mean prediction variances were close to the theoretical values overall. Spline interpolation is more empirical, and though it followed the abrupt changes better than kriging, it fluctuated excessively elsewhere, and its overall performance was poorer than that of kriging.

365 citations


Journal ArticleDOI
TL;DR: The jitter of such practical sampling systems as analog-to-digital converters, sample-and-hold circuits, and samplers is discussed and a model for estimating jitter is proposed, based on sampling sine-wave signal- to-noise ratio calculations.
Abstract: The jitter of such practical sampling systems as analog-to-digital converters, sample-and-hold circuits, and samplers is discussed. A model for estimating jitter is proposed. In this model, total jitter is composed of sampling circuit jitter, analog input signal jitter, and sampling clock jitter. Using this model, jitter is broken up into three components. To evaluate the model, a precise method for measuring jitter is devised. This method, based on sampling sine-wave signal-to-noise ratio calculations, enables separation of jitter and amplitude noise. The performance limit of converters as evaluated by the model is discussed. >

Book
01 Jan 1990
TL;DR: This book contains the tools for analyzing genetic data on morphological characters, isozyme frequencies, restriction fragment patterns, and DNA sequences, and the importance of clarifying the sampling frame is stressed.
Abstract: The interpretation of discrete genetic data lies at the heart of population and evolutionary genetics, yet basic statistics courses and texts generally concentrate on continuous variables. In "Genetic Data Analysis" a full account of the methodology appropriate for count data is presented. Starting with the basic idea of estimating gene frequencies, and proceeding through a wide range of topics to building phyilogenetic trees, the book contains the tools for analyzing genetic data on morphological characters, isozyme frequencies, restriction fragment patterns, and DNA sequences. Throughout the book, the importance of clarifying the sampling frame is stressed. For example, if conclusions are to be drawn about a single population, then only "statistical" sampling needs to be incorporated into expressions for variances of genetic parameter estimates. If conclusions are to be made on a wider basis, perhaps about a whole species, then the "genetical" sampling between generations that can cause local populations to differ must also be accommodated. This distinction is becoming increasingly important in the analysis of DNA sequence data, as the ability increases to generate multiple sequences of a particular region.

Proceedings ArticleDOI
01 May 1990
TL;DR: This paper extends the previous analysis to provide significantly improved bounds on the amount of sampling necessary for a given level of accuracy and provides “sanity bounds” to deal with queries for which the underlying data is extremely skewed or the query result is very small.
Abstract: Recently we have proposed an adaptive, random sampling algorithm for general query size estimation. In earlier work we analyzed the asymptotic efficiency and accuracy of the algorithm, in this paper we investigate its practicality as applied to selects and joins. First, we extend our previous analysis to provide significantly improved bounds on the amount of sampling necessary for a given level of accuracy. Next, we provide “sanity bounds” to deal with queries for which the underlying data is extremely skewed or the query result is very small. Finally, we report on the performance of the estimation algorithm as implemented in a host language on a commercial relational system. The results are encouraging, even with this loose coupling between the estimation algorithm and the DBMS.

Journal ArticleDOI
TL;DR: The storage/accuracy trade-off of an adaptive sampling algorithm due to Wegman that makes it possible to evaluate probabilistically the number of distinct elements in a large file stored on disk is analyzed.
Abstract: We analyze the storage/accuracy trade-off of an adaptive sampling algorithm due to Wegman that makes it possible to evaluate probabilistically the number of distinct elements in a large file stored on disk.

Journal ArticleDOI
01 Aug 1990-Oikos
TL;DR: All of the evidence indicates that a complex interplay of local ecological interactions, latitude, disturbance, and sampling regime determines the elevation of maximum insect species richness.
Abstract: The distribution of insects along elevational gradients is controversial Recent longterm sampling studies have concluded that mid-elevational peaks in species richness identified previously may have come from the short-term sampling regimes employed, and from disturbance at lower elevations Long-term sampling seems likely to reveal peaks at lower elevations Analysis of 20 studies taken from the literature add the possibility that the latitude at which a study is undertaken influences the elevation of peak species richness A study of 12 open sites in the southeastern US, ranging in elevation from 100 m to 1700 m, reveals that both principal reasons advanced previously for mid-elevational peaks may be valid, if short-term sampling is employed Taken together, all of the evidence indicates that a complex interplay of local ecological interactions, latitude, disturbance, and sampling regime determines the elevation of maximum insect species richness The temporal and spatial scale employed strongly influences the evaluation of this ecological "pattern"

Book ChapterDOI
01 Jan 1990
TL;DR: The evidence weighting mechanism, for augmenting the logic sampling stochastic simulation algorithm, and an enhancement to the basic algorithm which uses the evidential integration technique [Chin and Cooper, 1987].
Abstract: Stochastic simulation approaches perform probabilistic inference in Bayesian networks by estimating the probability of an event based on the frequency that the event occurs in a set of simulation trials. This paper describes the evidence weighting mechanism, for augmenting the logic sampling stochastic simulation algorithm [Henrion, 1986]. Evidence weighting modifies the logic sampling algorithm by weighting each simulation trial by the likelihood of a network's evidence given the sampled state node values for that trial. We also describe an enhancement to the basic algorithm which uses the evidential integration technique [Chin and Cooper, 1987]. A comparison of the basic evidence weighting mechanism with the Markov blanket algorithm [Pearl, 1987], the logic sampling algorithm, and the evidence integration algorithm is presented. The comparison is aided by analyzing the performance of the algorithms in a simple example network.

Journal ArticleDOI
TL;DR: In this article, sampling stochastic dynamic programming (SSDP) is used to capture the complex temporal and spatial structure of the streamflow process by using a large number of sample streamflow sequences.
Abstract: Most models for reservoir operation optimization have employed either deterministic optimization or stochastic dynamic programming algorithms. This paper develops sampling stochastic dynamic programming (SSDP), a technique that captures the complex temporal and spatial structure of the streamflow process by using a large number of sample streamflow sequences. The best inflow forecast can be included as a hydrologic state variable to improve the reservoir operating policy. A case study using the hydroelectric system on the North Fork of the Feather River in California illustrates the SSDP approach and its performance.

Journal ArticleDOI
TL;DR: The variable sampling interval (VSI) CUSUM chart as mentioned in this paper uses short sampling intervals if there is an indication that the process mean may have shifted and long sampling intervals when there is no indication of a change in the mean.
Abstract: A standard cumulative sum (CUSUM) chart for controlling the process mean takes samples from the process at fixed-length sampling intervals and uses a control statistic based on a cumulative sum of differences between the sample means and the target value. This article proposes a modification of the standard CUSUM scheme that varies the time intervals between samples depending on the value of the CUSUM control statistic. The variable sampling interval (VSI) CUSUM chart uses short sampling intervals if there is an indication that the process mean may have shifted and long sampling intervals if there is no indication of a change in the mean. If the CUSUM statistic actually enters the signal region, then the VSI CUSUM chart signals in the same manner as the standard CUSUM chart. A Markov-chain approach is used to evaluate properties such as the average time to signal and the average number of samples to signal. Results show that the proposed VSI CUSUM chart is considerably more efficient than the standard CUS...

Journal ArticleDOI
01 Feb 1990-Oikos
TL;DR: A relationship between the number of sampling units taken from the habitat, the rarity of the species, and the probability it will be detected in the sample is described, which can be described by a simple application of probability theory.
Abstract: Rare species are very common; a significant proportion of every community is made up of species with small populations (e.g. Morse et al. 1988). Such species pose a real problem for ecologists studying diversity, species composition and turnover. The chief problem is: when is a rare species not there? Clearly this can only be determined absolutely by an exhaustive, 100% efficient, search of the entire habitat. This is usually impractical. If the search is incomplete, say by sampling the habitat, the absence of the species from the sample may be because the species is truly absent or because the worker did not look hard enough. Only a probabilistic statement is possible; the more complete the search, the firmer the statement can be. There is a relationship between the number of sampling units taken from the habitat, the rarity of the species, and the probability it will be detected in the sample. Let N be the number of sampling units taken randomly from the habitat, p be the probability of the species appearing in a single sampling unit, and a is the probability or confidence that the species will be detected in the sample of N units. For rare species p would usually be less than 0.05; that is, the species would be expected to appear in less than 5 out of every 100 sampling units. The relationship can be described by a simple application of probability theory (though the same result can be derived via the binomial or even the negative binomial distribution). (l-p) is the probability of the species not appearing in a sampling unit, so (l-p)N is the probability of the sample not detecting that species. Thus the probability of the species appearing in the sample is

Journal ArticleDOI
TL;DR: It is concluded that the sampling interval can have a major impact on gas exchange data during exercise and considerable variability exists in the slope of the change in VO2 with a consistent change in external work regardless of the sample used, suggesting that a plateau is not a reliable physiological marker for maximal effort.
Abstract: To evaluate the effect of the gas exchange sampling interval on variability and plateau in O2 uptake (VO2), 10 subjects underwent steady-state treadmill exercise at 50% maximal VO2 and 6 subjects underwent maximal testing using a ramp protocol. During steady-state exercise, gas exchange data were acquired by using 10 different sampling intervals. The variability in VO2 was greater as the sampling interval shortened (SD = 4.5 ml.kg-1.min-1 for breath-by-breath vs. 0.8 ml.kg-1.min-1 for 60-s samples). The breath-by-breath data suggested a Gaussian distribution, and most of the variability was attributable to tidal volume (51%). During ramp testing, the slope of the change in VO2 (for each sample) was regressed with time. Considerable variability in the slopes was observed throughout exercise, and in each subject the slopes varied about zero, demonstrating both positive and negative values throughout submaximal effort. These observations were made despite the use of large sampling intervals. Shortening the sample resulted in even greater variability. We conclude that 1) the sampling interval can have a major impact on gas exchange data during exercise and 2) considerable variability exists in the slope of the change in VO2 with a consistent change in external work regardless of the sample used, suggesting that a plateau (defined as the slope of a VO2 sample at peak exercise that does not differ significantly from a slope of zero) in VO2 is not a reliable physiological marker for maximal effort.

Journal ArticleDOI
TL;DR: This work demonstrates for an important class of multistage stochastic models that three techniques — namely nested decomposition, Monte Carlo importance sampling, and parallel computing — can be effectively combined to solve this fundamental problem of large-scale linear programming.
Abstract: Our goal is to demonstrate for an important class of multistage stochastic models that three techniques — namely nested decomposition, Monte Carlo importance sampling, and parallel computing — can be effectively combined to solve this fundamental problem of large-scale linear programming.

Journal ArticleDOI
TL;DR: Item response theory (IT) models are now in common use for the analysis of dichotomous item responses and as discussed by the authors examines the sampling theory foundations for statistical inference in these models.
Abstract: Item response theory (IT) models are now in common use for the analysis of dichotomous item responses. This paper examines the sampling theory foundations for statistical inference in these models. The discussion includes: some history on the “stochastic subject” versus the random sampling interpretations of the probability in IRT models; the relationship between three versions of maximum likelihood estimation for IRT models; estimating θ versus estimating θ-predictors; IRT models and loglinear models; the identifiability of IRT models; and the role of robustness and Bayesian statistics from the sampling theory perspective.

Journal ArticleDOI
TL;DR: Analysis of estimates of home range and daily movement for radio-tagged pronghorns and coyotes based on subsamples of data collected at short time intervals during nonconsecutive 24-hour sampling sessions suggests that restricting sampling effort to statistically independent time intervals sacrifices biologically significant information.
Abstract: The authors compared estimates of home range and daily movement for radio-tagged pronghorns (Antilocapra americana) and coyotes (Canis latrans) based on subsamples of data collected at short time intervals during nonconsecutive 24-hour sampling sessions Home-range size, calculated by either the minimum area method or the linked-cell grid method, and daily distance traveled were underestimated when sampling intervals were based on statistically independent data Autocorrelated data provided a better estimate of true home-range sizes than independent data for all sampling intervals Estimates of daily movement based on sampling intervals > 4 hours for pronghorns and >3 hours for coyotes were not correlated with the actual distance traveled These relationships suggest that restricting sampling effort to statistically independent time intervals sacrifices biologically significant information

Journal ArticleDOI
TL;DR: Methods for measuring primary productivity of periphyton (O2 and 14C methods) and recent advances in microelectrode technology that allow microscale measurements of productivity a...
Abstract: I review recently published research (1970–89) on freshwater periphyton, with emphasis on epilithon and epiphyton Brushing syringe-samplers are recommended for sampling epilithon, due to their Sow

Journal ArticleDOI
TL;DR: In this article, the authors compare the assumptions and use of classical sampling theory with those of geostatistical theory, and conclude that this view is both false and unfortunate, and that estimates of spatial means based on classical sampling designs require fewer assumptions for their validity.
Abstract: A commonly held view among geostatisticians is that classical sampling theory is inapplicable to spatial sampling because spatial data are dependent, whereas classical sampling theory requires them to be independent. By comparing the assumptions and use of classical sampling theory with those of geostatistical theory, we conclude that this view is both false and unfortunate. In particular, estimates of spatial means based on classical sampling designs require fewer assumptions for their validity, and are therefore more robust, than those based on a geostatistical model.

ReportDOI
TL;DR: In this paper, a study region is subdivided into areal subsets that have a common spatial characteristic to stratify the population into several categories from which sampling sites are selected.
Abstract: Computer software was written to randomly select sites for a groundwater-quality sampling network. The software uses digital cartographic techniques and subroutines from a proprietary geographic information system. The report presents the approaches, computer software, and sample applications. It is often desirable to collect ground-water-quality samples from various areas in a study region that have different values of a spatial characteristic, such as land use or hydrogeologic setting. A stratified network can be used for testing hypotheses about relations between spatial characteristics and water quality, or for calculating statistical descriptions of water-quality data that account for variations that correspond to the spatial characteristic. In the software described, a study region is subdivided into areal subsets that have a common spatial characteristic to stratify the population into several categories from which sampling sites are selected. Different numbers of sites may be selected from each category of areal subsets. A population of potential sampling sites may be defined by either specifying a fixed population of existing sites, or by preparing an equally spaced population of potential sites. In either case, each site is identified with a single category, depending on the value of the spatial characteristic of the areal subset in which the site is located. Sites are selected from one category at a time. One of two approaches may be used to select sites. Sites may be selected randomly, or the areal subsets in the category can be grouped into cells and sites selected randomly from each cell.

Journal ArticleDOI
Yih-Chyun Jenq1
TL;DR: An algorithm for estimating the sampling time offsets encountered at the parallel sampling paths of an ultra-high-speed waveform digitizing system which is realized by interleaving many sample-hold and A/D (analog/digital) converter modules is presented.
Abstract: The author presents an algorithm for estimating the sampling time offsets encountered at the parallel sampling paths of an ultra-high-speed waveform digitizing system which is realized by interleaving many sample-hold and A/D (analog/digital) converter modules. One obvious application of this algorithm is to feed these estimates back to adjustable delay units in all the sampling paths to compensate the sampling-time offsets. Simulation results indicate that with an 8-bit quantizer the residual timing error can be reduced to just about 0.05% of the sampling period. >

Journal ArticleDOI
TL;DR: In this paper, an alternative procedure, based on directing and correcting the importance sampling function as sampling is carried out, is presented, in particular it is possible to have a multi-modal sampling function.

Journal ArticleDOI
TL;DR: In this paper, the capture efficiency of a beach seine varies greatly depending on aspects of the littoral zone habitat and fish community and to address this sampling bias, quantified seine efficiency and several ha...
Abstract: Capture efficiency of a beach seine varies greatly depending on aspects of the littoral zone habitat and fish community. To address this sampling bias, we quantified seine efficiency and several ha...