scispace - formally typeset
Search or ask a question

Showing papers on "Sampling (statistics) published in 1983"



Book
01 Jan 1983
TL;DR: Introduction Simple Random Sampling Systematic Sampling Stratification Cluster and Multistage Sampling Probability Proportional to Size Sampling Other Probability Designs Sampling Frames Nonresponse Survey Analysis Sample Size Two Examples Nonprobability Sampling Concluding Remarks
Abstract: Introduction Simple Random Sampling Systematic Sampling Stratification Cluster and Multistage Sampling Probability Proportional to Size Sampling Other Probability Designs Sampling Frames Nonresponse Survey Analysis Sample Size Two Examples Nonprobability Sampling Concluding Remarks

882 citations


Journal ArticleDOI
TL;DR: An exact expression is given for the jackknife estimate of the number of species in a community and for the variance of this number when quadrat sampling procedures are used.
Abstract: An exact expression is given for the jackknife estimate of the number of species in a community and for the variance of this number when quadrat sampling procedures are used. The jackknife estimate is a function of the number of species that occur in one and only one quadrat. The variance of the number of species can be constructed, as can approximate two-sided confidence intervals. The behavior of the jackknife estimate, as affected by quadrat size, sample size and sampling area, is investigated by simulation.

653 citations


Journal ArticleDOI
01 Jun 1983
TL;DR: Geostatistics as discussed by the authors is a method for the analysis of the spatial and temporal properties in a data set and a method of interpolation between selected points, which can be used to estimate the spatial or temporal dependence of samples and from this knowledge to arrive at an estimation of the sampling procedures or structure at a field.
Abstract: In agronomic problems the sampling procedure may create some confusion and bias in the analysis. Geostatistics provides a method for the analysis of the spatial and temporal properties in a data set and a method of interpolation between selected points. This paper describes the theory of geostatistics and its application to selected agronomic problems. Geostatistics considers a set of data collected in either space or time at discrete intervals. These samples may be correlated with each other to provide some unique information about the parameters which would not be detected in the classical statistical methods. Through the application of geostatistics to this type of problem, we can estimate the spatial or temporal dependence of samples and from this knowledge arrive at an estimation of the sampling procedures or structure at a field. The application of these techniques is shown for air temperature, surface temperature, yield, clay content, and fertilizer content in various fields and reveals the versatility of the techniques. Geostatistics also allows for the evaluation of the dependence between two parameters in either time or space. From this information it is possible to develop sampling procedures which would allow the more costly or time consuming variable to be sampled less frequently and estimated from the other variable by the method of kriging. This report summarizes all of these techniques and provides several different examples of their utilization. Examples of the computer code are provided for the reader wishing to apply these techniques.

560 citations


Book
01 Jan 1983
TL;DR: In this paper, Liebetrau reviews the properties of important contemporary measures of association and correlation, including measures for nominal, ordinal, and continuous (interval) data, paying special attention to the sampling distributions needed to determine levels of significance and confidence intervals.
Abstract: Clearly reviews the properties of important contemporary measures of association and correlation. Liebetrau devotes full chapters to measures for nominal, ordinal, and continuous (interval) data, paying special attention to the sampling distributions needed to determine levels of significance and confidence intervals. Valuable discussions also focus on the relationships between various measures, the sampling properties of their estimators and the comparative advantages and disadvantages of different approaches.

430 citations


Book
01 Jan 1983
TL;DR: Introduction Comparing Survey Methods Sampling Issues in Questionnaire Design and Question Writing Administration and Final Considerations and the Future.
Abstract: Introduction Comparing Survey Methods Sampling Issues in Questionnaire Design and Question Writing Administration Final Considerations and the Future

399 citations


Journal ArticleDOI
TL;DR: A practical example shows that the bias due to incomplete matching can be severe, and moreover, can be avoided entirely by using an appropriate multivariate nearest available matching algorithm, which, in the example, leaves only a small residual biasDue to inexact matching.
Abstract: Observational studies comparing groups of treated and control units are often used to estimate the effects caused by treatments. Matching is a method for sampling a large reservoir of potential controls to produce a control group of modest size that is ostensibly similar to the treated group. In practice, there is a trade-off between the desires to find matches for all treated units and to obtain matched treated-control pairs that are extremely similar to each other. We derive expressions for the bias in the average matched pair difference due to (1) the failure to match all treated units—incomplete matching, and (2) the failure to obtain exact matches—inexact matching. A practical example shows that the bias due to incomplete matching can be severe, and moreover, can be avoided entirely by using an appropriate multivariate nearest available matching algorithm, which in the example, leaves only a small residual bias due to inexact matching.

283 citations


Journal ArticleDOI
TL;DR: In this paper, a method for determining sample size, that is, the number of observations, taking account of spatial dependence, is presented, which depends on knowing the semivariogram for the property of interest, which is used to calculate the variances in the neighborhood of each observation point.
Abstract: A common task in regional studies of soil is to determine the mean values of particular soil properties from samples. Estimates of the number of observations needed for this purpose have usually been based on classical sampling theory without regard to spatial dependence in the data. As a result they have been unduly exaggerated and have often daunted investigators from pursuing their aims. This paper demonstrates a method for determining sample size, that is, the number of observations, taking account of spatial dependence. The method depends on knowing the semivariogram for the property of interest, which is used to calculate the variances in the neighborhood of each observation point. The variances are then pooled to form the global variance from which the standard error can be calculated The pooled value is minimized for a given sample size if all neighborhoods are of the same size, i.e., if the sampling points lie on a regular grid. If variation is isotropic, then an equilateral triangular grid is slightly better than a square one, though the latter will usually be preferred for convenience. Where there is simple anisotropy, a nonsquare rectangular grid aligned with its longer intervals in the direction of least variation is practically optimal. Examples show the relations between standard errors and sample sizes when sampling on regular grids and from which sample sizes can be chosen to achieve any desired precision. In all instances the sampling effort determined this way is less, and can be very much less, than would have been judged necessary using the classical approach.

242 citations


Journal ArticleDOI
TL;DR: A binomial model is presented which enables the clumping patterns of different species or categories of cotton arthropods and plant parts to be compared, accounting for the effect of their densities, and estimates of the proportion of infested sample units derived with this model are compared with those derived with three other binomial models.
Abstract: A binomial model is presented which enables the clumping patterns of different species or categories of cotton arthropods and plant parts to be compared, accounting for the effect of their densities. Estimates of the proportion of infested sample units derived with this model are compared with those derived with three other binomial models. Statistical comparison is made, using as a criterion the degree to which each model fit field values of the proportion of infested sample units collected by three sampling methods (visual whole-plant examination, a bag method, and sweep-net). Those models which fail to incorporate the effect of density on clumping behavior fit the data less well. Estimates of sample sizes derived by a binomial sample size equation and a numerical sample size equation both of which incorporate species clumping behavior are also compared. The sample size estimates from the two equations are most similar at low densities and for species whose distributions appear closest to random; and although binomial sampling requires a larger sample size at higher densities, less time is required to sample each unit (leaf, plant, etc.).

211 citations


Journal ArticleDOI
01 Jul 1983
TL;DR: Conditions for ignoring non-random selection mechanisms in a model-based approach to inference in an observational study, such as a sample survey, are examined.
Abstract: SUMMARY Random sampling schemes satisfy the conditions for ignoring the selection mechanism in a model-based approach to inference in an observational study, such as a sample survey. In many studies non-random sampling is employed. Conditions for ignoring non-random selection mechanisms are examined. Particular attention is paid to post- statification and to quota sampling. Randomization, whether in the application of treatments to experimental material or in the selection of units to be observed in a sample survey, is one of the most important contributions of statistics to science. The arguments for randomization are twofold. The first, and most important for science, is that randomization eliminates personal choice and hence eliminates the possibility of subjective selection bias. The second is that the randomization distribution provides a basis for statistical inference. The question for scientists is whether such statistical inferences are relevant for scientific inference. In most of applied science any uncertainty in "nature" is represented by a stochastic model of the phenomenon under study. The problems of statistical inference are then the problems of testing the fit of alternative models and of making inferences about the parameters of a given model. Even when randomization is employed the randomization distribution plays no direct role in this type of statistical inference. However, the selection of the units to be studied can affect

180 citations


Journal ArticleDOI
01 Aug 1983-Ecology
TL;DR: This is a conceptual paper outlining a new methodology, not a definitive investigation of the best specific way to implement this method, but several alternative sampling and analysis methods are discussed and an example is given.
Abstract: Distance sampling methodology is adapted to enable animal density (number per unit of area) to be estimated from capture-recapture and removal data. A trapping web design provides the link between capture data and distance sampling theory. It is possible to check qualitatively the critical assumption on which the web design and the estimator are based. This is a conceptual paper outlining a new methodology, not a definitive investigation of the best specific way to implement this method. Several alternative sampling and analysis methods are possible within the general framework of distance sampling theory; a few alternatives are discussed and an example is given.


Journal ArticleDOI
Abstract: SUMMARY A theory for unbiased estimation of the total of arbitrary particle characteristics in line-intercept sampling, for transects of fixed and of random length, is presented. This theory unifies present lineintercept sampling results. Examples are given and variance estimation is discussed. 1. Introduction and Literature Review Line-intercept sampling (LIS) is a method of sampling particles in a region whereby, roughly, a particle is sampled if a chosen line segment, called a 'transect', intersects the particle. It has the advantage over 'quadrat sampling' in that there is no need to delineate the quadrats and determine which objects are in each quadrat. Examples of the economics of LIS versus quadrat sampling can be found in Canfield (1941), Bauer (1943), Warren and Olsen (1964), and Bailey (1970). The particles may represent plants, shrubs, tree crowns, nearly stationary animals, animal dens or signs, roads, logs, forest debris, particles on a microscope slide, particles in a plane section of a rock or metal sample, etc. In early biological applications, sampling with a transect appears to have been a purposivesampling technique for studying how vegetation varies with changing environment, with the transect running perpendicular to the zonation (Weaver and Clements, 1929). In the study of range vegetation, Canfield (1941) incorporated random placement of the transect and, by taking the proportion of the sampled transect intercepted by the vegetation, obtained an unbiased estimator of coverage that is, the ratio of the area covered by the vegetation to the area of the region of interest. However, he did not prove the unbiasedness of this estimator. Canfield called this method the 'line-interception method'. He also discussed such design questions as how many lines of what length are required and whether or not the area of interest should be stratified. Bauer (1943) compared transect sampling to quadrat sampling in an area of dense chapparal vegetation and in a laboratory experiment. He concluded that '. . . transect sampling deserves much wider use. . .'. McIntyre (1953) investigated the possibility of using data on intercept lengths in order to estimate not only coverage, but also density-that is, the ratio of the number of particles to the area of the region of interest. He was able to do this for populations which consisted of particles that were all magnifications of a known shape. Lucas and Seber (1977) presented and proved the unbiasedness of estimators of particle density and coverage for arbitrarily shaped and located particles when the transect is randomly placed. Their estimator of coverage is the same as that of Canfield (1941). Eberhardt (1978) reviewed three transect methods for use in ecological sampling: LIS, and two methods, 'line-transect sampling' and 'strip-transect sampling', in which the particles are points and the probability of observing a particle is a function of its perpendicular distance

01 Jan 1983
TL;DR: In this article, the authors present a power test procedure to determine the number of replicates necessary to detect a change of a predicted magnitude in a single time and place, and over time.
Abstract: This optimum sampling design for the detection of environmental impact stresses the importance of replication through time, particularly in variable environments. A statistical model and a rationale for incorporating the variability between sampling locations over time into the error term of the analysis, together with the instantaneous replicate variability, results in a determination of statistical significance that will correspond more closely to biological significance. The sampling design incorporates two levels of replication, instantaneous at a single time and place, and over time. These levels of replication must be optimized, depending on their relative variability and the marginal cost of collecting each kind of sample. Finally, the authors review a power test procedure to determine the number of replicates necessary to detect a change of a predicted magnitude. Environmental monitoring is an important source of feedback about the results of alternative strategies of resource utilization. As such, it is an essential part of any attempt to use or develop natural resources wisely. The concept of a predicted change is central to the design of successful and cost-effective monitoring programs. These programs should be designed, from inception, to have a specified probability of detecting a predicted change of specified magnitude. Without thismore » design, monitoring programs run the risk, on the one hand, of having little or no chance of detecting anything but catastrophic changes; or, on the other, of sampling far in excess of what is necessary to test reasonable hypotheses. The authors have integrated relevant ecological, statistical, and management concerns in the presentation of methods to avoid these problems in the design of environmental monitoring programs. 12 references« less

Journal ArticleDOI
TL;DR: In this paper, the least squares estimator of the first order non-explosive autoregressive process with unknown parameter beta epsilon (1,1) was shown to be asymptotically normally distributed uniformly in beta.
Abstract: : For a first order non-explosive autoregressive process with unknown parameter beta epsilon (1,1), it is shown that if data are collected according to a particular stopping rule, the least squares estimator of beta is asymptotically normally distributed uniformly in beta. In the case of normal residuals, the stopping rule may be interpreted as sampling until the observed Fisher information reaches a preassigned level. The situation is contrasted with the fixed sample size case, where the estimator has a non-normal limiting distribution when (beta) = 1. (Author)

Journal ArticleDOI
TL;DR: This study suggests that one essentially has to have a good approximation of the region of acceptability in order to achieve significant variance reduction, and that importance sampling is very useful when there are few parameters and the yield is very high or very low.
Abstract: The efficiency of several variance reduction techniques (in particular, importance sampling, stratified sampling, and control variates) are studied with respect to their application in estimating circuit yields. This study suggests that one essentially has to have a good approximation of the region of acceptability in order to achieve significant variance reduction. Further, all the methods considered are based, either explicitly or implicity, on the use of a model. The control variate method appears to be more practical for implementation in a general purpose statistical circuit analysis program. Stratified sampling is the most simple to implement, but yields only very modest reductions in the variance of the yield estimator. Lastly, importance sampling is very useful when there are few parameters and the yield is very high or very low; however, a good practical technique for its implementation, in general, has not been found.

Journal ArticleDOI
TL;DR: In this article, the sampling errors of maximum likelihood estimation of item response theory parameters are studied in the case when both people and item parameters are estimated simultaneously, and a check on the validity of the standard error formulas is carried out.
Abstract: The sampling errors of maximum likelihood esti mates of item response theory parameters are studied in the case when both people and item parameters are estimated simultaneously. A check on the validity of the standard error formulas is carried out. The effect of varying sample size, test length, and the shape of the ability distribution is investigated. Finally, the ef fect of anchor-test length on the standard error of item parameters is studied numerically for the situation, common in equating studies, when two groups of ex aminees each take a different test form together with the same anchor test. The results encourage the use of rectangular or bimodal ability distributions, and also the use of very short anchor tests.

Journal ArticleDOI
TL;DR: In this article, a method for constructing confidence intervals for a binomial parameter upon termination of a sequential or multistage test is described for use with the MIL-STD 105D multiple sampling plans for acceptance sampling.
Abstract: This paper describes a method for constructing confidence intervals for a binomial parameter upon termination of a sequential or multistage test Tables are presented for use with the MIL-STD 105D multiple sampling plans for acceptance sampling Also given are tables for use with some three-stage schemes that have been proposed in connection with biomedical trials The results are compared with confidence intervals calculated as if the sampling plan had been one with a fixed sample size

Journal ArticleDOI
TL;DR: In this article, thermal and wet chemical methods of separating organic from elemental carbon in particulate samples were examined and it was concluded that none of them represents an ideal separation procedure and that only a method-dependent operational definition of organic and elemental carbon is possible at this time.

Journal ArticleDOI
TL;DR: In this article, the sample size required to produce a prescribed level of accuracy can be determined from statistical theory and the equations needed are presented in order to obtain the required number of operators.
Abstract: Operator and sampling errors in the grid by the number sampling technique for river gravels result in differences between sample and population parameters. Differences between operators occur because of slight differences in stone selection procedure and are independent of sample size. Differences between samples occur because of random errors and these decrease with increasing sample size. Consequently, as sample size increases differences between operators become statistically significant even though physically they remain the same. This is illustrated by data for eight operators. Samples of 30, 60 and 100 show no significant differences but those for 120, 180 and 300 do show significant differences. At small sample sizes, sampling error is large and the sample parameters only approximately define population values. The sample size required to produce a prescribed level of accuracy can be determined from statistical theory and the equations needed are presented here. Where samples larger than 100 stones...

Journal ArticleDOI
Ian G. Cowx1
TL;DR: Of the methods described the Zippin maximum likelihood model was considered most convenient and satisfactory (primarily because of its ease of use) when the proportion of the population taken in successive catches remains constant throughout sampling.
Abstract: The paper describes methods of estimating population size based on survey removal or depletion data, discusses the assumptions upon which they are based and the relative merits of each method Of the methods described the Zippin maximum likelihood model was considered most convenient and satisfactory (primarily because of its ease of use) when the proportion of the population taken in successive catches remains constant throughout sampling When this was not the case, as is often found in depletion sampling, the more robust maximum weighted likelihood model of Carle & Strub (1978) was found to be the only method which produced statistically reliable estimates

Journal ArticleDOI
TL;DR: In this paper, the authors examine the effects that test-unit size, spacing, and patterning have on the discovery of archaeological sites of varying size and artifact density and present some simple procedures both for the efficient design of surveys and the evaluation of existing survey results.
Abstract: Shovel-test sampling, the excavation of small test units at regular intervals along survey transects, is a widely used technique for archaeological survey in heavily vegetated areas. In order to achieve efficiently the archaeological goals of a survey employing the technique, the survey should be designed with consideration of the statistical properties of shovel-test sampling. In this paper we examine the effects that test–unit size, spacing, and patterning have on the discovery of archaeological sites of varying size and artifact density. This examination presents some simple procedures both for the efficient design of surveys and the evaluation of existing survey results.

Journal ArticleDOI
TL;DR: A distinction is made between the uses of models in sample design and in survey analysis, where design-based analysis may provide greater protection against model misspecifications than model- based analysis.
Abstract: Summary A distinction is made between the uses of models in sample design and in survey analysis. Sampling practitioners regularly employ models to guide their choice of sample design, but seldom place complete reliance in a model (which would eliminate the need for probability sampling). With the large samples typical of most surveys, they are reluctant to use model-based estimators of descriptive parameters because of the bias resulting from any misspecifications of the model. With small samples, however, they may prefer a model-based estimator, accepting its unknown bias where its variance is much smaller than the design-based estimator; synthetic estimation for small areas illustrates this point. Models are essential for handling nonresponse and with the technique of statistical matching. The use of models for predicting sampling errors is noted. Causal analysis of survey data necessarily involves models; even so, in some circumstances design-based analysis may provide greater protection against model misspecifications than model-based analysis.

Journal ArticleDOI
TL;DR: A method for calculating plankton densities from estimates of mean interanimal distances is described, and several species were found to exist within unexpectedly narrow and sharply defined layers, often at densities greatly surpassing density estimates based on net samples.
Abstract: This study evaluates the usefulness of a small submersible for observations of the plankton. A method for calculating plankton densities from estimates of mean interanimal distances is described. Estimates made by this method were compared with estimates based on net sampling and were found to be in fair general agreement with them. Fragile gelatinous forms were better counted from the submersible, small organisms by netting. Some delicate species, known to be abundant from submersible observations, were never recognized in net samples. Submersible observations also gave important insights into vertical distribution of the plankton. Several species were found to exist within unexpectedly narrow and sharply defined layers, often at densities greatly surpassing density estimates based on net samples. In Saanich Inlet, B.C., plankton distribution was studied in relation to the seasonal formation and dispersion of the oxygen-deficient basin water. Other data deal with behavior, color change, bioluminescence, ...

Journal ArticleDOI
TL;DR: Anscombe, Chow and Robbins as discussed by the authors proposed the large sample theory, which enables the number of samphng operations to be reduced by any predetermined factor, at the expense of only a slight increase in the expected sample size.
Abstract: Anscombe, Chow and Robbins. It enables the number of samphng operations to be reduced by any predetermined factor, at the expense of only a slight increase in the expected sample size. The large sample theory is outlined, and some numerical computations provided to demonstrate the practicality of the procedure for smaller samples.

Book ChapterDOI
01 Jan 1983
TL;DR: In this paper, the authors present the basic conceptual models that archaeologists labor under when they use probabilistic methods in survey sampling, and show explicitly how discovery probabilities and various statistical estimates are derived and what variables govern their efficiency and effectiveness.
Abstract: Publisher Summary Sampling is a valid and respectable archaeological concept. Statistical sampling theory occupies a central position in the discipline. This chapter discusses fundamentals and issues relating to how different statistical procedures function and focuses on the statistics of sampling. It discusses the fundamental expectations that archaeologists hold when they sample. The potential objectives of statistical survey in archaeology vary over a broad spectrum. These objectives range from the sometimes relatively simple task of estimating the average density of cultural remains in a region or sampling area to the perhaps more interesting one of estimating some substantively meaningful parameter of those remains. The chapter presents the basic conceptual models that archaeologists labor under when they use probabilistic methods in survey sampling. The concepts of discovery and statistical precision have been related to operational realities, showing explicitly how discovery probabilities and various statistical estimates are derived, and what variables govern their efficiency and effectiveness. Both element sampling and cluster sampling are regularly required in archaeological survey sampling. Both are most frequently implemented by subdividing the area to be sampled into arbitrary spatial units, quadrats or transects.

Journal ArticleDOI
TL;DR: In this paper, the authors view the labour market experience of individuals as a process of movement between the states of employment and unemployment and examine this dependence and its implications for the interpretation of estimates of models of labour market behaviour.
Abstract: In this paper we view the labour market experience of individuals as a process of movement between the states of employment and unemployment. We note that there are three main ways of sampling members of the labour force namely sampling the members of a specific state, sampling the people entering or leaving a state and sampling the population regardless of state. The joint distribution of observable and unobservable characteristics of individuals depends on the mode of sampling adopted. We examine this dependence and its implications for the interpretation of estimates of models of labour market behaviour.

Journal ArticleDOI
TL;DR: The most common ways of obtaining rare population subjects have been by screening households in known minority communities, recruitment from lists of persons known to belong to the group, such as minority group organization lists, and selecting subjects on the basis of definitive demographic characteristics such as physical appearance, language, and names as mentioned in this paper.
Abstract: BECAUSE members of minority groups constituting a very small population of a national population are difficult and costly to locate using standard probability sampling, social scientists interested in studying small minority groups have frequently had to rely on nonprobability sampling methods. Some of the most common ways of obtaining rare population subjects have been by screening households in known minority communities, recruitment from lists of persons known to belong to the group, such as minority group organization lists, and selecting subjects on the basis of definitive demographic characteristics, such as physical appearance, language, and names. Yet the use of such techniques makes the representativeness of the selected sample and the research results questionable. If research projects on small population groups must rely on nonprobability methods of sampling, therefore, an attempt should be made to determine which of those methods yields the most unbiased


Journal ArticleDOI
TL;DR: In this article, the sampling variability of spectra of wind-generated waves is tested against the predictions of the theory of waves as a stationary random quasi-Gaussian process, and it is demonstrated that the theory provides accurate estimates of the sample variability.
Abstract: The sampling variability of spectra of wind-generated waves is tested against the predictions of the theory of waves as a stationary random quasi-Gaussian process. Both laboratory data, in which stationarity was prescribed, and field data, in which the external conditions were remarkably steady, were treated in the same way. It is demonstrated that the theory of stationary Gaussian processes provides accurate estimates of the sampling variability. For a record length of 17 min, commonly used in wave monitoring at sea, the uncertainties in the significant height and peak frequency estimates are approximately ±12% and ±5% respectively at the 90% confidence level. Furthermore, the height of the peak of the spectrum is generally overestimated.