scispace - formally typeset
Search or ask a question

Showing papers on "Sampling distribution published in 1987"



Book
01 Jan 1987
TL;DR: In this article, the authors present a survey of the literature on statistical methods for measuring the goodness of fit and independence of two populations with respect to a given set of properties.
Abstract: 1. Data and Statistics. 2. Descriptive Statistics: Tabular and Graphical Presentations. 3. Descriptive Statistics: Numerical Measures. 4. Introduction to Probability. 5. Discrete Probability Distributions. 6. Continuous Probability Distributions. 7. Sampling and Sampling Distributions. 8. Interval Estimation. 9. Hypothesis Tests. 10. Statistical Inference about Means and Proportions with Two Populations. 11. Inferences About Population Variances. 12. Test of Goodness of Fit and Independence. 13. Experimental Design and Analysis of Variance. 14. Simple Linear Regression. 15. Multiple Regression. 16. Regression Analysis: Model Building. 17. Index Numbers. 18. Forecasting. 19. Nonparametric Methods. 20. Statistical Methods for Quality Control. 21. Decision Analysis. 22. Sample Survey (on CD). Appendices. A: References and Bibliography. B: Tables. C: Summation Notation. D: Self-Test Solutions and Answers to Even-Numbered Exercises. E: Using Excel Functions. F: Computing p-values Using Minitab and Excel.

312 citations


Journal ArticleDOI
TL;DR: In this article, the root of the confidence set is transformed by its estimated bootstrap cumulative distribution function, and the transformation of a confidence set root by the estimated distribution function can be iterated one or more times with smaller error than do confidence sets based on the original root.
Abstract: SUMMARY Approximate confidence sets for a parameter 0 may be obtained by referring a function of 0 and of the sample to an estimated quantile of that function's sampling distribution. We call this function the root of the confidence set. Either asymptotic theory or bootstrap methods can be used to estimate the desired quantile. When the root is not a pivot, in the sense of classical statistics, the actual level of the approximate confidence set may differ substantially from the intended level. Prepivoting is the transformation of a confidence set root by its estimated bootstrap cumulative distribution function. Prepivoting can be iterated. Bootstrap confidence sets generated from a root prepivoted one or more times have smaller error in level than do confidence sets based on the original root. The first prepivoting is nearly equivalent to studentizing, when that operation is appropriate. Further iterations of prepivoting make higher order corrections automatically.

309 citations


Journal ArticleDOI
TL;DR: Algorithms for generating the exact distribution of a finite sample drawn from a population in Hardy-Weinberg equilibrium are given for multiple alleles.
Abstract: Algorithms for generating the exact distribution of a finite sample drawn from a population in Hardy-Weinberg equilibrium are given for multiple alleles. The finite sampling distribution is derived analogously to Fisher's 2 X 2 exact distribution and is equivalent to Levene's conditional finite sampling distribution for Hardy-Weinberg populations. The algorithms presented are fast computationally and allow for quick alternatives to standard methods requiring corrections and approximations. Computation time is on the order of a few seconds for three-allele examples and up to 2 minutes for four-allele examples on an IBM 3081 machine.

304 citations


Journal ArticleDOI
TL;DR: In this article, a second-order Taylor expansion is used to estimate the mean squared error (MSE) of a Gaussian-based mean-squared error estimator.
Abstract: Forecasting requires estimates of the error of prediction; however, such estimates for autoregressive forecasts depend nonlinearly on unknown parameters and distributions. Substitution estimators of mean squared error (MSE) possess bias that varies with the underlying model, and Gaussian-based prediction intervals fail if the data are not normally distributed. This article proposes methods that avoid these problems. A second-order Taylor expansion produces an estimator of MSE that is unbiased and leads to accurate prediction intervals for Gaussian data. Bootstrapping also suggests an estimator of MSE, but it is approximately the problematic substitution estimator. Bootstrapping also yields prediction intervals, however, whose coverages are invariant of the sampling distribution and asymptotically approach the nominal content. Parameter estimation increases the error in autoregressive forecasts. This additional error inflates one-step prediction mean squared error (PMSE) by a factor of 1 + p/T, wh...

195 citations


Journal ArticleDOI
TL;DR: This paper demonstrates under general conditions the robustness of the t-test in that the maximum actual level of significance is close to the declared level.
Abstract: One may encounter the application of the two independent samples t-test to ordinal scaled data (for example, data that assume only the values 0, 1, 2, 3) from small samples. This situation clearly violates the underlying normality assumption for the t-test and one cannot appeal to large sample theory for validity. In this paper we report the results of an investigation of the t-test's robustness when applied to data of this form for samples of sizes 5 to 20. Our approach consists of complete enumeration of the sampling distributions and comparison of actual levels of significance with the significance level expected if the data followed a normal distribution. We demonstrate under general conditions the robustness of the t-test in that the maximum actual level of significance is close to the declared level.

178 citations


Book
01 Jan 1987
TL;DR: Probability and Stochastic Processes Limit Theorems for Some Statistics Asymptotic Theory of Estimation Linear Parametric Inference Martingale Approach to inference Inference in Non-Linear Regression Von-Mises Functionals Empirical Characteristic Function and Its Applications Index as mentioned in this paper.
Abstract: Probability and Stochastic Processes Limit Theorems for Some Statistics Asymptotic Theory of Estimation Linear Parametric Inference Martingale Approach to inference Inference in Non-Linear Regression Von-Mises Functionals Empirical Characteristic Function and Its Applications Index.

168 citations


Book
01 Jan 1987
TL;DR: In this paper, the authors discuss the importance of selecting the correct statistical test to evaluate the effectiveness of a program and the importance to consider when selecting a statistical test when evaluating individual practitioners' effectiveness.
Abstract: All chapters conclude with "Concluding Thoughts" and "Study Questions" Preface 1 Introduction to Statistical Analysis Uses of Statistical AnalysisGeneral Methodological TermsLevels of MeasurementLevels of Measurement and Analysis of DataOther Measurement ClassificationsCategories of Statistical Analyses 2 Frequency Distributions and Graphs Frequency DistributionsGrouped Frequency DistributionsUsing Frequency Distributions to Analyze DataMisrepresentation of DataGraphical Presentation of DataA Common Mistake in Displaying Data 3 Central Tendency and Variability Central TendencyVariability 4 Normal Distributions Skewed DistributionsNormal DistributionsConverting Raw Scores to Z Scores and PercentilesDeriving Raw Scores From Percentiles 5 Introduction to Hypothesis Testing Alternative ExplanationsProbabilityRefuting Sampling ErrorResearch HypothesesTesting the Null HypothesisStatistical SignificanceErrors in Drawing Conclusions About RelationshipsStatistically Significant Relationships and Meaningful Findings 6 Sampling Distributions and Hypothesis Testing Sample Size and Sampling ErrorSampling Distributions and InferenceSampling Distribution of MeansEstimating Parameters From StatisticsOther Distributions 7 Selecting a Statistical Test The Importance of Selecting the Correct Statistical TestFactors to Consider When Selecting a Statistical TestParametric and Nonparametric TestsMultivariate Statistical TestsGeneral Guidelines for Test SelectionGetting Help With Data Analyses 8 Correlation Uses of CorrelationPerfect CorrelationsNonperfect CorrelationsInterpreting Linear CorrelationsUsing Correlation For InferenceComputation and Presentation of Person's RNonparametric AlternativesUsing Correlation With Three or More VariablesOther Multivariate Tests that Use Correlation 9 Regression Analyses What is Prediction?What is Simple Linear Regression?Computation of the Regression EquationMore About the Regression LineInterpreting ResultsUsing Regression Analyses in Social Work PracticeRegression With Three or More VariablesOther Types of Regression Analyses 10 Cross-Tabulation The Chi-Square Test of AssociationUsing Chi-Square in Social Work PracticeChi-Square With Three or More VariablesSpecial Applications of the Chi-Square Formula 11 t Tests and Analysis of Variance The Use of t TestsThe One-Sample t TestThe Dependent t TestThe Independent t TestSimple Analysis of Variance (One-Way Anova) Appendix A Using Statistics to Evaluate Practice Effectiveness Evaluating ProgramsEvaluating Individual Practitioner Effectiveness Glossary Index

118 citations


Journal ArticleDOI
TL;DR: In this article, the authors used computer simulations to examine the accuracy of the increment-summation, instantancous-growth, Allen curve, and size-frequency estimates for various growth and mortality functions, patterns of prolonged recruitment, and sampling efforts; and test the reliability of parametric and nonparametric (bootstrap) confidence intervals for production estimates.
Abstract: We used computer simulations to examine the accuracy of the increment-summation, instantancous-growth, Allen curve, and size-frequency estimates for various growth and mortality functions, patterns of prolonged recruitment, and sampling efforts; describe the sampling distribution of the estimates for aggregated populations of stream benthos; and test the reliability of parametric and nonparametric (bootstrap) confidence intervals for production estimates. Sampling schedule is critical for synchronous populations, and all methods underestimate true production when sampling intervals do not cover periods of intense production. The size-frequency method tends to underestimate production of perfectly synchronous populations severely and is recommended only when cohorts are absolutely indistinguishable. Biases of cohort methods generally range from -- 30% to + loo/o of true production. Sampling error of production estimates can range from -60% to +3000/o of true production for highly aggregated populations sampled with low effort and from - 10% to + 10% at low aggregation level and high sampling effort. Published parametric and nonparametric confidence intervals are reliable only in the best circumstances (i.e. when aggregation is weak and sampling effort is high). Rcliablc confidence intervals can be obtained for the Allen curve estimate of production if the raw data can be transformed to stabilize the variance of density estimates and to linearize the relationship between density and mean individual mass. Studies of -secondary production address fundamental questions in ecology: transfers of energy and nutrients within communities and ecosystems, rational management of biological resources, and detection of the effects of pollution (Downing 1984). Secondary production can be estimated by various methods (e.g. Downing 1984; Benke 1984), but estimates are useful and comparable only if their biases arc known and if reliable confidence intervals can be calculated for the estimates. Computer simulations are appropriate tools to analyze the effect of critical assumptions on these estimates because they permit the investigation of bias in our conceptual approaches indcpendent 0.f sampling problems. Cushman et al. (1977) used a computer simulation to examine how the sampling regime, the growth curve, the variability in individual growth rate, and the sampling variability influence estimates of production obtained by the size-frequency, the instantaneous-growth, and the removal-summation methods (as described by Benke

92 citations


Journal ArticleDOI
TL;DR: In this paper, the authors compared four methods for estimating the 7-day, 10-year and 7day, 20-year low flows for streams using the bootstrap method.
Abstract: Four methods for estimating the 7-day, 10-year and 7-day, 20-year low flows for streams are compared by the bootstrap method. The bootstrap method is a Monte Carlo technique in which random samples are drawn from an unspecified sampling distribution defined from observed data. The nonparametric nature of the bootstrap makes it suitable for comparing methods based on a flow series for which the true distribution is unknown. Results show that the two methods based on hypothetical distributions (Log-Pearson III and Weibull) had lower mean square errors than did the Box-Cox transformation method or the Log-Boughton method which is based on a fit of plotting positions.

88 citations


Journal ArticleDOI
TL;DR: In this paper, a Bayes empirical Bayes approach to inference is presented, which allows the comparison of competing, perhaps nonnested, models for the distribution of the random variables in a natural way.
Abstract: Suppose that the first n order statistics from a random sample of N positive random variables are observed, where N is unknown. This, the general order statistic model, has been applied to the study of market penetration, capture—recapture, burn-in in repairable systems, software reliability growth, the estimation of the number of individuals exposed to radiation, and the estimation of the number of unseen species. Inference is to be made about the unknown parameters, especially N, and future observations are to be predicted. A Bayes empirical Bayes approach to inference is presented. This permits the comparison of competing, perhaps nonnested, models for the distribution of the random variables in a natural way. It also provides easily implemented inference and prediction procedures that avoid the difficulties of non-Bayesian methods. One such difficulty is that the maximum likelihood estimator of N may be infinite. Results are given for the case in which vague prior information about the model ...

Posted Content
TL;DR: In this article, the authors argue that Hall's t and F tests of whether consumption is predicted by lagged income, or by lags of consumption beyond the first, are asymptotically valid.
Abstract: We use recent research on estimation and testing in the presence of unit roots to argue that Hall's (1978) t and F tests of whether consumption is predicted by lagged income, or by lags of consumption beyond the first, are asymptotically valid. A Monte Carlo experiment suggests that the asymptotic t and F distributions provide a good approximation to the actual finite sample distribution.

Journal ArticleDOI
TL;DR: In this paper, the authors explain the phenomenon of partial $n−1/2}$-term correction by the bootstrap in the estimation of sampling distributions of nonstandardized statistics.
Abstract: The phenomenon of partial $n^{-1/2}$-term correction by the bootstrap in the estimation of sampling distributions of nonstandardized statistics is explained and studied in this note.

Journal ArticleDOI
TL;DR: In this paper, approximate conditional inference in location and scale parameter distributions, based on normal approximations to the distributions of signed square roots of likelihood ratio statistics, is discussed. And the accuracy of the approximate methods for small samples is illustrated by comparison with exact results in some examples.
Abstract: Methods of approximate conditional inference in location and scale parameter distributions, based on normal approximations to the distributions of signed square roots of likelihood ratio statistics, are discussed. These methods are applied to obtain approximate inference for the quantiles and scale parameter of the log generalized gamma distribution from uncensored samples, assuming that the index parameter of the distribution is known. As special cases, approximate inference in the extreme value and normal distributions from Type II censored samples is considered. The accuracy of the approximate methods for small samples is illustrated by comparison with exact results in some examples.

Book
01 Jan 1987
TL;DR: Pillai et al. as discussed by the authors proposed a hierarchical hierarchy of relationships between covariance matrices and showed the effect of additional variables in Principal Component Analysis, Discriminant Analysis and Canonical Correlation Analysis.
Abstract: Minimaxity of Empirical Bayes Estimators Derived from Subjective Hyperpriors.- Quasi-Inner Products and Their Applications.- A Hierarchy of Relationships Between Covariance Matrices.- Effect of Additional Variables in Principal Component Analysis, Discriminant Analysis and Canonical Correlation Analysis.- On a Locally Best Invariant and Locally Minimax Test in Symmetrical Multivariate Distributions.- Confidence Intervals for the Slope in a Linear Errors-in-Variables Regression Model.- Likelihood Ratio Test for Multisample Sphericity.- Statistical Selection Procedures in Multivariate Models.- Quadratic Forms to have a Specified Distribution.- Asymptotic Expansions for Errors of Misclassification:Nonnormal Situations.- Transformations of Statistics in Multivariate Analysis.- Error Rate Estimation in Discriminant Analysis: Recent Advances.- Some Simple Optimal Tests in Multivariate Analysis.- Developments in Eigenvalue Estimation.- Asymptotic Non-null Distributions of a Statistic for Testing the Equality of Hermitian Covariance Matrices in the Complex Gaussian Case.- A Model for Interlaboratory Differences.- Bayes Estimators in Lognormal Regression Model.- Multivariate Behrens-Fisher Problem by Heteroscedastic Method.- Tests for Covariance Structure in Familial Data and Principal Component.- Risk of Improved Estimators for Generalized Variance and Precision.- Sampling Distributions of Dependent Quadratic Forms from Normal and Nonnormal Universes.- Bibliography of Works by K. C. S. Pillai.

Journal ArticleDOI
TL;DR: In this article, it was shown that under essentially the same conditions the likelihood ration statistic, the Wald statistic, and the score statistic are asymptotically equivalent, i.e., they have the same limiting x 2-distribution under the general linear hypothesis as well as under suitable sequences of alternatives.
Abstract: Statistical inference in generalized linear models is based on the premises that the maximum likelihood estimator of unknown parameters is consistent and asymptotically normal, and that various test statistics have a limiting x2-distribution FAHRMEIR and KAUFMANN (1985) present mild conditions which assure consistency and asymptotic normality of the maximum likelihood estimator In this paper it is shown that under essentially tha same conditions the likelihood ration statistic, the Wald statistics and the score statistic are asymptotically equivalent, ie they have the same limiting x2-distributions under the general linear hypothesis as well as under suitable sequences of alternatives Thus, statistical inference in generalized linear models is asymptotically justified under rather weak requirements

Book
23 Oct 1987
TL;DR: The role of the clinical nurse and type II errors and ethics are considered, as well as the role of clinical practice records, in the development of a national medical service.
Abstract: Contents: Introduction Descriptive statistics Probability Probability distributions Sampling and sampling distributions Estimation Tests of hypotheses Comparison of two means and two variances Analysis of variance The chi-square test Linear regression and correlation Distribution-free methods Clinical trials Vital statistics Observational studies.

Journal ArticleDOI
TL;DR: The distribution function of the ith order statistic in random sampling from a distribution function F is obtained when the sample size is random as discussed by the authors, where the distribution function is defined as the sum of
Abstract: The distribution function of the ith order statistic in random sampling from a distribution function F is obtained when the sample size. is random.

Journal ArticleDOI
TL;DR: These relationships among its parameters and its sampling properties be understood by investigators in this field and tables for the sampling distribution of the maximal value of a finite Zipf distribution and an approximation formula for confidence intervals are provided.
Abstract: Because the Zipf size-frequency distribution is used so often as a mathematical model for bibliometric variables, it is important that the relationships among its parameters and its sampling properties be understood by investigators in this field. This paper examines these relationships and properties. In addition, it provides tables for the sampling distribution of the maximal value of a finite Zipf distribution and an approximation formula for confidence intervals. Confidence limits for the maximal value in a number of previous studies are determined.

Journal ArticleDOI
TL;DR: It is concluded that analysis of up to 2 or 3 binding sites presents few problems and linear, normal statistical results are valid and discrimination of 5 from 4 sites is an upper limit to the usefulness of the F test.

Journal ArticleDOI
TL;DR: The congruence coefficient was found to be positively biased for values of psi from.10 to.40 and negatively biased for the following values:.50,.60,.90, and.99.
Abstract: The effects of sample size, number of variables, and population value of the congruence coefficient on the sampling distribution of the congruence coefficient were examined. Sample data were generated on the basis of the common factor model, and principal axes factor analyses were performed. The results indicated that when the population congruence coefficient is zero, the sampling distribution of the congruence coefficient is relatively stable and is similar to the sampling distribution of the correlation coefficient. The congruence coefficient was found to be positively biased for values of psi from .10 to .40 and negatively biased for the following values of psi: .50, .60, .90, and .99. The amount of bias (a) depends on the sample size and the number of variables used and (b) tends to decrease as sample size increases.

Posted Content
TL;DR: In this article, the sampling distribution of the conventional t-ratio when the sample comprises independent draws from a standard Cauchy (0, 1) population is studied and it is shown that this distribution displays a striking bimodality for all sample sizes and that the bimmodality persists asymptotically.
Abstract: This paper studies the sampling distribution of the conventional t-ratio when the sample comprises independent draws from a standard Cauchy (0,1) population. It is shown that this distribution displays a striking bimodality for all sample sizes and that the bimodality persists asymptotically. An asymptotic theory is developed in terms of bivariate stable variates and the bimodality is explained by the statistical dependence between the numerator and denominator statistics of the t-ratio. This dependence also persists asymptotically. These results are in contrast to the classical t statistic constructed from a normal population, for which the numerator and denominator statistics are independent and the denominator, when suitably scaled, is a constant asymptotically. Our results are also in contrast to those that are known to apply for multivariate spherical populations. In particular, data from an n dimensional Cauchy population are well known to lead to a t-ratio statistic whose distribution is classical t with n-1 degrees of freedom. In this case the univariate marginals of the population are all standard Cauchy (0,1) but the sample data involves a special form of dependence associated with the multivariate spherical assumption. Our results therefore serve to highlight the effects of the dependence in component variates that is induced by a multivariate spherical population. Some extensions to symmetric stable populations with exponent parameter alpha does not equal 1 are also indicated. Simulation results suggest that the sampling distributions are well approximated by the asymptotic theory even for samples as small as n = 20.


Journal ArticleDOI
TL;DR: In case-control studies where the outcome is not uncommon but the exposure is rare, inverse sampling may be used to reduce the total number of subjects required to find a fixed number of exposed cases and controls as mentioned in this paper.
Abstract: In case-control studies where the outcome is not uncommon but the exposure is rare, inverse sampling may be used to reduce the total number of subjects required to find a fixed number of exposed cases and controls. The sampling distribution is negative binomial rather than binomial. Logistic regression for adjustment of covariates may be implemented on the computer program GLIM by the appropriate use of macros. An example is given.

Journal ArticleDOI
TL;DR: In this paper, the authors introduced a class of tests by using the Kaplan-Meier estimator for the sample distribution in the uncensored model, under some regularity conditions, the asymptotic normality of statistics is derived by an application of von Mises' method.
Abstract: Under a model of random censorship, we consider the test $H_0$: a life distribution is exponential, versus $H_1$: it is new better than used, but not exponential. This paper introduces a class of tests by using the Kaplan-Meier estimator for the sample distribution in the uncensored model. Under some regularity conditions, the asymptotic normality of statistics is derived by an application of von Mises' method, and asymptotically valid tests are obtained by using estimators for the null standard deviations. The efficiency loss in the proportional censoring model is studied and a Monte Carlo study of power is performed.

01 Dec 1987
TL;DR: In this paper, the performance of two methods of ratio scaling (the analytic hierarchy process proposed by Saaty (1977, 1980) and the geometric mean procedure advocated by Williams and Crawford (1980)) when random data are supplied is compared.
Abstract: : This research note evaluates and compares the performance of two methods of ratio scaling (the analytic hierarchy process proposed by Saaty (1977, 1980), and the geometric mean procedure advocated by Williams and Crawford (1980)) when random data are supplied. The two methods are examined in a series of Monte Carlo simulations for two response methods (direct estimation and constant sum) and for various stimuli and response scales. The sampling distributions of the measures of consistency of two methods are tabulated, rules for detecting and rejecting inconsistent respondents are outlined, and approximation formulas for other designs are derived. Overall, there is a high level of agreement and correspondence between the results from the two scaling techniques. We conclude that the present results reinforce Williams and Crawford's claim for the superiority of the geometric mean procedure. Keywords: Psychometrics.

Journal ArticleDOI
TL;DR: In this paper, it is shown that a logarithmic transformation of the semi-variogram gives approximate normality for moderately large sample sizes, which is theoretically expected under certain conditions.
Abstract: The semi-variogram is widely used in geostatistical data analyses. It is estimatedby the sample semi-variogram. Although some theoretical work has been done on its sampling distribution for a Gaussian process these results are not of immediate practical usage because of the problems induced by the presence of spatial inter-correlations. In this paper various simulation studies are reported which give overwhelming evidence that a logarithmic transformation of the semi-variogram gives approximate normality for moderately large sample sizes. This result is theoretically expected under certain conditions. A robust estimator of the semi-variogram was also studied using the simulations.

Book ChapterDOI
01 Jan 1987
TL;DR: A sampling distribution (WGNSD) for Importance Sampling method, which can be used in structural system reliability assessment, is introduced and clarified that independent sampling distribution produces poor estimates for correlated structural system.
Abstract: This paper introduces a sampling distribution (WGNSD) for Importance Sampling method, which can be used in structural system reliability assessment. Four numerical examples of various cases presented verify success of the approach. It is clarified that independent sampling distribution produces poor estimates for correlated structural system.

Journal ArticleDOI
TL;DR: A test and its empirical distribution (n > 4) has been developed to examine whether a set of sample results arise from a population which possesses a uniform distribution as discussed by the authors, which is easy to calculate and lends itself nicely to probability-plot analyses.
Abstract: A test and its empirical distribution (n > 4) has been developed to examine whether a set of sample results arise from a population which possesses a uniform distribution. The test statistic is easy to calculate and lends itself nicely to probability-plot analyses. The test statistic applies the concept of the Shapiro-Wilk W-test to the uniform distribution. The result allows one to examine distributional assumptions regarding any continuous distribution and thus provides practitioners, currently using traditional Chi-Square techniques, with a less subjective testing procedure.

Journal ArticleDOI
TL;DR: A computer simulation game to help teach spatial autocorrelation is presented and its use assumes that students have a minimal background in introductory statistics and its focus is on efficiently and effectively searching through a sampling distribution, seeking map patterns with particular levels of spatial autOCorrelation.
Abstract: A computer simulation game to help teach spatial autocorrelation is presented. Its use assumes that students have a minimal background in introductory statistics and its focus is on efficiently and effectively searching through a sampling distribution, seeking map patterns with particular levels of spatial autocorrelation. The sampling distribution is constructed using randomisation, and the search process is guided by calculations of Geary's contiguity ratio. Experiences with this simulation exercise at SUNY/Buffalo are briefly summarised.