scispace - formally typeset
Search or ask a question

Showing papers on "Sampling distribution published in 2003"


Journal ArticleDOI
TL;DR: Three case studies demonstrate that the adaptive capability of the SCEM‐UA algorithm significantly reduces the number of model simulations needed to infer the posterior distribution of the parameters when compared with the traditional Metropolis‐Hastings samplers.
Abstract: Author(s): Vrugt, JA; Gupta, HV; Bouten, W; Sorooshian, S | Abstract: Markov Chain Monte Carlo (MCMC) methods have become increasingly popular for estimating the posterior probability distribution of parameters in hydrologic models. However, MCMC methods require the a priori definition of a proposal or sampling distribution, which determines the explorative capabilities and efficiency of the sampler and therefore the statistical properties of the Markov Chain and its rate of convergence. In this paper we present an MCMC sampler entitled the Shuffled Complex Evolution Metropolis algorithm (SCEM-UA), which is well suited to infer the posterior distribution of hydrologic model parameters. The SCEM-UA algorithm is a modified version of the original SCE-UA global optimization algorithm developed by Duan et al. [1992]. The SCEM-UA algorithm operates by merging the strengths of the Metropolis algorithm, controlled random search, competitive evolution, and complex shuffling in order to continuously update the proposal distribution and evolve the sampler to the posterior target distribution. Three case studies demonstrate that the adaptive capability of the SCEM-UA algorithm significantly reduces the number of model simulations needed to infer the posterior distribution of the parameters when compared with the traditional Metropolis-Hastings samplers.

1,094 citations


Journal ArticleDOI
TL;DR: The panel dynamic ordinary least square (DOLS) estimator of a homogeneous cointegration vector for a balanced panel of N individuals observed over T time periods was studied in this article.
Abstract: We study the panel dynamic ordinary least square (DOLS) estimator of a homogeneous cointegration vector for a balanced panel of N individuals observed over T time periods. Allowable heterogeneity across individuals include individual-specific time trends, individual-specific fixed effects and time-specific effects. The estimator is fully parametric, computationally convenient, and more precise than the single equation estimator. For fixed N as T fi 1, the estimator converges to a function of Brownian motions and the Wald statistic for testing a set of s linear constraints has a limiting v 2 (s) distribution. The estimator also has a Gaussian sequential limit distribution that is obtained first by letting T fi 1 and then letting N fi 1. In a series of Monte-Carlo experiments, we find that the asymptotic distribution theory provides a reasonably close approximation to the exact finite sample distribution. We use panel DOLS to estimate coefficients of the long-run money demand function from a panel of 19 countries with annual observations that span from 1957 to 1996. The estimated income elasticity is 1.08 (asymptotic s.e. ¼ 0.26) and the estimated interest rate semi-elasticity is )0.02 (asymptotic s.e. ¼ 0.01).

632 citations


Journal ArticleDOI
TL;DR: The computation of the exact distribution of a maximally selected rank statistic is discussed and a new lower bound of the distribution is derived based on an extension of an algorithm for the exactribution of a linear rank statistic.

466 citations


Journal ArticleDOI
TL;DR: In this paper, a Bayesian framework for exploratory data analysis based on posterior predictive checks is presented, which can be used to create reference distributions for EDA graphs, and how this approach resolves some theoretical problems in Bayesian data analysis.
Abstract: Summary Exploratory data analysis (EDA) and Bayesian inference (or, more generally, complex statistical modeling)—which are generally considered as unrelated statistical paradigms—can be particularly effective in combination. In this paper, we present a Bayesian framework for EDA based on posterior predictive checks. We explain how posterior predictive simulations can be used to create reference distributions for EDA graphs, and how this approach resolves some theoretical problems in Bayesian data analysis. We show how the generalization of Bayesian inference to include replicated data yrep and replicated parameters θrep follows a long tradition of generalizations in Bayesian theory. On the theoretical level, we present a predictive Bayesian formulation of goodness-of-fit testing, distinguishing between p-values (posterior probabilities that specified antisymmetric discrepancy measures will exceed 0) and u-values (data summaries with uniform sampling distributions). We explain that p-values, unlike u-values, are Bayesian probability statements in that they condition on observed data. Having reviewed the general theoretical framework, we discuss the implications for statistical graphics and exploratory data analysis, with the goal being to unify exploratory data analysis with more formal statistical methods based on probability models. We interpret various graphical displays as posterior predictive checks and discuss how Bayesian inference can be used to determine reference distributions. The goal of this work is not to downgrade descriptive statistics, or to suggest they be replaced by Bayesian modeling, but rather to suggest how exploratory data analysis fits into the probability-modeling paradigm. We conclude with a discussion of the implications for practical Bayesian inference. In particular, we anticipate that Bayesian software can be generalized to draw simulations of replicated data and parameters from their posterior predictive distribution, and these can in turn be used to calibrate EDA graphs.

239 citations


Journal ArticleDOI
TL;DR: The root mean square error of approximation (RMSEA) as mentioned in this paper has a known sampling distribution that allows for the computation of co-approximation under certain assumptions, which is a key advantage of RMSEA.
Abstract: A key advantage of the root mean square error of approximation (RMSEA) is that under certain assumptions, the sample estimate has a known sampling distribution that allows for the computation of co...

145 citations


Journal ArticleDOI
TL;DR: This article discusses the application of a certain class of Monte Carlo methods to stochastic optimization problems by studying a modification of the well-known pure random search method, adapting it to the variable-sample scheme, and showing conditions for convergence of the algorithm.
Abstract: In this article we discuss the application of a certain class of Monte Carlo methods to stochastic optimization problems. Particularly, we study variable-sample techniques, in which the objective function is replaced, at each iteration, by a sample average approximation. We first provide general results on the schedule of sample sizes, under which variable-sample methods yield consistent estimators as well as bounds on the estimation error. Because the convergence analysis is performed pathwisely, we are able to obtain our results in a flexible setting, which requires mild assumptions on the distributions and which includes the possibility of using different sampling distributions along the algorithm. We illustrate these ideas by studying a modification of the well-known pure random search method, adapting it to the variable-sample scheme, and show conditions for convergence of the algorithm. Implementation issues are discussed and numerical results are presented to illustrate the ideas.

134 citations


Journal ArticleDOI
01 Sep 2003-Genetics
TL;DR: The methodology for calculating sampling distributions of single-nucleotide polymorphism (SNP) frequencies in populations with time-varying size is presented, and analytical expressions that contain coefficients that do not explode when the genealogy size increases are derived.
Abstract: We present new methodology for calculating sampling distributions of single-nucleotide polymorphism (SNP) frequencies in populations with time-varying size. Our approach is based on deriving analytical expressions for frequencies of SNPs. Analytical expressions allow for computations that are faster and more accurate than Monte Carlo simulations. In contrast to other articles showing analytical formulas for frequencies of SNPs, we derive expressions that contain coefficients that do not explode when the genealogy size increases. We also provide analytical formulas to describe the way in which the ascertainment procedure modifies SNP distributions. Using our methods, we study the power to test the hypothesis of exponential population expansion vs. the hypothesis of evolution with constant population size. We also analyze some of the available SNP data and we compare our results of demographic parameters estimation to those obtained in previous studies in population genetics. The analyzed data seem consistent with the hypothesis of past population growth of modern humans. The analysis of the data also shows a very strong sensitivity of estimated demographic parameters to changes of the model of the ascertainment procedure.

122 citations


Journal ArticleDOI
TL;DR: In this article, the asymptotic distribution of a post-model-selection estimator, both unconditional and conditional on selecting a correct model (minimal or not), has been derived.
Abstract: In Potscher (1991, Econometric Theory 7, 163–185) the asymptotic distribution of a post-model-selection estimator, both unconditional and conditional on selecting a correct model (minimal or not), has been derived. Limitations of these results are (i) that they do not provide information on the distribution of the post-model-selection estimator conditional on selecting an incorrect model and (ii) that the quality of this asymptotic approximation to the finite-sample distribution is not uniform with respect to the underlying parameters. In the present paper we first obtain the unconditional and also the conditional finite-sample distribution of the post-model-selection estimator, which turn out to be complicated and difficult to interpret. Second, we obtain approximations to the finite-sample distributions that are as simple and easy to interpret as the asymptotic distributions obtained in Potscher (1991) but at the same time are close to the finite-sample distributions uniformly with respect to the underlying parameters. As a by-product, we also obtain the asymptotic distribution conditional on selecting an incorrect model.We thank the co-editor Richard Smith and the two referees for helpful comments on a previous version of this paper. Hannes Leeb's research was supported by the Austrian Science Foundation (FWF), project P13868-MAT.

102 citations


Book
25 Jul 2003
TL;DR: This paper presents a meta-modelling procedure called Stochastic Processes, which automates the very labor-intensive and therefore time-heavy and expensive process of manually cataloging and evaluating the properties of Variables and Distributions.
Abstract: Probability Models - Random Variables and Distributions - Expectation - Sampling Distributions and Limits - Statistical Inference - Likelihood Inference - Bayesian Inference - Optimal Inferences - Model Checking - Relationships Among Variables - Advanced Topic - Stochastic Processes - Appendices: - 1. Mathematical Background - 2. Computations - 3. Common Distributions - 4. Tables - Index

83 citations


Journal ArticleDOI
TL;DR: A procedure for generalized monotonic curve fitting that is based on a Bayesian analysis of the isotonic regression model and uses Markov chain Monte Carlo simulation to draw samples from the unconstrained model space and retain only those samples for which themonotonic constraint holds.
Abstract: We introduce a procedure for generalized monotonic curve fitting that is based on a Bayesian analysis of the isotonic regression model. Conventional isotonic regression fits monotonically increasing step functions to data. In our approach we treat the number and location of the steps as random. For each step level we adopt the conjugate prior to the sampling distribution of the data as if the curve was unconstrained. We then propose to use Markov chain Monte Carlo simulation to draw samples from the unconstrained model space and retain only those samples for which the monotonic constraint holds. The proportion of the samples collected for which the constraint holds can be used to provide a value for the weight of evidence in terms of Bayes factors for monotonicity given the data. Using the samples, probability statements can be made about other quantities of interest such as the number of change points in the data and posterior distributions on the location of the change points can be provided. The method is illustrated throughout by a reanalysis of the leukaemia data studied by Schell and Singh.

71 citations


Book
01 Jan 2003
TL;DR: In this paper, the authors present a test for making an informed decision about whether to buy a car or not based on the probability of a given vehicle and a set of variables.
Abstract: Preface to the Instructor Supplements Technology Resources An Introduction to the Applets Applications Index PART 1. GETTING THE INFORMATION YOU NEED 1. Data Collection 1.1 Introduction to the Practice of Statistics 1.2 Observational Studies versus Designed Experiments 1.3 Simple Random Sampling 1.4 Other Effective Sampling Methods 1.5 Bias in Sampling 1.6 The Design of Experiments Chapter 1 Review Chapter Test Making an Informed Decision:What Movie Should I Go To? Case Study: Chrysalises for Cash PART 2. DESCRIPTIVE STATISTICS 2. Organizing and Summarizing Data 2.1 Organizing Qualitative Data 2.2 Organizing Quantitative Data:The Popular Displays 2.3 Additional Displays of Quantitative Data 2.4 Graphical Misrepresentations of Data Chapter 2 Review Chapter Test Making an Informed Decision:Tables or Graphs? Case Study: The Day the Sky Roared 3. Numerically Summarizing Data 3.1 Measures of Central Tendency 3.2 Measures of Dispersion 3.3 Measures of Central Tendency and Dispersion from Grouped Data 3.4 Measures of Position and Outliers 3.5 The Five-Number Summary and Boxplots Chapter 3 Review Chapter Test Making an Informed Decision: What Car Should I Buy? Case Study:Who Was A Mourner? 4. Describing the Relation between Two Variables 4.1 Scatter Diagrams and Correlation 4.2 Least-Squares Regression 4.3 Diagnostics on the Least-Squares Regression Line 4.4 Contingency Tables and Association 4.5 Nonlinear Regression:Transformations (on CD) Chapter 4 Review Chapter Test Making an Informed Decision: What Car Should I Buy? Case Study: Thomas Malthus, Population, and Subsistence PART 3. PROBABILITY AND PROBABILITY DISTRIBUTIONS 5. Probability 5.1 Probability Rules 5.2 The Addition Rule and Complements 5.3 Independence and the Multiplication Rule 5.4 Conditional Probability and the General Multiplication Rule 5.5 Counting Techniques 5.6 Putting It Together:Which Method Do I Use? 5.7 Bayes's Rule (on CD) Chapter 5 Review Chapter Test Making an Informed Decision: Sports Probabilities Case Study: The Case of the Body in the Bag 6. Discrete Probability Distributions 6.1 Discrete Random Variables 6.2 The Binomial Probability Distribution 6.3 The Poisson Probability Distribution 6.4 The Hypergeometric Probability Distribution (on CD) Chapter 6 Review Chapter Test Making an Informed Decision: Should We Convict? Case Study: The Voyage of the St. Andrew 7. The Normal Probability Distribution 7.1 Properties of the Normal Distribution 7.2 The Standard Normal Distribution 7.3 Applications of the Normal Distribution 7.4 Assessing Normality 7.5 The Normal Approximation to the Binomial Probability Distribution Chapter 7 Review Chapter Test Making an Informed Decision: Join the Club Case Study: A Tale of Blood Chemistry and Health PART 4. INFERENCE: FROM SAMPLES TO POPULATION 8. Sampling Distributions 8.1 Distribution of the Sample Mean 8.2 Distribution of the Sample Proportion Chapter 8 Review Chapter Test Making an Informed Decision: How Much Time Do You Spend in a Day...? Case Study: Sampling Distribution of the Median 9. Estimating the Value of a Parameter Using Confidence Intervals 9.1 The Logic in Constructing Confidence Intervals for a Population Mean When the Population Standard Deviation Is Known 9.2 Confidence Intervals for a Population Mean When the Population Standard Deviation Is Unknown 9.3 Confidence Intervals for a Population Proportion 9.4 Confidence Intervals for a Population Standard Deviation 9.5 Putting It Together:Which Procedure Do I Use? Chapter 9 Review Chapter Test Making an Informed Decision: Whats Your Major? Case Study:When Model Requirements Fail 10. Hypothesis Tests Regarding a Parameter 10.1 The Language of Hypothesis Testing 10.2 Hypothesis Tests for a Population Mean--Population Standard Deviation Known 10.3 Hypothesis Tests for a Population Mean--Population Standard Deviation Unknown 10.4 Hypothesis Tests for a Population Proportion 10.5 Hypothesis Tests for a Population Standard Deviation 10.6 Putting It Together:Which Method Do I Use? 10.7 The Probability of a Type II Error and the Power of the Test Chapter 10 Review Chapter Test Making an Informed Decision: What Does It Really Weigh? Case Study: How Old Is Stonehenge? 11. Inferences on Two Samples 11.1 Inference about Two Means: Dependent Samples 11.2 Inference about Two Means: Independent Samples 11.3 Inference about Two Population Proportions 11.4 Inference for Two Population Standard Deviations 11.5 Putting It Together:Which Method Do I Use? Chapter 11 Review Chapter Test Making an Informed Decision: Where Should I Invest? Case Study: Control in the Design of an Experiment 12. Inference on Categorical Data 12.1 Goodness-of-Fit Test 12.2 Tests for Independence and the Homogeneity of Proportions Chapter 12 Review Chapter Test Making an Informed Decision: Benefits of College Case Study: Feeling Lucky? Well, Are You? 13. Comparing Three or More Means 13.1 Comparing Three or More Means (One-Way Analysis of Variance) 13.2 Post Hoc Tests on One-Way Analysis of Variance 13.3 The Randomized Complete Block Design 13.4 Two-Way Analysis of Variance Chapter 13 Review Chapter Test Making an Informed Decision: Where Should I Invest? Part II Case Study: Hat Size and Intelligence 14. Inference on the Least-Squares Regression Model and Multiple Regression 14.1 Testing the Significance of the Least-Squares Regression Model 14.2 Confidence and Prediction Intervals 14.3 Multiple Regression Chapter 14 Review Chapter Test Making an Informed Decision: Buying a Car Case Study: Housing Boom 15. Nonparametric Statistics (on CD) 15.1 An Overview of Nonparametric Statistics 15.2 Runs Test for Randomness 15.3 Inferences about Measures of Central Tendency 15.4 Inferences about the Difference between Two Medians: Dependent Samples 15.5 Inferences about the Difference between Two Medians: Independent Samples 15.6 Spearman's Rank-Correlation Test 15.7 Kruskal--Wallis Test Chapter Test Making an Informed Decision: Where Should I Live? Case Study: Evaluating Alabamas 1891 House Bill 504 Appendix A Tables Appendix A Tables (on CD) Appendix B Lines (on CD) Photo Credits Answers Index

Journal ArticleDOI
TL;DR: In this paper, the authors propose a test of the identification condition in the nonlinear-inparameters GMM model in the existing literature, but they do not consider the first-stage F-test.
Abstract: One of the key assumptions of the standard linear instrumental variables (IV) model is that the instruments and endogenous variables are correlated. This is the identification assumption, without which the usual IV estimator is neither consistent nor asymptotically normal. If the correlation between the instruments and the endogenous variables is nonzero, but slight, then the conventional Gaussian asymptotic theory for the IV model can nevertheless provide a very poor approximation to the actual sampling distribution of estimators and test statistics. Recognizing the identification assumption on which the IV model relies, it is quite common in the applied literature to test for instrument relevance by a first-stage F-test. The null hypothesis is one of a total lack of identification. A rejection of this hypothesis by no means implies that issues of weak instruments can be ignored (Staiger and Stock, 1997). But a failure to reject this hypothesis is a strong indication of identification difficulties. The firststage F-test is an important and useful diagnostic in the IV model. The generalized method of moments (GMM) model (Hansen, 1982) nests the linear IV model as a special case. Not surprisingly, analogous issues arise in this model. Researchers have found that, in many contexts, the conventional Gaussian asymptotic theory provides a poor approximation to the sampling distribution of GMM estimators and test statistics. There are many possible reasons why this could happen, but they include identification problems. However, I am aware of no test of the identification condition in the nonlinear-inparameters GMM model in the existing literature. This paper proposes such a

Journal ArticleDOI
TL;DR: The authors discuss the basis behind analysis of continuous data with use of paired and unpaired t tests, the Bonferroni correction, and multivariate analysis of variance for readers of the radiology literature.
Abstract: Whenever means are reported in the literature, they are likely accompanied by tests to determine statistical significance. The t test is a common method for statistical evaluation of the difference between two sample means. It provides information on whether the means from two samples are likely to be different in the two populations from which the data originated. Similarly, paired t tests are common when comparing means from the same set of patients before and after an intervention. Analysis of variance techniques are used when a comparison involves more than two means. Each method serves a particular purpose, has its own computational formula, and uses a different sampling distribution to determine statistical significance. In this article, the authors discuss the basis behind analysis of continuous data with use of paired and unpaired t tests, the Bonferroni correction, and multivariate analysis of variance for readers of the radiology literature. © RSNA, 2003

Journal ArticleDOI
TL;DR: In this article, the Weibull function has been used extensively for estimating a parametric probability density of the data in the field of forestry, and the problem of estimating equations for method of moments and maximum likelihood for two-and three-parameter weibull distributions are presented.
Abstract: Many of the most popular sampling schemes used in forestry are probability proportional to size methods. These methods are also referred to as size-biased because sampling is actually from a weighted form of the underlying population distribution. Length- and area-biased sampling are special cases of size-biased sampling where the probability weighting comes from a lineal or areal function of the random variable of interest, respectively. Often, interest is in estimating a parametric probability density of the data. In forestry, the Weibull function has been used extensively for such purposes. Estimating equations for method of moments and maximum likelihood for two- and three-parameter Weibull distributions are presented. Fitting is illustrated with an example from an area-biased angle-gauge sample of standing trees in a woodlot. Finally, some specific points concerning the form of the size-biased densities are reported.

Journal ArticleDOI
TL;DR: The sampling distribution is shown to be asymptotically normal to first order, and hence large-sample hypothesis tests and confidence intervals with estimates of the variances and correlation coefficients are proposed.
Abstract: Methods to unravel the genetic determinants of non-Mendelian diseases lie at the next frontier of statistical approaches for human genetics. It is generally agreed that, before proceeding with segregation or linkage analysis, the trait under study ought to be shown to exhibit familial correlation. By coding dichotomous traits as binary variables, a single robust approach in the estimation of pedigree correlations, rather than two distinct approaches, can be used to assess the potential heritability of a trait, and, latterly, to examine the mode of inheritance. The asymptotic theory to conduct hypothesis tests and confidence intervals for correlations among different members of nuclear families is well established but is applicable only if the nuclear families are independent. As a further contribution to the literature, we derive the asymptotic sampling distribution of correlations between random variables among arbitrary pairs of members in extended families for the Pearson product-moment estimator with generalized weights. This derivation is done without assuming normality of the traits. The sampling distribution is shown to be asymptotically normal to first order, and hence large-sample hypothesis tests and confidence intervals with estimates of the variances and correlation coefficients are proposed. Discussion concludes with an example and a suggestion for future research.

Book
25 Jul 2003
TL;DR: In this paper, Pearson's correlation coefficient is used to measure the correlation between measures of correlation between different measures of variance, and the correlation coefficient can be used to test the difference between measures.
Abstract: IN THIS SECTION 1) BRIEF 2) COMPREHENSIVE BRIEF TABLE OF CONTENTS Chapter 1 Why the Social Researcher Uses Statistics Part I Description Chapter 2 Organizing the Data Chapter 3 Measures of Central Tendency Chapter 4 Measures of Variability Part II From Description to Decision Making Chapter 5 Probability and the Normal Curve Chapter 6 Samples and Populations Part III Decision Making Chapter 7 Testing Differences Between Means Chapter 8 Analysis of Variance Chapter 9 Nonparametric Tests of Significance Part IV From Decision Making to Association Chapter 10 Correlation Chapter 11 Regression Analysis Chapter 12 Nonparametric Measures of Correlation Part V Applying Statistics Chapter 13 Applying Statistical Procedures to Research Problems COMPREHENSIVE TABLE OF CONTENTS Chapter 1 Why the Social Researcher Uses Statistics The Nature of Social Research Why Test Hypotheses? The Stages of Social Research Using Series of Numbers to Do Social Research The Functions of Statistics Looking at the Larger Picture: A Student Survey Part I Description Chapter 2 Organizing the Data Frequency Distributions of Nominal Data Comparing Distributions Proportions and Percentages Simple Frequency Distributions of Ordinal and Interval Data Grouped Frequency Distributions of Interval Data Cumulative Distributions Dealing with Decimal Data Flexible Class Intervals Cross-Tabulations Graphic Presentations Chapter 3 Measures of Central Tendency The Mode The Median The Mean Taking One Step at a Time Step-by-Step Illustration: Mode, Median and Mean Comparing the Mode, Median, and Mean Chapter 4 Measures of Variability The Range The Variance and Standard Deviation Step-by-Step Illustration: Standard deviation The Raw-Score Formula for Variance and Standard Deviation Step-by-Step Illustration: Variance and Standard Deviation Using Raw Scores The Meaning of the Standard Deviation Comparing Measures of Variability Looking at the Larger Picture: Describing Data Part II From Description to Decision Making Chapter 5 Probability and the Normal Curve Probability Probability Distributions The Normal Curve as a Probability Distribution Characteristics of the Normal Curve The Model and the Reality of the Normal Curve The Area Under the Normal Curve Finding Probability Under the Normal Curve Step-by-Step Illustration: Probability under the Normal Curve Chapter 6 Samples and Populations Random Sampling Sampling Error Sampling Distribution of Means Standard Error of the Mean Confidence Intervals The t Distribution Step-by-Step Illustration: Confidence Interval Using t Estimating Proportions Step-by-Step Illustration: Confidence Intervals for Proportions Looking at the Larger Picture: Generalizing From Samples to Populations Part III Decision Making Chapter 7 Testing Differences Between Means The Null Hypothesis: No Difference Between Means The Research Hypothesis: A Difference Between Means Sampling Distribution of Differences Between Means Testing Hypotheses with the Distribution of Differences Between Means Levels of Significance Standard Error of the Difference Between Means Testing the Difference Between Means Step-by-Step Illustration: Test of Difference Between Means Comparing the Same Sample Measured Twice Step-by-Step Illustration: Test of Difference between Means for Same Sample Measured Twice Two Sample Tests of Proportions Step-by-Step Illustration: Test of Difference Between Proportions Requirements for Testing the Difference Between Means Chapter 8 Analysis of Variance The Logic of Analysis of Variance The Sum of Squares Mean Square The F Ratio Step-by-Step Illustration: Analysis of Variance Requirements for Using the F Ratio Chapter 9 Nonparametric Tests of Significance The Chi-Square Test Step-by-Step Illustration: Chi-Square Test of Significanc Step-by-Step Illustration: Comparin Several groups The Median Test Step-by-Step Illustration: Median Test Looking at the Larger Picture: Testing Differences Part IV From Decision Making to Association Chapter 10 Correlation Strength of Correlation Direction of Correlation Curvilinear Correlation The Correlation Coefficient Pearson's Correlation Coefficient Step-by-Step Illustration: Pearson's Correlation Coefficient Chapter 11 Regression Analysis The Regression Model Interpreting the Regression Line Regression and Pearson's Correlation Step-by-Step Illustration: Step-by-Step Illustration: Regression Analysis Chapter 12 Nonparametric Measures of Correlation Spearman's Rank-Order Correlation Coefficient Step-by-Step Illustration: Spearman's Rank-Order Correlation Coefficient Goodman's and Kruskal's Gamma Step-by-Step Illustration: Goodman's and Kruskal's Gamma Correlation Coefficient for Nominal Data Arranged in a 2 x 2 Table Correlation Coefficients for Nominal Data in Larger Than 2 x 2 Tables Looking at the Larger Picture: Measuring Association Appendix A A Review of Some Fundamentals of Mathematics Appendix B Tables Appendix C List of Formulas Glossary Answers to Selected Problems Index

Journal ArticleDOI
TL;DR: In this paper, a random walk model was developed to test the assumption that the Earth's magnetic field is reduced to a geocentric axial dipole (GAD) when sufficiently sampled for Mesozoic and earlier times.

Journal ArticleDOI
TL;DR: In this article, the authors compare three characterizations of the distribution of the sample coefficient alpha: the existing normal-theory-based distribution, a newly proposed distribution based on fourth-order moments, and the bootstrap empirical distribution.
Abstract: Sample coefficient alpha is commonly reported for psychological measurement scales. However, how to characterize the distribution of sample coefficient alpha with the Likert-type scales typically used in social and behavioral science research is not clear. Using the Hopkins Symptom Checklist, the authors compare three characterizations of the distribution of the sample coefficient alpha: the existing normal-theory-based distribution, a newly proposed distribution based on fourth-order moments, and the bootstrap empirical distribution. Their study indicates that the normal-theory-based distribution has a systematic bias in describing the behavior of the sample coefficient alpha. The distribution based on fourth-order moments is better than the normal-theory-based one but is still not good enough with finite samples. The bootstrap automatically takes the sampling distribution and sample size into account; thus it is recommended for characterizing the behavior of sample coefficient alpha with Likert-type scales.

Journal ArticleDOI
TL;DR: In this paper, the precision of a point estimator and the confidence of an interval estimator in frequentist inference are taken into account, and the problem of choice between alternative ancillary statistics is discussed.
Abstract: Summary. We argue that it can be fruitful to take a predictive view on notions such as the precision of a point estimator and the confidence of an interval estimator in frequentist inference. This predictive approach has implications for conditional inference, because it immediately allows a quantification of the concept of relevance for conditional inference. Conditioning on an ancillary statistic makes inference more relevant in this sense, provided that the ancillary is a precision index. Not all ancillary statistics satisfy this demand. We discuss the problem of choice between alternative ancillary statistics. The approach also has implications for the best choice of variance estimator, taking account of correlations with the squared error of estimation itself. The theory is illustrated by numerous examples, many of which are classical.

Journal ArticleDOI
TL;DR: In this paper, an approach based on the Sukhatme-Renyi representation of exponential order statistics is presented, which gives, as we think, a new insight into the problem of selecting a winner.
Abstract: Let Mn be the maximum of a sample X1,...,X n from a discrete distribution and let Wn be the number of i's, 1 ≤ i ≤ n, such that Xi=Mn. We discuss the asymptotic behavior of the distribution of Wn as n → ∞. The probability that the maximum is unique is of interest in diverse problems, for example, in connection with an algorithm for selecting a winner, and has been studied by several authors using mainly analytic tools. We present here an approach based on the Sukhatme-Renyi representation of exponential order statistics, which gives, as we think, a new insight into the problem.

Journal ArticleDOI
TL;DR: Six statistics for evaluating a structural equation model are extended from the conventional context to the multilevel context; these statistics are asymptotically distribution free, that is, their distributions do not depend on the sampling distribution when sample size at the highest level is large enough.

Journal ArticleDOI
TL;DR: In this paper, the authors formulated notions of both the experimental design for angular sampling and the kernel response function for an arbitrary set of angular measurements and formulated the Fisher informative matrix to circumscribe the optimal design problem to search for the best angular sampling.

Journal ArticleDOI
TL;DR: In this article, reliability sampling plans for the Weibull distribution under Type II progressive censoring with random removals (PCR), where the number of units removed at each failure time follows a binomial distribution, are presented.
Abstract: This paper presents reliability sampling plans for the Weibull distribution under Type II progressive censoring with random removals (PCR), where the number of units removed at each failure time follows a binomial distribution. To construct the sampling plans, the sample size n and the acceptance constant k are determined based on asymptotic distribution theory. The resulting sampling plans are tabulated for selected specifications under the proposed censoring scheme. Furthermore, a Monte Carlo simulation is conducted to validate the true probability of acceptance for the designed sampling plans.

Book ChapterDOI
20 May 2003

Book
01 Jan 2003
TL;DR: In this article, the authors present an overview of business statistics using statistical tables, including the properties of the Mean and the Variance of a Random Variable and the Co-variance of x(bar) and p(hat).
Abstract: 1. An Introduction to Business Statistics 2. Descriptive Statistics: Tabular and Graphical Methods 3. Descriptive Statistics: Numerical Methods 4. Probability 5. Discrete Random Variables 6. Continuous Random Variables 7. Sampling and Sampling Distributions 8. Confidence Intervals 9. Hypothesis Testing 10. Statistical Inferences Based on Two Samples 11. Experimental Design and Analysis of Variance 12. Chi-Square Tests 13. Simple Linear Regression Analysis 14. Multiple Regression and Model Building Appendix A: Statistical Tables Answers to Most Odd-Numbered Exercises References Photo Credits Index On the Website: 15. Process Improvement Using Control Charts Appendix B: Properties of the Mean and the Variance of a Random Variable and the Co-variance Appendix C: Derivatives of the Mean and Variance of x(bar) and p(hat) Appendix D: Confidence Intervals for Parameters of Finite Populations Appendix E: Logistic Regression

01 Jan 2003
TL;DR: In this paper, the authors show that the bootstrap procedure is asymptotically valid for a class of M -estimates provided the resample size mn satisfies mn → ∞ and mn/n → 0 as the original sample size n goes to infinity.
Abstract: The limiting distribution for M -estimates in a regression or autoregression model with heavy-tailed noise is generally intractable, which precludes its use for inference purposes. Alternatively, the bootstrap can be used to approximate the sampling distribution of the M -estimate. In this paper, we show that the bootstrap procedure is asymptotically valid for a class of M -estimates provided the bootstrap resample size mn satisfies mn → ∞ and mn/n → 0 as the original sample size n goes to infinity.

Journal ArticleDOI
TL;DR: This article compares the properties of three sampling distributions—the independence chain, the random walk chain, and the Taylored chain suggested by Geweke and Tanizaki—and concludes that the independence chain is preferred.
Abstract: The Metropolis-Hastings algorithm has been important in the recent development of Bayes methods. This algorithm generates random draws from a target distribution utilizing a sampling (or proposal) distribution. This article compares the properties of three sampling distributions—the independence chain, the random walk chain, and the Taylored chain suggested by Geweke and Tanizaki (Geweke, J., Tanizaki, H. (1999). On Markov Chain Monte-Carlo methods for nonlinear and non-Gaussian state-space models. Communications in Statistics, Simulation and Computation 28(4):867–894, Geweke, J., Tanizaki, H. (2001). Bayesian estimation of state-space model using the Metropolis-Hastings algorithm within Gibbs sampling. Computational Statistics and Data Analysis 37(2):151–170).

Journal ArticleDOI
TL;DR: In this paper, the role of the sampling distribution in student understanding of statistical inference is discussed, and recommendations concerning the content and conduct of teaching and learning strategies in this area are made.
Abstract: Many statistics educators believe that few students develop the level of conceptual understanding essential for them to apply correctly the statistical techniques at their disposal and to interpret their outcomes appropriately. It is also commonly believed that the sampling distribution plays an important role in developing this understanding. This study clarifies the role of the sampling distribution in student understanding of statistical inference, and makes recommendations concerning the content and conduct of teaching and learning strategies in this area.

Book ChapterDOI
01 Jan 2003
TL;DR: This monograph shall consider certain classes of dependent processes and point out situations where different types of bootstrap methods can be applied effectively, and also look at situations where these methods run into problems andpoint out possible remedies, if there is one known.
Abstract: The bootstrap is a computer-intensive method that provides answers to a large class of statistical inference problems without stringent structural assumptions on the underlying random process generating the data. Since its introduction by Efron (1979), the bootstrap has found its application to a number of statistical problems, including many standard ones, where it has outperformed the existing methodology as well as to many complex problems where conventional approaches failed to provide satisfactory answers. However, it is not a panacea for every problem of statistical inference, nor does it apply equally effectively to every type of random process in its simplest form. In this monograph, we shall consider certain classes of dependent processes and point out situations where different types of bootstrap methods can be applied effectively, and also look at situations where these methods run into problems and point out possible remedies, if there is one known.

Book
01 Jan 2003
TL;DR: The first look at MINITAB can be found in this paper, where the authors describe a procedure for entering data into a Worksheet and then creating a chart to represent the data.
Abstract: Typographical Conventions. 1. A FIRST LOOK AT MINITAB. Objectives. Launching MINITAB. Entering Data into a Worksheet. Saving a Worksheet. Creating a Chart. Saving a Project. Getting Help. Printing in MINITAB. Quitting MINITAB. 2. TABLES AND GRAPHS FOR ONE VARIABLE. Objectives. Opening a Worksheet. A Dotplot. Exploring the Data with Stem-and-Leaf. Creating a Histogram. Frequency Distributions with Tally. Printing Session Output. Another Bar Chart. Moving On... 3. TABLES AND GRAPHS FOR TWO VARIABLES. Objectives. Cross-Tabulating Data. Editing Your Most Recent Dialog. More on Bar Charts. Comparing Two Distributions. Scatterplots to Detect Relationships. Moving On... 4. ONE-VARIABLE DESCRIPTIVE STATISTICS. Objectives. Computing One Summary Measure for a Variable. Computing Several Summary Measures. Generating a Box-and-Whiskers Plot. Standardizing a Variable. Moving On... 5. TWO-VARIABLE DESCRIPTIVE STATISTICS. Objectives. Comparing Dispersion with the Coefficient of Variation. Descriptive Measures for Subsamples. Measures of Association: Covariance and Correlation. Moving On... 6. ELEMENTARY PROBABILITY. Objectives. Simulation. A Classical Example. Observed Relative Frequency as Probability. Handling Alphanumeric Data. Moving On... 7. DISCRETE PROBABILITY DISTRIBUTIONS. Objectives. An Empirical Discrete Distribution. Graphing a Distribution. Transferring Session Output to the Worksheet. Computing the Expected Value of a Theoretical Distribution: The Binomial. Another Theoretical Distribution: The Poisson. Moving On... 8. PROBABILITY DENSITY FUNCTIONS. Objectives. Continuous Random Variables. Generating Normal Distributions. Finding Areas under a Normal Curve. Normal Curves as Models. Moving On... 9. SAMPLING DISTRIBUTIONS. Objectives. What is a Sampling Distribution? Sampling from a Normal Population. Central Limit Theorem. Sampling Distribution of the Proportion. Moving On... 10. CONFIDENCE INTERVALS. Objectives. The Concept of a Confidence Interval. Effect of Confidence Coefficient. Large Samples from a Non-normal (Known) Population. Dealing with Real Data. Small Samples from a Normal Population. Confidence Interval for a Population Proportion. Moving On... 11. ONE-SAMPLE HYPOTHESIS TESTS. Objectives. The Logic of Hypothesis Testing. An Artificial Example. A More Realistic Case: We Don"t Know Sigma. A Small-Sample Example. A Test Involving Proportion. Moving On... 12. TWO-SAMPLE HYPOTHESIS. Objectives. Working with Two Samples. Matched vs. Independent Samples. Comparing Two Proportions. Moving On... 13. CHI-SQUARE TESTS. Objectives. Review of Qualitative vs. Quantitative Data. Goodness-of-Fit Testing. A First Example: Simple Genetics. Testing for Independence. Testing for Independence (Summary Data Only). Moving On... 14. ANALYSIS OF VARIANCE. Objectives. Comparing the Means of More than Two Samples. A Simple Example. ANOVA and the Two-Sample t-Tests. Another Example. Unstacked Data. A Two-Way ANOVA. Moving On... 15. LINEAR REGRESSION (I). Objectives. Linear Relationships. Another Example. Inference from Output. An Example of a Questionable Relationship. An Estimation Application. A Classic Example. Moving On... 16. LINEAR REGRESSION (II). Objectives. Assumptions for Least Squares Regression. Examining Residuals to Check Assumptions. A Time-Series Example. Issues in Forecasting and Prediction. A Caveat about "Mindless" Regression. Moving On... 17. MULTIPLE REGRESSION. Objectives. Going Beyond a Single Explanatory Variable. Significance Testing and Goodness of Fit. Prediction and Residual Analysis. Adding More Variables. A New Concern. Another Example. Working with Qualitative Variables. Moving On... 18. NON-LINEAR MODELS. Objectives. When Relationships Are Not Linear. A Simple Example. Some Common Transformations. Another Quadratic Model. A Logarithmic Transformation. Adding More Variables. Moving On... 19. BASIC FORECASTING TECHNIQUES. Objectives. Detecting Patterns over Time. Some Illustrative Examples. Forecasting Using Moving Averages. Forecasting Using Exponential Smoothing. Forecasting Using Trend Analysis. Moving On... 20. NONPARAMETRIC TESTS. Objectives. Nonparametric Methods. A Sign Test. A Wilcoxon Signed Rank Test. Mann-Whitney U Test. Kruskal-Wallis Test. Spearman Rank Order Correlation. A Runs Test. Moving On... 21. STATISTICAL PROCESS CONTROL. Objectives. Processes and Variation. Charting a Process Mean. Charting a Process Range. Another Example. Charting a Process Proportion. Moving On... Appendix A: Dataset Descriptions. Appendix B: Working with Files. Objectives. Worksheets. Session and History Files. Graph Files. MINITAB Projects. Converting Other Data Files into MINITAB Worksheets. Appendix C: Organizing a Worksheet. Choices. Stacked Data. Unstacked Data. Summarized Data. Appendix D: Working with Other Minitab Releases. Objectives. Differences Between Release 14 and Earlier Versions. Issues for Student Version 14 Users. Summary of Sessions Where Differences Arise. Workarounds for Earlier Releases. Commands Without Equivalents in Earlier Releases. Index.