scispace - formally typeset
Search or ask a question

Showing papers on "Population proportion published in 2003"


Book
01 Jan 2003
TL;DR: In this paper, the authors present a test for making an informed decision about whether to buy a car or not based on the probability of a given vehicle and a set of variables.
Abstract: Preface to the Instructor Supplements Technology Resources An Introduction to the Applets Applications Index PART 1. GETTING THE INFORMATION YOU NEED 1. Data Collection 1.1 Introduction to the Practice of Statistics 1.2 Observational Studies versus Designed Experiments 1.3 Simple Random Sampling 1.4 Other Effective Sampling Methods 1.5 Bias in Sampling 1.6 The Design of Experiments Chapter 1 Review Chapter Test Making an Informed Decision:What Movie Should I Go To? Case Study: Chrysalises for Cash PART 2. DESCRIPTIVE STATISTICS 2. Organizing and Summarizing Data 2.1 Organizing Qualitative Data 2.2 Organizing Quantitative Data:The Popular Displays 2.3 Additional Displays of Quantitative Data 2.4 Graphical Misrepresentations of Data Chapter 2 Review Chapter Test Making an Informed Decision:Tables or Graphs? Case Study: The Day the Sky Roared 3. Numerically Summarizing Data 3.1 Measures of Central Tendency 3.2 Measures of Dispersion 3.3 Measures of Central Tendency and Dispersion from Grouped Data 3.4 Measures of Position and Outliers 3.5 The Five-Number Summary and Boxplots Chapter 3 Review Chapter Test Making an Informed Decision: What Car Should I Buy? Case Study:Who Was A Mourner? 4. Describing the Relation between Two Variables 4.1 Scatter Diagrams and Correlation 4.2 Least-Squares Regression 4.3 Diagnostics on the Least-Squares Regression Line 4.4 Contingency Tables and Association 4.5 Nonlinear Regression:Transformations (on CD) Chapter 4 Review Chapter Test Making an Informed Decision: What Car Should I Buy? Case Study: Thomas Malthus, Population, and Subsistence PART 3. PROBABILITY AND PROBABILITY DISTRIBUTIONS 5. Probability 5.1 Probability Rules 5.2 The Addition Rule and Complements 5.3 Independence and the Multiplication Rule 5.4 Conditional Probability and the General Multiplication Rule 5.5 Counting Techniques 5.6 Putting It Together:Which Method Do I Use? 5.7 Bayes's Rule (on CD) Chapter 5 Review Chapter Test Making an Informed Decision: Sports Probabilities Case Study: The Case of the Body in the Bag 6. Discrete Probability Distributions 6.1 Discrete Random Variables 6.2 The Binomial Probability Distribution 6.3 The Poisson Probability Distribution 6.4 The Hypergeometric Probability Distribution (on CD) Chapter 6 Review Chapter Test Making an Informed Decision: Should We Convict? Case Study: The Voyage of the St. Andrew 7. The Normal Probability Distribution 7.1 Properties of the Normal Distribution 7.2 The Standard Normal Distribution 7.3 Applications of the Normal Distribution 7.4 Assessing Normality 7.5 The Normal Approximation to the Binomial Probability Distribution Chapter 7 Review Chapter Test Making an Informed Decision: Join the Club Case Study: A Tale of Blood Chemistry and Health PART 4. INFERENCE: FROM SAMPLES TO POPULATION 8. Sampling Distributions 8.1 Distribution of the Sample Mean 8.2 Distribution of the Sample Proportion Chapter 8 Review Chapter Test Making an Informed Decision: How Much Time Do You Spend in a Day...? Case Study: Sampling Distribution of the Median 9. Estimating the Value of a Parameter Using Confidence Intervals 9.1 The Logic in Constructing Confidence Intervals for a Population Mean When the Population Standard Deviation Is Known 9.2 Confidence Intervals for a Population Mean When the Population Standard Deviation Is Unknown 9.3 Confidence Intervals for a Population Proportion 9.4 Confidence Intervals for a Population Standard Deviation 9.5 Putting It Together:Which Procedure Do I Use? Chapter 9 Review Chapter Test Making an Informed Decision: Whats Your Major? Case Study:When Model Requirements Fail 10. Hypothesis Tests Regarding a Parameter 10.1 The Language of Hypothesis Testing 10.2 Hypothesis Tests for a Population Mean--Population Standard Deviation Known 10.3 Hypothesis Tests for a Population Mean--Population Standard Deviation Unknown 10.4 Hypothesis Tests for a Population Proportion 10.5 Hypothesis Tests for a Population Standard Deviation 10.6 Putting It Together:Which Method Do I Use? 10.7 The Probability of a Type II Error and the Power of the Test Chapter 10 Review Chapter Test Making an Informed Decision: What Does It Really Weigh? Case Study: How Old Is Stonehenge? 11. Inferences on Two Samples 11.1 Inference about Two Means: Dependent Samples 11.2 Inference about Two Means: Independent Samples 11.3 Inference about Two Population Proportions 11.4 Inference for Two Population Standard Deviations 11.5 Putting It Together:Which Method Do I Use? Chapter 11 Review Chapter Test Making an Informed Decision: Where Should I Invest? Case Study: Control in the Design of an Experiment 12. Inference on Categorical Data 12.1 Goodness-of-Fit Test 12.2 Tests for Independence and the Homogeneity of Proportions Chapter 12 Review Chapter Test Making an Informed Decision: Benefits of College Case Study: Feeling Lucky? Well, Are You? 13. Comparing Three or More Means 13.1 Comparing Three or More Means (One-Way Analysis of Variance) 13.2 Post Hoc Tests on One-Way Analysis of Variance 13.3 The Randomized Complete Block Design 13.4 Two-Way Analysis of Variance Chapter 13 Review Chapter Test Making an Informed Decision: Where Should I Invest? Part II Case Study: Hat Size and Intelligence 14. Inference on the Least-Squares Regression Model and Multiple Regression 14.1 Testing the Significance of the Least-Squares Regression Model 14.2 Confidence and Prediction Intervals 14.3 Multiple Regression Chapter 14 Review Chapter Test Making an Informed Decision: Buying a Car Case Study: Housing Boom 15. Nonparametric Statistics (on CD) 15.1 An Overview of Nonparametric Statistics 15.2 Runs Test for Randomness 15.3 Inferences about Measures of Central Tendency 15.4 Inferences about the Difference between Two Medians: Dependent Samples 15.5 Inferences about the Difference between Two Medians: Independent Samples 15.6 Spearman's Rank-Correlation Test 15.7 Kruskal--Wallis Test Chapter Test Making an Informed Decision: Where Should I Live? Case Study: Evaluating Alabamas 1891 House Bill 504 Appendix A Tables Appendix A Tables (on CD) Appendix B Lines (on CD) Photo Credits Answers Index

59 citations


Journal ArticleDOI
TL;DR: Proper understanding and use of fundamental statistics and their calculations will allow more reliable analysis, interpretation, and communication of clinical information among health care providers and between these providers and their patients.
Abstract: In radiology, appropriate diagnoses are often based on quantitative data. However, these data contain inherent variability. Radiologists often see P values in the literature but are less familiar with other ways of reporting statistics. Statistics such as the SD and standard error of the mean (SEM) are commonly used in radiology, whereas the CI is not often used. Because the SEM is smaller than the SD, it is often inappropriately used in order to make the variability of the data look tighter. However, unlike the SD, which quantifies the variability of the actual data for a single sample, the SEM represents the precision for an estimated mean of a general population taken from many sample means. Since readers are usually interested in knowing about the variability of the single sample, the SD often is the preferred statistic. Statistical calculations combine sample size and variability (ie, the SD) to generate a CI for a population proportion or population mean. CIs enable researchers to estimate populatio...

55 citations


Journal ArticleDOI
TL;DR: In this article, North et al. proposed a new formula for the empirical estimation of P values by Monte Carlo methods to replace a standard conventional estimator, and they claim that their new formula is "correct" and "most accurate".
Abstract: To the Editor:North et al. (2002xA note on the calculation of empirical P values from Monte Carlo procedures. North, BV, Curtis, D, and Sham, PC. Am J Hum Genet. 2002; 71: 439–441Abstract | Full Text | Full Text PDF | PubMed | Scopus (101)See all References2002) propose a new formula for the empirical estimation of P values by Monte Carlo methods to replace a standard conventional estimator. They claim that their new formula is “correct” and “most accurate” and that the conventional formula is “not strictly correct,” repeating this claim many times in their letter. The claim, however, is incorrect, and the conventional formula is the correct one.The North et al. claim arises when a test statistic (called here “t”) takes a certain numerical value (called here “t*”) when calculated from data from some experiment, and it is required to find an unbiased estimate of the P value corresponding to t* by Monte Carlo simulation. This is done by performing n Monte Carlo simulations, all performed under the null hypothesis tested in the original experiment and with the same sample size and other characteristics as for the original experiment. Suppose, to be concrete, that sufficiently large positive values of the test statistic t are significant. Then, we define “r” as the number of simulations in which the simulation value of t is greater than or equal to the observed value t*. North et al. claim that an unbiased, and thus preferred, estimate of the P value arising from these simulations is (r+1)/(n+1) instead of the conventional estimate r/n. This claim is incorrect.Strangely, North et al. (2002xA note on the calculation of empirical P values from Monte Carlo procedures. North, BV, Curtis, D, and Sham, PC. Am J Hum Genet. 2002; 71: 439–441Abstract | Full Text | Full Text PDF | PubMed | Scopus (101)See all References2002) themselves show by algebra that the mean value of their estimator (r+1)/(n+1) is (nP+1)/(n+1), where “P” is the P value to be estimated. Since this is not equal to P, their P value estimator is biased. Further, their calculation also shows that the mean value of the conventional estimator r/n, whose use they do not recommend, is the desired value P. Thus, the conventional estimator is unbiased. Thus, there is an internal inconsistency in their argument, and their algebraic calculations contradict their claim and the argument leading to it. The algebraic calculations are correct. It is important to see why the argument given in North et al. (2002xA note on the calculation of empirical P values from Monte Carlo procedures. North, BV, Curtis, D, and Sham, PC. Am J Hum Genet. 2002; 71: 439–441Abstract | Full Text | Full Text PDF | PubMed | Scopus (101)See all References2002) is incorrect, since the reasoning involved relates to the theory and practice of Monte Carlo simulation procedures that are performed increasingly in genetics, in particular to questions surrounding P values and type 1 errors.The incorrect argument given by North et al. (2002xA note on the calculation of empirical P values from Monte Carlo procedures. North, BV, Curtis, D, and Sham, PC. Am J Hum Genet. 2002; 71: 439–441Abstract | Full Text | Full Text PDF | PubMed | Scopus (101)See all References2002) is that if the original data were generated under the null hypothesis tested, then, in all, n+1 “experiments” were conducted, of which one is real and n simulation. With r as defined above, in r+1 of these, the value of the statistic t is either equal to the observed value t* or is greater than this value. It is then claimed that the estimator (r+1)/(n+1) is an unbiased estimator of the null hypothesis probability that the test statistic t exceeds t* when the null hypothesis is true.The error in this argument is, perhaps, best demonstrated by considering parallel reasoning used in the genetic ascertainment sampling context, exemplified as follows. Suppose that we wish to estimate the proportion of girls in a population, using a sample of families from that population. However, the sampling procedure is such that only families in which the oldest child is a girl are included in the sample. Clearly, using all children in the sample to estimate the proportion of girls in the population is incorrect, and the sample proportion of girls will overestimate the population proportion. The oldest child in each family, automatically included in the category of interest (girls), must be excluded in the estimation process. The analogy with the Monte Carlo case is that the observed value of the test statistic found from the actual data must be excluded in estimating a P value, since it is similarly automatically included in the category of interest (greater than or equal to itself). Any mathematical calculation concerning P values that does take this into account will be incorrect.It now appears that North et al. (2002xA note on the calculation of empirical P values from Monte Carlo procedures. North, BV, Curtis, D, and Sham, PC. Am J Hum Genet. 2002; 71: 439–441Abstract | Full Text | Full Text PDF | PubMed | Scopus (101)See all References2002) used mistaken terminology, and that the claim that they wished to make does not concern P value estimation, but that use of (r+1)/(n+1) “provides the correct type 1 error rate.” More precisely, if the type 1 error is chosen to be α, then it is claimed that rejecting the null hypothesis when (r+1)/(n+1)<α leads to the desired type 1 error of 5%.To see this in formal statistical terms, the null hypothesis is rejected, with the notation and assumptions given above, if the value of r is “too low.” More specifically, with the chosen type 1 error of α, the null hypothesis is rejected if r

27 citations


Journal ArticleDOI
TL;DR: The results of the simulation study indicated that the Score interval usually outperformed the Wald interval, suggesting that thescore interval is a viable method of constructing confidence intervals for the population mean of a rating scale item.
Abstract: This article presents a generalization of the Score method of constructing confidence intervals for the population proportion (E. B. Wilson, 1927) to the case of the population mean of a rating scale item. A simulation study was conducted to assess the properties of the Score confidence interval in relation to the traditional Wald (A. Wald, 1943) confidence interval under a variety of conditions, including sample size, number of response options, extremeness of the population mean, and kurtosis of the response distribution. The results of the simulation study indicated that the Score interval usually outperformed the Wald interval, suggesting that the Score interval is a viable method of constructing confidence intervals for the population mean of a rating scale item.

20 citations


Book
01 Jan 2003
TL;DR: The first look at MINITAB can be found in this paper, where the authors describe a procedure for entering data into a Worksheet and then creating a chart to represent the data.
Abstract: Typographical Conventions. 1. A FIRST LOOK AT MINITAB. Objectives. Launching MINITAB. Entering Data into a Worksheet. Saving a Worksheet. Creating a Chart. Saving a Project. Getting Help. Printing in MINITAB. Quitting MINITAB. 2. TABLES AND GRAPHS FOR ONE VARIABLE. Objectives. Opening a Worksheet. A Dotplot. Exploring the Data with Stem-and-Leaf. Creating a Histogram. Frequency Distributions with Tally. Printing Session Output. Another Bar Chart. Moving On... 3. TABLES AND GRAPHS FOR TWO VARIABLES. Objectives. Cross-Tabulating Data. Editing Your Most Recent Dialog. More on Bar Charts. Comparing Two Distributions. Scatterplots to Detect Relationships. Moving On... 4. ONE-VARIABLE DESCRIPTIVE STATISTICS. Objectives. Computing One Summary Measure for a Variable. Computing Several Summary Measures. Generating a Box-and-Whiskers Plot. Standardizing a Variable. Moving On... 5. TWO-VARIABLE DESCRIPTIVE STATISTICS. Objectives. Comparing Dispersion with the Coefficient of Variation. Descriptive Measures for Subsamples. Measures of Association: Covariance and Correlation. Moving On... 6. ELEMENTARY PROBABILITY. Objectives. Simulation. A Classical Example. Observed Relative Frequency as Probability. Handling Alphanumeric Data. Moving On... 7. DISCRETE PROBABILITY DISTRIBUTIONS. Objectives. An Empirical Discrete Distribution. Graphing a Distribution. Transferring Session Output to the Worksheet. Computing the Expected Value of a Theoretical Distribution: The Binomial. Another Theoretical Distribution: The Poisson. Moving On... 8. PROBABILITY DENSITY FUNCTIONS. Objectives. Continuous Random Variables. Generating Normal Distributions. Finding Areas under a Normal Curve. Normal Curves as Models. Moving On... 9. SAMPLING DISTRIBUTIONS. Objectives. What is a Sampling Distribution? Sampling from a Normal Population. Central Limit Theorem. Sampling Distribution of the Proportion. Moving On... 10. CONFIDENCE INTERVALS. Objectives. The Concept of a Confidence Interval. Effect of Confidence Coefficient. Large Samples from a Non-normal (Known) Population. Dealing with Real Data. Small Samples from a Normal Population. Confidence Interval for a Population Proportion. Moving On... 11. ONE-SAMPLE HYPOTHESIS TESTS. Objectives. The Logic of Hypothesis Testing. An Artificial Example. A More Realistic Case: We Don"t Know Sigma. A Small-Sample Example. A Test Involving Proportion. Moving On... 12. TWO-SAMPLE HYPOTHESIS. Objectives. Working with Two Samples. Matched vs. Independent Samples. Comparing Two Proportions. Moving On... 13. CHI-SQUARE TESTS. Objectives. Review of Qualitative vs. Quantitative Data. Goodness-of-Fit Testing. A First Example: Simple Genetics. Testing for Independence. Testing for Independence (Summary Data Only). Moving On... 14. ANALYSIS OF VARIANCE. Objectives. Comparing the Means of More than Two Samples. A Simple Example. ANOVA and the Two-Sample t-Tests. Another Example. Unstacked Data. A Two-Way ANOVA. Moving On... 15. LINEAR REGRESSION (I). Objectives. Linear Relationships. Another Example. Inference from Output. An Example of a Questionable Relationship. An Estimation Application. A Classic Example. Moving On... 16. LINEAR REGRESSION (II). Objectives. Assumptions for Least Squares Regression. Examining Residuals to Check Assumptions. A Time-Series Example. Issues in Forecasting and Prediction. A Caveat about "Mindless" Regression. Moving On... 17. MULTIPLE REGRESSION. Objectives. Going Beyond a Single Explanatory Variable. Significance Testing and Goodness of Fit. Prediction and Residual Analysis. Adding More Variables. A New Concern. Another Example. Working with Qualitative Variables. Moving On... 18. NON-LINEAR MODELS. Objectives. When Relationships Are Not Linear. A Simple Example. Some Common Transformations. Another Quadratic Model. A Logarithmic Transformation. Adding More Variables. Moving On... 19. BASIC FORECASTING TECHNIQUES. Objectives. Detecting Patterns over Time. Some Illustrative Examples. Forecasting Using Moving Averages. Forecasting Using Exponential Smoothing. Forecasting Using Trend Analysis. Moving On... 20. NONPARAMETRIC TESTS. Objectives. Nonparametric Methods. A Sign Test. A Wilcoxon Signed Rank Test. Mann-Whitney U Test. Kruskal-Wallis Test. Spearman Rank Order Correlation. A Runs Test. Moving On... 21. STATISTICAL PROCESS CONTROL. Objectives. Processes and Variation. Charting a Process Mean. Charting a Process Range. Another Example. Charting a Process Proportion. Moving On... Appendix A: Dataset Descriptions. Appendix B: Working with Files. Objectives. Worksheets. Session and History Files. Graph Files. MINITAB Projects. Converting Other Data Files into MINITAB Worksheets. Appendix C: Organizing a Worksheet. Choices. Stacked Data. Unstacked Data. Summarized Data. Appendix D: Working with Other Minitab Releases. Objectives. Differences Between Release 14 and Earlier Versions. Issues for Student Version 14 Users. Summary of Sessions Where Differences Arise. Workarounds for Earlier Releases. Commands Without Equivalents in Earlier Releases. Index.

17 citations


Hyuk-Joo Kim1
01 Jan 2003
TL;DR: In this paper, a two-stage procedure is proposed to estimate the population proportion of a sensitive group by combining the direct question method and a modified randomized response technique, which is verified that the proposed procedure is more efficient than existing methods under some mild conditions.
Abstract: A two-stage procedure is proposed to estimate the population proportion of a sensitive group. The proposed procedure is obtained by combining the direct question method and a modified randomized response technique. It is verified that the proposed procedure is more efficient than existing methods under some mild conditions.

1 citations


01 Jan 2003
TL;DR: Wang et al. as mentioned in this paper analyzed the relationship among different models used in the study of regional difference and pointed out the limitations of Gini coefficient and other indicators in the use of describing regional inequity, combining with China's specific conditions; then they examined the peculiar function of Kuznetz ratio.
Abstract: The study of regional difference and its change is one of the most important research topics in regional economics and geography. In this paper, we first discussed the relationship among different models used in the study of regional difference, and pointed out the limitations of Gini coefficient and other indicators in the use of describing regional inequity, combining with China's specific conditions; then we examined the peculiar function of Kuznetz ratio. By decomposing and calculating Kuznetz ratio, we find out the direct causes that bring about the changes of China's regional difference. The results show that the changing process of China's regional difference since the reform and opening up can be divided into four stages, that is, the first stage is from the year 1978 to 1983, the second stage from 1984 to 1990, the third stage from 1991 to 1995, and the fourth stage from 1996 to now. Each of these four stages shows different extrinsic characteristics and intrinsic factors of regional difference, which is mainly due to the contrast value of two factors, that is, the population proportion of low-income areas and the GDP proportion of high-income areas. The details are as follows: 1) In the first stage, the population proportion of low-income provinces gradually declines and the income proportion of low-income provinces increases, and these two facts cause the general inequity decline a little. 2) In the second stage, the general inequity increases a little, which is mainly due to the vibration of two factors, factor A is the variation of inequity coefficient caused by the relative change of low-income population, factor B is the variation of inequity coefficient caused by the relative change of high-income population; 3) In the third stage, the general inequity increases rapidly, which is due to the increase of population proportion and income proportion of low-income provinces. 4) In the fourth stage, the general inequity declines a little, comparing with the index of the third stage, which is the result of the obvious increase of population proportion and the obvious decrease of income proportion of low-income provinces. The results also show that China's regional inequity in the future may be lessened to some extent. Practice has proved that the indicator of Kuznetz ratio has a unique function in describing the regional difference, but simply using this method might not have a good result and we should pay attention to its application combining with other methods.

1 citations


01 Jan 2003
TL;DR: A simple and obvious procedure is presented that allows to estimate, the population proportion of a sensitive group, using a combined procedure of direct question and randomized response technique.
Abstract: In this paper, a simple and obvious procedure is presented that allows to estimate , the population proportion of a sensitive group. Suggested procedure is combined procedure of direct question and randomized response technique. It is found that the proposed procedure is more efficient than Warner's(1965).

1 citations


Book ChapterDOI
01 Jan 2003
TL;DR: In this article, the authors present some advanced concepts related to counting of discrete objects to produce data, and show that the binomial distribution is approximately normal when a large proportion of the objects in the data meet some qualitative criterion.
Abstract: This chapter presents some advanced concepts related to the counting of discrete objects to produce data. The binomial distribution arises when the data consist of objects that are counted, and the value reported is the fraction, or proportion, meeting some qualitative criterion. In such cases, when there are many objects, and when a fairly large proportion of them meets the necessary criterion, the distribution will be approximately Normal. The Normal distribution is continuous, whereas the binomial distribution, being based on counting, is discrete. In order for the Normal approximation to hold, there must be enough cases involved wherein the distance from one discrete value to the next should be unnoticeable, and therefore should act as though it were, in fact, at least quasi-continuous. All the properties of the binomial distribution for a given situation are determined by the values of N and P for that situation (where N is the sample size and P is used to indicate the population proportion). In particular, the values for the mean and the variance of the sampling distribution for samples of size N from a population containing a proportion P of items are determined by these parameters.

1 citations