scispace - formally typeset
Search or ask a question

Showing papers on "Mathematical statistics published in 1977"


Book
01 Jan 1977
TL;DR: In this article, the Chi-square test of homogeneity of proportions is used to compare the proportions of different groups of individuals in a population to a single variable, and the Wilcoxon Signed-Rank Test is used for the comparison of different proportions.
Abstract: PART I: INTRODUCTION 1. WHAT IS STATISTICS? Introduction / Why Study Statistics? / Some Current Applications of Statistics / What Do Statisticians Do? / Quality and Process Improvement / A Note to the Student / Summary / Supplementary Exercises PART II: COLLECTING THE DATA 2. USING SURVEYS AND SCIENTIFIC STUDIES TO COLLECT DATA Introduction / Surveys / Scientific Studies / Observational Studies / Data Management: Preparing Data for Summarization and Analysis / Summary PART III: SUMMARIZING DATA 3. DATA DESCRIPTION Introduction / Describing Data on a Single Variable: Graphical Methods / Describing Data on a Single Variable: Measures of Central Tendency / Describing Data on a Single Variable: Measures of Variability / The Box Plot / Summarizing Data from More Than One Variable / Calculators, Computers, and Software Systems / Summary / Key Formulas / Supplementary Exercises PART IV: TOOLS AND CONCEPTS 4. PROBABILITY AND PROBABILITY DISTRIBUTIONS How Probability Can Be Used in Making Inferences / Finding the Probability of an Event / Basic Event Relations and Probability Laws / Conditional Probability and Independence / Bayes's Formula / Variables: Discrete and Continuous / Probability Distributions for Discrete Random Variables / A Useful Discrete Random Variable: The Binomial / Probability Distributions for Continuous Random Variables / A Useful Continuous Random Variable: The Normal Distribution / Random Sampling / Sampling Distributions / Normal Approximation to the Binomial / Summary / Key Formulas / Supplementary Exercises PART V: ANALYZING DATA: CENTRAL VALUES, VARIANCES, AND PROPORTIONS 5. INFERENCES ON A POPULATION CENTRAL VALUE Introduction and Case Study / Estimation of / Choosing the Sample Size for Estimating / A Statistical Test for / Choosing the Sample Size for Testing / The Level of Significance of a Statistical Test / Inferences about for Normal Population, s Unknown / Inferences about the Population Median / Summary / Key Formulas / Supplementary Exercises 6. COMPARING TWO POPULATION CENTRAL VALUES Introduction and Case Study / Inferences about 1 - 2: Independent Samples / A Nonparametric Alternative: The Wilcoxon Rank Sum Test / Inferences about 1 - 2: Paired Data / A Nonparametric Alternative: The Wilcoxon Signed-Rank Test / Choosing Sample Sizes for Inferences about 1 - 2 / Summary / Key Formulas / Supplementary Exercises 7. INFERENCES ABOUT POPULATION VARIANCES Introduction and Case Study / Estimation and Tests for a Population Variance / Estimation and Tests for Comparing Two Population Variances / Tests for Comparing k > 2 Population Variances / Summary / Key Formulas / Supplementary Exercises 8. INFERENCES ABOUT POPULATION CENTRAL VALUES Introduction and Case Study / A Statistical Test About More Than Two Population Variances / Checking on the Assumptions / Alternative When Assumptions are Violated: Transformations / A Nonparametric Alternative: The Kruskal-Wallis Test / Summary / Key Formulas / Supplementary Exercises 9. MULTIPLE COMPARISONS Introduction and Case Study / Planned Comparisons Among Treatments: Linear Contrasts / Which Error Rate Is Controlled / Multiple Comparisons with the Best Treatment / Comparison of Treatments to a Control / Pairwise Comparison on All Treatments / Summary / Key Formulas / Supplementary Exercises 10. CATEGORICAL DATA Introduction and Case Study / Inferences about a Population Proportion p / Comparing Two Population Proportions p1 - p2 / Probability Distributions for Discrete Random Variables / The Multinomial Experiment and Chi-Square Goodness-of-Fit Test / The Chi-Square Test of Homogeneity of Proportions / The Chi-Square Test of Independence of Two Nominal Level Variables / Fisher's Exact Test, a Permutation Test / Measures of Association / Combining Sets of Contingency Tables / Summary / Key Formulas / Supplementary Exercises PART VI: ANALYZING DATA: REGRESSION METHODS, MODEL BUILDING 11. SIMPLE LINEAR REGRESSION AND CORRELATION Linear Regression and the Method of Least Squares / Transformations to Linearize Data / Correlation / A Look Ahead: Multiple Regression / Summary of Key Formulas. Supplementary Exercises. 12. INFERENCES RELATED TO LINEAR REGRESSION AND CORRELATION Introduction and Case Study / Diagnostics for Detecting Violations of Model Conditions / Inferences about the Intercept and Slope of the Regression Line / Inferences about the Population Mean for a Specified Value of the Explanatory Variable / Predictions and Prediction Intervals / Examining Lack of Fit in the Model / The Inverse Regression Problem (Calibration): Predicting Values for x for a Specified Value of y / Summary / Key Formulas / Supplementary Exercises 13. MULTIPLE REGRESSION AND THE GENERAL LINEAR MODEL Introduction and Case Study / The General Linear Model / Least Squares Estimates of Parameters in the General Linear Model / Inferences about the Parameters in the General Linear Model / Inferences about the Population Mean and Predictions from the General Linear Model / Comparing the Slope of Several Regression Lines / Logistic Regression / Matrix Formulation of the General Linear Model / Summary / Key Formulas / Supplementary Exercises 14. BUILDING REGRESSION MODELS WITH DIAGNOSTICS Introduction and Case Study / Selecting the Variables (Step 1) / Model Formulation (Step 2) / Checking Model Conditions (Step 3) / Summary / Key Formulas / Supplementary Exercises PART VII: ANALYZING DATA: DESIGN OF EXPERIMENTS AND ANOVA 15. DESIGN CONCEPTS FOR EXPERIMENTS AND STUDIES Experiments, Treatments, Experimental Units, Blocking, Randomization, and Measurement Units / How Many Replications? / Studies for Comparing Means versus Studies for Comparing Variances / Summary / Key Formulas / Supplementary Exercises 16. ANALYSIS OF VARIANCE FOR STANDARD DESIGNS Introduction and Case Study / Completely Randomized Design with Single Factor / Randomized Block Design / Latin Square Design / Factorial Experiments in a Completely Randomized Design / The Estimation of Treatment Differences and Planned Comparisons in the Treatment Means / Checking Model Conditions / Alternative Analyses: Transformation and Friedman's Rank-Based Test / Summary / Key Formulas / Supplementary Exercises 17. ANALYSIS OF COVARIANCE Introduction and Case Study / A Completely Randomized Design with One Covariate / The Extrapolation Problem / Multiple Covariates and More Complicated Designs / Summary / Key Formulas / Supplementary Exercises 18. ANALYSIS OF VARIANCE FOR SOME UNBALANCED DESIGNS Introduction and Case Study / A Randomized Block Design with One or More Missing Observations / A Latin Square Design with Missing Data / Incomplete Block Designs / Summary / Key Formulas / Supplementary Exercises 19. ANALYSIS OF VARIANCE FOR SOME FIXED EFFECTS, RANDOM EFFECTS, AND MIXED EFFECTS MODELS Introduction and Case Study / A One-Factor Experiment with Random Treatment Effects / Extensions of Random-Effects Models / A Mixed Model: Experiments with Both Fixed and Random Treatment Effects / Models with Nested Factors / Rules for Obtaining Expected Mean Squares / Summary / Key Formulas / Supplementary Exercises 20. SPLIT-PLOT DESIGNS AND EXPERIMENTS WITH REPEATED MEASURES Introduction and Case Study / Split-Plot Designs / Single-Factor Experiments with Repeated Measures / Two-Factor Experiments with Repeated Measures on One of the Factors / Crossover Design / Summary / Key Formulas / Supplementary Exercises PART VIII: COMMUNICATING AND DOCUMENTING THE RESULTS OF A STUDY OR EXPERIMENT 21. COMMUNICATING AND DOCUMENTING THE RESULTS OF A STUDY OR EXPERIMENT Introduction / The Difficulty of Good Communication / Communication Hurdles: Graphical Distortions / Communication Hurdles: Biased Samples / Communication Hurdles: Sample Size / The Statistical Report / Documentation and Storage of Results / Summary / Supplementary Exercises

5,674 citations


Book
11 Jan 1977
TL;DR: The assumption is made in this volume devoted to data analysis and regression that the student has had a 1st course in statistics and that attitudes and approaches are more important than the techniques this book can teach.
Abstract: The assumption is made in this volume devoted to data analysis and regression that the student has had a 1st course in statistics. Attitudes and approaches are more important than the techniques this book can teach. Readers can learn to identify at least the following attitudes understanding and approaches: an approach to the formulation of statistical and data analytical problems such that for example the students shortcut to inference can be properly understood and the role of vague concepts becomes clear; the role of indications (of pointers to behavior not necessarily on prechosen scales) in contrast to conclusions or decisions about prechosen quantities or alternatives; the importance of displays and the value of graphs in forcing the unexpected upon the reader; the importance of re-expression; the need to seek out the real uncertainty as a nontrivial task; the importance of iterated calculation; how the ideas of robustness and resistance can change both what one does and what one thinks; what regression is all about; what regression coefficient can and cannot do; that the behavior of ones data can often be used to guide its analysis; the importance of looking at and drawing information from residuals; and the idea that data analysis can profit from repeated starts and fresh approaches and that there is not just a single analysis for a substantial problem. The 16 chapters of this book include the following: some practical philosophy for data analysis; a background for simple linear regression; the nature and importance of re-expression; a method of direct assessment; the direct and flexible approach to 2 way tables; a review of resistant/robust techniques in the simpler applications; standardization; regression and regression coefficients; a mathematical approach to understanding regression; guided regression and examining regression residuals. Among the special features of this volume are the following: an introduction to stem and leaf displays; use of running medians for smoothing; the ladder of re-expression for straightening curves; methods of re-expression for analysis; special tables to make re-expression easy in hand calculations; robust and resistant measures of location and scale; and regression with errors of measurement.

1,430 citations


Book
01 Jan 1977
TL;DR: In this article, the authors present a generalization of the Bivariate Normal Distribution to the continuous type of data, where the Gamma and Chi-square distributions are used to measure the mean, variance, and standard deviation.
Abstract: 1. Empirical and Probability Distributions. Basic Concepts. The Mean, Variance, and Standard Deviation. Continuous-Type Data. Exploratory Data Analysis. Graphical Comparisons of Data Sets. Time Sequences. Probability Density and Mass Functions. 2. Probability. Properties of Probability. Methods of Enumeration. Conditional Probability. Independent Events. Bayes' Theorem. 3. Discrete Distributions. Random Variables of the Discrete Type. Mathematical Expectation. Bernoulli Trials and the Binomial Distribution. The Moment-Generating Function. The Poisson Distribution. 4. Continuous Distributions. Random Variables of the Continuous Type. The Uniform and Exponential Distributions. The Gamma and Chi-Square Distributions. The Normal Distribution. Distributions of Functions of a Random Variable. Mixed Distributions and Censoring. 5. Multivariable Distributions. Distributions of Two Random Variables. The Correlation Coefficient. Conditional Distributions. The Bivariate Normal Distribution. Transformations of Random Variables. Order Statistics. 6. Sampling Distribution Theory. Independent Random Variables. Distributions of Sums of Independent Random Variables. Random Functions Associated with Normal Distributions. The Central Limit Theorem. Approximations for Discrete Distributions. The t and F Distributions. Limiting Moment-Generating Functions. Chebyshev's Inequality and Convergence in Probability. Importance of Understanding Variability. 7. Estimation. Point Estimation. Confidence Intervals for Means. Confidence Intervals for Difference of Two Means. Confidence Intervals for Variances. Confidence Intervals for Proportions. Sample Size. Distribution-Free Confidence Intervals for Percentiles. A Simple Regression Problem. More Regression. 8. Tests of Statistical Hypotheses. Tests about Proportions. Tests about One Mean and One Variance. Tests of the Equality of Two Normal Distributions. Chi-Square Goodness of Fit Test. Contingency Tables. Tests of the Equality of Several Means. Two-Factor Analysis of Variance. Tests Concerning Regression and Correlation. The Wilcoxon Tests. Kolmogorov-Smirnov Goodness of Fit Test. Resampling Methods. Run Test and Test for Randomness. 9. Theory of Statistical Inference. Sufficient Statistics. Power of a Statistical Test. Best Critical Regions. Likelihood Ratio Tests. Bayesian Estimation. Asymptotic Distributions of Maximum Likelihood Estimators. 10. Quality Improvement through Statistical Methods. Statistical Quality Control. General Factorial and 2k Factorial Designs. More on Design of Experiments. Epilogue.Appendix A. Review of Selected Mathematical Techniques. Algebra of Sets. Mathematical Tools for the Hypergeometric Distribution. Limits. Infinite Series. Integration. Multivariate Calculus. Appendix B. References.Appendix C. Tables.Appendix D. Answers to Odd-Numbered Exercises.Index.

990 citations


Journal ArticleDOI
TL;DR: In this paper, the authors introduce the concept of conditional probability and introduce a set of variables and distributions for estimating the probability of a given set of estimators, including large random samples and special distributions.
Abstract: 1. Introduction to Probability 2. Conditional Probability 3. Random Variables and Distributions 4. Expectation 5. Special Distributions 6. Large Random Samples 7. Estimation 8. Sampling Distributions of Estimators 9. Testing Hypotheses 10. Categorical Data and Nonparametric Methods 11. Linear Statistical Models 12. Simulation

207 citations



Book
01 Jun 1977
TL;DR: In this paper, the authors present tools from Mathematical Statistics, Statistical Description of Random Variables and Stochastic Processes, and Applications to Spectroscopy and Optical Communication.
Abstract: 1. Introduction.- I. Tools From Mathematical Statistics.- 2. Statistical Description of Random Variables and Stochastic Processes.- 3. Point Processes.- II. Theory.- 4. The Optical Field: A Stochastic Vector Field or, Classical Theory of Optical Coherence.- 5. Photoelectron Events: A Doubly Stochastic Poisson Process or Theory of Photoelectron Statistics.- III. Applications.- 6. Applications to Optical Communication.- 7. Applications to Spectroscopy.- References.

65 citations


Journal ArticleDOI
TL;DR: In this paper, it is proved that maximum likelihood estimates are consistent and asymptotically normal in large samples, and formulae for the large-sample standard errors are given.

64 citations


Journal ArticleDOI
TL;DR: Although no one method was found to be statistically superior in all cases, the Ranking Method, proposed by Smith (1967) appears to be the best method for individual assessment of probability distributions.

44 citations


01 Nov 1977
TL;DR: In this paper, a dual optimization framework for information theory and statistics is developed in the form of dual convex programming problems and their duality theory, which extends the work for finite discrete distributions to the case of general measures.
Abstract: : A new dual optimization framework for some problems of information theory and statistics is developed in the form of dual convex programming problems and their duality theory. It extends the work for finite discrete distributions to the case of general measures. Although the primal problem (constrained relative entropy) is an infinite dimensional one, the dual problem is a finite dimensional one without constraints and involving only exponential and linear terms. Applications range from mathematical statistics and statistical mechanics to traffic engineering, marketing and economics.

25 citations



Book ChapterDOI
W. R. van Zwet1
01 Jan 1977
TL;DR: Asymptotic expansions for sums of independent and identically distributed random variables have been studied in the literature for many years as mentioned in this paper, and the main techniques for extending this theory to more general statistics are discussed in this paper.
Abstract: Publisher Summary This chapter discusses asymptotic expansions and explains their need. It reviews the classical theory of Edgeworth expansions for sums of independent and identically distributed random variables, and indicates the two main techniques for extending this theory to more general statistics. The chapter presents an account of as yet unpublished results of Bjerve and Helmers who establish Berry–Esseen type bounds for linear combinations of order statistics. For many years, mathematical statisticians have spent a great deal of effort and ingenuity toward applying the central limit theorem in statistics. The estimators and test statistics that interest statisticians are as a rule not sums of independent random variables, and much work went into showing that they can often be approximated sufficiently well by such sums to ensure asymptotic normality. This work can be traced throughout the development of mathematical statistics from the proof of the asymptotic normality of the maximum likelihood estimator to much of the recent work in nonparametric statistics.

Journal Article
01 Jun 1977
TL;DR: In this paper, the authors present tables dealing with the central multivariate student $t$ distribution in which there is a common variance estimate in the denominators of the variates and the numerators are equicorrelated.
Abstract: This volume presents tables dealing with the central multivariate student $t$ distribution in which there is a common variance estimate in the denominators of the variates and the numerators are equicorrelated. The tables contain one-sided and two-sided upper equicoordinate percentage points for this distribution. In addition, the volume provides tables based on the assumption that the variates have a certain block correlation structure. The entries have been computed to an accuracy of 5 decimal places.These tables, prepared under the aegis of the Institute for Mathematical Statistics, are considerably more comprehensive than previously published tables of this type. They have applications in many statistical settings, including selection among normal means using either the indifference-zone or the subset approach and in multiple comparisons involving contrasts among means. These and other applications are described in detail, and examples of the uses of the tables are given. In addition, the volume contains interpolation methods which extend the usefulness of the tables.


Journal ArticleDOI
01 Oct 1977-Synthese
TL;DR: Shaman monism as discussed by the authors is a form of personalistic statistical inference that is either pure or sham, and it is used to conceal the existence of objective or physical probabilities in statistical inference.
Abstract: Current systems of statistical inference which utilize personal probabilities (in short, personalistic systems) are either dualistic or monistic. Dualistic systems postulate the existence of objective or physical probabilities (which are usually identified with limiting relative frequencies of observable events), whereas monistic systems do not countenance such probabilities. The central thesis of monism is that statistics can get along quite well without physical probability and the related concepts of objective randomness and random process. Monistic systems may be either pure or sham, Sham monists pay lip service to monism but covertly introduce physical probabilities, and thus trivialize the central thesis. They accomplish this by introducing the same sort of probability models that dualist statisticians do, under the guise of 'personal' probability distributions of observable random variables conditional on the unknown value of a physical parameter. For example, a sham monist will treat problems that a dualist would describe as involving the unknown mean of a normally distributed population in the same way the dualist would with conditionally independent trials governed by a normal law except that he refuses to call the probabilities determined by the law 'physical probabilities'. He insists that they are merely special kinds of personal probabilities. The same sort of approach is used to treat all the standard problems of statistics, i.e., the probability models which govern the sham monist's observable random variables are going to be the same as the ones used by dualists and objectivists, except that they will be labelled differently. A notable adherent of sham monism was the late L. J. Savage, who advocated pure monism when he was theorizing on a foundational level, but who shifted ground while he tried to incorporate the standard problems of statistics into his theoretical framework (cf. [3], [13]). The difference between sham monists and dualists is that the latter overtly postulate the existence of physical probabilities, whereas the former covertly postulate



01 Mar 1977
TL;DR: A discussion of some well-known methodological issues leading to incorrect use of statistical inference by many social scientists, and how the use of confidence bounds can be informative on both significance & importance, & Bayesian methods are discussed.
Abstract: A discussion of some well-known methodological issues leading to incorrect use of statistical inference by many social scientists. Frequent misuse of statistical techniques is partly due to a concentration in textbooks on technical issues instead of methodological foundations; such misuse should not be a reason for deleting statistics from social science curricula. The foundations & applications are far more controversial than the purely mathematical theory of statistics & probability. Introduced is the paradigm of "the two spans of the bridge of inference" (Cornfield, J., & Tukey, J. W. Annals of Mathematical Statistics, 1956, 27, 912) showing how a statistical generalization to the operational population (from which a sample was drawn) is often followed by a generalization to a wider target population. This second step is mathematical, but should be based on the substantive insight of the investigator or the reader. The pitfalls of applying results valid for simple random sampling to all kinds of cluster samples, systematic samples, or haphazard samples so frequently used in practice, are warned against. Opposed is statistical decision theory, in which two decisions & their losses can be symmetrically viewed, & hypothesis testing, in which the null hypothesis survives unless clearly contradicted. The issues of one-sided vs two-sided tests & of situations where it is desired to prove the null hypothesis are discussed. The confusion of "large enough for statistical significant" with "large enough to be of practical importance" is deplored. It is demonstrated how the use of confidence bounds can be informative on both significance & importance, & Bayesian methods are discussed. Overambitious projects in which too many variables are measured, unreliably or invalidly, by a failure to concentrate on a few well-designed problems, are warned against, & some threats to validity commonly met in sociological research are discussed. Modified AA.

Journal ArticleDOI
01 Mar 1977


Book ChapterDOI
01 Jan 1977
TL;DR: In this article, the primary object of probability theory is the probability space (Ω, ℱ, P), i.e., a set Ω consisting of elementary events ω, with a distinguished system of its subsets (events), forming a σ-algebra, and P denotes a probability measure (probability) defined on sets in σ.
Abstract: According to Kolmogorov’s axiomatics the primary object of probability theory is the probability space (Ω, ℱ, P). Here (Ω, ℱ) denotes measurable space, i.e., a set Ω consisting of elementary events ω, with a distinguished system ℱ of its subsets (events), forming a σ-algebra, and P denotes a probability measure (probability) defined on sets in ℱ.

Book ChapterDOI
01 Jan 1977
TL;DR: The idea of machine learning was originally based on the application of variable elements which change their parametric values by the signals they transmit; later, learning processes have been discussed on a more abstract algorithmic level, and mathematical statistics has become the general setting of these problems.
Abstract: The idea of machine learning was originally based on the application of variable elements which change their parametric values by the signals they transmit; later, learning processes have been discussed on a more abstract algorithmic level, and mathematical statistics has become the general setting of these problems. Nevertheless, in its genuine form machine learning is a physical process.


Dissertation
01 Jan 1977
TL;DR: A Thesis in Statistics Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy to The Pennsylvania State University the Graduate School Department of Statistics as mentioned in this paper.
Abstract: A Thesis in Statistics Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy to The Pennsylvania State University the Graduate School Department of Statistics

Journal ArticleDOI
TL;DR: In this paper, the case of the smaller and the larger of two random variables is discussed in detail, and some applications for normal and binomial distributions are presented, as the solution of a system of equations.
Abstract: Functional forms of order statistics, as the solution of a system of equations, are studied. The case of the smaller and the larger of two random variables is discussed in detail. Some applications for normal and binomial distributions are presented.