scispace - formally typeset
Search or ask a question

Showing papers in "The American Statistician in 1956"



Journal ArticleDOI
F. S. Beckman1, D. A. Quarles1
TL;DR: A procedure that is used to accomplish multiple linear regression and correlation analysis on the Type 701 Electronic Data Processing Machine is described, and some of the characteristics of a program, now being prepared, which is to perform a like function on theType 704 calculator are indicated.
Abstract: Multiple linear regression and correlation analysis is one of the most fruitful statistical applications of digital computers. A basic objective in this analysis is the expression of a dependent variable, for purposes of prediction or fitting, as a suitable linear function of various independent variables. Geometrically interpreted, for a given data sample of n variables, the computational procedure is designed to determine that hyperplane in n-space which approximates the sample observations of the dependent variable in a least-squares sense. The approximation is regarded as suitable if, apart from what may reasonably be attributed to chance, the values predicted for the dependent variable by the hyperplane account for a large portion of this variable's sample variance. To decide whether the objective of a suitable linear approximation has been obtained, not only are certain statistics computed, but also significance tests are made. The wide range of usefulness of this technique is due to the fact that often, at least for appreciable ranges of the variables involved, the variable chosen as dependent can be approximately expressed as a linear function of various related independent variables, or of functions of such variables. The advent of large digital computers has made possible the convenient handling of the considerable computation required in performing such an analysis upon a substantial number of variables and observations. In this paper we shall describe a procedure that is used to accomplish this analysis on the Type 701 Electronic Data Processing Machine, and shall indicate some of the characteristics of a program, now being prepared, which is to perform a like function on the Type 704 calculator. The 701 program enables the computation and printing of the following statistics in the order stated: the means and standard deviations; the coefficients of simple and partial correlation, regression, and multiple correlation; and the standard error of estimate. As input data, the program accepts up to 1022 arbitrary five-decimal-digit observations of each of the variables where the array of observations for all variables has no missing entries. At this point, it is perhaps worth mentioning that we encountered no problem in this field of analysis where the five-digit limitation appeared too restrictive. The input data were directly used only in the computation of the sums, sums of squares and sums of cross-products for the variables, and from these three types of sums, the program developed the means, standard deviations, and simple correlation matrix. Two principal formats for the decimal input data cards were deserving of consideration. The data could be arranged by observation (so that the corresponding observations of all variables are grouped together), or by variable (so that all of the observations of a variable comprise one group). It seems that by far the more advantageous of these formats for our purposes was that which we have adopted as a standard, not only for the Type 701, but also for the Type 704 and Type 650 calculators. This standard format requires that the data be arranged by variable, i.e., the first fourteen observations of a variable on the first card, the second fourteen on the second card, etc., and has an important advantage in enabling large simple correlation matrices to be developed conveniently and efficiently, element by element, and row by row. Arrangement of the input data by observation would, for minimal machine time, require a large amount of high-speed storage to accommodate the double-precision partial sums accumulated in the formation of the matrix of crossproducts. This would impose a restriction on the number of variables that could be handled efficiently. In our procedure, this restriction is avoided by computing the sums of cross-products serially rather than in parallel. From another viewpoint, the calculation of the simple correlation matrix involves essentially a matrix multiplication to obtain the sums of crossproducts, and this input format enables direct formation of the pre-multiplier in such a way that the elements of the product matrix are conveniently generated serially rather than in parallel. The input data are initially read into the 701 from the decimal cards, 1 This article is based on a paper presented by the authors at the September 14-16, 1955 meeting at Philadelphia of the Association for Computing Machinery.

8 citations




Journal ArticleDOI
TL;DR: In this paper, the authors present a plan for making introductory statistics a tolerable, perhaps desirable, course in the liberal arts curriculum, based on the theory of the panic reaction.
Abstract: Everyone who has taught a course in introductory statistics to majors in sociology, psychology, education, and the like has been faced by the Great Panic Reaction. In another course the instructor can say, \"The occupational system is essentially the institutionalized differentiation of the adaptive aspect of the task-orientation area of the social system,\" l and be met with only a soft murmur of discontent. Yet, at the mere mention of the number \"three,\" student apprehension will rise to the panic level. Voicing an expression like \"sigma-ex\" will lead to a barrage of drop-slips and changes-in-major. Something in our cultural background seems to engender these anxieties; at some time in our formative years we are frightened by the magic of numbers. Genuine learning cannot take place in an atmosphere fraught with anxieties, and the fears must be allayed before students can attend to subject matter. And so the teacher-if he is to teach at all-must be concerned first with the fear and only secondly with the subject matter of statistics. He must be a religious confessor, a psychotherapist, and a numbers magician all rolled up into a neat package. It is not the intent of the present paper to explore the basis for the Panic Reaction. That is a problem for research in educational psychology and sociology. Rather, we hope to outline a device which may help to allay the anxiety. We are convinced that much of the fear engendered by a first experience with statistics may be eliminated by sound, systematic, and logical course organization. We shall present a plan-a course outlinedesigned to provide such an order. It is hoped that through the application of such a plan introductory statistics may become a tolerable, perhaps desirable, course in the liberal arts curriculum.

1 citations