Showing papers in "Journal of the American Statistical Association in 1963"

PDF

Open Access

Journal Article•DOI•

Hierarchical Grouping to Optimize an Objective Function

[...]

01 Mar 1963-Journal of the American Statistical Association

TL;DR: In this paper, a procedure for forming hierarchical groups of mutually exclusive subsets, each of which has members that are maximally similar with respect to specified characteristics, is suggested for use in large-scale (n > 100) studies when a precise optimal solution for a specified number of groups is not practical.

...read moreread less

Abstract: A procedure for forming hierarchical groups of mutually exclusive subsets, each of which has members that are maximally similar with respect to specified characteristics, is suggested for use in large-scale (n > 100) studies when a precise optimal solution for a specified number of groups is not practical. Given n sets, this procedure permits their reduction to n − 1 mutually exclusive sets by considering the union of all possible n(n − 1)/2 pairs and selecting a union having a maximal value for the functional relation, or objective function, that reflects the criterion chosen by the investigator. By repeating this process until only one group remains, the complete hierarchical structure and a quantitative estimate of the loss associated with each stage in the grouping can be obtained. A general flowchart helpful in computer programming and a numerical example are included.

...read moreread less

17,405 citations

Book Chapter•DOI•

Probability Inequalities for sums of Bounded Random Variables

[...]

Wassily Hoeffding¹•Institutions (1)

University of North Carolina at Chapel Hill¹

01 Mar 1963-Journal of the American Statistical Association

TL;DR: In this article, upper bounds for the probability that the sum S of n independent random variables exceeds its mean ES by a positive number nt are derived for certain sums of dependent random variables such as U statistics.

...read moreread less

Abstract: Upper bounds are derived for the probability that the sum S of n independent random variables exceeds its mean ES by a positive number nt. It is assumed that the range of each summand of S is bounded or bounded above. The bounds for Pr {S – ES ≥ nt} depend only on the endpoints of the ranges of the summands and the mean, or the mean and the variance of S. These results are then used to obtain analogous inequalities for certain sums of dependent random variables such as U statistics and the sum of a random sample without replacement from a finite population.

...read moreread less

8,655 citations

Journal Article•DOI•

Chi-Square Tests with One Degree of Freedom; Extensions of the Mantel-Haenszel Procedure

[...]

Nathan Mantel¹•Institutions (1)

National Institutes of Health¹

01 Sep 1963-Journal of the American Statistical Association

TL;DR: In this article, a method for analyzing multiple 2×2 contingency tables arising in retrospective studies of disease is extended in application and form, which includes comparisons of age-adjusted death rates, life-table analyses, comparisons of two sets of quantal dosage response data, and miscellaneous laboratory applications as appropriate.

...read moreread less

Abstract: A published method for analyzing multiple 2×2 contingency tables arising in retrospective studies of disease is extended in application and form. Extensions of application include comparisons of age-adjusted death rates, life-table analyses, comparisons of two sets of quantal dosage-response data, and miscellaneous laboratory applications as appropriate. Extensions in form involve considering multiple contingency tables with arbitrarily many rows and/or columns, where rows and columns are orderable, and may even be on a continuous scale. The assignment of some score for each row or column is essential to use of the method. With scores assigned, a deviation of the sum of cross products from expectation, and its variance conditioned on all marginal totals, are computed for each table and a chi square is determined corresponding to the grand total of the deviations. For various specific instances and for various scoring procedures, the procedure extends or is equivalent to the asymptotic form of man...

...read moreread less

2,654 citations

Journal Article•DOI•

Problems in the Analysis of Survey Data, and a Proposal

[...]

James N. Morgan, John A. Sonquist

01 Jun 1963-Journal of the American Statistical Association

TL;DR: In this article, an approach to survey data is proposed which imposes no restrictions on interaction effects, focuses on Importance in reducing predictive error, operates sequentially, and is independent of the extent of linearity in the classifications or the order in which the explanatory factors are introduced.

...read moreread less

Abstract: Most of the problems of analyzing survey data have been reasonably well handled, except those revolving around the existence of interaction effects. Indeed, increased efficiency in handling multivariate analyses even with non-numerical variables, has been achieved largely by assuming additivity. An approach to survey data is proposed which imposes no restrictions on interaction effects, focuses on Importance in reducing predictive error, operates sequentially, and is independent of the extent of linearity in the classifications or the order in which the explanatory factors are introduced.

...read moreread less

1,205 citations

Journal Article•DOI•

A Regression Method for Real Estate Price Index Construction

[...]

Martin J. Bailey¹, Richard F. Muth¹, Hugh O. Nourse²•Institutions (2)

University of Chicago¹, University of Washington²

01 Dec 1963-Journal of the American Statistical Association

TL;DR: In this paper, the problem of combining price relatives of repeat sales of properties to obtain a price index can be converted into a regression problem, and standard techniques of regression analysis can be used to estimate the index.

...read moreread less

Abstract: Quality differences make estimation of price indexes for real properties difficult, but these can be largely avoided by basing an index on sales prices of the same property at different times. The problem of combining price relatives of repeat sales of properties to obtain a price index can be converted into a regression problem, and standard techniques of regression analysis can be used to estimate the index. This method of estimation is more efficient than others for combining price relatives in that it utilizes information about the price index for earlier periods contained in sales prices in later periods. Standard errors of the estimated index numbers can be readily computed using the regression method, and it permits certain effects on the value of real properties to be eliminated from the index.

...read moreread less

1,100 citations

Journal Article•DOI•

Estimators for Seemingly Unrelated Regression Equations: Some Exact Finite Sample Results

[...]

Arnold Zellner¹•Institutions (1)

University of Wisconsin-Madison¹

01 Dec 1963-Journal of the American Statistical Association

TL;DR: The finite sample properties of an asymptotically efficient technique (JASA, June 1962) for estimating coefficients in certain generally encountered sets of regression equations are studied in this paper.

...read moreread less

Abstract: The finite sample properties of an asymptotically efficient technique (JASA, June, 1962) for estimating coefficients in certain generally encountered sets of regression equations are studied in this paper. In particular, exact first and second moments of the asymptotically efficient coefficient estimator are derived and compared with those of the usual least squares estimator. Further, the exact probability density function of the new estimator is derived and studied as a function of sample size. It is found that the approach to asymptotic normality is fairly rapid and that for a wide range of conditions an appreciable part of the asymptotic gain in efficiency is realized in samples of finite size.

...read moreread less

676 citations

Journal Article•DOI•

Ordered Hypotheses for Multiple Treatments: A Significance Test for Linear Ranks

[...]

Ellis B. Page¹•Institutions (1)

University of Connecticut¹

01 Mar 1963-Journal of the American Statistical Association

TL;DR: In this paper, a ranking statistic L is presented as test of a monotonic relationship among the treatment groups in the two-way analysis of variance, which combines considerable power with computational ease, and assumes data of only ordinal strength.

...read moreread less

Abstract: In many experiments the background evidence, theories, or conditions suggest an expected ordering among the treatment effects, yet in the analysis of variance such implicit hypotheses are typically neglected. A ranking statistic L is presented as test of a monotonic relationship among the treatment groups in the two-way analysis of variance. Used with accompanying table of L, it combines considerable power with computational ease, and assumes data of only ordinal strength. L is related to the test of the linear component of the treatment sum of squares in the parametric randomized-block design, to the product-moment correlation and regression, to the normal deviate test of Lyerly's average rho, and to Friedman's chi-square of ranks. Where either L or the Friedman test may be used, L is often more accurate and appropriate, and it has some advantages over other tests of trend and monotonicity.

...read moreread less

671 citations

Journal Article•DOI•

The Production and Distribution of Knowledge in the United States.

[...]

Edwin Mansfield, Fritz Machlup

01 Dec 1963-Journal of the American Statistical Association

TL;DR: Machlup defined knowledge as "any human (or human-induced) activity designed to create, alter, or confirm in a human mind-one's own or anyone else's-a meaningful apperception, awareness, cognizance, or consciousness of whatever it may be" as discussed by the authors.

...read moreread less

Abstract: by Fritz Machlup. Princeton, New Jersey: Princeton University Press, 1962. Pp. xx+416. $7.50. In this book Professor Machlup has twice defined knowledge as \"any human (or human-induced) activity designed to create, alter, or confirm in a human mind-one's own or anyone else's-a meaningful apperception, awareness, cognizance, or consciousness of whatever it may be.\" Though the term \"distribution\" appears in his title, it is in fact subsumed under knowledge production; Machlup is using \"distribution\" in the sense of knowledge transmission or diffusion as a process, which becomes part of \"production\" in economic analysis through exactly the same logic that identifies retailing, for example, as an economically \"productive\" activity. Knowledge production is thus defined to include many types of activities: transporting (e.g., postal services), transforming, data-processing, interpreting, and analyzing, as well as the original creation of new knowledge by discovery or invention. Machlup's concern is with three tasks: 1. A reasoned introductory presentation of concepts and types of knowledge. This analysis provides the broad framework and criteria for selection of what is to be included as knowledge-producing. 2. The measurement of the components of knowledge production and its share in gross national product or national income. This task occupies the bulk of the book. 3. The presentation of a program of school reform. Although the definition of knowledge cuts an exceedingly wide swath, Machlup narrows his task to manageable proportions by

...read moreread less

647 citations

Book Chapter•DOI•

Measures of Association for Cross Classifications III: Approximate Sampling Theory

[...]

Leo A. Goodman¹, William Kruskal¹•Institutions (1)

University of Chicago¹

01 Jun 1963-Journal of the American Statistical Association

TL;DR: In this paper, the authors derived large sample normal distributions with their associated standard errors for various measures of association and various methods of sampling and explained how the large sample normality may be used to test hypotheses about the measures and about differences between them, and to construct corresponding confidence intervals.

...read moreread less

Abstract: The population measures of association for cross classifications, discussed in the authors' prior publications, have sample analogues that are approximately normally distributed for large samples. (Some qualifications and restrictions are necessary.) These large sample normal distributions with their associated standard errors, are derived for various measures of association and various methods of sampling. It is explained how the large sample normality may be used to test hypotheses about the measures and about differences between them, and to construct corresponding confidence intervals. Numerical results are given about the adequacy of the large sample normal approximations. In order to facilitate extension of the large sample results to other measures of association, and to other modes of sampling, than those treated here, the basic manipulative tools of large sample theory are explained and illustrated.

...read moreread less

470 citations

Journal Article•DOI•

Inference in an Authorship Problem

[...]

Frederick Mosteller¹, David L. Wallace²•Institutions (2)

Center for Advanced Study in the Behavioral Sciences¹, University of Chicago²

01 Jun 1963-Journal of the American Statistical Association

TL;DR: The authorship question of The Federalist papers was investigated in this paper, where word counts were the variables used for discrimination, and the conclusions about the authorship problem were that Madison rather than Hamilton was the winner.

...read moreread less

Abstract: This study has four purposes: to provide a comparison of discrimination methods; to explore the problems presented by techniques based strongly on Bayes' theorem when they are used in a data analysis of large scale; to solve the authorship question of The Federalist papers; and to propose routine methods for solving other authorship problems. Word counts are the variables used for discrimination. Since the topic written about heavily influences the rate with which a word is used, care in selection of words is necessary. The filler words of the language such as an, of, and upon, and, more generally, articles, prepositions, and conjunctions provide fairly stable rates, whereas more meaningful words like war, executive, and legislature do not. After an investigation of the distribution of these counts, the authors execute an analysis employing the usual discriminant function and an analysis based on Bayesian methods. The conclusions about the authorship problem are that Madison rather than Hamilton ...

...read moreread less

392 citations

Journal Article•DOI•

On the Use of Incomplete Prior Information in Regression Analysis

[...]

H. Theil

01 Jun 1963-Journal of the American Statistical Association

TL;DR: In this paper, the use of prior beliefs in the estimation of regression coefficients is considered and the problems that arise when the residual variance of the regression equation is unknown and it offers a large-sample solution.

...read moreread less

Abstract: This article deals with the use of prior beliefs in the estimation of regression coefficients; in particular, it considers the problems that arise when the residual variance of the regression equation is unknown and it offers a large-sample solution. Additional contributions deal with testing the hypothesis that prior and sample information are compatible with each other; and with a scalar measure for the shares of these two kinds of information in the posterior precision.

...read moreread less

Journal Article•DOI•

Basic Principles of the Tracer Method.

[...]

John L. Stephenson, C. W. Sheppard

01 Jun 1963-Journal of the American Statistical Association

Journal Article•DOI•

Sequential Medical Trials

[...]

F. J. Anscombe¹•Institutions (1)

Princeton University¹

01 Jun 1963-Journal of the American Statistical Association

TL;DR: It is suggested that the operating-characteristic concepts of the Neyman-Pearson theory of tests are inappropriate to the analysis and interpretation of experimental data; the likelihood principle should be followed instead.

...read moreread less

Abstract: In an extended review of Sequential Medical Trials by P. Armitage, the statistical principles which should govern the analysis of experimental observations, and the planning of experiments, are discussed. It is suggested that the operating-characteristic concepts of the Neyman-Pearson theory of tests are inappropriate to the analysis and interpretation of experimental data; the likelihood principle should be followed instead. The planning of medical trials under an ethical injunction against unnecessary continuance of inferior treatments is studied in some detail. The propriety of such trials is considered.

...read moreread less

Journal Article•DOI•

Zeitschrift fur Wahrscheinlichkeitstheorie und verwandte Gebiete.

[...]

Leopold Schmetterer

01 Mar 1963-Journal of the American Statistical Association

Journal Article•DOI•

Seasonal Adjustment of Economic Time Series and Multiple Regression Analysis

[...]

Michael C. Lovell¹•Institutions (1)

Carnegie Institution for Science¹

01 Dec 1963-Journal of the American Statistical Association

TL;DR: In this paper, it is shown that any sum-preserving technique of seasonal adjustment that satisfies the quite reasonable requirements of orthogonality and idempotency can be executed on the electronic computer by standard least squares regression procedures.

...read moreread less

Abstract: The logical implications of certain simple consistency requirements for appraising alternative procedures for seasonal adjustment constitute the first problem considered in this paper. It is shown that any sum preserving technique of seasonal adjustment that satisfies the quite reasonable requirements of orthogonality and idempotency can be executed on the electronic computer by standard least squares regression procedures. Problems involved in utilizing seasonally adjusted data when estimating the parameters of econometric models are examined. Extending the fundamental Frisch-Waugh theorem concerning trend and regression analysis to encompass problems of seasonality facilitates the comparison of the implications of running regressions on data subjected to prior seasonal adjustment with the effects of including dummy variables with unadjusted data. Complicated types of moving seasonal patterns that may be handled by the dummy variable procedure are considered. Although efficient estimates of the ...

...read moreread less

Journal Article•DOI•

The Development of Numerical Credit Evaluation Systems

[...]

James H. Myers¹, Edward W. Forgy²•Institutions (2)

University of Southern California¹, University of California, Los Angeles²

01 Sep 1963-Journal of the American Statistical Association

TL;DR: In this paper, several discriminant and multiple regression analyses were performed on retail credit application data to develop a numerical scoring system for predicting credit risk in a finance company, and the results showed that equal weights for all significantly predictive items were as effective as weights from the more sophisticated techniques of discriminant analysis and stepwise multiple regression.

...read moreread less

Abstract: Several discriminant and multiple regression analyses were performed on retail credit application data to develop a numerical scoring system for predicting credit risk in a finance company. Results showed that equal weights for all significantly predictive items were as effective as weights from the more sophisticated techniques of discriminant analysis and “stepwise multiple regression.” However, a variation of the basic discriminant analysis produced a better separation of groups at the lower score levels, where more potential losses could be eliminated with a minimum cost of potentially good accounts.

...read moreread less

Journal Article•DOI•

The theory and management of systems

[...]

Richard A. Johnson, Fremont E. Kast, James E. Rosenzweig

01 Sep 1963-Journal of the American Statistical Association

Journal Article•DOI•

A Model for Selecting One of Two Medical Treatments

[...]

Theodore Colton¹•Institutions (1)

Harvard University¹

01 Jun 1963-Journal of the American Statistical Association

TL;DR: In this article, a simple cost function approach is proposed for designing an optimal clinical trial when a total of N patients with a disease are to be treated with one of two medical treatments.

...read moreread less

Abstract: A simple cost function approach is proposed for designing an optimal clinical trial when a total of N patients with a disease are to be treated with one of two medical treatments. The cost function is constructed with but one cost, the consequences of treating a patient with the superior or inferior of the two treatments. Fixed sample size and sequential trials are considered. Minimax, maximin, and Bayesian approaches are used for determining the optimal size of a fixed sample trial and the optimal position of the boundaries of a sequential trial. Comparisons of the different approaches are made as well as comparisons of the results for the fixed and sequential plans.

...read moreread less

Journal Article•DOI•

Field Plot Technique.

[...]

Foster B. Cady, Erwin L. LeClerg, Warren H. Leonard, Andrew G. Clark

01 Sep 1963-Journal of the American Statistical Association

Journal Article•DOI•

The Three-Parameter Lognormal Distribution and Bayesian Analysis of a Point-Source Epidemic

[...]

Bruce M. Hill¹•Institutions (1)

University of Michigan¹

01 Mar 1963-Journal of the American Statistical Association

TL;DR: In this article, it is shown that there exist paths along which the likelihood function of any sample t 1, …, tn tends to ∞ as (γ, μ, σ2) approaches (t (1), − ∞, + ∞), where t (1) is the smallest of the ti, and hence that in a meaningful sense this is the maximum-likelihood estimate.

...read moreread less

Abstract: Some unusual features of the likelihood function of the three-parameter lognormal distribution ln(t – γ)∼N(μ, σ2) are explored. In particular, it is shown that there exist paths along which the likelihood function of any sample t 1, …, tn tends to ∞ as (γ, μ, σ2) approaches (t (1), – ∞, + ∞), where t (1) is the smallest of the ti , and hence that in a meaningful sense this is the maximum-likelihood estimate. Estimation is then considered from a Bayesian point of view, and some natural posterior distributions are explored. A statistical model for a point-source epidemic is presented, and the theory developed is used in estimating the time of onset and other parameters.

...read moreread less

Journal Article•DOI•

Statistical Principles in Experimental Design.

[...]

Seymour Geisser, B. J. Winer

01 Dec 1963-Journal of the American Statistical Association

Report•DOI•

Tables of the distribution of the coefficient of coherence for stationary bivariate gaussian processes

[...]

D E Amos, L H Koopmans

01 Mar 1963-Journal of the American Statistical Association

Journal Article•DOI•

Ten Years of Consumer Attitude Surveys: Their Forecasting Record

[...]

Eva Mueller¹•Institutions (1)

University of Michigan¹

01 Dec 1963-Journal of the American Statistical Association

TL;DR: In this paper, the authors used the Survey Research Center Index of Consumer Attitudes (SRCA) to predict discretionary spending by consumers and found that attitudes contribute significantly to account for fluctuations in durable goods spending, particularly spending on new cars, after allowance is made for changes in the financial situation of consumers.

...read moreread less

Abstract: Since 1951 the Survey Research Center has conducted surveys of consumer optimism and confidence two to four times a year in an attempt to improve methods of forecasting discretionary spending by consumers. The predictive success of these attitude surveys is tested here in conjunction with a number of financial variables by means of time series regressions covering the years 1952–61. The results indicate that attitude measurements contain information not obtainable from a simple combination of financial and business cycle indicators. The explanatory value of the Survey Research Center Index of Consumer Attitudes is consistently good in the sense that a number of alternative formulations of the time series regressions lead to the same conclusion: Attitudes contribute significantly to our ability to account for fluctuations in durable goods spending, particularly spending on new cars, after allowance is made for changes in the financial situation of consumers.

...read moreread less

Journal Article•DOI•

Mathematical methods in small group processes

[...]

Joan H. Criswell, Herbert Solomon, Patrick Suppes

01 Sep 1963-Journal of the American Statistical Association

Journal Article•DOI•

Mark Twain and the Quintus Curtius Snodgrass Letters: A Statistical Test of Authorship

[...]

Claude S. Brinegar

01 Mar 1963-Journal of the American Statistical Association

TL;DR: This article applied an old, though little used statistical test of authorship, a word-length frequency test, to show that Mark Twain almost certainly did not write these 10 letters, which were signed "Quintus Curtius Snodgrass".

...read moreread less

Abstract: Mark Twain is widely credited with the authorship of 10 letters published in 1861 in the New Orleans Daily Crescent. The adventures described in these letters, which were signed “Quintus Curtius Snodgrass,” provide the historical basis of a main part of Twain's presumed role in the Civil War. This study applies an old, though little used statistical test of authorship—a word-length frequency test—to show that Twain almost certainly did not write these 10 letters. The statistical analysis includes a visual comparison of several word-length frequency distributions and applications of the χ2 and two-sample t tests.

...read moreread less

Journal Article•DOI•

Information for Estimating the Proportions in Mixtures of Exponential and Normal Distributions

[...]

Bruce M. Hill¹•Institutions (1)

University of Michigan¹

01 Dec 1963-Journal of the American Statistical Association

TL;DR: In this paper, the Fisher information I(p; f 1, f 2 ) for estimating the proportion p in a mixture λ(x) = pf 1 (x) +(1 − p)f 2(x), of two densities is investigated.

...read moreread less

Abstract: The Fisher information I(p; f 1, f 2) for estimating the proportion p in a mixture λ(x) = pf 1(x) +(1 − p)f 2(x) of two densities is investigated. A general power series expansion is obtained, which is then explored in detail for the case of two exponential densities, and for the case of two normal densities with equal scale. Simple approximations are obtained, for example when (μ1 − μ2/σ) is near zero in a mixture of two normal distributions with means μ1 and μ2 and common variance σ2, and when α/β is near unity in a mixture of two exponential distributions with mean lives (α)−1 and (β)−1, α < β. Brief tables based on the various approximations are presented, giving an overall picture of the information. The main qualitative conclusion is that extremely large, and often impractical, sample sizes are required to obtain even moderate precision in estimating p unless the mixed distributions are very well separated.

...read moreread less

Journal Article•DOI•

Income and Welfare in the United States.

[...]

Margaret G. Reid, James N. Morgan, Martin David, Wilbur J. Cohen, Harvey E. Brazer - Show less +1 more

01 Sep 1963-Journal of the American Statistical Association

Journal Article•DOI•

Early Decision in the Wilcoxon Two-Sample Test

[...]

David W. Alling

01 Sep 1963-Journal of the American Statistical Association

TL;DR: In this article, the outcome of a statistical test based on the complete sample sometimes can be anticipated by employing the following procedure: as each observation is made calculate least upper and greatest lower bounds for any subsequent value of the statistic; if the critical value does not lie between these bounds then the outcome is determined and experimentation is terminated.

...read moreread less

Abstract: If a fixed number of observations are made sequentially, the outcome of a statistical test based on the complete sample sometimes can be anticipated by employing the following procedure: As each observation is made calculate least upper and greatest lower bounds for any subsequent value of the statistic; if the critical value does not lie between these bounds then the outcome of the test is determined and experimentation is terminated. Use of this procedure in the case where the Wilcoxon two-sample test is applied to an ordered sequence of observations considerably reduces the average sample size.

...read moreread less