scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the American Statistical Association in 1967"


Journal ArticleDOI
TL;DR: In this paper, the power of the Kolmogorov-smirnov test is investigated and a table for testing whether a set of observations is from a normal population when the mean and variance are not specified but must be estimated from the sample.
Abstract: The standard tables used for the Kolmogorov-Smirnov test are valid when testing whether a set of observations are from a completely-specified continuous distribution. If one or more parameters must be estimated from the sample then the tables are no longer valid. A table is given in this note for use with the Kolmogorov-Smirnov statistic for testing whether a set of observations is from a normal population when the mean and variance are not specified but must be estimated from the sample. The table is obtained from a Monte Carlo calculation. A brief Monte Carlo investigation is made of the power of the test.

3,923 citations


Journal ArticleDOI
TL;DR: For rectangular confidence regions for the mean values of multivariate normal distributions, this paper proved that a confidence region constructed for independent coordinates is, at the same time, a conservative confidence region for any case of dependent coordinates.
Abstract: For rectangular confidence regions for the mean values of multivariate normal distributions the following conjecture of 0. J. Dunn [3], [4] is proved: Such a confidence region constructed for the case of independent coordinates is, at the same time, a conservative confidence region for any case of dependent coordinates. This result is based on an inequality for the probabilities of rectangles in normal distributions, which permits one to factor out the probability for any single coordinate.

2,413 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present some meaningful derivations of a multivariate exponential distribution that serve to indicate conditions under which the distribution is appropriate, such as the residual life is independent of age.
Abstract: A number of multivariate exponential distributions are known, but they have not been obtained by methods that shed light on their applicability. This paper presents some meaningful derivations of a multivariate exponential distribution that serves to indicate conditions under which the distribution is appropriate. Two of these derivations are based on “shock models,” and one is based on the requirement that residual life is independent of age. It is significant that the derivations all lead to the same distribution. For this distribution, the moment generating function is obtained, comparison is made with the case of independence, the distribution of the minimum is discussed, and various other properties are investigated. A multivariate Weibull distribution is obtained through a change of variables.

1,481 citations


Journal ArticleDOI
TL;DR: In this paper, the authors considered the problem of obtaining efficient estimates for the parameters of a system of M regression equations, where the disturbance terms of this system were assumed to be related by both serial and contemporaneous correlation.
Abstract: This paper considers the problem of obtaining efficient estimates for the parameters of a system of M regression equations. The disturbance terms of this system are assumed to be related by both serial and contemporaneous correlation. Under the further assumption that the serial correlation is a first order autoregressive process, the paper develops an estimator that is consistent and has the same asymptotic normal distribution as the Aitken estimator which assumes the covariance matrix to be known. The paper concludes with a discussion of some alternative covariance specifications and points out certain difficulties with the standard single equation procedures for handling auto-regressive schemes.

1,058 citations


Journal ArticleDOI
Hans Riedwyl1
TL;DR: In this paper, the authors define a class of distribution free measures of goodness of fit; their exact distribution for small samples can be calculated by means of a computer and two of them have the same asymptotic distribution as the Kolmogorov-Smirnov statistic.
Abstract: This Paper defines a class of distribution free measures of goodness of fit; their exact distribution for small samples can be calculated by means of a computer. Two of them have the same asymptotic distribution as the Kolmogorov-Smirnov statistic.

999 citations


Journal ArticleDOI
H. P. Friedman1, J. Rubin1
TL;DR: This paper attacks the problem of exploring the structure of multivariate data in search of “clusters” by using a computer procedure to obtain the “best” partition of n objects into g groups.
Abstract: This paper deals with methods of “cluster analysis”. In particular we attack the problem of exploring the structure of multivariate data in search of “clusters”. The approach taken is to use a computer procedure to obtain the “best” partition of n objects into g groups. A number of mathematical criteria for “best” are discussed and related to statistical theory. A procedure for optimizing the criteria is outlined. Some of the criteria are compared with respect to their behavior on actual data. Results of data analysis are presented and discussed.

586 citations


Journal ArticleDOI
TL;DR: In this paper, the authors compared three strategies of data collection: personal interviews, telephone interviews, and mail questionnaires in different combinations, and found that the responses from the three strategies were highly comparable Rate of return and rate of completeness of questionnaires were high for all three; substantive findings were virtually interchangeable; and there was little difference in validity.
Abstract: Returns and findings from three strategies of data collection are compared Each strategy contains personal interviews, telephone interviews, and mail questionnaires in different combinations—one mainly personal, one mainly telephone, and one mainly mail All three strategies are based on area probability samples of households in Alameda County, California The test was made on two separate studies, with identical questionnaires used in all strategies within each study The responses from the three strategies were found to be highly comparable Rate of return and rate of completeness of questionnaires were high for all three; substantive findings were virtually interchangeable; and there was little difference in validity The only important difference was cost per interview which varied considerably by strategy

414 citations


Journal ArticleDOI
TL;DR: In this paper, a simple step-wise procedure for the clustering of variables is described, and two alternative criteria for the merger of groups at each pass are discussed: maximization of the pairwise correlation between the centroids of two groups, and minimization of Wilks' statistic to test the hypothesis of independence between two groups.
Abstract: A simple step-wise procedure for the clustering of variables is described. Two alternative criteria for the merger of groups at each pass are discussed: (1) maximization of the pairwise correlation between the centroids of two groups, and (2) minimization of Wilks’ statistic to test the hypothesis of independence between two groups. For a set of sample covariance matrices the step-wise solution for each criterion is compared with the optimal two-group separation of variables found by total enumeration of the possible groupings.

404 citations


Journal ArticleDOI
TL;DR: In this paper, a questionnaire is developed and used in a study in which people actually assess prior distributions, and the results indicate that, by and large, it is feasible to question people about subjective prior probability distributions, although this depends on t...
Abstract: In the Bayesian framework, quantified judgments about uncertainty are an indispensable input to methods of statistical inference and decision. Ultimately, all components of the formal mathematical models underlying inferential procedures represent quantified judgments. In this study, the focus is on just one component, the prior distribution, and on some of the problems of assessment that arise when a person tries to express prior distributions in quantitative form. The objective is to point toward assessment procedures that can actually be used. One particular type of statistical problem is considered and several techniques of assessment are presented, together with the necessary instruction so that these techniques can be understood and applied. A questionnaire is developed and used in a study in which people actually assess prior distributions. The results indicate that, by and large, it is feasible to question people about subjective prior probability distributions, although this depends on t...

402 citations




Journal ArticleDOI
TL;DR: In this article, the problem of comparing two or more populations with respect to a response variable Y in the presence of a (possibly multivariate) concomitant variable X is discussed.
Abstract: Various methods are discussed for the problem of comparing two or more populations with respect to a response variable Y in the presence of a (possibly multivariate) concomitant variable X—a situation in which the usual method is the standard one-way analysis of covariance. A method based on ranks is developed.

Journal ArticleDOI
TL;DR: If the full potential of the electronic computer is to be achieved, an understanding of the basic arithmetic operations and their effect on the a...
Abstract: Although there are many linear least squares programs available for use on the electronic computer, the algorithms specified in many of these programs are numerically more appropriate for the desk calculator than for the electronic computer. Routines which may be efficient for desk calculators may not be efficient for electronic computers. Since most computers carry about eight digits in the calculations, routines which do not take the problem of round-off errors and truncation into account may produce inaccurate numerical results.1 The difficulty is that the user will not know whether the results are accurate. Experiments with routine test problems using economic data indicated that either the data must be modified to fit the program or that the program must be altered to fit the data before numerical accuracy could be obtained on most programs tested. If the full potential of the electronic computer is to be achieved, an understanding of the basic arithmetic operations and their effect on the a...

Journal ArticleDOI
TL;DR: In this article, the authors discuss measures of industry concentration, including the concentration ratio, the Herfindahl index and a new measure developed by the authors, analytically and empirically.
Abstract: This paper discusses measures of industry concentration—the concentration ratio, the Herfindahl index and a new measure developed by the authors—both analytically and empirically. The analytical analysis consists of developing a set of properties which we argue all measures of concentration should possess. Although the concentration ratio is shown to be deficient on analytical grounds it appears to yield estimates of concentration not too different from the Herfindahl index and the measure devised by the authors.


Journal ArticleDOI
TL;DR: The construction technique is applied to voting behaviour of the 50 United States in the last 13 presidential elections, giving a tree clustering of the states.
Abstract: Suppose given a set of similarities (or dissimilarities) between pairs of of objects from some set of objects, such as animal species, books, colours. We wish to construct from this similarity matrix a tree, or nested set of clusterings of the objects; graphs of trees provide a striking visual display of similarity groupings of the objects. The construction requires (1) a definition specifying when a similarity matrix has exact tree structure, (2) a measure of distance between any two similarity matrices, which yields (when combined with (1)) a measure of distance between any similarity matrix and any tree, (3) a family of local operations on a tree, which can be used to search out trees which best fit a given similarity matrix. The construction technique is applied to voting behaviour of the 50 United States in the last 13 presidential elections, giving a tree clustering of the states.

Journal ArticleDOI
TL;DR: In this paper, a modification of the Bradley-Terry model was proposed by introducing an additional parameter, called threshold parameter, into the model, which permits "ties" in the model and the problem of estimation and tests of hypotheses for the parameters of the modified model is also dealt with.
Abstract: The Bradley-Terry model for a paired-comparison experiment with t treatments postulates a set of t ‘true’ treatment ratings π1, π2, · · ·, π t such that π i ≥ 0, ∑ π i = 1 and the probability for preferring treatment i to treatment j is π i (π i + π j )−1. Thus, according to this model, every comparison of two treatments results in a definite preference for one of the two. This is an unrealistic restriction since when there is no difference between the responses due to two treatments, any method of expressing preference for one over the other is somewhat arbitrary. This paper considers a modification of the Bradley-Terry model by introducing an additional parameter, called threshold parameter, into the model. This permits ‘ties’ in the model. The problem of estimation and tests of hypotheses for the parameters of the modified model is also dealt with in the paper.

Journal ArticleDOI
TL;DR: In this paper, the problem of estimating a location parameter from a random sample when the form of distribution is unknown or there is contamination of the target distribution is attacked by deriving estimators which are efficient over a class of two or more forms (pencils) of continuous symmetric unimodal distributions.
Abstract: The problem of estimating a location parameter from a random sample when the form of distribution is unknown or there is contamination of the target distribution is attacked by deriving estimators which are efficient over a class of two or more forms (“pencils”) of continuous symmetric unimodal distributions. The pencils considered are the normal, double exponential, Cauchy, parabolic, triangular, and rectangular (a limiting case). The estimators considered are special symmetrical linear combinations of order statistics: trimmed means, Winsorized means, “linearly weighted” means, and a combination of the median and two other order statistics. These are also compared asymptotically with a Hodges-Lehmann estimator. The theory required for deriving asymptotic variances is outlined. Efficiences are tabulated for sample sizes of 4 or 5, 8 or 9, 16 or 17, and ∞. Asymptotic efficiences of at least 0.82 relative to the best estimator for any single pencil are achieved by using the best trimmed mean or li...


Journal ArticleDOI
TL;DR: In this paper, the distributions of persons by per capita household consumer expenditure on all items estimated from the 13th Round (Sept. 1957-May 1958) of the Indian National Sample Survey (NSS) separately for the rural and urban sectors of different states of India, the disparities in consumption are analysed into between states and within states components.
Abstract: This paper analyses the distributions of persons by per capita household consumer expenditure on all items estimated from the 13th Round (Sept. 1957-May 1958) of the Indian National Sample Survey (NSS) separately for the rural and urban sectors of the different states of India [19]. For rural India, urban India and all-India, the disparities in consumption are analysed into between states and within states components. This is easily done by an analysis of variance of logarithms. Greater attention is given to measures related to the Gini-Lorenz concentration curve; but while the ‘between states’ concentration curve could be defined in an interesting manner, the ‘within states’ component could not be defined with equal success.

Journal ArticleDOI
TL;DR: In this article, an ideal Assessor is hypothesized and his behavior is investigated under a number of such methods, including those suggested by de Finetti and others, and the implications of these methods for the theory of personal probability are discussed.
Abstract: The personalistic theory of probability prescribes that a person should use personal probability assessments in decision-making and that these assessments should correspond with his judgments. Since the judgments exist solely in the assessor's mind, there is no way to prove whether or not this requirement is satisfied. De Finetti has proposed the development of methods which should oblige the assessor to make his assessments correspond with his judgments. An ideal Assessor is hypothesized and his behavior is investigated under a number of such methods (including those suggested by de Finetti and others). The implications of these methods for the theory of personal probability are discussed. Finally, although the present interest is primarily normative, the practicability of the methods is also discussed.

Journal ArticleDOI
TL;DR: In this article, different types of matrix derivatives are defined and illustrated, and simple and easy techniques are then derived and are shown to be applicable to a considerable collection of matrix functions, including matrix integrals from scalar ones, determining maximum likelihood estimates for complex likelihood functions, optimizing matrix functions when there are matrices of side conditions, and evaluating the Jacobians of certain classes of transformations.
Abstract: It is claimed that the reasons for using matrices of derivatives, in appropriate situations, are as compelling as those for using matrices. This paper provides basic material for such use. Different types of matrix derivatives are defined and illustrated. Simple and easy techniques are then derived and are shown to be applicable to a considerable collection of matrix functions. Applications are made to such problems as establishing matrix integrals from scalar ones, determining maximum likelihood estimates for complex likelihood functions, optimizing matrix functions when there are matrices of side conditions, and evaluating the Jacobians of certain classes of transformations. The emphasis is on simplicity of derivation and on breadth of application.

Journal ArticleDOI
TL;DR: In this paper, the authors demonstrate the unbiasedness of Zellner's seemingly unrelated regression equations estimators under fairly general conditions, and show that their estimators are unbiased under general conditions.
Abstract: The purpose of the present note is to demonstrate the unbiasedness of Zellner's seemingly unrelated regression equations estimators under fairly general conditions.

Journal ArticleDOI
TL;DR: In this paper, it was shown that the weighted mean of T 1, T s, T n, T m, Tm is an unbiased estimator of the center of every symmetric distribution, provided certain expectations exist.
Abstract: Let Tj be a reasonable estimator (for example, a minimum mean square error estimator) of the parameter θ of the family Dj of distributions, j = 1, 2, …, m. An estimator T, which is a weighted mean of T 1, T s, …, Tm , is found that has the same asymptotic distribution as that of Tj , when the sample comes from Dj , j = 1, 2, …, m. Here the weights are functions of the sample items. Empirical evidence is given which indicates that T is satisfactory for small sample sizes. It is proved that if Tj and the weight Wj are odd location and even location-free statistics, respectively, j = 1, 2, …, m, then T = ΣWiTi , where ΣWi = 1, is an unbiased estimator of the center of every symmetric distribution, provided certain expectations exist. This is useful in the construction of the weight function Wj.

Journal ArticleDOI
TL;DR: Slemrod and Bakija as discussed by the authors proposed a set of criteria for evaluating tax laws, including fairness, economic prosperity, simplifyability, and enforceability, based on the concept of income.
Abstract: 1. Overview of U.S. Income Tax (a) Slemrod & Bakija, Taxing Ourselves (4th ed.). Introduction and an Overview of the U.S. Tax System, pp. 1-5 2. Criteria for Evaluating Tax Laws (a) Slemrod & Bakija, pp. 56-98 (fairness); 99-158 (economic prosperity); 159-188 (simplicity and enforceability) (b) Nozick, Anarchy, State and Utopia, pp. 149-164, 167-174 Available in the course material (c) Notes on Slemrod and Bakija Available in the course material (d) Slemrod, Does Atlas Shrug?, pp. 3-26, The Economics of Taxing the Rich Available in the course material 3. The Concept of Income (a) H. Simons, Personal Income Taxation (1938), pp. 41-58 (definition of income); 110-125 (income in kind); 125-147 (gratuitous receipts); 148-169 (capital gains) Available in the course material

Journal ArticleDOI
TL;DR: In this article, it was shown that this coefficient is a weighted sum of the values of γ in the various strata defined by categories of C, where the weight in stratum i is its proportion of the total pairs which differ on A and B and are tied on C.
Abstract: Following Goodman and Kruskal's interpretation of their coefficient, γA,B, a partial coefficient, γA, B|C is defined as “how much more probable it is to get like than unlike orders in measures A and B when pairs of individuals differing on A and on B and tied on C but unselected on any other measure are chosen at random from the population.” It is shown that this coefficient is a weighted sum of the values of γ in the various strata defined by categories of C, where the weight in stratum i is its proportion of the total pairs which differ on A and B and are tied on C. An empirical example illustrates the calculation of the co-efficient.

Journal ArticleDOI
TL;DR: In this article, a Bayesian analysis of the exponential model, based on life tests that are terminated at preassigned time points or after pre-assigned number of failures, has been developed.
Abstract: Bayesian analysis of the exponential model, based on life tests that are terminated at preassigned time points or after preassigned number of failures, has been developed. For the prior distribution of the parameter involved, uniform, inverted gamma and exponential densities have been examined. The estimation of the reliability function has also been carried out by using Bayesian methods and the case of ‘attribute testing’ has been considered briefly. The role of prior quasi-densities when a life tester has no prior information has been illustrated and it has been observed that the reliability estimate for a diffuse prior which is uniform over the entire positive real line closely resembles the classical MVU estimate obtained by Pugh. It has also been noted that the Bayes estimate of the exponential parameter θ for a prior quasi-density of the form 1/θ2 coincides with the classical MVU estimate of Epstein and Sobel. Further, it has been proved that in a wide class of prior densities, proper or im...

Journal ArticleDOI
TL;DR: In this article, it was shown that in a standard linear regression model, ordinary least-squares estimators are best linear unbiased if and only if the errors have the same variance and the same nonnegative coefficient of correlation between each pair.
Abstract: It is shown that in a standard linear regression model ordinary least-squares estimators are best linear unbiased if and only if the errors have the same variance and the same nonnegative coefficient of correlation between each pair.

Journal ArticleDOI
TL;DR: In this paper, the force of mortality for each individual is assumed to be a function of the covariable, and an approach to compare the mortality experience of two or more groups of individuals who are known not to be comparable with respect to a covariable is presented.
Abstract: In applied problems one often wishes to compare the mortality experience of two or more groups of individuals who are known not to be comparable with respect to a covariable. This paper presents an approach to this problem by assuming that the force of mortality for each individual is a function of the covariable. Extension to the case where more than one covariable is present is indicated. It is also suggested that the present method is adaptable to an actuarial type analysis.

Journal ArticleDOI
TL;DR: In this paper, the power of the F-test corresponding to the degrees of freedom (f 1, f 2) and non-centrality parameter λ were tabulated for Type I error α = 0.005, 0.01, 0.,025, 0,05, for the following values of the parameters:
Abstract: The values of the power of the F-test corresponding to the degrees of freedom (f 1, f 2) and non-centrality parameter λ are tabulated for Type I error α =0.005, 0.01, 0.025, 0.05, for the following values of the parameters: and