scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the American Statistical Association in 1968"


Journal ArticleDOI
TL;DR: In this article, a simple and robust estimator of regression coefficient β based on Kendall's rank correlation tau is studied, where the point estimator is the median of the set of slopes (Yj - Yi )/(tj-ti ) joining pairs of points with ti ≠ ti.
Abstract: The least squares estimator of a regression coefficient β is vulnerable to gross errors and the associated confidence interval is, in addition, sensitive to non-normality of the parent distribution. In this paper, a simple and robust (point as well as interval) estimator of β based on Kendall's [6] rank correlation tau is studied. The point estimator is the median of the set of slopes (Yj - Yi )/(tj-ti ) joining pairs of points with ti ≠ ti , and is unbiased. The confidence interval is also determined by two order statistics of this set of slopes. Various properties of these estimators are studied and compared with those of the least squares and some other nonparametric estimators.

8,409 citations


Journal ArticleDOI
TL;DR: In this paper, an empirical sampling study of the sensitivities of nine statistical procedures for evaluating the normality of a complete sample was carried out, including W (Shapiro and Wilk, 1965), (standard third moment), b 2 (standard fourth moment), KS (Kolmogorov-Smirnov), CM (Cramer-Von Mises), WCM (weighted CM), D (modified KS), CS (chi-squared) and u (studentized range).
Abstract: Results are given of an empirical sampling study of the sensitivities of nine statistical procedures for evaluating the normality of a complete sample. The nine statistics are W (Shapiro and Wilk, 1965), (standard third moment), b 2 (standard fourth moment), KS (Kolmogorov-Smirnov), CM (Cramer-Von Mises), WCM (weighted CM), D (modified KS), CS (chi-squared) and u (Studentized range). Forty-five alternative distributions in twelve families and five sample sizes were studied. Results are included on the comparison of the statistical procedures in relation to groupings of the alternative distributions, on means and variances of the statistics under the various alternatives, on dependence of sensitivities on sample size, on approach to normality as measured by the W statistic within some classes of distribution, and on the effect of misspecification of parameters on the performance of the simple hypothesis test statistics. The general findings include: (i) The W statistic provides a generally superio...

1,093 citations


Journal ArticleDOI
TL;DR: In this paper, the Lintner model was used to examine the dividend policies of individual firms and provided the best predictions of dividends on a year of data not used in fitting the regressions.
Abstract: Starting with the “partial adjustment model” suggested by Lintner [10, 11], this paper examines the dividend policies of individual firms. The Lintner model, in which the change in dividends from year t-1 to year t is regressed on a constant, the level of dividends for t-1, and the level of profits for t, explains dividend changes for individual firms fairly well relative to other models tested. But a model in which the constant term is suppressed and the level of earnings for t-1 is added, provides the best predictions of dividends on a year of data not used in fitting the regressions. Though the dividend policy of individual firms is certainly a subject of economic interest, perhaps much of the novelty of the paper is methodological: specifically, the way in which a validation sample, simulations, and prediction tests are used to investigate results obtained from a pilot sample. To avoid spurious results that could follow from the extensive data-dredging involved in finding “good-fitting” divid...

968 citations


Journal ArticleDOI
TL;DR: In this article, the linear model given by is considered and a number of consistent estimators of the coefficients, βk, and the variances of the errors are developed and a few properties of the estimators are noted.
Abstract: The linear model given by is considered. The βk represent average responses of yt the dependent variable to unit changes in the independent variables, ztk. The vtk are independently distributed random errors. A number of consistent estimators of the coefficients, βk, and the variances of the errors are developed and a few properties of the estimators are noted. Further investigations of sampling properties are needed.

551 citations


Journal ArticleDOI
TL;DR: The analysis of cross-classified data: Independence, quasi-independence, and Interactions in Contingency Tables with or without Missing Entries as mentioned in this paper is an example of such a model.
Abstract: (1968). The Analysis of Cross-Classified Data: Independence, Quasi-Independence, and Interactions in Contingency Tables with or without Missing Entries. Journal of the American Statistical Association: Vol. 63, No. 324, pp. 1091-1131.

469 citations


Journal ArticleDOI
TL;DR: In this paper, a Monte Carlo experiment is carried out to examine the small sample properties of five alternative estimators of a set of linear regression equations with mutually correlated disturbances, and the results show that three of the five estimation methods lead to identical estimates for any sample size, that in many cases the two-stage Aitken estimator performs as well as or better than the other estimators, and most of the asymptotic properties of this estimator tend to hold in small samples as well.
Abstract: A Monte Carlo experiment is carried out to examine the small sample properties of five alternative estimators of a set of linear regression equations with mutually correlated disturbances. The estimators considered are ordinary least squares, Zellner's two-stage Aitken, Zellner's iterative Aitken, Telser's iterative, and maximum likelihood. The experiment, based on 100 samples, provides approximate sampling distributions for samples of size 10, 20 and 100 for various model specifications. The results show that three of the five estimation methods lead to identical estimates for any sample size, that in many cases the two-stage Aitken estimator performs as well as or better than the other estimators, and that most of the asymptotic properties of this estimator tend to hold in small samples as well.

465 citations


Journal ArticleDOI
TL;DR: In this article, the Association and Estimation in Contingency Tables (AET) is used to estimate the number of contingency tables in a given contingency table, and it is shown to be a good fit for counting tables.
Abstract: (1968). Association and Estimation in Contingency Tables. Journal of the American Statistical Association: Vol. 63, No. 321, pp. 1-28.

461 citations


Journal ArticleDOI
TL;DR: In this article, a Monte Carlo study of truncated means as estimates of location is presented, where it is shown that some truncated mean has smaller sampling dispersion than the full mean.
Abstract: This paper takes a few steps toward alleviating problems of data analysis that arise from the fact that elementary expressions for density and cumulative distribution functions (c.d.f.'s) for most stable distributions are unknown. In section 2 results of Bergstrom [3] are used to develop numerical approximations for the c.d.f.'s and the inverse functions of the c.d.f.'s of symmetric stable distributions. Tables of the c.d.f.'s and their inverse functions are presented for twelve values of the characteristic exponent. In section 3 the usefulness of the numerical c.d.f.'s and their inverse functions in estimating the parameters of stable distributions and testing linear models involving stable variables is discussed. Finally, section 4 presents a Monte Carlo study of truncated means as estimates of location. In every case but the Gaussian, some truncated mean is shown to have smaller sampling dispersion than the full mean.

443 citations


Journal ArticleDOI
TL;DR: In this paper, a procedure for lowering the mean square error (MSE) of the minimum variance unbiased linear estimator (MVULE) of a population is considered, where the technique employed is shrinkage of the MVULE towards a natural origin, μ0, in the parameter space.
Abstract: A procedure for lowering the mean square error (MSE) of the minimum variance unbiased linear estimator (MVULE) of the mean of a population is considered. The technique employed is shrinkage of the MVULE towards a natural origin, μ0, in the parameter space. The suggested estimator is: The case where μ0 = 0 is given special consideration. This is the general case for the normal distribution. When |μ/σz| is small, shrinkage buys decreased MSE at the expense of increased MSE for |μ/σz| large.

259 citations


Journal ArticleDOI
TL;DR: In this paper, a class of estimators of the probability density function f and the associated cumulative distribution function F are considered and simple expressions for the mean integrated square errors, M.I.S.E.
Abstract: A class of estimators (referred to as the Fourier estimators m and m) of the probability density function f and the associated cumulative distribution function F are considered. Here m = Σmk=0 âkψk and m = Σmk=0 Âk ψk where the functions {ψk} comprise an orthogonal set with respect to weight function w(x), and the statistics âk and Âk are formed from the n unordered observations.Simple expressions are found for the mean integrated square errors, M.I.S.E., of the estimators m and m, i.e., E∫{ƒ(x) – m(x)}2ω(x)dx and E∫{F(x) – m(x)}2w(x)dx in terms of the variances of âk and Âk and the Fourier coefficients of f and F.For Fourier estimators based upon the trigonometric orthogonal functions the âk are the sample trigonometric moments. The variances and covariances of the statistics âk and Âk for these special cases are shown to be linear functions of the density f's Fourier coefficients. Therefore, simple expressions are obtained which relate the M.I.S.E. of the Fourier estimators m and m to the Fourier coeffi...

259 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that there are no countably additive exchangeable distributions on the space of observations which give ties probability 0 and for which a next observation is conditionally equally likely to fall in any of the open intervals between successive order statistics of a given sample.
Abstract: A Bayesian approach to inference about the percentiles and other characteristics of a finite population is proposed. The approach does not depend upon, though it need not exclude, the use of parametric models.Some related questions concerning the existence of exchangeable distributions are considered. It is shown that there are no countably additive exchangeable distributions on the space of observations which give ties probability 0 and for which a next observation is conditionally equally likely to fall in any of the open intervals between successive order statistics of a given sample.

Journal ArticleDOI
TL;DR: In this article, the authors examined the mean square error criterion for rejecting or adopting restrictions on the parameter space in a regression model, and developed a uniformly most powerful testing procedure for the criterion.
Abstract: The objectives of this paper are to examine the mean square error criterion for rejecting or adopting restrictions on the parameter space in a regression model, and to develop a uniformly most powerful testing procedure for the criterion. We present a tabulation of critical points for the test for one restriction and selected points of the power function. The mean square error criterion suggests a framework for thinking about the problem of multicollinearity in a linear model. To this end we present some examples to illustrate the linkage of the mean square error criterion with multicollinearity.

Journal ArticleDOI
TL;DR: In this paper, a general functional form is considered for which the linear and logarithmic functional forms are special cases for money stock defined as currency plus demand deposits, and the empirical evidence suggests that the log-factor functional model is appropriate for time deposits.
Abstract: A general functional form is considered for which the linear and logarithmic functional forms are special cases. The general functional form is a power transformation of each of the variables—each variable is raised to a λ power. This functional form is estimated for the demand for money using the maximum likelihood method. Real money demand is specified to be a function of real current income and a short-term interest rate. The empirical evidence suggests that the logarithmic functional form is appropriate for money stock defined as currency plus demand deposits. For money stock defined to also include time deposits, neither the linear nor logarithmic form seems appropriate. The estimates of λ are insensitive to expansion of the model explaining money demand, but functional form is important for discrimination among alternative hypotheses.

Journal ArticleDOI
TL;DR: In this article, the use of a linear function for discriminating with dichotomous variables is discussed and evaluated, and four such functions are considered: Fisher's linear discriminant function, two functions based upon a logistic model, and a function based upon the assumption of mutual independence of the variables.
Abstract: The use of a linear function for discriminating with dichotomous variables is discussed and evaluated. Four such functions are considered: Fisher's linear discriminant function, two functions based upon a logistic model, and a function based upon the assumption of mutual independence of the variables. The evaluation of these functions as well as of a completely general multinomial procedure is carried out within the context of a 1st order interaction model by means of computer experiments. The product moment correlation of the optimal function with the linear function under evaluation plays a central role as a criterion for judging the relative merits of the procedures considered.


Journal ArticleDOI
John W. Pratt1
TL;DR: In this paper, a new Normal approximation to the beta distribution and its relatives, in particular, the binomial, Pascal, negative binomial and negative Binomial, F, t, Poisson, gamma, and chi square distributions, was proposed.
Abstract: This paper concerns a new Normal approximation to the beta distribution and its relatives, in particular, the binomial, Pascal, negative binomial, F, t, Poisson, gamma, and chi square distributions The approximate Normal deviates are expressible in terms of algebraic functions and logarithms, but for desk calculation it is preferable in most cases to use an equivalent expression in terms of a function specially tabulated here Graphs of the error are provided They show that the approximation is good even in the extreme tails except for beta distributions which are J or U shaped or nearly so, and they permit correction to obtain still more accuracy For use beyond the range of the graphs, some standard recursive relations and some classical continued fractions are listed, with some properties of the latter which seem to be partly new Various Normal approximations are compared, with further graphs The new approximation appears far more accurate than the others Everything an ordinary user of th

Journal ArticleDOI
TL;DR: For orthogonal linear regression, point estimators for the coefficients in linear regression which are better than the ordinary least squares estimator are obtained when at least three coefficients are to be estimated.
Abstract: Point estimators for the coefficients in orthogonal linear regression which are better than the ordinary least squares estimator are obtained when at least three coefficients are to be estimated. The measure of goodness of an estimator is the sum, or weighted sum, of the componentwise mean squared errors. Some of the new estimators have interpretations as estimators which depend upon preliminary tests of significance. These estimators may be especially appropriate when the independent variables fall into two sets or are ordered, as in polynomial regression or regression on principal components. The extension of the results to the general case of nonorthogonal regression is given; here the measure of goodness of an estimator is the mean of a quadratic form in the componentwise errors.

Journal ArticleDOI
TL;DR: In this paper, the robustness of the F-test between means to its underlying assumptions (normally distributed populations with equal variances) is investigated using two nonnormal distributions (exponential and lognormal).
Abstract: In this study of robustness the insensitivity of the F-test between means to its underlying assumptions (normally distributed populations with equal variances) is investigated. Using two nonnormal distributions (exponential and lognormal), it is found that the test is fairly insensitive for moderate and equal sample size (n = 32) when the variances are equal. Further, for small samples (n < 32), the test is conservative with respect to Type I error. It is also conservative with respect to Type II error for a large range of φ (noncentrality), depending on the size of the sample and α. When the within cell error variances are heterogenous, the test continues to be conservative for the upper values of φ and slightly biased toward larger Type II errors for smaller values of φ depending on the size of α. Analysis of the correlation between the numerator and denominator of F under the null hypothesis indicates that the robustness feature is largely due to this correlation. Analytic proofs under the non-null hyp...

Journal ArticleDOI
TL;DR: In this paper, the authors present the region of coverage of the curve shape characteristics, α23 and δ = (2α4 −3α23 −6) /(α4+3), for a certain general system of distributions, Burr (1942).
Abstract: This paper presents the region of coverage of the curve shape characteristics, α23 and δ = (2α4 −3α23 −6) /(α4+3), for a certain general system of distributions, Burr (1942). These curve-shape characteristics were chosen for comparison with the Pearson system because of the simplicity with which they map the members of the latter system, Craig (1936). It is here shown that the present system covers almost all of the regions of the main Pearson Types IV and VI, and an important part of that of the main Type I (or beta distribution). The density function of the median for odd sized samples from the present system is given in closed form. All finite moments of the median are linear combinations of beta functions. Important characteristics of the median: bias, and efficiency relative to the sample mean, are given for samples of n = 3, 5, 7 and 11, for populations with α3:x = 0, .50, 1.00, 1.50, and, corresponding to each α3:x two well separated α4:x, values. It also appears that for this system, the median be...

Journal ArticleDOI
TL;DR: In this paper, the relationship between the bias and the true parameter value is analyzed and the bias is tabulated for selected values of the parameters of the distribution, and it is shown that the estimator possesses finite moments up to order v where v is the number of over-identifying restrictions.
Abstract: The distribution studied is that of an estimator of a structural parameter appearing in a system of linear simultaneous equations. The relationship between the bias and the true parameter value is analysed and the bias is tabulated for selected values of the parameters of the distribution. It is also shown that the estimator possesses finite moments up to order v where v is the number of overidentifying restrictions and that the estimator converges to the true parameter value as one of the parameters of the distribution (not the sample size) increases indefinitely.

Journal ArticleDOI
TL;DR: In this article, a method of evaluating the distribution function of the Inverse Gaussian distribution from the standard normal distribution is presented, which is similar to the one described in this paper.
Abstract: This note deals with a method of evaluating the distribution function of the Inverse Gaussian Distribution, from the Standard Normal Distribution.

Journal ArticleDOI
TL;DR: In this article, the bias and mean square error of the sometimes-pool estimator are given, and the relative efficiency of the always-pool estimation to the never-pool estimate is tabulated and tables can be used to determine a proper choice of the significance level of the preliminary test.
Abstract: Given two random samples from normal populations, the experimenter wishes to estimate the mean of the first population. Whether to pool the two samples or not is made to depend on the result of a preliminary test. The bias and mean square error of the sometimes-pool estimator are given. The relative efficiency of the sometimes-pool estimator to the never-pool estimator is tabulated and the tables can be used to determine a proper choice of the significance level of the preliminary test. A pooling procedure for means, based on prior information, is discussed when the prior distribution is normal.

Journal ArticleDOI
TL;DR: In this paper, the monetary evaluation of decisions under uncertainty and of associated opportunities to acquire information is generalized to a class of nonlinear utility functions by defining and interrelating minimum selling and maximum buying prices for decisions, certainty equivalents of uncertain costs of information, values of information and net gains of information.
Abstract: The monetary evaluation of decisions under uncertainty and of associated opportunities to acquire information is well known and conceptually straightforward when utility is linear in money. This analysis is generalized here to a class of nonlinear utility functions by defining and interrelating minimum selling and maximum buying prices for decisions (section 2), certainty equivalents of uncertain costs of information (section 3), values of information (section 4), and net gains of information (section 5). Preliminary material constitutes the first section, and section 6 summarizes results of particular usefulness in practice.

Journal ArticleDOI
TL;DR: In this paper, an algorithm for the determination of the economic design of -charts based on Duncan's model is described, which consists of solving an implicit equation in design variables n (sample size) and k (control limit factor) and an explicit equation for h (sampling interval).
Abstract: An algorithm for the determination of the economic design of -charts based on Duncan's model is described in this paper. This algorithm consists of solving an implicit equation in design variables n (sample size) and k (control limit factor) and an explicit equation for h (sampling interval). The use of this algorithm not only yields the exact optimum but also provides valuable information so that the sensitivity of the optimum loss-cost (L*) can be evaluated. Loss-cost contours are used to discuss the nature of the loss-cost surface and the effect of the design variables. The effect of two parameters, the delay factor (e), and the average time for an assignable cause to occur (1/λ), on the optimum design is evaluated. Numerical examples are used for illustrations.


Journal ArticleDOI
TL;DR: In this paper, a method for obtaining a multiple linear regression equation which permits the variance, as well as the mean, of normally distributed random variables, Y to be a function of known constants X1, Xp, and yields an approximation of maximum likelihood estimates of regression coefficients which may be used to construct confidence intervals for the parameters of the normal distribution and tolerance intervals for individual Y.
Abstract: A method is suggested for obtaining a multiple linear regression equation which permits the variance, as well as the mean, of normally distributed random variables, Y to be a function of known constants X1 … Xp. The method is applicable to large samples, and yields an approximation of maximum likelihood estimates of regression coefficients, which may be used to construct confidence intervals for the parameters of the normal distribution and tolerance intervals for individual Y. A likelihood ratio test may be used to test this model against the usual homoscedastic least squares model. Two exmples of this type of analysis on data from the literature are presented.

Journal ArticleDOI
TL;DR: In this article, a method is developed for estimating the parameters in the multivariate normal distribution in which the missing observavations are not restricted to follow certain patterns as in most previous papers.
Abstract: In this paper a method is developed for estimating the parameters in the multivariate normal distribution in which the missing observavations are not restricted to follow certain patterns as in most previous papers. The large sample properties of the estimators are discussed. Equivalence with maximum likelihood estimators has been established for a subclass of problems. The results of some simulation studies are provided to support the theoretical development.

Journal ArticleDOI
TL;DR: For a finite universe of n items, it is proved no one can lie more than standard deviations away from the mean as discussed by the authors, which is an improvement over the result given by Tchebycheff's inequality.
Abstract: For a finite universe of N items, it is proved no one can lie more than standard deviations away from the mean. This is an improvement over the result given by Tchebycheff's inequality: and a similar improvement is possible when speaking of how far from the mean any odd-number r out of N observations can lie. However, the relative inefficiency of Tchebycheff's inequality as applied to a finite universe does go to zero as N goes to infinity.

Journal ArticleDOI
TL;DR: In this article, the first and second partial derivatives with respect to the parameters are worked out and the likelihood equations are obtained by equating to zero the first partial derivatives, but an iterative procedure for solving them on an electronic computer is described.
Abstract: Let X be a random variable having the first asymptotic distribution of smallest (largest) values, with location parameter u and scale parameter b, b > 0. The natural logarithm of the likelihood function of a sample of size n from such a distribution, the lowest r1 and the highest r2 sample values having been censored, is written down and its first and second partial derivatives with respect to the parameters are worked out. The likelihood equations, obtained by equating to zero the first partial derivatives, do not have explicit solutions, but an iterative procedure for solving them on an electronic computer is described. The asymptotic variances and covariances of the maximum-likelihood estimators of the parameters are obtained by inverting the information matrix, whose elements are the negatives of the limits, as n → ∞, of the expected values of the second partial derivatives, and tabulated for censoring proportions q1 = 0.0(0.1)0.9 from below and q2 = 0.0(0.1) (0.9 – q1) from above. The asymptotic vari...