scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the American Statistical Association in 1978"


Journal ArticleDOI
Joseph Waksberg1
TL;DR: A method of sample selection for household telephone interviewing via random digit dialing is developed which significantly reduces the cost of such surveys as compared to dialing numbers completely at random.
Abstract: A method of sample selection for household telephone interviewing via random digit dialing is developed which significantly reduces the cost of such surveys as compared to dialing numbers completely at random. The sampling is carried out through a two-stage design and has the unusual feature that although all units have the same probability of selection, it is not necessary to know the probabilities of selection of the first-stage or the second-stage units. Simple random sampling of possible telephone numbers, within existing telephone exchanges, is inefficient because only about 20 percent of these numbers are actually telephone numbers assigned to households. The method of selection proposed reduces the proportion of unused numbers sharply.

1,373 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed a logistic regression model with maximum likelihood estimators for solving both problems and reported on their own supportive empirical studies, and summarized the related arguments.
Abstract: Classifying an observation into one of several populations is discriminant analysis, or classification. Relating qualitative variables to other variables through a logistic cdf functional form is logistic regression. Estimators generated for one of these problems are often used in the other. If the populations are normal with identical covariance matrices, discriminant analysis estimators are preferred to logistic regression estimators for the discriminant analysis problem. In most discriminant analysis applications, however, at least one variable is qualitative (ruling out multivariate normality). Under nonnormality, we prefer the logistic regression model with maximum likelihood estimators for solving both problems. In this article we summarize the related arguments, and report on our own supportive empirical studies.

1,134 citations


Journal ArticleDOI
TL;DR: A condition is the on that will make you feel that you must read as mentioned in this paper, which is the condition that makes us feel that reading is a need and a hobby at once.
Abstract: Some people may be laughing when looking at you reading in your spare time. Some may be admired of you. And some may want be like you who have reading hobby. What about your own feel? Have you felt right? Reading is a need and a hobby at once. This condition is the on that will make you feel that you must read. If you know are looking for the book enPDFd fractals form chance and dimension as the choice of reading, you can find here.

1,041 citations


Journal ArticleDOI
Nan M. Laird1
TL;DR: In this article, the authors show that the nonparametric maximum likelihood estimate of a mixing distribution is self-consistent, i.e., it is a step function with a finite number of steps.
Abstract: The nonparametric maximum likelihood estimate of a mixing distribution is shown to be self-consistent, a property which characterizes the nonparametric maximum likelihood estimate of a distribution function in incomplete data problems. Under various conditions the estimate is a step function, with a finite number of steps. Its computation is illustrated with a small example.

763 citations


Journal ArticleDOI
TL;DR: In this paper, the estimator which minimizes the sum of absolute residuals is demonstrated to be consistent and asymptotically Gaussian with covariance matrix ω2 Q -1.
Abstract: In the general linear model with independent and identically distributed errors and distribution function F, the estimator which minimizes the sum of absolute residuals is demonstrated to be consistent and asymptotically Gaussian with covariance matrix ω2 Q -1, where Q = lim T -1 X'X and ω2 is the asymptotic variance of the ordinary sample median from samples with distribution F. Thus the least absolute error estimator has strictly smaller asymptotic confidence ellipsoids than the least squares estimator for linear models from any F for which the sample median is a more efficient estimator of location than the sample mean.

598 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider an estimation problem when only the k largest observations of a sample of size n are available, and present estimators for the location and scale parameters and for p-quantiles of F, where p is of the form 1 − c/n (c fixed).
Abstract: We consider an estimation problem when only the k largest observations of a sample of size n are available. It is assumed that the underlying distribution function F belongs to the domain of attraction of a known extreme-value distribution and that k remains fixed as n → ∞. We present estimators for the location and scale parameters and for p-quantiles of F, where p is of the form 1 — c/n (c fixed). These estimators are either asymptotically maximum likelihood or minimum variance.

591 citations


Journal ArticleDOI
TL;DR: In this article, the authors introduced the "moment generating function estimator" which minimizes the sum of squares of differences between the theoretical and sample moment generating functions, and applied it to the Hamermesh model of wage bargain determination.
Abstract: Since the likelihood function corresponding to finite mixtures of normal distributions is unbounded, maximum likelihood estimation may break down in practice. The article introduces the “moment generating function estimator” defined as the estimator which minimizes the sum of squares of differences between the theoretical and sample moment generating functions. The consistency and asymptotic normality of the estimator are proved and its finite sample behavior is compared to that of the standard method of moments estimator by Monte Carlo experiments. The estimator is applied to the Hamermesh model of wage bargain determination.

533 citations


Journal ArticleDOI
TL;DR: In this article, a simple randomized treatment assignment rule is proposed and analyzed in a sequential medical trial, and on the average this rule assigns more patients to the better treatment, and it is applicable to the case where patients have delayed responses to treatments.
Abstract: In a sequential medical trial, a simple randomized treatment assignment rule is proposed and analyzed. On the average this rule assigns more patients to the better treatment, and it is applicable to the case where patients have delayed responses to treatments. This new assignment rule is studied for both a fixed sample size and an inverse stopping rule.

441 citations


Journal ArticleDOI
TL;DR: In this paper, a procedure that reduces the effect of population skewness on the distribution of the t variable so that tests about the mean can be more correctly computed is proposed.
Abstract: This article considers a procedure that reduces the effect of population skewness on the distribution of the t variable so that tests about the mean can be more correctly computed. A modification of the t variable is obtained that is useful for distributions with skewness as severe as that of the exponential distribution. The procedure is generalized and applied to the jackknife t variable for a class of statistics with moments similar to those of the sample mean. Tests of the correlation coefficient obtained using this procedure are compared empirically with corresponding tests determined using Fisher's z transformation and the usual jackknife estimate.

367 citations


Journal ArticleDOI
TL;DR: In this article, general measures of residual variation are considered, including ordinary squared error and prediction error as well as the log likelihood, and the relation of Goodman and Kruskal's measures of categorical association to the theory of penalty functions and probability elicitation is demonstrated.
Abstract: We consider regression situations for which the response variable is dichotomous. The most common analysis fits successively richer linear logistic models and measures the residual variation from the model by minus twice the maximized log likelihood. General measures of residual variation are considered here, including ordinary squared error and prediction error as well as the log likelihood. All of these are shown to be satisfactory in a certain primitive sense, unlike quantitative regression theory where only squared error is logically satisfactory. The relation of Goodman and Kruskal's measures of categorical association to the theory of penalty functions and probability elicitation is demonstrated.

337 citations


Journal ArticleDOI
TL;DR: The small sample properties of three goodness-of-fit statistics for the analysis of categorical data are examined with respect to the adequacy of the asymptotic chi-squared approximation.
Abstract: The small-sample properties of three goodness-of-fit statistics for the analysis of categorical data are examined with respect to the adequacy of the asymptotic chi-squared approximation. The approximate tests based on the likelihood ratio and Freeman-Tukey statistics yield exact levels that are typically in excess of the nominal levels for moderate expected values. In contrast, the Pearson statistic attains exact levels that are quite close to the nominal values. The reason for the large number of rejections for the likelihood ratio and Freeman-Tukey statistics is related to their handling of small observed counts.


Journal ArticleDOI
TL;DR: In this paper, a treatment assignment rule is proposed that forces a small subtrial to be balanced, but tends toward the complete randomization scheme as the size of the subtrial increases.
Abstract: In the comparison of K treatments, assume that patients appear singly and must be treated immediately. Suppose that patients having the same combination of prognostic factor levels are grouped into the same stratum. If the number of different strata is small, we treat each stratum as a separate independent subtrial. A treatment assignment rule is proposed that forces a small subtrial to be balanced, but tends toward the complete randomization scheme as the size of the subtrial increases. When the number of strata is large, we propose an overall assignment rule which can achieve a degree of treatment balance simultaneously across all prognostic factors.

Journal ArticleDOI
TL;DR: The problem of conflicting objectives is of paramount importance, both in planned and market economies, and this book represents a cross-cultural mixture of approaches from many countries to the same class of problem as discussed by the authors.
Abstract: This book deals with quantitative approaches in making decisions when conflicting objectives are present. This problem is central to many applications of decision analysis, policy analysis, operational research, etc. in a wide range of fields, for example, business, economics, engineering, psychology, and planning. The book surveys different approaches to the same problem area and each approach is discussed in considerable detail so that the coverage of the book is both broad and deep. The problem of conflicting objectives is of paramount importance, both in planned and market economies, and this book represents a cross-cultural mixture of approaches from many countries to the same class of problem.

Journal ArticleDOI
TL;DR: In this paper, a model for the joint distribution of bivariate random variables when one variable is directional and one is scalar is proposed, based on the maximum entropy principle and by the specification of the marginal distributions.
Abstract: Parametric models are proposed for the joint distribution of bivariate random variables when one variable is directional and one is scalar. These distributions are developed on the basis of the maximum entropy principle and by the specification of the marginal distributions. The properties of these distributions and the statistical analysis of regression models based on these distributions are explored. One model is extended to several variables in a form that justifies the use of least squares for estimation of parameters, conditional on the observed angles.


Journal ArticleDOI
TL;DR: In this paper, a simple procedure for multiple comparisons of means is proposed for unequally sized samples, where the critical points used are those of the Studentized Maximum Modulus and two inequalities are involved: a conservative probability inequality and a radical algebraic inequality.
Abstract: A simple procedure for multiple comparisons of means is proposed for unequally sized samples The procedure is attractive for its simplicity and for a graphical display which allows the experimenter to evaluate all comparisons of interest at a glance The critical points used are those of the Studentized Maximum Modulus and two inequalities are involved: a conservative probability inequality and a radical algebraic inequality Apparently, the joint effect of the two inequalities is conservative unless the imbalance in sample sizes is very large For equally sized samples one can use critical points of the Studentized Range and the procedure becomes equivalent to Tukey's well-known T-method

Journal ArticleDOI
TL;DR: An alternative strategy for assigning pseudorandom numbers to experimental points in statistically designed simulation and distribution sampling experiments is devised and shown to improve upon existing recommendations for a wide class of problems.
Abstract: This research investigates various strategies for assigning pseudorandom numbers to experimental points in statistically designed simulation and distribution sampling experiments. Strategies studied include the widely advocated practices of (i) employing a common set of pseudorandom numbers for all experimental points, and (ii) assigning a unique set of pseudorandom numbers to each experimental point. An alternative, based upon blocking concepts in designed experiments, is devised and shown to improve upon existing recommendations for a wide class of problems. A small simulation, a pilot study of a hospital resource allocation problem, illustrates the new strategy.



Journal ArticleDOI
TL;DR: In this article, it was shown that the ratio of relevant information contained in unclassified observations to that in classified observations varies from approximately one-fifth to two-thirds for the statistically interesting range of separation of the populations.
Abstract: Fisher's linear discriminant rule may be estimated by maximum likelihood estimation using unclassified observations. It is shown that the ratio of the relevant information contained in unclassified observations to that in classified observations varies from approximately one-fifth to two-thirds for the statistically interesting range of separation of the populations. Thus, more information may be obtained from large numbers of inexpensive unclassified observations than from a small classified sample. Also, all available unclassified and classified data should be used for estimating Fisher's linear discriminant rule.


Journal ArticleDOI
Philip H. Ramsey1
TL;DR: In this article, a number of multiple comparison procedures are studied whose maximum Type I error rate, experimentwise, is limited to a fixed value, the experimentwise level, and the procedures are compared for all-pairs power; i.e., for the probability of simultaneous significance for all pairs which are truly unequal.
Abstract: A number of multiple comparison procedures are studied whose maximum Type I error rate, experimentwise, is limited to a fixed value, the experimentwise level. Some of these procedures are slight revisions of existing methods. The procedures are compared for all-pairs power; i.e., for the probability of simultaneous significance for all pairs which are truly unequal. Results of Monte Carlo simulation show that a revision of Peritz's F method is uniformly the most powerful among all the procedures studied. The revised Peritz F method is substantially more powerful than Tukey's method with a power advantage as high as 0.50. A method due to Welsch (1972, 1977) is also quite powerful and may sometimes be recommended for greater simplicity.

Journal ArticleDOI
TL;DR: In this article, the authors review existing methods for analyzing experimental design models with unbalanced data and to relate them to existing computer programs, distinguished by the hypotheses associated with the sums of squares which are generated, rather than on computational convenience or the orthogonality of the quadratic forms.
Abstract: The objective of this article is to review existing methods for analyzing experimental design models with unbalanced data and to relate them to existing computer programs The methods are distinguished by the hypotheses associated with the sums of squares which are generated The choice of a method should be based on the appropriateness of the hypothesis rather than on computational convenience or the orthogonality of the quadratic forms The sums of squares are described using the R ( ) notation as applied to the over-parameterized linear model, but the hypotheses are stated in terms of the full-rank cell means model The zero-cell frequency situation is treated briefly

Journal ArticleDOI
TL;DR: In this paper, the variance of the sample median is estimated based on small sample sizes, and short tables are provided to facilitate calculation of the estimates, where the variance is defined as the sum of the mean and variance.
Abstract: Estimation of the variance of the sample median based on small samples is discussed, and short tables are provided to facilitate calculation of the estimates.

Journal ArticleDOI
TL;DR: There is a need for caution in the design of surveys and interpretation of their results, regardless of the explanation of the discrepancy, and explanations for this discrepancy range from problems in the surveys' methodology to increased underreporting of cigarette consumption.
Abstract: Data from four major surveys, spanning the years since the Surgeon General's Report, suggest a significant reduction in rates of cigarette smoking. These data, however, conflict with production and sales data which record only a slight reduction. Explanations for this discrepancy range from problems in the surveys' methodology to increased underreporting of cigarette consumption because of both a growing awareness of the threat to health and the social undesirability of smoking. This article emphasizes the need for caution in the design of surveys and interpretation of their results, regardless of the explanation of the discrepancy.

Journal ArticleDOI
TL;DR: It is shown that the clustering problem so defined is reducible to the problem of optimally coloring a sequence of graphs, and is NP-complete.
Abstract: A basic problem in cluster analysis is how to partition the entities of a given set into a preassigned number of homogeneous subsets called clusters. The homogeneity of the clusters is often expressed as a function of a dissimilarity measure between entities. The objective function considered here is the minimization of the maximum dissimilarity between entities in the same cluster. It is shown that the clustering problem so defined is reducible to the problem of optimally coloring a sequence of graphs, and is NP-complete. An efficient algorithm is proposed and computational experience with problems involving up to 270 entities is reported on.


Journal ArticleDOI
TL;DR: The general limited dependent variable model discussed in this paper permits skewness in a pretruncated variable by transforming it within the class of Box-Cox transformations, and as a byproduct this general model also provides a convenient nesting framework for statistically distinguishing between numerous limited dependent variables models.
Abstract: The general limited dependent variable model discussed in this article permits skewness in a pretruncated variable by transforming it within the class of Box-Cox transformations. As a by-product this general model also provides a convenient nesting framework for statistically distinguishing between numerous limited dependent variable models. An application to a model of the supply of bilateral foreign aid illustrates the ability of the general model to empirically distinguish between competing specifications.

Journal ArticleDOI
TL;DR: In this article, a robust variance estimator is derived, and its asymptotic properties are shown to compare favorably with those of the weighted least-squares variance estimators.
Abstract: Under a linear regression model, the best linear unbiased estimator (BLUE) for a finite population total can be obtained. The problem studied here is that of estimating the variance for setting large-sample confidence intervals about the BLUE when the model generating this estimate is inaccurate. A robust variance estimator is derived, and its asymptotic properties are shown to compare favorably with those of the weighted least-squares variance estimator. The robust variance estimator is shown to be asymptotically equivalent to the jackknife variance estimator under rather general conditions. These are extensions of results previously established for the ratio estimator by Royall and Eberhardt (1975).