scispace - formally typeset
Search or ask a question

Showing papers in "Communications in Statistics-theory and Methods in 1974"


Journal ArticleDOI
TL;DR: A method for identifying clusters of points in a multidimensional Euclidean space is described and its application to taxonomy considered and an informal indicator of the "best number" of clusters is suggested.
Abstract: A method for identifying clusters of points in a multidimensional Euclidean space is described and its application to taxonomy considered. It reconciles, in a sense, two different approaches to the investigation of the spatial relationships between the points, viz., the agglomerative and the divisive methods. A graph, the shortest dendrite of Florek etal. (1951a), is constructed on a nearest neighbour basis and then divided into clusters by applying the criterion of minimum within cluster sum of squares. This procedure ensures an effective reduction of the number of possible splits. The method may be applied to a dichotomous division, but is perfectly suitable also for a global division into any number of clusters. An informal indicator of the "best number" of clusters is suggested. It is a"variance ratio criterion" giving some insight into the structure of the points. The method is illustrated by three examples, one of which is original. The results obtained by the dendrite method are compared with those...

5,772 citations


Journal ArticleDOI
K C Rao1, B S Robson1
TL;DR: In this article, it was shown that in the case of exponential family, the quadratic form of the asymptotic multinomial conditional distribution of the class frequencies given the parameter estimates can be used to test the goodness-of-fit, and the simulated distribution agreed with the X2-distribution with degrees of freedom one less than the number of classes after grouping, regardless of the numberof parameters estimated.
Abstract: When the class boundaries used in constructing a chi-square goodness-of-fit statistic are predetermined and the unknown parameters are estimated by maximum likelihood from the ungrouped data, the resulting statistic does not have a limiting X2-distribution but instead is asymptotically distributed as a linear function of chi-square variables. The same result applies in the more realistic and useful case where only the number of classes and their probability content are predetermined. It is shown here that in both of the above cases, in the case of exponential family, the quadratic form of the asymptotic multinomial conditional distribution of the class frequencies given the parameter estimates can be used to test the goodness-of-fit, The simulated distribution of the statistic agrees with the X2-distribution with degrees of freedom one less than the number of classes after grouping, regardless of the number of parameters estimated.

105 citations


Journal ArticleDOI
TL;DR: In this paper, the problem of estimating common parameters from two linear models under the assumption of normality is addressed and sufficient conditions are obtained under which one can uniformly improve upon the estimates obtained from one model.
Abstract: This paper deals with the problem of estimation of common parameters from two linear models under the assumption of normality. A set of sufficient conditions are obtained under which one can uniformly improve upon the estimates obtained from one model. Uniform improvement on both the models is also considered. We construct estimates which satisfy those conditions and apply these methods to (i) the problem of estimating the common mean of two normal populations and (ii) the problem of recovery of interblock information in incomplete block designs.Exact variances for these estimates and for some other estimates have been evaluated for some impor tant special cases and have been computed for some designs.

97 citations


Journal ArticleDOI
TL;DR: In this article, the authors used the method of maximum likelihood to estimate the parameters of a mixture of two regression lines and found that when the sample size exceeds 250 and the regression lines are more than three standard deviations apart for at least one half of the data, the maximum likelihood estimates are reliable.
Abstract: The method of maximum likelihood is used to estimate the parameters of a mixture of two regression lines, The results of a small simulation study show that when the sample size exceeds 250 and the regression lines are more than three standard deviations apart for at least one half of the data, the maximum likelihood estimates are reliable. When this is net the case their sampling variances are so large that the estimates may not be reliable.

63 citations


Journal ArticleDOI
TL;DR: In this paper, a test is prepared for determining conditions under which stochastic linear prior information, which is incorrect on the average, may improve the parameter estimates for a linear model over conventional sample information estimates, in the sense of having the same or smaller mean square errors for all estimates.
Abstract: A test is prepared for determining conditions under which stochastic linear prior information, which is incorrect on the average, may improve the parameter estimates for a linear model over conventional sample information estimates, in the sense of having the same or smaller mean square errors for all estimates.

45 citations


Journal ArticleDOI
TL;DR: In this article, an approximation to the exact distribution of the Wilcoxon signed ranks test statistic based on the one-sample t-test applied to ranks is compared with the usual normal approximation.
Abstract: An approximation to the exact distribution of the Wilcoxon signed ranks test statistic based on the one-sample t-test applied to ranks is compared with the usual normal approximation. A second approximation based on a linear combination of the normal statistic and the t-statistic is introduced. The normal approximation tends to result in a conservative test in the tails, while the Student's t approximation tends to be liberal. The average of the two statistics provides a test that usually has an alpha level as close as possible to the α-levels .05, .025, .01 and .005, for values of n < 50 (only cases examined).

31 citations


Journal ArticleDOI
TL;DR: In this paper, the relationship between α-unimmodality of 01shen and Savage and a concept of generalized unimodality is studied, and it is shown that all multivariate stable distributions are general unimodal.
Abstract: Relationship between α-unimodality of 01shen and Savage and a concept of generalized unimodality is studied. It is shown that all multivariate stable distributions are generalized unimodal and the ...

30 citations


Journal ArticleDOI
TL;DR: In this article, the authors discuss multiple t tests and confidence intervals based on a Bonferroni Inequality and a slight improvement which can be made in some cases, which is the case in this paper.
Abstract: This paper discusses multiple t tests and confidence intervals based on a Bonferroni Inequality and a slight improvement which can be made in some cases.

29 citations


Journal ArticleDOI
TL;DR: In this paper, the authors examined the family whose probability generating functions have the form of the generalized hypergeometric function, pFq [(a); (b); λ(s-1)].
Abstract: This paper examines the family whose probability generating functions have the form of the generalized hypergeometric function, pFq [(a); (b); λ(s-1)] . It includes a number of matching distributions as well as many classic discrete distributions. Properties may be derived from the differential equations satisfied by the various generating functions e.g. useful recurrence formulae for probabilities, cumulants, and moments about an arbitrary point can be obtained.

28 citations


Journal ArticleDOI
TL;DR: In this paper, the authors investigate Bayesian procedures for estimating the time point at which a parameter change occurred in an observed sequence of independent random variables of the regular exponential class, in particular, binomial, exponential, and normal sequences.
Abstract: This study is designed to investigate Bayesian procedures for estimating the time point at which a parameter change occurred in an observed sequence of independent random variables of the regular exponential class. In particular, binomial, exponential, and normal sequences are considered and a generalization to the so-called two-phase regression problem is emphasized. In addition, inference to other parameters of the sequence is made.

27 citations


Journal ArticleDOI
TL;DR: In this paper, Khatri and Rao's theorem on a characterization of multivariate normal distribution through independence of linear functions of random vectors is extended to independence of more general functions satisfying an associativity equation.
Abstract: Khatri and Rao's theorem on a characterization of multivariate normal distribution through independence of linear functions of random vectors is extended to independence of more general functions satisfying an associativity equation.

Journal ArticleDOI
TL;DR: In this article, the authors show how the presence of simple equicorrelation in a sample from a multivariate normal population affects the confidence coefficients of the confidence set for the mean of a normal population and difference of means of two normal populations with equal dispersion matrices.
Abstract: This paper shows how the presence of simple equicorrelation in a sample from a multivariate normal population affects the confidence coefficients of the confidence set for the mean of multivariate normal population and difference of means of two multivariate normal populations with equal dispersion matrices.

Journal ArticleDOI
TL;DR: In this paper, the first two moments of an estimate of Shannon's measure of information are derived in closed form for a trinomial distribution, and a table of the first moments for varying sample size is given.
Abstract: An underlying multinomial distribution is assumed and from this the first two moments of an estimate of Shannon's measure of information are derived in closed form. A table of the first two moments for varying sample size is given for a trinomial

Journal ArticleDOI
TL;DR: In this paper, a goodness of fit test is presented for a completely specified distribution in which the single sample is censored at the kth ordered statistic (k < n), commonly known as Type II censored sample.
Abstract: The treatment of censored data problems has been restricted almost exclusively to estimation of parameters. In this paper a goodness of fit test is presented for a completely specified distribution in which the single sample is censored at the kth ordered statistic (k < n), commonly known as Type II censored sample (David, 1970) , The technique is an extension of the Hartley and Pfaffenberger (1972) criterion, The approximate distribution of the test statistic is obtained via Pearson's Type VI and Type IV curves.

Journal ArticleDOI
TL;DR: In this article, the large sample behaviour of the maximum likelihood estimate in this non-regular case is investigated. But the results of the analysis are restricted to the case where σ is known and σ = 0, and the estimate of μ is consistent at a rate of convergence of order n-1/4.
Abstract: For the folded normal distribution, generated froma N(μσ2) by loss of the signs of observations, μ=o corresponds to a non-regular case of estimation and testing. The large sample behaviour of the maximum likelihood estimate in this non-regular case is investigated. If σ is known, the estimate of μ is consistent at a rate of convergence of order n-1/4; if 0 is unknown the rate is or order n-1/8, as the sample size n tends to infinity. As a by-product it is shown that a moment method of estimation proposed by Elandt (1961) is locally asymptotically(as μ/σ→ 0 and n→∞).equivalent to the maximum likelihood method. Tests ofμ=0 are derived from the point of view of local optimality. In direct correspondence to the slow consistency of the estimates, the tests have very slowly increasing power functions.

Journal ArticleDOI
TL;DR: In this article, a series of 2×c contingency tables for different age groups were used to test the homogeneity of c binomial populations and the test statistic is based on the sum of the binomial observations through age for each dose group, the distribution of which is computed by the conditional distributions for given marginals in each contingency table.
Abstract: Suppose that we have series of 2×c contingency tables for different age groups and that each 2×c table consists of c bino-mial observations, which for later use, may be regarded as the number of deaths from leukemia during some period for c radiation dose groups among atomic bomb survivors. After adjusting the age constitution in each dose group, we wish to test the homogeneity of c binomial populations. The test statistic is based on the sum of the binomial observations through age for each dose group, the distribution of which is computed by the conditional distributions for given marginals in each contingency table. Because of the singularity of the covariance matrix, we have several ways, includ ing Amtitage (1966), to make X2 -statistic with c-1 degrees of freedom. We can show, however, that all of these are the same, which may be regarded as an extension of Cochran (1954) and Mantel- Haenszel procedure (1959). As Birch (1964) and Zelen (1971) noted, this test procedure is supported by the logit mode...

Journal ArticleDOI
TL;DR: In this article, a new statistic T for testing for normality is proposed, which is easy to compute and against skew distributions, and symmetric distributions having large kurtosis, is generally more powerful than Shapiro & Wilk statistic W. T is also both origin and scale invariant.
Abstract: A new statistic T for testing for normality is proposed. T is easy to compute and against skew distributions, and symmetric distributions having large kurtosis, is generally more powerful than Shapiro & Wilk statistic W. T is also both origin and scale invariant. Besides, T tends to normality with increasing sample size.

Journal ArticleDOI
TL;DR: In this paper, the order statistics of a random sample of size n from the power-function distribution with distribution function were used to derive the best linear unbiased estimators of (1) α when β and y are known, (2) γ when α and γ, (3) α and β when γ is known,
Abstract: Suppose are the order statistics of a random sample of size n from the power-function distribution with distribution function . Using linear combinations of the order statistics , best linear unbiased estimators of (1) α when β and y are known, (2) γ when α and γ are known, (3) α and β when γ is known, are derived.

Journal ArticleDOI
Harold Ruben1
TL;DR: In this article, the authors present recursion relationships for the probability density and distribution functions of non-central chi-square and gamma random variables, and derive sum and interpolation formulae for the distribution functions.
Abstract: The paper presents recursion relationships for the probability density and distribution functions of non-central chi-square and gamma random variables. Sum and interpolation formulae for the distribution functions are developed, and upper bounds obtained when the Interpolation formulae are used in truncated form. Finally, simple finite expansions for the distribution function of non-central chi-square with odd degrees of freedom are developed, and a simple expression for one distribution function of non-central chi-square with even degrees of freedom Is obtained in terms of Integrals involving the standardized normal distribution function and derivatives of the standardized normal density function.

Journal ArticleDOI
TL;DR: In this paper, a brief review of such procedures and an empirical comparison of these procedures using simulated first-order schemes with moderate sample sizes are presented, along with a comparison of simulated first order schemes with a moderate sample size.
Abstract: Since explicit solution to the maximum likelihood equations for normal moving average schemes is intractable, several estimation procedures have been proposed which offer explicit estimators whose asymptotic properties are the same as those of the maximum likelihood estimator. Our objective is to present a brief review of ail such procedures and an empirical comparison of these procedures using simulated first-order schemes with moderate sample sizes.

Journal ArticleDOI
Moti L. Tiku1
TL;DR: In this paper, the authors generalized the test-statistic T for testing normality and the testE statistic TE for testing exponentiality to test the normality of k independent random samples.
Abstract: The test-statistic T for testing normality (Tiku [8]) and the testE statistic TE for testing exponentiality (Tiku, Rai and Mead E[17]) are generalized to test normality and exponentiality of k independent random samples. The distributions of these generalized statistics and their power properties are studied.

Journal ArticleDOI
TL;DR: For testing H0:μ 0 and against when the observations are independent normal with known variance, tests of the following structure are considered: for fixed n, stop with the first i≤n such that and reject H0,. Otherwise stop with n observations and accept H0.
Abstract: For testing H0:μ 0 and against when the observations are independent normal with known variance, tests of the following structure are considered: For fixed n , stop with the first i≤n such that and reject H0, . Otherwise stop with n observations and accept H0 . Bounds on the power function and expected sample size are obtained, and these become exact limits as n→∞. The power function of these tests never falls below 94% of that of the corresponding nonsequential UMP (UMPU) tests.

Journal ArticleDOI
TL;DR: In the days when analyses of variance were typically performed on a desk calculator, approximate methods such as unweighted means were generally preferred to exact least squares methods when dealin with variance as mentioned in this paper.
Abstract: In the days when analyses of variance were typically performed on a desk calculator, approximate methods such as unweighted means were generally preferred to exact least squares methods when dealin...

Journal ArticleDOI
TL;DR: In this paper, a modification of the method of moments which requires the estimators to be functions of the minimal sufficient statistic is discussed and it is shown that these modified estimators are in fact the maximum likelihood estimators.
Abstract: Some aspects of the Pearson-Fisher controversy concerning the method of moments and the method of maximum likelihood are reviewed. In the multiparameter exponential family, a modification of the method of moments which requires the estimators to be functions of the minimal sufficient statistic is discussed It is shown that these modified estimators are in fact the maximum likelihood estimators.Although the mathematics underlying the result is widely available in the literature, the authors have not seen it stated in the present context.

Journal ArticleDOI
TL;DR: In this paper, three basic criteria, determinant, trace and maxim-urn root, are used for determining optimality of experimental designs, and examples are presented where the three criteria give rise to different designs.
Abstract: Three basic criteria, determinant, trace and maxim-urn root, are in common use for determining optimality of experimental designs. Here examples are presented where the three criteria give rise to different designs. The examples are balanced resolution IV* of the 2m series and are particularly insightful with respect to the dependence of the criteria on the correlation between estimators of the parameter.

Journal ArticleDOI
TL;DR: In this paper, a score is to be constructed as a linear combination of variables that will reflect an ordering based on expert judgement, available data are a set of preference-ordered pairs together with values.
Abstract: A score is to be constructed as a linear combination of variables that will reflect an ordering based on expert judgement. Available data are a set of preference-ordered pairs together with values ...

Journal ArticleDOI
TL;DR: In this article, the problem of estimating variance components in the three-stage nested randomeffects model from a Bayesian viewpoint is considered and some Bayes estimators of the variance components are developed using appropriate loss functions and adopting a non-informative reference prior distribution.
Abstract: The problem of estimation of the variance components in the three-stage nested randomeffects model is considered from a Bayesian viewpoint.Under the usual assumptions of normality and independence of random effects some Bayes estimators of the variance components are developed using appropriate loss functions and adopting a non-informative reference prior distribution.

Journal ArticleDOI
TL;DR: In this paper, the joint density function of the sample mean and sample variance is recursively derived for samples from a population with density function f where f (x) > 0 cilinost syervi,vheie, everywhere continuous and has certain integral properties.
Abstract: The joint density function of the sample mean and sample variance is recursively derived for samples from a population with density function f where f (x) > 0 cilinost syervi,vheie, everywhere continuous and has certain integral properties. For populations where f does not have these integral properties, this joint density is an approximation. This joint density-function is used to derive the density function of the t-statistic for samples from f. The family of generalized normal density functions is used for an example. The approximation for the t-density is given for that family. For some specific members of the family, the true a probabilities for the approximations are tabled and compared to the results of a simulation study.

Journal ArticleDOI
TL;DR: For a normal distribution with covariance matrix having intraclass structure of arbitrary order, the authors considered estimation and tests of hypotheses concerning various equalities of covariances and correlations.
Abstract: For a normal distribution with covariance matrix having intraclass structure of arbitrary order. we consider estimation and tests of hypotheses concerning various equalities of covariances and correlations. A numerical example is included.

Journal ArticleDOI
TL;DR: In this article, variable selection, lack of independence, analysis of repeated observations, alternate measurements of similar phenomena, and exclusion of variables unrelated to treatment are considered in the context of multivariate data obtained in a clinical therapy trial in multiple sclerosis (MS).
Abstract: The problems of analyzing data from large clinical trials are considered. Specifically, variable selection, lack of independence, analysis of repeated observations, alternate measurements of similar phenomena, and exclusion of variables unrelated to treatment are considered in the context of multivariate data obtained in a clinical therapy trial in multiple sclerosis (MS). The criteria considered are: 1) Dropping variables having low “signal to noise” ratio. 2)Maximizing prediction. 3)Maximizing separation between treatment groups, and. 4)Reducing the dimension of multivariate data. Analyses are described and illustrated in the selection of variables from the ACTH clinical trial. The selection of “important” variables was validated by means of split-half analyses. The analytical approach, which has general application, and the implications of the analysis on the reduced data set are discussed.