scispace - formally typeset
Search or ask a question

Showing papers in "Communications in Statistics-theory and Methods in 1990"


Journal ArticleDOI
TL;DR: In this article, the authors present a class of afflne-invariant tests for the composite hypothesis H d the law of X 1 is a non-degenerate normal distribution which are consistent against any fixed non-normal alternative distribution.
Abstract: Let be independent identically distributed random vectors in Rd d ≥ 1 , with sample mean [Xbar] n and sample covariance matrix S n . We present a class of practicable afflne-invariant tests for the composite hypothesis H d the law of X 1 is a non-degenerate normal distribution which are consistent against any fixed non- normal alternative distribution. The test statistic is a weighted integral of the squared modulus of the difference between the empirical characteristic function of the scaled residuals and its pointwise limit under H d - An alternative representation is given in terms of an L 2-distance between densities. The limiting null distribution of the test statistic is obtained. Power performance of the new tests is assessed in a Monte Carlo study.

570 citations


Journal ArticleDOI
TL;DR: This paper introduces a new information-theoretic measure of complexity called ICOMP as a decision rule for model selection and evaluation for multivariate linear models.
Abstract: This paper introduces a new information-theoretic measure of complexity called ICOMP as a decision rule for model selection and evaluation for multivariate linear models. The development of ICOMP is based on the generalization and utilization of the covariance complexity index of van Emden (1971) in estimation of the multivariate linear model. ICOMP is motivated by Akaike's (1973) Information Criterion (AIC), but it is a different procedure than AIC. In linear or nonlinear statistical models ICOMP uses an information-based characterization of: (i) the covariance matrix properties of the parameter estimates of a model starting from their finite sampling distributions, and (ii) the complexity of the inverse-Fisher information matrix (i-FIM) as a new criterion of achievable accuracy of the model As a result, it provides a trade-off between the accuracy of the parameter estimates and the interaction of the residuals of a model via the measure of complexity of their respective covariances. It controls the risk...

177 citations


Journal ArticleDOI
TL;DR: In this paper, the Radon-Nikodym derivative of a bivariate distribution function with respect to the product of its marginal distribution functions is derived and the strong uniform consistency and asymptotic normality of kernel-type estimators are proved under various conditions on the bandwidth and on the smoothness of the kernel.
Abstract: This paper deals with estimation of the density of a copula function as well as with that of the Radon-Nikodym derivative of a bivariate distribution function with respect to the product of its marginal distribution functions. Strong uniform consistency and asymptotic normality of kernel-type estimators are proved under various conditions on the bandwidth and on the smoothness of the kernel. As an application, the estimation of Neyman-Pearson curves in the testing of independence problem is discussed.

141 citations


Journal ArticleDOI
TL;DR: The relationship between the weighted distributions and the parent distributions in the context of reliability and life testing depends on the nature of the weight function and give rise to interesting connections between the different ageing criteria of the two distributions.
Abstract: C. R. Rao pointed out that “The role of statistical methodology is to extract the relevant information from a given sample to answer specific questions about the parent population” and raised the question “What population does a sample represent”? Wrong specification can lead to invalid inference giving rise to a third kind of error. Rao introduced the concept of weighted distributions as a method of adjustment applicable to many situations. In this paper, we study the relationship between the weighted distributions and the parent distributions in the context of reliability and life testing. These relationships depend on the nature of the weight function and give rise to interesting connections between the different ageing criteria of the two distributions. As special cases, the length biased distribution, the equilibrium distribution of the backward and forward recurrence times and the residual life distribution, which frequently arise in practice, are studied and their relationships with the original di...

139 citations


Journal ArticleDOI
TL;DR: In this paper, it is suggested that the data analytic tools of time series analysis should be more widely used, and many of the problems of two-dimensional modelling can be overcome by using separable processes.
Abstract: Although many spatial methods for analysing field trials have been proposed, most are for a one-dimensional layout, and most prescribe a single model for all data. In this paper it is suggested that the data analytic tools of time series analysis should be more widely used. Many of the problems of two-dimensional modelling can be overcome by using separable processes. This subclass of lattice processes can often be reasonably used, and has several advantages, including rapid fitting and simple extensions of many techniques developed and successfully used in time series. Some examples are given.

108 citations


Journal ArticleDOI
TL;DR: In this article, a multiple regression method based on distance analysis and metric scaling is proposed and studied to predict a continuous response variable from several explanatory variables, which is compatible with the general linear model and is found to be useful when the predictor variables are both continuous and categorical.
Abstract: A multiple regression method based on distance analysis and metric scaling is proposed and studied. This method allow us to predict a continuous response variable from several explanatory variables, is compatible with the general linear model and is found to be useful when the predictor variables are both continuous and categorical. Real data examples are given to illustrate the results obtained.

97 citations


Journal ArticleDOI
TL;DR: A wide selection of tests for exponentiality is discussed and compared in this article, where power computations, using simulations, were done for each procedure, and the score test presented in Cox and Oakes (1984) appears to be the best if one does not have a particular alternative in mind.
Abstract: A wide selection of tests for exponentiality is discussed and compared. Power computations, using simulations, were done for each procedure. Certain tests (e.g. Gnedenko (1969), Lin and Mudholkar (1980), Harris (1376), Cox and Oakes (1384), and Deshpande (1983)) performed well for alternative distributions with non-monotonic hazard rates, while others (e.g. Deshpande (1983), Gail and Gastwirth (1978), Kolmogorov-Smirnov (LillViefors (1969)), Hahn and Shapiro (1967), Hollander and Proschan (1972), and Cox and Oakes (1984)) fared well for monotonic hazard rates. Of all the procedures compared, the score test presented in Cox and Oakes (1984) appears to be the best if one does not have a particular alternative in mind.

94 citations


Journal ArticleDOI
TL;DR: In this paper, the sampling properties of estimated divergence-type measures are investigated, and approximate means and variances are derived and asymptotic distributions are obtained by testing the goodness of fit of observed frequencies to expected ones and tests of equality of divergences based on two or more multinomial samples.
Abstract: φ-divergence .statistics are obtained by either replacing both distributions involved in the argument of the φ -divergence measure by their sample estimates or replacing one distribution and considering the other as given. The sampling properties of estimated divergence-type measures are investigated. Approximate means and variances are derived and asymptotic distributions are obtained. Tests of goodness of fit of observed frequencies to expected ones and tests of equality of divergences based on two or more multinomial samples are constructed.

90 citations


Journal ArticleDOI
TL;DR: In this article, the authors developed a symmetric interval estimator for Cpk, and conducted a simulation study to explore its coverage probabilities, assuming that the data are normal, independent and identically distributed.
Abstract: Several sampling distribution properties of the estimator for Cpk. are presented under the assumption that the data are normal, independent and identically distributed. In particular, the expectation, variance and skewness are derived. Since the sampling distribution is only weakly skewed, we concluded that a symmetric interval estimator for Cpk . might be reasonable. We developed such a symmetric interval estimator and conducted a simulation study to explore its coverage probabilities.

73 citations


Journal ArticleDOI
TL;DR: In this paper, the authors introduce two modifications of the Anderson-Darling test statistic which are sensitive to departures of the fitted distribution from the truedistribution in one or other of the tails.
Abstract: In this paper we introduce two modifications of the Anderson-Darling test statistic which are sensitive to departures of the fitted distribution from the truedistribution in one or other of the tails. Simulated critical values are provided for the cases when the parameter values are known and when they have to be estimated from the data.

69 citations


Journal ArticleDOI
TL;DR: In this paper, the authors investigated the behavior of the optimal regularization parameter in the method of regularization for solving first kind integral equations with noisy data, under a range of definitions of "optimal", varying from mean square error in higher derivatives of the solution, to mean square errors in the predicted data.
Abstract: We investigate the behavior of the optimal regularization parameter in the method of regularization for solving first kind integral equations with noisy data, under a range of definitions of “optimal”, varying from mean square error in higher derivatives of the solution, to mean square error in the predicted data. We study how the optimal regularization parameter changes when the optimality criteria changes, under a broad range of smoothness assumptions on the solution, the kernel of the integral operator, and the penalty functional. Although some of the calculations we present have been given elsewhere, we organize the results with a specific god in mind. That Is, we study a certain class of problems within which we can identify conditions on the solution, the kernel of the operator and the penalty functional for which the rate at which the optimal regularization parameter goes to zero is the same for both predictive mean square error and solution mean square error optimality criteria, and for which it i...

Journal ArticleDOI
TL;DR: In this article, a new process capability index is proposed that takes into account the location of the process mean between the two specification limits, the proximity to the target value, and the process variation when assessing process performance.
Abstract: A new process capability index is proposed that takes into account the location of the process mean between the two specification limits, the proximity to the target value, and the process variation when assessing process performance The proposed index is compared to other indices on several properties The proposed index is estimated based on a random sample of observations from the production process when the process is assumed to be normally distributed The 95% lower confidence limits for the proposed index are derived for given sample sizes and its estimates

Journal ArticleDOI
TL;DR: In this paper, a class of ratio-type estimators of finite population variance and variance ratio when the population variance of an auxiliary character is known is proposed, and their MSE's are compared with the MSE of usual ratio estimator t1 proposed by Isaki (1983).
Abstract: This paper proposes a class of ratio-type estimators of finite population variance and variance ratio when the population variance of an auxiliary character is known. Asymptotic expressions for bias and mean square error are derived and their MSE's are compared with the MSE of usual ratio estimator t1 proposed by Isaki (1983) of the population variance of study character y. The regions are obtained under which proposed estimator is superior to t1 . When a prior knowledge of the value of coefficient of kurtosis, ;β2(y)of y is a t hand, ratio type estimator, say t2, of is suggested. It is shown, under certain conditions, that t2 is more efficient than t1. Under some further knowledge another estimator of , better than t1 and t2,is proposed. Finally, one more class of estimator is proposed and its properties are discussed. The discussions are also made in bivariate normal population

Journal ArticleDOI
TL;DR: Property of two models of ‘within-dose’ dependence of efficacy and toxicity in parallel designs - one a bivariate analogue of the familiar univariate logistic model, and the other an adaptation of a general model developed by D.R. Cox are explored.
Abstract: Methods for the simultaneous analysis of the relationships of binary variables for efficacy and toxicity to dosage of an experimental drug are developed. Properties of two models of ‘within-dose’ dependence of efficacy and toxicity in parallel designs - one a bivariate analogue of the familiar univariate logistic model, and the other an adaptation of a general model developed by D.R. Cox– are explored. The cell probabilities predicted by these models are often quite similar to those predicted by a model of independence of efficacy and toxicity, but large discrepancies can occur when there is approximate equality of the median effective and median toxic doses. Asymptotic variances of estimates of parameters involved in assessing correlation are large when there is little or no dependence in the data, but parameters can be estimated with good precision in at least some cases of moderate to strong dependence between efficacy and toxicity.

Journal ArticleDOI
TL;DR: In this paper, a multivariate affinc-invariant family of rank tests is proposed for the two sample location problem, which is built upon Randles' multivariate one-sample sign statistic based on interdirections.
Abstract: A multivariate affinc-invariant family of rank tests is proposed for the two sample location problem. The class of statistics introduced is built upon Randles' multivariate one-sample sign statistic based on interdirections and the multivariate one-sample signed-rank statistic of Peters and Randles. Asymptotic relative efficiencies are obtained which indicate that selected members of the class perform very well for a broad class of distributions. Further comparisons are made among several statistics using Monte Carlo results.

Journal ArticleDOI
TL;DR: In this article, the authors show that a one-step estimator of variance is asymptotically equivalent to the Liang and Zeger variance estimator; they perform simulation runs to compare the small sample performance of the jackknife and Liang and Zeng's estimator.
Abstract: We discuss methods for analyzing data from repeated measures studies. The marginal distribution of the response at each time is assumed to be from the exponential family. The maximum likelihood regression estimators, assuming the repeated measurements on the same individual are independent, have been shown to be consistent by Liang and Zeger (1986). Liang and Zeger (1986) propose a ‘robust’ estimator of the asymptotic variance of the estimated regression parameters. We show that a ‘one-step’ jackknife estimator of variance is asymptotically equivalent to the Liang and Zeger variance estimator; we perform simulation runs to compare the small sample performance of the jackknife and Liang and Zeger's estimator. We also show that SAS Proc Jackreg (1986), which does jackknife regression for ordinary least squares, can be used to calculate jackknife estimates of the regression parameters and the estimated parameters' covariance matrix.

Journal ArticleDOI
TL;DR: In this paper, the authors considered the problem of estimating the mode of a conditional probability density function and showed that the conditional mode obtained by maximizing a kernel estimate of the CPD function is strongly consistent and asymptotically normally distributed.
Abstract: The problem of estimating the mode of a conditional probability density function is considered. It is shown that under some regularity conditions the estimate of the conditional mode obtained by maximizing a kernel estimate of the conditional probability density function is strongly consistent and asymptotically normally distributed.

Journal ArticleDOI
TL;DR: In this paper, the relationship between the folded-t distribution and a special case of the folded normal distribution was studied, where the underlying distribution of measurements is replaced by a distribution of absolute measurements.
Abstract: Measurements are frequently recorded without their algebraicsign. As a consequence the underlying distribution of measurements is replaced by a distribution of absolute measurements. When the underlying distribution is t the resulting distribution is called the “folded-t distribution”. Here we study this distribution, we find the relationship between the folded-t distribution and a special case of the folded normal distribution and we derive relationships of the folded-t distribution to other distributions pertaining to computer generation. Also tables are presented which give areas of the folded-t distribution.

Journal ArticleDOI
TL;DR: A cubic spline hazard model where the tails are linearly constrained (Stone and Koo, 1985) has considerable flexibility in describing data which has been generated from distributions having a variety of hazard function shapes as mentioned in this paper.
Abstract: A cubic spline hazard model where the tails are linearly constrained (Stone and Koo, 1985) has considerable flexibility in describing data which has been generated from distributions having a variety of hazard function shapes. This model is as efficient as the Kaplan-Meier (1958) estimator for estimating survival probabilities.

Journal ArticleDOI
TL;DR: In this article, the conditional probability under H0 (or alternative hypothesis, Ha) given current data, of rejecting (accepting) H0 at the end of the trial is calculated based on an unlimited number of interim looks.
Abstract: Stochastic curtailment procedures are used to monitor accumulating data in long term clinical trials. These procedures allow one to calculate the conditional probability under H0 (or alternative hypothesis, Ha) given current data, of rejecting (accepting) H0 at the end of the trial. One might stop early and reject (accept) H0 if this probability is equal to or greater than γ (γ'). Lan, Simon and Halperin (1982) have shown that if α and β are the type I and type II errors , respectively, then the type I (type II) error using a stochastic curtailment method is equal to or less than α/γ (β/γ'). This upper bound is based on an unlimited number of interim looks. In fact, there are usually a small number of looks. Lower upper bounds using only the standard normal distribution table are presented. A computer program to obtain these bounds is available from the authors.

Journal ArticleDOI
TL;DR: The Chen-Stein method of Poisson approximation yields bounds on the error incurred when approximating the number of occurrences of possibly dependent events by a Poisson random variable of the same mean.
Abstract: The Poisson distribution is commonly used to model the number of occurrences of independent rare events. However, many instances arise where dependence exists, for example, in counting the length of long head runs in coin tossing, or matches between two DNA sequences. The Chen-Stein method of Poisson approximation yields bounds on the error incurred when approximating the number of occurrences of possibly dependent events by a Poisson random variable of the same mean. In addition to the problems related to the motivating examples from molecular biology involving runs and matches, the method may be applied to questions as varied as calculating probabilities involving extremes of sequences of random variables and approximating the probability of general birthday coincidences.

Journal ArticleDOI
TL;DR: In this paper, the distributions of the estimates of the process capability indices Cp, Cpk and Cpm are asymptotically normal under general conditions, and they show that Cp is the smallest of the three indices.
Abstract: This paper shows that under general conditions, the distributions of the estimates of the process capability indices Cp, Cpk and Cpm, are asymptotically normal.

Journal ArticleDOI
TL;DR: In this article, the impact of optimal design theory on the design of repeated measurements experiments is discussed, and some construction methods for these designs are presented. And a bibliography of these designs is provided at the end of this paper.
Abstract: Any experiment in which one or more of the experimental units is used more than once is called a repeated measurements experiment. The associated design of a repeated measurements experiment is referred to as a repeated measurements design. This review covers some known results on repeated measurements designs. Emphasis is placed on the impact of optimal design theory. Some construction methods for these designs are presented. Hedayat and Afsarinejad (1975) has an extensive bibliography of earlier literature. A bibliography of these designs published after 1974 is provided at the end of this paper.

Journal ArticleDOI
TL;DR: In this article, the average run lengths of the zone control chart are compared with that of several Shewhart charts with and without runs rules, and it is shown that the standard Zone Control Chart has performance similar to some even simpler charts and a much higher false alarm rate than the She Whithart chart with all of the common runs rules.
Abstract: Average run lengths of the zone control chart are presented, The performance of this chart is compared with that of several Shewhart charts with and without runs rules, It is shown that the standard zone control chart has performance similar to some even simpler charts and a much higher false alarm rate than the Shewhart chart with all of the common runs rules. It is also shown that a slightly modified zone control chart outperforms the Shewhart chart with the common runs rules.

Journal ArticleDOI
TL;DR: In this paper, it was shown that if the nth record value Rn of a sequence of i.i.d.vs. has an increasing failure rate (IFR) distribution then so does Rn+1 On the other hand Rn-1 has decreasing failure rate distribution if Rn is DFR.
Abstract: It has been shown that if the nth record value Rn of a sequence of i.i.d. r.vs. has an increasing failure rate (IFR) distribution then so does Rn+1 On the other hand Rn-1 has decreasing failure rate (DFR) distribution if Rn is DFR. Some other partial ordering results for the record values have also been obtained.

Journal ArticleDOI
TL;DR: In this article, the necessary and sufficient conditions for any real function (x) is the conditional expectation E(h(X)/X≥x) of a random variable X with continuous distribution function, where h is a given real, continuous and strictly monotonic function.
Abstract: We obtain the necessary and sufficient conditions so that any real function (x) is the conditional expectation E(h(X)/X≥x) of a random variable X with continuous distribution function, where h is a given real, continuous and strictly monotonic function

Journal ArticleDOI
TL;DR: In this article, five procedures for detecting outliers in linear regression are compared: sequential testing of the maximum internally studentized residual or maximum externally studentized (cross-validatory) residual, Marasinghe's multistage procedure, and two procedures based on recursive residuals, calculated on adaptive-ordered observations.
Abstract: Five procedures for detecting outliers in linear regression are compared: sequential testing of the maximum internally studentized residual or maximum externally studentized (cross-validatory) residual, Marasinghe's multistage procedure, and two procedures based on recursive residuals, calculated on adaptively-ordered observations. All of these procedures initially test a no-outliers hypothesis, and they have an underlying unity in their general approach to the outlier identification problem. Which procedure is most effective depends on the number and placement of outliers in the data. The multistage procedure is very effective in some cases, but requires prespecifying a value k, the maximum number of outliers one can then detect; the procedure can suffer severely if the chosen value for k is either larger or smaller than the number of outliers actually in the data.

Journal ArticleDOI
TL;DR: In this paper, the marginal density function of a truncated multivariate normal density function is derived in a form that can be evaluated using an available computer algorithm, and it is shown that marginal density functions are the truncated normal density functions multiplied by a "skew function".
Abstract: The single variable marginal density function of a truncated multivariate normal density function is derived in a form that can be evaluated using an available computer algorithm. It is shown that the marginal density function is a truncated normal density function multiplied by a “skew function”

Journal ArticleDOI
TL;DR: In this article, the maximum asymptotic bias of two classes of robust estimates of the dispersion matrix V of a p-dimensional random vector x, under a contamination model of the form, where P is the distribution of x,P 0 is a spherical distribution, and δ(x 0) is a point mass at x 0.
Abstract: This paper deals with the maximum asymptotic bias of two classes of robust estimates of the dispersion matrix V of a p-dimensional random vector x, under a contamination model of the form , where P is the distribution of x,P 0 is a spherical distribution, and δ(x 0) is a point mass at x 0. Estimators VQ,α of the first class minimize the a quantile of x'V-1x among all symmetric positive-definite matrices V for some α∊(0,1). The “mimimum volume ellipsoid” estimator proposed by Rousseeuw belongs to this class with α=0.5. These estimators have breakdown point min(α,1,-α) for all p. The second class of estimators consist of the M-estimators, from which the seemingly most robust member was chosen; namely the Tyler estimate defined as the solution VT of . This estimator has breakdown point 1/p. The numerical results show that except for ∊ very close to 1/p VT has in general a smaller maximum bias than VQ,α and that the maximum bias of the latter may be extremely large even for e much smaller than its breakdown p...

Journal ArticleDOI
TL;DR: In this paper, a different approach based on some measures of closeness between the subspaces spanned by the initial eigenvectors and their corresponding version derived from an infinitesimal perturbation of the data distribution is proposed.
Abstract: In the context of sensitivity analysis in principal component analysis, Tanaka (1988) tackles the problem of the stability of the subspace spanned by dominant principal components. He derives the influence functions related to the projection operator on this subspace and to the spectral decomposition of the covariance or correlation matrix as sensitivity indicators. We suggest here a different approach based on some measures of closeness between the subspaces spanned by the initial eigenvectors and their corresponding version derived from an infinitesimal perturbation of the data distribution.