scispace - formally typeset
Search or ask a question

Showing papers in "Canadian Journal of Statistics-revue Canadienne De Statistique in 1995"


Journal ArticleDOI
TL;DR: In this paper, a sampling-based approach using a Gibbs or Metropolis-within-Gibbs method is used to compute the posterior divergence measures, which convey the effects of a single observation or covariate on the posterior.
Abstract: A Bayesian approach is presented for detecting influential observations using general divergence measures on the posterior distributions. A sampling-based approach using a Gibbs or Metropolis-within-Gibbs method is used to compute the posterior divergence measures. Four specific measures are proposed, which convey the effects of a single observation or covariate on the posterior. The technique is applied to a generalized linear model with binary response data, an overdispersed model and a nonlinear model. An asymptotic approximation using Laplace method to obtain the posterior divergence is also briefly discussed. Cet article presente une approche de Bayes pour depister des observations influentes a l'aide de mesures de divergence generale appliquees aux distributions a posteriori. Une approche fondee sur l'echantillonnage, utilisant la methode Gibbs ou Metropolis avec Gibbs, est utilisee afin de calculer les mesures de divergence a posteriori. Quatre mesures specifiques sont proposees, communiquant les effets d'une seule observation ou facteur a posteriori. La technique est appliquee a un modele lineaire generalise avec donnees de reponse binaires, a un modele surdisperse et a un modele nonlineaire. Une approximation asymptotique utilisant la methode de Laplace dans le but d'obtenir une divergence a posteriori est egalement discutee.

116 citations


Journal ArticleDOI
TL;DR: In this paper, the authors introduce empirical likelihood (Owen 1988) and compare it with the other methods, and present some simulations which indicate the need for small-sample corrections in some situations.
Abstract: Likelihood and estimating-equation methods provide the two most common approaches to parametric inference. With the former there are three main ways to deal with interval estimation or hypothesis testing; roughly speaking, these are based on the likelihood function, the score function, and the maximum likelihood estimator. For estimating equations which, unlike likelihood, do not require the specification of a full probability distribution for the data, there are analogous methods based on the pseudoscore (estimating function) and on the estimator obtained from the estimating equations. Boos (1992) has given a lucid review, with emphasis on problems that involve constraints on parameters (Aitchison and Silvey 1958). The purposes of this paper are to incorporate empirical likelihood (Owen 1988) into this setting, to compare it with the other methods, and to present some simulations which indicate the need for small-sample corrections in some situations. Les methodes de vraisemblance et des equations d'estimation sont les deux approches a l'inference parametrique les plus repandues. Avec la premiere methode, il y a trois principales facons de traiter des estimations par intervalle ou des test d'hypotheses; en gros, ces methodes sont fondees sur la fonction de vraisemblance, la fonction score, et l'estimateur du maximum de vraisemblance. Avec la methode des equations d'estimation qui, contrairement a la methode de vraisemblance, ne necessite pas la specification d'une loi de distribution complete pour les donnees, il y existe des methodes analogues fondees sur le pseudo score (fonction d'estimation) et sur l'estimateur obtenu des equations d'estimation. Boos (1992) a fait une critique lucide, soulignant les problemes qui impliquent des contraintes sur les parametres (Aitchison et Silvey 1958). L'objet de cet article est d'introduire la vraisemblance empirique (Owen 1988) dans ce cadre, de la comparer avec les autres methodes et de presenter quelques simulations qui revelent la necessite de corrections de petits echantillons, dans certaines situations.

99 citations


Journal ArticleDOI
TL;DR: In this paper, the authors propose a new class of measures of rank correlation which are based on a notion of distance between incomplete rankings, which can be used to define rank correlation when the data are incomplete.
Abstract: The subject of rank correlation has had a rich history. It has been used in numerous applications in tests for trend and for independence. However, little has been said about how to define rank correlation when the data are incomplete. The practice has often been to ignore missing observations and to define rank correlation for the smaller complete record. We propose a new class of measures of rank correlation which are based on a notion of distance between incomplete rankings. There is the potential for a significant increase in efficiency over the approach which ignores missing observations as demonstrated by a specific case.

50 citations


Journal ArticleDOI
TL;DR: In this article, a two-stage procedure is described for assessing subject-specific and marginal agreement for data from a test-retest reliability study of a binary classification procedure, where subjectspecific agreement is parametrized through the log odds ratio, while marginal agreement is reflected by the log ratio of the off-diagonal Poisson means.
Abstract: A two-stage procedure is described for assessing subject-specific and marginal agreement for data from a test-retest reliability study of a binary classification procedure. Subject-specific agreement is parametrized through the log odds ratio, while marginal agreement is reflected by the log ratio of the off-diagonal Poisson means. A family of agreement measures in the interval [-1, 1] is presented for both types of agreement. The conditioning argument described facilitates exact inference. The proposed methodology is demonstrated by way of an example involving hypothetical data chosen for illustrative purposes, and data from a National Health Survey Study (Rogot and Goldberg 1966).

43 citations


Journal ArticleDOI
TL;DR: In this article, the authors focus on the mean of the posterior distribution of the random distribution, which is the predictive distribution of a future observation given the sample, and learn about other features of this posterior distribution as well as about posteriors associated with functionals of the distribution of data.
Abstract: The nonparametric Bayesian approach for inference regarding the unknown distribution of a random sample customarily assumes that this distribution is random and arises through Dirichlet-process mixing. Previous work within this setting has focused on the mean of the posterior distribution of this random distribution, which is the predictive distribution of a future observation given the sample. Our interest here is in learning about other features of this posterior distribution as well as about posteriors associated with functionals of the distribution of the data. We indicate how to do this in the case of linear functionals. An illustration, with a sample from a Gamma distribution, utilizes Dirichlet-process mixtures of normals to recover this distribution and its features. Les tenants de l'approche non parametrique bayesienne, lorsqu'ils estiment une loi de probabilite inconnue a partir d'un echantillon aleatoire de sa population, representent generalement l'information a priori au moyen d'un processus de Dirichlet. Jusqu'a present, les travaux realises dans ce domaine ont surtout porte sur l'esperance a posteriori de ce processus, qui permet de faire des previsions sur la valeur eventuelle d'une future observation, etant donne les donnees deja recueillies. L'objet de cet article est de mettre en lumiere d'autres caracteristiques de ce processus a posteriori et des processus a posteriori correspondant aux fonctionnelles de la loi des donnees, notamment des fonctionnelles lineaires. A titre d'illustration, on montre comment il est possible d'identifier et d'etudier les proprietes du processus a posteriori, lorsque les donnees sont issues d'une loi gamma et qu'a priori, le processus de Dirichlet s'appuie sur un melange de lois normales.

30 citations


Journal ArticleDOI
TL;DR: In this article, the authors show that the asymptotic distributions of the likelihood-ratio tests are of chi-bar-square type, and provide expressions for the weighting values.
Abstract: There are numerous situations in categorical data analysis where one wishes to test hypotheses involving a set of linear inequality constraints placed upon the cell probabilities. For example, it may be of interest to test for symmetry in k × k contingency tables against one-sided alternatives. In this case, the null hypothesis imposes a set of linear equalities on the cell probabilities (namely pij = Pji ×i > j), whereas the alternative specifies directional inequalities. Another important application (Robertson, Wright, and Dykstra 1988) is testing for or against stochastic ordering between the marginals of a k × k contingency table when the variables are ordinal and independence holds. Here we extend existing likelihood-ratio results to cover more general situations. To be specific, we consider testing Ht,0 against H1 - H0 and H1 against H2 - H 1 when H0:k × i=1 pixji = 0, j = 1,…, s, H1:k × i=1 pixji × 0, j = 1,…, s, and does not impose any restrictions on p. The xji's are known constants, and s × k - 1. We show that the asymptotic distributions of the likelihood-ratio tests are of chi-bar-square type, and provide expressions for the weighting values. Il y a plusieurs situations dans l'analyse de categories de donnees ou l'on veut tester des hypotheses impliquant un ensemble de contraintes ayant la forme d'inegalites lineaires placees sur les cellules de probabilites. Par exemple il peut ětre interessant de tester la symetrie de tables de contingence de dimension k × k contre des alternatives avec assymetries. Dans ce cas, l'hypothese nulle impose un ensemble d'egalites sur les cellules de probabilites (ou plus precisement pij = pji, ×i > j) alors que les hypotheses alternatives specifies certaines inegalites directionnelles. Une autre application importante (Robertson, Wright and Dykstra, 1988) est de tester pour ou contre l'existence d'un arrangement stochastique des marginales d'une table de contingence k × k lorsque les variables sont de type ordinal et independantes. Ici nous etendons des resultats existants sur les ratio de vraisemblance, pour couvrir des situations plus generales. Pour ětre plus specifique, nous testons H0 contre H1 - H0 et H1 contre H2 - H1 ou H0:k × i=1 pixji = 0, j = 1,…, s, H1:k × i=1 pixji × 0, j = 1,…, s, et H2 n'impose pas de restrictions sur P. Les sont connus constants et s × k - 1. Nous montrons que les distributions asymptotiques des tests de ratio de vraisemblance sont de type Chi-deux, et donnent des expressions pour les valeurs ponderantes.

26 citations


Journal ArticleDOI
TL;DR: In this article, the authors investigate the effect of the bandwidth on the mean squared errors and biases of the smooth estimators, and also provide a comparison of their performance with the analogous estimators obtained under random sampling for record values.
Abstract: Often, in industrial stress testing, meteorological data analysis, and other similar situations, measurements may be made sequentially and only values smaller than all previous ones are recorded. When the number of records is fixed in advance, the data are referred to as inversely sampled record-breaking data. This paper is concerned with nonparametric estimation of the distribution and density functions from such data (successive minima). For a single record-breaking sample, consistent estimation is not possible except in the extreme left tail of the distribution. Hence, replication is required, and for m such independent record-breaking samples, the estimators are shown to be strongly consistent and asymptotically normal as m ∞ . Computer simulations are used to investigate the effect of the bandwidth on the mean squared errors and biases of the smooth estimators, and are also used to provide a comparison of their performance with the analogous estimators obtained under random sampling for record values. Dans certains contextes experimentaux, en meteorologie par exemple, mais aussi dans de diverses applications industrielles, il arrive que les donnees soient recueillies de facon sequentielle et qu'on ne retienne une observation que si elle est inferieure a toutes les precedenles. Lorsque le nombre de tels “records a la baisse” est fixe a l'avance, les donnees forment une suite decroissante de minima successifs. Cet article aborde le probleme de l'estimation non parametrique de la fonction de repartition et de la densite de telles donnees. En presence d'un seul echantillon de ce type, on ne saurait esperer construire des estimateurs convergents, sauf en ce qui touche l'extremite gauche de la loi. La replication est done necessaire, et pour m echantillons de records a la baisse, on montre ici qu'il est possible de construire des echantillons asymptotiquement normaux et fortement convergents, lorsque m ∞ . Des simulations par ordinateur permettent d'etudier l'effet de la largeur de la fenětre sur l'erreur quadratique moyenne et le biais de ces estimateurs lisses; elles permettent en outre d'en comparer leur performance avec celui d'estimateurs analogues bǎtis a partir d'echantillons aleatoires de valeurs record.

22 citations


Journal ArticleDOI
TL;DR: In this article, an admissible minimax estimator for scale-invariant squared-error loss is presented, which is the pointwise limit of a sequence of Bayes estimators.
Abstract: Let X have a gamma distribution with known shape parameter θr;aL and unknown scale parameter θ. Suppose it is known that θ ≥ a for some known a > 0. An admissible minimax estimator for scale-invariant squared-error loss is presented. This estimator is the pointwise limit of a sequence of Bayes estimators. Further, the class of truncated linear estimators C = {θρ|θρ(x) = max(a, ρ), ρ > 0} is studied. It is shown that each θρ is inadmissible and that exactly one of them is minimax. Finally, it is shown that Katz's [Ann. Math. Statist., 32, 136–142 (1961)] estimator of θ is not minimax for our loss function. Some further properties of and comparisons among these estimators are also presented. Soit X, ayant une distribution gamma avec un parametre de forme connu αaL et un parametre d'echelle inconnu θ. Supposons qu'il est accepte que θ ≥ a pour un αaL connu positif. Cet article presente un estimateur minimax admissible par rapport a la fonction de perte a l'erreur quadratique invariante en changement d'echelle. Cet estimateur est la limite ponctuelle d'une suite d'estimateurs de Bayes. De plus, la classe des estimateurs lineaires tronques C = {θρ|θρ(x) = max(a, ρx), ρ > 0} est etudiee. Il est demontre que chaque θρ est inadmissible et qu'exactement un d'entre eux est minimax. Enfin, il est demontre que l'estimateur de Katz [Ann. Math. Statist., 32, 136–142 (1961)] de θ n'est pas minimax pour notre fonction de perte. Quelques autres proprietes ainsi que des comparaisons entre ces estimateurs sont presentees.

19 citations


Journal ArticleDOI
TL;DR: In this paper, the authors describe some computationally efficient methods for building proportional-hazards models with piecewise constant relative risk functions, which can be used to fit and assess a single step-function (changepoint) term or as flexible exploratory survival-data analysis tools.
Abstract: We describe some computationally efficient methods for building proportional-hazards models with piecewise constant relative risk functions. The proposed techniques can be used to fit and assess a single step-function (changepoint) term or as flexible exploratory survival-data analysis tools. In addition, these tools can be used to include step-function terms in more general proportional-hazards models such as tree-based models. An application to the development of prognostic groups based on data from a clinical trial for myeloma is presented. Nous decrivons quelques methodes de calcul efficaces pour construire des modeles de risques proportionnels avec des functions de risque constantes par pieces. Les techniques proposes peuvent ětre utilisees pour ajuster et evaluer un terme fonction par etage/point de changement ou comme un outil d'analyse flexible d'exploration de donnees de survie. De plus, ces outils peuvent ětre utilises pour inclure des termes fonctions par etages dans des modeles de risques proportionnels plus generaux tels que les modeles a base d'arbres. Une application au developpement de groupes de prognostiques basee sur des donnees d'une enquete sur le myelome est presentee.

19 citations


Journal ArticleDOI
TL;DR: In this article, the authors prove the asymptotic validity of bootstrap confidence bands for the influence curve from its usual estimator (the sensitivity curve), based on the use of Gill's generalized delta method for Hadamard differentiable operators.
Abstract: We prove the asymptotic validity of bootstrap confidence bands for the influence curve from its usual estimator (the sensitivity curve). The proof is based on the use of Gill's generalized delta method for Hadamard differentiable operators. Some statistical applications, in particular to the estimation of asymptotic variance, are given. Nous demontrons la validite asymptotique des bandes de confiance obtenues moyennant bootstrap a partir de l'estimation habituel (la courbe de sensitivite). La demonstration est basee sur la methode du delta generalise de Gill pour les operateurs differentiels au sens de Hadamard. Nous presentons aussi applications de ce resultat, en particulier a l'estimation de la variance asymptotique.

16 citations


Journal ArticleDOI
TL;DR: In this article, a saddle-point approximation for the marginal density of a real-valued function p(), where is a general M-estimator of a p-dimensional parameter, that is, the solution of the system {n-1ljl (Yl) = 0}j = 1,
Abstract: We develop a saddle-point approximation for the marginal density of a real-valued function p(), where is a general M-estimator of a p-dimensional parameter, that is, the solution of the system {n-1ljl (Yl,) = 0}j=1,…,p. The approximation is applied to several regression problems and yields very good accuracy for small samples. This enables us to compare different classes of estimators according to their finite-sample properties and to determine when asymptotic approximations are useful in practice. Dans cet article, nous developpons une approximation utilisant la methode du col pour la densite marginale d'une fonction a valeur reelle p(), ou est un estimateur-M general d'un parametre p-dimensionnel, c'est a dire, la solution du systeme {n-1ljl (Yl, ) = 0}j =1,…,p. L'approximation est appliquee a plusieurs problemes de regression et demontre une grande exactitude pour les petits echantillons. Ceci nous permet de comparer differentes categories d'estimateurs selon leurs proprietes d'echantillon fini et de determiner dans quels cas l'approximation asymptotique peut ětre utile en pratique.

Journal ArticleDOI
TL;DR: In this paper, the authors consider automatic data-driven density, regression and autoregression estimates, based on any random bandwidth selector h/T, and show that in a first-order asymptotic approximation they behave as well as the related estimates obtained with the “optimal” bandwidth hT as long as hT/hT 1 in probability.
Abstract: We consider automatic data-driven density, regression and autoregression estimates, based on any random bandwidth selector h/T. We show that in a first-order asymptotic approximation they behave as well as the related estimates obtained with the “optimal” bandwidth hT as long as hT/hT 1 in probability. The results are obtained for dependent observations; some of them are also new for independent observations. Cet article etudie des estimations automatiques de densite, de regression et d'autoiegression, fondees sur n'importe quel selecteur aleatoire de largeur de fenětre hT. On montre qu'a partir d'une approximation asymptotique de premier ordre, el les ont le měme comportement que les estimations correspondantes, obtenues de la largeur de fenětre “optimale” hT, tant que hT/hT 1 en probabilite. On obtient ces resultats pour des observations dependantes; certains d'entre eux sont egalement nouveaux pour des observations independantes.

Journal ArticleDOI
TL;DR: In this paper, robust estimators of the scale parameters in the error-components model are described, which are based on the empirical characteristic functions of appropriate sets of residuals and are affine equivariant, consistent and asymptotically normal.
Abstract: Robust estimators of the scale parameters in the error-components model are described. The new estimators are based on the empirical characteristic functions of appropriate sets of residuals and are affine equivariant, consistent and asymptotically normal. The robustness of the new estimators is investigated via influence-function calculations. The results of Monte Carlo experiments and an example based on real data illustrate the usefulness of the estimators.

Journal ArticleDOI
TL;DR: In this paper, an exact confidence region centered at a generalized version of the well-known Graybill-deal estimator of m is developed, and a multiple comparison procedure based on this confidence region is outlined.
Abstract: Suppose that there are independent samples available from several multivariate normal populations with the same mean vector m but possibly different covariance matrices. The problem of developing a confidence region for the common mean vector based on all the samples is considered. An exact confidence region centered at a generalized version of the well-known Graybill-Deal estimator of m is developed, and a multiple comparison procedure based on this confidence region is outlined. Necessary percentile points for constructing the confidence region are given for the two-sample case. For more than two samples, a convenient method of approximating the percentile points is suggested. Also, a numerical example is presented to illustrate the methods. Further, for the bivariate case, the proposed confidence region and the ones based on individual samples are compared numerically with respect to their expected areas. The numerical results indicate that the new confidence region is preferable to the single-sample versions for practical use. Supposons qu'il y a des echantillons independants provenant de plusieurs populations a plusieurs variables normales avec le měme vecteur moyen m mais possiblement differentes matrices des covariances. Cette article considere le probleme de developper une region de confiance pour le vecteur moyen commun basee sur tous les echantillons. Une region de confiance exacte centree sur un version generalisee du fameux estimateur de m de Graybill-Deal est developpee et une procedure de comparaisons multiples basee sur cette region de confiance est evoquee. Les points percentiles necessaires a la construction de la region de confiance sont donnes pour le cas ou il y a deux echantillons. Dans le cas ou il y a plus de deux echantillons, une methode pratique d'approximation des points percentiles est suggeree. Aussi, un exemple numerique est presente afin d'illustrer les methodes. De plus, pour le cas bidimensionnel, la region de confiance proposee et celles basees sur des echantillons individuels sont comparees numeriquement quant a leur aires probables. Les resultats numeriques indiquent que la nouvelle region de confiance est preferable aux versions d'echantillon unique pour un usage pratique.

Journal ArticleDOI
TL;DR: In this article, a new Laplacian approximation for the posterior density of the Fisher-Behrens problem is presented. But the approximation is not normalized, since the computation of their approximation cannot be normalized.
Abstract: This paper presents a new Laplacian approximation to the posterior density of η = g(θ). It has a simpler analytical form than that described by Leonard et al. (1989). The approximation derived by Leonard et al. requires a conditional information matrix Rη to be positive definite for every fixed η. However, in many cases, not all Rη are positive definite. In such cases, the computations of their approximations fail, since the approximation cannot be normalized. However, the new approximation may be modified so that the corresponding conditional information matrix can be made positive definite for every fixed η. In addition, a Bayesian procedure for contingency-table model checking is provided. An example of cross-classification between the educational level of a wife and fertility-planning status of couples is used for explanation. Various Laplacian approximations are computed and compared in this example and in an example of public school expenditures in the context of Bayesian analysis of the multiparameter Fisher-Behrens problem. Cet article presente une nouvelle approximation laplacienne de la densite a posteriori de η = g(θ). Sa forme analytique est plus simple que celle decrite par Leonard et al. (1989). L'approximation de Leonard et al. exige qu'une matrice d'information conditionnelle Rη soit definie positive pour tout η fixe. Dans de nombreux cas, cependant, les Rη ne sont pas toutes definies positives. Le calcul de leur approximation echoue donc, car cette approximation est alors impossible a normaliser. En revanche, l'approximation proposee ici peut ětre modifiee de maniere a ce que la matrice d'information conditionnelle soit definie positive pour tout η fixe. Une procedure bayesienne de verification de modeles pour tableaux de frequence est egalement presentee. Un exemple de tri-croise entre le niveau de scolarite de femmes mariees et les moyens de contrǒle des naissances privilegies par leur couple sert a illustrer la methodologie. Plusieurs approximations laplaciennes sont calculees et comparees dans cet exemple, ainsi que dans le contexte d'une approche bayesienne du probleme de Fisher-Behrens multiparametrique appliquee a l'analyse des depenses d'une ecole du secteur public.

Journal ArticleDOI
TL;DR: In this paper, a family of higher-order kernels for estimation of a probability density are constructed by iterating the twicing procedure, with the attractive property that their Fourier transforms are simply 1 − {1 -$(.)}m+1, where ǫ is the Fourier transform of K.
Abstract: Classes of higher-order kernels for estimation of a probability density are constructed by iterating the twicing procedure. Given a kernel K of order l, we build a family of kernels Km of orders l(m + 1) with the attractive property that their Fourier transforms are simply 1 — {1 —$(.)}m+1, where Ǩ is the Fourier transform of K. These families of higher-order kernels are well suited when the fast Fourier transform is used to speed up the calculation of the kernel estimate or the least-squares cross-validation procedure for selection of the window width. We also compare the theoretical performance of the optimal polynomial-based kernels with that of the iterative twicing kernels constructed from some popular second-order kernels. Nous utilisons la technique du “twicing” pour construire des families de noyaux d'ordres superieurs. En fixant un noyan K d'ordre l et en iterant la technique du “twicing” m fois, nous obtenons un noyau Km d'ordre l(m + 1) et dont la transformee de Fourier (T.F.) est donnee par$m(.) = l - {l -$(.)}m+l, ou K est la T.F. de K. La simplicite de l'expression de$m est d'un grand interět pratique, car une application de la transformation de Fourier rapide permet d'accelerer revaluation de l'estimateur a noyau ou de criteres de choix du pas de lissage. L'efficacite asymptotique de quelques families de noyaux obtenus a l'aide du “twicing” itere est comparee a celle des noyaux polynomiaux optimaux.

Journal ArticleDOI
TL;DR: In this paper, a modified rank-based version of Tiao and Box's model specification procedure is proposed, which is likely to be more reliable under non-Gaussian conditions, and more robust against gross errors.
Abstract: Rank-based cross-covariance matrices, extending to the case of multivariate observed series the (univariate) rank autocorrelation coefficients introduced by Wald and Wolfowitz (1943), are considered A permutational central limit theorem is established for the joint distribution of such matrices, under the null hypothesis of (multivariate) randomness as well as under contiguous alternatives of (multivariate) ARMA dependence A rank-based, permutationaily distribution-free test of the portmanteau type is derived, and its asymptotic local power is investigated Finally, a modified rank-based version of Tiao and Box's model specification procedure is proposed, which is likely to be more reliable under non-Gaussian conditions, and more robust against gross errors RESUME Des matrices de covariances croisees fondees sur les rangs, generalisant au cas des series multivariees les coefficients d'autocorrelation de rangs introduits par Wald et Wolfowitz (1943) sont considerees Un theoreme central-limite permutationnel est etabli pour ces matrices, sous Phypothese que la serle sous-jacente constitue la realisation d'un bruit blanc multivarie, ainsi que sous des contre-hypotheses contigues de dependance ARMA Un test de rangs du type portemanteau est egalement construit, et sa puissance asymptotique locale est explicitement calculee Enfin, une version fondee sur les rangs de la procedure d'identification de Tiao et Box est proposee Celle-ci est plus fiable que la procedure usuelle sous des conditions non gaussiennes, et plus robuste par rapport a la presence de valeurs aberrantes

Journal ArticleDOI
TL;DR: In this paper, the authors show that the saddlepoint expansion for the density of a sample mean is always uniformly valid on compact subsets in the interior of the domain of the mean, and that this uniform validity is the key for establishing the relation between saddlepoint expansions for density functions and Lugannani and Rice's expansion for tail probability.
Abstract: This paper shows that Daniels's (1954) saddlepoint expansion for the density of a sample mean is, for all practical purposes, always uniformly valid on compact subsets in the interior of the domain of the mean. This uniform validity is the key for establishing the relation between the saddlepoint expansion for the density function and Lugannani and Rice's expansion for the tail probability, and for establishing the validity of a high-order asymptotic expansion for the density of a standardized mean. Cet article montre que la methode proposee par Daniels (1954) pour approximer la densite moyenne echantillonnale est, a toutes fins pratiques, uniformement valable sur tous les sousensembles compacts de l'interieur du domaine des valeurs de la moyenne theorique. Cette validite uniforme est la cle pour etablir un lien entre la methode de Daniels pour approximer la fonction de densite et celle utilisee par Lugannani et Rice pour approximer la probabilite des queues d'une distribution; elle permet aussi de confirmer la validite d'un developpement asymptotique d'ordre superieur pour la densite d'une moyenne standardisee.

Journal ArticleDOI
TL;DR: In this article, the authors show that in a regression model with multivariate response, the least-distances method typically yields quantities that exhibit uniqueness properties that are similar to those obtained by the least squares method.
Abstract: In a regression model with univariate response, the quantities derived from the least-absolute-deviations method need not be unique. In this note, we show that, contrary to the univariate case, in a regression model with multivariate response, the least-distances method typically yields quantities that exhibit uniqueness properties that are similar to those obtained by the least-squares method. Dans un contexte de regression avec une reponse univariee, les quantitees obtenues par la methode des moindres deviations absolues ne sont pas, en general, uniques. Dans cette note, on montre que, contrairement au cas univarie, les quantites obtenues par la methode des moindres distances dans le cas d'une reponse multivariee presentent des proprietes d'unicite similaires a celles obtenues par la methode des moindres carres.

Journal ArticleDOI
TL;DR: Brown and Gajek (1990) gave useful lower bounds on Bayes risks, which improved on earlier bounds by various authors as mentioned in this paper, such as the information inequality, using convexity of appropriate functionals instead of information inequality.
Abstract: Brown and Gajek (1990) gave useful lower bounds on Bayes risks, which improve on earlier bounds by various authors. Many of these use the information inequality. For estimating a normal variance using the invariant quadratic loss and any arbitrary prior on the reciprocal of the variance that is a mixture of Gamma distributions, we obtain lower bounds on Bayes risks that are different from Borovkov-Sakhanienko bounds. The main tool is convexity of appropriate functionals as opposed to the information inequality. The bounds are then applied to many specific examples, including the multi-Bayesian setup (Zidek and his coauthors). Subsequent use of moment theory and geometry gives a number of new results on efficiency of estimates which are linear in the sufficient statistic. These results complement earlier results of Donoho, Liu and MacGibbon (1990), Johnstone and MacGibbon (1992) and Vidakovic and DasGupta (1994) for the location case. Brown et Gajek (1990) ont donne des limites inferieures utiles aux risques de Bayes, qui ameliorent les limites donnees precedemment par differents auteurs. Plusieurs d'entre eux utilisent l'inegalite d'information. En estimant une variance normale a l'aide d'une fonction de perte quadratique invariante et d'un a priori arbitraire sur l'inverse de la variance, qui est un melange de distributions Gamma, on obtient des limites inferieures sur les risques de Bayes qui sont differentes des limites de Borovkov-Sakhanienko. L'outil principal est la convexite de fonctionnels appropries, et non l'inegalite de l'information. Les limites sont alors appliquees a differents exemples, dont a la situation multi-Bayesienne (Zidek et co-auteurs). L'utilisation subsequente de la theorie des moments et de la geometrie donne un certain nombre de nouveaux resultats sur l'efficacite des estimations, qui sont lineaires dans la statistique exhaustive. Ces resultats completent les resultats obtenus precedemment par Donoho, Liu et MacGibbon (1990), Johnson et MacGibbon (1992) et Vidakovic et DasGupta (1994) pour le cas de la position.

Journal ArticleDOI
TL;DR: Some recent works on estimation in survey sampling are analyzed and extended from the perspective of the theory of optimal estimating functions as discussed by the authors, and some of these works have been extended to survey sampling.
Abstract: Some recent works on estimation in survey sampling are analyzed and extended from the perspective of the theory of optimal estimating functions. Quelques recents travaux concernant l'estimation dans les enquětes par sondage sont analyses et approfondis dans la perspective de la theorie des functions d'estimation optimales.

Journal ArticleDOI
TL;DR: In this paper, a strong invariance principle for triangular arrays of a broad class of weakly dependent real random variables is established for the convergence of estimators in regression analysis, and the functional central limit theorem and law of the iterated logarithm for the approximating array are shown.
Abstract: We establish a strong invariance principle for triangular arrays of a broad class of weakly dependent real random variables. We approximate the original array of dependent random variables by an array of rowwise independent standard normal variables. We demonstrate the functional central limit theorem and law of the iterated logarithm for the approximating array and thereby extend these results to the original array. Among several examples, we look at arrays used in describing the rate of convergence of estimators in regression analysis. Nous etablissons un principe d'invariance fort pour des tableaux triangulaires de variables aleatoires faiblement dependentes. Le tableau donne est approximise par un tableau de variables aleatoires normales standards et independentes par rangee. On demontre les formes fonctionnelles de la loi de la limite centrale et du logarithme itere pour le tableau des normales, et donc pour le tableau initial. Parmi quelques examples, on regarde des tableaux utilises dans l'etude du taux de convergence des estimateurs en analyse de la regression.

Journal ArticleDOI
TL;DR: In this article, a new control chart, called the θ chart, was proposed for monitoring the mean of a process with bivariate quality characteristics, which can identify a rotation, shift or alternation between the subgroups of the process mean.
Abstract: A new control chart, called the θ chart, for monitoring the mean of a process with bivariate quality characteristics is proposed. It can identify a rotation, shift or alternation between the subgroups of the process mean. The conventional application of X2 chart to identify a sudden shift of the process mean is also expanded to identify a change of the process mean or a change of the process dispersion. Furthermore, when used together, the θ and X2 charts could provide further insight into the process. Une nouvelle charte de contrǒle, appelee charte θ, pour suveiller la moyenne d'un processus avec characteristiques bidimensionnelles de qualite est proposee. Elle peut identifier une rotation, un changement ou une alternance entre la moyenne des sous-groupes du processus. L'application conventionnelle de la charte du X2 servant a identifier une variation soudaine de la moyenne du processus est aussi etendue pour pouvoir identifier un changement de la moyenne du processus ou de la dispersion du processus. De plus, lorsque utilisees ensemble, les chartes θ et du X2 peuvent procurer une analyse plus poussee du processus.

Journal ArticleDOI
TL;DR: In this article, the authors consider the problem of estimating a quantile of an exponential distribution with unknown location and scale parameters under Pitman's measure of closeness (PMC), where the loss function is required to satisfy some mild conditions but is otherwise arbitrary.
Abstract: We consider the problem of estimating a quantile of an exponential distribution with unknown location and scale parameters under Pitman's measure of closeness (PMC). The loss function is required to satisfy some mild conditions but is otherwise arbitrary. An optimal estimator is obtained in the class of location-scale-equivariant estimators, and its admissibility in the sense of PMC is investigated. On considere le probleme d'estimer le quantile d'un distribution exponentielle, avec parametres de position et d'echelle inconnus, a l'aide de la mesure de precision de Pitman (PMC). La fonction de perte est necessaire a la satisfaction de quelques conditions faibles, mais est autrement arbitraire. Un estimateur optimal est obtenu de la categorie des estimateurs equivariants de position-echelle et son admissibilite selon la PMC est examinee.

Journal ArticleDOI
TL;DR: In this article, a stochastic model for tumour growth and additive competing risks of death is proposed to estimate the incidence rate for occult tumours in carcinogenicity trials.
Abstract: In the development of many diseases there are often associated random variables which continuously reflect the progress of a subject towards the final expression of the disease (failure). At any given time these processes, which we call stochastic covariates, may provide information about the current hazard and the remaining time to failure. Likewise, in situations when the specific times of key prior events are not known, such as the time of onset of an occult tumour or the time of infection with HIV-1, it may be possible to identify a stochastic covariate which reveals, indirectly, when the event of interest occurred. The analysis of carcinogenicity trials which involve occult tumours is usually based on the time of death or sacrifice and an indicator of tumour presence for each animal in the experiment. However, the size of an occult tumour observed at the endpoint represents data concerning tumour development which may convey additional information concerning both the tumour incidence rate and the rate of death to which tumour-bearing animals are subject. We develop a stochastic model for tumour growth and suggest different ways in which the effect of this growth on the hazard of failure might be modelled. Using a combined model for tumour growth and additive competing risks of death, we show that if this tumour size information is used, assumptions concerning tumour lethality, the context of observation or multiple sacrifice times are no longer necessary in order to estimate the tumour incidence rate. Parametric estimation based on the method of maximum likelihood is outlined and is applied to simulated data from the combined model. The results of this limited study confirm that use of the stochastic covariate tumour size results in more precise estimation of the incidence rate for occult tumours. Dans le developpement de plusieurs maladies il y a souvent des variables alleatoires associees qui refletent continuellemement le progres d'un sujet vers l'expression finale de la maladie (echec). A un moment donne, ces processus, que nous appelons covariables stochastiques, peuvent fournir de l'information sur le risque actuel et le temps restant jusqu'a l'echec. De měme, dans des situations ou les temps specifiques d'evěnements cles passes ne sont pas connus, tel que le temps des premieres attaques d'une tumeur occulte ou le temps d'infection au HIV-1, il peut ětre possible d'identifier une covariable stochastique qui revele indirectement quand l'evěnement d'interět est survenu. L'analyse de tests de cancerogeneite qui impliquent des tumeurs occultes est habituellement basee sur les temps de mort ou de sacrifice de chaque animal et sur un indicateur de la presence de tumeurs pour chacun des animaux de l'experience. Cependant, la taille d'une tumeur occulte observee a la fin de l'experience represente aussi une source de donnees concemant le developpement des tumeurs qui peut procurer de l'information additionnelle a la fois sur le taux d'incidence et sur le taux de mortalite auquel les animaux porteurs de tumeurs sont sujets. Nous developpons un modele stochastique pour la croissance des tumeurs et nous suggerons differentes facons selon lesquelles l'effet de cette croissance sur le risque d'echec pourrait ětre modelise. Utilisant un modele combine pour la croissance de la tumeur et les risques de mort additifs et concurrents, nous montrons que si la taille de la tumeur est utilisee, des hypotheses sur la mortalite des tumeurs, le contexte de l'observation ou les temps multiples de sacrifices ne sont plus necessaires pour estimer le taux d'incidence des tumeurs. Une estimation parametrique basee sur la methode du maximum de vraissemblance est esquissee et est appliquee a des donnees simulees a partir du modele combine. Les resultats de cette etude limitee confirment que l'utilisation de la covariable stochastique taille des tumeurs, procure une estimation plus precise du taux d'incidence des tumeurs occultes.

Journal ArticleDOI
Peide Shi1
TL;DR: In this paper, the authors relax the independence assumption on the stationary sequence and show that the optimal global convergence rate can be achieved by the B-spline based estimators and their derivatives.
Abstract: Nonparametric regression quantiles provide informative and powerful alternatives and generalizations to the more traditional mean and median. The use of B-spline approximation in the conditional quantile estimation has been studied recently by He and Shi (1994). The regression quantile splines are given by minimizing n i = 1 ρα (Yi - g(Xi))where ρα(t) = |t| - (2α - 1)t is the Czech function and g is taken from the space spanned by normalized B-splines basis functions. This paper relaxes the independence assumption on the stationary sequence {Xi, Yi}. If the true conditional quantile function is smooth up to order r and the observed sequence is β-mixing (or absolutely regular), it is shown, under suitable mixing conditions, that the optimal global convergence rates can be achieved by the B-spline based estimators and their derivatives. Les fractiles de regression non-parametriques donnent d'eclairantes et puissantes alternatives et generalisations aux traditionelles moyenne et mediane. L'utilisation d'approximations par B-splines dans l'estimation conditionnelle du fractile a ete etudiee recemment par He et Shi (1994). Les splines des fractiles de regression sont obtenus en minimisant n i = 1 ρα (Yi - g(Xi)), o u ρα (t) = |t| - (2α - 1)t est la fonction de Czech et g est pris dans l'espace genere par une base de fonctions compose de B-splines normalisees. Cet article affaiblit l'hypothese d'independance de la suite stationnaire {Xi, Yi,}. Si la veritable fonction fractile est lisse d'ordre r et la suite observee est β-mixante (ou absolument reguliere), il est montre, sous des hypotheses appropriees de mixage, que les taux obtimaux de convergence global peuvent ětre obtenus par des estimateurs bases sur les B-splines et leurs derives.

Journal ArticleDOI
TL;DR: In this paper, a set of n planar regions on the surface of a part is considered and the small-sample density of this estimator (on the unit sphere S2) is determined asymptotically as the variance of the CMM error approaches 0.
Abstract: Coordinate measuring machines (CMMs) are used to check the geometric integrity of component parts. The geometric constraints to which a part must conform, as defined e.g., by The American National Standards Institute, assume the use of some type of gauging system when inspecting the part. Statistical issues arise in interpretting CMM data in the inspection of part tolerances. We consider a set of n planar regions on the surface of a part. The unit vector normal to each plane is estimated by orthogonal least squares. The small-sample density of this estimator (on the unit sphere S2) is determined asymptotically as the variance of the CMM error approaches 0. To a first-degree approximation, this density is Fisher-von Mises. Diagnostics are reviewed to test the geometric constraint that the n planar regions are oriented correctly with respect to one another, and to test the flatness of planar regions. Des Machines de Mesure de Coordonnees (MMC) sont utilisees pour verifier la conformite geometrique de pieces composantes. Les contraintes geometriques qu'une piece doit satisfaire, telles que definies par The Amarican National Standards Institute par exemple, supposent l'utilisation de certains types de systes de mesure lors de l'inspection des pies. Des questions statistiques se posent alors pour l'interpretation des donees de MMC dans l'inspection de la tolerance des pieces. Dans cet article nous considerons un ensemble de n regions planaires sur la surface d'une piece. Le vecteur unitaire normal a chaque plan est estime par les Moindres Carres Orthogonaux. La densite de petits echantillons de cet estimateur (sur la sphere unite S2) est determinee asymptotiquement lorsque la variance de l'erreur de la MMC tend vers zero. A un premier degre d'approximation, cette densite est de Fisher-von Mises. Les diagnostiques sont revus pour tester la contrainte geometrique que les n regions planaires sont correctement orientes les unes par rapports aux autres, et pour tester la rectitude lineaire de regions planaires.

Journal ArticleDOI
TL;DR: In this paper, a family of easily computable estimators of the odds ratio of a number of two-by-two contingency tables is proposed, including the well-known Mantel-Haenszel estimator as a special case.
Abstract: A family of easily computable estimators of the odds ratio of a number of two-by-two contingency tables is proposed. It includes the well-known Mantel-Haenszel estimator as a special case. A necessary and sufficient condition is given for the strong consistency of the estimators for the case when the tables are sparse and the number of tables becomes large. The condition also ensures the asymptotic normality of this family of estimators. A family of consistent estimators is proposed for the variances of the asymptotic distributions. In the case of the Mantel-Haenszel estimator, the validity of Breslow's (1981, Biometrika) condition for the consistency and asymptotic normality is questioned. Examples are given to demonstrate that it is neither necessary nor sufficient for the consistency of the Mantel-Haenszel estimator. Cet article propose une famille d'estimateurs faciles a calculer du rapport-chance d'un certain nombre de tableaux de contingences deux par deux. Elle comprend le fameux estimateur de Mantel-Haenszel, comme cas particulier. Une condition necessaire et suffisante pour la convergence forte des estimateurs est donnee, dans le cas ou les tableaux sont clairsemes et le nombre de tableaux devient eleve. Cette condition garantit egalement la normalite asymptotique de cette famille d'estimateurs. Une famille d'estimateurs convergents est proposee pour les variances des distributions aymptotiques. Dans le cas de l'estimateur de Mantel-Haenszel, la validite de la condition de Breslow (1981, Biometrika) pour la convergence et la normalite asymptotique est remise en cause. Des examples sont fournis, demontrant qu'elle n'est ni necessaire ni suffisante pour la convergence de l'estimateur de Mantel-Haenszel.

Journal ArticleDOI
TL;DR: In this paper, the authors prove a strong law of large numbers, a central limit theorem and a law of the iterated logarithm for the sequence {Sk, including both the situations where the sample sizes tend to infinity while m is fixed and those where sample sizes remain small while m tends to infinity.
Abstract: Consider a family of square-integrable Rd-valued statistics Sk = Sk(X1,k1; X2,k2;…; Xm,km), where the independent samples Xi,kj respectively have ki i.i.d. components valued in some separable metric space Xi. We prove a strong law of large numbers, a central limit theorem and a law of the iterated logarithm for the sequence {Sk}, including both the situations where the sample sizes tend to infinity while m is fixed and those where the sample sizes remain small while m tends to infinity. We also obtain two almost sure convergence results in both these contexts, under the additional assumption that Sk is symmetric in the coordinates of each sample Xi,kj. Some extensions to row-exchangeable and conditionally independent observations are provided. Applications to an estimator of the dimension of a data set and to the Henze-Schilling test statistic for equality of two densities are also presented. Etant donnee une famille de statistiques de carre integrable a valeurs dans Rd Sk = Sk(X1,k1; X2,k/i.2;…; Xm,km), basees sur des echantillons aleatoires mutuellement independents Xi,kj comprenant ki, observations tirees au hasard d'un espace metrique separable Xi, nous demontrons une loi forte des grands nombres, un theoreme de la limite centrale et une loi du logarithme itere pour {Sk} incluant les deux situations ou les tailles echantillonnales croissent a l'infini pour m fixe et ou les tallies echantillonnales demeurent petites pendant que m tend vers l'infini. Nous obtenons egalement deux resultats de convergence presque sǔre dans ces deux cas, sous l'hypothese additionnelle que Sk soit symetrique en ses coordonnees pour chaque echantillon Xi,kj. Certains resultats s'appliquent egalement lorsque les observations sont interchangeables par rangees, d'autres lorsque les observations ne verifient l'independance que conditionnellement. Deux exemples sont analyses en detail: un estimateur de la dimension d'un ensemble de donnees et la statistique de Henze-Schilling pour decider de l'egalite de deux densites.

Journal ArticleDOI
TL;DR: In this article, the authors considered the problem of growth inhibition in terms of the length of the terminal sprouts of a tree and the response variable was chosen to be the proportion of the sprouts on a tree that exceeded a specified cutoff length.
Abstract: In searching for the “best” growth inhibitor, we decided to consider growth inhibition in terms of the lengths of the terminal sprouts. For it is logical to infer that the trees with the longer sprouts (after a 20-month period) will most likely be the ones that will need trimming in the future. Additionally, we reasoned that if a particular treatment produced a smaller proportion of “long” sprouts, then it would be a more effective growth inhibitor. It was now necessary to define what was meant by “long”. After consultation with foresters we chose cutoff lengths of 15.0, 25.0 and 35.0 cm. Hence the response variable was chosen to be the proportion of the terminal sprouts on a tree that exceeded a specified cutoff length. By varying the cutoff lengths, we would minimize the effect of the arbitrariness involved in choosing one particular length. Au cours de la recherche du “meilleur” inhibiteur de croissance, nous avons decide d'etudier l'inhibition de croissance en termes de longueur des pousses terminales. Il est en effet logique d'inferer que les arbres aux pousses les plus longues (apres un periode de vingt mois) ont le plus de chance d'ětre ceux qu'il faudra tailler dans le futur. De plus, nous avons deduit que, si un traitement en particulier produisait une faible proportion de “longues” pousses, il constituerait un inhibiteur de croissance plus efficace. Il nous a donc fallu definir ce que l'on entendait par “long”. Apres avoir consulte des agents forestiers, nous avons choisi les longueurs limites suivantes: 15.0, 25.0 et 35.0 centimetres. Ainsi, comme variable de reponse nous avons choisi la proportion de pousses terminales d'un arbre qui depassaient une longueur limite specifique. En variant les longueurs limites, nous avons tente de reduire l'effet d'arbitraire entraǐne par le choix d'une longueur en particulier.