scispace - formally typeset
Search or ask a question

Showing papers in "Canadian Journal of Statistics-revue Canadienne De Statistique in 2006"


Journal ArticleDOI
TL;DR: The authors proposed a conservative prior distribution for variance components, which deliberately gives more weight to smaller values and is appropriate for investigators who are skeptical about the presence of variability in the second-stage parameters (random effects).
Abstract: Bayesian hierarchical models typically involve specifying prior distributions for one or more variance components. This is rather removed from the observed data, so specification based on expert knowledge can be difficult. While there are suggestions for "default" priors in the literature, often a condi tionally conjugate inverse-gamma specification is used, despite documented drawbacks of this choice. The authors suggest "conservative" prior distributions for variance components, which deliberately give more weight to smaller values. These are appropriate for investigators who are skeptical about the presence of variability in the second-stage parameters (random effects) and want to particularly guard against inferring more structure than is really present. The suggested priors readily adapt to various hierarchical modelling settings, such as fitting smooth curves, modelling spatial variation and combining data from multiple sites. Lois a priori conservatrices pour les parametres de variance de modeles hierarchiques Rgsum6: Les modeles bay6siens hierarchiques comportent g6n6ralement une ou des composantes de va riance que l'on doit doter de lois a priori. Le choix de ces lois est delicat car la variation est un aspect des donn6es difficile a cemer. De toutes les lois a priori "par defaut," une loi conjuguee inverse-gamma con ditionnelle est la plus souvent employ6e, malgr6 ses inconvenients. Les auteurs proposent des lois a priori "conservatrices" pour les composantes de la variance qui privilegient les petites valeurs. Elles conviennent bien aux situations oiu le chercheur s'interroge sur la presence r6elle de variabilit6 dans les parametres de deuxieme degre (effets aleatoires) et qu'il veut eviter d'imposer une structure artificielle. Les lois a priori sugg6rdes s'adaptent A diverses situations propices a la mod6lisation hierarchique, notamment l'ajustement de courbes lisses et la modelisation de variation spatiale ou de donn6es issues de nombreux sites.

1,184 citations


Journal ArticleDOI
TL;DR: The authors study binary classification that allows for a reject option in which case no decision is made to be made for those observations for which the conditional class probabilities are close and as such are hard to classify.
Abstract: This paper studies two-class (or binary) classification of elements X in R k that allows for a reject option. Based on n independent copies of the pair of random variables (X,Y ) with X 2 R k and Y 2 {0,1}, we consider classifiers f(X) that render three possible outputs: 0, 1 and R. The option R expresses doubt and is to be used for few observations that are hard to classify in an automatic way. Chow (1970) derived the optimal rule minimizing the risk P{f(X) 6 Y, f(X) 6 R} + dP{f(X) = R}. This risk function subsumes that the cost of making a wrong decision equals 1 and that of utilizing the reject option is d. We show that the classification problem hinges on the behavior of the regression function (x) = E(Y |X = x) near d and 1 d. (Here d 2 [0,1/2] as the other cases turn out to be trivial.) Classification rules can be categorized into plug-in estimators and empirical risk minimizers. Both types are considered here and we prove that the rates of convergence of the risk of any estimate depends on P{| (X) d| } + P{| (X) (1 d)| } and on the quality of the estimate for or an appropriate measure of the size of the class of classifiers, in case of plug-in rules and empirical risk minimizers, respectively. We extend the mathematical framework even further by dierentiating between costs associated with the two possible errors: predicting f(X) = 0 whilst Y = 1 and predicting f(X) = 1 whilst Y = 0. Such situations are common in, for instance, medical studies where misclassifying a sick patient as healthy is worse than the opposite.

231 citations


Journal ArticleDOI
TL;DR: In this paper, a new definition of a selection distribution was proposed that encompasses many existing families of multivariate skewed distributions, such as the skew-normal and skew-elliptical distributions, and several methods of constructing selection distributions based on linear and nonlinear selection mechanisms.
Abstract: Parametric families of multivariate nonnormal distributions have received considerable attention in the past few decades. The authors propose a new definition of a selection distribution that encompasses many existing families of multivariate skewed distributions. Their work is motivated by examples that involve various forms of selection mechanisms and lead to skewed distributions. They give the main prop erties of selection distributions and show how various families of multivariate skewed distributions, such as the skew-normal and skew-elliptical distributions, arise as special cases. The authors further introduce several methods of constructing selection distributions based on linear and nonlinear selection mechanisms. Une perspective integree des lois asymetriques issues de processus de selection Resume: Les familles param6triques de lois multivari6es non gaussiennes ont suscite beaucoup d'int6ret depuis quelques decennies. Les auteurs proposent une nouvelle definition du concept de loi de selection qui englobe plusieurs familles connues de lois asymetriques multivari6es. Leurs travaux sont motives par diverses situations faisant intervenir des mecanismes de selection et conduisant 'a des lois asymetriques. Ils mentionnent les principales propri6t6s des lois de selection et montrent comment diverses familles de lois asymetriques multivariees telles que les lois asymetriques normales ou elliptiques emergent comme cas particuliers. Les auteurs presentent en outre plusieurs methodes de construction de lois de selection fondees sur des mecanismes lin6aires ou non lineaires.

193 citations


Journal ArticleDOI
TL;DR: In this article, the problem of estimating the density g of independent and identically disributed variables Xi, from a sample Z1,..., Z, such that Zi = Xi + aei for i = 1,.., n, and e is noise independent of X, with ae having a known distribution.
Abstract: The authors consider the problem of estimating the density g of independent and identically dis tributed variables Xi, from a sample Z1, ... , Z,, such that Zi = Xi + aei for i = 1, . ., n, and e is noise independent of X, with ae having a known distribution. They present a model selection procedure allowing one to construct an adaptive estimator of g and to find nonasymptotic risk bounds. The estimator achieves the minimax rate of convergence, in most cases where lower bounds are available. A simulation study gives an illustration of the good practical performance of the method.

117 citations


Journal ArticleDOI
TL;DR: In this article, an adjusted pseudo-empirical likelihood ratio statistic that is asymptoti- cally distributed as a chi-square random variable is used to construct confidence intervals for a finite population mean or finite population distribution function.
Abstract: The authors show how an adjusted pseudo-empirical likelihood ratio statistic that is asymptoti- cally distributed as a chi-square random variable can be used to construct confidence intervals for a finite population mean or a finite population distribution function from complex surv ey samples. They consider both non-stratified and stratified sampling designs, with or without auxiliary in formation. They examine the behaviour of estimates of the mean and the distribution function at specifi c points using simulations calling on the Rao-Sampford method of unequal probability sampling without replacement. They conclude that the pseudo-empirical likelihood ratio confidence intervals are super ior to those based on the normal approximation, whether in terms of coverage probability, tail error rates or average length of the intervals.

82 citations


Journal ArticleDOI
TL;DR: In this paper, the problem of testing whether two populations have the same law by comparing kernel estimators of the two density functions is studied, and the proposed test statistic is based on a local empirical likelihood approach.
Abstract: The authors study the problem of testing whether two populations have the same law by comparing kernel estimators of the two density functions. The proposed test statistic is based on a local empirical likelihood approach. They obtain the asymptotic distribution of the test statistic and propose a bootstrap approximation to calibrate the test. A simulation study is carried out in which the proposed method is compared with two competitors, and a procedure to select the bandwidth parameter is studied. The proposed test can be extended to more than two samples and to multivariate distributions.

68 citations


Journal ArticleDOI
TL;DR: In this article, a block empirical likelihood procedure is proposed to accommodate the within-group correlation in longitudinal partially linear regression models, which leads them to prove a nonparametric version of the Wilks theorem.
Abstract: The authors propose a block empirical likelihood procedure to accommodate the within-group correlation in longitudinal partially linear regression models. This leads them to prove a nonparametric version of the Wilks theorem. In comparison with normal approximations, their method does not require a consistent estimator for the asymptotic covariance matrix, which makes it easier to conduct inference on the parametric component of the model. An application to a longitudinal study on fluctuations of progesterone level in a menstrual cycle is used to illustrate the procedure developed here. Une vraisemblance empirique par bloc pour les modeles de regression partiellernent lineaires longitudinaux Les auteurs proposent une procedure a base de vraisemblance empirique par bloc pour tenir compte de la correlation intra-groupe dans des modeles de regression partiellement lineaires longitudinaux. Ceci les ameneo a demontrer une version non parametrique du theoreme de Wilks. A la difference des approximations normales, leur methode ne fait pas appel a un estimateur convergent de la matrice des cova-riances asymptotiques, ce qui facilite l'inference concernant la composante parametrique du modele. Une etude longitudinale sur la fluctuation du niveau de progesterone pendant le cycle menstruel sert a illustrer le propos.

56 citations


Journal ArticleDOI
TL;DR: In this paper, the authors study a varying-coefficient regression model in which some of the covariates are measured with additive errors, and they find that the usual local linear estimator (LLE) of the coefficient functions is biased and the usual correction for attenuation fails to work.
Abstract: The authors study a varying-coefficient regression model in which some of the covariates are measured with additive errors. They find that the usual local linear estimator (LLE) of the coefficient functions is biased and that the usual correction for attenuation fails to work. They propose a corrected LLE and show that it is consistent and asymptotically normal, and they also construct a consistent estimator for the model error variance. They then extend the generalized likelihood technique to develop a goodness of fit test for the model. They evaluate these various procedures through simulation studies and use them to analyze data from the Framingham Heart Study. Estimation polynomiale locale corrigee dans les modeles a coefficients variables comportant des erreurs de mesure Les auteurs s'interessent a un modele de regression a coefficients variables dont certaines cova-riables sont entachees d'erreurs additives. Ils montrent que l'estimateur localement lineaire (ELL) usuel des coefficients fonctionnels est biaise et que le facteur de correction habituel du phenomene d'attenuation est inefficace. Ils proposent une version corrigee de l'ELL qui s'avere convergente et asymptotiquement normale; ils suggerent aussi une estimation convergente de la variance du terme d'erreur du modele. Une adaptation de la technique de vraisemblance generalisee leur permet en outre d'elaborer un test d'adequation du modele. Ils evaluent ces diverses procedures par voie de simulation et s'en servent pour analyser des donnees issues de l'etude Framingham sur les risques cardiometaboliques.

45 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed a robust prediction error criterion, called RC-criterion, to choose the estimator of the Pareto tail index from extreme order statistics.
Abstract: Estimation of the Pareto tail index from extreme order statistics is an important problem in many settings such as income distributions (for inequality measurement), finance (for the evaluation of the value at risk), and insurance (determination of loss probabilities) among others. The upper tail of the distribution in which the data are sparse is typically fitted with a model such as the Pareto model from which quantities such as probabilities associated with extreme events are deduced. The success of this procedure relies heavily not only on the choice of the estimator for the Pareto tail index but also on the procedure used to determine the number k of extreme order statistics that are used for the estimation. For the choice of k most of the known procedures are based on the minimization of (an estimate of) the asymptotic mean square error of the maximum likelihood (or Hill) estimator (MLE) which is the traditional choice for the estimator of the Pareto tail index. In this paper we question the choice of the estimator and the resulting procedure for the determination of k, because we believe that the model chosen to describe the behaviour of the tail distribution can only be considered as approximate. If the data in the tail are not exactly but only approximately Pareto, then the MLE can be biased, i.e. it is not robust, and consequently the choice of k is also biased. We propose instead a weighted MLE for the Pareto tail index that downweights data “far” from the model, where “far” will be measured by the size of standardized residuals constructed by viewing the Pareto model as a regression model. The data that are downweighted this way do not systematically correspond to the largest quantiles. Based on this estimator and proceeding as in Ronchetti and Staudte (1994), we develop a robust prediction error criterion, called RC-criterion, to choose k. In simulation studies, we will compare our estimator and criterion to classical ones with exact and/or approximate Pareto data. Moreover, the analysis of real data sets will show that a robust procedure for selection, and not just for estimation, is needed.

45 citations


Journal ArticleDOI
TL;DR: In this article, the authors present a Web of Science Record created on 2006-04-21, modified on 2017-05-12 for a paper entitled "Reference STAT-ARTICLE-2006-007:
Abstract: Reference STAT-ARTICLE-2006-007View record in Web of Science Record created on 2006-04-21, modified on 2017-05-12

41 citations


Journal ArticleDOI
TL;DR: This article proposed a new ratio imputation method using response probability, which can be justified under the response model or under the imputation model and is doubly protected against the failure of either of these models.
Abstract: The authors propose a new ratio imputation method using response probability. Their estimator can be justified either under the response model or under the imputation mo del; it is thus doubly protected against the failure of either of these models. The authors also propose a variance estimator that can be justified under the two models. Their methodology is applicable whether the re sponse probabilities are estimated or known. A small simulation study illustrates their technique. Utilisation de probabilit ´ es de r´ eponse ` a des fins d'imputation R´´ e : Les auteurs proposent une nouvelle md'imputation par quotient bassur les probabilit ´ es Imputation is a commonly used method of compensating for item nonresponse in sample sur- veys. Reasons for conducting imputation are to facilitate analyses using complete data analysis methods, to ensure that the results obtained by different an alyses are consistent with one another, and to reduce nonresponse bias. Kalton (1983) and Groves, Dillman, Eltinge & Little (2002) provide a comprehensive overview of imputation methods in survey sampling. Many imputation methods such as ratio imputation or regression imputation use auxiliary information that is observed throughout the sample. Such imputation methods require assump- tions about the distribution of the study variable. The imputation model refers to the assumptions about the variables collected in the survey and the relation ship among these variables. Another model, called the response model, is also commonly adopted in the analysis of missing data. The response model refers to the assumptions about the probability of obtaining responses from the sample for the item. One of the commonly used response models is the uniform response model, where the responses are assumed to be independent and identically distributed within the impu- tation cell. Rao & Shao (1992), Rao & Sitter (1995) and Shao & Steel (1999) discuss inference using the imputed estimator under the uniform response model. However, for the other nonuni- form response models such as the logistic response model, imputation methods incorporating the response model are relatively underdeveloped, although analyses incorporating the response model are quite popular in the nonimputation context. Examples include Rosenbaum (1987), Robins, Rotnitzky & Zhao (1994), and Lipsitz, Ibrahim & Zhao (1999). In this article, we provide an imputation methodology that combines the imputation model and the response model. The proposed method can be justified u nder either one of the two approaches. That is, it is justified if either a response mode l or an imputation model can be correctly specified. Thus, the resulting estimator is doubl y protected against the failure of the as- sumed model. (Scharfstein, Rotnitsky & Robins 1999). The basic project is introduced under the ratio imputation model in Section 2. The proposed method is further discussed in Section 3. In Section 4, we propose a replication variance estimator that can be justified under the two models. In Section 5, we discuss the proposed imputation method when the response probabilities are

Journal ArticleDOI
TL;DR: In this paper, the goodness-of-fit of a linear regression model when there are missing data in the response variable is tested using the L 2 distance between nonparametric estimators of the regression function and a consistent estimator of the same function under the parametric model.
Abstract: The authors show how to test the goodness-of-fit of a linear regression model when there are missing data in the response variable. Their statistics are based on the L2 distance between nonparametric estimators of the regression function and a -consistent estimator of the same function under the parametric model. They obtain the limit distribution of the statistics and check the validity of their bootstrap version. Finally, a simulation study allows them to examine the behaviour of their tests, whether the samples are complete or not. Tests d'adequation pour des modeles de regression lineaire en I'absence de certaines valeurs de la variable reponse Les auteurs montrent comment tester I'adequation d'un modele de regression lineaire en I'absence de certaines valeurs de la variable response. Leurs statistiques sont fonction de la distance L2 entre des estimateurs non parametriques de la fonction de regression et un estimateur -convergent de la měme fonction sous le modele parametrique. 11s obtiennent la loi limite des statistiques et demontrent la validite de leur version bootstrap. Enfin, une etude de simulation leur permet d'examiner le comportement de leurs tests, selon que les donnees soient completes ou non.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a maximin approach based on efficiencies which leads to designs that are simultaneously efficient with respect to various choices of link functions and parameter regions, and dealt with the problems of designing model and percentile robust experiments.
Abstract: For the problem of percentile estimation of a quantal response curve, the authors determine multiobjective designs which are robust with respect to misspecifications of the model assumptions. They propose a maximin approach based on efficiencies which leads to designs that are simultaneously efficient with respect to various choices of link functions and parameter regions. Furthermore, the authors deal with the problems of designing model and percentile robust experiments. They give various examples of such designs, which are calculated numerically.

Journal ArticleDOI
TL;DR: In this article, a regression model for orthogonal matrices was investigated for the special case of 3 × 3 rotation matrices and several specifications for the errors in this regression model were proposed.
Abstract: This paper investigates a regression model for orthogonal matrices introduced by Prentice (1989). It focuses on the special case of 3 × 3 rotation matrices. The model under study expresses the dependent rotation matrix V as A1UAt2 perturbed by experimental errors, where A1 and A2 are unknown 3 × 3 rotation matrices and U is an explanatory 3 × 3 rotation matrix. Several specifications for the errors in this regression model are proposed. The asymptotic distributions, as the sample size n becomes large or as the experimental errors become small, of the least squares estimators for A1 and A2 are derived. A new algorithm for calculating the least squares estimates of A1 and A2 is presented. The independence model is not a submodel of Prentice's regression model, thus the independence between the U and the V sample cannot be tested when fitting Prentice's model. To overcome this difficulty, permutation tests of independence are investigated. Examples dealing with postural variations of subjects performing a drilling task and with the calibration of a camera system for motion analysis using a magnetic tracking device illustrate the methodology of this paper. Un modele de regression et une mesure de correlation pour des matrices de rotation 3 × 3 Cet article etudie un modele de regression pour des matrices de correlation introduit par Prentice (1989). Il s'interesse plus particulierement au cas des matrices de rotations 3 × 3. Ce modele exprime la rotation dependante V comme A1U At2 perturbee par des erreurs experimentales, ou A1 et A2 sont des matrices de rotation 3 × 3 inconnues et U est une matrice de rotation 3 × 3 explicative. Plusieurs specifications pour les erreurs de ce modele de regression sont mises de l'avant. Les distributions asymp-totiques, lorsque la taille d'echantillon n tend vers l'infini ou lorsque les erreurs experimentales deviennent petites, des estimateurs des moindres carres de A1 et A2 sont calculees. Un nouvel algorithme de calcul pour les estimateurs des moindres carres de A1 et A2 est presente. Le modele d'independance n'est pas un cas particulier du modele de Prentice, ainsi l'independance entre U et V ne peut pas ětre testee en ajustant ce modele. Pour pallier a ce probleme, des tests d'independance de permutation sont construits. Des exemples portant sur la variations des postures lorsqu'un sujet manipule une perceuse et sur la calibration d'un systeme de cameras pour l'etude du mouvement utilisant une methode de reperage magnetique illustrent la methodologie statistique presentee dans cet article.

Journal ArticleDOI
Xiaogang Wang1
TL;DR: A weighted likelihood estimator is proposed that minimizes the empirical Bayesian risk under relative entropy loss and discusses connections among the weighted likelihood, empirical Bayes and James‐Stein estimators.
Abstract: The author proposes to use weighted likelihood to approximate Bayesian inference when no ex- ternal or prior information is available. He proposes a weighted likelihood e stimator that minimizes the empirical Bayes risk under relative entropy loss. He discusses connections among the weighted likelihood, empirical Bayes and James-Stein estimators. Both simulated and real data sets are used for illustration purposes.

Journal ArticleDOI
TL;DR: In this article, the authors demonstrate that relative surprise inferences possess a certain optimality property and develop computational techniques for implementing them, provided that algorithms are available to sample from the prior and posterior distributions.
Abstract: Relative surprise inferences are based on how beliefs change from a priori to a posteriori. As they are based on the posterior distribution of the integrated likelihood, inferenc es of this type are invariant under relabellings of the parameter of interest. The authors demonstrate that the se inferences possess a certain optimality property. Further, they develop computational techniques for implementing them, provided that algorithms are available to sample from the prior and posterior distributions. Optimalit´ e et calculs pour des inf´ erences ` a surprise relative R´´

Journal ArticleDOI
TL;DR: In this paper, the authors define a class of partially linear single-index survival models that are more flexible than the classical proportional hazards regression models in their treatment of covariates, and develop a likelihood-based inference to estimate the model components via an iterative algorithm.
Abstract: The authors define a class of "partially linear single-index" survival models that are more flexible than the classical proportional hazards regression models in their treatment of covariates. The latter enter the proposed model either via a parametric linear form or a nonparametric single-index form. It is then pos sible to model both linear and functional effects of covariates on the logarithm of the hazard function and if necessary, to reduce the dimensionality of multiple covariates via the single-index component. The par tially linear hazards model and the single-index hazards model are special cases of the proposed model. The authors develop a likelihood-based inference to estimate the model components via an iterative algorithm. They establish an asymptotic distribution theory for the proposed estimators, examine their finite-sample behaviour through simulation, and use a set of real data to illustrate their approach. Une classe de modeles de survie partiellement lineaires a indice simple Rgsum=: Les auteurs d6finissent une classe de modeles de survie dits "partiellement lineaires a indice simple" qui s'averent plus flexibles que les modeles de regression a risques proportionnels classiques dans le traitement des covariables. Ces dernieres entrent dans le modele propose soit sous une forme lin6aire param6trique, soit sous une forme a indice simple non parametrique. I1 est alors possible de modeliser a la fois des effets lineaires et fonctionnels de covariables sur le logarithme du risque, et de reduire au besoin la dimension de covariables multiples par l'intermediaire de la composante a indice simple. Les modeles a risques partiellement lineaires et a indice simple sont des cas speciaux du modele propose. Les auteurs developpent une methode d'inf6rence vraisemblantiste pour l'estimation des composantes du modele au moyen d'un algorithme it6ratif. Ils d6terminent la loi asymptotique des estimateurs proposes, en etudient le comportement a taille finie par voie de simulation et illustrent leur approche a l'aide de donnees reelles.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a new method for testing the hypothesis of uniformity of the underlying distribution, which is based on the dis tance of every observation to the boundary of a given domain.
Abstract: Given a random sample taken on a compact domain S C Rd, the authors propose a new method for testing the hypothesis of uniformity of the underlying distribution. The test statistic is based on the dis tance of every observation to the boundary of S. The proposed test has a number of interesting properties. In particular, it is feasible and particularly suitable for high dimensional data; it is distribution free for a wide range of choices of S; it can be adapted to the case that the support of S is unknown; and it also allows for one-sided versions. Moreover, the results suggest that, in some cases, this procedure does not suffer from the well-known curse of dimensionality. The authors study the properties of this test from both a theoretical and practical point of view. In particular, an extensive Monte Carlo simulation study allows them to compare their methods with some alternative procedures. They conclude that the proposed test provides quite a satisfactory balance between power, computational simplicity, and adaptability to different dimensions and supports.

Journal ArticleDOI
TL;DR: In this article, the authors study the application of the bootstrap to a class of estimators which converge at a non-standard rate to a nonstandard distribution and provide a theoretical framework to study its asymptotic behavior.
Abstract: The authors study the application of the bootstrap to a class of estimators which converge at a nonstandard rate to a nonstandard distribution. They provide a theoretical framework to study its asymptotic behaviour. A simulation study shows that in the case of an estimator such as Chernoff's estimator of the mode, usually the basic bootstrap confidence intervals drastically undercover while the percentile bootstrap intervals overcover. This is a rare instance where basic and percentile confidence intervals, which have exactly the same length, behave in a very different way. In the case of Chernoff's estimator, if the distribution is symmetric, it is possible to bootstrap from a smooth symmetric estimator of the distribution for which the basic bootstrap confidence intervals will have the claimed coverage probability while the percentile bootstrap interval will have an asymptotic coverage of 1! A propos du bootstrap pour des estimateurs convergeant a la vitesse racine cubique Les auteurs etudient l'application du bootstrap a une classe d'estimateurs qui convergent a une vitesse et vers une loi non standard. Ils presentent un cadre theorique pour l'etude de son comportement asymptotique. Une simulation demontre que dans le cas d'un estimateur du mode de Chernoff, la probabilite de couverture de l'intervalle de confiance bootstrap de base est grandement inferieure au niveau prescrit, alors que celle des intervalles de type percentile depasse le niveau prescrit. C'est un rare cas ou les intervalles de confiance de base et percentile ont un comportement si different malgre des longueurs identiques. Dans le cas de l'estimateur de Chernoff, si la distribution est symetrique, il est possible d'appliquer le bootstrap a partir d'un estimateur lisse et symetrique de la distribution qui menera a des intervalles bootstrap de base dont la probabilite de couverture asymptotique sera la bonne, alors que celle de l'intervalle percentile convergera vers 1!

Journal ArticleDOI
TL;DR: This paper proposed methods based on the stratified Cox proportional hazards model that account for the fact that the data have been collected according to a complex survey design, and illustrate their methodology by an analysis of jobless spells in Statistics Canada's Survey of Labour and Income Dynamics.
Abstract: The authors propose methods based on the stratified Cox proportional hazards model that account for the fact that the data have been collected according to a complex survey design. The methods they propose are based on the theory of estimating equations in conjunction with empirical process theory. The authors also discuss issues concerning ignorable sampling design, and the use of weighted and unweighted procedures. They illustrate their methodology by an analysis of jobless spells in Statistics Canada's Survey of Labour and Income Dynamics. They discuss briefly problems concerning weighting, model checking, and missing or mismeasured data. They also identify areas for further research. Analyse de survie fondee sur le modele des risques proportionnels et sur des donnees d'enqugte Les auteurs proposent des methodes fondees sur une version stratifiee du modele des risques proportionnels de Cox et adaptees aux situations ou les donnees sont issues d'un plan d'echantillonnage complexe. Les methodes qu'ils proposent s'appuient sur la theorie des equations d'estimation et celle des processus empiriques. Les auteurs discutent egalement des plans d'echantillonnage non infoimatifs en relation avec les methodes ponderees et non ponderees. Ils illustrent leur methodologie au moyen de l'Enquěte sur la dynamique du travail et du revenu de Statistique Canada. Ils abordent brievement les questions relatives aux poids d'echantillonnage, aux methodes d'adequation et aux donnees manquantes ou entachees d'erreurs de mesure. Us mentionnent en outre quelques problemes ouverts.

Journal ArticleDOI
TL;DR: In this paper, the authors consider the issue of map positional error, or the difference between location as represented in a spatial database (i.e., a map) and the corresponding unobservable true location.
Abstract: The authors consider the issue of map positional error, or the difference between location as represented in a spatial database (i.e., a map) and the corresponding unobservable true location. They propose a fully model-based approach that incorporates aspects of the map registration process commonly performed by users of geographic informations systems, including rubber-sheeting. They explain how estimates of positional error can be obtained, hence estimates of true location. They show that with multiple maps of varying accuracy along with ground truthing data, suitable model averaging offers a strategy for using all of the maps to learn about true location. Modelisation de I'erreur associee a la position d'un objet sur une carte en vue de sa localisation Les auteurs s'interessent a l'erreur associee a la position d'un objet sur une carte, c'est-a-dire a la difference entre sa position telle que representee par ses coordonnees spatiales (c'est-a-dire sur une carte) et sa veritable localisation inconnue dans l'espace. Ils proposent un modele tenant compte de differents aspects des procedes de cartographie tels que mis en œuvre dans les systemes d'information a reference geographique, y compris la correction geometrique par membrane elastique. Ils expliquent comment estimer l'erreur associee au releve et donc la position reelle d'un objet. Us montrent qu'en operant une moyenne sur differents modeles et en s'aidant de cartes de precision variee et de donnees de contrǒle au sol, on peut arriver a determiner la veritable position de l'objet.

Journal ArticleDOI
TL;DR: In this paper, a new monotone nonparametric estimate for a regression function of two or more variables is proposed, which consists in applying successively one-dimensional isotonization procedures on an initial, unconstrained non-parametric regression estimate.
Abstract: The authors propose a new monotone nonparametric estimate for a regression function of two or more variables. Their method consists in applying successively one-dimensional isotonization procedures on an initial, unconstrained nonparametric regression estimate. In the case of a strictly monotone regression function, they show that the new estimate and the initial one are first-order asymptotic equivalent; they also establish asymptotic normality of an appropriate standardization of the new estimate. In addition, they show that if the regression function is not monotone in one of its arguments, the new estimate and the initial one have approximately the same LP-norm. They illustrate their approach by means of a simulation study, and two data examples are analyzed.

Journal ArticleDOI
TL;DR: In this paper, the authors developed empirical likelihood (EL) based methods of inference for a common mean using data from several independent but nonhomogeneous populations, and evaluated the performance of the MEL estimator and the weighted EL confidence interval.
Abstract: The authors develop empirical likelihood (EL) based methods of inference for a common mean using data from several independent but nonhomogeneous populations. For point estimation, they propose a maximum empirical likelihood (MEL) estimator and show that it is p n -consistent and asymptotically optimal. For confidence intervals, they consider two EL based methods an d show that both intervals have approximately correct coverage probabilities under large samples. Finite-sample performances of the MEL estimator and the EL based confidence intervals are evaluated through a s imulation study. The results indi- cate that overall the MEL estimator and the weighted EL confidence interval are superior alternatives to the existing methods.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a weighting method in which kernel regression is used to estimate the response probabilities and derived its asymptotic distribution through a replication-based technique.
Abstract: Non-response is a common problem in survey sampling and this phenomenon can only be ignored at the risk of invalidating inferences from a survey. In order to adjust for unit non-response, the authors propose a weighting method in which kernel regression is used to estimate the response probabilities. They show that the adjusted estimator is consistent and they derive its asymptotic distribution. They also suggest a means of estimating its variance through a replication-based technique. Furthermore, a Monte Carlo study allows them to illustrate the properties of the non-response adjustment and its variance estimator. Une methode de lissage par noyau pour la prise en compte de la non-reponse dans le cadre d'enquětes La non-reponse est un probleme commun en echantillonnage et ce phenomene ne peut ětre passe sous silence qu'au risque d'invalider les conclusions d'une enquěte. Pour tenir compte de la non-reponse, les auteurs proposent une methode de ponderation dans laquelle une regression a noyau sert a estimer les probabilites de reponse. Ils montrent que l'estimation resultante est convergente et ils en determinent la loi asymptotique. Ils suggerent aussi un moyen d'estimer sa variance par une technique fondee sur des replicats. De plus, une etude de Monte-Carlo leur permet d'illustrer les proprietes de l'estimation corrigee et de celle de sa variance.

Journal ArticleDOI
TL;DR: The authors use an SAEM‐type algorithm (a stochastic approximation of the EM algorithm) to select the densities of the mixture model and test the validity of their methodologies by forecasting short term travel times.
Abstract: The purpose of this work is, on the one hand, to study how to forecast road trafficking on high way networks and, on the other hand, to describe future traffic events. Here, road trafficking is measured by vehicle velocities. The authors propose two methodologies. The first is based on an empirical classification method, and the second on a probability mixture model. They use an SAEM-type algorithm (a stochastic approximation of the EM algorithm) to select the densities of the mixture model. Then, they test the validity of their methodologies by forecasting short term travel times.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed two multiple imputation methods for analyzing multiple-category recurrent event data under the proportional means/rates model, which leads to consistent estimation of regression parameters even when the missingness of event categories depends on covariates.
Abstract: Frequently in clinical and epidemiologic studies, the event of interest is recurrent (i.e., can occur more than once per subject). When the events are not of the same type, an analysis which accounts for the fact that events fall into different categories will often be more informative. Often, however, although event times may always be known, information through which events are categorized may potentially be missing. Complete-case methods (whose application may require, for example, that events be censored when their category cannot be determined) are valid only when event categories are missing completely at random. This assumption is rather restrictive. The authors propose two multiple imputation methods for analyzing multiple-category recurrent event data under the proportional means/rates model. The use of a proper or improper imputation technique distinguishes the two approaches. Both methods lead to consistent estimation of regression parameters even when the missingness of event categories depends on covariates. The authors derive the asymptotic properties of the estimators and examine their behaviour in finite samples through simulation. They illustrate their approach using data from an international study on dialysis.

Journal ArticleDOI
TL;DR: In this article, the cutoff points are specified in terms of a function of the target parameter rather than as constants, which yields shorter confidence intervals when it is suitably chosen, and can also be used to improve the coverage properties of approximate confidence intervals.
Abstract: The authors describe a new method for constructing confidence intervals. Their idea consists in specifying the cutoff points in terms of a function of the target parameter rather than as constants. When it is suitably chosen, this so-called tail function yields shorter confidence intervals in the presence of prior information. It can also be used to improve the coverage properties of approximate confidence intervals. The authors illustrate their technique by application to interval estimation of the mean of Bernoulli and normal populations. They further suggest guidelines for choosing the optimal tail function and discuss the relationship with Bayesian inference.

Journal ArticleDOI
TL;DR: In this paper, a robust quasi-likelihood method was proposed for downweighting any influential data points when estimating the model parameters. But this method is not suitable for the case when data are contaminated with outliers.
Abstract: The author develops a robust quasi-likelihood method, which appears to be useful for down-weighting any influential data points when estimating the model parameters. He illustrates the computational issues of the method in an example. He uses simulations to study the behaviour of the robust estimates when data are contaminated with outliers, and he compares these estimates to those obtained by the ordinary quasi-likelihood method. Inference robuste pour des modeles de donnees longitudinales lineaires generalises L'auteur developpe une methode de quasi-vraisemblance robuste qui semble utile pour reduire l'impact des points influents sur l'estimation des parametres d'un modele. Il illustre les questions de calcul liees a la methode a l'aide d'un exemple. Il a recours a des simulations pour etudier le comportement des estimations robustes lorsque les donnees sont contaminees par des valeurs aberrantes et il compare ces estimations a celles obtenues par la methode de quasi-vraisemblance ordinaire.

Journal ArticleDOI
TL;DR: In this article, the authors consider the subclass of purely bilinear and strictly superdiagonal time series models with periodic coefficients and numerically illustrate their theoretical results via Monte Carlo simulations.
Abstract: Les auteurs considerent la sous-classe des processus purement bilineaires et strictement superdiagonaux a coefficients periodiques. En effet, grǎce a un large potentiel d'application couvrant notamment l'economie et la finance, les modeles bilineaires a coefficients dependant du temps sont recemment apparus dans la litterature statistique des series temporelles. Les auteurs donnent des conditions assurant l'existence d'une solution dans L2 non-anticipative, l'inversibilite et l'existence de moments d'ordre superieur. Le probleme de l'estimation des parametres est egalement traite par une approche basee sur les moments empiriques d'ordre deux et trois. Les auteurs illustrent numeriquement leurs resultats theoriques par le biais d'experiences de Monte-Carlo. L2-properties and estimation of purely bilinear and strictly superdiagonal time series models with periodic coefficients The authors consider the subclass of purely bilinear and strictly superdiagonal time series models with periodic coefficients. Indeed, thanks to their possible application to a wide variety of fields including economics and finance, bilinear time series models with time-dependent coefficients have recently been the object of attention in the statistical literature. The authors give conditions ensuring the existence of a causal solution in L2, the invertibility and the existence of higher-order moments. The problem of estimating the parameters is also investigated through an approach based on second and third empirical moments. The authors numerically illustrate their theoretical results via Monte Carlo simulations.

Journal ArticleDOI
TL;DR: In this paper, asymptotic results for the class of pseudo-likelihood estimators in the autoregressive conditional heteroscedastic models introduced by Engle (1982) are presented.
Abstract: The author presents asymptotic results for the class of pseudo-likelihood estimators in the autoregressive conditional heteroscedastic models introduced by Engle (1982). Unlike what is required for the quasi-likelihood estimator, some estimators in the class he considers do not require the finiteness of the fourth moment of the error density. Thus his method is applicable to heavy-tailed error distributions for which moments higher than two may not exist.