scispace - formally typeset
Search or ask a question

Showing papers by "Yves Tillé published in 2017"


Journal ArticleDOI
TL;DR: Three theoretical principles are formalized: randomization, overrepresentation and restriction; these principles are developed and used in choosing the sampling design in a systematic way and can be applied in order to improve inference.
Abstract: The aim of this paper is twofold. First, three theoretical principles are formalized: randomization, overrepresentation and restriction. We develop these principles and give a rationale for their use in choosing the sampling design in a systematic way. In the model-assisted framework, knowledge of the population is formalized by modelling the population and the sampling design is chosen accordingly. We show how the principles of overrepresentation and of restriction naturally arise from the modelling of the population. The balanced sampling then appears as a consequence of the modelling. Second, a review of probability balanced sampling is presented through the model-assisted framework. For some basic models, balanced sampling can be shown to be an optimal sampling design. Emphasis is placed on new spatial sampling methods and their related models. An illustrative example shows the advantages of the different methods. Throughout the paper, various examples illustrate how the three principles can be applied in order to improve inference.

43 citations


Posted Content
TL;DR: In this article, the authors introduce a new absolute measure of the spatial spreading of a sample using a normalized version of the Moran's $I$ index, which measures the degree of spatial spreading in absolute terms.
Abstract: Measuring the degree of spatial spreading of a sample can be of great interest when sampling from a spatial population. The commonly used spatial balance index by Grafstr\"om et al. (2012) is particularly effective in comparing the level of spatial spreading of different samples from the same population. However, its unbounded and uninterpretable scale of measurement does not allow to assess the level of spatial spreading in absolute terms and confines its use to only raw comparisons. In this paper, we introduce a new absolute measure of the spatial spreading of a sample using a normalized version of the Moran's $I$ index. The properties and behaviour of the proposed measure are analysed through two simulation experiments, one based on artificial populations and the other on a population of real business units located in the province of Siena (Italy).

13 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider different types of survey structures and develop design-based estimators that are calibrated on known as well as estimated totals of auxiliary variables, which can be viewed as extensions of the Montanari generalised regression estimator adapted to more complex situations.
Abstract: Summary The use of auxiliary variables to improve the efficiency of estimators is a well-known strategy in survey sampling. Typically, the auxiliary variables used are the totals of appropriate measurement that are exactly known from registers or administrative sources. Increasingly, however, these totals are estimated from surveys and are then used to calibrate estimators and improve their efficiency. We consider different types of survey structures and develop design-based estimators that are calibrated on known as well as estimated totals of auxiliary variables. The optimality properties of these estimators are studied. These estimators can be viewed as extensions of the Montanari generalised regression estimator adapted to the more complex situations. The paper studies interesting special cases to develop insights and guidelines to properly manage the survey-estimated auxiliary totals.

9 citations


Journal ArticleDOI
TL;DR: Optimal sampling designs for audit, minimizing the mean squared error of the estimated amount of the misstatement, are proposed based on balanced sampling with unequal probabilities and are more efficient than monetary unit sampling.
Abstract: Optimal sampling designs for audit, minimizing the mean squared error of the estimated amount of the misstatement, are proposed. They are derived from a general statistical model that describes the error process with the help of available auxiliary information. We show that, if the model is adequate, these optimal designs based on balanced sampling with unequal probabilities are more efficient than monetary unit sampling. We discuss how to implement the optimal designs in practice. Monte Carlo simulations based on audit data from the Swiss hospital billing system confirms the benefits of the proposed method.

6 citations


Posted Content
TL;DR: In this paper, the rank-difference distribution is used to control the joint inclusion probabilities of units and especially the spreading of sampled units in the population, which is useful when neighboring units have similar characteristics or, on the contrary, are very different.
Abstract: We present new sampling methods in finite population that allow to control the joint inclusion probabilities of units and especially the spreading of sampled units in the population. They are based on the use of renewal chains and multivariate discrete distributions to generate the difference of population ranks between two successive selected units. With a Bernoulli sampling design, these differences follow a geometric distribution, and with a simple random sampling design they follow a negative hypergeometric distribution. We propose to use other distributions and introduce a large class of sampling designs with and without fixed sample size. The choice of the rank-difference distribution allows us to control units joint inclusion probabilities with a relatively simple method and closed form formula. Joint inclusion probabilities of neighboring units can be chosen to be larger, or smaller, compared to those of Bernoulli or simple random sampling, thus allowing to more or less spread the sample on the population. This can be useful when neighboring units have similar characteristics or, on the contrary, are very different. A set of simulations illustrates the qualities of this method.

5 citations


Journal ArticleDOI
TL;DR: The assumption that wages are more equitable in the public sector is analysed and it is shown that in thePublic sector, discrimination occurs quite uniformly both in lower and in higher-paying jobs.
Abstract: Wage differences between women and men can be divided into an explained part and an unexplained part. The former encompasses differences in the observable characteristics of the members of groups, such as age, education or work experience. The latter includes the part of the difference that is not attributable to objective factors and represents an estimation of the discrimination level. We discuss the original method of Blinder (J Hum Resour 8(4):436–455, 1973) and Oaxaca (Int Econ Rev 14(3):693–709, 1973), the reweighting technique of DiNardo et al. (Econometrica 64(5):1001–1044, 1996) and our approach based on calibration. Using a Swiss dataset from 2012, we compare the estimated explained and unexplained parts of the difference in average wages in the private and public sectors obtained with the three methods. We show that for the private sector, all three methods yield similar results. For the public sector, the reweighting technique estimates a lower value of the unexplained part than the other two methods. The calibration approach and the reweighting technique allow us to estimate the explained and unexplained parts of the wage differences at points other than the mean. By using this, in this paper, the assumption that wages are more equitable in the public sector is analysed. Wage differences at different quantiles in both sectors are examined. We show that in the public sector, discrimination occurs quite uniformly both in lower and in higher-paying jobs. On the other hand, in the private sector, discrimination is greater in lower-paying jobs than in higher-paying jobs.queryPlease check and confirm the given name and family name is correctly identified for the first author and amend if necessary.

4 citations


Journal ArticleDOI
TL;DR: In this article, the variance and the estimated variance of the expanded estimator in the intersection of two independent samples can be decomposed into two ways, and it is generally more practical to compute the variance with one decomposition.