scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Official Statistics in 2011"


Journal Article
TL;DR: This work conceptualizes the existence of a single combined database containing all of the information for the individuals in the separate databases and for the union of the variables, and proposes an approach that gives full statistical calculation on this combined database without actually combining information sources.
Abstract: We consider the problem of linear regression where the data are split up and held by different parties. We conceptualize the existence of a single combined database containing all of the information for the individuals in the separate databases and for the union of the variables. We propose an approach that gives full statistical calculation on this combined database without actually combining information sources. We focus on computing linear regression and ridge regression estimates, as well as certain goodness of fit statistics. We make use of homomorphic encryption in constructing a protocol for regression analysis which adheres to the definitions of security laid out in the cryptography literature. Our approach provides only the final result of the calculations, in contrast with other methods that share intermediate values and thus present an opportunity for compromise of privacy. We perform an experiment on a dataset extracted from the Current Population Survey, with 51,016 cases and 22 covariates, to show that our approach is practical for moderate-sized problems.

137 citations


Journal Article
TL;DR: In this article, the authors consider assessment of nonresponse bias for the mean of a survey variable Y subject to nonresponse, assuming that there are a set of covariates observed for nonrespondents and respondents.
Abstract: We consider assessment of nonresponse bias for the mean of a survey variable Y subject to nonresponse. We assume that there are a set of covariates observed for nonrespondents and respondents. To reduce dimensionality and for simplicity we reduce the covariates to a proxy variable X that has the highest correlation with Y , estimated from a regression analysis of respondent data. We consider adjusted estimators of the mean of Y that are maximum likelihood for a pattern-mixture model with dierent mean and covariance matrix of Y and X for respondents and nonrespondents, assuming missingness is an arbitrary function of a known linear combination of X and Y . We propose a taxonomy for the evidence concerning bias based on the strength of the proxy and the deviation of the mean of X for respondents from its overall mean, propose a sensitivity analysis, and describe Bayesian versions of this approach. Methods are demonstrated through data from the third National Health and Nutrition Examination Survey (NHANES III).

76 citations


Journal Article
TL;DR: The concept of “balanced response set” introduced in this article extends the well-known idea of "balanced sample" and is a quadratic form relating to a multivariate auxiliary vector; its statistical properties are explored.
Abstract: In dealing with survey nonresponse, statisticians need to consider (a) measures to be taken at the data collection stage, and (b) measures to be taken at the estimation stage. One may employ some form of responsive design. In the later stages of the data collection in particular, one tries to achieve an ultimate set of responding units that is “better balanced” or “more representative” than if no special effort is made. The concept of “balanced response set” introduced in this article extends the well-known idea of “balanced sample.” A measure of “lack of balance” is proposed; it is a quadratic form relating to a multivariate auxiliary vector; its statistical properties are explored. But whether or not good balance has been achieved in the data collection, a compelling question remains at the estimation stage: How to achieve the most effective reduction of nonresponse bias in the survey estimates. Balancing alone may not help. The nonresponse adjustment effort is aided by a bias indicator, a product of three factors involving selected powerful auxiliary variables.

69 citations


Journal Article
TL;DR: In this article, the authors address the problem of performing a statistical survey of a group of individuals present in a certain population when information about the complete list of the members is missing or partially unknown.
Abstract: We address in this article the problem of performing a statistical survey of a group of individuals present in a certain population when information about the complete list of the members is missing or partially unknown. This problem is particularly relevant in immigration analysis, where many of the individuals are possibly illegal migrants and therefore not formally registered or accounted for in official statistics. We propose a sample method that integrates information provided by specific surveys and subjective knowledge available to the researcher about the geo-social reality of interest.

61 citations


Posted Content
TL;DR: In this article, the authors present and investigate indicators that support data collection monitoring and effective decisions in adaptive and responsive survey designs, termed partial R-indicators, and make a distinction between unconditional and conditional partial indicators.
Abstract: The increasing efforts and costs required to achieve survey response have led to a stronger focus on survey data collection monitoring by means of paradata and to the rise of adaptive and responsive survey designs. Indicators that support data collection monitoring, targeting and prioritizing in such designs are not yet available. Subgroup response rates come closest but do not account for subgroup size, are univariate and are not available at the variable level. We present and investigate indicators that support data collection monitoring and effective decisions in adaptive and responsive survey designs. As they are natural extensions of R-indicators, they are termed partial R-indicators. We make a distinction between unconditional and conditional partial R-indicators. Conditional partial R-indicators provide a multivariate assessment of the impact of register data and paradata variables on representativeness of response. We propose methods for estimating partial indicators and investigate their sampling properties in a simulation study.. The use of partial indicators for monitoring and targeting nonresponse is illustrated for both a a household and business survey. Guidelines for the use of the indicators are given.

53 citations


Journal Article
TL;DR: The authors developed a cross-national error source typology to identify potential error sources that are either not present or are less common in single-nation studies, which can help to identify these error sources better inform the survey researcher when improving a source questionnaire.
Abstract: This article evaluates a Cross National Error Source Typology that was developed as a tool for making cross-national questionnaire design more effective. Cross-national questionnaire design has a number of potential error sources that are either not present or are less common in single nation studies. Tools that help to identify these error sources better inform the survey researcher when improving a source questionnaire that serves as the basis for translation. This article outlines the theoretical and practical development of the typology and evaluates an attempt to apply it to cross-national cognitive interviewing findings from the European Social Survey.

47 citations


Journal Article
TL;DR: In this paper, the authors use survey respondents' own reasons for participating or not participating in surveys, as well as experiments carried out over many years, to propose a benefit-cost theory of survey participation.
Abstract: This article uses survey respondents’ own reasons for participating or not participating in surveys, as well as experiments carried out over many years, to propose a benefit-cost theory of survey participation. The argument is that people choose to act, in surveys as in life, when, in their subjective calculus, the benefits of doing so outweigh the costs. The process of reaching a decision may be carefully reasoned or it may proceed almost instantaneously, with the aid of heuristics. But regardless of the process, the outcome depends on a judgment that the benefits of acting outweigh the costs of doing so – even if, objectively speaking, the actors are badly informed and their decision leads to an undesirable outcome. The article reviews research on confidentiality assurances and risk perceptions with reference to a benefit-cost theory of behavior, and concludes by suggesting research to test the theory’s predictions and by drawing testable implications for survey practice.

46 citations


Journal Article
TL;DR: New methods for handling not missing at random (NMAR) nonresponse are developed and applied and an algorithm for estimating the parameters governing the two models is developed and shown how to estimate the distributions of the missing covariates and outcomes.
Abstract: In this paper we develop and apply new methods for handling not missing at random (NMAR) nonresponse. We assume a model for the outcome variable under complete response and a model for the response probability, which is allowed to depend on the outcome and auxiliary variables. The two models define the model holding for the outcomes observed for the responding units, which can be tested. Our methods utilize information on the population totals of some or all of the auxiliary variables in the two models, but we do not require that the auxiliary variables are observed for the nonresponding units. We develop an algorithm for estimating the parameters governing the two models and show how to estimate the distributions of the missing covariates and outcomes, which are then used for imputing the missing values for the nonresponding units and for estimating population means and the variances of the estimators. We also consider several test statistics for testing the model fitted to the observed data and study their performance, thus validating the proposed procedure. The new developments are illustrated using simulated data and a real data set collected as part of the Household Expenditure Survey carried out by the Israel Central Bureau of Statistics in 2005.

36 citations


Journal Article
TL;DR: In this paper, the authors compared Likert-scale responses obtained in a mixed mode survey using a telephone and a mail national crime victimization survey in Belgium and found that responses were significantly more positive in the telephone survey, but no evidence was found for differences in acquiescence across the survey modes.
Abstract: The study compared Likert-scale responses obtained in a mixed mode survey using a telephone and a mail national crime victimization survey in Belgium. Theoretically, more socially desirable responses and more acquiescence were expected in the telephone survey. Results showed that, consistent with the social desirability hypothesis, responses were significantly more positive in the telephone survey, but no evidence was found for differences in acquiescence across the survey modes. These results were obtained with structural equation models (SEM). In addition to accounting for differences in sample composition in nonexperimental data by including covariates, an SEM also allows dealing with a wide variety of mode effects not usually considered in empirical mixed mode research, such as interaction effects, differential item functioning, and the structure of measurement errors. Analyses detected an interpretable interaction effect between age and survey mode in this study, illustrating the usefulness of the SEM method in mixed mode research.

33 citations


Journal Article
TL;DR: In this paper, the authors argue that "attempts to persuade or pressure respondents to increase response might be counterproductive in the long-term because they can negatively affect response rates."
Abstract: Response rates to surveys are declining in most countries. Attempts to persuade or pressure respondents to increase response might be counterproductive in the long-term because they can negatively ...

29 citations


Journal Article
TL;DR: In this paper, the authors investigated properties of random noise multiplication as a data masking procedure, especially for tabular magnitude data, and studied the effects of data swapping, cell suppression, use of synthetic data and random noise perturbations.
Abstract: Most statistical agencies are concerned with the dual challenge of releasing quality data and reducing, if not totally eliminating, the risk of divulging private information. Various data masking procedures such as data swapping, cell suppression, use of synthetic data and random noise perturbations have been recommended and used in practice to meet these two objectives. This paper investigates properties of random noise multiplication as a data masking procedure, especially for tabular magnitude data. We study effects of ∗Statistical Research Division, U.S. Census Bureau, Washington, DC 20233, and Department of Statistics, George Washington University, Washington, DC 20052. †Statistical Research Division, U.S. Census Bureau, Washington, DC 20233, and Department of Mathematics and Statistics, University of Maryland, Baltimore County, Baltimore, MD 21250. ‡Statistical Research Division, U.S. Census Bureau, Washington, DC 20233. §This report is released to inform interested parties of ongoing research and to encourage discussion of work in progress. The views expressed are those of the authors and not necessarily those of the U.S. Census Bureau.

Journal Article
Li-Chun Zhang1
TL;DR: A unit-error theory is outlined which provides a framework for evaluating the statistical accuracy of these register-based household statistics, and its use is illustrated through an application to the Norwegian register household data.
Abstract: The Journal of Official Statistics is published by Statistics Sweden, the national statistical office of Sweden. The journal publishes articles on statistical methodology and theory, with an emphasis on applications. It is an open access journal, which gives the right to users to read, download, copy, distribute, print, search, or link to the full texts of all articles.

Journal Article
TL;DR: In this article, the authors evaluated various methods for imputing income in a household-based longitudinal survey and found that the Little and Su method with the population carryover method performed the best overall.
Abstract: This article evaluates various methods for imputing income in a household-based longitudinal survey. Eight longitudinal methods are evaluated in a simulation study using five waves of data from the Household, Income and Labour Dynamics in Australia (HILDA) Survey. The quality of the imputed data is evaluated by considering eleven criteria that measure the predictive, distributional and estimation accuracy of the cross-sectional estimates and the estimates of change between waves. Many of the imputation methods perform well cross-sectionally, but when the methods are placed in a longitudinal context their strengths and weaknesses are more apparent. The method that combines the Little and Su method with the population carryover method performs the best overall.

Journal Article
TL;DR: In this paper, a unified approach based on multiplicity-adjusted estimation is proposed to deal with mixed frame level information, where some units may have only basic (possibly due to privacy concerns or lack of memory on the part of the respondent) while others may have more than basic information.
Abstract: The available multiple frame estimation methods do not deal with the case of mixed frame level information where units from the same sample are allowed to have mixed information. That is, some units may have only basic (possibly due to privacy concerns or lack of memory on the part of the respondent) while others may have more than basic information, where basic is defined as having known selection probability for each unit from the sampled frame and the number of frames the unit could have been selected from but not knowing the frame identification except, of course, for the sampled frame. To address this new problem, we first propose a unified approach based on multiplicity-adjusted estimation which encompasses all the proposed estimators (classified in this article as either combined or separate) as well as new estimators obtained by combining simple and complex multiplicity estimators. We also propose hybrid multiplicity estimators to account for mixed information. The methods discussed here are limited to the combined frame approach only because of their ability to deal with the case of mixed information. Simulation results are presented to compare various methods in terms of relative bias and relative root mean squared error of point and variance estimators.

Journal Article
TL;DR: It is suggested that non-narrative open questions can be designed to help guide respondents to provide answers in the desired format and separate input fields improve the quality of responses over single input fields.
Abstract: Web surveys often collect information such as frequencies, currency amounts, dates, or other items requiring short structured answers in an open-ended format, typically using text boxes for input. We report on several experiments exploring design features of such input fields. We find little effect of the size of the input field on whether frequency or dollar amount answers are well-formed or not. By contrast, the use of templates to guide formatting significantly improves the well-formedness of responses to questions eliciting currency amounts. For date questions (whether month/year or month/day/year), we find that separate input fields improve the quality of responses over single input fields, while drop boxes further reduce the proportion of ill-formed answers. Drop boxes also reduce completion time when the list of responses is short (e.g., months), but marginally increases completion time when the list is long (e.g., birth dates). These results suggest that non-narrative open questions can be designed to help guide respondents to provide answers in the desired format.

Journal Article
TL;DR: In this article, the authors examined the importance of change in characteristics and circumstances of households and household members for contact and cooperation patterns and found that such changes affect obtaining cooperation rather than obtaining contact, and tend to increase attrition.
Abstract: This study examines the importance of change in characteristics and circumstances of households and household members for contact and cooperation patterns. The literature suggests that there might be an underrepresentation of change in panel studies, because respondents facing more changes would be more likely to drop out. We approach this problem by analysing whether previous changes are predictive of later attrition or temporary drop-out, using eleven waves of the Swiss Household Panel (1999-2009). Our analyses support previous findings to some extent. Changes in household composition, employment status and social involvement as well as moving are associated mainly with attrition and less with temporary drop-out. These changes affect obtaining cooperation rather than obtaining contact, and tend to increase attrition.

Journal Article
TL;DR: The proposed methodology can be profitably adopted in record linkage and in capture-recapture problems where the size of a finite population is the main object of interest and the number of “recaptured” individuals is unknown.
Abstract: We propose a Bayesian approach for matching noisy multivariate continuous vectors observed on different occasions but originating from the same closed population. The proposed methodology can be profitably adopted in record linkage and in capture-recapture problems where the size of a finite population is the main object of interest and the number of “recaptured” individuals is unknown. A Gibbs sampling scheme is used to simulate from the posterior distribution of the model parameters. The performance of the proposed approach is evaluated with simulated data sets.

Journal Article
TL;DR: Whether the treatment effects differ by key characteristics of panel members including likelihood of moving and anticipated difficulty in completing an interview, and some recommendations for the use of contact update strategies in panel studies are provided.
Abstract: The Panel Study of Income Dynamics (PSID) is a nationally representative longitudinal survey of approximately 9,000 families and their descendants that has been ongoing since 1968. Since 1969, families have been sent a mailing asking them to update or verify their contact information to keep track of their whereabouts between waves. Having updated contact information prior to data collection is associated with fewer call attempts and refusal conversion efforts, less tracking, and lower attrition. Given these apparent advantages, a study was designed in advance of the 2009 PSID field effort to improve the response rate of the contact update mailing. Families were randomly assigned to the following conditions: mailing design (traditional versus new), $10 as a prepaid versus postpaid incentive, timing and frequency of the mailing (July 2008 versus October 2008 versus both times) and whether or not they were sent a study newsletter. This paper reports on findings with regards to response rates to the mailing and the effect on production outcomes including tracking rates and number of calls during 2009 by these different conditions, examines whether the treatment effects differ by key characteristics of panel members including likelihood of moving and anticipated difficulty in completing an interview, and provides some recommendations for the use of contact update strategies in panel studies.

Journal Article
TL;DR: The authors compared breakoff and unit nonresponse in web surveys through response behavior for the same individuals across different surveys and found that nonrespondents to one survey were considerably more likely to be nonrespondent to subsequent surveys, but such consistency in response behavior was substantially lower for breakoffs.
Abstract: Sample members start web surveys but fail to complete them at relatively high rates compared to other modes. Existing theories and empirical findings on unit nonresponse may or may not apply to breakoff. This study contrasts breakoff and unit nonresponse in web surveys through response behavior for the same individuals across different surveys. Nonrespondents to one survey were considerably more likely to be nonrespondents to subsequent surveys, but such consistency in response behavior was substantially lower for breakoffs. There is a degree of transitioning between response behaviors, however, such as nonrespondents in one survey being more likely to be breakoffs than respondents in a subsequent survey, indicative of unmeasured common causes. Limited support for the common cause hypothesis is also found in demographic covariates, yet to a very limited degree; race and gender were associated with both breakoff and nonresponse, and some associations (e.g., year in school) were in the opposite direction. Subjects invited to multiple surveys in a short period of time were more likely than others to be nonrespondents, but were not more likely to break off.

Journal Article
TL;DR: In this paper, the authors adapted some influence diagnostics that have been formulated for ordinary or weighted least squares for use with unclustered survey data to handle nonsurvey data and compared the differences in the performance of ordinary least squares and survey-weighted diagnostics.
Abstract: Diagnostics for linear regression models have largely been developed to handle nonsurvey data. The models and the sampling plans used for finite populations often entail stratification, clustering, and survey weights. In this article we adapt some influence diagnostics that have been formulated for ordinary or weighted least squares for use with unclustered survey data. The statistics considered here include DFBETAS, DFFITS, and Cook’s D. The differences in the performance of ordinary least squares and survey-weighted diagnostics are compared in an empirical study where the values of weights, response variables, and covariates vary substantially.

Journal Article
TL;DR: This study examines methods for designing a reinterview sample with the goal of identifying more falsified cases and fits a logistic regression model to predict the likelihood of falsification with the data from original interviews, and uses the predicted probabilities to construct alternative reinter interview sampling designs.
Abstract: The U.S. Census Bureau 1 relies on reinterview programs as the primary method to evaluate field work and monitor the work of the interviewers. One purpose of the reinterviews is to identify falsification. Since falsification is a rare occurrence, reinterview programs generally identify very few falsified cases even when the reinterview sample is reasonably large. This study examines methods for designing a reinterview sample with the goal of identifying more falsified cases. With the Current Population Survey (CPS) as an example, we explore data that could be used for reinterview sampling beyond that currently used in the CPS program. We fit a logistic regression model to predict the likelihood of falsification with the data from original interviews, and use the predicted probabilities to construct alternative reinterview sampling designs. The alternative designs are compared to the current sampling method through cross validation and simulation methods.

Journal Article
TL;DR: A sensitivity analysis is described to assess the impact of departures from MAR for refusals, based on SRMI for a pattern-mixture model, which avoids the well-known problems of underidentification of parameters of missing not at random models.
Abstract: In a rotating panel survey, individuals are interviewed in some waves of the survey but are not interviewed in others. We consider the treatment of missing income data in the labor force survey of the Municipality of Florence in Italy, a survey with a rotating panel design where recipiency and amount of income are missing for waves where individuals are not interviewed, and amount of income is missing for waves where individuals are interviewed but refuse to answer the income amount question. It is thus a question of a multivariate missing data problem with two missing-data mechanisms, one by design and one by refusal, and varying sets of covariates for imputation depending on the wave of the survey. Existing methods for multivariate imputation such as sequential regression multiple imputation (SRMI) can be applied, but assume that the missing income values are missing at random (MAR). This assumption is reasonable when missing data arise from the rotating panel design, but less reasonable when the missing data arise from refusal to answer the income question, since in this case missingness of income is generally thought to be related to the value of income itself, after conditioning on available covariates. In this article we describe a sensitivity analysis to assess the impact of departures from MAR for refusals, based on SRMI for a pattern-mixture model. The sensitivity analysis avoids the well-known problems of underidentification of parameters of missing not at random models, is easy to carry out using existing sequential multiple imputation software, and takes into account the different mechanisms that lead to missing data.

Journal Article
TL;DR: In this article, the authors describe a geographic segmentation of survey and census response focused on the underlying constructs behind census tracts with historically low mail response rates, and perform a cluster analysis based on twelve demographic, housing, and socioeconomic variables used to calculate a "hard-to-count" score.
Abstract: The 2010 U.S. Census used a multimode response model with the first phase being a mailout/mailback and the second being a personal visit follow-up. Knowing which segments of the population are predisposed to mail back a form is essential to develop methods to maximize census participation and to plan for and monitor areas of nonresponse. In this article, we describe a geographic segmentation of survey and census response focused on the underlying constructs behind census tracts with historically low mail response rates. We perform a cluster analysis based on twelve demographic, housing, and socioeconomic variables used to calculate a “hard-to-count” score. This yielded eight mutually exclusive geographic clusters of the population that varied across the spectrum of mailback propensities. Each segment is distinguished by unique demographic, housing, and socioeconomic characteristics and several segments are closely aligned to three different hard-to-count profiles. To gauge how the segments performed in terms of recent mail response behavior, we examine several outcome measures with data from the 2010 Census and the American Community Survey collected in 2009 and 2010. To conclude, we discuss the usefulness of extending this geographic segmentation model beyond the census to targeted experiments and other applications in demographic surveys.

Journal Article
TL;DR: In this article, the authors investigated the relationship between busyness claims, indicators of busyness and the decline in survey participation in Flemish surveys conducted between 2002 and 2007, using paradata collected during fieldwork.
Abstract: As both time pressure (e.g., Gershuny 2005) and survey nonresponse (e.g., Curtin et al. 2005) increase in Western societies one can wonder whether the busiest people still have time for survey participation. This article investigates the relationship between busyness claims, indicators of busyness and the decline in survey participation in Flemish surveys conducted between 2002 and 2007. Using paradata collected during fieldwork, we investigate whether busyness related doorstep reactions have increased over the years and whether there is an empirical relationship between these busyness claims and indicators of busyness.


Journal Article
TL;DR: In this article, a superlative index number formula is used to aggregate heterogeneous items and a unit value index to aggregate homogeneous ones, and a solution to this index number problem is proposed.
Abstract: Index number theory informs us that if data on matched prices and quantities are available, a superlative index number formula is best to aggregate heterogeneous items, and a unit value index to aggregate homogeneous ones. The formulas can give very different results. Neglected is the practical case of broadly comparable items, for which price dispersion can be decomposed into a quality component, say due to product differentiation, and a component that is stochastic or due to price discrimination. This paper analyses why such formulas differ and proposes a solution to this index number problem. JEL Classification Numbers: C43, C81, E31, L11, L15.

Journal Article
TL;DR: In this paper, the impact of proxy interviews on the survey-based employment rate estimates was evaluated using data from the Norwegian Labour Force Survey with register data, over a relatively long time series from 1997 to 2008, and they found that proxy interviews probably result in a better employment rate estimate, even though they introduce some underreporting.
Abstract: We combine data from the Norwegian Labour Force Survey with register data in order to evaluate the impact of proxy interviews on the survey-based employment rate estimates. The method compares estimates under different models for proxy response and nonresponse models, over a relatively long time series from 1997 to 2008. Using register-based employment as an auxiliary variable, we try to differentiate between the effect of the measurement and the effect of the fact that proxy-interviewed people are not selected at random. We label these effects “proxy effect” and “selection effect” respectively, and suggest methods for estimating them. Our conclusion, after also including the impact of nonresponse, is that proxy interviews probably result in a better employment rate estimate, even though they introduce some underreporting. The reason is that proxy interviews provide data on some hard-to-reach people who have a labour-market situation more similar to that of those not reached at all. We find that including the proxy responses has approximately the same effect as post-stratification of the direct responses, using register-employment status as the auxiliary variable.

Journal Article
TL;DR: In this paper, the authors analyzed respondents' likelihood of complying with the mode switch request in a survey of university alumni and found that paradata derived from the screening interview are related to mode switch participation.
Abstract: Minimizing unit nonresponse and maximizing reporting accuracy about sensitive items are common goals among survey practitioners. In order to maximize reporting accuracy without compromising on response rates a common strategy is to recruit respondents over the phone and switch them to a self-administered mode (e.g., IVR, web) for answering the sensitive items. A drawback to the “recruit-and-switch” design is that a substantial portion of the sample (typically 20 percent or more) drop out during the mode switch. Recent evidence suggests that this form of nonresponse can introduce bias and offset gains in accuracy achieved by self-administration. We analyze respondents’ likelihood of complying with the mode switch request in a survey of university alumni. Results indicate that paradata derived from the screening interview are related to mode switch participation. Also, we find evidence that adding reluctant sample members into the mode switch respondent pool yields improved estimates with lower nonresponse bias.

Journal Article
TL;DR: This article describes two additional types of frequently occurring obvious inconsistencies, sign errors and rounding errors: simple algorithms are given that detect and correct these errors.
Abstract: Selective editing is often used for the data of structural business surveys. Records containing potentially influential errors are edited manually, whereas the other, noncritical records can be edited automatically. At Statistics Netherlands, the automatic editing is performed by an advanced software package called SLICE. Prior to this several types of obvious inconsistencies are detected and corrected deductively. This article describes two additional types of frequently occurring obvious inconsistencies, sign errors and rounding errors. Simple algorithms are given that detect and correct these errors. Correction of these errors in a separate step will increase the efficiency of the subsequent editing process, because more records will be eligible (and suitable) for automatic editing. By way of illustration, the algorithms are applied to real data from the Dutch structural business survey.

Journal Article
TL;DR: In this paper, a cycle-specific marginal hotdeck imputation method is proposed to fill in the missing responses and the pseudo-GEE method described in Carrillo et al. (2009) is applied to the imputed data set.
Abstract: This paper presents a pseudo-GEE approach to the analysis of longitudinal surveys when the response variable contains missing values. A cycle-specific marginal hotdeck imputation method is proposed to fill in the missing responses and the pseudo-GEE method described in Carrillo et al. (2009) is applied to the imputed data set. Consistency of the resulting pseudo-GEE estimators is established under a joint randomization framework. Linearization variance estimators are also developed for the pseudo-GEE estimators under the assumption that the finite population sampling fraction is small or negligible. Finite sample performances of the proposed estimators are investigated through an extensive simulation study using data from the National Longitudinal Survey of Children and Youth.