scispace - formally typeset
Search or ask a question

Showing papers by "Frauke Kreuter published in 2015"


Journal ArticleDOI
TL;DR: This report provides examples of different types of Big Data and their potential for survey research; it also describes the Big Data process, discusses its main challenges, and considers solutions and research needs.
Abstract: Recent years have seen an increase in the amount of statistics describing different phenomena based on “Big Data.” This term includes data characterized not only by their large volume, but also by their variety and velocity, the organic way in which they are created, and the new types of processes needed to analyze them and make inference from them. The change in the nature of the new types of data, their availability, and the way in which they are collected and disseminated is fundamental. This change constitutes a paradigm shift for survey research. There is great potential in Big Data, but there are some fundamental challenges that have to be resolved before its full potential can be realized. This report provides examples of different types of Big Data and their potential for survey research; it also describes the Big Data process, discusses its main challenges, and considers solutions and research needs.

121 citations


Posted Content
TL;DR: In this paper, the authors introduce the "generalized multitrait-multimethod" (GMTMM) model, which can be seen as a general framework for evaluating the quality of administrative and survey data simultaneously.
Abstract: Administrative register data are increasingly important in statistics, but, like other types of data, may contain measurement errors. To prevent such errors from invalidating analyses of scientific interest, it is therefore essential to estimate the extent of measurement errors in administrative data. Currently, however, most approaches to evaluate such errors involve either prohibitively expensive audits or comparison with a survey that is assumed perfect. We introduce the "generalized multitrait-multimethod" (GMTMM) model, which can be seen as a general framework for evaluating the quality of administrative and survey data simultaneously. This framework allows both survey and register to contain random and systematic measurement errors. Moreover, it accommodates common features of administrative data such as discreteness, nonlinearity, and nonnormality, improving similar existing models. The use of the GMTMM model is demonstrated by application to linked survey-register data from the German Federal Employment Agency on income from and duration of employment, and a simulation study evaluates the estimates obtained. KEY WORDS: Measurement error, Latent Variable Models, Official statistics, Register data, Reliability

29 citations


Posted Content
TL;DR: The change in the nature of the new types of data, their availability, the way in which they are collected, and disseminated are fundamental and constitutes a paradigm shift for survey research.
Abstract: In recent years we have seen an increase in the amount of statistics in society describing different phenomena based on so called Big Data. The term Big Data is used for a variety of data as explained in the report, many of them characterized not just by their large volume, but also by their variety and velocity, the organic way in which they are created, and the new types of processes needed to analyze them and make inference from them. The change in the nature of the new types of data, their availability, the way in which they are collected, and disseminated are fundamental. The change constitutes a paradigm shift for survey research.

19 citations


31 Aug 2015
TL;DR: Exploring the effects of survey question format and survey type on data quality as well as developments in the treatment of missing data, an international collection of contributors addresses such key topics as motivated misreporting, audio-recording of open-ended questions, framing effects, and multiple imputation.
Abstract: The accuracy of a survey is directly affected by how the survey is presented, how the questions are worded, and what the format is for responses. In addition, survey methods continue to develop at an accelerating rate to keep step with technological demands. Consequently, research on survey methods themselves is essential to ensuring accurate data. Survey Measurements presents the most up to date findings in this field. Exploring the effects of survey question format and survey type on data quality as well as developments in the treatment of missing data, an international collection of contributors addresses such key topics as motivated misreporting; audio-recording of open-ended questions; framing effects; multitrait-multimethod matrix modeling; web, mobile web, and mixed-mode research; experience sampling; estimates of change; and multiple imputation. This book will be a vital resource for teachers and students of survey methodology, advanced data analysis, applied survey research, and a variety of disciplines including the social sciences, public health research, epidemiology, and psychology.

14 citations


Journal ArticleDOI
TL;DR: In this paper, a large-scale national survey in Germany showed modest efficiency gains measured in the number of call attempts needed until first contact but no gains in efficiency to gain cooperation.
Abstract: Call scheduling is a challenge for surveys around the world. Unlike cross-sectional surveys, panel surveys can use information from prior waves to enhance call-scheduling algorithms. Past observational studies showed the benefit of calling panel cases at times that had been successful in the past. This article is the first to experimentally assign panel cases to previously beneficial call windows. The results from a large-scale national survey in Germany show modest efficiency gains measured in number of call attempts needed until first contact but no gains in efficiency to gain cooperation.

10 citations


DOI
13 Nov 2015
TL;DR: This article conducted a gain-loss framing experiment in which they emphasized the benefit (gain) of linking or the negative consequence (loss) of not linking one's data as it related to the usefulness of their survey responses.
Abstract: Many sample surveys ask respondents for consent to link their survey information with administrative sources. There is significant variation in how linkage requests are administered and little experimental evidence to suggest which approaches are useful for achieving high consent rates. A common approach is to emphasize the positive benefits of linkage to respondents. However, some evidence suggests that emphasizing the negative consequences of not consenting to linkage is a more effective strategy. To further examine this issue, we conducted a gain-loss framing experiment in which we emphasized the benefit (gain) of linking or the negative consequence (loss) of not linking one’s data as it related to the usefulness of their survey responses. In addition, we explored a sunk-prospective costs rationale by varying the emphasis on response usefulness for responses that the respondent had already provided prior to the linkage request (sunk costs) and responses that would be provided after the linkage request (prospective costs). We found a significant interaction between gain-loss framing and the sunk-prospective costs rationale: respondents in the gain-framing condition consented to linkage at a higher rate than those in the loss-framing condition when response usefulness was emphasized for responses to subsequent survey items. Conversely, the opposite pattern was observed when response usefulness was emphasized for responses that had already been provided: loss-framing resulted in a higher consent rate than the gain-framing, but this result did not reach statistical significance.

9 citations


Posted Content
TL;DR: In this paper, the change in the nature of the new types of data, their availability, the way in which they are collected, and disseminated are fundamental, and constitutes a paradigm shift for survey research.
Abstract: In recent years we have seen an increase in the amount of statistics in society describing different phenomena based on so called Big Data. The term Big Data is used for a variety of data as explained in the report, many of them characterized not just by their large volume, but also by their variety and velocity, the organic way in which they are created, and the new types of processes needed to analyze them and make inference from them. The change in the nature of the new types of data, their availability, the way in which they are collected, and disseminated are fundamental. The change constitutes a paradigm shift for survey research.

9 citations


Journal Article
TL;DR: In this article, an intervention was designed to provide field interviewers with observable predictors of a key auxiliary variable for which they were recording observations, which showed evidence of a significant improvement in the quality of the observations.
Abstract: Face-to-face household surveys sometimes ask field interviewers to record observations about selected characteristics of all sampled housing units. Some surveys ask interviewers to record judgments about potential respondents to serve as proxy measures of key variables. Past studies have shown that these judgments are prone to error, which has negative implications for survey estimators in terms of the bias and variance introduced by nonresponse adjustments based on the judgments. Practical techniques for reducing these errors are therefore needed. This article analyzes an intervention implemented in the 2006–2010 National Survey of Family Growth. The intervention was designed to provide field interviewers with observable predictors of a key auxiliary variable for which they were recording observations. The analysis shows evidence of a significant improvement in the quality of the observations. The article concludes with a discussion of directions for future work in this area.

8 citations


Journal ArticleDOI
TL;DR: In this article, face-to-face household surveys sometimes ask field interviewers to record observations about selected characteristics of all sampled housing units, and some surveys ask interviews to record judgments.
Abstract: Face-to-face household surveys sometimes ask field interviewers to record observations about selected characteristics of all sampled housing units. Some surveys ask interviewers to record judgments...

7 citations


Journal ArticleDOI
TL;DR: An R package with functions that compute sample sizes for various types of finite population sampling designs when totals or means are estimated, and some specialized functions for estimating variance components and design effects.
Abstract: PracTools is an R package with functions that compute sample sizes for various types of finite population sampling designs when totals or means are estimated. One-, two-, and three-stage designs are covered as well as allocations for stratified sampling and probability proportional to size sampling. Sample allocations can be computed that minimize the variance of an estimator subject to a budget constraint or that minimize cost subject to a precision constraint. The package also contains some specialized functions for estimating variance components and design effects. Several finite populations are included that are useful for classroom instruction.

4 citations



Journal ArticleDOI
TL;DR: In this paper, the authors show that the relative sizes of the variance components in a cluster sample are dramatically affected by how much the clusters vary in size, by the type of sample design, and by the form of estimator used.
Abstract: Determining sample sizes in multistage samples requires variance components for each stage of selection. The relative sizes of the variance components in a cluster sample are dramatically affected by how much the clusters vary in size, by the type of sample design, and by the form of estimator used. Measures of the homogeneity of survey variables within clusters are related to the variance components and affect the numbers of sample units that should be selected at each stage to achieve the desired precision levels. Measures of homogeneity can be estimated using standard software for random-effects models but the model-based intracluster correlations may need to be transformed to be appropriate for use with the sample design. We illustrate these points and implications for sample size calculation for two-stage sample designs using a realistic population derived from household surveys and the decennial census in the U.S.