Showing papers on "Imputation (statistics) published in 1986"

PDF

Open Access

Journal Article•DOI•

Multiple Imputation for Interval Estimation from Simple Random Samples with Ignorable Nonresponse

[...]

Donald B. Rubin¹, Nathaniel Schenker•Institutions (1)

01 Jun 1986-Journal of the American Statistical Association

TL;DR: In this paper, several multiple imputation techniques for simple random samples with ignorable nonresponse on a scalar outcome variable are compared using both analytic and Monte Carlo results concerning coverages of the resulting intervals for the population mean.

...read moreread less

Abstract: Several multiple imputation techniques are described for simple random samples with ignorable nonresponse on a scalar outcome variable. The methods are compared using both analytic and Monte Carlo results concerning coverages of the resulting intervals for the population mean. Using m = 2 imputations per missing value gives accurate coverages in common cases and is clearly superior to single imputation (m = 1) in all cases. The performances of the methods for various m can be predicted well by linear interpolation in 1/(m — 1) between the results for m = 2 and m = ∞. As a rough guide, to assure coverages of interval estimates within 2% of the nominal level when using the preferred methods, the number of imputations per missing value should increase from 2 to 3 as the nonresponse rate increases from 10% to 60%.

...read moreread less

725 citations

Journal Article•DOI•

Survey Nonresponse Adjustments for Estimates of Means

[...]

Roderick J. A. Little

01 Aug 1986-International Statistical Review

TL;DR: In this paper, the theoretical properties of nonresponse adjustments based on adjustment cells are studied, for estimates of means for the whole population and in subclasses that cut across adjustment cells.

...read moreread less

Abstract: Summary Theoretical properties of nonresponse adjustments based on adjustment cells are studied, for estimates of means for the whole population and in subclasses that cut across adjustment cells. Three forms of adjustment are considered: weighting by the inverse response rate within cells, post-stratification on known population cell counts, and mean imputation within adjustment cells. Two dimensions of covariate information x are distinguished as particularly useful for reducing nonresponse bias: the response propensity f(x) and the conditional mean ^(x) of the outcome variable y given x. Weighting within adjustment cells based on ^f(x) controls bias, but not necessarily variance. Imputation within adjustment cells based on ^(x) controls bias and variance. Post-stratification yields some gains in efficiency for overall population means, and smaller gains for means in subclasses of the population. A simulation study similar to that of Holt & Smith (1979) is described which explores the mean squared error properties of the estimators. Finally, some modifications of response propensity weighting to control variance are suggested.

...read moreread less

526 citations

Journal Article•DOI•

What Do We Really Know about Wages? The Importance of Nonreporting and Census Imputation

[...]

Lee A. Lillard, James C. Smith, Finis Welch

01 Jun 1986-Journal of Political Economy

TL;DR: In the most frequently used microdata sets, over a quarter of all respondents now refuse to answer some questions about their incomes as discussed by the authors, which has been increasing in severity over time, by imputing incomes of non-respondents.

...read moreread less

Abstract: In the most frequently used microdata sets, over a quarter of all respondents now refuse to answer some questions about their incomes. The Census Bureau has dealt with this problem, which has been increasing in severity over time, by imputing incomes of non-respondents. Their imputation procedure, called the "hot deck," essentially matches nonrespondents with demographically similar donors. In this paper we evaluate the census imputation methodology and raise some questions. First, the census procedure is tied to commonality of events in the population rather than the more appropriate informational content of regressors. Clearly, the census procedure severely understates income in certain occupations. Because it is based on the apparently invalid assumption that income does not affect reporting propensities, it most likely understates average incomes as well.

...read moreread less

184 citations

Journal Article•DOI•

Outlier Robust Finite Population Estimation

[...]

Ray Chambers

01 Dec 1986-Journal of the American Statistical Association

TL;DR: In this paper, the authors identify sample outliers as two basic types: representative outliers and non-representative outliers, i.e., sample elements whose data values are incorrect or unique in some sense.

...read moreread less

Abstract: Outliers in sample data are a perennial problem for applied survey statisticians. Moreover, it is a problem for which traditional sample survey theory offers no real solution, beyond the sensible advice that such sample elements should not be weighted to their fullest extent in estimation. Sample outliers can be identified as of two basic types. Here we are concerned with the first type, which may conveniently be termed representative outliers. These are sample elements with values that have been correctly recorded and that cannot be assumed to be unique. That is, there is no good reason to assume there are no more similar outliers in the nonsampled part of the target population. The remaining sample outliers, which by default are termed nonrepresentative, are sample elements whose data values are incorrect or unique in some sense. Methods for dealing with these nonrepresentative outliers lie basically within the scope of survey editing and imputation theory and are, therefore, not considered in ...

...read moreread less

158 citations

Journal Article•DOI•

Alternative Methods for CPS Income Imputation

[...]

Martin David¹, Roderick J. A. Little², Michael E. Samuhel³, Robert K. Triest¹•Institutions (3)

University of Wisconsin-Madison¹, University of California, Los Angeles², Centers for Disease Control and Prevention³

01 Mar 1986-Journal of the American Statistical Association

TL;DR: In this article, the authors compare the CPS hot deck imputations of wages and salary amounts with alternatives based on regression models for the logarithm of wages, and for the wage rate.

...read moreread less

Abstract: The U.S. Bureau of the Census imputes missing income items in the income supplement of the Current Population Survey (CPS) by a technique commonly known as the CPS hot deck. This article compares CPS hot deck imputations of wages and salary amounts with alternatives based on regression models for the logarithm of wages and salary and for the wage rate. Comparisons are effected by comparing imputations with an Internal Revenue Service (IRS) wages and salary amount found by an exact match of CPS data to IRS records. Although limitations in the matching and in the comparison variable preclude a definitive conclusion, we find that (a) the CPS hot deck does not underestimate income aggregates to any serious extent; (b) model-based alternatives have slightly smaller mean absolute error than the hot deck, when comparable data bases of respondents are used to carry out imputations; and (c) multivariate models for imputing recipiency, weeks and hours worked, and earnings need to be developed to provide re...

...read moreread less

131 citations

Journal Article•DOI•

Missing Data in Evaluation Research

[...]

Mark R. Raymond¹•Institutions (1)

American Nurses Association¹

01 Dec 1986-Evaluation & the Health Professions

TL;DR: It is concluded that pairwise deletion and listwise deletion are among the least effective methods in terms of approximating the results that would have been obtained had the data been complete, whereas replacing missing values with estimates based on correlationalprocedures generally produces the most accurate results.

...read moreread less

Abstract: Although research conducted in applied settings is frequently hindered by missing data, there is surprisingly little practical advice concerning effective methods for dealing with the problem. The purpose of this article is to describe several alternative methodsfor dealing with incomplete multivariate data and to examine the effectiveness of these methods. It is concluded that pairwise deletion and listwise deletion are among the least effective methods in terms of approximating the results that would have been obtained had the data been complete, whereas replacing missing values with estimates based on correlationalprocedures generally produces the most accurate results. In addition, some descriptive statistical procedures are recommended that permit researchers to investigate the causes and consequences of incomplete data more fully.

...read moreread less

124 citations

Journal Article•DOI•

Optimal Imputation of Erroneous Data: Categorical Data, General Edits

[...]

Robert Garfinkel¹, Anand S. Kunnathur², G. E. Liepins³•Institutions (3)

University of Tennessee¹, University of Toledo², Oak Ridge National Laboratory³

01 Oct 1986-Operations Research

TL;DR: A model in which a response is modified to pass a set of edits with as little change as possible is developed, which is NP-hard for categorical data and general edits.

...read moreread less

Abstract: Responses to surveys often contain large amounts of incorrect information. One option for dealing with the problem is to revise those erroneous responses that can be detected. Fellegi and Holt developed a model in which a response is modified to pass a set of edits with as little change as possible. The model is called Minimum Weighted Fields to Impute MWFI and is NP-hard for categorical data and general edits. We develop two algorithms for MWFI, based on set covering, and present computational experience.

...read moreread less

51 citations

Journal Article•DOI•

Testing for Nonnormality in Farm Net Returns

[...]

Steven T. Buccola¹•Institutions (1)

Oregon State University¹

01 May 1986-American Journal of Agricultural Economics

TL;DR: In this article, a positive correlation between skewness and kurtosis was found to reduce the likelihood of associated decision errors in a wide range of joint price-yield distributions.

...read moreread less

Abstract: Cash returns from farming are expected to be nonnormally distributed under a wide range of joint price-yield distributions. Adequate testing for such nonnormality requires use of proper whitening procedures as well as appropriate statistics. With tests and sample sizes commonly employed, a false imputation of normality often will be made. However, positive correlation between skewness and kurtosis reduces the likelihood of associated decision errors. These results are illustrated with data for

...read moreread less

47 citations

Journal Article•DOI•

A Bayesian Procedure for Imputing Missing Values in Sample Surveys

[...]

H. Y. Chiu, Joseph Sedransk¹•Institutions (1)

University of Iowa¹

01 Sep 1986-Journal of the American Statistical Association

TL;DR: In this paper, a new method is proposed for imputation of missing values in sample survey data, which uses standard statistical methodology, permits a general specification of the nonresponse process, and does not impose specific model assumptions.

...read moreread less

Abstract: A new method is proposed for imputation of missing values in sample survey data. The procedure uses standard statistical methodology, permits a general specification of the nonresponse process, and does not impose specific model assumptions. Prior information from past similar surveys or from other sources may be incorporated in a routine manner.

...read moreread less

20 citations

Journal Article•DOI•

Imputations and Explications: Representational Problems in Treatments of Prepositional Attitudes

[...]

John A. Barnden¹•Institutions (1)

Indiana University¹

01 Jul 1986-Cognitive Science

TL;DR: In this article, the importance of imputations that arise in a representation scheme depends strongly on the use to which the scheme is put and whether it is used as part of a formal, objective account of natural language, or is used rather as a representational tool within an agent.

...read moreread less

17 citations

Book Chapter•DOI•

The Design and Analysis of Longitudinal Surveys: Controversies and Issues of Cost and Continuity

[...]

Stephen E. Fienberg, Judith M. Tanur

01 Jan 1986

TL;DR: In this paper, the authors addressed the analysis issues of longitudinal vs. cross-sectional methods of imputation and adjustment for missing values, and the use of weights in longitudinal analyses to adjust for unequal probabilities of selection and nonresponse.

...read moreread less

Abstract: Longitudinal survey data can arise in many different settings, e.g., from rotating panel surveys, in cohort studies, and in the context of field experiments that involve economic and social phenomena that change over time. In all of these settings the longitudinal feature implies repeated interviews of respondents from nonstationary populations, and both panel attrition and missing data present special concerns. The issues here are ones involving both design and analysis. Among the design issues in a longitudinal survey is how to achieve a high degree of data continuity by following movers, when the cost of such continuity is high. If the sampling units of interest are groups as opposed to individuals, there is often a critical need for operational definitions of “family” and “household”, because the concepts are dynamic and change over time. Among the analysis issues addressed in the paper are (i) the use of longitudinal vs. cross-sectional methods of imputation and adjustment for missing values, and (ii) the use of weights in longitudinal analyses to adjust for unequal probabilities of selection and nonresponse.

...read moreread less

Journal Article•DOI•

Optimal Imputation of Erroneous Data

[...]

GarfinkelR. S., KunnathurA. S., LiepinsG. E.

01 Oct 1986-Operations Research

TL;DR: Responses to surveys often contain large amounts of incorrect information and one option for dealing with the problem is to revise those erroneous responses that can be detected.

...read moreread less

Posted Content•

The Missing Income Problem in Analyses of Engel Functions

[...]

Hsiang-Tai Cheng

01 Jan 1986-Western Journal of Agricultural Economics

TL;DR: In this article, the authors focus on the missing income problem in analyses of Engel functions, and statistically link particular demographic attributes in affecting the probability of reporting income information, and discuss several techniques to overcome this problem, namely, regression imputation, the Heckman procedure, and item deletion.

...read moreread less

Abstract: The empirical evidence from the extant literature in demand analysis points to the importance of income in food expenditure relationships. However, roughly 30 percent of all households in the 1977-78 Nationwide Food Consumption Survey do not report income figures. The focus of this paper is on the missing income problem in analyses of Engel functions. This analysis statistically links particular demographic attributes in affecting the probability of reporting income information. Additionally, several techniques to overcome the missing income problem, namely, regression imputation, the Heckman procedure, and item deletion, are discussed. Empirical evidence suggests that the Heckman procedure is statistically superior to item deletion, and that regression imputation and the Heckman procedure yield similar results.

...read moreread less

Journal Article•DOI•

Stability by deterrence in cooperative non-sidepayment n-person games

[...]

H. Andrew Michener¹, Young C. Choi¹, David C. Dettman¹•Institutions (1)

University of Wisconsin-Madison¹

01 Dec 1986-Journal of Economic Behavior and Organization

TL;DR: In this article, the authors report several experimental tests of the deterrence set, a solution concept for n -person games proposed by Laffond and Moulin (1977), which is distinctive because it specifies equilibria attained through the use of threats, where threats are conceptualized as costly not only to the target but also to the user.

...read moreread less

Abstract: This paper reports several experimental tests of the deterrence set, a solution concept for n -person games proposed by Laffond and Moulin (1977, 1981). This solution concept is distinctive because it specifies equilibria attained through the use of threats, where threats are conceptualized as costly not only to the target but also to the user. The laboratory tests were conducted in the context of 3- and 4-person cooperative non-sidepayment matrix games. In the first experimental test, the deterrence set was juxtaposed against the imputation set. Results indicate that the deterrence set has greater predictive accuracy than the imputation set. In a series of further tests, the deterrence set was juxtaposed against both the imputation set and the von Neumann-Morgenstern stable set solution. Results again show that the deterrence set is more accurate than the imputation set, but neither the deterrence set nor the stable set is reliably more accurate than the other. Overall, these results indicate that deterrence is a viable basis for stability in cooperative non-sidepayment games.

...read moreread less

Journal Article•DOI•

Missing Survey Data in End-Use Energy Models: An Overlooked Problem

[...]

David A. Swanson

01 May 1986-The Energy Journal

TL;DR: In this article, the authors consider the problem of missing data in end-use energy demand models and discard cases in which values are missing for variables required by their models (see e.g., U.S. Government, 1983, Pacific Gas and Electric, 1983; Hirst and Carney, 1978; and EPRI, 1977).

...read moreread less

Abstract: Although missing data are found in all types of data sets, surveys are particularly prone to produce data sets in which values of some respondent variables are missing (see, e.g., Cochran, 1977; Ericson, 1967; Kalton, 1983; and Hutcheson and Prather, 1977). Survey data collected for end-use energy demand models are no exception; high frequencies of nonresponse occur for many variables. This issue is, however, generally disregarded in the end-use literature, and analysts working with end-use models often discard cases in which values are missing for variables required by their models (see e.g., U.S. Government, 1983; Pacific Gas and Electric, 1983; Hirst and Carney, 1978; and EPRI, 1977). Discarding cases with missing values has important consequences. It implicitly assumes that the missing values occur randomly rather than systematically. If, however, missing values do not occur ran-domly, discarding cases with missing values will result in misspecified models and biased forecasts. Furthermore, by discarding cases, the detail appro-priate for a given end-use model can be lost.

...read moreread less