Statistical Analysis with Missing Data
About: This article is published in Journal of Marketing Research.The article was published on 1989-08-01. It has received 7643 citations till now. The article focuses on the topics: Missing data & Exploratory data analysis.
Citations
More filters
••
TL;DR: In this article, the authors proposed a global test statistic for multivariate data with missing values, that is, whether the missing data are missing completely at random (MCAR), that is whether missingness depends on the variables in the data set.
Abstract: A common concern when faced with multivariate data with missing values is whether the missing data are missing completely at random (MCAR); that is, whether missingness depends on the variables in the data set. One way of assessing this is to compare the means of recorded values of each variable between groups defined by whether other variables in the data set are missing or not. Although informative, this procedure yields potentially many correlated statistics for testing MCAR, resulting in multiple-comparison problems. This article proposes a single global test statistic for MCAR that uses all of the available data. The asymptotic null distribution is given, and the small-sample null distribution is derived for multivariate normal data with a monotone pattern of missing data. The test reduces to a standard t test when the data are bivariate with missing data confined to a single variable. A limited simulation study of empirical sizes for the test applied to normal and nonnormal data suggests th...
6,045 citations
••
[...]
TL;DR: Multi-task Learning (MTL) as mentioned in this paper is an approach to inductive transfer that improves generalization by using the domain information contained in the training signals of related tasks as an inductive bias.
Abstract: Multitask Learning is an approach to inductive transfer that improves generalization by using the domain information contained in the training signals of related tasks as an inductive bias. It does this by learning tasks in parallel while using a shared representation; what is learned for each task can help other tasks be learned better. This paper reviews prior work on MTL, presents new evidence that MTL in backprop nets discovers task relatedness without the need of supervisory signals, and presents new results for MTL with k-nearest neighbor and kernel regression. In this paper we demonstrate multitask learning in three domains. We explain how multitask learning works, and show that there are many opportunities for multitask learning in real domains. We present an algorithm and results for multitask learning with case-based methods like k-nearest neighbor and kernel regression, and sketch an algorithm for multitask learning in decision trees. Because multitask learning works, can be applied to many different kinds of domains, and can be used with different learning algorithms, we conjecture there will be many opportunities for its use on real-world problems.
5,181 citations
••
TL;DR: This review presents a practical summary of the missing data literature, including a sketch of missing data theory and descriptions of normal-model multiple imputation (MI) and maximum likelihood methods, and strategies for reducing attrition bias.
Abstract: This review presents a practical summary of the missing data literature, including a sketch of missing data theory and descriptions of normalmodel multiple imputation (MI) and maximum likelihood methods. Practical missing data analysis issues are discussed, most notably the inclusion of auxiliary variables for improving power and reducing bias. Solutions are given for missing data challenges such as handling longitudinal, categorical, and clustered data with normal-model MI; including interactions in the missing data model; and handling large numbers of variables. The discussion of attrition and nonignorable missingness emphasizes the need for longitudinal diagnostics and for reducing the uncertainty about the missing data mechanism under attrition. Strategies suggested for reducing attrition bias include using auxiliary variables, collecting follow-up data on a sample of those initially missing, and collecting data on intent to drop out. Suggestions are given for moving forward with research on missing data and attrition.
5,095 citations
••
4,461 citations
••
TL;DR: A description of the assumed context and objectives of multiple imputation is provided, and a review of the multiple imputations framework and its standard results are reviewed.
Abstract: Multiple imputation was designed to handle the problem of missing data in public-use data bases where the data-base constructor and the ultimate user are distinct entities. The objective is valid frequency inference for ultimate users who in general have access only to complete-data software and possess limited knowledge of specific reasons and models for nonresponse. For this situation and objective, I believe that multiple imputation by the data-base constructor is the method of choice. This article first provides a description of the assumed context and objectives, and second, reviews the multiple imputation framework and its standard results. These preliminary discussions are especially important because some recent commentaries on multiple imputation have reflected either misunderstandings of the practical objectives of multiple imputation or misunderstandings of fundamental theoretical results. Then, criticisms of multiple imputation are considered, and, finally, comparisons are made to alt...
3,495 citations
References
More filters
••
TL;DR: In this article, the authors studied square integrable coefficients of an irreducible representation of the non-unimodular $ax + b$-group and obtained explicit expressions in the case of a particular analyzing family that plays a role analogous to coherent states (Gabor wavelets) in the usual $L_2 $ -theory.
Abstract: An arbitrary square integrable real-valued function (or, equivalently, the associated Hardy function) can be conveniently analyzed into a suitable family of square integrable wavelets of constant shape, (i.e. obtained by shifts and dilations from any one of them.) The resulting integral transform is isometric and self-reciprocal if the wavelets satisfy an “admissibility condition” given here. Explicit expressions are obtained in the case of a particular analyzing family that plays a role analogous to that of coherent states (Gabor wavelets) in the usual $L_2 $ -theory. They are written in terms of a modified $\Gamma $-function that is introduced and studied. From the point of view of group theory, this paper is concerned with square integrable coefficients of an irreducible representation of the nonunimodular $ax + b$-group.
3,423 citations
••
TL;DR: In this article, the authors derived a period determination technique that is well suited to the case of nonsinusoidal time variation covered by only a few irregularly spaced observations and applied it to the doublemode Cepheid BK Cen.
Abstract: We derive a period determination technique that is well suited to the case of nonsinusoidal time variation covered by only a few irregularly spaced observations. A detailed statistical analysis allows comparison with other techniques and indicates the optimum choice of parameters for a given problem. Application to the double-mode Cepheid BK Cen demonstrates the applicability of these methods to difficult cases. Using 49 photoelectric points, we obtain the two primary oscillatory components as well as the principal mode-interaction term; the derived periods are in agreement with previous estimates.
1,536 citations
••
01 Jul 1987TL;DR: In this article, the authors place bispectrum estimation in a digital signal processing framework in order to aid engineers in grasping the utility of the available bispectral estimation techniques, and discuss application problems that can directly benefit from the use of the Bispectrum, and to motivate research in this area.
Abstract: It is the purpose of this tutorial paper to place bispectrum estimation in a digital signal processing framework in order to aid engineers in grasping the utility of the available bispectrum estimation techniques, to discuss application problems that can directly benefit from the use of the bispectrum, and to motivate research in this area Three general reasons are behind the use of bispectrum in signal processing and are addressed in the paper: to extract information due to deviations from normality, to estimate the phase of parametric signals, and to detect and characterize the properties of nonlinear mechanisms that generate time series
1,413 citations
••
TL;DR: In this article, the higher-order spectra or polyspectra of multivariate stationary time series were derived from an observed stretch of time series and several applications of the results obtained.
Abstract: The subject of this paper is the higher-order spectra or polyspectra of multivariate stationary time series. The intent is to derive (i) certain mathematical properties of polyspectra, (ii) estimates of polyspectra based on an observed stretch of time series, (iii) certain statistical properties of the proposed estimates and (iv) several applications of the results obtained.
524 citations
••
TL;DR: In this article, the γ-ray burst detector Konus was used to detect hard X-ray bursts from the same source on 5 and 6 March, 1979, and the burst of 5 March was very intense, particularly in the initial phase and the second burst on 6 March was considerably weaker.
Abstract: The γ-ray burst detector Konus1, on the Venera 11 and Venera 12 spacecraft, detected on 5 and 6 March, 1979 two bursts of hard X rays originating from the same source. These events are quite unusual and of considerable interest. The burst of 5 March was very intense, particularly in the initial phase. This event was also observed by several other spacecraft2. The second burst on 6 March was considerably weaker. The observations reported here permitted us to obtain a detailed time structure of the bursts, to measure their energy spectra and to locate the source on the celestial sphere.
461 citations