Showing papers on "Outlier published in 1978"

PDF

Open Access

Journal Article•DOI•

The Influence Function as an Aid in Outlier Detection in Discriminant Analysis

[...]

01 Nov 1978-Journal of The Royal Statistical Society Series C-applied Statistics

TL;DR: In this paper, the influence function is used to detect outliers in discriminant analysis, which is a quadratic function of the deviation of the discriminant score for the perturbed observation from the mean of the corresponding group.

...read moreread less

Abstract: The influence function is used to develop criteria for detecting outliers in discriminant analysis. For Mahalanobis' D2, the influence function is a quadratic function of the deviation of the discriminant score for the perturbed observation from the discriminant score for the mean of the corresponding group. A X2 approximation to the null distribution of the influence function values appears to be suitable for graphical representation.

...read moreread less

94 citations

Journal Article•DOI•

Least squares versus minimum absolute deviations estimation in linear models

[...]

Hoyt G. Wilson¹•Institutions (1)

University of British Columbia¹

01 Apr 1978-Decision Sciences

TL;DR: In this paper, the authors investigated cases in which disturbances are normally distributed with constant variance except for one or more outliers whose disturbances are taken from a normal distribution with a much larger variance.

...read moreread less

Abstract: Previous research has indicated that minimum absolute deviations (MAD) estimators tend to be more efficient than ordinary least squares (OLS) estimators in the presence of large disturbances. Via Monte Carlo sampling this study investigates cases in which disturbances are normally distributed with constant variance except for one or more outliers whose disturbances are taken from a normal distribution with a much larger variance. It is found that MAD estimation retains its advantage over OLS through a wide range of conditions, including variations in outlier variance, number of regressors, number of observations, design matrix configuration, and number of outliers. When no outliers are present, the efficiency of MAD estimators relative to OLS exhibits remarkably slight variation.

...read moreread less

52 citations

Journal Article•DOI•

On Testing for Two Outliers or One Outlier in Two-Way Tables

[...]

J. A. John, N. R. Draper

01 Feb 1978-Technometrics

TL;DR: The authors proposed a two-stage test for the presence of two outliers or one outlier in two-way tables, where the test statistics are estimated by Monte Carlo generations, and approximations to the percentage points are suggested.

...read moreread less

Abstract: Previous work by Gentleman and Wilk on outliers in two-way tables is summarized, and their statistic QK is discussed. Instead of a plot of QK values, we propose a two-stage test for the presence of two outliers or one outlier. Percentage points for the test statistics are estimated by Monte Carlo generations, and approximations to the percentage points are suggested. The 8 × 12 case is examined in detail and extensions to other cases are briefly discussed.

...read moreread less

39 citations

Journal Article•DOI•

A Method of Bivariate Trimming for Robust Estimation of the Correlation Coefficient

[...]

Andrew Bebbington¹•Institutions (1)

University of Kent¹

01 Nov 1978-Applied statistics

TL;DR: In this article, it is proposed that bivariate data should be trimmed of those points which define the convex hull, for robust estimation of the product-moment correlation coefficient.

...read moreread less

Abstract: SUMMARY It is proposed that bivariate data should be trimmed of those points which define the convex hull, for robust estimation of the product-moment correlation coefficient. Properties of this method are examined by a Monte Carlo investigation. Other applications are mentioned. THE product-moment correlation coefficient, like many other parametric estimators, is sensitive to outliers and disturbances in the tails of the bivariate distribution of quantitative variables. For this reason there is much to recommend the routine application of a trimming procedure before this statistic is calculated. Previous authors, e.g. Nath (1971), have investigated what may be termed a rectangular trimming procedure: that is, each distribution independently is truncated in each tail. While this method has the virtue of simplicity, there may be certain objections to its use. Firstly, it takes no account of the multivariate structure of the data. This may well be important particularly if the truncated sample will be used in more complex multivariate procedures. Secondly, the rectangular trimmed product-moment correlation coefficient is almost certain to be biased toward zero as an estimate of the population correlation. Nath (1971) gives an example of a correlation of 079 which reduced to 065 on single truncation of either distribution resulting in 23 per cent of the sample being eliminated. A bias correction may be calculated assuming a bivariate normal distribution, but is computationally tedious, even though Dyer (1973) has proposed a method avoiding complex iteration.

...read moreread less

34 citations

Journal Article•DOI•

Effect of correlation on the estimation of a mean in the presence of spurious observations

[...]

Irwin Guttman¹, George C. Tiao²•Institutions (2)

University of Toronto¹, University of Wisconsin-Madison²

01 Jan 1978-Canadian Journal of Statistics-revue Canadienne De Statistique

TL;DR: In this paper, the authors examined the effect of various correlation structures of observations on rules for estimating a mean which are designed to quard against the possibility of spurious observations (that is, observations generated in a manner not intended).

...read moreread less

Abstract: : This paper examines the effect of various correlation structures of observations on rules for estimating a mean which are designed to quard against the possibility of spurious observations (that is, observations generated in a manner not intended). The premium and protection of these rules are evaluated and discussed for the equi-correlation case and for the case of an autoregressive process of first order. It is shown that the premium and protection of a given rule which is designed for the estimator of a general mean mu when spuriosity may exist and when the observations are independent, lacks robustness to departures from independence. It is also shown that in moderate sized samples a spurious observation could seriously bias the usual estimator of the autoregressive coefficient alpha. One application of these results is in the case of a first order autoregressive model which can be used to represent many time series data encountered in business and economics.

...read moreread less

32 citations

Book Chapter•DOI•

Robustness of Location Estimators in the Presence of an Outlier

[...]

Herbert A. David¹, V.S. Shu¹•Institutions (1)

Iowa State University¹

01 Jan 1978

Abstract: The bias and mean square error of various location estimators, expressible as linear functions of order statistics, are studied when an unidentified single outlier is present in a sample of size n. Specific attention is paid to the cases when the outlier comes from a population differing from the target population either in location or scale. When, in addition, the target population is normal, exact numerical results have been obtained for n = 5, 10, 20 and are presented here for n = 10. The estimators included are the mean, median, trimmed means, Winsorized means, linearly weighted means, and Gastwirth mean.

...read moreread less

31 citations

Journal Article•DOI•

Examination of the Behaviour of Tests for Outliers When More than One Outlier is Present

[...]

P. Prescott¹•Institutions (1)

University of Southampton¹

01 Mar 1978-Journal of The Royal Statistical Society Series C-applied Statistics

TL;DR: A graphical method is described for the examination of the behaviour of tests for outliers when one or two outliers of varying magnitude are present in the data.

...read moreread less

Abstract: A graphical method is described for the examination of the behaviour of tests for outliers when one or two outliers of varying magnitude are present in the data. The method involves computing a sensitivity surface for the test statistic and, from this surface, determining contours corresponding to selected critical values. The contour diagrams so formed give an indication of the behaviour of the test statistic for a variety of data configurations. The tests for the single sample case include one‐outlier tests, many‐outlier tests and sequentially applied tests for a specified maximum number of outliers.

...read moreread less

16 citations

Dissertation•DOI•

Robust estimation of a location parameter in the presence of outliers

[...]

Ven-Shion Shu¹•Institutions (1)

Iowa State University¹

01 Jan 1978

14 citations

Journal Article•DOI•

Analysis of three tests for one or two outliers

[...]

Douglas M. Hawkins¹•Institutions (1)

University of the Witwatersrand¹

01 Sep 1978-Statistica Neerlandica

TL;DR: In this article, three well-known test statistics for the precense of two outliers are studied from the points of views of exact null distribution and power, and a recursive version of the Pearson-Chandrasekar test is shown to perform well.

...read moreread less

Abstract: Summary Three well-known test statistics for the precense of two outliers are studied from the points of views of exact null distribution and power. In a sample case, it is shown that the Murphy two outlier statistic may suffer from masking. A recursive version of the Pearson-Chandrasekar test is shown to perform well.

...read moreread less

8 citations

Report•DOI•

Smoothing 3-D Data for Torpedo Paths

[...]

J. B. Tysver

01 May 1978

TL;DR: In this article, the general track smoothing program (MASM3DRJ) was used at NUWES using linear, parabolic, and logarithmic functions to fit 3D data files on torpedo paths by the method of least squares.

...read moreread less

Abstract: : The general track smoothing program (MASM3DRJ) in use at NUWES uses linear, parabolic, and logarithmic functions to fit 3-D data files on torpedo paths by the method of least squares. Polynomial functions of the first (linear), second (parabolic), third, and fourth orders were fitted to data for a variety of path segments of a torpedo run at NUWES using the method of least squares. Results suggest expansion of the program to include higher order polynomials and fitting shorter path segments will provide substantial reduction in residual errors. The method of sequential differences was tried on the data and can be incorporated in the smoothing program as a means of identifying outlier data points and of selecting the appropriate polynomial order for fitting the data.

...read moreread less

5 citations

Journal Article•DOI•

A modified tiao-guttman rule for multiple outliers

[...]

R.G. Marks¹, P. V. Rao¹•Institutions (1)

University of Florida¹

01 Jan 1978-Communications in Statistics-theory and Methods

TL;DR: In this paper, a modification of the rule suggested by Tiao and Guttman (1967) is considered for the estimation of the mean of a normal population using data with an unspecified number of outliers.

...read moreread less

Abstract: A modification of the rule suggested by Tiao and Guttman (1967) is considered for the estimation of the mean of a normal population using data with an unspecified number of outliers. The properties of the modified rule are investigated by considering its premium and protection. For the case where the number of outliers, i, is specified, it is shown that the Tiao-Guttman tables for m 2 are adequate for all values of m.

...read moreread less

Book Chapter•DOI•

Testing for Outliers in Linear Regression

[...]

James E. Gentle¹•Institutions (1)

Iowa State University¹

01 Jan 1978

TL;DR: In this article, the problem of outliers in the regression model is considered and the use of the maximum absolute studentized residual, Rn, for identification of the outlier has been suggested by a number of authors.

...read moreread less

Abstract: The problem of outliers in the regression model is considered. For the case of one outlier at most, the use of the maximum absolute studentized residual, Rn, for identification of the outlier has been suggested by a number of authors. Simulation studies of the power of a conservative test based on Rn for identifying single outliers in regression models with one, two, and three independent variables are reported. The case of multiple outliers is also considered and techniques for their identification are discussed. A simulation study of a sequential procedure for handling two outliers is reported.

...read moreread less

Journal Article•DOI•

Randomization Tests and Outlier Scores

[...]

Eugene S. Edgington¹, Gerard Ezinga¹•Institutions (1)

University of Calgary¹

01 Jul 1978-The Journal of Psychology

TL;DR: In this paper, the presence of an outlier score (that is, an extreme score for an independent t test or an extreme difference score for a correlated t test) makes it difficult to get a t large enough for a t table to show significance.

...read moreread less

Abstract: Summary The presence of an outlier score-that is, an extreme score for an independent t test or an extreme difference score for a correlated t test—makes it difficult to get a t large enough for a t table to show significance When there are outlier scores, the use of a randomization test to determine significance is more likely to reveal a treatment effect than the use of t tables

...read moreread less

A statistical test procedure for detecting multiple outliers in a data set

[...]

R. S. Chhikara, A. H. Feiveson

01 Nov 1978

Book Chapter•DOI•

Real Time Validation of Air Quality Data

[...]

A.A.M. Roosken

01 Jan 1978-Studies in Environmental Science

TL;DR: In this article, the authors discuss the real-time validation of air quality data and propose to use robust statistical parameters to minimize the role of high and low data in the validation process.

...read moreread less

Abstract: Publisher Summary This chapter discusses the real time validation of air quality data. Collecting air quality data consists of the following steps: sampling, analysis, data processing, and data storage. The first two steps may entail many mistakes, which greatly influence the value of the collected data. Therefore, complementary methods for validation of air quality data are necessary. The need for validation is strong when the reliability of the chosen techniques is low, and it is especially strong when dealing with data outliers. One way to solve the problem of air quality data validation is to minimize the role of high and low data by using robust statistical parameters. Instead of an arithmetic mean, the median should be preferred in the case of homogeneous distributions. When data from two networks are compared, quartiles rather than extreme percentiles should be used. This solution of the outlier problem, however, has only restricted value. From the point of view of community health, it is no solution at all. Another way to solve the problem of air quality data validation is to check a number of essential steps in the process of data acquisition and data transmission and to check a number of parameters in the monitors.

...read moreread less