scispace - formally typeset
Search or ask a question

Showing papers on "Outlier published in 1996"


Journal ArticleDOI
TL;DR: It is shown how prior assumptions about the spatial structure of outliers can be expressed as constraints on the recovered analog outlier processes and how traditional continuation methods can be extended to the explicit outlier-process formulation.
Abstract: The modeling of spatial discontinuities for problems such as surface recovery, segmentation, image reconstruction, and optical flow has been intensely studied in computer vision. While “line-process” models of discontinuities have received a great deal of attention, there has been recent interest in the use of robust statistical techniques to account for discontinuities. This paper unifies the two approaches. To achieve this we generalize the notion of a “line process” to that of an analog “outlier process” and show how a problem formulated in terms of outlier processes can be viewed in terms of robust statistics. We also characterize a class of robust statistical problems for which an equivalent outlier-process formulation exists and give a straightforward method for converting a robust estimation problem into an outlier-process formulation. We show how prior assumptions about the spatial structure of outliers can be expressed as constraints on the recovered analog outlier processes and how traditional continuation methods can be extended to the explicit outlier-process formulation. These results indicate that the outlier-process approach provides a general framework which subsumes the traditional line-process approaches as well as a wide class of robust estimation problems. Examples in surface reconstruction, image segmentation, and optical flow are presented to illustrate the use of outlier processes and to show how the relationship between outlier processes and robust statistics can be exploited. An appendix provides a catalog of common robust error norms and their equivalent outlier-process formulations.

752 citations


Journal ArticleDOI
TL;DR: Alternative techniques drawn from the fields of resistant, robust and non-parametric statistics are usually much less affected by the presence of ‘outliers’ and other forms of non-normality and are presented.
Abstract: Basic traditional parametric statistical techniques are used widely in climatic studies for characterizing the level (central tendency) and variability of variables, assessing linear relationships (including trends), detection of climate change, quality control and assessment, identification of extreme events, etc. These techniques may involve estimation of parameters such as the mean (a measure of location), variance (a measure of scale) and correlatiodregression coefficients (measures of linear association); in addition, it is often desirable to estimate the statistical significance of the difference between estimates of the mean from two different samples as well as the significance of estimated measures of association. The validity of these estimates is based on underlying assumptions that sometimes are not met by real climate data. Two of these assumptions are addressed here: normality and homogeneity (and as a special case statistical stationarity); in particular, contamination from a relatively few ‘outlying values’ may greatly distort the estimates. Sometimes these common techniques are used in order to identify outliers; ironically they may fail because of the presence of the outliers! Alternative techniques drawn from the fields of resistant, robust and non-parametric statistics are usually much less affected by the presence of ‘outliers’ and other forms of non-normality. Some of the theoretical basis for the alternative techniques is presented as motivation for their use and to provide quantitative measures for their performance as compared with the traditional techniques that they may replace. Although this work is by no means exhaustive, typically a couple of suitable alternatives are presented for each of the common statistical quantitiedtests mentioned above. All of the technical details needed to apply these techniques are presented in an extensive appendix. With regard to the issue of homogeneity of the climate record, a powerfd non-parametric technique is introduced for the objective identification of ‘change-points’ (discontinuities) in the mean. These may arise either naturally (abrupt climate change) or as the result of errors or changes in instruments, recording practices, data transmission, processing, etc. The change-point test is able to identify multiple discontinuities and requires no ‘metadata’ or comparison with neighbouring stations; these are important considerations because instrumental changes are not always documented and, particularly with regard to radiosonde observations, suitable neighbouring stations for ‘buddy checks’ may not exist. However, when such auxiliary information is available it may be used as independent confirmation of the artificial nature of the discontinuities. The application and practical advantages of these alternative techniques are demonstrated using primarily actual radiosonde station data and in a few cases using some simulated (artificial) data as well. The ease with which suitable examples were obtained from the radiosonde archive begs for serious consideration of these techniques in the analysis of climate data.

574 citations


Journal ArticleDOI
TL;DR: The question of what levels of contamination can be detected by this algorithm as a function of dimension, computation time, sample size, contamination fraction, and distance of the contamination from the main body of data is investigated.
Abstract: New insights are given into why the problem of detecting multivariate outliers can be difficult and why the difficulty increases with the dimension of the data. Significant improvements in methods for detecting outliers are described, and extensive simulation experiments demonstrate that a hybrid method extends the practical boundaries of outlier detection capabilities. Based on simulation results and examples from the literature, the question of what levels of contamination can be detected by this algorithm as a function of dimension, computation time, sample size, contamination fraction, and distance of the contamination from the main body of data is investigated. Software to implement the methods is available from the authors and STATLIB.

342 citations


Journal ArticleDOI
TL;DR: In this article, a robust estimator and exploratory statistical methods for the detection of gross errors as the data reconciliation is performed are discussed, which have the property insensitive to departures from ideal statistical distributions and to the presence of outliers.
Abstract: Gross-error detection plays a vital role in parameter estimation and data reconciliation for dynamic and steady-state systems. Data errors due to miscalibrated or faulty sensors or just random events nonrepresentative of the underlying statistical distribution can induce heavy biases in parameter estimates and reconciled data. Robust estimators and exploratory statistical methods for the detection of gross errors as the data reconciliation is performed are discussed. These methods have the property insensitive to departures from ideal statistical distributions and to the presence of outliers. Once the regression is done, the outliers can be detected readily by using exploratory statistical techniques. Optimization algorithm and reconciled data offer the ability to classify variables according to their observability and redundancy properties. In this article, an observable variable is an unmeasured quantity that can be estimated from the measured variables through the physical model, while a nonredundant variable is a measured variable that cannot be estimated other than through its measurement. Variable classification can be used to help design instrumentation schemes. An efficient method for this classification of dynamic systems is developed. Variable classification and gross-error detection have important connections, and gross-error detection on nonredundant variables has to be performed with caution.

198 citations


Journal ArticleDOI
TL;DR: In this paper, the authors defined estimators that improve on known S-estimators in having all of the following properties: (1) maximal breakdown for the given sample size and dimension; (2) ability completely to reject as outliers points that are far from the main mass of points; (3) convergence to good solutions with a modest amount of computation from a nonrobust starting point for large (though not near 50%) contamination.
Abstract: For the problem of robust estimation of multivariate location and shape, defining S-estimators using scale transformations of a fixed $\rho$ function regardless of the dimension, as is usually done, leads to a perverse outcome: estimators in high dimension can have a breakdown point approaching 50%, but still fail to reject as outliers points that are large distances from the main mass of points This leads to a form of nonrobustness that has important practical consequences In this paper, estimators are defined that improve on known S-estimators in having all of the following properties: (1) maximal breakdown for the given sample size and dimension; (2) ability completely to reject as outliers points that are far from the main mass of points; (3) convergence to good solutions with a modest amount of computation from a nonrobust starting point for large (though not near 50%) contamination However, to attain maximal breakdown, these estimates, like other known maximal breakdown estimators, require large amounts of computational effort This greater ability of the new estimators to reject outliers comes at a modest cost in efficiency and gross error sensitivity and at a greater, but finite, cost in local shift sensitivity

193 citations


Journal ArticleDOI
TL;DR: In this paper, the authors show that the previously recommended {p(n-1)/(n −p)}F p,n−p are unsuitable, and p(n 1) 2 F p, n−p−1 /n(n− p−1+pF p,n−P−1 ) are the correct critical values when searching for a single outlier.
Abstract: The Mahalanobis distance is a well-known criterion which may be used for detecting outliers in multivariate data. However, there are some discrepancies about which critical values are suitable for this purpose. Following a comparison with Wilks's method, this paper shows that the previously recommended {p(n-1)/(n-p)}F p,n−p are unsuitable, and p(n-1) 2 F p,n−p−1 /n(n−p−1+pF p,n−p−1 ) are the correct critical values when searching for a single outlier The importance of which critical values should be used is illustrated when searching for a single outlier in a clinical laboratory data set containing 10 patients and five variables. The jackknifed Mahalanobis distance is also discussed and the relevant critical values are given. Finally, upper bounds for the usual Mahalanobis distance and the jackknifed version are discussed.

192 citations


Journal ArticleDOI
TL;DR: A neural-network architecture and an instant learning algorithm that rapidly decides the weights of the designed single-hidden layer neural network that is able to achieve "one-shot" training as opposed to most iterative training algorithms in the literature.
Abstract: This paper presents a neural-network architecture and an instant learning algorithm that rapidly decides the weights of the designed single-hidden layer neural network. For an n-dimensional N-pattern training set, with a constant bias, a maximum of N-r-1 hidden nodes is required to learn the mapping within a given precision (where r is the rank, usually the dimension, of the input patterns). For off-line training, the proposed network and algorithm is able to achieve "one-shot" training as opposed to most iterative training algorithms in the literature. An online training algorithm is also presented. Similar to most of the backpropagation type of learning algorithms, the given algorithm also interpolates the training data. To eliminate outlier data which may appear in some erroneous training data, a robust weighted least squares method is proposed. The robust weighted least squares learning algorithm can eliminate outlier samples and the algorithm approximates the training data rather than interpolates them. The advantage of the designed network architecture is also mathematically proved. Several experiments show very promising results.

161 citations


Journal ArticleDOI
TL;DR: The mean log squared error (MLSE) is proposed as the error criteria that can be easily adapted by most supervised learning algorithms and simulation results indicate that the proposed method is robust against outliers.
Abstract: Most supervised neural networks (NNs) are trained by minimizing the mean squared error (MSE) of the training set. In the presence of outliers, the resulting NN model can differ significantly from the underlying system that generates the data. Two different approaches are used to study the mechanism by which outliers affect the resulting models: influence function and maximum likelihood. The mean log squared error (MLSE) is proposed as the error criteria that can be easily adapted by most supervised learning algorithms. Simulation results indicate that the proposed method is robust against outliers.

144 citations


Journal ArticleDOI
TL;DR: This article proposed a method for simultaneous variable selection and outlier identification based on the computation of posterior model probabilities, which avoids the problem that the model selection depends upon the order in which variable selection is carried out.

136 citations


Journal ArticleDOI
TL;DR: The class of mixture transition distribution (MTD) time series models is extended to general non-Gaussian time series and the stationarity and autocorrelation properties of the models are derived.
Abstract: The class of mixture transition distribution (MTD) time series models is extended to general non-Gaussian time series. In these models the conditional distribution of the current observation given the past is a mixture of conditional distributions given each one of the last p observations. They can capture non-Gaussian and nonlinear features such as flat stretches, bursts of activity, outliers changepoints in a single unified model class. They can also represent time series defined on arbitrary state spaces, univariate or multivariate, continuous, discrete or mixed, which need not even be Euclidean. They perform well in the usual case of Gaussian time series without obvious nonstandard behaviors. The models are simple, analytically tractable, easy to simulate readily estimated. The stationarity and autocorrelation properties of the models are derived. A simple EM algorithm is given and shown to work well for estimation. The models are applied to several real and simulated datasets with satisfacto...

124 citations


Journal ArticleDOI
TL;DR: An approach to Bayesian sensitivity analysis that uses an influence statistic and an outlier statistic to assess the sensitivity of a model to perturbations, and two alternative divergences are proposed and shown to be interpretable.
Abstract: This paper describes an approach to Bayesian sensitivity analysis that uses an influence statistic and an outlier statistic to assess the sensitivity of a model to perturbations. The basic outlier statistic is a Bayes factor, whereas the influence statistic depends strongly on the purpose of the analysis. The task of influence analysis is aided by having an interpretable influence statistic. Two alternative divergences, an L 1 -distance and a x 2 -divergence, are proposed and shown to be interpretable. The Bayes factor and the proposed influence measures are shown to be summaries of the posterior of a perturbation function.

Journal ArticleDOI
TL;DR: A fairly general fuzzy regression technique is proposed based on the least-squares approach to estimate the modal value and the spreads separately, and suspicious outliers, that is, data points that are obviously and suspiciously lying outside the usual range, can be treated and their effects can be reduced.

Proceedings ArticleDOI
18 Jun 1996
TL;DR: A new operator, called MUSE (Minimum Unbiased Scale Estimator), evaluates a hypothesized fit over potential inlier sets via an objective function of unbiased scale estimates, and extracts the single best fit from the data by minimizing its objective function over a set of hypothesized fits.
Abstract: Despite many successful applications of robust statistics, they have yet to be completely adapted to many computer vision problems. Range reconstruction, particularly in unstructured environments, requires a robust estimator that not only tolerates a large outlier percentage but also tolerates several discontinuities, extracting multiple surfaces in an image region. Observing that random outliers and/or points from across discontinuities increase a hypothesized fit's scale estimate (standard deviation of the noise), our new operator; called MUSE (Minimum Unbiased Scale Estimator), evaluates a hypothesized fit over potential inlier sets via an objective function of unbiased scale estimates. MUSE extracts the single best fit from the data by minimizing its objective function over a set of hypothesized fits and can sequentially extract multiple surfaces from an image region. We show MUSE to be effective on synthetic data modelling small scale discontinuities and in preliminary experiments on complicated range data.

Proceedings ArticleDOI
25 Aug 1996
TL;DR: When reverse engineering a CAD model, it is necessary to integrate information from several views of an object into a common reference frame, using an improved version of the interactive closes point algorithm.
Abstract: When reverse engineering a CAD model, it is necessary to integrate information from several views of an object into a common reference frame. Given a rough initial alignment, further pose refinement here uses an improved version of the interactive closes point algorithm. Incremental adjustments are computed simultaneously for all data sets, resulting in a more globally optimal set of transformations. Also, thresholds for removing outlier correspondences are not needed, as the merging data sets are considered as a whole. Motion updates are computed through force-based optimization, using implied springs between data sets. Experiments indicate that even for very rough initial positionings, registration accuracy approaches 25% of the interpoint sampling resolution of the images.

Journal Article
TL;DR: In this paper, the authors define two types of measurement errors: systematic errors (predictable problems usually due to calibration) and random errors, which are quantified by experiments involving repeated measurements of standards or "true" values.
Abstract: Safely operating life support equipment and evaluating new technology both require some basic understanding of measurement theory. Measurement errors fall into two main categories: systematic errors (predictable problems usually due to calibration) and random errors (unpredictable). These two types of errors can be quantified by experiments involving repeated measurements of standards or "true" values. Systematic error (called bias) is usually expressed as the mean difference between measured and true values. Random error, called imprecision, can be expressed as the standard deviation of measured values. Total error can be expressed as an error interval, being the sum of bias and some multiple of imprecision. An error interval is a prediction about the error of some proportion of future measurements (e.g., 95%) at some level of confidence (e.g., 99%) based on the variability of the sample data and the sample size. Specifically, a tolerance interval gives an estimate of the true value of some variable given repeated measurements with an assumed valid measurement system. An inaccuracy interval predicts the validity of a measurement system with an estimate of the difference between measured true values (given that a standard or true value is available for measurement). An agreement interval evaluates whether or not one measurement system (e.g., a known valid system) can be used in place of another (e.g., a new unknown system). Statistical analyses such as correlation and linear regression are commonly seen in the literature, but not usually appropriate for evaluation of new equipment. Instrument performance evaluation studies should start out with a decision about the level of allowable error. Next, experiments are designed to obtain repeated measurements of known quantities (inaccuracy studies) or of unknown quantities by two different measurement systems (i.e., agreement studies). The first step in data analysis is to generate scatter plots of the raw data for review of validity (e.g., outliers). The next step is to make sure the data adhere to the assumption of normality. The third step is to calculate basic descriptive statistics, such as the mean and standard deviation. Finally, the data should be presented in graphic form with the differences plotted against the reference values and including numerical values for the calculated error intervals. The key idea to remember is that device evaluation and method agreement studies are based on the desire to know how much trust we should place in single measurements that may be used to make life support decisions.

Journal ArticleDOI
TL;DR: The use of a robust PCA for modeling normal process behavior, is proposed, and a kernel approach is suggested as an alternative method to define normal region via the data.
Abstract: The procedure of multivariate statistical approaches (MSA) is as follows: (1) data representing normal process behavior are collected; (2) multivariate statistical methods, such as principal component analysis (PCA) and partial least square (PLS), are utilized to compress the data and to extract the information projecting the data into a low dimension space that summarizes all the important information; (3) normal region or control chart is configured to monitor the process; (4) diagnosis and identification of the fault sources, if anyone causes the process out of the normal region. This work focuses on steps 2 and 3. The first motivation of the current work is to reliably absorb information despite the existence of outliers. The second is to define normal region via the data. This has the advantage that no a priori assumption has to be made. The use of a robust PCA for modeling normal process behavior, is proposed, and a kernel approach is suggested as an alternative method to define normal region

Journal ArticleDOI
TL;DR: Modifications of the Euclidean algorithm are presented for determining the period from a sparse set of noisy measurements, where the elements of the set are the noisy occurrence times of a periodic event with (perhaps very many) missing measurements.
Abstract: Modifications of the Euclidean algorithm are presented for determining the period from a sparse set of noisy measurements. The elements of the set are the noisy occurrence times of a periodic event with (perhaps very many) missing measurements. This problem arises in radar pulse repetition interval (PRI) analysis, in bit synchronization in communications, and in other scenarios. The proposed algorithms are computationally straightforward and converge quickly. A robust version is developed that is stable despite the presence of arbitrary outliers. The Euclidean algorithm approach is justified by a theorem that shows that, for a set of randomly chosen positive integers, the probability that they do not all share a common prime factor approaches one quickly as the cardinality of the set increases. In the noise-free case, this implies that the algorithm produces the correct answer with only 10 data samples, independent of the percentage of missing measurements. In the case of noisy data, simulation results show, for example, good estimation of the period from 100 data samples with 50% of the measurements missing and 25% of the data samples being arbitrary outliers.

Journal Article
TL;DR: The key idea to remember is that device evaluation and method agreement studies are based on the desire to know how much trust the authors should place in single measurements that may be used to make life support decisions.
Abstract: Safely operating life support equipment and evaluating new technology both require some basic understanding of measurement theory. Measurement errors fall into two main categories: systematic errors (predictable problems usually due to calibration) and random errors (unpredictable). These two types of errors can be quantified by experiments involving repeated measurements of standards or "true" values. Systematic error (called bias) is usually expressed as the mean difference between measured and true values. Random error, called imprecision, can be expressed as the standard deviation of measured values. Total error can be expressed as an error interval, being the sum of bias and some multiple of imprecision. An error interval is a prediction about the error of some proportion of future measurements (e.g., 95%) at some level of confidence (e.g., 99%) based on the variability of the sample data and the sample size. Specifically, a tolerance interval gives an estimate of the true value of some variable given repeated measurements with an assumed valid measurement system. An inaccuracy interval predicts the validity of a measurement system with an estimate of the difference between measured true values (given that a standard or true value is available for measurement). An agreement interval evaluates whether or not one measurement system (e.g., a known valid system) can be used in place of another (e.g., a new unknown system). Statistical analyses such as correlation and linear regression are commonly seen in the literature, but not usually appropriate for evaluation of new equipment. Instrument performance evaluation studies should start out with a decision about the level of allowable error. Next, experiments are designed to obtain repeated measurements of known quantities (inaccuracy studies) or of unknown quantities by two different measurement systems (i.e., agreement studies). The first step in data analysis is to generate scatter plots of the raw data for review of validity (e.g., outliers). The next step is to make sure the data adhere to the assumption of normality. The third step is to calculate basic descriptive statistics, such as the mean and standard deviation. Finally, the data should be presented in graphic form with the differences plotted against the reference values and including numerical values for the calculated error intervals. The key idea to remember is that device evaluation and method agreement studies are based on the desire to know how much trust we should place in single measurements that may be used to make life support decisions.

Journal ArticleDOI
TL;DR: It is shown that the effect of the leverage in regression models makes very difficult the convergence of the Gibbs sampling algorithm in sets of data with strong masking.
Abstract: This article discusses the convergence of the Gibbs sampling algorithm when it is applied to the problem of outlier detection in regression models. Given any vector of initial conditions, theoretically, the algorithm converges to the true posterior distribution. However, the speed of convergence may slow down in a high-dimensional parameter space where the parameters are highly correlated. We show that the effect of the leverage in regression models makes very difficult the convergence of the Gibbs sampling algorithm in sets of data with strong masking. The problem is illustrated with examples.

Journal ArticleDOI
TL;DR: In this article, a new variable selection criterion is presented based on the Wald test statistic and is defined by Tp = Wp- K + 2p where K and p are the numbers of parameters in the full and submodel respectively, and Wp is the Wald statistic for testing whether the coefficients of the variables not in the submodel are 0.
Abstract: SUMMARY A new variables selection criterion is presented. It is based on the Wald test statistic and is defined by Tp = Wp- K + 2p where K and p are the numbers of parameters in the full and submodel respectively, and Wp is theWald statistic for testing whether the coefficients of the variables not in the submodel are 0.'Good' submodels will have Tp-values that are close to or smaller than p, and, as with Mallows's Cp, they will be selected by graphical rather than stepwise methods. We first consider an application to the linear regression of the heat evolved in a cement mix on four explanatory variables; we use robust methods and obtain the same results as those from the more computer-intensive methods of Ronchetti and Staudte. Our later applications are to previously published data sets which use logistic regression to predict participation in the US federal food stamp program, myocardial infarction and prostatic cancer. The first data set was shown in previous analysis to contain an outlier and is considered for illustration. In the last two data sets our criterion applied to the maximum likelihood estimates selects the same model as do previously published stepwise analyses. However, for the food stamp data set, the application of our criterion using the robust logistic regression estimates of Carroll and Pederson suggests more parsimonious models than those arising from the likelihood analysis, and further suggests that interactions previously regarded as important may be due to outliers.

Journal ArticleDOI
TL;DR: A robust method for outlier detection together with the likelihood of transformed data is presented as a first approach to solve problems when the additive-logratio and multivariate Box-Cox transformations are used.
Abstract: The statistical analysis of compositional data is based on determining an appropriate transformation from the simplex to real space. Possible transfonnations and outliers strongly interact: parameters of transformations may be influenced particularly by outliers, and the result of goodness-of-fit tests will reflect their presence. Thus, the identification of outliers in compositional datasets and the selection of an appropriate transformation of the same data, are problems that cannot be separated. A robust method for outlier detection together with the likelihood of transformed data is presented as a first approach to solve those problems when the additive-logratio and multivariate Box-Cox transformations are used. Three examples illustrate the proposed methodology.

Journal ArticleDOI
TL;DR: In this paper, the authors developed new plots positions for normal plots and developed tests for various departures from normality, especially for skewness and heavy tails, which can be considered as components of a Shapiro-Wilk type test that has been decomposed into different sources of nonnormality.
Abstract: In this article we develop new plotting positions for normal plots. The use of the plots usually centers on detecting irregular tail behavior or outliers. Along with the normal plot, we develop tests for various departures from normality, especially for skewness and heavy tails. The tests can be considered as components of a Shapiro-Wilk type test that has been decomposed into different sources of nonnormality. Convergence to the limiting distributions is slow, so finite sample corrections are included to make the tests useful for small sample sizes.

25 Jan 1996
TL;DR: It is argued that outlier robust methods provide useful tools for applied researchers as the methods disclose valuable additional information about the long-run behavior of economic processes.
Abstract: textThis book focuses on statistical methods for discriminating between competing models for the long-run behavior of economic time series. Traditional methods that are used in this context are sensitive to outliers in the data. Therefore, this book considers alternative methods that take into account the possibility that not all observations are generated by the postulated model. These methods are called outlier robust. The basic principle underlying outlier robust methods is that discordant observations are downweighted automatically. The use of weights has important consequences for the statistical properties of the methods discussed. These consequences are studied by means of asymptotic theory, Monte-Carlo simulations, and empirical illustrations. Based on the results of this study, it is argued that outlier robust methods provide useful tools for applied researchers as the methods disclose valuable additional information about the long-run behavior of economic processes.

Journal ArticleDOI
TL;DR: A new technique is proposed which estimates the number of outliers in a network by evaluating the redundancy contributions of the detected observations, which leads to higher efficiency in data snooping of geodetic networks.
Abstract: When applying single outlier detection techniques, such as the Tau (τ) test, to examine the residuals of observations for outliers, the number of detected observations in any iteration of adjustment is most often more numerous than the actual number of true outliers. A new technique is proposed which estimates the number of outliers in a network by evaluating the redundancy contributions of the detected observations. In this way, a number of potential outliers can be identified and eliminated in each iteration of an adjustment. This leads to higher efficiency in data snooping of geodetic networks. The technique is illustrated with some numerical examples.

Journal ArticleDOI
TL;DR: In this article, a Bayesian approach to estimating an additive semiparametric regression model which is robust to outliers is presented, where the unknown curves are estimated by posterior means and are shown to be smoothing splines.

Journal ArticleDOI
TL;DR: In this article, Bayesian prediction bounds for some order statistics of future observations from the Burr (c, k) distribution are obtained in the presence of single outlier arising from different members of the same family of distributions.

Journal ArticleDOI
TL;DR: A generalized version of the iterative conditional modes (ICM) method for image enhancement is developed, which utilizes the characteristic of Markov random fields (MRF) in modeling the contextual information embedded in image formation and preserves the details of the images well.
Abstract: A generalized version of the iterative conditional modes (ICM) method for image enhancement is developed. The proposed algorithm utilizes the characteristic of Markov random fields (MRF) in modeling the contextual information embedded in image formation. To cope with real images, a new local MRF model with a second-order neighborhood is introduced. This model extracts contextual information not only from the intensity levels but also from the relative position of neighboring cliques. Also, an outlier rejection method is presented. In this method, the rejection depends on each candidate's contribution to the local variance. To cope with a mixed noise case, a hypothesis test is implemented as part of the restoration procedure. The proposed algorithm performs signal adaptive, nonlinear, and recursive filtering. In comparing the performance of the new procedure with several well-known order statistic filters, the superiority of the proposed algorithm is demonstrated both in the mean-square-error (MSE) and the mean-absolute-error (MAE) senses. In addition, the new algorithm preserves the details of the images well. It should be noted that the blurring effect is not considered.

Journal ArticleDOI
TL;DR: In this article, robust scale estimators such as the interquartile range (IR) and the median absolute deviation from the median (MAD) were used for robust scale estimation.
Abstract: SUMMARY Statistics for detecting outliers generally suffer from masking when multiple outliers are present. One aspect of this masking is inflation by the outliers of estimates of scale. This shrinks test statistics and results in loss of power to identify the outliers. Two familiar robust scale estimators are considered: the interquartile range (IR) and the median absolute deviation from the median (MAD). They are used here to scale statistics both for testing individual observations and for testing a no-outliers hypothesis. Some of these statistics use ordinary least squares residuals, others use recursive residuals calculated on adaptively ordered observations. The more severe the masking problem, the more advantageous robust scale estimation was found to be. IR and MAD worked equally well. Test statistics based on the recursive residuals were more powerful than those based on ordinary residuals.

Journal ArticleDOI
TL;DR: Neural network approaches to robust TLS regression are reviewed and a new learning algorithm is introduced, based on a robust TLS criterion involving a nonlinear function, which outperforms the commonly used LS and TLS fitting methods in resisting both Gaussian noise and outliers.

Journal ArticleDOI
01 Jun 1996
TL;DR: The authors present an approach to range data processing designed to reconstruct the underlying shape of the surfaces in the scene, yet preserve the discontinuities between them, by using a lower complexity variation of the least median of squares estimator and robust smoothing by anisotropic diffusion.
Abstract: Algorithms for the segmentation and description of range images are very sensitive to errors in the source data caused by noise processes in the optoelectronic sensing, and outliers caused by incorrect signal detection, for example false peaks in an active laser triangulation system. The authors present an approach to range data processing designed to reconstruct the underlying shape of the surfaces in the scene, yet preserve the discontinuities between them. The approach has two stages, first outlier removal by a lower complexity variation of the least median of squares estimator, and second, robust smoothing by anisotropic diffusion. To evaluate the proposed methods, the authors quantify the improvement in depth, normal and curvature estimation, and show how preprocessing improves surface patch segmentation and classification.