scispace - formally typeset
Search or ask a question
Author

Mia Hubert

Bio: Mia Hubert is an academic researcher from Katholieke Universiteit Leuven. The author has contributed to research in topics: Estimator & Outlier. The author has an hindex of 47, co-authored 157 publications receiving 9213 citations. Previous affiliations of Mia Hubert include Catholic University of Leuven & University of Antwerp.


Papers
More filters
Journal ArticleDOI
TL;DR: The ROBPCA approach, which combines projection pursuit ideas with robust scatter matrix estimation, yields more accurate estimates at noncontaminated datasets and more robust estimates at contaminated data.
Abstract: We introduce a new method for robust principal component analysis (PCA). Classical PCA is based on the empirical covariance matrix of the data and hence is highly sensitive to outlying observations. Two robust approaches have been developed to date. The first approach is based on the eigenvectors of a robust scatter matrix such as the minimum covariance determinant or an S-estimator and is limited to relatively low-dimensional data. The second approach is based on projection pursuit and can handle high-dimensional data. Here we propose the ROBPCA approach, which combines projection pursuit ideas with robust scatter matrix estimation. ROBPCA yields more accurate estimates at noncontaminated datasets and more robust estimates at contaminated data. ROBPCA can be computed rapidly, and is able to detect exact-fit situations. As a by-product, ROBPCA produces a diagnostic plot that displays and classifies the outliers. We apply the algorithm to several datasets from chemometrics and engineering.

935 citations

Journal ArticleDOI
TL;DR: In this article, an adjustment of the boxplot is presented that includes a robust measure of skewness in the determination of the whiskers, which results in a more accurate representation of the data and of possible outliers.

641 citations

Journal ArticleDOI
TL;DR: An overview of several robust methods and outlier detection tools for univariate, low‐dimensional, and high‐dimensional data such as estimation of location and scatter, linear regression, principal component analysis, and classification are presented.
Abstract: When analyzing data, outlying observations cause problems because they may strongly influence the result. Robust statistics aims at detecting the outliers by searching for the model fitted by the majority of the data. We present an overview of several robust methods and outlier detection tools. We discuss robust procedures for univariate, low-dimensional, and high-dimensional data such as estimation of location and scatter, linear regression, principal component analysis, and classification. © 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 73-79 DOI: 10.1002/widm.2 This article is categorized under: Algorithmic Development > Biological Data Mining Algorithmic Development > Spatial and Temporal Data Mining Application Areas > Health Care Technologies > Structure Discovery and Clustering

533 citations

Journal ArticleDOI
TL;DR: A MATLAB library of robust statistical methods used by chemometricians, statisticians, chemists, and engineers is introduced and many graphical tools to detect and classify the outliers are provided.

407 citations

01 Jan 1996
TL;DR: This paper describes the incorporation of seven stand-alone clustering programs into S-PLUS, where they can now be used in a much more flexible way.
Abstract: This paper describes the incorporation of seven stand-alone clustering programs into S-PLUS, where they can now be used in a much more flexible way. The original Fortran programs carried out new cluster analysis algorithms introduced in the book of Kaufman and Rousseeuw (1990). These clustering methods were designed to be robust and to accept dissimilarity data as well as objects-by-variables data. Moreover, they each provide a graphical display and a quality index reflecting the strength of the clustering. The powerful graphics of S-PLUS made it possible to improve these graphical representations considerably. The integration of the clustering algorithms was performed according to the object-oriented principle supported by S-PLUS. The new functions have a uniform interface, and are compatible with existing S-PLUS functions. We will describe the basic idea and the use of each clustering method, together with its graphical features. Each function is briefly illustrated with an example.

352 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Convergence of Probability Measures as mentioned in this paper is a well-known convergence of probability measures. But it does not consider the relationship between probability measures and the probability distribution of probabilities.
Abstract: Convergence of Probability Measures. By P. Billingsley. Chichester, Sussex, Wiley, 1968. xii, 253 p. 9 1/4“. 117s.

5,689 citations

01 Aug 2000
TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.
Abstract: BIOE 402. Medical Technology Assessment. 2 or 3 hours. Bioentrepreneur course. Assessment of medical technology in the context of commercialization. Objectives, competition, market share, funding, pricing, manufacturing, growth, and intellectual property; many issues unique to biomedical products. Course Information: 2 undergraduate hours. 3 graduate hours. Prerequisite(s): Junior standing or above and consent of the instructor.

4,833 citations

Journal ArticleDOI
TL;DR: Using a Bayesian likelihood approach, the authors estimate a dynamic stochastic general equilibrium model for the US economy using seven macroeconomic time series, incorporating many types of real and nominal frictions and seven types of structural shocks.
Abstract: Using a Bayesian likelihood approach, we estimate a dynamic stochastic general equilibrium model for the US economy using seven macro-economic time series. The model incorporates many types of real and nominal frictions and seven types of structural shocks. We show that this model is able to compete with Bayesian Vector Autoregression models in out-of-sample prediction. We investigate the relative empirical importance of the various frictions. Finally, using the estimated model we address a number of key issues in business cycle analysis: What are the sources of business cycle fluctuations? Can the model explain the cross-correlation between output and inflation? What are the effects of productivity on hours worked? What are the sources of the "Great Moderation"?

3,155 citations

Journal ArticleDOI
TL;DR: In this paper, a dynamic stochastic general equilibrium (DSGE) model with sticky prices and wages for the euro area was developed and estimated with Bayesian techniques using seven key macroeconomic variables: GDP, consumption, investment, prices, real wages, employment, and the nominal interest rate.
Abstract: This paper develops and estimates a dynamic stochastic general equilibrium (DSGE) model with sticky prices and wages for the euro area. The model incorporates various other features such as habit formation, costs of adjustment in capital accumulation and variable capacity utilization. It is estimated with Bayesian techniques using seven key macroeconomic variables: GDP, consumption, investment, prices, real wages, employment, and the nominal interest rate. The introduction of ten orthogonal structural shocks (including productivity, labor supply, investment, preference, cost-push, and monetary policy shocks) allows for an empirical investigation of the effects of such shocks and of their contribution to business cycle e uctuations in the euro area. Using the estimated model, we also analyze the output (real interest rate) gap, dee ned as the difference between the actual and model-based potential output (real interest rate). (JEL: E4, E5)

2,767 citations