scispace - formally typeset
Search or ask a question
Author

Marco Riani

Other affiliations: University of Rome Tor Vergata
Bio: Marco Riani is an academic researcher from University of Parma. The author has contributed to research in topics: Outlier & Robust regression. The author has an hindex of 25, co-authored 121 publications receiving 2410 citations. Previous affiliations of Marco Riani include University of Rome Tor Vergata.


Papers
More filters
BookDOI
01 Jan 2000
TL;DR: This article used regression diagnostics and computer graphics to understand the relationship between a regression model and the data to which it is fitted, and showed how the fitted regression model depends on individual observations and on groups of observations.
Abstract: This book is about using graphs to understand the relationship between a regression model and the data to which it is fitted Because of the new way in which models are fitted, for example by least squares, we can lose information about the effect of individual observations on inferences about the form and parameters of the model The methods developed in this book reveal how the fitted regression model depends on individual observations and on groups of observations Robust procedures can sometimes reveal this structure, but downweight or discard some observations The novelty in this book is to combine robustness and a "forward" search through the data with regression diagnostics and computer graphics

377 citations

Book
21 Mar 2013
TL;DR: In this article, the authors present an analysis of multivariate data and the forward search for regression data in order to find a Multivariate Transformations to Normality (MTN) with the Forward Search.
Abstract: Contents Preface Notation 1 Examples of Multivariate Data 1.1 In.uence, Outliers and Distances 1.2 A Sketch of the Forward Search 1.3 Multivariate Normality and our Examples 1.4 Swiss Heads 1.5 National Track Records forWomen 1.6 Municipalities in Emilia-Romagna 1.7 Swiss Bank Notes 1.8 Plan of the Book 2 Multivariate Data and the Forward Search 2.1 The Univariate Normal Distribution 2.1.1 Estimation 2.1.2 Distribution of Estimators 2.2 Estimation and the Multivariate Normal Distribution 2.2.1 The Multivariate Normal Distribution 2.2.2 The Wishart Distribution 2.2.3 Estimation of O 2.3 Hypothesis Testing 2.3.1 Hypotheses About the Mean 2.3.2 Hypotheses About the Variance 2.4 The Mahalanobis Distance 2.5 Some Deletion Results 2.5.1 The Deletion Mahalanobis Distance 2.5.2 The (Bartlett)-Sherman-Morrison-Woodbury Formula 2.5.3 Deletion Relationships Among Distances 2.6 Distribution of the Squared Mahalanobis Distance 2.7 Determinants of Dispersion Matrices and the Squared Mahalanobis Distance 2.8 Regression 2.9 Added Variables in Regression 2.10 TheMean Shift OutlierModel 2.11 Seemingly Unrelated Regression 2.12 The Forward Search 2.13 Starting the Search 2.13.1 The Babyfood Data 2.13.2 Robust Bivariate Boxplots from Peeling 2.13.3 Bivariate Boxplots from Ellipses 2.13.4 The Initial Subset 2.14 Monitoring the Search 2.15 The Forward Search for Regression Data 2.15.1 Univariate Regression 2.15.2 Multivariate Regression 2.16 Further Reading 2.17 Exercises 2.18 Solutions 3 Data from One Multivariate Distribution 3.1 Swiss Heads 3.2 National Track Records for Women 3.3 Municipalities in Emilia-Romagna 3.4 Swiss Bank Notes 3.5 What Have We Seen? 3.6 Exercises 3.7 Solutions 4 Multivariate Transformations to Normality 4.1 Background 4.2 An Introductory Example: the Babyfood Data 4.3 Power Transformations to Approximate Normality 4.3.1 Transformation of the Response in Regression 4.3.2 Multivariate Transformations to Normality 4.4 Score Tests for Transformations 4.5 Graphics for Transformations 4.6 Finding a Multivariate Transformation with the Forward Search 4.7 Babyfood Data 4.8 Swiss Heads 4.9 Horse Mussels 4.10 Municipalities in Emilia-Romagna 4.10.1 Demographic Variables 4.10.2 Wealth Variables 4.10.3 Work Variables 4.10.4 A Combined Analysis 4.11 National Track Records for Women 4.12 Dyestuff Data 4.13 Babyfood Data and Variable Selection 4.14 Suggestions for Further Reading 4.15 Exercises 4.16 Solutions 5 Principal Components Analysis 5.1 Background 5.2 Principal Components and Eigenvectors 5.2.1 Linear Transformations and Principal Components . 5.2.2 Lack of Scale Invariance and Standardized Variables 5.2.3 The Number of Components 5.3 Monitoring the Forward Search 5.3.1 Principal Components and Variances 5.3.2 Principal Component Scores 5.3.3 Correlations Between Variables and Principal Components 5.3.4 Elements of the Eigenvectors 5.4 The Biplot and the Singular Value Decomposition 5.5 Swiss Heads 5.6 Milk Data 5.7 Quality of Life 5.8 Swiss Bank Notes 5.8.1 Forgeries and Genuine Notes 5.8.2 Forgeries Alone 5.9 Municipalities in Emilia-Romagna 5.10 Further reading 5.11 Exercises 5.12 Solutions 6 Discriminant Analysis 6.1 Background 6.2 An Outline of Discriminant Analysis 6.2.1 Bayesian Discrimination 6.2.2 Quadratic Discriminant Analysis 6.2.3 Linear Discriminant Analysis 6.2.4 Estimation of Means and Variances 6.2.5 Canonical Variates 6.2.6 Assessment of Discriminant Rules 6.3 The Forward Search 6.3.1 Step 1: Choice of the Initial Subset 6.3.2 Step 2: Adding

202 citations

Journal ArticleDOI
TL;DR: In this article, robust Mahalanobis distances were used to detect the presence of outliers in a sample of multivariate normal data. But the robustness of the robust Mahanobis distance was not evaluated.
Abstract: We use the forward search to provide robust Mahalanobis distances to detect the presence of outliers in a sample of multivariate normal data. Theoretical results on order statistics and on estimation in truncated samples provide the distribution of our test statistic. We also introduce several new robust distances with associated distributional results. Comparisons of our procedure with tests using other robust Mahalanobis distances show the good size and high power of our procedure. We also provide a unification of results on correction factors for estimation from truncated samples.

169 citations

Journal ArticleDOI
TL;DR: In this paper, a simple way of constructing a bivariate boxplot based on convex hull peeling and B-spline smoothing is proposed, which leads to defining a natural inner region which is completely nonparametric and smooth.

90 citations

Journal ArticleDOI
TL;DR: The Forward Search as discussed by the authors is a powerful general method, incorporating flexible data-driven trimming, for the detection of outliers and unsuspected structure in data and so for building robust models, starting from small subsets of data, observations that are close to the fitted model are added to the observations used in parameter estimation.
Abstract: The Forward Search is a powerful general method, incorporating flexible data-driven trimming, for the detection of outliers and unsuspected structure in data and so for building robust models. Starting from small subsets of data, observations that are close to the fitted model are added to the observations used in parameter estimation. As this subset grows we monitor parameter estimates, test statistics and measures of fit such as residuals. The paper surveys theoretical development in work on the Forward Search over the last decade. The main illustration is a regression example with 330 observations and 9 potential explanatory variables. Mention is also made of procedures for multivariate data, including clustering, time series analysis and fraud detection.

88 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined and derive a measure pD for the effective number in a model as the difference between the posterior mean of the deviances and the deviance at the posterior means of the parameters of interest, which is related to other information criteria and has an approximate decision theoretic justification.
Abstract: Summary. We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. Using an information theoretic argument we derive a measure pD for the effective number of parameters in a model as the difference between the posterior mean of the deviance and the deviance at the posterior means of the parameters of interest. In general pD approximately corresponds to the trace of the product of Fisher's information and the posterior covariance, which in normal models is the trace of the ‘hat’ matrix projecting observations onto fitted values. Its properties in exponential families are explored. The posterior mean deviance is suggested as a Bayesian measure of fit or adequacy, and the contributions of individual observations to the fit and complexity can give rise to a diagnostic plot of deviance residuals against leverages. Adding pD to the posterior mean deviance gives a deviance information criterion for comparing models, which is related to other information criteria and has an approximate decision theoretic justification. The procedure is illustrated in some examples, and comparisons are drawn with alternative Bayesian and classical proposals. Throughout it is emphasized that the quantities required are trivial to compute in a Markov chain Monte Carlo analysis.

11,691 citations

Journal ArticleDOI

6,278 citations

01 Jan 2016
TL;DR: The modern applied statistics with s is universally compatible with any devices to read, and is available in the digital library an online access to it is set as public so you can download it instantly.
Abstract: Thank you very much for downloading modern applied statistics with s. As you may know, people have search hundreds times for their favorite readings like this modern applied statistics with s, but end up in harmful downloads. Rather than reading a good book with a cup of coffee in the afternoon, instead they cope with some harmful virus inside their laptop. modern applied statistics with s is available in our digital library an online access to it is set as public so you can download it instantly. Our digital library saves in multiple countries, allowing you to get the most less latency time to download any of our books like this one. Kindly say, the modern applied statistics with s is universally compatible with any devices to read.

5,249 citations

Posted Content
TL;DR: In this paper, the authors provide a unified and comprehensive theory of structural time series models, including a detailed treatment of the Kalman filter for modeling economic and social time series, and address the special problems which the treatment of such series poses.
Abstract: In this book, Andrew Harvey sets out to provide a unified and comprehensive theory of structural time series models. Unlike the traditional ARIMA models, structural time series models consist explicitly of unobserved components, such as trends and seasonals, which have a direct interpretation. As a result the model selection methodology associated with structural models is much closer to econometric methodology. The link with econometrics is made even closer by the natural way in which the models can be extended to include explanatory variables and to cope with multivariate time series. From the technical point of view, state space models and the Kalman filter play a key role in the statistical treatment of structural time series models. The book includes a detailed treatment of the Kalman filter. This technique was originally developed in control engineering, but is becoming increasingly important in fields such as economics and operations research. This book is concerned primarily with modelling economic and social time series, and with addressing the special problems which the treatment of such series poses. The properties of the models and the methodological techniques used to select them are illustrated with various applications. These range from the modellling of trends and cycles in US macroeconomic time series to to an evaluation of the effects of seat belt legislation in the UK.

4,252 citations

Journal ArticleDOI

1,484 citations