scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Bivariate extensions of the boxplot

01 Aug 1992-Technometrics (Taylor & Francis Group)-Vol. 34, Iss: 3, pp 307-320
TL;DR: In this paper, several options of bivariate boxplot-type constructions are discussed, including both elliptic and asymmetric plots, and alternative constructions compared in terms of efficiency of the relevant parameters.
Abstract: The boxplot has proven to be a very useful tool for summarizing univariate data. Several options of bivariate boxplot-type constructions are discussed. These include both elliptic and asymmetric plots. An inner region contains 50% of the data, and a fence identifies potential outliers. Such a robust plot shows location, scale, correlation, and a resistant regression line. Alternative constructions are compared in terms of efficiency of the relevant parameters. Additional properties are given and recommendations made. Emphasis is given to the bivariate biweight M estimator. Several practical examples illustrate that standard least squares ellipsoids can give graphically misleading summaries.
Citations
More filters
Journal ArticleDOI
TL;DR: A simple method for computing a highest density region from any given density f(x) that is bounded and continuous in x is proposed and a new form of boxplot is proposed based on highest density regions.
Abstract: Many statistical methods involve summarizing a probability distribution by a region of the sample space covering a specified probability. One method of selecting such a region is to require it to contain points of relatively high density. Highest density regions are particularly useful for displaying multimodal distributions and, in such cases, may consist of several disjoint subsets—one for each local mode. In this paper, I propose a simple method for computing a highest density region from any given (possibly multivariate) density f(x) that is bounded and continuous in x. Several examples of the use of highest density regions in statistical graphics are also given. A new form of boxplot is proposed based on highest density regions; versions in one and two dimensions are given. Highest density regions in higher dimensions are also discussed and plotted.

602 citations

Journal ArticleDOI
TL;DR: The bagplot as mentioned in this paper is a generalization of the univariate boxplot that extends the concept of rank to include the half space location depth of a point relative to a bivariate dataset.
Abstract: We propose the bagplot, a bivariate generalization of the univariate boxplot. The key notion is the half space location depth of a point relative to a bivariate dataset, which extends the univariate concept of rank. The “depth median” is the deepest location, and it is surrounded by a “bag” containing the n/2 observations with largest depth. Magnifying the bag by a factor 3 yields the “fence” (which is not plotted). Observations between the bag and the fence are marked by a light gray loop, whereas observations outside the fence are flagged as outliers. The bagplot visualizes the location, spread, correlation, skewness, and tails of the data. It is equivariant for linear transformations, and not limited to elliptical distributions. Software for drawing the bagplot is made available for the S-Plus and MATLAB environments. The bagplot is illustrated on several datasets—for example, in a scatterplot matrix of multivariate data.

524 citations

Journal ArticleDOI
TL;DR: For instance, this paper pointed out that standard analysis of variance, Pearson productmoment correlations, and least squares regression can be highly misleading and can have relatively low power even under very small departures from normality.
Abstract: Hundreds of articles in statistical journals have pointed out that standard analysis of variance, Pearson productmoment correlations, and least squares regression can be highly misleading and can have relatively low power even under very small departures from normality. In practical terms, psychology journals are littered with nonsignificant results that would have been significant if a more modern method had been used. Modern robust techniques, developed during the past 30 years, provide very effective methods for dealing with nonnormality, and they compete very well with conventional procedures when standard assumptions are met. In addition, modern methods provide accurate confidence intervals for a much broader range of situations, they provide more effective methods for detecting and studying outliers, and they can be used to get a deeper understanding of how variables are related. This article outlines and illustrates these results.

297 citations

Book ChapterDOI
01 Jan 2014
TL;DR: This chapter outlines sources and models of uncertainty, gives an overview of the state-of-the-art, provides general guidelines, outline small exemplary applications, and discusses open problems in uncertainty visualization.
Abstract: The goal of visualization is to effectively and accurately communicate data. Visualization research has often overlooked the errors and uncertainty which accompany the scientific process and describe key characteristics used to fully understand the data. The lack of these representations can be attributed, in part, to the inherent difficulty in defining, characterizing, and controlling this uncertainty, and in part, to the difficulty in including additional visual metaphors in a well designed, potent display. However, the exclusion of this information cripples the use of visualization as a decision making tool due to the fact that the display is no longer a true representation of the data. This systematic omission of uncertainty commands fundamental research within the visualization community to address, integrate, and expect uncertainty information. In this chapter, we outline sources and models of uncertainty, give an overview of the state-of-the-art, provide general guidelines, outline small exemplary applications, and finally, discuss open problems in uncertainty visualization.

189 citations

Journal ArticleDOI
TL;DR: A longitudinal study of complementary measures of Aβ pathology (PIB, CSF and plasma Aβ) and other biomarkers in a cohort with an extensive neuropsychological battery is significant because it shows that Aβ measurements have limited value for disease classification and modest value as prognostic factors over the 3-year follow-up as mentioned in this paper.
Abstract: Previous studies of Aβ plasma as a biomarker for Alzheimer’s disease (AD) obtained conflicting results We here included 715 subjects with baseline Aβ1-40 and Aβ1-42 plasma measurement (50% with 4 serial annual measurements): 205 cognitively normal controls (CN), 348 patients mild cognitive impairment (MCI) and 162 with AD We assessed the factors that modified their concentrations and correlated these values with PIB PET, MRI and tau and Aβ1-42 measures in cerebrospinal fluid (CSF) Association between Aβ and diagnosis (baseline and prospective) was assessed A number of health conditions were associated with altered concentrations of plasma Aβ The effect of age differed according to AD stage Plasma Aβ1-42 showed mild correlation with other biomarkers of Aβ pathology and were associated with infarctions in MRI Longitudinal measurements of Aβ1-40 and Aβ1-42 plasma levels showed modest value as a prognostic factor for clinical progression Our longitudinal study of complementary measures of Aβ pathology (PIB, CSF and plasma Aβ) and other biomarkers in a cohort with an extensive neuropsychological battery is significant because it shows that plasma Aβ measurements have limited value for disease classification and modest value as prognostic factors over the 3-year follow-up However, with longer follow-up, within subject plasma Aβ measurements could be used as a simple and minimally invasive screen to identify those at increased risk for AD Our study emphasizes the need for a better understanding of the biology and dynamics of plasma Aβ as well as the need for longer term studies to determine the clinical utility of measuring plasma Aβ

178 citations


Cites methods from "Bivariate extensions of the boxplot..."

  • ...were calculated (rP), but in the presence of bivariate outliers in the relplot representation [18] a percentage bend Acta Neuropathol (2011) 122:401–413 403...

    [...]

References
More filters
Book
31 Jan 1986
TL;DR: Numerical Recipes: The Art of Scientific Computing as discussed by the authors is a complete text and reference book on scientific computing with over 100 new routines (now well over 300 in all), plus upgraded versions of many of the original routines, with many new topics presented at the same accessible level.
Abstract: From the Publisher: This is the revised and greatly expanded Second Edition of the hugely popular Numerical Recipes: The Art of Scientific Computing. The product of a unique collaboration among four leading scientists in academic research and industry, Numerical Recipes is a complete text and reference book on scientific computing. In a self-contained manner it proceeds from mathematical and theoretical considerations to actual practical computer routines. With over 100 new routines (now well over 300 in all), plus upgraded versions of many of the original routines, this book is more than ever the most practical, comprehensive handbook of scientific computing available today. The book retains the informal, easy-to-read style that made the first edition so popular, with many new topics presented at the same accessible level. In addition, some sections of more advanced material have been introduced, set off in small type from the main body of the text. Numerical Recipes is an ideal textbook for scientists and engineers and an indispensable reference for anyone who works in scientific computing. Highlights of the new material include a new chapter on integral equations and inverse methods; multigrid methods for solving partial differential equations; improved random number routines; wavelet transforms; the statistical bootstrap method; a new chapter on "less-numerical" algorithms including compression coding and arbitrary precision arithmetic; band diagonal linear systems; linear algebra on sparse matrices; Cholesky and QR decomposition; calculation of numerical derivatives; Pade approximants, and rational Chebyshev approximation; new special functions; Monte Carlo integration in high-dimensional spaces; globally convergent methods for sets of nonlinear equations; an expanded chapter on fast Fourier methods; spectral analysis on unevenly sampled data; Savitzky-Golay smoothing filters; and two-dimensional Kolmogorov-Smirnoff tests. All this is in addition to material on such basic top

12,662 citations

Journal ArticleDOI

11,905 citations

Book
01 Jan 1987
TL;DR: This paper presents the results of a two-year study of the statistical treatment of outliers in the context of one-Dimensional Location and its applications to discrete-time reinforcement learning.
Abstract: 1. Introduction. 2. Simple Regression. 3. Multiple Regression. 4. The Special Case of One-Dimensional Location. 5. Algorithms. 6. Outlier Diagnostics. 7. Related Statistical Techniques. References. Table of Data Sets. Index.

6,955 citations

Journal ArticleDOI
TL;DR: In this article, a text designed to make multivariate techniques available to behavioural, social, biological and medical students is presented, which includes an approach to multivariate inference based on the union-intersection and generalized likelihood ratio principles.
Abstract: A text designed to make multivariate techniques available to behavioural, social, biological and medical students. Special features include an approach to multivariate inference based on the union-intersection and generalized likelihood ratio principles.

6,488 citations

Book
01 Jan 1976
TL;DR: In this article, a text designed to make multivariate techniques available to behavioural, social, biological and medical students is presented, which includes an approach to multivariate inference based on the union-intersection and generalized likelihood ratio principles.
Abstract: A text designed to make multivariate techniques available to behavioural, social, biological and medical students. Special features include an approach to multivariate inference based on the union-intersection and generalized likelihood ratio principles.

5,807 citations