scispace - formally typeset
Search or ask a question
Topic

Scatterplot smoothing

About: Scatterplot smoothing is a research topic. Over the lifetime, 116 publications have been published within this topic receiving 17642 citations.


Papers
More filters
Journal ArticleDOI
William S. Cleveland1
TL;DR: Robust locally weighted regression as discussed by the authors is a method for smoothing a scatterplot, in which the fitted value at z k is the value of a polynomial fit to the data using weighted least squares, where the weight for (x i, y i ) is large if x i is close to x k and small if it is not.
Abstract: The visual information on a scatterplot can be greatly enhanced, with little additional cost, by computing and plotting smoothed points. Robust locally weighted regression is a method for smoothing a scatterplot, (x i , y i ), i = 1, …, n, in which the fitted value at z k is the value of a polynomial fit to the data using weighted least squares, where the weight for (x i , y i ) is large if x i is close to x k and small if it is not. A robust fitting procedure is used that guards against deviant points distorting the smoothed points. Visual, computational, and statistical issues of robust locally weighted regression are discussed. Several examples, including data on lead intoxication, are used to illustrate the methodology.

10,225 citations

Journal ArticleDOI
TL;DR: A relatively large number of knots and a difference penalty on coefficients of adjacent B-splines are proposed to use and connections to the familiar spline penalty on the integral of the squared second derivative are shown.
Abstract: B-splines are attractive for nonparametric modelling, but choosing the optimal number and positions of knots is a complex task. Equidistant knots can be used, but their small and discrete number allows only limited control over smoothness and fit. We propose to use a relatively large number of knots and a difference penalty on coefficients of adjacent B-splines. We show connections to the familiar spline penalty on the integral of the squared second derivative. A short overview of B-splines, of their construction and of penalized likelihood is presented. We discuss properties of penalized B-splines and propose various criteria for the choice of an optimal penalty parameter. Nonparametric logistic regression, density estimation and scatterplot smoothing are used as examples. Some details of the computations are presented.

3,512 citations

Journal ArticleDOI
TL;DR: In this paper, the authors apply the idea of plug-in bandwidth selection to develop strategies for choosing the smoothing parameter of local linear squares kernel estimators, which is applicable to odd-degree local polynomial fits and can be extended to other settings, such as derivative estimation and multiple nonparametric regression.
Abstract: Local least squares kernel regression provides an appealing solution to the nonparametric regression, or “scatterplot smoothing,” problem, as demonstrated by Fan, for example. The practical implementation of any scatterplot smoother is greatly enhanced by the availability of a reliable rule for automatic selection of the smoothing parameter. In this article we apply the ideas of plug-in bandwidth selection to develop strategies for choosing the smoothing parameter of local linear squares kernel estimators. Our results are applicable to odd-degree local polynomial fits and can be extended to other settings, such as derivative estimation and multiple nonparametric regression. An implementation in the important case of local linear fits with univariate predictors is shown to perform well in practice. A by-product of our work is the development of a class of nonparametric variance estimators, based on local least squares ideas, and plug-in rules for their implementation.

850 citations

Book ChapterDOI
TL;DR: In this article, the authors use the local scoring algorithm to estimate the functions fj (xj ) nonparametrically, using a scatterplot smoother as a building block.
Abstract: Generalized additive models have the form η(x) = α + σ fj (x j ), where η might be the regression function in a multiple regression or the logistic transformation of the posterior probability Pr(y = 1 | x) in a logistic regression. In fact, these models generalize the whole family of generalized linear models η(x) = β′x, where η(x) = g(μ(x)) is some transformation of the regression function. We use the local scoring algorithm to estimate the functions fj (xj ) nonparametrically, using a scatterplot smoother as a building block. We demonstrate the models in two different analyses: a nonparametric analysis of covariance and a logistic regression. The procedure can be used as a diagnostic tool for identifying parametric transformations of the covariates in a standard linear analysis. A variety of inferential tools have been developed to aid the analyst in assessing the relevance and significance of the estimated functions: these include confidence curves, degrees of freedom estimates, and approximat...

637 citations

Journal ArticleDOI
30 Aug 1985-Science
TL;DR: The computer graphics revolution has stimulated the invention of many graphical methods for analyzing and presenting scientific data, such as box plots, two-tiered error bars, scatterplot smoothing, dot charts, and graphing on a log base 2 scale.
Abstract: Graphical perception is the visual decoding of the quantitative and qualitative information encoded on graphs Recent investigations have uncovered basic principles of human graphical perception that have important implications for the display of data The computer graphics revolution has stimulated the invention of many graphical methods for analyzing and presenting scientific data, such as box plots, two-tiered error bars, scatterplot smoothing, dot charts, and graphing on a log base 2 scale

561 citations


Network Information
Related Topics (5)
Multivariate statistics
18.4K papers, 1M citations
74% related
Regression analysis
31K papers, 1.7M citations
74% related
Statistical hypothesis testing
19.5K papers, 1M citations
72% related
Linear model
19K papers, 1M citations
71% related
Linear regression
21.3K papers, 1.2M citations
70% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20218
20208
20194
20183
20176
20168