Showing papers in "Journal of Computational and Graphical Statistics in 1996"

PDF

Open Access

Journal Article•DOI•

R: A Language for Data Analysis and Graphics

[...]

Ross Ihaka¹, Robert Gentleman¹•Institutions (1)

01 Sep 1996-Journal of Computational and Graphical Statistics

TL;DR: In this article, the authors discuss their experience designing and implementing a statistical computing language, which combines what they felt were useful features from two existing computer languages, and they feel that the new language provides advantages in the areas of portability, computational efficiency, memory management, and scope.

...read moreread less

Abstract: In this article we discuss our experience designing and implementing a statistical computing language. In developing this new language, we sought to combine what we felt were useful features from two existing computer languages. We feel that the new language provides advantages in the areas of portability, computational efficiency, memory management, and scoping.

...read moreread less

9,446 citations

Journal Article•DOI•

Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models

[...]

Genshiro Kitagawa

01 Mar 1996-Journal of Computational and Graphical Statistics

TL;DR: A new algorithm based on a Monte Carlo method that can be applied to a broad class of nonlinear non-Gaussian higher dimensional state space models on the provision that the dimensions of the system noise and the observation noise are relatively low.

...read moreread less

Abstract: A new algorithm for the prediction, filtering, and smoothing of non-Gaussian nonlinear state space models is shown. The algorithm is based on a Monte Carlo method in which successive prediction, filtering (and subsequently smoothing), conditional probability density functions are approximated by many of their realizations. The particular contribution of this algorithm is that it can be applied to a broad class of nonlinear non-Gaussian higher dimensional state space models on the provision that the dimensions of the system noise and the observation noise are relatively low. Several numerical examples are shown.

...read moreread less

2,406 citations

Journal Article•DOI•

Randomized Quantile Residuals

[...]

Peter K. Dunn¹, Gordon K. Smyth¹•Institutions (1)

University of Queensland¹

01 Sep 1996-Journal of Computational and Graphical Statistics

TL;DR: In this paper, a general definition of residuals for regression models with independent responses is given, which produces residuals that are exactly normal, apart from sampling variability in the estimated parameters, by inverting the fitted distribution function for each response value and finding the equivalent standard normal quantile.

...read moreread less

Abstract: In this article we give a general definition of residuals for regression models with independent responses Our definition produces residuals that are exactly normal, apart from sampling variability in the estimated parameters, by inverting the fitted distribution function for each response value and finding the equivalent standard normal quantile Our definition includes some randomization to achieve continuous residuals when the response variable is discrete Quantile residuals are easily computed in computer packages such as SAS, S-Plus, GLIM, or LispStat, and allow residual analyses to be carried out in many commonly occurring situations in which the customary definitions of residuals fail Quantile residuals are applied in this article to three example data sets

...read moreread less

838 citations

Journal Article•DOI•

Interactive High-Dimensional Data Visualization

[...]

Andreas Buja¹, Dianne Cook²•Institutions (2)

Bell Labs¹, Iowa State University²

01 Mar 1996-Journal of Computational and Graphical Statistics

TL;DR: A rudimentary taxonomy of interactive data visualization is proposed based on a triad of data analytic tasks: finding Gestalt, posing queries, and making comparisons; namely, high-dimensional projections, linked scatterplot brushing, and matrices of conditional plots.

...read moreread less

Abstract: We propose a rudimentary taxonomy of interactive data visualization based on a triad of data analytic tasks: finding Gestalt, posing queries, and making comparisons. These tasks are supported by three classes of interactive view manipulations: focusing, linking, and arranging views. This discussion extends earlier work on the principles of focusing and linking and sets them on a firmer base. Next, we give a high-level introduction to a particular system for multivariate data visualization—XGobi. This introduction is not comprehensive but emphasizes XGobi tools that are examples of focusing, linking, and arranging views; namely, high-dimensional projections, linked scatterplot brushing, and matrices of conditional plots. Finally, in a series of case studies in data visualization, we show the powers and limitations of particular focusing, linking, and arranging tools. The discussion is dominated by high-dimensional projections that form an extremely well-developed part of XGobi. Of particular inter...

...read moreread less

389 citations

Journal Article•DOI•

Estimating and Visualizing Conditional Densities

[...]

Rob J. Hyndman¹, David M. Bashtannyk¹, Gary K. Grunwald²•Institutions (2)

Monash University, Clayton campus¹, University of Melbourne²

01 Jan 1996-Journal of Computational and Graphical Statistics

Abstract: We consider the kernel estimator of conditional density and derive its asymptotic bias, variance, and mean-square error. Optimal bandwidths (with respect to integrated mean-square error) are found and it is shown that the convergence rate of the density estimator is order n –2/3. We also note that the conditional mean function obtained from the estimator is equivalent to a kernel smoother. Given the undesirable bias properties of kernel smoothers, we seek a modified conditional density estimator that has mean equivalent to some other nonparametric regression smoother with better bias properties. It is also shown that our modified estimator has smaller mean square error than the standard estimator in some commonly occurring situations. Finally, three graphical methods for visualizing conditional density estimators are discussed and applied to a data set consisting of maximum daily temperatures in Melbourne, Australia.

...read moreread less

384 citations

Journal Article•DOI•

The Visual Design and Control of Trellis Display

[...]

Richard Alan Becker¹, William S. Cleveland², Ming-Jen Shyu•Institutions (2)

AT&T¹, Alcatel-Lucent²

01 Jun 1996-Journal of Computational and Graphical Statistics

TL;DR: Trellis display provides a powerful mechanism for understanding interactions in studies of how a response depends on explanatory variables, and makes important discoveries not appreciated in the original analyses.

...read moreread less

Abstract: Trellis display is a framework for the visualization of data. Its most prominent aspect is an overall visual design, reminiscent of a garden trelliswork, in which panels are laid out into rows, columns, and pages. On each panel of the trellis, a subset of the data is graphed by a display method such as a scatterplot, curve plot, boxplot, 3-D wireframe, normal quantile plot, or dot plot. Each panel shows the relationship of certain variables conditional on the values of other variables. A number of display methods employed in the visual design of Trellis display enable it to succeed in uncovering the structure of data even when the structure is quite complicated. For example, Trellis display provides a powerful mechanism for understanding interactions in studies of how a response depends on explanatory variables. Three examples demonstrate this; in each case, we make important discoveries not appreciated in the original analyses. Several control methods are also essential to Trellis display. A con...

...read moreread less

294 citations

Journal Article•DOI•

Wavelet Analysis and Synthesis of Stationary Long-Memory Processes

[...]

Emma J. McCoy¹, Andrew T. Walden¹•Institutions (1)

Imperial College London¹

01 Mar 1996-Journal of Computational and Graphical Statistics

TL;DR: It is shown how the DWT breaks down a fdGn, and the exact correlation structure of the resulting coefficients for different wavelets (Daubechies' minimum-phase and least-asymmetric and Haar) is shown.

...read moreread less

Abstract: The discrete wavelet transform (DWT) can be interpreted as a filtering of a time series by a set of octave band filters such that the width of each band as a proportion of its center frequency is constant. A long-memory process having a power spectrum that plots as a straight line on log-frequency/log-power scales over many octaves of frequency is intrinsically related to such a structure. As an example of such processes, we focus on one class of discrete-time, stationary, long-memory processes, the fractionally differenced Gaussian white noise processes (fdGn). We show how the DWT breaks down a fdGn, and show the exact correlation structure of the resulting coefficients for different wavelets (Daubechies' minimum-phase and least-asymmetric and Haar). The DWT is an impressive “whitening filter.” A discrete wavelet-based scheme for simulating fdGn's is discussed and is shown to be equivalent to a spectral decomposition of the covariance matrix of the process; however, it can be carried out using o...

...read moreread less

158 citations

Journal Article•DOI•

Interactive Graphics for Data Sets with Missing Values—MANET

[...]

Antony Unwin¹, George Hawkins, Heike Hofmann, Bernd Siegl•Institutions (1)

University of Augsburg¹

01 Jun 1996-Journal of Computational and Graphical Statistics

TL;DR: The MANET software has been developed for keeping track of missing values in interactive graphics analyses and for investigating new interactive graphics tools.

...read moreread less

Abstract: Missing values are a problem for statistical methods. This applies just as much to modern methods such as interactive graphics as to more classical methods. The MANET software has been developed for keeping track of missing values in interactive graphics analyses and for investigating new interactive graphics tools.

...read moreread less

115 citations

Journal Article•DOI•

Pixel-Oriented Visualization Techniques for Exploring Very Large Data Bases

[...]

Daniel A. Keim¹•Institutions (1)

Ludwig Maximilian University of Munich¹

01 Mar 1996-Journal of Computational and Graphical Statistics

TL;DR: This article describes a set of pixel-oriented visualization techniques that use each pixel of the display to visualize one data value and therefore allow the visualization of the largest amount of data possible.

...read moreread less

Abstract: An important goal of visualization technology is to support the exploration and analysis of very large amounts of data. This article describes a set of pixel-oriented visualization techniques that use each pixel of the display to visualize one data value and therefore allow the visualization of the largest amount of data possible. Most of the techniques have been specifically designed for visualizing and querying large data bases. The techniques may be divided into query-independent techniques that directly visualize the data (or a certain portion of it) and query-dependent techniques that visualize the data in the context of a specific query. Examples for the class of query-independent techniques are the screen-filling curve and recursive pattern techniques. The screen-filling curve techniques are based on the well-known Morton and Peano–Hilbert curve algorithms, and the recursive pattern technique is based on a generic recursive scheme, which generalizes a wide range of pixel-oriented arrangeme...

...read moreread less

91 citations

Journal Article•DOI•

Gibbs Sampling Will Fail in Outlier Problems with Strong Masking

[...]

Ana Justel¹, Daniel Peña¹•Institutions (1)

Charles III University of Madrid¹

01 Jun 1996-Journal of Computational and Graphical Statistics

TL;DR: It is shown that the effect of the leverage in regression models makes very difficult the convergence of the Gibbs sampling algorithm in sets of data with strong masking.

...read moreread less

Abstract: This article discusses the convergence of the Gibbs sampling algorithm when it is applied to the problem of outlier detection in regression models. Given any vector of initial conditions, theoretically, the algorithm converges to the true posterior distribution. However, the speed of convergence may slow down in a high-dimensional parameter space where the parameters are highly correlated. We show that the effect of the leverage in regression models makes very difficult the convergence of the Gibbs sampling algorithm in sets of data with strong masking. The problem is illustrated with examples.

...read moreread less

44 citations

Journal Article•DOI•

Sequential Linearization of Empirical Likelihood Constraints with Application to U-Statistics

[...]

Andrew T. A. Wood¹, Kim-Anh Do², Bradley Broom³•Institutions (3)

University of Bath¹, QIMR Berghofer Medical Research Institute², Queensland University of Technology³

01 Jan 1996-Journal of Computational and Graphical Statistics

TL;DR: The basic idea, which may be described as “sequential linearization of constraints,” is a very simple one, but it could have significant ramifications for the implementation and practical use of empirical likelihood methodology.

...read moreread less

Abstract: Empirical likelihood for a mean is straightforward to compute, but for nonlinear statistics significant computational difficulties arise because of the presence of nonlinear constraints in the underlying optimization problem. It is certainly the case that these difficulties can be overcome with sufficient time, care, and programming effort. However, they do make it difficult to write general software for implementing empirical likelihood, and therefore these difficulties are likely to hinder the widespread use of empirical likelihood in applied work. The purpose of this article is to suggest an approximate approach that sidesteps the difficult computational issues. The basic idea, which may be described as “sequential linearization of constraints,” is a very simple one, but we believe it could have significant ramifications for the implementation and practical use of empirical likelihood methodology. One application of the linearization approach, which we consider in this article, is to the probl...

...read moreread less

Journal Article•DOI•

Hazard Rate Regression Using Ordinary Nonparametric Regression Smoothers

[...]

Robert Gray¹•Institutions (1)

Harvard University¹

01 Jun 1996-Journal of Computational and Graphical Statistics

TL;DR: In this paper, a method for nonparametric estimation of hazard rates as a function of time and possibly multiple covariates is proposed, which is based on dividing the time axis into intervals, and calculating number of event and follow-up time contributions from the different intervals.

...read moreread less

Abstract: This article proposes a method for nonparametric estimation of hazard rates as a function of time and possibly multiple covariates. The method is based on dividing the time axis into intervals, and calculating number of event and follow-up time contributions from the different intervals. The number of event and follow-up time data are then separately smoothed on time and the covariates, and the hazard rate estimators obtained by taking the ratio. Pointwise consistency and asymptotic normality are shown for the hazard rate estimators for a certain class of smoothers, which includes some standard approaches to locally weighted regression and kernel regression. It is shown through simulation that a variance estimator based on this asymptotic distribution is reasonably reliable in practice. The problem of how to select the smoothing parameter is considered, but a satisfactory resolution to this problem has not been identified. The method is illustrated using data from several breast cancer clinical t...

...read moreread less

Journal Article•DOI•

An Algorithm for Isotonic Regression on Ordered Rectangular Grids

[...]

Shixian Qian, William F. Eddy¹•Institutions (1)

Carnegie Mellon University¹

01 Sep 1996-Journal of Computational and Graphical Statistics

TL;DR: In this article, an algorithm for isotonic regression on ordered rectangular grids is presented, with running time no more than a cubic in the number of grid points, which makes bivariate isotonic regressions a practical choice for some data analysis.

...read moreread less

Abstract: In this article, we give an algorithm for isotonic regressions on ordered rectangular grids. The running time of the algorithm is no more than cubic in the number of grid points. This algorithm makes bivariate isotonic regression a practical choice for some data analysis.

...read moreread less

Journal Article•DOI•

Extensible Statistical Software: On a Voyage to Oberon

[...]

Günther Sawitzki¹•Institutions (1)

Heidelberg University¹

01 Sep 1996-Journal of Computational and Graphical Statistics

TL;DR: Voyager, an extensible data analysis system based on Oberon, which tries to exploit some of these possibilities for statistical computing by exploiting dynamic loading and type-safety across module boundaries, even at run time.

...read moreread less

Abstract: Recent changes in software technology have opened new possibilities for statistical computing. Conditions for creating efficient and reliable extensible systems have been largely improved by programming languages and systems that provide dynamic loading and type-safety across module boundaries, even at run time. We introduce Voyager, an extensible data analysis system based on Oberon, which tries to exploit some of these possibilities.

...read moreread less

Journal Article•DOI•

Exact Tests for Interaction in Several 2 × 2 Tables

[...]

Karim F. Hirji¹, Stein Emil Vollset², Isildinha M. Reis¹, Abdelmonem A. Afifi¹•Institutions (2)

University of California, Los Angeles¹, University of Bergen²

01 Sep 1996-Journal of Computational and Graphical Statistics

TL;DR: In this article, a polynomial multiplication algorithm was proposed to compute exact distributions and tail areas for the family of stratum-additive statistics, including score, likelihood ratio, and other statistics.

...read moreread less

Abstract: The investigation of interaction in a series of 2 × 2 tables is warranted in a variety of research endeavors. Though many large-sample approaches for such investigations are available, the exact analysis of the problem has been formulated for the probability statistic only. We present several alternative statistics applicable in this context. We also give an efficient polynomial multiplication algorithm to compute exact distributions and tail areas for the family of stratum-additive statistics. Besides the probability statistic, these include the score, likelihood ratio, and other statistics. In addition to comparing, in empirical terms, the diverse computational strategies for exact interaction analysis, we also explore the theoretical linkages between them. Data from published papers are used for illustration.

...read moreread less

Journal Article•DOI•

Fast Computation of Auxiliary Quantities in Local Polynomial Regression

[...]

Berwin A. Turlach¹, Matt P. Wand²•Institutions (2)

Australian National University¹, University of New South Wales²

01 Dec 1996-Journal of Computational and Graphical Statistics

TL;DR: In this article, the authors investigate the extension of binning methodology to fast computation of several auxiliary quantities that arise in local polynomial smoothing, such as degrees of freedom measures, cross-validation functions, variance estimates, and exact measures of error.

...read moreread less

Abstract: We investigate the extension of binning methodology to fast computation of several auxiliary quantities that arise in local polynomial smoothing. Examples include degrees of freedom measures, cross-validation functions, variance estimates, and exact measures of error. It is shown that the computational effort required for such approximations is of the same order of magnitude as that required for a binned local polynomial smooth.

...read moreread less

Journal Article•DOI•

Risks of Using Improper Priors with Gibbs Sampling and Autocorrelated Errors

[...]

Judy L. Palmer¹, Lawrence Pettit²•Institutions (2)

Texas Medical Center¹, Goldsmiths, University of London²

01 Sep 1996-Journal of Computational and Graphical Statistics

TL;DR: This paper showed that the use of an improper prior leads to an improper posterior, although the conditionals are proper, and hence a formal Gibbs sampler can be constructed to solve the problem.

...read moreread less

Abstract: In this article we examine the use of Gibbs sampling to estimate the autocorrelation coefficient in a linear regression model. Researchers had previously experienced difficulty with moderate-to-high positive autocorrelated errors; estimates could be unstable and sometimes failed to converge. We show that the cause of this problem is that the use of an improper prior leads to an improper posterior, although the conditionals are proper, and hence a formal Gibbs sampler can be constructed. The problem is solved by the use of a vague but proper prior. In this simple case many of the calculations can be done analytically and it serves as a warning as to the uncritical use of improper priors with Gibbs sampling.

...read moreread less

Journal Article•DOI•

Algorithms for Analyzing Nonstationary Time Series with Fractal Noise

[...]

Peter Hall, David Matthews, Eckhard Platen¹•Institutions (1)

Australian National University¹

01 Dec 1996-Journal of Computational and Graphical Statistics

TL;DR: In this paper, a jump process is used to explain irregular fluctuations that are not so plausibly modeled by fractal processes, such as discontinuities and nonlinear drift in the mean.

...read moreread less

Abstract: Arguably the best-known applications of fractal methods are in relatively homogeneous, stationary settings, where the environment is controllable by scientists or engineers. For example, in applications to surface science, an unblemished portion of a surface is selected for analysis; and in environmental science, an artificial soil bed of controlled homogeneity is subjected to uniformly distributed water droplets, to model the effect of actual rain on a real soil surface. In some applications, however, the environment is uncontrollable, with the result that measurements are subject to irregular fluctuations that are not so plausibly modeled by fractal processes. The fluctuations may include discontinuities and nonlinear drift in the mean. Some approaches to analysis do not distinguish between this nonstationary contamination and the “background,” with the result that a jump process may provide a significantly better explanation of the data than a fractal process. In this article we suggest decomp...

...read moreread less

Journal Article•DOI•

An Interactive Icon Index: Images of the Outer Planets

[...]

William F. Eddy, Audris Mockus

01 Mar 1996-Journal of Computational and Graphical Statistics

TL;DR: An index of the images in the collection of digital images that has nearly 30,000 members is constructed, regarded as an early development of statistical exploratory tools for studying collections of complex objects.

...read moreread less

Abstract: We are interested in the exploratory analysis of large collections of complex objects. As an example, we are studying a large collection of digital images that has nearly 30,000 members. We regard each image in the collection as an individual observation. To facilitate our study we construct an index of the images in the collection. The index uses a small copy of each image (an icon or a “thumbnail”) to represent the full-size version. A large number of these thumbnails are laid out in a workstation window. We can interactively arrange and rearrange the thumbnails within the window. For example, we can sort the thumbnails by the values of a function computed from them or by the values of data associated with each of them. By the use of specialized equipment (a single-frame video disk recorder/player), we can instantly access any individual full-size image in the collection as a video image. We regard our software as an early development of statistical exploratory tools for studying collections of...

...read moreread less

Journal Article•DOI•

Computer-Assisted Statistics Education at Delft University of Technology

[...]

Piet Groeneboom¹, Peter de Jong, Dimitri B. Tischenko¹, Bert C. van Zomeren•Institutions (1)

Delft University of Technology¹

01 Dec 1996-Journal of Computational and Graphical Statistics

TL;DR: The TWI-Stat project, a computer-aided instruction course was developed to help students become more familiar with modern statistical analysis and presents itself as a dynamic, interactive, personal book.

...read moreread less

Abstract: At Delft University of Technology many students experience difficulties in mastering basic concepts of probability and statistics. In the past few years the lectures have undergone a radical change—the lecture notes now contain modern data analysis techniques, like kernel density estimation, simulation, and bootstrapping. In the TWI-Stat project, a computer-aided instruction course was developed to help students become more familiar with modern statistical analysis. The course presents itself as a dynamic, interactive, personal book. Highly interactive analysis tools are available. The software will be available for MS-Windows.

...read moreread less

Journal Article•DOI•

Exact Tests for Interaction in Several 2 × 2 Tables@@@Exact Tests for Interaction in Several 2 X 2 Tables

[...]

Karim F. Hirji, Stein Emil Vollset, Isildinha Marques dos Reis, Abdelmonem A. Afifi

01 Sep 1996-Journal of Computational and Graphical Statistics

Journal Article•DOI•

Constraint-Based Representations of Statistical Graphs

[...]

Allan R. Wilks¹•Institutions (1)

AT&T¹

01 Dec 1996-Journal of Computational and Graphical Statistics

TL;DR: This article describes in detail the constraint system that Pictor uses, which describes graphs as graphical objects whose component pieces are related by several sorts of constraints.

...read moreread less

Abstract: Pictor is an environment for statistical graphics that promotes simple commands for common uses and offers the ability to experiment with whole new paradigms. Pictor describes graphs as graphical objects whose component pieces are related by several sorts of constraints. This article describes in detail the constraint system that Pictor uses.

...read moreread less

Journal Article•DOI•

Recent Developments and Future Directions in LispStat

[...]

Luke Tierney

01 Sep 1996-Journal of Computational and Graphical Statistics

TL;DR: Lisp-Stat is an extensible statistical computing environment based on the Lisp language that is currently being revised on the basis of experience gained from several years of use.

...read moreread less

Abstract: Lisp-Stat is an extensible statistical computing environment based on the Lisp language. The system is currently being revised on the basis of experience gained from several years of use. This article outlines some of the changes that have been completed and others that are under consideration.

...read moreread less

Journal Article•DOI•

Extensible Software Systems in Oberon

[...]

Johannes L. Marais

01 Sep 1996-Journal of Computational and Graphical Statistics

TL;DR: This article introduces the Oberon system from the perspective of an extensible system and discusses issues related to these hierarchies and the approaches selected in Oberon for their implementation.

...read moreread less

Abstract: Extensible software systems play an important role in prototyping environments where a fast compile-and-test turnaround is required. Typically, extensible software systems combine ways to reuse code, an approach to object-oriented programming, and ways to preserve state from one session to another. In this article we introduce the Oberon system from the perspective of an extensible system. In Oberon, the stated requirements manifest themselves as separate hierarchies related to modularity, the type system, runtime system organization, and persistency. We discuss issues related to these hierarchies and the approaches selected in Oberon for their implementation. The article is mainly a short introduction to Oberon and a summary of what has been accomplished with this system.

...read moreread less