Showing papers in "Journal of Statistical Software in 2006"

PDF

Open Access

Journal Article•DOI•

The R Package geepack for Generalized Estimating Equations

[...]

Ulrich Halekoh, Søren Højsgaard, Jun Yan

01 Jan 2006-Journal of Statistical Software

TL;DR: The core features of the R package geepack are described, which implements the generalized estimating equations (GEE) approach for fitting marginal generalized linear models to clustered data, through an example of clustered binary data.

...read moreread less

Abstract: This paper describes the core features of the R package geepack, which implements the generalized estimating equations (GEE) approach for fitting marginal generalized linear models to clustered data. Clustered data arise in many applications such as longitudinal data and repeated measures. The GEE approach focuses on models for the mean of the correlated observations within clusters without fully specifying the joint distribution of the observations. It has been widely used in statistical practice. This paper illustrates the application of the GEE approach with geepack through an example of clustered binary data.

...read moreread less

1,785 citations

Journal Article•

ltm: An R Package for Latent Variable Modeling and Item Response Theory Analyses

[...]

Dimitrios Rizopoulos¹•Institutions (1)

Katholieke Universiteit Leuven¹

01 Jan 2006-Journal of Statistical Software

TL;DR: The LTM package ltm as discussed by the authors is developed for the analysis of multivariate dichotomous and polytomous data using latent variable models, under the Item Response Theory approach.

...read moreread less

Abstract: The R package ltm has been developed for the analysis of multivariate dichotomous and polytomous data using latent variable models, under the Item Response Theory approach. For dichotomous data the Rasch, the Two-Parameter Logistic, and Birnbaum’s Three-Parameter models have been implemented, whereas for polytomous data Semejima’s Graded Response model is available. Parameter estimates are obtained under marginal maximum likelihood using the Gauss-Hermite quadrature rule. The capabilities and features of the package are illustrated using two real data examples.

...read moreread less

835 citations

Journal Article•DOI•

Support Vector Machines in R

[...]

Alexandros Karatzoglou, David Meyer, Kurt Hornik

06 Apr 2006-Journal of Statistical Software

TL;DR: The purpose of this paper is to present and compare these implementations of support vector machines, among the most popular and efficient classification and regression methods currently available.

...read moreread less

Abstract: Being among the most popular and efficient classification and regression methods currently available, implementations of support vector machines exist in almost every popular programming language. Currently four R packages contain SVM related software. The purpose of this paper is to present and compare these implementations. (authors' abstract)

...read moreread less

576 citations

Journal Article•DOI•

Object-oriented Computation of Sandwich Estimators

[...]

Achim Zeileis

15 Aug 2006-Journal of Statistical Software

TL;DR: Conceptual tools and their translation to computational tools in the package sandwich are discussed, enabling the computation of sandwich estimators in general parametric models.

...read moreread less

Abstract: Sandwich covariance matrix estimators are a popular tool in applied regression modeling for performing inference that is robust to certain types of model misspecification. Suitable implementations are available in the R system for statistical computing for certain model fitting functions only (in particular lm()), but not for other standard regression functions, such as glm(), nls(), or survreg(). Therefore, conceptual tools and their translation to computational tools in the package sandwich are discussed, enabling the computation of sandwich estimators in general parametric models. Object orientation can be achieved by providing a few extractor functions' most importantly for the empirical estimating functions' from which various types of sandwich estimators can be computed.

...read moreread less

447 citations

Journal Article•DOI•

LDheatmap: An R Function for Graphical Display of Pairwise Linkage Disequilibria Between Single Nucleotide Polymorphisms

[...]

Ji-Hyung Shin, Sigal Blay, Brad McNeney, Jinko Graham

20 Sep 2006-Journal of Statistical Software

TL;DR: The R function LDheatmap() is described which produces a graphical display, as a heat map, of pairwise linkage disequilibrium measurements between single nucleotide polymorphisms within a genomic region using the grid graphics system.

...read moreread less

Abstract: We describe the R function LDheatmap() which produces a graphical display, as a heat map, of pairwise linkage disequilibrium measurements between single nucleotide polymorphisms within a genomic region. LDheatmap() uses the grid graphics system, an alternative to the traditional R graphics system. The features of the LDheatmap() function and the use of tools from the grid package to modify heat maps are illustrated by examples.

...read moreread less

424 citations

Journal Article•DOI•

The Strucplot Framework: Visualizing Multi-way Contingency Tables with vcd

[...]

David Meyer, Achim Zeileis, Kurt Hornik

19 Oct 2006-Journal of Statistical Software

TL;DR: Strucplot displays include hierarchical conditional plots such as mosaic, association, and sieve plots, and can be combined into more complex, specialized plots for visualizing conditional independence, GLMs, and the results of independence tests.

...read moreread less

Abstract: This paper describes the `strucplot' framework for the visualization of multi-way contingency tables. Strucplot displays include hierarchical conditional plots such as mosaic, association, and sieve plots, and can be combined into more complex, specialized plots for visualizing conditional independence, GLMs, and the results of independence tests. The framework's modular design allows flexible customization of the plots' graphical appearance, including shading, labeling, spacing, and legend, by means of graphical appearance control (`grapcon') functions. The framework is provided by the R package vcd. (author's abstract)

...read moreread less

333 citations

Journal Article•DOI•

A Handbook of Statistical Analyses Using R

[...]

Joseph Hilbe

12 Aug 2006-Journal of Statistical Software

TL;DR: This text is one of a series of five handbooks that present an overview on how to use a major statistical software package, including S-PLUS, Stata, SPSS, SAS, and R.

...read moreread less

Abstract: This text is one of a series of five handbooks that present an overview on how to use a major statistical software package. Handbooks include S-PLUS, Stata, SPSS, SAS, and R. Although R is not strictly speaking a statistical package, it is a currently popular statistical language that is downloaded into ones computer from various mirror sites. It is similar in logic to the S language of the 1980s, which later became transformed to the S-PLUS commercial package.

...read moreread less

218 citations

Journal Article•DOI•

Ratios of Normal Variables

[...]

George Marsaglia

11 May 2006-Journal of Statistical Software

TL;DR: In this article, the density and distribution functions of the ratio z/w for any two jointly normal variates z,w, and provides details on methods for transforming a general ratio z /w into a standard form, (a+x)/(b+y) with x and y independent standard normal and a, b non-negative constants.

...read moreread less

Abstract: This article extends and amplifies on results from a paper of over forty years ago. It provides software for evaluating the density and distribution functions of the ratio z/w for any two jointly normal variates z,w, and provides details on methods for transforming a general ratio z/w into a standard form, (a+x)/(b+y) , with x and y independent standard normal and a, b non-negative constants. It discusses handling general ratios when, in theory, none of the moments exist yet practical considerations suggest there should be approximations whose adequacy can be verified by means of the included software. These approximations show that many of the ratios of normal variates encountered in practice can themselves be taken as normally distributed. A practical rule is developed: If a < 2.256 and 4 < b then the ratio (a+x)/(b+y) is itself approximately normally distributed with mean μ = a/(1.01b - .2713) and variance σ 2 = (a 2 + 1)/(b 2 + .108b - 3.795) μ 2 .

...read moreread less

186 citations

Journal Article•DOI•

ada: An R Package for Stochastic Boosting

[...]

Mark Culp¹, Kjell Johnson², George Michailides•Institutions (2)

University of Michigan¹, Pfizer²

26 Sep 2006-Journal of Statistical Software

TL;DR: Ada is an R package that implements three popular variants of boosting, together with a version of stochastic gradient boosting, which incorporates a random mechanism at each boosting step showing an improvement in performance and speed in generating the ensemble.

...read moreread less

Abstract: Boosting is an iterative algorithm that combines simple classification rules with "mediocre" performance in terms of misclassification error rate to produce a highly accurate classification rule. Stochastic gradient boosting provides an enhancement which incorporates a random mechanism at each boosting step showing an improvement in performance and speed in generating the ensemble. ada is an R package that implements three popular variants of boosting, together with a version of stochastic gradient boosting. In addition, useful plots for data analytic purposes are provided along with an extension to the multi-class case. The algorithms are illustrated with synthetic and real data sets.

...read moreread less

128 citations

Journal Article•DOI•

hapassoc: Software for Likelihood Inference of Trait Associations with SNP Haplotypes and Other Attributes

[...]

Kelly M. Burkett, Jinko Graham, Brad McNeney

26 Apr 2006-Journal of Statistical Software

TL;DR: Hpassoc is developed, software for R implementing a likelihood approach to inference of haplotype and non-genetic effects in GLMs of trait associations, which highlights the flexibility to specify dominant and recessive effects of genetic risk factors.

...read moreread less

Abstract: Complex medical disorders, such as heart disease and diabetes, are thought to involve a number of genes which act in conjunction with lifestyle and environmental factors to increase disease susceptibility. Associations between complex traits and single nucleotide polymorphisms (SNPs) in candidate genomic regions can provide a useful tool for identifying genetic risk factors. However, analysis of trait associations with single SNPs ignores the potential for extra information from haplotypes, combinations of variants at multiple SNPs along a chromosome inherited from a parent. When haplotype-trait associations are of interest and haplotypes of individuals can be determined, generalized linear models (GLMs) may be used to investigate haplotype associations while adjusting for the effects of non-genetic cofactors or attributes. Unfortunately, haplotypes cannot always be determined cost-effectively when data is collected on unrelated subjects. Uncertain haplotypes may be inferred on the basis of data from single SNPs. However, subsequent analyses of risk factors must account for the resulting uncertainty in haplotype assignment in order to avoid potential errors in interpretation. To account for such uncertainty, we have developed hapassoc, software for R implementing a likelihood approach to inference of haplotype and non-genetic effects in GLMs of trait associations. We provide a description of the underlying statistical method and illustrate the use of hapassoc with examples that highlight the flexibility to specify dominant and recessive effects of genetic risk factors, a feature not shared by other software that restricts users to additive effects only. Additionally, hapassoc can accommodate missing SNP genotypes for limited numbers of subjects.

...read moreread less

116 citations

Journal Article•DOI•

Handbook of Univariate and Multivariate Data Analysis and Interpretation with SPSS

[...]

Juana Sanchez

19 Jul 2006-Journal of Statistical Software

Journal Article•DOI•

R Programs for Truncated Distributions

[...]

Saralees Nadarajah, Samuel Kotz

17 Aug 2006-Journal of Statistical Software

TL;DR: In this article, the authors provide programs for computing six quantities of interest (probability density function, mean, variance, cumulative distribution function, quantile function and random numbers) for any truncated distribution: whether it is left truncated, right truncated or doubly truncated.

...read moreread less

Abstract: Truncated distributions arise naturally in many practical situations. In this note, we provide programs for computing six quantities of interest (probability density function, mean, variance, cumulative distribution function, quantile function and random numbers) for any truncated distribution: whether it is left truncated, right truncated or doubly truncated. The programs are written in R: a freely downloadable statistical software.

...read moreread less

Journal Article•DOI•

SOCR: Statistics Online Computational Resource

[...]

Ivo D. Dinov¹•Institutions (1)

University of California, Los Angeles¹

01 Oct 2006-Journal of Statistical Software

TL;DR: This paper describes an integrated educational web-based framework for: interactive distribution modeling, virtual online probability experimentation, statistical data analysis, visualization and integration, and evidence that SOCR resources build student's intuition and enhance their learning.

...read moreread less

Abstract: The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an integrated educational web-based framework for: interactive distribution modeling, virtual online probability experimentation, statistical data analysis, visualization and integration. Following years of experience in statistical teaching at all college levels using established licensed statistical software packages, like STATA, S-PLUS, R, SPSS, SAS, Systat, etc., we have attempted to engineer a new statistics education environment, the Statistics Online Computational Resource (SOCR). This resource performs many of the standard types of statistical analysis, much like other classical tools. In addition, it is designed in a plug-in object-oriented architecture and is completely platform independent, web-based, interactive, extensible and secure. Over the past 4 years we have tested, fine-tuned and reanalyzed the SOCR framework in many of our undergraduate and graduate probability and statistics courses and have evidence that SOCR resources build student's intuition and enhance their learning.

...read moreread less

Journal Article•DOI•

Robust Statistical Methods with R

[...]

Jan de Leeuw

07 Jul 2006-Journal of Statistical Software

TL;DR: There is a rapidly increasing number of books with titles “Something with R”, where “ something” is some area of statistics, and it is good that the books use R, because R is the lingua franca of computational statistics.

...read moreread less

Abstract: There is a rapidly increasing number of books with titles “Something with R”, where “Something” is some area of statistics. Clearly this is a good development from the point of view of JSS: statistical software gets more attention than it did in the “Without R” era. I think it is also good from a somewhat broader perspective: paying more attention to software blends applied and theoretical aspects of statistics, and illustrates the fact that statistics is properly defined as the development and study of techniques for data analysis. For those of us who are so inclined source code for a working algorithm is a precise and reproducible way to explain what a technique actually does. And finally it is good that the books use R, and not something else, because R is the lingua franca of computational statistics.

...read moreread less

Journal Article•DOI•

Independencies Induced from a Graphical Markov Model After Marginalization and Conditioning: The R Package ggm

[...]

Giovanni M. Marchetti

08 Feb 2006-Journal of Statistical Software

TL;DR: The package ggm has a few basic functions that find the essential graph, the induced concentration and covariance graphs, and several types of chain graphs implied by the directed acyclic graph (DAG) after grouping and reordering the variables.

...read moreread less

Abstract: We describe some functions in the R package ggm to derive from a given Markov model, represented by a directed acyclic graph, different types of graphs induced after marginalizing over and conditioning on some of the variables. The package has a few basic functions that find the essential graph, the induced concentration and covariance graphs, and several types of chain graphs implied by the directed acyclic graph (DAG) after grouping and reordering the variables. These functions can be useful to explore the impact of latent variables or of selection effects on a chosen data generating model.

...read moreread less

Journal Article•DOI•

Additive Integer Partitions in R

[...]

Robin K. S. Hankin

17 May 2006-Journal of Statistical Software

TL;DR: The partitions package of R routines as mentioned in this paper is a small package of C code for integer partititions, with support for unrestricted partitions, unequal partitions, and restricted partitions, which can be used to solve combinatorial problems.

...read moreread less

Abstract: This paper introduces the partitions package of R routines, for numerical calculation of integer partititions. Functionality for unrestricted partitions, unequal partitions, and restricted partitions is provided in a small package that accompanies this note; the emphasis is on terse, efficient C code. A simple combinatorial problem is solved using the package.

...read moreread less

Journal Article•DOI•

Formulating State Space Models in R with Focus on Longitudinal Regression Models

[...]

Claus Dethlefsen, Søren Lundbye-Christensen

26 Apr 2006-Journal of Statistical Software

TL;DR: In this paper, a state space model is specified similarly to a generalized linear model in R, and then the time-varying terms are marked in the formula, with special functions for specifying polynomial time trends, harmonic seasonal patterns, unstructured seasonal patterns and time-variate covariates.

...read moreread less

Abstract: We provide a language for formulating a range of state space models with response densities within the exponential family. The described methodology is implemented in the R-package sspir. A state space model is specified similarly to a generalized linear model in R, and then the time-varying terms are marked in the formula. Special functions for specifying polynomial time trends, harmonic seasonal patterns, unstructured seasonal patterns and time-varying covariates can be used in the formula. The model is fitted to data using iterated extended Kalman filtering, but the formulation of models does not depend on the implemented method of inference. The package is demonstrated on three datasets.

...read moreread less

Journal Article•DOI•

Efficient Calculation of Jackknife Confidence Intervals for Rank Statistics

[...]

Roger Newson

01 Jan 2006-Journal of Statistical Software

TL;DR: In this article, an algorithm for calculating concordance-discordance totals in a time of order N log N, where N is the number of observations, using a balanced binary search tree is presented.

...read moreread less

Abstract: An algorithm is presented for calculating concordance-discordance totals in a time of order N log N , where N is the number of observations, using a balanced binary search tree. These totals can be used to calculate jackknife estimates and confidence limits in the same time order for a very wide range of rank statistics, including Kendall's tau, Somers' D, Harrell's c, the area under the receiver operating characteristic (ROC) curve, the Gini coefficient, and the parameters underlying the sign and rank-sum tests. A Stata package is introduced for calculating confidence intervals for these rank statistics using this algorithm, which has been implemented in the Mata compilable matrix programming language supplied with Stata.

...read moreread less

Journal Article•DOI•

Data Ellipses, HE Plots and Reduced-Rank Displays for Multivariate Linear Models: SAS Software and Examples

[...]

Michael Friendly

10 Nov 2006-Journal of Statistical Software

TL;DR: In this article, the authors describe graphical methods for multiple-response data within the framework of the multivariate linear model (MLM), aimed at understanding what is being tested in a multivariate test, and how factor/predictor effects are expressed across multiple response measures.

...read moreread less

Abstract: This paper describes graphical methods for multiple-response data within the framework of the multivariate linear model (MLM), aimed at understanding what is being tested in a multivariate test, and how factor/predictor effects are expressed across multiple response measures. In particular, we describe and illustrate a collection of SAS macro programs for: (a) Data ellipses and low-rank biplots for multivariate data, (b) HE plots, showing the hypothesis and error covariance matrices for a given pair of responses, and a given effect, (c) HE plot matrices, showing all pairwise HE plots, and (d) low-rank analogs of HE plots, showing all observations, group means, and their relations to the response variables.

...read moreread less

Journal Article•DOI•

Using R via PHP for Teaching Purposes: R-php

[...]

Angelo Mineo, Alfredo Pontillo

20 Oct 2006-Journal of Statistical Software

TL;DR: This paper deals with the R-php statistical software, that is an environment for statistical analysis, freely accessible and attainable through the World Wide Web, based on R, and thinks that this tool could be particularly useful for teaching purposes.

...read moreread less

Abstract: This paper deals with the R-php statistical software, that is an environment for statistical analysis, freely accessible and attainable through the World Wide Web, based on R. Indeed, this software uses, as "engine" for statistical analyses, R via PHP and its design has been inspired by a paper of de Leeuw (1997). R-php is based on two modules: a base module and a point-and-click module. R-php base allows the simple editing of R code in a form. R-php point-and-click allows some statistical analyses by means of a graphical user interface (GUI): then, to use this module it is not necessary for the user to know the R environment, but all the allowed analyses can be performed by using the computer mouse. We think that this tool could be particularly useful for teaching purposes: one possible use could be in a University computer laboratory to permit a smooth approach of students to R.

...read moreread less

Journal Article•DOI•

A Computational Tool for Testing Dose-related Trend Using an Age-adjusted Bootstrap-based Poly-k Test

[...]

Hojin Moon, Hongshik Ahn, Ralph L. Kodell

04 Aug 2006-Journal of Statistical Software

TL;DR: An age-adjusted bootstrap-based method is developed to assess the significance of assumed asymptotic normal tests for animal carcinogenicity data and is applied to National Toxicology Program data sets to evaluate a dose-related trend of a test substance on the incidence of neoplasms.

...read moreread less

Abstract: A computational tool for testing for a dose-related trend and/or a pairwise difference in the incidence of an occult tumor via an age-adjusted bootstrap-based poly-k test and the original poly-k test is presented in this paper. The poly-k test (Bailer and Portier 1988) is a survival-adjusted Cochran-Armitage test, which achieves robustness to effects of differential mortality across dose groups. The original poly-k test is asymptotically standard normal under the null hypothesis. However, the asymptotic normality is not valid if there is a deviation from the tumor onset distribution that is assumed in this test. Our age-adjusted bootstrap-based poly-k test assesses the significance of assumed asymptotic normal tests and investigates an empirical distribution of the original poly-k test statistic using an age-adjusted bootstrap method. A tumor of interest is an occult tumor for which the time to onset is not directly observable. Since most of the animal carcinogenicity studies are designed with a single terminal sacrifice, the present tool is applicable to rodent tumorigenicity assays that have a single terminal sacrifice. The present tool takes input information simply from a user screen and reports testing results back to the screen through a user-interface. The computational tool is implemented in C/C++ and is applied to analyze a real data set as an example. Our tool enables the FDA and the pharmaceutical industry to implement a statistical analysis of tumorigenicity data from animal bioassays via our age-adjusted bootstrap-based poly-k test and the original poly-k test which has been adopted by the National Toxicology Program as its standard statistical test.

...read moreread less

Journal Article•DOI•

Proportional Symbol Mapping in R

[...]

Susumu Tanimura, Chusi Kuroiwa, Tsutomu Mizota

24 Jan 2006-Journal of Statistical Software

TL;DR: The authors developed some functions for proportional symbol mapping using R, including mathematical and perceptual scaling, that demonstrated the new expressive power and options available in R, particularly for the visualization of conceptual point data.

...read moreread less

Abstract: Visualization of spatial data on a map aids not only in data exploration but also in communication to impart spatial conception or ideas to others. Although recent carto-graphic functions in R are rapidly becoming richer, proportional symbol mapping, which is one of the common mapping approaches, has not been packaged thus far. Based on the theories of proportional symbol mapping developed in cartography, the authors developed some functions for proportional symbol mapping using R, including mathematical and perceptual scaling. An example of these functions demonstrated the new expressive power and options available in R, particularly for the visualization of conceptual point data.

...read moreread less

Journal Article•DOI•

ITA 2.0: A Program for Classical and Inductive Item Tree Analysis

[...]

Martin Schrepp

06 Sep 2006-Journal of Statistical Software

TL;DR: A computer program is described called ITA 2.0 which implements both of the algorithms available to perform an Item Tree Analysis and is shown with a concrete data set how the program can be used for the analysis of questionnaire data.

...read moreread less

Abstract: Item Tree Analysis (ITA) is an explorative method of data analysis which can be used to establish a hierarchical structure on a set of dichotomous items from a questionnaire or test. There are currently two different algorithms available to perform an ITA. We describe a computer program called ITA 2.0 which implements both of these algorithms. In addition we show with a concrete data set how the program can be used for the analysis of questionnaire data.

...read moreread less

Journal Article•DOI•

Programs to Compute Distribution Functions and Critical Values for Extreme Value Ratios for Outlier Detection

[...]

George C. McBane

04 May 2006-Journal of Statistical Software

TL;DR: A set of FORTRAN subprograms is presented to compute density and cumulative distribution functions and critical values for the range ratio statistics of Dixon.

...read moreread less

Abstract: A set of FORTRAN subprograms is presented to compute density and cumulative distribution functions and critical values for the range ratio statistics of Dixon (1951, The Annals of Mathematical Statistics ) These statistics are useful for detection of outliers in small samples.

...read moreread less

Journal Article•DOI•

A Computer Program to Calculate Two-Stage Short-Run Control Chart Factors for (X,MR) Charts

[...]

Matthew E. Elam, Kenneth E. Case

13 Apr 2006-Journal of Statistical Software

TL;DR: The development and execution of a computer program that accurately calculates first- and second-stage short-run control chart factors for (X, MR) charts using the equations derived in the first paper is described.

...read moreread less

Abstract: This paper is the second in a series of two papers that fully develops two-stage short-run (X, MR) control charts. This paper describes the development and execution of a computer program that accurately calculates first- and second-stage short-run control chart factors for (X, MR) charts using the equations derived in the first paper. The software used is Mathcad. The program accepts values for number of subgroups, α for the X chart, and α for the MR chart both above the upper control limit and below the lower control limit. Tables are generated for specific values of these inputs and the implications of the results are discussed. A numerical example illustrates the use of the program.

...read moreread less

Journal Article•DOI•

Introducing elliptic, an R Package for Elliptic and Modular Functions

[...]

Robin K. S. Hankin

18 Jan 2006-Journal of Statistical Software

TL;DR: The elliptic package of R routines, for numerical calculation of elliptic and related functions, and the package illustrates these numerically and visually, with a statistical application in fluid mechanics.

...read moreread less

Abstract: This paper introduces the elliptic package of R routines, for numerical calculation of elliptic and related functions. Elliptic functions furnish interesting and instructive examples of many ideas of complex analysis, and the package illustrates these numerically and visually. A statistical application in fluid mechanics is presented.

...read moreread less

Journal Article•DOI•

Gompertz: A Scilab Program for Estimating Gompertz Curve Using Gauss-Newton Method of Least Squares

[...]

Surajit Ghosh Dastidar

13 Apr 2006-Journal of Statistical Software

TL;DR: A computer program for estimating Gompertz curve using Gauss-Newton method of least squares is described in detail and is an improved version of the program proposed in Dastidar (2005).

...read moreread less

Abstract: A computer program for estimating Gompertz curve using Gauss-Newton method of least squares is described in detail. It is based on the estimation technique proposed in Reddy (1985). The program is developed using Scilab (version 3.1.1), a freely available scientific software package that can be downloaded from http://www.scilab.org/. Data is to be fed into the program from an external disk file which should be in Microsoft Excel format. The output will contain sample size, tolerance limit, a list of initial as well as the final estimate of the parameters, standard errors, value of Gauss-Normal equations namely GN1 GN2 and GN3, No. of iterations, variance(σ2), Durbin-Watson statistic, goodness of fit measures such as R2, D value, covariance matrix and residuals. It also displays a graphical output of the estimated curve vis a vis the observed curve. It is an improved version of the program proposed in Dastidar (2005).

...read moreread less

Journal Article•DOI•

Exact Hypothesis Tests for Log-linear Models with exactLoglinTest

[...]

Brian Caffo¹•Institutions (1)

Johns Hopkins University¹

10 Nov 2006-Journal of Statistical Software

TL;DR: The exactLoglinTest as mentioned in this paper package implements a sequentially rounded normal approximation and importance sampling to approximate probabilities from the conditional distribution, and a Monte Carlo algorithm is proposed to estimate P values from the resulting conditional distribution.

...read moreread less

Abstract: This manuscript overviews exact testing of goodness of fit for log-linear models using the R package exactLoglinTest. This package evaluates model fit for Poisson log-linear models by conditioning on minimal sufficient statistics to remove nuisance parameters. A Monte Carlo algorithm is proposed to estimate P values from the resulting conditional distribution. In particular, this package implements a sequentially rounded normal approximation and importance sampling to approximate probabilities from the conditional distribution. Usually, this results in a high percentage of valid samples. However, in instances where this is not the case, a Metropolis Hastings algorithm can be implemented that makes more localized jumps within the reference set. The manuscript details how some conditional tests for binomial logit models can also be viewed as conditional Poisson log-linear models and hence can be performed via exactLoglinTest. A diverse battery of examples is considered to highlight use, features and extensions of the software. Notably, potential extensions to evaluating disclosure risk are also considered.

...read moreread less

Journal Article•DOI•

SAS/IML Macros for a Multivariate Analysis of Variance Based on Spatial Signs

[...]

Jaakko Nevalainen, Hannu Oja

17 May 2006-Journal of Statistical Software

TL;DR: In this article, a new SAS/IML tool for performing a spatial sign based multivariate analysis of variance is introduced, which has promising efficiency and robustness properties compared with the classical multivariate analyses of variance model.

...read moreread less

Abstract: Recently, new nonparametric multivariate extensions of the univariate sign methods have been proposed. Randles (2000) introduced an affine invariant multivariate sign test for the multivariate location problem. Later on, Hettmansperger and Randles (2002) considered an affine equivariant multivariate median corresponding to this test. The new methods have promising efficiency and robustness properties. In this paper, we review these developments and compare them with the classical multivariate analysis of variance model. A new SAS/IML tool for performing a spatial sign based multivariate analysis of variance is introduced.

...read moreread less

Journal Article•DOI•

CVTresh: R Package for Level-Dependent Cross-Validation Thresholding

[...]

Donghoh Kim, Hee-Seok Oh

08 Apr 2006-Journal of Statistical Software

TL;DR: A cross-validation method for the selection of the thresholding value in wavelet shrinkage of Oh, Kim, and Lee (2006) is reviewed, and the R package CVThresh implementing details of the calculations for the procedures are introduced.

...read moreread less

Abstract: The core of the wavelet approach to nonparametric regression is thresholding of wavelet coefficients. This paper reviews a cross-validation method for the selection of the thresholding value in wavelet shrinkage of Oh, Kim, and Lee (2006), and introduces the R package CVThresh implementing details of the calculations for the procedures. This procedure is implemented by coupling a conventional cross-validation with a fast imputation method, so that it overcomes a limitation of data length, a power of 2. It can be easily applied to the classical leave-one-out cross-validation and K-fold cross-validation. Since the procedure is computationally fast, a level-dependent cross-validation can be developed for wavelet shrinkage of data with various sparseness according to levels.

...read moreread less