Showing papers in "Journal of Statistical Software in 2016"

PDF

Open Access

Journal Article•DOI•

Least-Squares Means: The R Package lsmeans

[...]

29 Jan 2016-Journal of Statistical Software

TL;DR: The lsmeans package (Lenth 2016) provides a simple way of obtaining least-squares means and contrasts thereof and supports many models fitted by R (R Core Team 2015) core packages that fit linear or mixed models.

...read moreread less

Abstract: Least-squares means are predictions from a linear model, or averages thereof. They are useful in the analysis of experimental data for summarizing the effects of factors, and for testing linear contrasts among predictions. The lsmeans package (Lenth 2016) provides a simple way of obtaining least-squares means and contrasts thereof. It supports many models fitted by R (R Core Team 2015) core packages (as well as a few key contributed ones) that fit linear or mixed models, and provides a simple way of extending it to cover more model classes.

...read moreread less

4,656 citations

Journal Article•DOI•

Regression Modeling Strategies with Applications to Linear Models, Logistic and Ordinal Regression and Survival Analysis (2nd Edition)

[...]

James E. Helmreich

04 Apr 2016-Journal of Statistical Software

1,232 citations

Journal Article•DOI•

missMDA: A Package for Handling Missing Values in Multivariate Data Analysis

[...]

Julie Josse, François Husson

04 Apr 2016-Journal of Statistical Software

TL;DR: The missMDA as mentioned in this paper package performs principal component analysis on incomplete data sets, aiming to obtain scores, loadings and graphical representations despite missing values, and can be used to perform single imputation to complete data involving continuous, categorical and mixed variables.

...read moreread less

Abstract: We present the R package missMDA which performs principal component methods on incomplete data sets, aiming to obtain scores, loadings and graphical representations despite missing values. Package methods include principal component analysis for continuous variables, multiple correspondence analysis for categorical variables, factorial analysis on mixed data for both continuous and categorical variables, and multiple factor analysis for multi-table data. Furthermore, missMDA can be used to perform single imputation to complete data involving continuous, categorical and mixed variables. A multiple imputation method is also available. In the principal component analysis framework, variability across different imputations is represented by confidence areas around the row and column positions on the graphical outputs. This allows assessment of the credibility of results obtained from incomplete data sets.

...read moreread less

758 citations

Journal Article•DOI•

Spatial Point Patterns: Methodology and Applications with R

[...]

Virgilio Gómez-Rubio

05 Dec 2016-Journal of Statistical Software

580 citations

Journal Article•DOI•

TMB: Automatic Differentiation and Laplace Approximation

[...]

Kasper Kristensen, Anders Nielsen, Casper Willestofte Berg, Hans J. Skaug¹, Brad Bell² - Show less +1 more•Institutions (2)

University of Bergen¹, University of Washington²

04 Apr 2016-Journal of Statistical Software

TL;DR: TMB is an open source R package that enables quick implementation of complex nonlinear random effect (latent variable) models in a manner similar to the established AD Model Builder package, and is designed to be fast for problems with many random effects and parameters.

...read moreread less

Abstract: TMB is an open source R package that enables quick implementation of complex nonlinear random effects (latent variable) models in a manner similar to the established AD Model Builder package (ADMB, http://admb-project.org/; Fournier et al. 2011). In addition, it offers easy access to parallel computations. The user defines the joint likelihood for the data and the random effects as a C++ template function, while all the other operations are done in R; e.g., reading in the data. The package evaluates and maximizes the Laplace approximation of the marginal likelihood where the random effects are automatically integrated out. This approximation, and its derivatives, are obtained using automatic differentiation (up to order three) of the joint likelihood. The computations are designed to be fast for problems with many random effects (≈ 106 ) and parameters (≈ 103 ). Computation times using ADMB and TMB are compared on a suite of examples ranging from simple models to large spatial models where the random effects are a Gaussian random field. Speedups ranging from 1.5 to about 100 are obtained with increasing gains for large problems. The package and examples are available at http://tmb-project.org/.

...read moreread less

533 citations

Journal Article•DOI•

Meta-Analysis with R

[...]

Ii. James P. Howard

04 Apr 2016-Journal of Statistical Software

TL;DR: This book provides a comprehensive introduction to performingMeta-analysis using the statistical software R, and introduces the key concepts and models used in meta-analysis.

...read moreread less

Abstract: This book provides a comprehensive introduction to performing meta-analysis using the statistical software R. It is intended for quantitative researchers and students in the medical and social sciences who wish to learn how to perform meta-analysis with R. As such, the book introduces the key concepts and models used in meta-analysis. It also includes chapters on the following advanced topics: publication bias and small study effects; missing data; multivariate meta-analysis, network meta-analysis; and meta-analysis of diagnostic studies.

...read moreread less

507 citations

Journal Article•DOI•

runjags: An R Package Providing Interface Utilities, Model Templates, Parallel Computing Methods and Additional Distributions for MCMC Models in JAGS

[...]

Matthew J. Denwood

26 Jul 2016-Journal of Statistical Software

TL;DR: The runjags package provides a set of interface functions to facilitate running Markov chain Monte Carlo models in JAGS from within R, and an illustration of a simulation study to assess the sensitivity of two equivalent model formulations to different prior distributions is given.

...read moreread less

Abstract: The runjags package provides a set of interface functions to facilitate running Markov chain Monte Carlo models in JAGS from within R. Automated calculation of appropriate convergence and sample length diagnostics, user-friendly access to commonly used graphical outputs and summary statistics, and parallelized methods of running JAGS are provided. Template model specifications can be generated using a standard lme4-style formula interface to assist users less familiar with the BUGS syntax. Automated simulation study functions are implemented to facilitate model performance assessment, as well as drop-k type cross-validation studies, using high performance computing clusters such as those provided by parallel. A module extension for JAGS is also included within runjags, providing the Pareto family of distributions and a series of minimally-informative priors including the DuMouchel and half-Cauchy priors. This paper outlines the primary functions of this package, and gives an illustration of a simulation study to assess the sensitivity of two equivalent model formulations to different prior distributions.

...read moreread less

499 citations

Journal Article•DOI•

ExtRemes 2.0: An extreme value analysis package in R

[...]

Eric Gilleland, Richard W. Katz

30 Aug 2016-Journal of Statistical Software

TL;DR: The functions primarily provide utilities for implementing univariate EVA, with a focus on weather and climate applications, including the incorporation of covariates, as well as some functionality for assessing bivariate tail dependence.

...read moreread less

Abstract: This article describes the extreme value analysis (EVA) R package extRemes version 2.0, which is completely redesigned from previous versions. The functions primarily provide utilities for implementing univariate EVA, with a focus on weather and climate applications, including the incorporation of covariates, as well as some functionality for assessing bivariate tail dependence.

...read moreread less

400 citations

Journal Article•DOI•

cooccur: Probabilistic Species Co-Occurrence Analysis in R

[...]

Daniel M. Griffith, Joseph A. Veech, Charles J. Marsh

09 Feb 2016-Journal of Statistical Software

TL;DR: Cooccur as discussed by the authors is an R implementation of a recently published model that is metric-free, distribution-free and randomization-free for community co-occurrence analysis.

...read moreread less

Abstract: The observation that species may be positively or negatively associated with each other is at least as old as the debate surrounding the nature of community structure which began in the early 1900's with Gleason and Clements. Since then investigating species co-occurrence patterns has taken a central role in understanding the causes and consequences of evolution, history, coexistence mechanisms, competition, and environment for community structure and assembly. This is because co-occurrence among species is a measurable metric in community datasets that, in the context of phylogeny, geography, traits, and environment, can sometimes indicate the degree of competition, displacement, and phylogenetic repulsion as weighed against biotic and environmental effects promoting correlated species distributions. Historically, a multitude of different co-occurrence metrics have been developed and most have depended on data randomization procedures to produce null distributions for significance testing. Here we improve upon and present an R implementation of a recently published model that is metric-free, distribution-free, and randomization-free. The R package, cooccur, is highly accessible, easily integrates into common analyses, and handles large datasets with high performance. In the article we develop the package's functionality and demonstrate aspects of co-occurrence analysis using three sample datasets.

...read moreread less

389 citations

Journal Article•DOI•

Imputation with the R Package VIM

[...]

Alexander Kowarik, Matthias Templ

20 Oct 2016-Journal of Statistical Software

TL;DR: The graphical user interface of VIM has been re-implemented from scratch resulting in the package VIMGUI to enable users without extensive R skills to access these imputation and visualization methods.

...read moreread less

Abstract: The package VIM (Templ, Alfons, Kowarik, and Prantner 2016) is developed to explore and analyze the structure of missing values in data using visualization methods, to impute these missing values with the built-in imputation methods and to verify the imputation process using visualization tools, as well as to produce high-quality graphics for publications. This article focuses on the different imputation techniques available in the package. Four different imputation methods are currently implemented in VIM, namely hot-deck imputation, k-nearest neighbor imputation, regression imputation and iterative robust model-based imputation (Templ, Kowarik, and Filzmoser 2011). All of these methods are implemented in a flexible manner with many options for customization. Furthermore in this article practical examples are provided to highlight the use of the implemented methods on real-world applications. In addition, the graphical user interface of VIM has been re-implemented from scratch resulting in the package VIMGUI (Schopfhauser, Templ, Alfons, Kowarik, and Prantner 2016) to enable users without extensive R skills to access these imputation and visualization methods.

...read moreread less

363 citations

Journal Article•DOI•

robustlmm : An R Package for Robust Estimation of Linear Mixed-Effects Models

[...]

Manuel Koller

06 Dec 2016-Journal of Statistical Software

TL;DR: An R package, robustlmm, is introduced, designed to robustly fit linear mixed-effects models, to provide estimates where contamination has only little influence and to detect and flag contamination.

...read moreread less

Abstract: As any real-life data, data modeled by linear mixed-effects models often contain outliers or other contamination. Even little contamination can drive the classic estimates far away from what they would be without the contamination. At the same time, datasets that require mixed-effects modeling are often complex and large. This makes it difficult to spot contamination. Robust estimation methods aim to solve both problems: to provide estimates where contamination has only little influence and to detect and flag contamination. We introduce an R package, robustlmm, to robustly fit linear mixed-effects models. The package's functions and methods are designed to closely equal those offered by lme4, the R package that implements classic linear mixed-effects model estimation in R. The robust estimation method in robustlmm is based on the random effects contamination model and the central contamination model. Contamination can be detected at all levels of the data. The estimation method does not make any assumption on the data's grouping structure except that the model parameters are estimable. robustlmm supports hierarchical and non-hierarchical (e.g., crossed) grouping structures. The robustness of the estimates and their asymptotic efficiency is fully controlled through the function interface. Individual parts (e.g., fixed effects and variance components) can be tuned independently. In this tutorial, we show how to fit robust linear mixed-effects models using robustlmm, how to assess the model fit, how to detect outliers, and how to compare different fits.

...read moreread less

Journal Article•DOI•

flexsurv: A Platform for Parametric Survival Modeling in R

[...]

Christopher Jackson

12 May 2016-Journal of Statistical Software

TL;DR: The methods and design principles of flexsurv, an R package for fully-parametric modeling of survival data, are explained, giving several worked examples of its use.

...read moreread less

Abstract: flexsurv is an R package for fully-parametric modeling of survival data. Any parametric time-to-event distribution may be fitted if the user supplies a probability density or hazard function, and ideally also their cumulative versions. Standard survival distributions are built in, including the three and four-parameter generalized gamma and F distributions. Any parameter of any distribution can be modeled as a linear or log-linear function of covariates. The package also includes the spline model of Royston and Parmar (2002), in which both baseline survival and covariate effects can be arbitrarily flexible parametric functions of time. The main model-fitting function, flexsurvreg, uses the familiar syntax of survreg from the standard survival package (Therneau 2016). Censoring or left-truncation are specified in 'Surv' objects. The models are fitted by maximizing the full log-likelihood, and estimates and confidence intervals for any function of the model parameters can be printed or plotted. flexsurv also provides functions for fitting and predicting from fully-parametric multi-state models, and connects with the mstate package (de Wreede, Fiocco, and Putter 2011). This article explains the methods and design principles of the package, giving several worked examples of its use.

...read moreread less

Journal Article•DOI•

Statistical Inference for Partially Observed Markov Processes via the R Package pomp

[...]

Aaron A. King, Dao Nguyen¹, Edward L. Ionides¹•Institutions (1)

University of Michigan¹

29 Mar 2016-Journal of Statistical Software

TL;DR: The R package pomp as mentioned in this paper provides a very flexible framework for Monte Carlo statistical investigations using nonlinear, non-Gaussian POMP models, including iterated filtering, particle Markov chain Monte Carlo, approximate Bayesian computation, maximum synthetic likelihood estimation and trajectory matching.

...read moreread less

Abstract: Partially observed Markov process (POMP) models, also known as hidden Markov models or state space models, are ubiquitous tools for time series analysis. The R package pomp provides a very flexible framework for Monte Carlo statistical investigations using nonlinear, non-Gaussian POMP models. A range of modern statistical methods for POMP models have been implemented in this framework including sequential Monte Carlo, iterated filtering, particle Markov chain Monte Carlo, approximate Bayesian computation, maximum synthetic likelihood estimation, nonlinear forecasting, and trajectory matching. In this paper, we demonstrate the application of these methodologies using some simple toy problems. We also illustrate the specification of more complex POMP models, using a nonlinear epidemiological model with a discrete population, seasonality, and extra-demographic stochasticity. We discuss the specification of user-defined models and the development of additional methods within the programming environment provided by pomp.

...read moreread less

Journal Article•DOI•

bartMachine: Machine Learning with Bayesian Additive Regression Trees

[...]

Adam Kapelner, Justin Bleich

04 Apr 2016-Journal of Statistical Software

TL;DR: In this paper, the authors present a new package in R implementing Bayesian additive regression trees (BART), which introduces many new features for data analysis using BART such as variable selection, interaction detection, model diagnostic plots, incorporation of missing data and the ability to save trees for future prediction.

...read moreread less

Abstract: We present a new package in R implementing Bayesian additive regression trees (BART). The package introduces many new features for data analysis using BART such as variable selection, interaction detection, model diagnostic plots, incorporation of missing data and the ability to save trees for future prediction. It is significantly faster than the current R implementation, parallelized, and capable of handling both large sample sizes and high-dimensional data.

...read moreread less

Journal Article•DOI•

synthpop: Bespoke Creation of Synthetic Data in R

[...]

Beata Nowok, Gillian M. Raab, Chris Dibben

28 Oct 2016-Journal of Statistical Software

TL;DR: The synthpop package for R provides routines to generate synthetic versions of original data sets that mimic the original observed data and preserve the relationships between variables but do not contain any disclosive records.

...read moreread less

Abstract: In many contexts, confidentiality constraints severely restrict access to unique and valuable microdata Synthetic data which mimic the original observed data and preserve the relationships between variables but do not contain any disclosive records are one possible solution to this problem The synthpop package for R, introduced in this paper, provides routines to generate synthetic versions of original data sets We describe the methodology and its consequences for the data characteristics We illustrate the package features using a survey data example

...read moreread less

Journal Article•DOI•

The R Package JMbayes for Fitting Joint Models for Longitudinal and Time-to-Event Data Using MCMC

[...]

Dimitris Rizopoulos

28 Aug 2016-Journal of Statistical Software

TL;DR: The capabilities of the R package JMbayes for fitting joint models under a Bayesian approach using Markon chain Monte Carlo algorithms are presented and several tools to validate these predictions in terms of discrimination and calibration are offered.

...read moreread less

Abstract: Joint models for longitudinal and time-to-event data constitute an attractive modeling framework that has received a lot of interest in the recent years. This paper presents the capabilities of the R package JMbayes for fitting these models under a Bayesian approach using Markov chain Monte Carlo algorithms. JMbayes can fit a wide range of joint models, including among others joint models for continuous and categorical longitudinal responses, and provides several options for modeling the association structure between the two outcomes. In addition, this package can be used to derive dynamic predictions for both outcomes, and offers several tools to validate these predictions in terms of discrimination and calibration. All these features are illustrated using a real data example on patients with primary biliary cirrhosis.

...read moreread less

Journal Article•DOI•

R2GUESS: A Graphics Processing Unit-Based R Package for Bayesian Variable Selection Regression of Multivariate Responses.

[...]

Benoit Liquet¹, Leonardo Bottolo², Gianluca Campanella², Sylvia Richardson, Marc Chadeau-Hyam² - Show less +1 more•Institutions (2)

University of Pau and Pays de l'Adour¹, Imperial College London²

29 Jan 2016-Journal of Statistical Software

TL;DR: R2GUESS, an R package wrapping the original C++ source code, is proposed, providing a user-friendly interface of the original code automating its parametrisation, and data handling, which incorporates many features to explore the data, and extend statistical inferences from the native algorithm.

...read moreread less

Abstract: Technological advances in molecular biology over the past decade have given rise to high dimensional and complex datasets offering the possibility to investigate biological associations between a range of genomic features and complex phenotypes. The analysis of this novel type of data generated unprecedented computational challenges which ultimately led to the definition and implementation of computationally efficient statistical models that were able to scale to genome-wide data, including Bayesian variable selection approaches. While extensive methodological work has been carried out in this area, only few methods capable of handling hundreds of thousands of predictors were implemented and distributed. Among these we recently proposed GUESS, a computationally optimised algorithm making use of graphics processing unit capabilities, which can accommodate multiple outcomes. In this paper we propose R2GUESS, an R package wrapping the original C++ source code. In addition to providing a user-friendly interface of the original code automating its parametrisation, and data handling, R2GUESS also incorporates many features to explore the data, to extend statistical inferences from the native algorithm (e.g., effect size estimation, significance assessment), and to visualize outputs from the algorithm. We first detail the model and its parametrisation, and describe in details its optimised implementation. Based on two examples we finally illustrate its statistical performances and flexibility.

...read moreread less

Journal Article•DOI•

laGP: Large-Scale Spatial Modeling via Local Approximate Gaussian Processes in R

[...]

Robert B. Gramacy

16 Aug 2016-Journal of Statistical Software

TL;DR: This work discusses an implementation of local approximate Gaussian process models, in the laGP package for R, that offers a particular sparse-matrix remedy uniquely positioned to leverage modern parallel computing architectures.

...read moreread less

Abstract: Gaussian process (GP) regression models make for powerful predictors in out of sample exercises, but cubic runtimes for dense matrix decompositions severely limit the size of data - training and testing - on which they can be deployed. That means that in computer experiment, spatial/geo-physical, and machine learning contexts, GPs no longer enjoy privileged status as data sets continue to balloon in size. We discuss an implementation of local approximate Gaussian process models, in the laGP package for R, that offers a particular sparse-matrix remedy uniquely positioned to leverage modern parallel computing architectures. The laGP approach can be seen as an update on the spatial statistical method of local kriging neighborhoods. We briefly review the method, and provide extensive illustrations of the features in the package through worked-code examples. The appendix covers custom building options for symmetric multi-processor and graphical processing units, and built-in wrapper routines that automate distribution over a simple network of workstations.

...read moreread less

Journal Article•DOI•

Dealing with Stochastic Volatility in Time Series Using the R Package stochvol

[...]

Gregor Kastner

24 Feb 2016-Journal of Statistical Software

TL;DR: The main focus of this paper is to show the functionality of stochvol, and provides a brief mathematical description of the model, an overview of the sampling schemes used, and several illustrative examples using exchange rate data.

...read moreread less

Abstract: The R package stochvol provides a fully Bayesian implementation of heteroskedasticity modeling within the framework of stochastic volatility. It utilizes Markov chain Monte Carlo (MCMC) samplers to conduct inference by obtaining draws from the posterior distribution of parameters and latent variables which can then be used for predicting future volatilities. The package can straightforwardly be employed as a stand-alone tool; moreover, it allows for easy incorporation into other MCMC samplers. The main focus of this paper is to show the functionality of stochvol. In addition, it provides a brief mathematical description of the model, an overview of the sampling schemes used, and several illustrative examples using exchange rate data.

...read moreread less

Journal Article•DOI•

Multivariate Dose-Response Meta-Analysis: The dosresmeta R Package

[...]

Alessio Crippa, Nicola Orsini

16 Aug 2016-Journal of Statistical Software

TL;DR: Specific topics covered are reconstructing covariances of correlated outcomes; pooling of study-specific trends; flexible modeling of the exposure; testing hypothesis; assessing statistical heterogeneity; and presenting in either a graphical or tabular way the overall dose-response association.

...read moreread less

Abstract: An increasing number of quantitative reviews of epidemiological data includes a doseresponse analysis. Aims of this paper are to describe the main aspects of the methodology and to illustrate the novel R package dosresmeta developed for multivariate dose-response meta-analysis of summarized data. Specific topics covered are reconstructing covariances of correlated outcomes; pooling of study-specific trends; flexible modeling of the exposure; testing hypothesis; assessing statistical heterogeneity; and presenting in either a graphical or tabular way the overall dose-response association.

...read moreread less

Journal Article•DOI•

The R Package CDM for Cognitive Diagnosis Models

[...]

Ann Cathrice George, Alexander Robitzsch, Thomas Kiefer, Jürgen Groß, Ali Ünlü - Show less +1 more

20 Oct 2016-Journal of Statistical Software

TL;DR: The theoretical aspects of implemented CDM frameworks are described and the usage of the R package CDM for cognitive diagnosis models is illustrated with empirical data of the common fraction subtraction test by Tatsuoka (1984).

...read moreread less

Abstract: This paper introduces the R package CDM for cognitive diagnosis models (CDMs). The package implements parameter estimation procedures for two general CDM frameworks, the generalized-deterministic input noisy-and-gate (G-DINA) and the general diagnostic model (GDM). It contains additional functions for analyzing data under these frameworks, like tools for simulating and plotting data, or for evaluating global model and item fit. The paper describes the theoretical aspects of implemented CDM frameworks and it illustrates the usage of the package with empirical data of the common fraction subtraction test by Tatsuoka (1984).

...read moreread less

Journal Article•DOI•

Monitoring Count Time Series in R: Aberration Detection in Public Health Surveillance

[...]

Maëlle Salmon, Dirk Schumacher, Michael Höhle

18 May 2016-Journal of Statistical Software

TL;DR: The present article shows how well surveillance can support automatic aberration detection in a public health surveillance context by integrating it into the monitoring workflow of apublic health institution.

...read moreread less

Abstract: Public health surveillance aims at lessening disease burden by, e.g., timely recognizing emerging outbreaks in case of infectious diseases. Seen from a statistical perspective, this implies the use of appropriate methods for monitoring time series of aggregated case reports. This paper presents the tools for such automatic aberration detection offered by the R package surveillance. We introduce the functionalities for the visualization, modeling and monitoring of surveillance time series. With respect to modeling we focus on univariate time series modeling based on generalized linear models (GLMs), multivariate GLMs, generalized additive models and generalized additive models for location, shape and scale. Applications of such modeling include illustrating implementational improvements and extensions of the well-known Farrington algorithm, e.g., by spline-modeling or by treating it in a Bayesian context. Furthermore, we look at categorical time series and address overdispersion using beta-binomial or Dirichlet-multinomial modeling. With respect to monitoring we consider detectors based on either a Shewhart-like single timepoint comparison between the observed count and the predictive distribution or by likelihoodratio based cumulative sum methods. Finally, we illustrate how surveillance can support aberration detection in practice by integrating it into the monitoring workflow of a public health institution. Altogether, the present article shows how well surveillance can support automatic aberration detection in a public health surveillance context.

...read moreread less

Journal Article•DOI•

equate: An R Package for Observed-Score Linking and Equating

[...]

Anthony D. Albano

21 Oct 2016-Journal of Statistical Software

TL;DR: These designs for observed-score linking and equating under single-group, equivalent-groups, and nonequivalent-groups with anchor test(s) designs are introduced and details about each of the supported methods are provided.

...read moreread less

Abstract: The R package equate contains functions for observed-score linking and equating under single-group, equivalent-groups, and nonequivalent-groups with anchor test(s) designs. This paper introduces these designs and provides an overview of observed-score equating with details about each of the supported methods. Examples demonstrate the basic functionality of the equate package.

...read moreread less

Journal Article•DOI•

ggmcmc: Analysis of MCMC Samples and Bayesian Inference

[...]

Xavier Fernández-i-Marín

12 May 2016-Journal of Statistical Software

TL;DR: The article reviews the potential uses and options ofggmcmc, an R package for analyzing Markov chain Monte Carlo simulations from Bayesian inference, ranging from classical convergence tests to caterpillar plots or posterior predictive checks.

...read moreread less

Abstract: ggmcmc is an R package for analyzing Markov chain Monte Carlo simulations from Bayesian inference. By using a well known example of hierarchical/multilevel modeling, the article reviews the potential uses and options of the package, ranging from classical convergence tests to caterpillar plots or posterior predictive checks.

...read moreread less

Journal Article•DOI•

Recovering a Basic Space from Issue Scales in R

[...]

Keith T. Poole, Jeffrey B. Lewis, Howard Rosenthal, James Lo, Royce Carroll - Show less +1 more

11 Mar 2016-Journal of Statistical Software

TL;DR: Basicspace is an R package that conducts Aldrich-McKelvey and Blackbox scaling to recover estimates of the underlying latent dimensions of issue scale data that can recover latent dimensions and reproduce the matrix of responses at moderate levels of error and missing data.

...read moreread less

Abstract: basicspace is an R package that conducts Aldrich-McKelvey and Blackbox scaling to recover estimates of the underlying latent dimensions of issue scale data. We illustrate several applications of the package to survey data commonly used in the social sciences. Monte Carlo tests demonstrate that the procedure can recover latent dimensions and reproduce the matrix of responses at moderate levels of error and missing data.

...read moreread less

Journal Article•DOI•

R2MLwiN: A Package to Run MLwiN From Within R

[...]

Zhengzheng Zhang, Richard M A Parker, Christopher M J Charlton, George Leckie, William J Browne - Show less +1 more

08 Sep 2016-Journal of Statistical Software

TL;DR: R2MLwiN is a new package designed to run the multilevel modeling software program MLwiN from within the R environment, and allows for a large range of models to be specified, including continuous, binary, proportion, count, ordinal and nominal responses for data structures which are nested, cross-classified and/or exhibit multiple membership.

...read moreread less

Abstract: R2MLwiN is a new package designed to run the multilevel modeling software program MLwiN from within the R environment. It allows for a large range of models to be specified which take account of a multilevel structure, including continuous, binary, proportion, count, ordinal and nominal responses for data structures which are nested, cross-classified and/or exhibit multiple membership. Estimation is available via iterative generalized least squares (IGLS), which yields maximum likelihood estimates, and also via Markov chain Monte Carlo (MCMC) estimation for Bayesian inference. As well as employing MLwiN's own MCMC engine, users can request that MLwiN write BUGS model, data and initial values statements for use with WinBUGS or OpenBUGS (which R2MLwiN automatically calls via rbugs), employing IGLS starting values from MLwiN. Users can also take advantage of MLwiN's graphical user interface: for example to specify models and inspect plots via its interactive equations and graphics windows. R2MLwiN is supported by a large number of examples, reproducing all the analyses conducted in MLwiN's IGLS and MCMC manuals.

...read moreread less

Journal Article•DOI•

gamboostLSS: An R Package for Model Building and Variable Selection in the GAMLSS Framework

[...]

Benjamin Hofner, Andreas Mayr, Matthias Schmid

20 Oct 2016-Journal of Statistical Software

TL;DR: This work provides a boosting method to fit generalized additive models for location, scale and shape (GAMLSS) with a variety of convenience functions, including methods for tuning parameter selection, prediction and visualization of results.

...read moreread less

Abstract: Generalized additive models for location, scale and shape are a flexible class of regression models that allow to model multiple parameters of a distribution function, such as the mean and the standard deviation, simultaneously. With the R package gamboostLSS, we provide a boosting method to fit these models. Variable selection and model choice are naturally available within this regularized regression framework. To introduce and illustrate the R package gamboostLSS and its infrastructure, we use a data set on stunted growth in India. In addition to the specification and application of the model itself, we present a variety of convenience functions, including methods for tuning parameter selection, prediction and visualization of results. The package gamboostLSS is available from the Comprehensive R Archive Network (CRAN) at https://CRAN.R-project.org/package=gamboostLSS.

...read moreread less

Journal Article•DOI•

Generating Adaptive and Non-Adaptive Test Interfaces for Multidimensional Item Response Theory Applications

[...]

R. Philip Chalmers

27 Jul 2016-Journal of Statistical Software

TL;DR: A new R package for implementing unidimensional and multidimensional CATs using a wide variety of IRT models, which can be unique for each respective test item, and how graphical user interfaces and Monte Carlo simulation designs can be constructed with the mirtCAT package are described.

...read moreread less

Abstract: Computerized adaptive testing (CAT) is a powerful technique to help improve measurement precision and reduce the total number of items required in educational, psychological, and medical tests. In CATs, tailored test forms are progressively constructed by capitalizing on information available from responses to previous items. CAT applications primarily have relied on unidimensional item response theory (IRT) to help select which items should be administered during the session. However, multidimensional CATs may be constructed to improve measurement precision and further reduce the number of items required to measure multiple traits simultaneously. A small selection of CAT simulation packages exist for the R environment; namely, catR (Magis and Raiche 2012), catIrt (Nydick 2014), and MAT (Choi and King 2014). However, the ability to generate graphical user interfaces for administering CATs in realtime has not been implemented in R to date, support for multidimensional CATs have been limited to the multidimensional three-parameter logistic model, and CAT designs were required to contain IRT models from the same modeling family. This article describes a new R package for implementing unidimensional and multidimensional CATs using a wide variety of IRT models, which can be unique for each respective test item, and demonstrates how graphical user interfaces and Monte Carlo simulation designs can be constructed with the mirtCAT package.

...read moreread less

Journal Article•DOI•

PerFit: An R package for person-fit analysis in IRT

[...]

Jorge N. Tendeiro, Rob R. Meijer, A. Susan M. Niessen

20 Oct 2016-Journal of Statistical Software

TL;DR: The goal of the PerFit R package is to show how person-fit statistics can be easily applied to testing of questionnaire data.

...read moreread less

Abstract: Checking the validity of test scores is important in both educational and psychological measurement. Person-fit analysis provides several statistics that help practitioners assessing whether individual item score vectors conform to a prespecified item response theory model or, alternatively, to a group of test takers. Software enabling easy access to most person-fit statistics was lacking up to now. The PerFit R package was written in order to fill in this void. A theoretical overview of relatively simple person-fit statistics is provided. A practical guide showing how the main functions of PerFit can be used is also given. Both numerical and graphical tools are described and illustrated using examples. The goal is to show how person-fit statistics can be easily applied to testing of questionnaire data.

...read moreread less

Journal Article•DOI•

Just Another Gibbs Additive Modeler: Interfacing JAGS and mgcv

[...]

Simon N. Wood

20 Dec 2016-Journal of Statistical Software

TL;DR: In this article, the authors describe an interface between mgcv and JAGS, based around an R function, jagam, which takes a generalized additive model (GAM) as specified in mgcv, and automatically generates the jAGS model code and data required for inference about the model via Gibbs sampling.

...read moreread less

Abstract: The BUGS language offers a very flexible way of specifying complex statistical models for the purposes of Gibbs sampling, while its JAGS variant offers very convenient R integration via the rjags package. However, including smoothers in JAGS models can involve some quite tedious coding, especially for multivariate or adaptive smoothers. Further, if an additive smooth structure is required then some care is needed, in order to centre smooths appropriately, and to find appropriate starting values. R package mgcv implements a wide range of smoothers, all in a manner appropriate for inclusion in JAGS code, and automates centring and other smooth setup tasks. The purpose of this note is to describe an interface between mgcv and JAGS, based around an R function, jagam, which takes a generalized additive model (GAM) as specified in mgcv and automatically generates the JAGS model code and data required for inference about the model via Gibbs sampling. Although the auto-generated JAGS code can be run as is, the expectation is that the user would wish to modify it in order to add complex stochastic model components readily specified in JAGS. A simple interface is also provided for visualisation and further inference about the estimated smooth components using standard mgcv functionality. The methods described here will be un-necessarily inefficient if all that is required is fully Bayesian inference about a standard GAM, rather than the full flexibility of JAGS. In that case the BayesX package would be more efficient.

...read moreread less