scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Statistical Software in 2015"


Journal ArticleDOI
TL;DR: In this article, a model is described in an lmer call by a formula, in this case including both fixed-and random-effects terms, and the formula and data together determine a numerical representation of the model from which the profiled deviance or the profeatured REML criterion can be evaluated as a function of some of model parameters.
Abstract: Maximum likelihood or restricted maximum likelihood (REML) estimates of the parameters in linear mixed-effects models can be determined using the lmer function in the lme4 package for R. As for most model-fitting functions in R, the model is described in an lmer call by a formula, in this case including both fixed- and random-effects terms. The formula and data together determine a numerical representation of the model from which the profiled deviance or the profiled REML criterion can be evaluated as a function of some of the model parameters. The appropriate criterion is optimized, using one of the constrained optimization functions in R, to provide the parameter estimates. We describe the structure of the model, the steps in evaluating the profiled deviance or REML criterion, and the structure of classes or types that represents such a model. Sufficient detail is included to allow specialization of these structures by users who wish to write functions to fit specialized linear mixed models, such as models incorporating pedigrees or smoothing splines, that are not easily expressible in the formula language used by lmer.

50,607 citations


Journal ArticleDOI
TL;DR: Fitdistrplus as discussed by the authors provides functions for fitting univariate distributions to different types of data (continuous censored or non-censored data and discrete data) and allowing different estimation methods (maximum likelihood, moment matching, quantile matching and maximum goodness of fit estimation).
Abstract: The package fitdistrplus provides functions for fitting univariate distributions to different types of data (continuous censored or non-censored data and discrete data) and allowing different estimation methods (maximum likelihood, moment matching, quantile matching and maximum goodness-of-fit estimation). Outputs of fitdist and fitdistcens functions are S3 objects, for which specific methods are provided, including summary, plot and quantile. This package also provides various functions to compare the fit of several distributions to the same data set and can handle to bootstrap parameter estimates. Detailed examples are given in food risk assessment, ecotoxicology and insurance contexts.

1,433 citations


Journal ArticleDOI
TL;DR: The principles behind the interface to continuous domain spatial models in the RINLA software package for R are described and the integrated nested Laplace approximation approach proposed by Rue, Martino, and Chopin (2009) is a computationally effective alternative to MCMC for Bayesian inference.
Abstract: The principles behind the interface to continuous domain spatial models in the RINLA software package for R are described. The integrated nested Laplace approximation (INLA) approach proposed by Rue, Martino, and Chopin (2009) is a computationally effective alternative to MCMC for Bayesian inference. INLA is designed for latent Gaussian models, a very wide and flexible class of models ranging from (generalized) linear mixed to spatial and spatio-temporal models. Combined with the stochastic partial differential equation approach (SPDE, Lindgren, Rue, and Lindstrom 2011), one can accommodate all kinds of geographically referenced data, including areal and geostatistical ones, as well as spatial point process data. The implementation interface covers stationary spatial models, non-stationary spatial models, and also spatio-temporal models, and is applicable in epidemiology, ecology, environmental risk assessment, as well as general geostatistics.

829 citations


Journal ArticleDOI
TL;DR: This review constitutes an up-to-date comparison of generalized method of moments and maximum likelihood implementations now available, using the cross-sectional US county data set provided by Drukker, Prucha, and Raciborski (2013d).
Abstract: Recent advances in the implementation of spatial econometrics model estimation techniques have made it desirable to compare results, which should correspond between implementations across software applications for the same data. These model estimation techniques are associated with methods for estimating impacts (emanating effects), which are also presented and compared. This review constitutes an up-to-date comparison of generalized method of moments and maximum likelihood implementations now available. The comparison uses the cross-sectional US county data set provided by Drukker, Prucha, and Raciborski (2013d). The comparisons will be cast in the context of alternatives using the MATLAB Spatial Econometrics toolbox, Stata's user-written sppack commands, Python with PySAL and R packages including spdep, sphet and McSpatial.

828 citations


Journal ArticleDOI
TL;DR: A unified diagnostic framework with the R package nlstools is introduced and the various features of the package are presented and exemplified using a worked example from pulmonary medicine.
Abstract: Nonlinear regression models are applied in a broad variety of scientific fields. Various R functions are already dedicated to fitting such models, among which the function nls() has a prominent position. Unlike linear regression fitting of nonlinear models relies on non-trivial assumptions and therefore users are required to carefully ensure and validate the entire modeling. Parameter estimation is carried out using some variant of the least- squares criterion involving an iterative process that ideally leads to the determination of the optimal parameter estimates. Therefore, users need to have a clear understanding of the model and its parameterization in the context of the application and data considered, an a priori idea about plausible values for parameter estimates, knowledge of model diagnostics procedures available for checking crucial assumptions, and, finally, an under- standing of the limitations in the validity of the underlying hypotheses of the fitted model and its implication for the precision of parameter estimates. Current nonlinear regression modules lack dedicated diagnostic functionality. So there is a need to provide users with an extended toolbox of functions enabling a careful evaluation of nonlinear regression fits. To this end, we introduce a unified diagnostic framework with the R package nlstools. In this paper, the various features of the package are presented and exemplified using a worked example from pulmonary medicine.

491 citations


Journal ArticleDOI
TL;DR: Geographically weighted (GW) models as discussed by the authors use a moving window weighting technique, where localized models are found at target locations, and outputs are mapped to provide a useful exploratory tool into the nature of the data spatial heterogeneity.
Abstract: Spatial statistics is a growing discipline providing important analytical techniques in a wide range of disciplines in the natural and social sciences. In the R package GWmodel we present techniques from a particular branch of spatial statistics, termed geographically weighted (GW) models. GW models suit situations when data are not described well by some global model, but where there are spatial regions where a suitably localized calibration provides a better description. The approach uses a moving window weighting technique, where localized models are found at target locations. Outputs are mapped to provide a useful exploratory tool into the nature of the data spatial heterogeneity. Currently, GWmodel includes functions for: GW summary statistics, GW principal components analysis, GW regression, and GW discriminant analysis; some of which are provided in basic and robust forms.

376 citations


Journal ArticleDOI
TL;DR: The R package TSclust is aimed to implement a large set of well-established peer-reviewed time series dissimilarity measures, including measures based on raw data, extracted features, underlying parametric models, complexity levels, and forecast behaviors.
Abstract: Time series clustering is an active research area with applications in a wide range of fields. One key component in cluster analysis is determining a proper dissimilarity measure between two data objects, and many criteria have been proposed in the literature to assess dissimilarity between two time series. The R package TSclust is aimed to implement a large set of well-established peer-reviewed time series dissimilarity measures, including measures based on raw data, extracted features, underlying parametric models, complexity levels, and forecast behaviors. Computation of these measures allows the user to perform clustering by using conventional clustering algorithms. TSclust also includes a clustering procedure based on p values from checking the equality of generating models, and some utilities to evaluate cluster solutions. The implemented dissimilarity functions are accessible individually for an easier extension and possible use out of the clustering context. The main features of TSclust are described and examples of its use are presented.

362 citations


Journal ArticleDOI
TL;DR: The poweRlaw R package is described, which makes fitting power laws and other heavy-tailed distributions straightforward and provides a principled approach to power law fitting.
Abstract: Over the last few years, the power law distribution has been used as the data generating mechanism in many disparate fields. However, at times the techniques used to fit the power law distribution have been inappropriate. This paper describes the poweRlaw R package, which makes fitting power laws and other heavy-tailed distributions straightforward. This package contains R functions for fitting, comparing and visualizing heavy tailed distributions. Overall, it provides a principled approach to power law fitting.

320 citations


Journal ArticleDOI
TL;DR: A new fully interactive R interface to BayesX is presented: the R package R2BayesX, which complements the already impressive capabilities for semiparametric regression in R by a comprehensive toolbox comprising in particular more complex response types and alternative inferential procedures such as simulation-based Bayesian inference.
Abstract: Structured additive regression (STAR) models provide a flexible framework for modeling possible nonlinear effects of covariates: They contain the well established frameworks of generalized linear models and generalized additive models as special cases but also allow a wider class of effects, e.g., for geographical or spatio-temporal data, allowing for specification of complex and realistic models. BayesX is standalone software package providing software for fitting general class of STAR models. Based on a comprehensive open-source regression toolbox written in C++, BayesX uses Bayesian inference for estimating STAR models based on Markov chain Monte Carlo simulation techniques, a mixed model representation of STAR models, or stepwise regression techniques combining penalized least squares estimation with model selection. BayesX not only covers models for responses from univariate exponential families, but also models from less-standard regression situations such as models for multi-categorical responses with either ordered or unordered categories, continuous time survival data, or continuous time multi-state models. This paper presents a new fully interactive R interface to BayesX: the R package R2BayesX. With the new package, STAR models can be conveniently specified using R’s formula language (with some extended terms), fitted using the BayesX binary, represented in R with objects of suitable classes, and finally printed/summarized/plotted. This makes BayesX much more accessible to users familiar with R and adds extensive graphics capabilities for visualizing fitted STAR models. Furthermore, R2BayesX complements the already impressive capabilities for semiparametric regression in R by a comprehensive toolbox comprising in particular more complex response types and alternative inferential procedures such as simulation-based Bayesian inference.

262 citations


Journal ArticleDOI
TL;DR: The ecp package is designed to perform multiple change point analysis while making as few assumptions as possible, and is suitable for both univariate and multivariate observations.
Abstract: There are many different ways in which change point analysis can be performed, from purely parametric methods to those that are distribution free. The ecp package is designed to perform multiple change point analysis while making as few assumptions as possible. While many other change point methods are applicable only for univariate data, this R package is suitable for both univariate and multivariate observations. Hierarchical estimation can be based upon either a divisive or agglomerative algorithm. Divisive estimation sequentially identifies change points via a bisection algorithm. The agglomerative algorithm estimates change point locations by determining an optimal segmentation. Both approaches are able to detect any type of distributional change within the data. This provides an advantage over many existing change point algorithms which are only able to detect changes within the marginal distributions.

259 citations


Journal ArticleDOI
TL;DR: A new R package nparcomp is introduced which provides an easy and user-friendly access to rank-based methods for the analysis of unbalanced one-way layouts and provides procedures performing multiple comparisons and computing simultaneous confidence intervals for the estimated effects which can be easily visualized.
Abstract: One-way layouts, i.e., a single factor with several levels and multiple observations at each level, frequently arise in various fields. Usually not only a global hypothesis is of interest but also multiple comparisons between the different treatment levels. In most practical situations, the distribution of observed data is unknown and there may exist a number of atypical measurements and outliers. Hence, use of parametric and semiparametric procedures that impose restrictive distributional assumptions on observed samples becomes questionable. This, in turn, emphasizes the demand on statistical procedures that enable us to accurately and reliably analyze one-way layouts with minimal conditions on available data. Nonparametric methods offer such a possibility and thus become of particular practical importance. In this article, we introduce a new R package nparcomp which provides an easy and user-friendly access to rank-based methods for the analysis of unbalanced one-way layouts. It provides procedures performing multiple comparisons and computing simultaneous confidence intervals for the estimated effects which can be easily visualized. The special case of two samples, the nonparametric Behrens-Fisher problem, is included. We illustrate the implemented procedures by examples from biology and medicine.

Journal ArticleDOI
TL;DR: Dentropart is a package for R designed to estimate diversity based on HCDT entropy or similarity-based entropy, which allows calculating species-neutral, phylogenetic and functional entropy and diversity, partitioning them and correcting them for estimation bias.
Abstract: entropart is a package for R designed to estimate diversity based on HCDT entropy or similarity-based entropy. It allows calculating species-neutral, phylogenetic and functional entropy and diversity, partitioning them and correcting them for estimation bias.

Journal ArticleDOI
TL;DR: kml and kml3d are R packages providing an implementation of k-means designed to work specifically on trajectories (kml) or on joint trajectories(kml3D), and they offer graphic facilities to “visualize” the trajectories, either in 2D or 3D (joint-trajectories).
Abstract: Longitudinal studies are essential tools in medical research. In these studies, variables are not restricted to single measurements but can be seen as variable-trajectories, either single or joint. Thus, an important question concerns the identification of homogeneous patient trajectories.kml and kml3d are R packages providing an implementation of k-means designed to work specifically on trajectories (kml) or on joint trajectories (kml3d). They provide various tools to work on longitudinal data: imputation methods for trajectories (nine classic and one original), methods to define starting conditions in k-means (four classic and three original) and quality criteria to choose the best number of clusters (four classic and one original). In addition, they offer graphic facilities to “visualize” the trajectories, either in 2D (single trajectory) or 3D (joint-trajectories). The 3D graph representing the mean joint-trajectories of each cluster can be exported through LATEX in a 3D dynamic rotating PDF graph (Figures 1 and 9).

Journal ArticleDOI
TL;DR: The R package RandomFields supports the simulation, the parameter estimation and the prediction in particular for the linear model of coregionalization, the multivariate Matern models, the delay model, and a spectrum of physically motivated vector valued models.
Abstract: Modeling of and inference on multivariate data that have been measured in space, such as temperature and pressure, are challenging tasks in environmental sciences, physics and materials science. We give an overview over and some background on modeling with crosscovariance models. The R package RandomFields supports the simulation, the parameter estimation and the prediction in particular for the linear model of coregionalization, the multivariate Matern models, the delay model, and a spectrum of physically motivated vector valued models. An example on weather data is considered, illustrating the use of RandomFields for parameter estimation and prediction.

Journal ArticleDOI
TL;DR: The R package cpm is described, which provides a fast implementation of all the above change point models in both batch (Phase I) and sequential (Phase II) settings, where the sequences may contain either a single or multiple change points.
Abstract: The change point model framework introduced in Hawkins, Qiu, and Kang (2003) and Hawkins and Zamba (2005a) provides an effective and computationally efficient method for detecting multiple mean or variance change points in sequences of Gaussian random variables, when no prior information is available regarding the parameters of the distribution in the various segments. It has since been extended in various ways by Hawkins and Deng (2010), Ross, Tasoulis, and Adams (2011), Ross and Adams (2012) to allow for fully nonparametric change detection in non-Gaussian sequences, when no knowledge is available regarding even the distributional form of the sequence. Another extension comes from Ross and Adams (2011) and Ross (2014) which allows change detection in streams of Bernoulli and Exponential random variables respectively, again when the values of the parameters are unknown. This paper describes the R package cpm, which provides a fast implementation of all the above change point models in both batch (Phase I) and sequential (Phase II) settings, where the sequences may contain either a single or multiple change points.

Journal ArticleDOI
TL;DR: The HardyWeinberg package offers the classical tests for equilibrium, functions for power computation and for the simulation of marker data under equilibrium and disequilibrium, and various graphical tools for exploring the equilibrium status of a large set of diallelic markers.
Abstract: Testing genetic markers for Hardy-Weinberg equilibrium is an important issue in genetic association studies. The HardyWeinberg package oers the classical tests for equilibrium, functions for power computation and for the simulation of marker data under equilibrium and disequilibrium. Functions for testing equilibrium in the presence of missing data by using multiple imputation are provided. The package also supplies various graphical tools such as ternary plots with acceptance regions, log-ratio plots and Q-Q plots for exploring the equilibrium status of a large set of diallelic markers. Classical tests for equilibrium and graphical representations for diallelic marker data are reviewed. Several data sets illustrate the use of the package.

Journal ArticleDOI
TL;DR: This paper presents the R package frbs, which implements the most widely used FRBS models, namely, Mamdani and Takagi Sugeno Kang (TSK) ones, as well as some common variants.
Abstract: Fuzzy rule-based systems (FRBSs) are a well-known method family within soft computing. They are based on fuzzy concepts to address complex real-world problems. We present the R package frbs which implements the most widely used FRBS models, namely, Mamdani and Takagi Sugeno Kang (TSK) ones, as well as some common variants. In addition a host of learning methods for FRBSs, where the models are constructed from data, are implemented. In this way, accurate and interpretable systems can be built for data analysis and modeling tasks. In this paper, we also provide some examples on the usage of the package and a comparison with other common classification and regression methods available in R.

Journal ArticleDOI
TL;DR: The BMS (Bayesian model sampling) package for R that implements Bayesian model averaging for linear regression models excels in allowing for a variety of prior structures, among them the "binomial-beta" prior on the model space and the so-called "hyper-g" specifications for Zellner's g prior.
Abstract: This article describes the BMS (Bayesian model sampling) package for R that implements Bayesian model averaging for linear regression models. The package excels in allowing for a variety of prior structures, among them the "binomial-beta" prior on the model space and the so-called "hyper-g" specifications for Zellner's g prior. Furthermore, the BMS package allows the user to specify her own model priors and offers a possibility of subjective inference by setting "prior inclusion probabilities" according to the researcher's beliefs. Furthermore, graphical analysis of results is provided by numerous built-in plot functions of posterior densities, predictive densities and graphical illustrations to compare results under different prior settings. Finally, the package provides full enumeration of the model space for small scale problems as well as two efficient MCMC (Markov chain Monte Carlo) samplers that sort through the model space when the number of potential covariates is large.

Journal ArticleDOI
TL;DR: This paper shows how to fit a number of spatial models with R-INLA, including its interaction with other R packages for data analysis, and describes a novel method to extend the number of latent models available for the model parameters.
Abstract: The integrated nested Laplace approximation (INLA) provides an interesting way of approximating the posterior marginals of a wide range of Bayesian hierarchical models. This approximation is based on conducting a Laplace approximation of certain functions and numerical integration is extensively used to integrate some of the models parameters out. The R-INLA package offers an interface to INLA, providing a suitable framework for data analysis. Although the INLA methodology can deal with a large number of models, only the most relevant have been implemented within R-INLA. However, many other important models are not available for R-INLA yet. In this paper we show how to fit a number of spatial models with R-INLA, including its interaction with other R packages for data analysis. Secondly, we describe a novel method to extend the number of latent models available for the model parameters. Our approach is based on conditioning on one or several model parameters and fit these conditioned models with R-INLA. Then these models are combined using Bayesian model averaging to provide the final approximations to the posterior marginals of the model. Finally, we show some examples of the application of this technique in spatial statistics. It is worth noting that our approach can be extended to a number of other fields, and not only spatial statistics.

Journal ArticleDOI
TL;DR: This paper introduces two R packages available on the Comprehensive R Archive network, DiceDesign and DiceEval, dedicated to numerical design of experiments and the fit, the validation and the comparison of metamodels.
Abstract: This paper introduces two R packages available on the Comprehensive R Archive network. The main application concerns the study of computer code output. Package DiceDesign is dedicated to numerical design of experiments, from the construction to the study of the design properties. Package DiceEval deals with the fit, the validation and the comparison of metamodels. After a brief presentation of the context, we focus on the architecture of these two packages. A two-dimensional test function will be a running example to illustrate the main functionalities of these packages and an industrial case study in five dimensions will also be detailed.

Journal ArticleDOI
TL;DR: In this article, the spBayes R package has been reformulated and rewritten to improve computational efficiency, flexibility, and usability for point-referenced data models, which has resulted in improved sampler convergence rate and efficiency by reducing parameter space; decreased sampler run-time by avoiding expensive matrix computations; increased scalability to large datasets by implementing a class of predictive process models that attempt to overcome computational hurdles by representing spatial processes in terms of lower-dimensional realizations.
Abstract: In this paper we detail the reformulation and rewrite of core functions in the spBayes R package. These efforts have focused on improving computational efficiency, flexibility, and usability for point-referenced data models. Attention is given to algorithm and computing developments that result in improved sampler convergence rate and efficiency by reducing parameter space; decreased sampler run-time by avoiding expensive matrix computations, and; increased scalability to large datasets by implementing a class of predictive process models that attempt to overcome computational hurdles by representing spatial processes in terms of lower-dimensional realizations. Beyond these general computational improvements for existing model functions, we detail new functions for modeling data indexed in both space and time. These new functions implement a class of dynamic spatio-temporal models for settings where space is viewed as continuous and time is taken as discrete.

Journal ArticleDOI
TL;DR: This paper implements a slightly modified version of the model proposed by Ranjan et al. (2011) in the R package GPfit, with a novel parameterization of the spatial correlation function and a clustering based multi-start gradient based optimization algorithm that yield robust optimization that is typically faster than the genetic algorithm based approach.
Abstract: Gaussian process (GP) models are commonly used statistical metamodels for emulating expensive computer simulators. Fitting a GP model can be numerically unstable if any pair of design points in the input space are close together. Ranjan, Haynes, and Karsten (2011) proposed a computationally stable approach for fitting GP models to deterministic computer simulators. They used a genetic algorithm based approach that is robust but computationally intensive for maximizing the likelihood. This paper implements a slightly modified version ofthe model proposed by Ranjan et al. (2011 ) in the R package GPfit. A novel parameterization of the spatial correlation function and a clustering based multi-start gradient based optimization algorithm yield robust optimization that is typically faster than the genetic algorithm based approach. We present two examples with R codes to illustrate the usage of the main functions in GPfit . Several test functions are used for performance comparison with the popular R package mlegp . We also use GPfit for a real application, i.e., for emulating the tidal kinetic energy model for the Bay of Fundy, Nova Scotia, Canada. GPfit is free software and distributed under the General Public License and available from the Comprehensive R Archive Network.

Journal ArticleDOI
TL;DR: PReMiuM as mentioned in this paper is a recently developed R package for Bayesian clustering using a Dirichlet process mixture model, which allows binary, categorical, count and continuous response, as well as continuous and discrete covariates.
Abstract: PReMiuM is a recently developed R package for Bayesian clustering using a Dirichlet process mixture model. This model is an alternative to regression models, non-parametrically linking a response vector to covariate data through cluster membership (Molitor, Papathomas, Jerrett, and Richardson 2010). The package allows binary, categorical, count and continuous response, as well as continuous and discrete covariates. Additionally, predictions may be made for the response, and missing values for the covariates are handled. Several samplers and label switching moves are implemented along with diagnostic tools to assess convergence. A number of R functions for post-processing of the output are also provided. In addition to fitting mixtures, it may additionally be of interest to determine which covariates actively drive the mixture components. This is implemented in the package as variable selection.

Journal ArticleDOI
TL;DR: The package spTimer for hierarchical Bayesian modeling of stylized environmental space-time monitoring data is developed as a contributed software package in the R language that is fast becoming a very popular statistical computing platform.
Abstract: Hierarchical Bayesian modeling of large point-referenced space-time data is increasingly becoming feasible in many environmental applications due to the recent advances in both statistical methodology and computation power. Implementation of these methods using the Markov chain Monte Carlo (MCMC) computational techniques, however, requires development of problem-specific and user-written computer code, possibly in a low-level language. This programming requirement is hindering the widespread use of the Bayesian model-based methods among practitioners and, hence there is an urgent need to develop high-level software that can analyze large data sets rich in both space and time. This paper develops the package spTimer for hierarchical Bayesian modeling of stylized environmental space-time monitoring data as a contributed software package in the R language that is fast becoming a very popular statistical computing platform. The package is able to fit, spatially and temporally predict large amounts of space-time data using three recently developed Bayesian models. The user is given control over many options regarding covariance function selection, distance calculation, prior selection and tuning of the implemented MCMC algorithms, although suitable defaults are provided. The package has many other attractive features such as on the fly transformations and an ability to spatially predict temporally aggregated summaries on the original scale, which saves the problem of storage when using MCMC methods for large datasets. A simulation example, with more than a million observations, and a real life data example are used to validate the underlying code and to illustrate the software capabilities.

Journal ArticleDOI
TL;DR: An overview of the model-based clustering and classification methods implemented in Mixmod is given, and it is shown how the R package Rmixmod can be used for clustersering and discriminant analysis.
Abstract: Mixmod is a well-established software package for fitting a mixture model of multivariate Gaussian or multinomial probability distribution functions to a given data set with either a clustering, a density estimation or a discriminant analysis purpose. The Rmixmod S4 package provides a bridge between the C++ core library of Mixmod (mixmodLib) and the R statistical computing environment. In this article, we give an overview of the model-based clustering and classification methods, and we show how the R package Rmixmod can be used for clustering and discriminant analysis.

Journal ArticleDOI
TL;DR: A suite of R functions provides an extensible framework for inferring covariate effects as well as the parameters of the latent field in log-Gaussian Cox processes and presents methods for Bayesian inference in two further classes of model based on the log- Gaussian Cox process.
Abstract: Log-Gaussian Cox processes are an important class of models for spatial and spatiotemporal point-pattern data. Delivering robust Bayesian inference for this class of models presents a substantial challenge, since Markov chain Monte Carlo (MCMC) algorithms require careful tuning in order to work well. To address this issue, we describe recent advances in MCMC methods for these models and their implementation in the R package lgcp. Our suite of R functions provides an extensible framework for inferring covariate effects as well as the parameters of the latent field. We also present methods for Bayesian inference in two further classes of model based on the log-Gaussian Cox process. The first of these concerns the case where we wish to fit a point process model to data consisting of event-counts aggregated to a set of spatial regions: we demonstrate how this can be achieved using data-augmentation. The second concerns Bayesian inference for a class of marked-point processes specified via a multivariate log-Gaussian Cox process model. For both of these extensions, we give details of their implementation in R.

Journal ArticleDOI
TL;DR: Crawley as discussed by the authors presents a short introduction to statistical analysis with R. The first half covers standard introductory material (descriptive statistics, simple regression and inferential statistics) in a somewhat idiosyncratic way, the second half is a good introduction to modeling (multiple regression, ANOVA and ANCOVA, various general linear models, survival analysis).
Abstract: Michael J. Crawley’s text Statistics: An Introduction with R is indeed an introduction to statistical analysis as well as excellent introduction to working in and with R. However, it is not for the faint of heart. The first half covers standard introductory material (descriptive statistics, simple regression and inferential statistics) in a somewhat idiosyncratic way. The second half of the text is a good introduction to modeling (multiple regression, ANOVA and ANCOVA, various general linear models, survival analysis). The treatments are thorough and yet short – only 300 odd pages. As you can imagine he is extraordinarily succinct. There are many useful and thoughtful insights; it is well written and clear. It contains no exercises; there is an excellent appendix that serves well as a basic R tutorial. If you teach this material without a text or think you might like to try, you might find that it (paradoxically) will work as the structural backbone for a course that you fill out as you see fit. The text will serve well for an introductory course or for a second course in modeling.

Journal ArticleDOI
TL;DR: This paper briefly describes geostatistical models for Gaussian and non-Gaussian data and demonstrates the geostatsp and dieasemapping packages for performing inference using these models.
Abstract: This paper briefly describes geostatistical models for Gaussian and non-Gaussian data and demonstrates the geostatsp and dieasemapping packages for performing inference using these models. Making use of R’s spatial data types, and raster objects in particular, makes spatial analyses using geostatistical models simple and convenient. Examples using real data are shown for Gaussian spatial data, binomially distributed spatial data, a logGaussian Cox process, and an area-level model for case counts.

Journal ArticleDOI
TL;DR: The theory and application of generalized linear autoregressive moving average observation-driven models for time series of counts with explanatory variables and the estimation of these models using the R package glarma are reviewed.
Abstract: We review the theory and application of generalized linear autoregressive moving average observation-driven models for time series of counts with explanatory variables and describe the estimation of these models using the R package glarma. Forecasting, diagnostic and graphical methods are also illustrated by several examples.

Journal ArticleDOI
TL;DR: In this article, the authors present diffIRT, an R package that can be used to fit item response theory models that are based on a diffusion process, and illustrate the use of the package with two datasets pertaining to extraversion and mental rotation.
Abstract: In the psychometric literature, item response theory models have been proposed that explicitly take the decision process underlying the responses of subjects to psychometric test items into account. Application of these models is however hampered by the absence of general and flexible software to fit these models. In this paper, we present diffIRT, an R package that can be used to fit item response theory models that are based on a diffusion process. We discuss parameter estimation and model fit assessment, show the viability of the package in a simulation study, and illustrate the use of the package with two datasets pertaining to extraversion and mental rotation. In addition, we illustrate how the package can be used to fit the traditional diffusion model (as it has been originally developed in experimental psychology) to data.