scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Agricultural Biological and Environmental Statistics in 2017"


Journal ArticleDOI
TL;DR: It is demonstrated why well-established formal procedures for model selection, such as those based on standard information criteria, tend to favor models with numbers of states that are undesirably large in situations where states shall be meaningful entities.
Abstract: We discuss the notorious problem of order selection in hidden Markov models, that is of selecting an adequate number of states, highlighting typical pitfalls and practical challenges arising when analyzing real data. Extensive simulations are used to demonstrate the reasons that render order selection particularly challenging in practice despite the conceptual simplicity of the task. In particular, we demonstrate why well-established formal procedures for model selection, such as those based on standard information criteria, tend to favor models with numbers of states that are undesirably large in situations where states shall be meaningful entities. We also offer a pragmatic step-by-step approach together with comprehensive advice for how practitioners can implement order selection. Our proposed strategy is illustrated with a real-data case study on muskox movement. Supplementary materials accompanying this paper appear online.

181 citations


Journal ArticleDOI
TL;DR: This paper discusses how Bayesian multiple-regression methods that are used for whole-genome prediction can be adapted for GWAS and argues that controlling the posterior type I error rate is more suitable than controlling the genomewise error rate for controlling false positives in GWAS.
Abstract: Data that are collected for whole-genome prediction can also be used for genome-wide association studies (GWAS). This paper discusses how Bayesian multiple-regression methods that are used for whole-genome prediction can be adapted for GWAS. It is argued here that controlling the posterior type I error rate (PER) is more suitable than controlling the genomewise error rate (GER) for controlling false positives in GWAS. It is shown here that under ideal conditions, i.e., when the model is correctly specified, PER can be controlled by using Bayesian posterior probabilities that are easy to obtain. Computer simulation was used to examine the properties of this Bayesian approach when the ideal conditions were not met. Results indicate that even then useful inferences can be made.

72 citations


Journal ArticleDOI
TL;DR: This paper considers generalized linear latent variable models that can handle overdispersed counts and continuous but non-negative data and shows how estimation and inference for the considered models can be performed efficiently using the Laplace approximation method and use simulations to study the finite-sample properties of the resulting estimates.
Abstract: In this paper we consider generalized linear latent variable models that can handle overdispersed counts and continuous but non-negative data. Such data are common in ecological studies when modelling multivariate abundances or biomass. By extending the standard generalized linear modelling framework to include latent variables, we can account for any covariation between species not accounted for by the predictors, notably species interactions and correlations driven by missing covariates. We show how estimation and inference for the considered models can be performed efficiently using the Laplace approximation method and use simulations to study the finite-sample properties of the resulting estimates. In the overdispersed count data case, the Laplace-approximated estimates perform similarly to the estimates based on variational approximation method, which is another method that provides a closed form approximation of the likelihood. In the biomass data case, we show that ignoring the correlation between taxa affects the regression estimates unfavourably. To illustrate how our methods can be used in unconstrained ordination and in making inference on environmental variables, we apply them to two ecological datasets: abundances of bacterial species in three arctic locations in Europe and abundances of coral reef species in Indonesia. Supplementary materials accompanying this paper appear on-line.

58 citations


Journal ArticleDOI
TL;DR: In this article, a hierarchical hidden Markov model (HMM) is proposed to model animal behavior simultaneously at multiple time scales, opening new possibilities in the area of animal movement and behavior modeling.
Abstract: Hidden Markov models (HMMs) are commonly used to model animal movement data and infer aspects of animal behavior. An HMM assumes that each data point from a time series of observations stems from one of N possible states. The states are loosely connected to behavioral modes that manifest themselves at the temporal resolution at which observations are made. Due to advances in tag technology and tracking with digital video recordings, data can be collected at increasingly fine temporal resolutions. Yet, inferences at time scales cruder than those at which data are collected and, which correspond to larger-scale behavioral processes, are not yet answered via HMMs. We include additional hierarchical structures to the basic HMM framework, incorporating multiple Markov chains at various time scales. The hierarchically structured HMMs allow for behavioral inferences at multiple time scales and can also serve as a means to avoid coarsening data. Our proposed framework is one of the first that models animal behavior simultaneously at multiple time scales, opening new possibilities in the area of animal movement and behavior modeling. We illustrate the application of hierarchically structured HMMs in two real-data examples: (i) vertical movements of harbor porpoises observed in the field, and (ii) garter snake movement data collected as part of an experimental design. Supplementary materials accompanying this paper appear online.

50 citations


Journal ArticleDOI
TL;DR: This work investigates a practical alternative to incorporating measurement error and temporally irregular observations into HMMs based on multiple imputation of the position process drawn from a single-state continuous-time movement model and finds the two-stage multiple-imputation approach to be promising in terms of its ease of implementation, computation time, and performance.
Abstract: When data streams are observed without error and at regular time intervals, discrete-time hidden Markov models (HMMs) have become immensely popular for the analysis of animal location and auxiliary biotelemetry data However, measurement error and temporally irregular data are often pervasive in telemetry studies, particularly in marine systems While relatively small amounts of missing data that are missing-completely-at-random are not typically problematic in HMMs, temporal irregularity can result in few (if any) observations aligning with the regular time steps required by HMMs Fitting HMMs that explicitly account for uncertainty attributable to location measurement error, temporally irregular observations, or other forms of missing data typically requires computationally demanding techniques, such as Markov chain Monte Carlo (MCMC) Using simulation and a real-world bearded seal (Erignathus barbatus) example, I investigate a practical alternative to incorporating measurement error and temporally irregular observations into HMMs based on multiple imputation of the position process drawn from a single-state continuous-time movement model This two-stage approach is relatively simple, performed with existing software using efficient maximum likelihood methods, and completely parallelizable I generally found the approach to perform well across a broad range of simulated measurement error and irregular sampling rates, with latent states and locations reliably recovered in nearly all simulated scenarios However, high measurement error coupled with low sampling rates often induced bias in both the estimated probability distributions of data streams derived from the imputed position process and the estimated effects of spatial covariates on state transition probabilities Results from the two-stage analysis of the bearded seal data were similar to a more computationally intensive single-stage MCMC analysis, but the two-stage analysis required much less computation time and no custom model-fitting algorithms I thus found the two-stage multiple-imputation approach to be promising in terms of its ease of implementation, computation time, and performance Code for implementing the approach using the R package “momentuHMM” is provided Supplementary materials accompanying this paper appear online

45 citations


Journal ArticleDOI
TL;DR: In this article, the authors used multiple imputation approaches to account for the uncertainty associated with our knowledge of the latent trajectory of an animal, and applied these methods to analyze a telemetry data set involving northern fur seals (Callorhinus ursinus) in the Bering Sea.
Abstract: The analysis of telemetry data is common in animal ecological studies. While the collection of telemetry data for individual animals has improved dramatically, the methods to properly account for inherent uncertainties (e.g., measurement error, dependence, barriers to movement) have lagged behind. Still, many new statistical approaches have been developed to infer unknown quantities affecting animal movement or predict movement based on telemetry data. Hierarchical statistical models are useful to account for some of the aforementioned uncertainties, as well as provide population-level inference, but they often come with an increased computational burden. For certain types of statistical models, it is straightforward to provide inference if the latent true animal trajectory is known, but challenging otherwise. In these cases, approaches related to multiple imputation have been employed to account for the uncertainty associated with our knowledge of the latent trajectory. Despite the increasing use of imputation approaches for modeling animal movement, the general sensitivity and accuracy of these methods have not been explored in detail. We provide an introduction to animal movement modeling and describe how imputation approaches may be helpful for certain types of models. We also assess the performance of imputation approaches in two simulation studies. Our simulation studies suggests that inference for model parameters directly related to the location of an individual may be more accurate than inference for parameters associated with higher-order processes such as velocity or acceleration. Finally, we apply these methods to analyze a telemetry data set involving northern fur seals (Callorhinus ursinus) in the Bering Sea. Supplementary materials accompanying this paper appear online.

35 citations


Journal ArticleDOI
TL;DR: The interpretable nature of the continuous-time model is demonstrated, finding clear differences in behaviour over time and insights into short-term behaviour that could not have been obtained in discrete time.
Abstract: Mechanistic modelling of animal movement is often formulated in discrete time despite problems with scale invariance, such as handling irregularly timed observations. A natural solution is to formulate in continuous time, yet uptake of this has been slow. This lack of implementation is often excused by a difficulty in interpretation. Here we aim to bolster usage by developing a continuous-time model with interpretable parameters, similar to those of popular discrete-time models that use turning angles and step lengths. Movement is defined by a joint bearing and speed process, with parameters dependent on a continuous-time behavioural switching process, creating a flexible class of movement models. Methodology is presented for Markov chain Monte Carlo inference given irregular observations, involving augmenting observed locations with a reconstruction of the underlying movement process. This is applied to well-known GPS data from elk (Cervus elaphus), which have previously been modelled in discrete time. We demonstrate the interpretable nature of the continuous-time model, finding clear differences in behaviour over time and insights into short-term behaviour that could not have been obtained in discrete time.

32 citations


Journal ArticleDOI
TL;DR: It is found that the group lasso prior resulted in roughly twice the effective sample size for MCMC samples of regression coefficients and can have higher and less variable predictive accuracy based on out-of-sample data when compared to the standard SGLMM.
Abstract: Generalized linear mixed models for spatial processes are widely used in applied statistics. In many applications of the spatial generalized linear mixed model (SGLMM), the goal is to obtain inference about regression coefficients while achieving optimal predictive ability. When implementing the SGLMM, multicollinearity among covariates and the spatial random effects can make computation challenging and influence inference. We present a Bayesian group lasso prior with a single tuning parameter that can be chosen to optimize predictive ability of the SGLMM and jointly regularize the regression coefficients and spatial random effect. We implement the group lasso SGLMM using efficient Markov chain Monte Carlo (MCMC) algorithms and demonstrate how multicollinearity among covariates and the spatial random effect can be monitored as a derived quantity. To test our method, we compared several parameterizations of the SGLMM using simulated data and two examples from plant ecology and disease ecology. In all examples, problematic levels multicollinearity occurred and influenced sampling efficiency and inference. We found that the group lasso prior resulted in roughly twice the effective sample size for MCMC samples of regression coefficients and can have higher and less variable predictive accuracy based on out-of-sample data when compared to the standard SGLMM. Supplementary materials accompanying this paper appear online.

32 citations


Journal ArticleDOI
TL;DR: Methods for simulation and inference based on augmenting the constrained movement path with a latent unconstrained path are presented and illustrated with a simulation example and an analysis of telemetry data from a Steller sea lion in southeast Alaska.
Abstract: Movement for many animal species is constrained in space by barriers such as rivers, shorelines, or impassable cliffs. We develop an approach for modeling animal movement constrained in space by considering a class of constrained stochastic processes, reflected stochastic differential equations. Our approach generalizes existing methods for modeling unconstrained animal movement. We present methods for simulation and inference based on augmenting the constrained movement path with a latent unconstrained path and illustrate this augmentation with a simulation example and an analysis of telemetry data from a Steller sea lion (Eumatopias jubatus) in southeast Alaska.

18 citations


Journal ArticleDOI
TL;DR: A new estimator is proposed based on the bias correction method introduced by Firth, which uses a modification of the score function, and is provided with an easily computable, Newton–Raphson iterative formula for its computation.
Abstract: In the estimation of proportions by pooled testing, the MLE is biased, and several methods of correcting the bias have been presented in previous studies. We propose a new estimator based on the bias correction method introduced by Firth (Biometrika 80:27-38, 1993), which uses a modification of the score function, and we provide an easily computable, Newton-Raphson iterative formula for its computation. Our proposed estimator is almost unbiased across a range of problems, and superior to existing methods. We show that for equal pool sizes the new estimator is equivalent to the estimator proposed by Burrows (Phytopathology 77:363-365, 1987). The performance of our estimator is examined using pooled testing problems encountered in plant disease assessment and prevalence estimation of mosquito-borne viruses.

18 citations


Journal ArticleDOI
TL;DR: The proposed model for dependence among individuals’ behavioral states is combined with a spatially varying stochastic differential equation model to allow for spatially and temporally heterogeneous collective movement of all ants within the nest.
Abstract: Animal movement often exhibits changing behavior because animals often alternate between exploring, resting, feeding, or other potential states. Changes in these behavioral states are often driven by environmental conditions or the behavior of nearby individuals. We propose a model for dependence among individuals’ behavioral states. We couple this state switching with complex discrete-time animal movement models to analyze a large variety of animal movement types. To demonstrate this method of capturing dependence, we study the movements of ants in a nest. The behavioral interaction structure is combined with a spatially varying stochastic differential equation model to allow for spatially and temporally heterogeneous collective movement of all ants within the nest. Our results reveal behavioral tendencies that are related to nearby individuals, particularly the queen, and to different locations in the nest.

Journal ArticleDOI
TL;DR: In this article, a model is developed for the residual variance covariance structure firstly by considering a multivariate autoregressive model in one spatial direction and then extending this to two spatial directions.
Abstract: Field trials for variety selection often exhibit spatial correlation between plots. When multivariate data are analysed from these field trials, there is the added complication in having to simultaneously account for correlation between the traits at both the residual and genetic levels. This may be temporal correlation in the case of multi-harvest data from perennial crop field trials, or between-trait correlation in multi-trait data sets. Use of parsimonious yet plausible models for the variance–covariance structure of the residuals for such data is a key element to achieving an efficient and inferentially sound analysis. In this paper, a model is developed for the residual variance–covariance structure firstly by considering a multivariate autoregressive model in one spatial direction and then extending this to two spatial directions. Conditions for ensuring that the processes are directionally invariant are presented. Using a canonical decomposition, these directionally invariant processes can be transformed into a set of independent separable processes. This simplifies the estimation process. The new model allows for flexible modelling of the spatial and multivariate interaction and allows for different spatial correlation parameters for each harvest or trait. The methods are illustrated using data from lucerne breeding trials at several environments.

Journal ArticleDOI
TL;DR: In this paper, a Bayesian smoothing technique based on a conditionally autoregressive (CAR) prior distribution and Bayesian regression was used to estimate animal distributions and abundances over large regions.
Abstract: Estimating animal distributions and abundances over large regions is of primary interest in ecology and conservation. Specifically, integrating data from reliable but expensive surveys conducted at smaller scales with cost-effective but less reliable data generated from surveys at wider scales remains a central challenge in statistical ecology. In this study, we use a Bayesian smoothing technique based on a conditionally autoregressive (CAR) prior distribution and Bayesian regression to address this problem. We illustrate the utility of our proposed methodology by integrating (i) abundance estimates of tigers in wildlife reserves from intensive photographic capture–recapture methods, and (ii) estimates of tiger habitat occupancy from indirect sign surveys, conducted over a wider region. We also investigate whether the random effects which represent the spatial association due to the CAR structure have any confounding effect on the fixed effects of the regression coefficients.

Journal ArticleDOI
TL;DR: A general hierarchical framework for modeling collective movement behavior with multiple stages that allows for the discrete time prediction of animal locations in the presence of missing observations and develops an approximate Bayesian computation algorithm for estimation.
Abstract: Modeling complex collective animal movement presents distinct challenges. In particular, modeling the interactions between animals and the nonlinear behaviors associated with these interactions, while accounting for uncertainty in data, model, and parameters, requires a flexible modeling framework. To address these challenges, we propose a general hierarchical framework for modeling collective movement behavior with multiple stages. Each of these stages can be thought of as processes that are flexible enough to model a variety of complex behaviors. For example, self-propelled particle (SPP) models (e.g., Vicsek et al. in Phys Rev Lett 75:1226–1229, 1995) represent collective behavior and are often applied in the physics and biology literature. To date, the study and application of these models has almost exclusively focused on simulation studies, with less attention given to rigorously quantifying the uncertainty. Here, we demonstrate our general framework with a hierarchical version of the SPP model applied to collective animal movement. This structure allows us to make inference on potential covariates (e.g., habitat) that describe the behavior of agents and rigorously quantify uncertainty. Further, this framework allows for the discrete time prediction of animal locations in the presence of missing observations. Due to the computational challenges associated with the proposed model, we develop an approximate Bayesian computation algorithm for estimation. We illustrate the hierarchical SPP methodology with a simulation study and by modeling the movement of guppies. Supplementary materials accompanying this paper appear online.

Journal ArticleDOI
TL;DR: This paper combines demographic datasets with island-wide population counts to construct the first multi-species Integrated Population Model to consider synchrony, and extends the IPM concept to allow the simultaneous estimation of demographic parameters, adult abundance and multi- species synchrony in survival and productivity, within a robust statistical framework.
Abstract: Integrated population models (IPMs) combine data on different aspects of demography with time-series of population abundance. IPMs are becoming increasingly popular in the study of wildlife populations, but their application has largely been restricted to the analysis of single species. However, species exist within communities: sympatric species are exposed to the same abiotic environment, which may generate synchrony in the fluctuations of their demographic parameters over time. Given that in many environments conditions are changing rapidly, assessing whether species show similar demographic and population responses is fundamental to quantifying interspecific differences in environmental sensitivity and highlighting ecological interactions at risk of disruption. In this paper, we combine statistical approaches to study populations, integrating data along two different dimensions: across species (using a recently proposed framework to quantify multi-species synchrony in demography) and within each species (using IPMs with demographic and abundance data). We analyse data from three seabird species breeding at a nationally important long-term monitoring site. We combine demographic datasets with island-wide population counts to construct the first multi-species Integrated Population Model to consider synchrony. Our extension of the IPM concept allows the simultaneous estimation of demographic parameters, adult abundance and multi-species synchrony in survival and productivity, within a robust statistical framework. The approach is readily applicable to other taxa and habitats.

Journal ArticleDOI
TL;DR: In this paper, a flexible statistical modeling framework is presented to deal with multivariate count data along with longitudinal and repeated measures structures, where the covariance structure for each response variable is defined in terms of a covariance link function combined with a matrix linear predictor involving known matrices.
Abstract: The main goal of this article is to present a flexible statistical modelling framework to deal with multivariate count data along with longitudinal and repeated measures structures. The covariance structure for each response variable is defined in terms of a covariance link function combined with a matrix linear predictor involving known matrices. In order to specify the joint covariance matrix for the multivariate response vector, the generalized Kronecker product is employed. We take into account the count nature of the data by means of the power dispersion function associated with the Poisson–Tweedie distribution. Furthermore, the score information criterion is extended for selecting the components of the matrix linear predictor. We analyse a data set consisting of prey animals (the main hunted species, the blue duiker Philantomba monticola and other taxa) shot or snared for bushmeat by 52 commercial hunters over a 33-month period in Pico Basile, Bioko Island, Equatorial Guinea. By taking into account the severely unbalanced repeated measures and longitudinal structures induced by the hunters and a set of potential covariates (which in turn affect the mean and covariance structures), our method can be used to indicate whether there was statistical evidence of a decline in blue duikers and other species hunted during the study period. Determining whether observed drops in the number of animals hunted are indeed true is crucial to assess whether species depletion effects are taking place in exploited areas anywhere in the world. We suggest that our method can be used to more accurately understand the trajectories of animals hunted for commercial or subsistence purposes and establish clear policies to ensure sustainable hunting practices.

Journal ArticleDOI
TL;DR: In this introduction, a brief overview to statistical models for animal trajectories is provided and the set of invited articles that comprise the issue are summarized.
Abstract: In this introduction, we provide a brief overview to statistical models for animal trajectories and then summarize the set of invited articles that comprise the issue.

Journal ArticleDOI
TL;DR: In 2012, a capture-recapture analysis was conducted to adjust for undercoverage, nonresponse, and misclassification in the 2012 US Census of Agriculture as mentioned in this paper.
Abstract: The Census of Agriculture is conducted every 5 years, in years ending in 2 and 7. The Census list frame is incomplete, resulting in undercoverage. Not all operations on the list frame respond, and, based on the response, some misclassification occurs. In 2012, a capture–recapture analysis was conducted to adjust for undercoverage, nonresponse, and misclassification. This was the first time capture–recapture methods were used to produce official statistics for an establishment survey. The number of records on the Census Mailing List that were classified as farms was 1,382,099, and the published estimate of the number of farms was 2,109,303, a 34.5% adjustment. The adjustment was greatest for farms with low production levels and for specialty farms, both of which are difficult to identify and add to the list. The methods used are described. Challenges that arose in the implementation process are discussed. Areas for enhancement being targeted for the 2017 Census of Agriculture are highlighted. Supplementary materials accompanying this paper appear online.

Journal ArticleDOI
TL;DR: In this paper, a more flexible and easy to implement model for predicting aboveground biomass (stem, branches and total) as a smooth function of height and diameter using smooth additive mixed models which preserve the additive property necessary to model the relationship within wood fractions, and allow the inclusion of random effects and interaction terms.
Abstract: Aboveground biomass estimation in short-rotation forestry plantations is an essential step in the development of crop management strategies as well as allowing the economic viability of the crop to be determined prior to harvesting. Hence, it is important to develop new methodologies that improve the accuracy of predictions, using only a minimum set of easily obtainable information i.e., diameter and height. Many existing models base their predictions only on diameter (mainly due to the complexity of including further covariates), or rely on complicated equations to obtain biomass predictions. However, in tree species, it is important to include height when estimating aboveground biomass because this will vary from one genotype to another. This work proposes the use of a more flexible and easy to implement model for predicting aboveground biomass (stem, branches and total) as a smooth function of height and diameter using smooth additive mixed models which preserve the additive property necessary to model the relationship within wood fractions, and allows the inclusion of random effects and interaction terms. The model is applied to the analysis of three trials carried out in Spain, where nine clones at three different sites are compared. Also, an analysis of slash pine data is carried out in order to compare with the approach proposed by Parresol (Can J For Res 31:865–878, 2001). Supplementary materials accompanying this paper appear on-line

Journal ArticleDOI
TL;DR: This paper illustrates the application of exact p values, first developed for succession data, to construct a confidence set on a carrion insect’s age based only on its development stage.
Abstract: The age of a carrion insect associated with a corpse may represent a minimum postmortem interval. No method has been proposed before for constructing a confidence set on age based on development stage modeled as a categorical response. This paper illustrates the application of exact p values, first developed for succession data, to construct a confidence set on a carrion insect’s age based only on its development stage. It uses published development data for Lucilia sericata, with individuals reared at different temperatures pooled into sets of similar age as indexed in accumulated degree hours. Rates of coverage of true ages, assessed using each insect as a singleton holdout sample, were greater than the nominal 95% level.

Journal ArticleDOI
TL;DR: A Bayesian nonparametric modeling approach to inference and risk assessment for developmental toxicity studies and uses data from a toxicity experiment that investigated the toxic effects of an organic solvent to demonstrate the range of inferences obtained from the non Parametric mixture model, including comparison with a parametric hierarchical model.
Abstract: We present a Bayesian nonparametric modeling approach to inference and risk assessment for developmental toxicity studies. The primary objective of these studies is to determine the relationship between the level of exposure to a toxic chemical and the probability of a physiological or biochemical response. We consider a general data setting involving clustered categorical responses on the number of prenatal deaths, the number of live pups, and the number of live malformed pups from each laboratory animal, as well as continuous outcomes (e.g., body weight) on each of the live pups. We utilize mixture modeling to provide flexibility in the functional form of both the multivariate response distribution and the various dose–response curves of interest. The nonparametric model is built from a structured mixture kernel and a dose-dependent Dirichlet process prior for the mixing distribution. The modeling framework enables general inference for the implied dose–response relationships and for dose-dependent correlations between the different endpoints, features which provide practical advances relative to traditional parametric models for developmental toxicology. We use data from a toxicity experiment that investigated the toxic effects of an organic solvent (diethylene glycol dimethyl ether) to demonstrate the range of inferences obtained from the nonparametric mixture model, including comparison with a parametric hierarchical model. Supplementary materials accompanying this paper appear on-line.

Journal ArticleDOI
TL;DR: In this paper, the odd log-logistic Student t distribution is proposed as an alternative to the normal and Student t distributions, which can be symmetric, platykurtic, mesokurtic or leptokurtics and may be unimodal or bimodal.
Abstract: The normal distribution is most used in analysis of experiments. However, it is not suitable to apply in situations where the data have evidence of bimodality or heavier tails than the normal distribution. So, we propose a new four-parameter model called the odd log-logistic Student t distribution as an alternative to the normal and Student t distributions. The new distribution can be symmetric, platykurtic, mesokurtic or leptokurtic and may be unimodal or bimodal. Its various structural properties can be determined from the linear representation of its density function. The estimation of the model parameters is performed by maximum likelihood. The proposed distribution can be used as an alternative for randomized complete block design, thus providing analysis of real data more realistic than other special regression models. We perform a sensitivity analysis to detect influential or outlying observations, and construct generated envelopes from the residuals to select appropriate models. We illustrate the importance of the proposed model by means of three real data sets in analysis of experiments carried out in different regions of Brazil.

Journal ArticleDOI
TL;DR: In this article, a zero-inflated spatial model was used to quantify variance components due to non-sampling factors, and the model was then used to calibrate the estimated abundance index and its variance using pseudo empirical likelihood.
Abstract: Abundance and standard error estimates in surveys of fishery resources typically employ classical design-based approaches, ignoring the influences of non-design factors such as varying catchability. We developed a Bayesian approach for estimating abundance and associated errors in a fishery survey by incorporating sampling and non-sampling variabilities. First, a zero-inflated spatial model was used to quantify variance components due to non-sampling factors; second, the model was used to calibrate the estimated abundance index and its variance using pseudo empirical likelihood. The approach was applied to a winter dredge survey conducted to estimate the abundance of blue crabs (Callinectes sapidus) in the Chesapeake Bay. We explored the properties of the calibration estimators through a limited simulation study. The variance estimator calibrated on posterior sample performed well, and the mean estimator had comparable performance to design-based approach with slightly higher bias and lower (about 15% reduction) mean squared error. The results suggest that application of this approach can improve estimation of abundance indices using data from design-based fishery surveys.

Journal ArticleDOI
TL;DR: An MCMC way to fit models of space use where individuals alternate between spending time in a patch and moving to other patches in the network is presented and ways to estimate time spent in the non-habitat matrix when going from patch to patch are provided.
Abstract: Landscape heterogeneity can often be represented as a series of discrete habitat or resource patches surrounded by a matrix of non-habitat. Understanding how animals move in such networks of patches is important for many theoretical and applied questions. The probability of going from one patch to another is affected in a non-trivial way by the characteristics and location of other patches in the network. Nearby patches can compete as possible destinations, and a particular patch can be shadowed by neighboring patches. We present a way to account for the effects of the spatial configuration of patches in models of space use where individuals alternate between spending time in a patch and moving to other patches in the network. The approach is based on the original derivation of Ovaskainen and Cornell (J Appl Probab 40:557–580, 2003) for a diffusion model that considered all possible ways in which an individual leaving a particular patch can eventually reach another patch before dying or leaving the patch network. By replacing the theoretical results of Ovaskainen and Cornell by other appropriate functions, we provide generality and thus make their approach useful in contexts where diffusion is not a good approximation of movement. Furthermore, we provide ways to estimate time spent in the non-habitat matrix when going from patch to patch and implement a method to incorporate the effect of the history of previous visits on future patch use. We present an MCMC way to fit these models to data and illustrate the approach with both simulated data and data from sheep moving among seasonally flooded meadows in northern Patagonia.Supplementary materials accompanying this paper appear online.


Journal ArticleDOI
TL;DR: In this article, a model for estimating the effects of a temporal sequence of a weather variable (such as mean temperatures from successive months) on annual species abundance indices is presented, where overfitting is avoided by constraining the regression coefficients to lie on a curve defined by a small number of parameters.
Abstract: Weather has often been associated with fluctuations in population sizes of species; however, it can be difficult to estimate the effects satisfactorily because population size is naturally measured by annual abundance indices whilst weather varies on much shorter timescales. We describe a novel method for estimating the effects of a temporal sequence of a weather variable (such as mean temperatures from successive months) on annual species abundance indices. The model we use has a separate regression coefficient for each covariate in the temporal sequence, and over-fitting is avoided by constraining the regression coefficients to lie on a curve defined by a small number of parameters. The constrained curve is the product of a periodic function, reflecting assumptions that associations with weather will vary smoothly throughout the year and tend to be repetitive across years, and an exponentially decaying term, reflecting an assumption that the weather from the most recent year will tend to have the greatest effect on the current population and that the effect of weather in previous years tends to diminish as the time lag increases. We have used this approach to model 501 species abundance indices from Great Britain and present detailed results for two contrasting species alongside an overall impression of the results across all species. We believe this approach provides an important advance to the challenge of robustly modelling relationships between weather and species population size.

Journal ArticleDOI
TL;DR: In this article, the authors consider a class of N-mixture models that allow for detection heterogeneity over time through a flexible defined time-to-detection distribution and allow for fixed and random effects for both abundance and detection.
Abstract: Abundance estimates from animal point-count surveys require accurate estimates of detection probabilities. The standard model for estimating detection from removal-sampled point-count surveys assumes that organisms at a survey site are detected at a constant rate; however, this assumption can often lead to biased estimates. We consider a class of N-mixture models that allows for detection heterogeneity over time through a flexibly defined time-to-detection distribution (TTDD) and allows for fixed and random effects for both abundance and detection. Our model is thus a combination of survival time-to-event analysis with unknown-N, unknown-p abundance estimation. We specifically explore two-parameter families of TTDDs, e.g., gamma, that can additionally include a mixture component to model increased probability of detection in the initial observation period. Based on simulation analyses, we find that modeling a TTDD by using a two-parameter family is necessary when data have a chance of arising from a distribution of this nature. In addition, models with a mixture component can outperform non-mixture models even when the truth is non-mixture. Finally, we analyze an Ovenbird data set from the Chippewa National Forest using mixed effect models for both abundance and detection. We demonstrate that the effects of explanatory variables on abundance and detection are consistent across mixture TTDDs but that flexible TTDDs result in lower estimated probabilities of detection and therefore higher estimates of abundance. Supplementary materials accompanying this paper appear on-line.

Journal ArticleDOI
TL;DR: A multivariate model for functional mapping that can detect and characterize quantitative trait loci (QTLs) that simultaneously control multiple dynamic traits that can aid in the comprehension of the genetic control mechanisms of complex dynamic traits over time is developed.
Abstract: Many biological phenomena undergo developmental changes in time and space. Functional mapping, which is aimed at mapping genes that affect developmental patterns, is instrumental for studying the genetic architecture of biological changes. Often biological processes are mediated by a network of developmental and physiological components and, therefore, are better described by multiple phenotypes. In this article, we develop a multivariate model for functional mapping that can detect and characterize quantitative trait loci (QTLs) that simultaneously control multiple dynamic traits. Because the true genotypes of QTLs are unknown, the measurements for the multiple dynamic traits are modeled using a mixture distribution. The functional means of the multiple dynamic traits are estimated using the nonparametric regression method, which avoids any parametric assumption on the functional means. We propose the profile likelihood method to estimate the mixture model. A likelihood ratio test is exploited to test for the existence of pleiotropic effects on distinct but developmentally correlated traits. A simulation study is implemented to illustrate the finite sample performance of our proposed method. We also demonstrate our method by identifying QTLs that simultaneously control three dynamic traits of soybeans. The three dynamic traits are the time-course biomass of the leaf, the stem, and the root of the whole soybean. The genetic linkage map is constructed with 950 microsatellite markers. The new model can aid in our comprehension of the genetic control mechanisms of complex dynamic traits over time.

Journal ArticleDOI
TL;DR: In this paper, a multiple binary trait simulation was carried out in order to implement and validate a new procedure for dealing with the consequences of the restrictions imposed to the residual variance using threshold models.
Abstract: Several discrete responses, such as health status, reproduction performance and meat quality, are routinely collected for several livestock species. These traits are often of binary or discrete nature. Genetic evaluation for these traits is frequently conducted using a single-trait threshold model, or they are considered continuous responses either in univariate or in multivariate context. Implementation of threshold models in the presence of several binary responses or a mixture of binary and continuous responses is far from simple. The complexity of such implementation is primarily due to the incomplete randomness of the residual (co)variance matrix. In the current study, a multiple binary trait simulation was carried out in order to implement and validate a new procedure for dealing with the consequences of the restrictions imposed to the residual variance using threshold models. Using three and eight binary responses, the proposed method was able to estimate all unknown parameters without any noticeable bias. In fact, for simulated residual correlations ranging from −0.8 to 0.8, the resulting HPD 95% intervals included the true values in all cases. The proposed procedure involved limited additional computational cost and is straightforward to implement independent of the number of binary responses involved in the analysis. Monitoring of the convergence of the procedure must be conducted at the identifiable scale, and special care must be placed on the selection of the prior of the non-identifiable model. The latter could have serious consequences on the final results due to potential truncation of the parameter space.