scispace - formally typeset
Search or ask a question

Showing papers in "Methods in Ecology and Evolution in 2020"



Journal ArticleDOI
TL;DR: Hmsc 3.0 is introduced, a user‐friendly r implementation that makes JSDM fitting and post‐processing easily accessible to ecologists familiar with r, and demonstrates how to construct and fit models with different types of random effects and how to examine MCMC convergence.
Abstract: Joint Species Distribution Modelling (JSDM) is becoming an increasingly popular statistical method for analysing data in community ecology. Hierarchical Modelling of Species Communities (HMSC) is a general and flexible framework for fitting JSDMs. HMSC allows the integration of community ecology data with data on environmental covariates, species traits, phylogenetic relationships and the spatio-temporal context of the study, providing predictive insights into community assembly processes from non-manipulative observational data of species communities. The full range of functionality of HMSC has remained restricted to Matlab users only. To make HMSC accessible to the wider community of ecologists, we introduce Hmsc 3.0, a user-friendly r implementation. We illustrate the use of the package by applying Hmsc 3.0 to a range of case studies on real and simulated data. The real data consist of bird counts in a spatio-temporally structured dataset, environmental covariates, species traits and phylogenetic relationships. Vignettes on simulated data involve single-species models, models of small communities, models of large species communities and models for large spatial data. We demonstrate the estimation of species responses to environmental covariates and how these depend on species traits, as well as the estimation of residual species associations. We demonstrate how to construct and fit models with different types of random effects, how to examine MCMC convergence, how to examine the explanatory and predictive powers of the models, how to assess parameter estimates and how to make predictions. We further demonstrate how Hmsc 3.0 can be applied to normally distributed data, count data and presence–absence data. The package, along with the extended vignettes, makes JSDM fitting and post-processing easily accessible to ecologists familiar with r. (Less)

201 citations


Journal ArticleDOI
TL;DR: The metan R package is described, a collection of functions that implement a workflow‐based approach to check, manipulate and summarize typical MET data, and how they integrate into a workflow to explore and analyse MET data.
Abstract: Multi-environment trials (MET) are crucial steps in plant breeding programs that aim increasing crop productivity to ensure global food security. The analysis of MET data requires the combination of several approaches including data manipulation, visualization, and modeling. As new methods are proposed, analyzing MET data correctly and completely remains a challenge, often intractable with existing tools. Here we describe the metan R package, a collection of functions that implement a workflow-based approach to (a) check, manipulate and summarise typical MET data; (b) analyze individual environments using both fixed and mixed-effect models; (c) compute parametric and non-parametric stability statistics; (c) implement biometrical models widely used in MET analysis; and (d) plot typical MET data quickly. In this paper, we present a summary of the functions implemented in metan and how they integrate into a workflow to explore and analyze MET data. We guide the user along a gentle learning curve and show how adding only a few commands or options at a time, powerfull analyzes can be implemented. metan offers a flexible, intuitive, and richly documented working environment with tools that will facilitate the implementation of a complete analysis of MET data sets.

171 citations



Journal ArticleDOI
TL;DR: In this article, an empirical subsampling approach involving 2225 camera deployments run at 41 study areas around the world to evaluate three aspects of camera trap study design (number of sites, duration and season of sampling) and their influence on the estimation of three ecological metrics (species richness, occupancy, detection rate) for mammals.
Abstract: 1. Camera traps deployed in grids or stratified random designs are a well-established survey tool for wildlife but there has been little evaluation of study design parameters. 2. We used an empirical subsampling approach involving 2225 camera deployments run at 41 study areas around the world to evaluate three aspects of camera trap study design (number of sites, duration and season of sampling) and their influence on the estimation of three ecological metrics (species richness, occupancy, detection rate) for mammals. 3. We found that 25-35 camera locations were needed for precise estimates of species richness, depending on scale of the study. The precision of species-level estimates of occupancy was highly sensitive to occupancy level, with 0.75) species, but more than 150 sites likely needed for rare (<0.25) species. Species detection rates were more difficult to estimate precisely at the grid level due to spatial heterogeneity, presumably driven by unaccounted for habitat variability within the study area. Running a camera at a site for 2 weeks was most efficient for detecting new species, but 3-4 weeks were needed for precise estimates of local detection rate, with no gains in precision observed after 1 month. Metrics for all mammal communities were sensitive to seasonality, with 37-50% of the species at the sites we examined fluctuating significantly in their occupancy or detection rates over the year. This effect was more pronounced in temperate sites, where seasonally sensitive species varied in relative abundance by an average factor of 4-5, and some species were completely absent in one season due to hibernation or migration. 4. We recommend the following guidelines to efficiently obtain precise estimates of species richness, occupancy and detection rates with camera trap arrays: run each camera for 3-5 weeks across 40-60 sites per array. We recommend comparisons of detection rates be model-based and include local covariates to help account for small-scale variation. Furthermore, comparisons across study areas or times must account for seasonality, which had strong impacts on mammal communities in both tropical and temperate sites.,We used camera trap data already available through repositories or collaborators. Most data came from the eMammal or TEAM repositories. We also used one data set (China) from collaborators that was not already archived. All camera traps were set similarly, in being placed on a tree at 0.5m facing parallel to the ground, with no bait. A variety of camera models were used, but all had infrared flashes and fast (<0.5s) trigger times. Camera trap designs were either regular (grid) or stratified random.,For this paper we wanted to asess the importance of three things to camera trap study design: amount of locations surveyed (spatial), amount of time each survey ran (temporal), and rather season mattered (seasonal). We broke into three teams to analyze these data, and used three slightly different collections of data for each team. Thus, you will find three datasets labeled as to which analyses they were part of: spatial, temporal, or seasonal. All data is presented as raw detection data, giving the date, time, and species for each time photograph was recorded. These are organized as 'deployments' representing a time period a camera was placed in a given location. We are including a TXT file with the Data Dictionary from eMammal that describes all the standard fields. A few files have additional fields we added that should be self explanatory.,

101 citations


Journal ArticleDOI
TL;DR: The Quantitative Colour Pattern Analysis (QCPA) is presented, which combines novel and existing pattern analysis frameworks into what the authors hope is a unified, free and open source toolbox and introduces a range of novel analytical and data‐visualization approaches.
Abstract: 1. To understand the function of colour signals in nature, we require robust quantitative analytical frameworks to enable us to estimate how animal and plant colour patterns appear against their natural background as viewed by ecologically relevant species. Due to the quantitative limitations of existing methods, colour and pattern are rarely analysed in conjunction with one another, despite a large body of literature and decades of research on the importance of spatio‐chromatic colour pattern analyses. Furthermore, key physiological limitations of animal visual systems such as spatial acuity, spectral sensitivities, photoreceptor abundances and receptor noise levels are rarely considered together in colour pattern analyses. 2. Here, we present a novel analytical framework, called the Quantitative Colour Pattern Analysis (QCPA). We have overcome many quantitative and qualitative limitations of existing colour pattern analyses by combining calibrated digital photography and visual modelling. We have integrated and updated existing spatio‐chromatic colour pattern analyses, including adjacency, visual contrast and boundary strength analysis, to be implemented using calibrated digital photography through the Multispectral Image Analysis and Calibration (MICA) Toolbox. 3. This combination of calibrated photography and spatio‐chromatic colour pattern analyses is enabled by the inclusion of psychophysical colour and luminance discrimination thresholds for image segmentation, which we call ‘Receptor Noise Limited Clustering’, used here for the first time. Furthermore, QCPA provides a novel psycho‐physiological approach to the modelling of spatial acuity using convolution in the spatial or frequency domains, followed by ‘Receptor Noise Limited Ranked Filtering’ to eliminate intermediate edge artefacts and recover sharp boundaries following smoothing. We also present a new type of colour pattern analysis, the ‘local edge intensity analysis’ as well as a range of novel psycho‐physiological approaches to the visualization of spatio‐chromatic data. 4. QCPA combines novel and existing pattern analysis frameworks into what we hope is a unified, free and open source toolbox and introduces a range of novel analytical and data‐visualization approaches. These analyses and tools have been seamlessly integrated into the MICA toolbox providing a dynamic and user‐friendly workflow.

96 citations


Journal ArticleDOI
TL;DR: In this article, the authors describe procedures for automating the collection of training data, generating training datasets, and training CNNs to allow identification of individual birds, including sociable weaver Philetairus socius, the great tit Parus major and the zebra finch Taeniopygia guttata.
Abstract: 1. Individual identification is a crucial step to answer many questions in evolutionary biology and is mostly performed by marking animals with tags. Such methods are well-established, but often make data collection and analyses time-consuming, or limit the contexts in which data can be collected. 2. Recent computational advances, specifically deep learning, can help overcome the limitations of collecting large-scale data across contexts. However, one of the bottlenecks preventing the application of deep learning for individual identification is the need to collect and identify hundreds to thousands of individually labelled pictures to train convolutional neural networks (CNNs). 3. Here we describe procedures for automating the collection of training data, generating training datasets, and training CNNs to allow identification of individual birds. We apply our procedures to three small bird species, the sociable weaver Philetairus socius, the great tit Parus major and the zebra finch Taeniopygia guttata, representing both wild and captive contexts. 4. We first show how the collection of individually labelled images can be automated, allowing the construction of training datasets consisting of hundreds of images per individual. Second, we describe how to train a CNN to uniquely re-identify each individual in new images. Third, we illustrate the general applicability of CNNs for studies in animal biology by showing that trained CNNs can re-identify individual birds in images collected in contexts that differ from the ones originally used to train the CNNs. Finally, we present a potential solution to solve the issues of new incoming individuals. 5. Overall, our work demonstrates the feasibility of applying state-of-the-art deep learning tools for individual identification of birds, both in the laboratory and in the wild. These techniques are made possible by our approaches that allow

92 citations


Journal ArticleDOI
TL;DR: Barrow Island Gorgon Barrow Island Net Conservation Benefits Fund (NCBBSF) as discussed by the authors is an Australian Government's National Environmental Science Program (NESP) program.
Abstract: Australian Government's National Environmental Science Program; Australian Research Data Commons; GorgonBarrow Island Gorgon Barrow Island Net Conservation Benefits Fund

78 citations


Journal ArticleDOI
TL;DR: In this paper, an approach for the R programming environment which integrates existing R packages for obtaining terrain and sub-daily atmospheric forcing data (elevatr and RNCEP), and two complementary microclimate modelling packages (NicheMapR and microclima) is presented.
Abstract: 1. Microclimates are the thermal and hydric environments organisms actually experience and estimates of them are increasingly needed in environmental research. The availability of global weather and terrain data sets, together with increasingly sophisticated microclimate modelling tools, makes the prospect of a global, web-based microclimate estimation procedure feasible. 2. We have developed such an approach for the R programming environment which integrates existing R packages for obtaining terrain and sub-daily atmospheric forcing data (elevatr and RNCEP), and two complementary microclimate modelling packages (NicheMapR and microclima). The procedure can be used to generate NicheMapR’s hourly time series outputs of above and below ground conditions, including convective and radiative environments, soil temperature, soil moisture and snow cover, for a single point, using microclima to account for local topographic and vegetation effects. Alternatively, it can use microclima to produce high-resolution grids of near-surface temperatures, using NicheMapR to derive calibration coefficients normally obtained from experimental data. 3. We validate this integrated approach against a series of microclimate observations used previously in the tests of the respective models and show equivalent performance. 4. It is thus now feasible to produce realistic estimates of microclimate at fine (<30 m) spatial and temporal scales anywhere on earth, from 1957 to present.

76 citations



Journal ArticleDOI
TL;DR: The authors compared conventional generalized linear models (GLM) with more flexible Machine Learning (ML) models (Random Forest, Boosted Regression Trees, Deep Neural Networks, CNN, Convolutional Neural Network, Support Vector Machines, Naïve Bayes, and kNN) for predicting species interactions in plant-hummingbird networks.
Abstract: Ecologists have long suspected that species are more likely to interact if their traits match in a particular way. For example, a pollination interaction may be more likely if the proportions of a bee's tongue fit a plant's flower shape. Empirical estimates of the importance of trait‐matching for determining species interactions, however, vary significantly among different types of ecological networks. Here, we show that ambiguity among empirical trait‐matching studies may have arisen at least in parts from using overly simple statistical models. Using simulated and real data, we contrast conventional generalized linear models (GLM) with more flexible Machine Learning (ML) models (Random Forest, Boosted Regression Trees, Deep Neural Networks, Convolutional Neural Networks, Support Vector Machines, naive Bayes, and k‐Nearest‐Neighbor), testing their ability to predict species interactions based on traits, and infer trait combinations causally responsible for species interactions. We found that the best ML models can successfully predict species interactions in plant–pollinator networks, outperforming GLMs by a substantial margin. Our results also demonstrate that ML models can better identify the causally responsible trait‐matching combinations than GLMs. In two case studies, the best ML models successfully predicted species interactions in a global plant–pollinator database and inferred ecologically plausible trait‐matching rules for a plant–hummingbird network from Costa Rica, without any prior assumptions about the system. We conclude that flexible ML models offer many advantages over traditional regression models for understanding interaction networks. We anticipate that these results extrapolate to other ecological network types. More generally, our results highlight the potential of machine learning and artificial intelligence for inference in ecology, beyond standard tasks such as image or pattern recognition.

Journal ArticleDOI
Aud Helen Halbritter Rechsteiner1, Hans J. De Boeck2, Amy E. Eycott3, Amy E. Eycott4, Sabine Reinsch, David A. Robinson, Sara Vicca2, Bernd Josef Berauer5, Casper T. Christiansen1, Marc Estiarte6, José M. Grünzweig7, Ragnhild Gya1, Karin Hansen8, Anke Jentsch5, Hanna Lee1, Sune Linder9, John D. Marshall9, Josep Peñuelas6, Inger Kappel Schmidt10, Ellen Stuart-Haëntjens11, Peter A. Wilfahrt5, Vigdis Vandvik1, Nelson Abrantes3, María Almagro3, Inge H. J. Althuizen, Isabel C. Barrio12, Mariska te Beest, Claus Beier, Iilka Beil, Z. Carter Berry, Tone Birkemoe, Jarle W. Bjerke, Benjamin Blonder, Gesche Blume-Werry, Gil Bohrer, Isabel Campos, Lucas S. Cernusak, Bogdan H. Chojnicki, Bernhard J. Cosby, Lee T. Dickman, Ika Djukic, Iolanda Filella, Lucia Fuchslueger, Albert Gargallo-Garriga, Mark A. K. Gillespie, Gregory R. Goldsmith, Christopher M. Gough, Fletcher W. Halliday13, Stein Joar Hegland, Günter Hoch, Petr Holub, Francesca Jaroszynska, Daniel M. Johnson, Scott B. Jones, Paul Kardol, Jan Jacob Keizer, Karel Klem, Heidi Sjursen Konestabo, Jürgen Kreyling, György Kröel-Dulay, Simon M. Landhäusser, Klaus Steenberg Larsen, Niki I. W. Leblans, Inma Lebron, Marco M. Lehmann, Jonas J. Lembrechts, Armando Lenz, Anja Linstädter, Joan Llusià, Marc Macias-Fauria14, Andrey V. Malyshev, Pille Mänd12, Miles R. Marshall, Ashley M. Matheny, Nate G. McDowell, Ina C. Meier, Frederick C. Meinzer, Sean T. Michaletz, Megan L. Miller, Lena Muffler, Michal Oravec, Ivika Ostonen, Albert Porcar-Castell, Catherine Preece, Iain Colin Prentice, Dajana Radujković, Virve Ravolainen, Relena R. Ribbons, Jan C. Ruppert, Lawren Sack, Jordi Sardans, Andreas Schindlbacher, Christine Scoffoni, Bjarni D. Sigurdsson, Simon M. Smart, Stuart W. Smith, Fiona M. Soper, James D. M. Speed, Anne Sverdrup-Thygeson, Markus A. K. Sydenham, Arezoo Taghizadeh-Toosi, Richard J. Telford, Katja Tielbörger, Joachim Töpper, Otmar Urban, Martine van der Ploeg, Leandro Van Langenhove, Kristýna Večeřová, Arne Ven, Erik Verbruggen, Unni Vik, Robert Weigel, Thomas Wohlgemuth, Lauren K. Wood, Julie C. Zinnert, Kamal Zurba 
TL;DR: A minimum subset of variables that should be collected in all climate change studies to allow data re-use and synthesis, and guidance on additional variables critical for different types of synthesis and upscaling are recommended.
Abstract: Climate change is a world-wide threat to biodiversity and ecosystem structure, functioning and services. To understand the underlying drivers and mechanisms, and to predict the consequences for nature and people, we urgently need better understanding of the direction and magnitude of climate change impacts across the soil-plant-atmosphere continuum. An increasing number of climate change studies are creating new opportunities for meaningful and high-quality generalizations and improved process understanding. However, significant challenges exist related to data availability and/or compatibility across studies, compromising opportunities for data re-use, synthesis and upscaling. Many of these challenges relate to a lack of an established 'best practice' for measuring key impacts and responses. This restrains our current understanding of complex processes and mechanisms in terrestrial ecosystems related to climate change. To overcome these challenges, we collected best-practice methods emerging from major ecological research networks and experiments, as synthesized by 115 experts from across a wide range of scientific disciplines. Our handbook contains guidance on the selection of response variables for different purposes, protocols for standardized measurements of 66 such response variables and advice on data management. Specifically, we recommend a minimum subset of variables that should be collected in all climate change studies to allow data re-use and synthesis, and give guidance on additional variables critical for different types of synthesis and upscaling. The goal of this community effort is to facilitate awareness of the importance and broader application of standardized methods to promote data re-use, availability, compatibility and transparency. We envision improved research practices that will increase returns on investments in individual research projects, facilitate second-order research outputs and create opportunities for collaboration across scientific communities. Ultimately, this should significantly improve the quality and impact of the science, which is required to fulfil society's needs in a changing world.

Journal ArticleDOI
TL;DR: In this article, a set of functions to calculate FD indices based on kernel density n-dimensional hypervolumes, including alpha (richness), beta (and respective components), dispersion, evenness, contribution, and originality, were developed.
Abstract: The use of kernel density n-dimensional hypervolumes [Global Ecol. Biogeogr. 23(5):595-609] in trait-based ecology is rapidly increasing. By representing the functional space of a species or community as a Hutchinsonian niche space, this relatively new approach is showing great potential for the advance of functional ecology theory. Functions for calculating the standard set of functional diversity (FD) indexes (richness, divergence, and regularity) have not been developed in the context of kernel density hypervolumes yet. This gap is delaying a full exploitation of the kernel density n-dimensional hypervolumes framework in functional ecology, meanwhile preventing the possibility to compare its performance with that of other FD methods. We develop a set of functions to calculate FD indices based on kernel density n-dimensional hypervolumes, including alpha (richness), beta (and respective components), dispersion, evenness, contribution, and originality. Altogether, these indexes provide a coherent framework to explore the primary mathematical components of FD within a kernel density setting. These new functions can work either with Hypervolumes objects, HypervolumeList objects, or raw data (species presence or abundance and their traits) as input data, and are versatile in terms of computation speed, which is achieved by controlling the number of stochastic points to be used in the different estimations. These newly developed functions are implemented within the R package BAT, an open platform for biodiversity assessments. As a coherent corpus of functional indexes based on a common algorithm, it opens up the possibility to fully explore the strengths of the Hutchinsonian niche concept in community ecology research.

Journal ArticleDOI
TL;DR: The r package phyr implements a suite of metrics, comparative methods and mixed models that use phylogenies to understand and predict community composition and other ecological and evolutionary phenomena, and provides an easy‐to‐use collection of tools that will ignite the use of phylogenies.
Abstract: O_LIModel-based approaches are increasingly popular in ecological studies. A good example of this trend is the use of joint species distribution models to ask questions about ecological communities. However, most current applications of model-based methods do not include phylogenies despite the well-known importance of phylogenetic relationships in shaping species distributions and community composition. In part, this is due to lack of accessible tools allowing ecologists to fit phylogenetic species distribution models easily. C_LIO_LITo fill this gap, the R package phyr (pronounced fire) implements a suite of metrics, comparative methods and mixed models that use phylogenies to understand and predict community composition and other ecological and evolutionary phenomena. The phyr workhorse functions are implemented in C++ making all calculations and model estimations fast. C_LIO_LIphyr can fit a variety of models such as phylogenetic joint-species distribution models, spatiotemporal-phylogenetic autocorrelation models, and phylogenetic trait-based bipartite network models. phyr also estimates phylogenetically independent trait correlations with measurement error to test for adaptive syndromes and performs fast calculations of common alpha and beta phylogenetic diversity metrics. All phyr methods are united under Brownian motion or Ornstein-Uhlenbeck models of evolution and phylogenetic terms are modelled as phylogenetic covariance matrices. C_LIO_LIThe functions and model formula syntax we propose in phyr serves as a simple and unified framework that ignites the use of phylogenies to address a variety of ecological questions. C_LI

Journal ArticleDOI
TL;DR: LeWoS as discussed by the authors is a fully automatic tool to automate the separation of leaf and wood components, based only on geometric information at both the plot and individual tree scales, using recursive point cloud segmentation and regularization procedures.
Abstract: Leaf‐wood separation in terrestrial LiDAR data is a prerequisite for non‐destructively estimating biophysical forest properties such as standing wood volumes and leaf area distributions. Current methods have not been extensively applied and tested on tropical trees. Moreover, their impacts on the accuracy of subsequent wood volume retrieval were rarely explored. We present LeWoS, a new fully automatic tool to automate the separation of leaf and wood components, based only on geometric information at both the plot and individual tree scales. This data‐driven method utilizes recursive point cloud segmentation and regularization procedures. Only one parameter is required, which makes our method easily and universally applicable to data from any LiDAR technology and forest type. We conducted a twofold evaluation of the LeWoS method on an extensive dataset of 61 tropical trees. We first assessed the point‐wise classification accuracy, yielding a score of 0.91 ± 0.03 in average. Second, we evaluated the impact of the proposed method on 3D tree models by cross‐comparing estimates in wood volume and branch length with those based on manually separated wood points. This comparison showed similar results, with relative biases of less than 9% and 21% on volume and length respectively. LeWoS allows an automated processing chain for non‐destructive tree volume and biomass estimation when coupled with 3D modelling methods. The average processing time on a laptop was 90s for 1 million points. We provide LeWoS as an open‐source tool with an end‐user interface, together with a large dataset of labelled 3D point clouds from contrasting forest structures. This study closes the gap for stand volume modelling in tropical forests where leaf and wood separation remain a crucial challenge.

Journal ArticleDOI
TL;DR: ‘Marxan Connect’ is a new open source, open access Graphical User Interface (GUI) tool designed to assist conservation planners with the appropriate use of data on ecological connectivity in protected area network planning.
Abstract: 1. Globally, protected areas are being established to protect biodiversity and to promote ecosystem resilience. The typical spatial conservation planning process leading to the creation of these protected areas focuses on representation and replication of ecological features, often using decision support tools such as Marxan. Yet, despite the important role ecological connectivity has in metapopulation persistence and resilience, Marxan currently requires manual input or specialized scripts to explicitly consider connectivity. 2. ‘Marxan Connect’ is a new open source, open access Graphical User Interface (GUI) tool designed to assist conservation planners with the appropriate use of data on ecological connectivity in protected area network planning. 3. Marxan Connect can facilitate the use of estimates of demographic connectivity (e.g. derived from animal tracking data, dispersal models, or genetic tools) or structural landscape connectivity (e.g. isolation by resistance). This is accomplished by calculating metapopulation‐relevant connectivity metrics (e.g. eigenvector centrality) and treating those as conservation features or by including the connectivity data as a spatial dependency amongst sites in the prioritization process. 4. Marxan Connect allows a wide group of users to incorporate directional ecological connectivity into conservation planning with Marxan. The solutions provided by Marxan Connect, combined with ecologically relevant post‐hoc testing, are more likely to support persistent and resilient metapopulations (e.g. fish stocks) and provide better protection for biodiversity.

Journal ArticleDOI
TL;DR: An r package named adiv that provides additional methods to measure and analyse biodiversity, and aims to complement existing R packages to provide scientists with a wide variety of diversity indices, as each index reflects a very specific facet of biodiversity.
Abstract: R is an open-source programming environment for statistical computing and graphics structured by numerous contributed packages. The current packages used for biodiversity research focus on limited, particular aspects of biodiversity. Most packages focus on the number and abundance of species. I present an r package named adiv that provides additional methods to measure and analyse biodiversity. adiv contains approaches to quantify species-based, trait-based (functional) and phylogenetic diversity (a) within communities (α diversity), (b) between communities (β diversity) and (c) to partition it over space and time (α, β and γ levels of diversity). Partitioning approaches allow evaluating whether the levels of α and β diversity could have been obtained by chance. Moreover, groups of biological entities (e.g. species of the same clade or with similar biological characteristics) that drive each level of diversity (α, β and γ) can be identified via ordination analyses. Although the package focuses on interspecific diversity in its current state, the developed approaches can also be applied to analyse intraspecific diversity or, at another level, ecosystem diversity. More generally, the functions can be applied in any discipline interested in the concept of diversity, such as economics or linguistics. Indeed, all available approaches can be easily applied at other scales and to other disciplines provided that the data have the required format: a matrix of abundance or presence/absence data of some entities in some collections and information on the differences between the entities. adiv aims to complement existing R packages to provide scientists with a wide variety of diversity indices, as each index reflects a very specific facet of biodiversity. adiv will grow in the future to integrate as many validated approaches for biodiversity analysis as possible, not yet available in R. As it includes both traditional and recent viewpoints on how biodiversity should be evaluated, adiv offers a promising platform where methods to analyse biodiversity can be developed and compared in terms of their statistical behaviour and biological relevance. Applications of the most relevant tools for a given study aim will eventually improve research on human-driven variations in biodiversity.

Journal ArticleDOI
TL;DR: To show how embarcadero can be used by ecologists, this work illustrates a BART workflow for a virtual species distribution model and includes a more advanced vignette showing how BART can beused for mapping disease transmission risk, using the example of Crimean–Congo haemorrhagic fever in Africa.
Abstract: Classification and regression tree methods, like random forests (RF) or boosted regression trees (BRT), are one of the most popular methods of mapping species distributions. Bayesian additive regression trees (BARTs) are a relatively new alternative to other popular regression tree approaches. Whereas BRT iteratively fits an ensemble of trees each explaining smaller fractions of the total variance, BART starts by fitting a sum-of-trees model and then uses Bayesian backfitting with an MCMC algorithm to create a posterior draw. So far, BARTs have yet to be applied to species distribution modeling. embarcadero is an R package of convenience tools for researchers interested in species distribution modeling with BARTs. It includes functionality for spatial prediction, an automated variable selection and importance procedure, and other functionality for rapid implementation and data visualization. To show how embarcadero can be used by ecologists, we re-map the distribution of Crimean-Congo haemorrhagic fever and a likely vector, Hyalomma truncatum, in Africa.

Journal ArticleDOI
TL;DR: H phyloregion will facilitate rapid biogeographical analyses that will accommodate the ongoing mass production of species occurrence records and phylogenetic datasets at any scale and for any taxonomic group into completely reproducible R workflows.




Journal ArticleDOI
TL;DR: This work describes a robot-enabled image-based identification machine, which can automate the process of invertebrate identification, biomass estimation and sample sorting, and test the classification accuracy i.e. how well the species identity of a specimen can be predicted from images taken by the machine.
Abstract: Understanding how biological communities respond to environmental changes is a key challenge in ecology and ecosystem management. The apparent decline of insect populations necessitates more biomonitoring but the time-consuming sorting and identification of taxa pose strong limitations on how many insect samples can be processed. In turn, this affects the scale of efforts to map invertebrate diversity altogether. Given recent advances in computer vision, we propose to replace the standard manual approach of human expert-based sorting and identification with an automatic image-based technology. We describe a robot-enabled image-based identification machine, which can automate the process of invertebrate identification, biomass estimation and sample sorting. We use the imaging device to generate a comprehensive image database of terrestrial arthropod species. We use this database to test the classification accuracy i.e. how well the species identity of a specimen can be predicted from images taken by the machine. We also test sensitivity of the classification accuracy to the camera settings (aperture and exposure time) in order to move forward with the best possible image quality. We use state-of-the-art Resnet-50 and InceptionV3 CNNs for the classification task. The results for the initial dataset are very promising ($\overline{ACC}=0.980$). The system is general and can easily be used for other groups of invertebrates as well. As such, our results pave the way for generating more data on spatial and temporal variation in invertebrate abundance, diversity and biomass.

Journal ArticleDOI
TL;DR: MGDrivE (Mosquito Gene Drive Explorer): a simulation framework designed to investigate the population dynamics of a variety of gene drive architectures and their spread through spatially explicit mosquito populations.
Abstract: Author(s): Sanchez C., HM; Wu, SL; Bennett, JB; Marshall, JM | Abstract: Malaria, dengue, Zika and other mosquito-borne diseases continue to pose a major global health burden through much of the world, despite the widespread distribution of insecticide-based tools and antimalarial drugs. The advent of CRISPR/Cas9-based gene editing and its demonstrated ability to streamline the development of gene drive systems has reignited interest in the application of this technology to the control of mosquitoes and the diseases they transmit. The versatility of this technology has enabled a wide range of gene drive architectures to be realized, creating a need for their population-level and spatial dynamics to be explored. We present MGDrivE (Mosquito Gene Drive Explorer): a simulation framework designed to investigate the population dynamics of a variety of gene drive architectures and their spread through spatially explicit mosquito populations. A key strength of the MGDrivE framework is its modularity: (a) a genetic inheritance module accommodates the dynamics of gene drive systems displaying user-defined inheritance patterns, (b) a population dynamic module accommodates the life history of a variety of mosquito disease vectors and insect agricultural pests, and (c) a landscape module generates the metapopulation model by which insect populations are connected via migration over space. Example MGDrivE simulations are presented to demonstrate the application of the framework to CRISPR/Cas9-based homing gene drive for: (a) driving a disease-refractory gene into a population (i.e. population replacement), and (b) disrupting a gene required for female fertility (i.e. population suppression), incorporating homing-resistant alleles in both cases. Further documentation and use examples are provided at the project's Github repository. MGDrivE is an open-source r package freely available on CRAN. We intend the package to provide a flexible tool capable of modelling novel inheritance-modifying constructs as they are proposed and become available. The field of gene drive is moving very quickly, and we welcome suggestions for future development.

Journal ArticleDOI
TL;DR: In this article, the authors discuss issues associated with the capture, handling, housing and experimental approaches for species occupying varied habitats, in both vertebrates and invertebrates (principally insects, crustaceans and molluscs).
Abstract: 1. Wild animals are used in scientific research in a wide variety of contexts both in situ and ex situ. Guidelines for best practice, where they exist, are not always clearly linked to animal welfare and may instead have their origins in practicality. This is complicated by a lack of clarity about indicators of welfare for wild animals, and to what extent a researcher should intervene in cases of compromised welfare. 2. This Primer highlights and discusses the broad topic of wild animal welfare and the ethics of using wild animals in scientific research, both in the wild and in controlled conditions. Throughout, we discuss issues associated with the capture, handling, housing and experimental approaches for species occupying varied habitats, in both vertebrates and invertebrates (principally insects, crustaceans and molluscs). 3. We highlight where data on the impacts of wild animal research are lacking and provide suggestive guidance to help direct, prepare and mitigate potential welfare issues, including the consideration of end-points and the ethical framework around euthanasia. 4. We conclude with a series of recommendations for researchers to implement from the design stage of any study that uses animals, right through to publication, and discuss the role of journals in promoting better reporting of wild animal studies, ultimately to the benefit of wild animal welfare.

Journal ArticleDOI
TL;DR: LeafByte as discussed by the authors is a free iOS application for measuring leaf area and herbivory, which can save data automatically, read and record barcodes, handle both light and dark colored plant tissue, and be used non-destructively.
Abstract: In both basic and applied studies, quantification of herbivory on foliage is a key metric in characterizing plant-herbivore interactions, which underpin many ecological, evolutionary, and agricultural processes. Current methods of quantifying herbivory are slow or inaccurate. We present LeafByte, a free iOS application for measuring leaf area and herbivory. LeafByte can save data automatically, read and record barcodes, handle both light and dark colored plant tissue, and be used non-destructively. We evaluate its accuracy and efficiency relative to existing herbivory assessment tools. LeafByte has the same accuracy as ImageJ, the field standard, but is 50% faster. Other tools, such as BioLeaf and grid quantification, are quick and accurate, but limited in the information they can provide. Visual estimation is quickest, but it only provides a coarse measure of leaf damage and tends to overestimate herbivory. LeafByte is a quick and accurate means of measuring leaf area and herbivory, making it a useful tool for research in fields such as ecology, entomology, agronomy, and plant science.

Journal ArticleDOI
TL;DR: DeepForest as mentioned in this paper detects individual trees in high-resolution RGB images using deep learning and fine-tunes a pre-trained model on over 30 million algorithmically generated crowns from 22 forests.
Abstract: Remote sensing of forested landscapes can transform the speed, scale, and cost of forest research. The delineation of individual trees in remote sensing images is an essential task in forest analysis. Here we introduce a new Python package, DeepForest, that detects individual trees in high resolution RGB imagery using deep learning. While deep learning has proven highly effective in a range of computer vision tasks, it requires large amounts of training data that are typically difficult to obtain in ecological studies. DeepForest overcomes this limitation by including a model pre-trained on over 30 million algorithmically generated crowns from 22 forests and fine-tuned using 10,000 hand-labeled crowns from 6 forests. The package supports the application of this general model to new data, fine tuning the model to new datasets with user labeled crowns, training new models, and evaluating model predictions. This simplifies the process of using and retraining deep learning models for a range of forests, sensors, and spatial resolutions. We illustrate the workflow of DeepForest using data from the National Ecological Observatory Network, a tropical forest in French Guiana, and street trees from Portland, Oregon.

Journal ArticleDOI
TL;DR: This work built multi‐input neural network models that synthesize metadata and images to identify records to species level, and shows that machine learning models can effectively harness contextual information to improve the interpretation of images.
Abstract: O_LIThe accurate identification of species in images submitted by citizen scientists is currently a bottleneck for many data uses. Machine learning tools offer the potential to provide rapid, objective and scalable species identification for the benefit of many aspects of ecological science. Currently, most approaches only make use of image pixel data for classification. However, an experienced naturalist would also use a wide variety of contextual information such as the location and date of recording.nC_LIO_LIHere, we examine the automated identification of ladybird (Coccinellidae) records from the British Isles submitted to the UK Ladybird Survey, a volunteer-led mass participation recording scheme. Each image is associated with metadata; a date, location and recorder ID, which can be cross-referenced with other data sources to determine local weather at the time of recording, habitat types and the experience of the observer. We built multi-input neural network models that synthesise metadata and images to identify records to species level.nC_LIO_LIWe show that machine learning models can effectively harness contextual information to improve the interpretation of images. Against an image-only baseline of 48.2%, we observe a 9.1 percentage-point improvement in top-1 accuracy with a multi-input model compared to only a 3.6% increase when using an ensemble of image and metadata models. This suggests that contextual data is being used to interpret an image, beyond just providing a prior expectation. We show that our neural network models appear to be utilising similar pieces of evidence as human naturalists to make identifications.nC_LIO_LIMetadata is a key tool for human naturalists. We show it can also be harnessed by computer vision systems. Contextualisation offers considerable extra information, particularly for challenging species, even within small and relatively homogeneous areas such as the British Isles. Although complex relationships between disparate sources of information can be profitably interpreted by simple neural network architectures, there is likely considerable room for further progress. Contextualising images has the potential to lead to a step change in the accuracy of automated identification tools, with considerable benefits for large scale verification of submitted records.nC_LI


Journal ArticleDOI
TL;DR: A flexible model simulation tool is created to explore and quantify the bias in model parameter estimates that is created when using an inaccurate transmission function, and found that most experimental and observational studies reported that nonlinear transmission–density functions outperformed simple linear transmission– density functions.