scispace - formally typeset
Search or ask a question

Showing papers in "PeerJ in 2014"


Journal ArticleDOI
19 Jun 2014-PeerJ
TL;DR: The advantages of open source to achieve the goals of the scikit-image library are highlighted, and several real-world image processing applications that use scik it-image are showcased.
Abstract: scikit-image is an image processing library that implements algorithms and utilities for use in research, education and industry applications. It is released under the liberal Modified BSD open source license, provides a well-documented API in the Python programming language, and is developed by an active, international team of collaborators. In this paper we highlight the advantages of open source to achieve the goals of the scikit-image library, and we showcase several real-world image processing applications that use scikit-image. More information can be found on the project homepage, http://scikit-image.org.

3,903 citations


Journal ArticleDOI
04 Mar 2014-PeerJ
TL;DR: The R package poppr is developed providing unique tools for analysis of data from admixed, clonal, mixed, and/or sexual populations, and functions for genotypic diversity and clone censoring are specific for clonal populations.
Abstract: Many microbial, fungal, or oomcyete populations violate assumptions for population genetic analysis because these populations are clonal, admixed, partially clonal, and/or sexual. Furthermore, few tools exist that are specifically designed for analyzing data from clonal populations, making analysis difficult and haphazard. We developed the R package poppr providing unique tools for analysis of data from admixed, clonal, mixed, and/or sexual populations. Currently, poppr can be used for dominant/codominant and haploid/diploid genetic data. Data can be imported from several formats including GenAlEx formatted text files and can be analyzed on a user-defined hierarchy that includes unlimited levels of subpopulation structure and clone censoring. New functions include calculation of Bruvo’s distance for microsatellites, batch-analysis of the index of association with several indices of genotypic diversity, and graphing including dendrograms with bootstrap support and minimum spanning networks. While functions for genotypic diversity and clone censoring are specific for clonal populations, several functions found in poppr are also valuable to analysis of any populations. A manual with documentation and examples is provided. Poppr is open source and major releases are available on CRAN: http://cran.r-project.org/package=poppr. More supporting documentation and tutorials can be found under ‘resources’ at: http://grunwaldlab.cgrb.oregonstate.edu/.

1,942 citations


Journal ArticleDOI
09 Oct 2014-PeerJ
TL;DR: Simulations show that in cases where overdispersion is caused by random extra-Poisson noise, or aggregation in the count data, observation-level random effects yield more accurate parameter estimates compared to when overdisPersion is simply ignored, and that their ability to minimise bias is not uniform across all types of over Dispersion and must be applied judiciously.
Abstract: Overdispersion is common in models of count data in ecology and evolutionary biology, and can occur due to missing covariates, non-independent (aggregated) data, or an excess frequency of zeroes (zero-inflation). Accounting for overdispersion in such models is vital, as failing to do so can lead to biased parameter estimates, and false conclusions regarding hypotheses of interest. Observation-level random effects (OLRE), where each data point receives a unique level of a random effect that models the extra-Poisson variation present in the data, are commonly employed to cope with overdispersion in count data. However studies investigating the efficacy of observation-level random effects as a means to deal with overdispersion are scarce. Here I use simulations to show that in cases where overdispersion is caused by random extra-Poisson noise, or aggregation in the count data, observation-level random effects yield more accurate parameter estimates compared to when overdispersion is simply ignored. Conversely, OLRE fail to reduce bias in zero-inflated data, and in some cases increase bias at high levels of overdispersion. There was a positive relationship between the magnitude of overdispersion and the degree of bias in parameter estimates. Critically, the simulations reveal that failing to account for overdispersion in mixed models can erroneously inflate measures of explained variance (r2), which may lead to researchers overestimating the predictive power of variables of interest. This work suggests use of observation-level random effects provides a simple and robust means to account for overdispersion in count data, but also that their ability to minimise bias is not uniform across all types of overdispersion and must be applied judiciously.

845 citations


Journal ArticleDOI
25 Sep 2014-PeerJ
TL;DR: In this paper, the authors proposed Swarm, a fast, scalable, and input-order independent approach for amplicon clustering that reduces the influence of clustering parameters and produces robust operational taxonomic units.
Abstract: Popular de novo amplicon clustering methods suffer from two fundamental flaws: arbitrary global clustering thresholds, and input-order dependency induced by centroid selection. Swarm was developed to address these issues by first clustering nearly identical amplicons iteratively using a local threshold, and then by using clusters’ internal structure and amplicon abundances to refine its results. This fast, scalable, and input-order independent approach reduces the influence of clustering parameters and produces robust operational taxonomic units.

699 citations


Journal ArticleDOI
09 Jan 2014-PeerJ
TL;DR: This work presents an approach to leverage phylogenetic analysis of metagenomic sequence data to conduct phylogeny-driven Bayesian hypothesis tests for the presence of an organism in a sample and applies new tools to analyze the phylogenetic diversity of microbial communities.
Abstract: Like all organisms on the planet, environmental microbes are subject to the forces of molecular evolution. Metagenomic sequencing provides a means to access the DNA sequence of uncultured microbes. By combining DNA sequencing of microbial communities with evolutionary modeling and phylogenetic analysis we might obtain new insights into microbiology and also provide a basis for practical tools such as forensic pathogen detection. In this work we present an approach to leverage phylogenetic analysis of metagenomic sequence data to conduct several types of analysis. First, we present a method to conduct phylogeny-driven Bayesian hypothesis tests for the presence of an organism in a sample. Second, we present a means to compare community structure across a collection of many samples and develop direct associations between the abundance of certain organisms and sample metadata. Third, we apply new tools to analyze the phylogenetic diversity of microbial communities and again demonstrate how this can be associated to sample metadata. These analyses are implemented in an open source software pipeline called PhyloSift. As a pipeline, PhyloSift incorporates several other programs including LAST, HMMER, and pplacer to automate phylogenetic analysis of protein coding and RNA sequences in metagenomic datasets generated by modern sequencing platforms (e.g., Illumina, 454).

580 citations


Journal ArticleDOI
21 Aug 2014-PeerJ
TL;DR: A performance-optimized algorithm for assigning marker gene sequences generated on next-generation sequencing platforms to operational taxonomic units (OTUs) for microbial community analysis is presented and it is shown that subsampled open-reference OTU picking yields results that are highly correlated with those generated by “classic” open- reference OTUpicking through comparisons on three well-studied datasets.
Abstract: We present a performance-optimized algorithm, subsampled open-reference OTU picking, for assigning marker gene (e.g., 16S rRNA) sequences generated on next-generation sequencing platforms to operational taxonomic units (OTUs) for microbial community analysis. This algorithm provides benefits over de novo OTU picking (clustering can be performed largely in parallel, reducing runtime) and closed-reference OTU picking (all reads are clustered, not only those that match a reference database sequence with high similarity). Because more of our algorithm can be run in parallel relative to "classic" open-reference OTU picking, it makes open-reference OTU picking tractable on massive amplicon sequence data sets (though on smaller data sets, "classic" open-reference OTU clustering is often faster). We illustrate that here by applying it to the first 15,000 samples sequenced for the Earth Microbiome Project (1.3 billion V4 16S rRNA amplicons). To the best of our knowledge, this is the largest OTU picking run ever performed, and we estimate that our new algorithm runs in less than 1/5 the time than would be required of "classic" open reference OTU picking. We show that subsampled open-reference OTU picking yields results that are highly correlated with those generated by "classic" open-reference OTU picking through comparisons on three well-studied datasets. An implementation of this algorithm is provided in the popular QIIME software package, which uses uclust for read clustering. All analyses were performed using QIIME's uclust wrappers, though we provide details (aided by the open-source code in our GitHub repository) that will allow implementation of subsampled open-reference OTU picking independently of QIIME (e.g., in a compiled programming language, where runtimes should be further reduced). Our analyses should generalize to other implementations of these OTU picking algorithms. Finally, we present a comparison of parameter settings in QIIME's OTU picking workflows and make recommendations on settings for these free parameters to optimize runtime without reducing the quality of the results. These optimized parameters can vastly decrease the runtime of uclust-based OTU picking in QIIME.

491 citations


Journal ArticleDOI
10 Jun 2014-PeerJ
TL;DR: dDocent consistently identified more SNPs shared across greater numbers of individuals and with higher levels of coverage, due to the fact that dDocent quality trims instead of filtering, incorporates both forward and reverse reads in assembly, mapping, and SNP calling.
Abstract: Restriction-site associated DNA sequencing (RADseq) has become a powerful and useful approach for population genomics. Currently, no software exists that utilizes both paired-end reads from RADseq data to efficiently produce population-informative variant calls, especially for non-model organisms with large effective population sizes and high levels of genetic polymorphism. dDocent is an analysis pipeline with a user-friendly, command-line interface designed to process individually barcoded RADseq data (with double cut sites) into informative SNPs/Indels for population-level analyses. The pipeline, written in BASH, uses data reduction techniques and other stand-alone software packages to perform quality trimming and adapter removal, de novo assembly of RAD loci, read mapping, SNP and Indel calling, and baseline data filtering. Double-digest RAD data from population pairings of three different marine fishes were used to compare dDocent with Stacks, the first generally available, widely used pipeline for analysis of RADseq data. dDocent consistently identified more SNPs shared across greater numbers of individuals and with higher levels of coverage. This is due to the fact that dDocent quality trims instead of filtering, incorporates both forward and reverse reads (including reads with INDEL polymorphisms) in assembly, mapping, and SNP calling. The pipeline and a comprehensive user guide can be found at http://dDocent.wordpress.com.

350 citations


Journal ArticleDOI
27 Feb 2014-PeerJ
TL;DR: It is argued that differences in translation rates can play no role in determining the expression levels for the ∼40% of genes that are non-expressed, and that transcription plays a more important role than the earlier studies implied and translation a much smaller role.
Abstract: Large scale surveys in mammalian tissue culture cells suggest that the protein expressed at the median abundance is present at 8,000-16,000 molecules per cell and that differences in mRNA expression between genes explain only 10-40% of the differences in protein levels. We find, however, that these surveys have significantly underestimated protein abundances and the relative importance of transcription. Using individual measurements for 61 housekeeping proteins to rescale whole proteome data from Schwanhausser et al. (2011), we find that the median protein detected is expressed at 170,000 molecules per cell and that our corrected protein abundance estimates show a higher correlation with mRNA abundances than do the uncorrected protein data. In addition, we estimated the impact of further errors in mRNA and protein abundances using direct experimental measurements of these errors. The resulting analysis suggests that mRNA levels explain at least 56% of the differences in protein abundance for the 4,212 genes detected by Schwanhausser et al. (2011), though because one major source of error could not be estimated the true percent contribution should be higher. We also employed a second, independent strategy to determine the contribution of mRNA levels to protein expression. We show that the variance in translation rates directly measured by ribosome profiling is only 12% of that inferred by Schwanhausser et al. (2011), and that the measured and inferred translation rates correlate poorly (R(2) = 0.13). Based on this, our second strategy suggests that mRNA levels explain ∼81% of the variance in protein levels. We also determined the percent contributions of transcription, RNA degradation, translation and protein degradation to the variance in protein abundances using both of our strategies. While the magnitudes of the two estimates vary, they both suggest that transcription plays a more important role than the earlier studies implied and translation a much smaller role. Finally, the above estimates only apply to those genes whose mRNA and protein expression was detected. Based on a detailed analysis by Hebenstreit et al. (2012), we estimate that approximately 40% of genes in a given cell within a population express no mRNA. Since there can be no translation in the absence of mRNA, we argue that differences in translation rates can play no role in determining the expression levels for the ∼40% of genes that are non-expressed.

300 citations


Journal ArticleDOI
30 Sep 2014-PeerJ
TL;DR: GroopM is introduced, an automated binning tool that primarily uses differential coverage to obtain high fidelity population genomes from related metagenomes and it is shown that GroopM produces results comparable with more time consuming, labor-intensive methods.
Abstract: Metagenomic binning methods that leverage differential population abundances in microbial communities (differential coverage) are emerging as a complementary approach to conventional composition-based binning. Here we introduce GroopM, an automated binning tool that primarily uses differential coverage to obtain high fidelity population genomes from related metagenomes. We demonstrate the effectiveness of GroopM using synthetic and real-world metagenomes, and show that GroopM produces results comparable with more time consuming, labor-intensive methods.

275 citations


Journal ArticleDOI
17 Jul 2014-PeerJ
TL;DR: This work introduces a technique for feature learning from large volumes of bird sound recordings, inspired by techniques that have proven useful in other domains, and demonstrates that unsupervised feature learning provides a substantial boost over MFCCs and Mel spectra without adding computational complexity after the model has been trained.
Abstract: Automatic species classification of birds from their sound is a computational tool of increasing importance in ecology, conservation monitoring and vocal communication studies. To make classification useful in practice, it is crucial to improve its accuracy while ensuring that it can run at big data scales. Many approaches use acoustic measures based on spectrogram-type data, such as the Mel-frequency cepstral coefficient (MFCC) features which represent a manually-designed summary of spectral information. However, recent work in machine learning has demonstrated that features learnt automatically from data can often outperform manually-designed feature transforms. Feature learning can be performed at large scale and "unsupervised", meaning it requires no manual data labelling, yet it can improve performance on "supervised" tasks such as classification. In this work we introduce a technique for feature learning from large volumes of bird sound recordings, inspired by techniques that have proven useful in other domains. We experimentally compare twelve different feature representations derived from the Mel spectrum (of which six use this technique), using four large and diverse databases of bird vocalisations, classified using a random forest classifier. We demonstrate that in our classification tasks, MFCCs can often lead to worse performance than the raw Mel spectral data from which they are derived. Conversely, we demonstrate that unsupervised feature learning provides a substantial boost over MFCCs and Mel spectra without adding computational complexity after the model has been trained. The boost is particularly notable for single-label classification tasks at large scale. The spectro-temporal activations learned through our procedure resemble spectro-temporal receptive fields calculated from avian primary auditory forebrain. However, for one of our datasets, which contains substantial audio data but few annotations, increased performance is not discernible. We study the interaction between dataset characteristics and choice of feature representation through further empirical analysis.

238 citations


Journal ArticleDOI
01 Apr 2014-PeerJ
TL;DR: The large-scale BLAST score ratio (LS-BSR) pipeline is presented, which rapidly compares the genetic content of hundreds to thousands of bacterial genomes, and returns a matrix that describes the relatedness of all coding sequences (CDSs) in all genomes surveyed.
Abstract: Background. As whole genome sequence data from bacterial isolates becomes cheaper to generate, computational methods are needed to correlate sequence data with biological observations. Here we present the large-scale BLAST score ratio (LS-BSR) pipeline, which rapidly compares the genetic content of hundreds to thousands of bacterial genomes, and returns a matrix that describes the relatedness of all coding sequences (CDSs) in all genomes surveyed. This matrix can be easily parsed in order to identify genetic relationships between bacterial genomes. Although pipelines have been published that group peptides by sequence similarity, no other software performs the rapid, large-scale, full-genome comparative analyses carried out by LS-BSR. Results. To demonstrate the utility of the method, the LS-BSR pipeline was tested on 96 Escherichia coli and Shigella genomes; the pipeline ran in 163 min using 16 processors, which is a greater than 7-fold speedup compared to using a single processor. The BSR values for each CDS, which indicate a relative level of relatedness, were then mapped to each genome on an independent core genome single nucleotide polymorphism (SNP) based phylogeny. Comparisons were then used to identify clade specific CDS markers and validate the LS-BSR pipeline based on molecular markers that delineate between classical E. coli pathogenic variant (pathovar) designations. Scalability tests demonstrated that the LS-BSR pipeline can process 1,000 E. coli genomes in 27-57 h, depending upon the alignment method, using 16 processors. Conclusions. LS-BSR is an open-source, parallel implementation of the BSR algorithm, enabling rapid comparison of the genetic content of large numbers of genomes. The results of the pipeline can be used to identify specific markers between user-defined phylogenetic groups, and to identify the loss and/or acquisition of genetic information between bacterial isolates. Taxa-specific genetic markers can then be translated into clinical diagnostics, or can be used to identify broadly conserved putative therapeutic candidates.

Journal ArticleDOI
08 Jul 2014-PeerJ
TL;DR: This study showed that AR was applied in a wide range of topics in healthcare education and acceptance for AR as a learning technology was reported among the learners and its potential for improving different types of competencies.
Abstract: Background. Developing healthcare competencies in students and professionals poses great educational challenges. A possible solution is to provide learning opportunities that utilize augmented reality (AR), where virtual learning experiences can be embedded within a real physical context. The aim of this study was to provide a comprehensive overview of the current state of AR in terms of user acceptance, the AR applications currently developed and the effect of AR on the development of competencies in healthcare. Methods. We conducted an integrative review, which is the broadest type of research review method allowing for the inclusion of various research designs. This allows us to more fully understand a phenomenon of interest. Our review included multi-disciplinary research publications in English reported until 2012. Results. We found 2 529 research papers from ERIC, CINAHL, Medline, PubMed, Web of Science and Springer-link. Three qualitative, twenty quantitative and two mixed-method studies were included. Using thematic analysis, we have described characteristics for research, technology and education. This study showed that AR was applied across a wide range of topics in healthcare education. Furthermore, acceptance for AR as a learning technology was reported among the learners, as well as its potential for improving different types of competencies. Discussion. AR is still considered a novelty in the literature, with most of the studies reporting early prototypes. Additionally, the designed AR applications lacked an explicit pedagogical theoretical framework. Instead, the learning strategies adopted were of the traditional style ‘see one, do one and teach one’ and do not integrate clinical competencies to ensure patients’ safety.

Journal ArticleDOI
27 Mar 2014-PeerJ
TL;DR: In this paper, the authors used pollination exclusion on flowers or inflorescences on a whole plant basis to assess the contribution of insect pollination to crop yield and quality in four flowering crops (spring oilseed rape, field bean, strawberry, and buckwheat).
Abstract: Background. Up to 75% of crop species benefit at least to some degree from animal pollination for fruit or seed set and yield. However, basic information on the level of pollinator dependence and pollinator contribution to yield is lacking for many crops. Even less is known about how insect pollination affects crop quality. Given that habitat loss and agricultural intensification are known to decrease pollinator richness and abundance, there is a need to assess the consequences for different components of crop production. Methods. We used pollination exclusion on flowers or inflorescences on a whole plant basis to assess the contribution of insect pollination to crop yield and quality in four flowering crops (spring oilseed rape, field bean, strawberry, and buckwheat) located in four regions of Europe. For each crop, we recorded abundance and species richness of flower visiting insects in ten fields located along a gradient from simple to heterogeneous landscapes. Results. Insect pollination enhanced average crop yield between 18 and 71% depending on the crop. Yield quality was also enhanced in most crops. For instance, oilseed rape had higher oil and lower chlorophyll contents when adequately pollinated, the proportion of empty seeds decreased in buckwheat, and strawberries' commercial grade improved; however, we did not find higher nitrogen content in open pollinated field beans. Complex landscapes had a higher overall species richness of wild pollinators across crops, but visitation rates were only higher in complex landscapes for some crops. On the contrary, the overall yield was consistently enhanced by higher visitation rates, but not by higher pollinator richness. Discussion. For the four crops in this study, there is clear benefit delivered by pollinators on yield quantity and/or quality, but it is not maximized under current agricultural intensification. Honeybees, the most abundant pollinator, might partially compensate the loss of wild pollinators in some areas, but our results suggest the need of landscape-scale actions to enhance wild pollinator populations.

Journal ArticleDOI
11 Mar 2014-PeerJ
TL;DR: A study with 42 participants investigates the relationship between the affective states, creativity, and analytical problem-solving skills of software developers and offers support for the claim that happy developers are indeed better problem solvers in terms of their analytical abilities.
Abstract: For more than thirty years, it has been claimed that a way to improve software developers’ productivity and software quality is to focus on people and to provide incentives to make developers satisfied and happy. This claim has rarely been verified in software engineering research, which faces an additional challenge in comparison to more traditional engineering fields: software development is an intellectual activity and is dominated by often-neglected human factors (called human aspects in software engineering research). Among the many skills required for software development, developers must possess high analytical problem-solving skills and creativity for the software construction process. According to psychology research, affective states—emotions and moods—deeply influence the cognitive processing abilities and performance of workers, including creativity and analytical problem solving. Nonetheless, little research has investigated the correlation between the affective states, creativity, and analytical problem-solving performance of programmers. This article echoes the call to employ psychological measurements in software engineering research. We report a study with 42 participants to investigate the relationship between the affective states, creativity, and analytical problem-solving skills of software developers. The results offer support for the claim that happy developers are indeed better problem solvers in terms of their analytical abilities. The following contributions are made by this study: (1) providing a better understanding of the impact of affective states on the creativity and analytical problem-solving capacities of developers, (2) introducing and validating psychological measurements, theories, and concepts of affective states, creativity, and analytical-problem-solving skills in empirical software engineering, and (3) raising the need for studying the human factors of software engineering by employing a multidisciplinary viewpoint.

Journal ArticleDOI
20 Nov 2014-PeerJ
TL;DR: The draft assembly of domestic cow, Bos taurus, was scanned to identify 173 small contigs that appeared to derive from microbial contaminants, and it was discovered that one genome, Neisseria gonorrhoeae TCDC-NG08107, although putatively a complete genome, contained multiple sequences that actually derived from the cow and sheep genomes.
Abstract: The raw data from a genome sequencing project sometimes contains DNA from contaminating organisms, which may be introduced during sample collection or sequence preparation. In some instances, these contaminants remain in the sequence even after assembly and deposition of the genome into public databases. As a result, searches of these databases may yield erroneous and confusing results. We used efficient microbiome analysis software to scan the draft assembly of domestic cow, Bos taurus, and identify 173 small contigs that appeared to derive from microbial contaminants. In the course of verifying these findings, we discovered that one genome, Neisseria gonorrhoeae TCDC-NG08107, although putatively a complete genome, contained multiple sequences that actually derived from the cow and sheep genomes. Our findings illustrate the need to carefully validate findings of anomalous DNA that rely on comparisons to either draft or finished genomes.

Journal ArticleDOI
25 Mar 2014-PeerJ
TL;DR: New Zealand manuka-type honeys, at the concentrations they can be applied in wound dressings are highly active in both preventing S. aureus biofilm formation and in their eradication, and do not result in bacteria becoming resistant.
Abstract: Chronic wounds are a major global health problem. Their management is difficult and costly, and the development of antibiotic resistance by both planktonic and biofilm-associated bacteria necessitates the use of alternative wound treatments. Honey is now being revisited as an alternative treatment due to its broad-spectrum antibacterial activity and the inability of bacteria to develop resistance to it. Many previous antibacterial studies have used honeys that are not well characterized, even in terms of quantifying the levels of the major antibacterial components present, making it difficult to build an evidence base for the efficacy of honey as an antibiofilm agent in chronic wound treatment. Here we show that a range of well-characterized New Zealand manuka-type honeys, in which two principle antibacterial components, methylglyoxal and hydrogen peroxide, were quantified, can eradicate biofilms of a range of Staphylococcus aureus strains that differ widely in their biofilm-forming abilities. Using crystal violet and viability assays, along with confocal laser scanning imaging, we demonstrate that in all S. aureus strains, including methicillin-resistant strains, the manuka-type honeys showed significantly higher anti-biofilm activity than clover honey and an isotonic sugar solution. We observed higher anti-biofilm activity as the proportion of manuka-derived honey, and thus methylglyoxal, in a honey blend increased. However, methylglyoxal on its own, or with sugar, was not able to effectively eradicate S. aureus biofilms. We also demonstrate that honey was able to penetrate through the biofilm matrix and kill the embedded cells in some cases. As has been reported for antibiotics, sub-inhibitory concentrations of honey improved biofilm formation by some S. aureus strains, however, biofilm cell suspensions recovered after honey treatment did not develop resistance towards manuka-type honeys. New Zealand manuka-type honeys, at the concentrations they can be applied in wound dressings are highly active in both preventing S. aureus biofilm formation and in their eradication, and do not result in bacteria becoming resistant. Methylglyoxal requires other components in manuka-type honeys for this anti-biofilm activity. Our findings support the use of well-defined manuka-type honeys as a topical anti-biofilm treatment for the effective management of wound healing.

Journal ArticleDOI
05 Aug 2014-PeerJ
TL;DR: A QIIME-compatible database designed for species-level taxonomic assignment of 16S rRNA gene amplicon data targeting methanogenic archaea from the rumen, and from animal and human intestinal tracts, and shows that taxonomic assignments with RIM-DB resulted in the most detailed assignment.
Abstract: Methane is formed by methanogenic archaea in the rumen as one of the end products of feed fermentation in the ruminant digestive tract. To develop strategies to mitigate anthropogenic methane emissions due to ruminant farming, and to understand rumen microbial differences in animal feed conversion efficiency, it is essential that methanogens can be identified and taxonomically classified with high accuracy. Currently available taxonomic frameworks offer only limited resolution beyond the genus level for taxonomic assignments of sequence data stemming from high throughput sequencing technologies. Therefore, we have developed a QIIME-compatible database (DB) designed for species-level taxonomic assignment of 16S rRNA gene amplicon data targeting methanogenic archaea from the rumen, and from animal and human intestinal tracts. Called RIM-DB (Rumen and Intestinal Methanogen-DB), it contains a set of 2,379 almost full-length chimera-checked 16S rRNA gene sequences, including 20 previously unpublished sequences from isolates from three different orders. The taxonomy encompasses the recently-proposed seventh order of methanogens, the Methanomassiliicoccales, and allows differentiation between defined groups within this order. Sequence reads from rumen contents from a range of ruminant-diet combinations were taxonomically assigned using RIM-DB, Greengenes and SILVA. This comparison clearly showed that taxonomic assignments with RIM-DB resulted in the most detailed assignment, and only RIM-DB taxonomic assignments allowed methanogens to be distinguished taxonomically at the species level. RIM-DB complements the use of comprehensive databases such as Greengenes and SILVA for community structure analysis of methanogens from the rumen and other intestinal environments, and allows identification of target species for methane mitigation strategies.

Journal ArticleDOI
24 Jun 2014-PeerJ
TL;DR: The utility of mobile phones to gather data about the personal microbiome — the collection of microorganisms associated with the personal effects of an individual — is investigated and suggests that mobile phones hold untapped potential as personal microbiome sensors.
Abstract: Most people on the planet own mobile phones, and these devices are increasingly being utilized to gather data relevant to our personal health, behavior, and environment. During an educational workshop, we investigated the utility of mobile phones to gather data about the personal microbiome - the collection of microorganisms associated with the personal effects of an individual. We characterized microbial communities on smartphone touchscreens to determine whether there was significant overlap with the skin microbiome sampled directly from their owners. We found that about 22% of the bacterial taxa on participants' fingers were also present on their own phones, as compared to 17% they shared on average with other people's phones. When considered as a group, bacterial communities on men's phones were significantly different from those on their fingers, while women's were not. Yet when considered on an individual level, men and women both shared significantly more of their bacterial communities with their own phones than with anyone else's. In fact, 82% of the OTUs were shared between a person's index and phone when considering the dominant taxa (OTUs with more than 0.1% of the sequences in an individual's dataset). Our results suggest that mobile phones hold untapped potential as personal microbiome sensors.

Journal ArticleDOI
02 Jan 2014-PeerJ
TL;DR: Three novel scoring systems to diagnose BRD in preweaned dairy calves were developed using data from a pair-matched case-control study on a California calf raising facility and had excellent agreement when compared to an earlier scoring system.
Abstract: Several clinical scoring systems for diagnosis of bovine respiratory disease (BRD) in calves have been proposed. However, such systems were based on subjective judgment, rather than statistical methods, to weight scores. Data from a pair-matched case-control study on a California calf raising facility was used to develop three novel scoring systems to diagnose BRD in preweaned dairy calves. Disease status was assigned using both clinical signs and diagnostic test results for BRD-associated pathogens. Regression coefficients were used to weight score values. The systems presented use nasal and ocular discharge, rectal temperature, ear and head carriage, coughing, and respiratory quality as predictors. The systems developed in this research utilize fewer severity categories of clinical signs, require less calf handling, and had excellent agreement (Kappa > 0.8) when compared to an earlier scoring system. The first scoring system dichotomized all clinical predictors but required inducing a cough. The second scoring system removed induced cough as a clinical abnormality but required distinguishing between three levels of nasal discharge severity. The third system removed induced cough and forced a dichotomized variable for nasal discharge. The first system presented in this study used the following predictors and assigned values: coughing (induced or spontaneous coughing, 2 points), nasal discharge (any discharge, 3 points), ocular discharge (any discharge, 2 points), ear and head carriage (ear droop or head tilt, 5 points), fever (≥39.2°C or 102.5°F, 2 points), and respiratory quality (abnormal respiration, 2 points). Calves were categorized "BRD positive" if their total score was ≥4. This system correctly classified 95.4% cases and 88.6% controls. The second presented system categorized the predictors and assigned weights as follows: coughing (spontaneous only, 2 points), mild nasal discharge (unilateral, serous, or watery discharge, 3 points), moderate to severe nasal discharge (bilateral, cloudy, mucoid, mucopurlent, or copious discharge, 5 points), ocular discharge (any discharge, 1 point), ear and head carriage (ear droop or head tilt, 5 points), fever (≥39.2°C, 2 points), and respiratory quality (abnormal respiration, 2 points). Calves were categorized "BRD positive" if their total score was ≥4. This system correctly classified 89.3% cases and 92.8% controls. The third presented system used the following predictors and scores: coughing (spontaneous only, 2 points), nasal discharge (any, 4 points), ocular discharge (any, 2 points), ear and head carriage (ear droop or head tilt, 5 points), fever (≥39.2°C, 2 points), and respiratory quality (abnormal respiration, 2 points). Calves were categorized "BRD positive" if their total score was ≥5. This system correctly classified 89.4% cases and 90.8% controls. Each of the proposed systems offer few levels of clinical signs and data-based weights for on-farm diagnosis of BRD in dairy calves.

Journal ArticleDOI
30 Oct 2014-PeerJ
TL;DR: Exposure to canola grown from seed treated with clothianidin poses low risk to honey bees, and overwintering success did not differ significantly between treatment and control hives, and was similar to overwintering colony loss rates reported for the winter of 2012–2013.
Abstract: In summer 2012, we initiated a large-scale field experiment in southern Ontario, Canada, to determine whether exposure to clothianidin seed-treated canola (oil seed rape) has any adverse impacts on honey bees. Colonies were placed in clothianidin seed-treated or control canola fields during bloom, and thereafter were moved to an apiary with no surrounding crops grown from seeds treated with neonicotinoids. Colony weight gain, honey production, pest incidence, bee mortality, number of adults, and amount of sealed brood were assessed in each colony throughout summer and autumn. Samples of honey, beeswax, pollen, and nectar were regularly collected, and samples were analyzed for clothianidin residues. Several of these endpoints were also measured in spring 2013. Overall, colonies were vigorous during and after the exposure period, and we found no effects of exposure to clothianidin seed-treated canola on any endpoint measures. Bees foraged heavily on the test fields during peak bloom and residue analysis indicated that honey bees were exposed to low levels (0.5–2 ppb) of clothianidin in pollen. Low levels of clothianidin were detected in a few pollen samples collected toward the end of the bloom from control hives, illustrating the difficulty of conducting a perfectly controlled field study with free-ranging honey bees in agricultural landscapes. Overwintering success did not differ significantly between treatment and control hives, and was similar to overwintering colony loss rates reported for the winter of 2012–2013 for beekeepers in Ontario and Canada. Our results suggest that exposure to canola grown from seed treated with clothianidin poses low risk to honey bees.

Journal ArticleDOI
23 Sep 2014-PeerJ
TL;DR: The potential of shotgun metagenomics, sequencing of DNA from samples without culture or target-specific amplification or capture, to detect and characterise strains from the Mycobacterium tuberculosis complex in smear-positive sputum samples obtained from The Gambia in West Africa is explored.
Abstract: Tuberculosis remains a major global health problem. Laboratory diagnostic methods that allow effective, early detection of cases are central to management of tuberculosis in the individual patient and in the community. Since the 1880s, laboratory diagnosis of tuberculosis has relied primarily on microscopy and culture. However, microscopy fails to provide species- or lineage-level identification and culture-based workflows for diagnosis of tuberculosis remain complex, expensive, slow, technically demanding and poorly able to handle mixed infections. We therefore explored the potential of shotgun metagenomics, sequencing of DNA from samples without culture or target-specific amplification or capture, to detect and characterise strains from the Mycobacterium tuberculosis complex in smear-positive sputum samples obtained from The Gambia in West Africa. Eight smear- and culture-positive sputum samples were investigated using a differential-lysis protocol followed by a kit-based DNA extraction method, with sequencing performed on a benchtop sequencing instrument, the Illumina MiSeq. The number of sequence reads in each sputum-derived metagenome ranged from 989,442 to 2,818,238. The proportion of reads in each metagenome mapping against the human genome ranged from 20% to 99%. We were able to detect sequences from the M. tuberculosis complex in all eight samples, with coverage of the H37Rv reference genome ranging from 0.002X to 0.7X. By analysing the distribution of large sequence polymorphisms (deletions and the locations of the insertion element IS6110) and single nucleotide polymorphisms (SNPs), we were able to assign seven of eight metagenome-derived genomes to a species and lineage within the M. tuberculosis complex. Two metagenome-derived mycobacterial genomes were assigned to M. africanum, a species largely confined to West Africa; the others that could be assigned belonged to lineages T, H or LAM within the clade of “modern” M. tuberculosis strains. We have provided proof of principle that shotgun metagenomics can be used to detect and characterise M. tuberculosis sequences from sputum samples without culture or target-specific amplification or capture, using an accessible benchtop-sequencing platform, the Illumina MiSeq, and relatively simple DNA extraction, sequencing and bioinformatics protocols. In our hands, sputum metagenomics does not yet deliver sufficient depth of coverage to allow sequence-based sensitivity testing; it remains to be determined whether improvements in DNA extraction protocols alone can deliver this or whether culture, capture or amplification steps will be required. Nonetheless, we can foresee a tipping point when a unified automated metagenomics-based workflow might start to compete with the plethora of methods currently in use in the diagnostic microbiology laboratory.

Journal ArticleDOI
20 Mar 2014-PeerJ
TL;DR: Pass passerine gut microbiota may be more variable and environmentally determined than other taxonomic groups examined to date and may be most strongly influenced by environmental factors.
Abstract: Brown-headed Cowbirds (Molothrus ater) are the most widespread avian brood parasite in North America, laying their eggs in the nests of approximately 250 host species that raise the cowbird nestlings as their own. It is currently unknown how these heterospecific hosts influence the cowbird gut microbiota relative to other factors, such as the local environment and genetics. We test a Nature Hypothesis (positing the importance of cowbird genetics) and a Nurture Hypothesis (where the host parents are most influential to cowbird gut microbiota) using the V6 region of 16S rRNA as a microbial fingerprint of the gut from 32 cowbird samples and 16 potential hosts from nine species. We test additional hypotheses regarding the influence of the local environment and age of the birds. We found no evidence for the Nature Hypothesis and little support for the Nurture Hypothesis. Cowbird gut microbiota did not form a clade, but neither did members of the host species. Rather, the physical location, diet and age of the bird, whether cowbird or host, were the most significant categorical variables. Thus, passerine gut microbiota may be most strongly influenced by environmental factors. To put this variation in a broader context, we compared the bird data to a fecal microbiota dataset of 38 mammal species and 22 insect species. Insects were always the most variable; on some axes, we found more variation within cowbirds than across all mammals. Taken together, passerine gut microbiota may be more variable and environmentally determined than other taxonomic groups examined to date.

Journal ArticleDOI
27 May 2014-PeerJ
TL;DR: This work estimates the intrinsic risk of extinction for a typical generic manta ray using a variant of the classic Euler–Lotka demographic model, and shows that it is possible to derive important insights into the demography extinction risk of data-poor species using well-established life history theory.
Abstract: Background: The directed harvest and global trade in the gill plates of mantas, and devil rays, has led to increased fishing pressure and steep population declines in some locations. The slow life history, particularly of the manta rays, is cited as a key reason why such species have little capacity to withstand directed fisheries. Here, we place their life history and demography within the context of other sharks and rays. Methods: Despite the limited availability of data, we use life history theory and comparative analysis to estimate the intrinsic risk of extinction (as indexed by the maximum intrinsic rate of population increase r(max)) for a typical generic manta ray using a variant of the classic Euler-Lotka demographic model. This model requires only three traits to calculate the maximum intrinsic population growth rate r(max): von Bertalanffy growth rate, annual pup production and age at maturity. To account for the uncertainty in life history parameters, we created plausible parameter ranges and propagate these uncertainties through the model to calculate a distribution of the plausible range of rmax values. Results. The maximum population growth rate rmax of manta ray is most sensitive to the length of the reproductive cycle, and the median rmax of 0.116 year(-1) 95th percentile [0.089-0.139] is one of the lowest known of the 106 sharks and rays for which we have comparable demographic information. Discussion: In common with other unprotected, unmanaged, high-value largebodied sharks and rays the combination of very low population growth rates of manta rays, combined with the high value of their gill rakers and the international nature of trade, is highly likely to lead to rapid depletion and potential local extinction unless a rapid conservation management response occurs worldwide. Furthermore, we show that it is possible to derive important insights into the demography extinction risk of data-poor species using well-established life history theory.

Journal ArticleDOI
27 May 2014-PeerJ
TL;DR: Hi-C sequencing data provide valuable information for metagenome analyses that are not currently obtainable by other methods, and it is demonstrated that Hi-C data provides a long-range signal of strain-specific genotypes, indicating such data may be useful for high-resolution genotyping of microbial populations.
Abstract: Metagenomics is a valuable tool for the study of microbial communities but has been limited by the difficulty of "binning" the resulting sequences into groups corresponding to the individual species and strains that constitute the community. Moreover, there are presently no methods to track the flow of mobile DNA elements such as plasmids through communities or to determine which of these are co-localized within the same cell. We address these limitations by applying Hi-C, a technology originally designed for the study of three-dimensional genome structure in eukaryotes, to measure the cellular co-localization of DNA sequences. We leveraged Hi-C data generated from a simple synthetic metagenome sample to accurately cluster metagenome assembly contigs into groups that contain nearly complete genomes of each species. The Hi-C data also reliably associated plasmids with the chromosomes of their host and with each other. We further demonstrated that Hi-C data provides a long-range signal of strain-specific genotypes, indicating such data may be useful for high-resolution genotyping of microbial populations. Our work demonstrates that Hi-C sequencing data provide valuable information for metagenome analyses that are not currently obtainable by other methods. This metagenomic Hi-C method could facilitate future studies of the fine-scale population structure of microbes, as well as studies of how antibiotic resistance plasmids (or other genetic elements) mobilize in microbial communities. The method is not limited to microbiology; the genetic architecture of other heterogeneous populations of cells could also be studied with this technique.

Journal ArticleDOI
06 May 2014-PeerJ
TL;DR: It is shown that overall consumption rates increase with temperature between 20 and 30 °C but do not increase further with increasing temperature, and there is substantial variation in thermal responses among individual herbivore-plant pairs at the highest temperatures.
Abstract: Rising temperatures can influence the top-down control of plant biomass by increasing herbivore metabolic demands. Unfortunately, we know relatively little about the effects of temperature on herbivory rates for most insect herbivores in a given community. Evolutionary history, adaptation to local environments, and dietary factors may lead to variable thermal response curves across different species. Here we characterized the effect of temperature on herbivory rates for 21 herbivore-plant pairs, encompassing 14 herbivore and 12 plant species. We show that overall consumption rates increase with temperature between 20 and 30 °C but do not increase further with increasing temperature. However, there is substantial variation in thermal responses among individual herbivore-plant pairs at the highest temperatures. Over one third of the herbivore-plant pairs showed declining consumption rates at high temperatures, while an approximately equal number showed increasing consumption rates. Such variation existed even within herbivore species, as some species exhibited idiosyncratic thermal response curves on different host plants. Thus, rising temperatures, particularly with respect to climate change, may have highly variable effects on plant-herbivore interactions and, ultimately, top-down control of plant biomass.

Journal ArticleDOI
20 May 2014-PeerJ
TL;DR: A multi-gene analysis highlights the potential power of assessing genome-wide evolutionary patterns using recent advances in sequencing technology and emphasizes the importance of integrating ecological data with more comprehensive sampling of free-living and symbiotic Symbiodinium in assessing the evolutionary adaptation of this enigmatic dinoflagellate.
Abstract: Symbiodinium, a large group of dinoflagellates, live in symbiosis with marine protists, invertebrate metazoans, and free-living in the environment. Symbiodinium are functionally variable and play critical energetic roles in symbiosis. Our knowledge of Symbiodinium has been historically constrained by the limited number of molecular markers available to study evolution in the genus. Here we compare six functional genes, representing three cellular compartments, in the nine known Symbiodinium lineages. Despite striking similarities among the single gene phylogenies from distinct organelles, none were evolutionarily identical. A fully concatenated reconstruction, however, yielded a well-resolved topology identical to the current benchmark nr28S gene. Evolutionary rates differed among cellular compartments and clades, a pattern largely driven by higher rates of evolution in the chloroplast genes of Symbiodinium clades D2 and I. The rapid rates of evolution observed amongst these relatively uncommon Symbiodinium lineages in the functionally critical chloroplast may translate into potential innovation for the symbiosis. The multi-gene analysis highlights the potential power of assessing genome-wide evolutionary patterns using recent advances in sequencing technology and emphasizes the importance of integrating ecological data with more comprehensive sampling of free-living and symbiotic Symbiodinium in assessing the evolutionary adaptation of this enigmatic dinoflagellate.

Journal ArticleDOI
18 Feb 2014-PeerJ
TL;DR: It is found that elephants affiliated significantly more with other individuals through directed, physical contact and vocal communication following a distress event than in control periods, and is best classified with similar consolation responses by apes, possibly based on convergent evolution of empathic capacities.
Abstract: Contact directed by uninvolved bystanders toward others in distress, often termed consolation, is uncommon in the animal kingdom, thus far only demonstrated in the great apes, canines, and corvids. Whereas the typical agonistic context of such contact is relatively rare within natural elephant families, other causes of distress may trigger similar, other-regarding responses. In a study carried out at an elephant camp in Thailand, we found that elephants affiliated significantly more with other individuals through directed, physical contact and vocal communication following a distress event than in control periods. In addition, bystanders affiliated with each other, and matched the behavior and emotional state of the first distressed individual, suggesting emotional contagion. The initial distress responses were overwhelmingly directed toward ambiguous stimuli, thus making it difficult to determine if bystanders reacted to the distressed individual or showed a delayed response to the same stimulus. Nonetheless, the directionality of the contacts and their nature strongly suggest attention toward the emotional states of conspecifics. The elephants' behavior is therefore best classified with similar consolation responses by apes, possibly based on convergent evolution of empathic capacities.

Journal ArticleDOI
28 Oct 2014-PeerJ
TL;DR: The study supports the use of analysis of Twitter content to unobtrusively measure attitudes towards mental illness, both supportive and stigmatising.
Abstract: Introduction. The paper reports on an exploratory study of the usefulness of Twitter for unobtrusive assessment of stigmatizing attitudes in the community. Materials and Methods. Tweets with the hashtags #depression or #schizophrenia posted on Twitter during a 7-day period were collected. Tweets were categorised based on their content and user information and also on the extent to which they indicated a stigmatising attitude towards depression or schizophrenia (stigmatising, personal experience of stigma, supportive, neutral, or anti-stigma). Tweets that indicated stigmatising attitudes or personal experiences of stigma were further grouped into the following subthemes: social distance, dangerousness, snap out of it, personal weakness, inaccurate beliefs, mocking or trivializing, and self-stigma. Results and Discussion. Tweets on depression mostly related to resources for consumers (34%), or advertised services or products for individuals with depression (20%). The majority of schizophrenia tweets aimed to increase awareness of schizophrenia (29%) or reported on research findings (22%). Tweets on depression were largely supportive (65%) or neutral (27%). A number of tweets were specifically anti-stigma (7%). Less than 1% of tweets reflected stigmatising attitudes (0.7%) or personal experience of stigma (0.1%). More than one third of the tweets which reflected stigmatising attitudes were mocking or trivialising towards individuals with depression (37%). The attitude that individuals with depression should "snap out of it" was evident in 30% of the stigmatising tweets. The majority of tweets relating to schizophrenia were categorised as supportive (42%) or neutral (43%). Almost 10% of tweets were explicitly anti-stigma. The percentage of tweets showing stigmatising attitudes was 5%, while less than 1% of tweets described personal experiences of stigmatising attitudes towards individuals with schizophrenia. Of the tweets that indicated stigmatising attitudes, most reflected inaccurate beliefs about schizophrenia being multiple personality disorder (52%) or mocked or trivialised individuals with schizophrenia (33%). Conclusions. The study supports the use of analysis of Twitter content to unobtrusively measure attitudes towards mental illness, both supportive and stigmatising. The results of the study may be useful in assisting mental health promotion and advocacy organisations to provide information about resources and support, raise awareness and counter common stigmatising attitudes.

Journal ArticleDOI
12 Jun 2014-PeerJ
TL;DR: Chimpanzees spontaneously solve the task a total of 3,565 times in both dyadic and triadic combinations, demonstrating that in the midst of a complex social environment, chimpanzees spontaneously initiate and maintain a high level of cooperative behavior.
Abstract: The purpose of the present study was to push the boundaries of cooperation among captive chimpanzees (Pan troglodytes). There has been doubt about the level of cooperation that chimpanzees are able to spontaneously achieve or understand. Would they, without any pre-training or restrictions in partner choice, be able to develop successful joint action? And would they be able to extend cooperation to more than two partners, as they do in nature? Chimpanzees were given a chance to cooperate with multiple partners of their own choosing. All members of the group (N = 11) had simultaneous access to an apparatus that required two (dyadic condition) or three (triadic condition) individuals to pull in a tray baited with food. Without any training, the chimpanzees spontaneously solved the task a total of 3,565 times in both dyadic and triadic combinations. Their success rate and efficiency increased over time, whereas the amount of pulling in the absence of a partner decreased, demonstrating that they had learned the task contingencies. They preferentially approached the apparatus when kin or nonkin of similar rank were present, showing a preference for socially tolerant partners. The forced partner combinations typical of cooperation experiments cannot reveal these abilities, which demonstrate that in the midst of a complex social environment, chimpanzees spontaneously initiate and maintain a high level of cooperative behavior.

Journal ArticleDOI
25 Sep 2014-PeerJ
TL;DR: This study presents evidence of synergy from an experimental system of two phages and a mucoid E. coli host and offers mathematical models and simulations to understand the dynamics of synergy and the enhanced magnitude of bacterial control possible.
Abstract: Where phages are used to treat bacterial contaminations and infections, multiple phages are typically applied at once as a cocktail. When two or more phages in the cocktail attack the same bacterium, the combination may produce better killing than any single phage (synergy) or the combination may be worse than the best single phage (interference). Synergy is of obvious utility, especially if it can be predicted a priori, but it remains poorly documented with few examples known. This study addresses synergy in which one phage improves adsorption by a second phage. It first presents evidence of synergy from an experimental system of two phages and a mucoid E. coli host. The synergy likely stems from a tailspike enzyme produced by one of the phages. We then offer mathematical models and simulations to understand the dynamics of synergy and the enhanced magnitude of bacterial control possible. The models and observations complement each other and suggest that synergy may be of widespread utility and may be predictable from easily observed phenotypes.