•Journal•ISSN: 2167-8359

PeerJ

PeerJ, Inc.

About: PeerJ is an academic journal published by PeerJ, Inc.. The journal publishes majorly in the area(s): Population & Gene. It has an ISSN identifier of 2167-8359. It is also open access. Over the lifetime, 13545 publications have been published receiving 239030 citations. The journal is also known as: PeerJ Life & Environment.

...read moreread less

Topics: Population, Gene, Genome, Species richness, Phylogenetic tree ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

VSEARCH: a versatile open source tool for metagenomics

[...]

Torbjørn Rognes¹, Torbjørn Rognes², Tomas Flouri³, Tomas Flouri⁴, Ben Nichols⁵, Christopher Quince⁶, Christopher Quince⁵, Frédéric Mahé⁷ - Show less +4 more•Institutions (7)

Oslo University Hospital¹, University of Oslo², Karlsruhe Institute of Technology³, Heidelberg Institute for Theoretical Studies⁴, University of Glasgow⁵, University of Warwick⁶, Kaiserslautern University of Technology⁷

18 Oct 2016-PeerJ

TL;DR: VSEARCH is here shown to be more accurate than USEARCH when performing searching, clustering, chimera detection and subsampling, while on a par with US EARCH for paired-ends read merging and dereplication.

...read moreread less

Abstract: Background: VSEARCH is an open source and free of charge multithreaded 64-bit tool for processing and preparing metagenomics, genomics and population genomics nucleotide sequence data. It is designed as an alternative to the widely used USEARCH tool (Edgar, 2010) for which the source code is not publicly available, algorithm details are only rudimentarily described, and only a memory-confined 32-bit version is freely available for academic use. Methods: When searching nucleotide sequences, VSEARCH uses a fast heuristic based on words shared by the query and target sequences in order to quickly identify similar sequences, a similar strategy is probably used in USEARCH. VSEARCH then performs optimal global sequence alignment of the query against potential target sequences, using full dynamic programming instead of the seed-and-extend heuristic used by USEARCH. Pairwise alignments are computed in parallel using vectorisation and multiple threads. Results: VSEARCH includes most commands for analysing nucleotide sequences available in USEARCH version 7 and several of those available in USEARCH version 8, including searching (exact or based on global alignment), clustering by similarity (using length pre-sorting, abundance pre-sorting or a user-defined order), chimera detection (reference-based or de novo), dereplication (full length or prefix), pairwise alignment, reverse complementation, sorting, and subsampling. VSEARCH also includes commands for FASTQ file processing, i.e., format detection, filtering, read quality statistics, and merging of paired reads. Furthermore, VSEARCH extends functionality with several new commands and improvements, including shuffling, rereplication, masking of low-complexity sequences with the well-known DUST algorithm, a choice among different similarity definitions, and FASTQ file format conversion. VSEARCH is here shown to be more accurate than USEARCH when performing searching, clustering, chimera detection and subsampling, while on a par with USEARCH for paired-ends read merging. VSEARCH is slower than USEARCH when performing clustering and chimera detection, but significantly faster when performing paired-end reads merging and dereplication. VSEARCH is available at https://github.com/torognes/vsearch under either the BSD 2-clause license or the GNU General Public License version 3.0. Discussion: VSEARCH has been shown to be a fast, accurate and full-fledged alternative to USEARCH. A free and open-source versatile tool for sequence analysis is now available to the metagenomics community.

...read moreread less

5,850 citations

Journal Article•DOI•

scikit-image: Image processing in Python

[...]

Stefan van der Walt¹, Johannes L. Schonberger², Juan Nunez-Iglesias³, François Boulogne⁴, Joshua D. Warner⁵, Neil Yager, Emmanuelle Gouillart⁶, Tony S. Yu - Show less +4 more•Institutions (6)

Stellenbosch University¹, University of North Carolina at Chapel Hill², Victorian Life Sciences Computation Initiative³, Princeton University⁴, Mayo Clinic⁵, Saint-Gobain⁶

19 Jun 2014-PeerJ

TL;DR: The advantages of open source to achieve the goals of the scikit-image library are highlighted, and several real-world image processing applications that use scik it-image are showcased.

...read moreread less

Abstract: scikit-image is an image processing library that implements algorithms and utilities for use in research, education and industry applications. It is released under the liberal Modified BSD open source license, provides a well-documented API in the Python programming language, and is developed by an active, international team of collaborators. In this paper we highlight the advantages of open source to achieve the goals of the scikit-image library, and we showcase several real-world image processing applications that use scikit-image. More information can be found on the project homepage, http://scikit-image.org.

...read moreread less

3,903 citations

Journal Article•DOI•

Probabilistic Programming in Python using PyMC3

[...]

John Salvatier, Thomas V. Wiecki, Christopher Fonnesbeck¹•Institutions (1)

Vanderbilt University¹

06 Apr 2016-PeerJ

TL;DR: This paper is a tutorial-style introduction to PyMC3, a new open source Probabilistic Programming framework written in Python that uses Theano to compute gradients via automatic dierentiation as well as compile probabilistic programs on-the-fly to C for increased speed.

...read moreread less

Abstract: Probabilistic Programming allows for automatic Bayesian inference on user-defined probabilistic models. Recent advances in Markov chain Monte Carlo (MCMC) sampling allow inference on increasingly complex models. This class of MCMC, known as Hamiltonian Monte Carlo, requires gradient information which is often not readily available. PyMC3 is a new open source Probabilistic Programming framework written in Python that uses Theano to compute gradients via automatic dierentiation as well as compile probabilistic programs on-the-fly to C for increased speed. Contrary to other Probabilistic Programming languages, PyMC3 allows model specification directly in Python code. The lack of a domain specific language allows for great flexibility and direct interaction with the model. This paper is a tutorial-style introduction to this software package.

...read moreread less

1,969 citations

Journal Article•DOI•

Poppr: an R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction.

[...]

Zhian N. Kamvar¹, Javier F. Tabima¹, Niklaus J. Grünwald¹•Institutions (1)

Oregon State University¹

04 Mar 2014-PeerJ

TL;DR: The R package poppr is developed providing unique tools for analysis of data from admixed, clonal, mixed, and/or sexual populations, and functions for genotypic diversity and clone censoring are specific for clonal populations.

...read moreread less

Abstract: Many microbial, fungal, or oomcyete populations violate assumptions for population genetic analysis because these populations are clonal, admixed, partially clonal, and/or sexual. Furthermore, few tools exist that are specifically designed for analyzing data from clonal populations, making analysis difficult and haphazard. We developed the R package poppr providing unique tools for analysis of data from admixed, clonal, mixed, and/or sexual populations. Currently, poppr can be used for dominant/codominant and haploid/diploid genetic data. Data can be imported from several formats including GenAlEx formatted text files and can be analyzed on a user-defined hierarchy that includes unlimited levels of subpopulation structure and clone censoring. New functions include calculation of Bruvo’s distance for microsatellites, batch-analysis of the index of association with several indices of genotypic diversity, and graphing including dendrograms with bootstrap support and minimum spanning networks. While functions for genotypic diversity and clone censoring are specific for clonal populations, several functions found in poppr are also valuable to analysis of any populations. A manual with documentation and examples is provided. Poppr is open source and major releases are available on CRAN: http://cran.r-project.org/package=poppr. More supporting documentation and tutorials can be found under ‘resources’ at: http://grunwaldlab.cgrb.oregonstate.edu/.

...read moreread less

1,942 citations

Journal Article•DOI•

MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities

[...]

Dongwan D. Kang¹, Jeff Froula², Jeff Froula¹, Rob Egan², Rob Egan¹, Zhong Wang¹ - Show less +2 more•Institutions (2)

Lawrence Berkeley National Laboratory¹, Joint Genome Institute²

27 Aug 2015-PeerJ

TL;DR: MetaBAT as mentioned in this paper integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning, and automatically forms hundreds of high quality genome bins on a very large assembly consisting millions of contigs.

...read moreread less

Abstract: Grouping large genomic fragments assembled from shotgun metagenomic sequences to deconvolute complex microbial communities, or metagenome binning, enables the study of individual organisms and their interactions. Because of the complex nature of these communities, existing metagenome binning methods often miss a large number of microbial species. In addition, most of the tools are not scalable to large datasets. Here we introduce automated software called MetaBAT that integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning. MetaBAT outperforms alternative methods in accuracy and computational efficiency on both synthetic and real metagenome datasets. It automatically forms hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node. MetaBAT is open source software and available at https://bitbucket.org/berkeleylab/metabat.

...read moreread less

1,406 citations

Collapse

Performance

Metrics

13,545

Papers

239,030

Citations

No. of papers from the Journal in previous years
Year	Papers
2023	95
2022	6
2021	2,405
2020	2,408
2019	2,226
2018	1,967