scispace - formally typeset
Search or ask a question
Journal ArticleDOI

In Silico Detection and Typing of Plasmids using PlasmidFinder and Plasmid Multilocus Sequence Typing

TL;DR: Two easy-to-use Web tools for in silico detection and characterization of whole-genome sequence (WGS) and whole-plasmid sequence data from members of the family Enterobacteriaceae are designed and developed.
Abstract: In the work presented here, we designed and developed two easy-to-use Web tools for in silico detection and characterization of whole-genome sequence (WGS) and whole-plasmid sequence data from members of the family Enterobacteriaceae. These tools will facilitate bacterial typing based on draft genomes of multidrug-resistant Enterobacteriaceae species by the rapid detection of known plasmid types. Replicon sequences from 559 fully sequenced plasmids associated with the family Enterobacteriaceae in the NCBI nucleotide database were collected to build a consensus database for integration into a Web tool called PlasmidFinder that can be used for replicon sequence analysis of raw, contig group, or completely assembled and closed plasmid sequencing data. The PlasmidFinder database currently consists of 116 replicon sequences that match with at least at 80% nucleotide identity all replicon sequences identified in the 559 fully sequenced plasmids. For plasmid multilocus sequence typing (pMLST) analysis, a database that is updated weekly was generated from www.pubmlst.org and integrated into a Web tool called pMLST. Both databases were evaluated using draft genomes from a collection of Salmonella enterica serovar Typhimurium isolates. PlasmidFinder identified a total of 103 replicons and between zero and five different plasmid replicons within each of 49 S . Typhimurium draft genomes tested. The pMLST Web tool was able to subtype genomic sequencing data of plasmids, revealing both known plasmid sequence types (STs) and new alleles and ST variants. In conclusion, testing of the two Web tools using both fully assembled plasmid sequences and WGS-generated draft genomes showed them to be able to detect a broad variety of plasmids that are often associated with antimicrobial resistance in clinically relevant bacterial pathogens.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: Light is shed on the correlation between population mobility and the importation of drug-resistant pathogens and the manifold mechanisms underlying antibiotic resistance in A. baumannii and on the role of refugees in the transmission of antimicrobial resistance in developing countries.

24 citations

Posted ContentDOI
12 Aug 2020-bioRxiv
TL;DR: SCAPP (Sequence Contents-Aware Plasmid Peeler) is an easy to use Python package that enables the assembly of full plasmid sequences from metagenomic samples and outperformed existing meetagenomic plasmids assemblers in most cases.
Abstract: Background Metagenomic sequencing has led to the identification and assembly of many new bacterial genome sequences. These bacteria often contain plasmids: usually small, circular double-stranded DNA molecules that may transfer across bacterial species and confer antibiotic resistance. These plasmids are generally less studied and understood than their bacterial hosts. Part of the reason for this is insufficient computational tools enabling the analysis of plasmids in metagenomic samples. Results We developed SCAPP (Sequence Contents-Aware Plasmid Peeler) - an algorithm and tool to assemble plasmid sequences from metagenomic sequencing. SCAPP builds on some key ideas from the Recycler algorithm while improving plasmid assemblies by integrating biological knowledge about plasmids. We compared the performance of SCAPP to Recycler and metaplasmidSPAdes on simulated metagenomes, real human gut microbiome samples, and a human gut plasmidome dataset that we generated. We also created plasmidome and metagenome data from the same cow rumen sample and used the parallel sequencing data to create a novel assessment procedure. Overall, SCAPP outperformed Recycler and metaplasmidSPAdes across this wide range of datasets. Conclusions SCAPP is an easy to use Python package that enables the assembly of full plasmid sequences from metagenomic samples. It outperformed existing metagenomic plasmid assemblers in most cases, and assembled novel and clinically relevant plasmids in samples we generated such as a human gut plasmidome. SCAPP is open-source software available from: https://github.com/Shamir-Lab/SCAPP.

24 citations


Cites methods from "In Silico Detection and Typing of P..."

  • ...There are a number of tools that can be used to detect plasmid sequences including PlasmidFinder [2], cBar [3], gPlas [4], PlasFlow [5], and others....

    [...]

Journal ArticleDOI
01 Sep 2017-Apmis
TL;DR: The detection of the plasmid borne mcr‐1 gene conferring colistin resistance in an extended‐spectrum β‐lactamase (ESBL) producing Escherichia coli ST10 strain retrieved from seawater at a public beach in Norway illustrates that E. coli strains carryingplasmid‐mediated colistIn resistance genes have also reached areas where this drug is hardly used at all.
Abstract: We hereby report the detection of the plasmid borne mcr-1 gene conferring colistin resistance in an extended-spectrum β-lactamase (ESBL) producing Escherichia coli ST10 strain retrieved from seawater at a public beach in Norway. The sample was collected in September 2010 and was investigated by whole-genome sequencing in 2016. This report illustrates that E. coli strains carrying plasmid-mediated colistin resistance genes have also reached areas where this drug is hardly used at all. Surveillance of colistin resistance in environmental, veterinary, and human strains is warranted also in countries where colistin resistance is rare in clinical settings.

24 citations

Journal ArticleDOI
27 Oct 2020-Mbio
TL;DR: The genomes of 45 Pseudomonas aeruginosa lineages evolving in the lungs of cystic fibrosis patients were analyzed to identify genes that are lost or acquired during the first years of infection and found that a notable proportion of such genes are associated with virulence; a trait previously shown to be important for adaptation.
Abstract: Genome analyses have documented that there are differences in gene repertoire between evolutionary distant lineages of the same bacterial species; however, less is known about microevolutionary dynamics of gene loss and acquisition within bacterial lineages as they evolve over years. Here, we analyzed the genomes of 45 Pseudomonas aeruginosa lineages evolving in the lungs of cystic fibrosis (CF) patients to identify genes that are lost or acquired during the first years of infection. On average, lineage genome content changed by 88 genes (range, 0 to 473). Genes were more often lost than acquired, and prophage genes were more variable than bacterial genes. We identified convergent loss or acquisition of the same genes across lineages, suggesting selection for loss and acquisition of certain genes in the host environment. We found that a notable proportion of such genes are associated with virulence; a trait previously shown to be important for adaptation. Furthermore, we also compared the genomes across lineages to show that the within-lineage variable genes (i.e., genes that had been lost or acquired during the infection) often belonged to genomic content not shared across all lineages. In sum, our analysis adds to the knowledge on the pace and drivers of gene loss and acquisition in bacteria evolving over years in a human host environment and provides a basis to further understand how gene loss and acquisition play roles in lineage differentiation and host adaptation.IMPORTANCE Bacterial airway infections, predominantly caused by P. aeruginosa, are a major cause of mortality and morbidity of CF patients. While short insertions and deletions as well as point mutations occurring during infection are well studied, there is a lack of understanding of how gene loss and acquisition play roles in bacterial adaptation to the human airways. Here, we investigated P. aeruginosa within-host evolution with regard to gene loss and acquisition. We show that during long-term infection P. aeruginosa genomes tend to lose genes, in particular, genes related to virulence. This adaptive strategy allows reduction of the genome size and evasion of the host's immune response. This knowledge is crucial to understand the basic mutational steps that, on the timescale of years, diversify lineages and adds to the identification of bacterial genetic determinants that have implications for CF disease.

24 citations


Cites methods from "In Silico Detection and Typing of P..."

  • ...In contrast, plasmid genes were not identified to be lost or acquired in any lineage (the PlasmidFinder database was used to define plasmid genes)....

    [...]

  • ...Finally, the absence of Pseudomonas plasmid annotations in the PlasmidFinder database could have led to a low number of identified plasmids among our isolates....

    [...]

  • ...The PlasmidFinder (61) database (263 sequences; retrieved 21 March 2018) was used for plasmid gene identification, the VFDB database (2,597 genes; retrieved 21 March 2018) (62) was used for virulence gene identification, the Resfinder database (2,280 genes; retrieved 21 March 2018) (63) was used for resistance gene identification, and the ACLAME database (54,945 genes; retrieved 7 June 2018) (64) was used for prophage origin sequence identification....

    [...]

Journal ArticleDOI
TL;DR: Investigating the genomic dynamics of a 10 year outbreak of blaIMP-4-containing organisms in a burns unit in a hospital in Sydney, Australia found genetic backgrounds disseminating bla IMP- 4 can persist, diversify and evolve amongst both human and environmental reservoirs during a prolonged outbreak despite intensive prevention efforts.
Abstract: Background: Hospital outbreaks of carbapenemase-producing organisms, such as blaIMP-4-containing organisms, are an increasing threat to patient safety. Objectives: To investigate the genomic dynamics of a 10 year (2006–15) outbreak of blaIMP-4-containing organisms in a burns unit in a hospital in Sydney, Australia. Methods: All carbapenem-non-susceptible or MDR clinical isolates (2006–15) and a random selection of equivalent or ESBL-producing environmental isolates (2012–15) were sequenced [short-read (Illumina), long-read (Oxford Nanopore Technology)]. Sequence data were used to assess genetic relatedness of isolates (Mash; mapping and recombination-adjusted phylogenies), perform in silico typing (MLST, resistance genes and plasmid replicons) and reconstruct a subset of blaIMP plasmids for comparative plasmid genomics. Results: A total of 46/58 clinical and 67/96 environmental isolates contained blaIMP-4. All blaIMP-4-positive organisms contained five or more other resistance genes. Enterobacter cloacae was the predominant organism, with 12 other species mainly found in either the environment or patients, some persisting despite several cleaning methods. On phylogenetic analysis there were three genetic clusters of E. cloacae containing both clinical and environmental isolates, and an additional four clusters restricted to either reservoir. blaIMP-4 was mostly found as part of a cassette array (blaIMP-4-qacG2-aacA4-catB3) in a class 1 integron within a previously described IncM2 plasmid (pEl1573), with almost complete conservation of this cassette across the species over the 10 years. Several other plasmids were also implicated, including an IncF plasmid backbone not previously widely described in association with blaIMP-4. Conclusions: Genetic backgrounds disseminating blaIMP-4 can persist, diversify and evolve amongst both human and environmental reservoirs during a prolonged outbreak despite intensive prevention efforts.

24 citations

References
More filters
Journal ArticleDOI
TL;DR: A web server providing a convenient way of identifying acquired antimicrobial resistance genes in completely sequenced isolates was created, and the method was evaluated on WGS chromosomes and plasmids of 30 isolates.
Abstract: Objectives Identification of antimicrobial resistance genes is important for understanding the underlying mechanisms and the epidemiology of antimicrobial resistance. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available in routine diagnostic laboratories and is anticipated to substitute traditional methods for resistance gene identification. Thus, the current challenge is to extract the relevant information from the large amount of generated data.

3,956 citations


"In Silico Detection and Typing of P..." refers methods in this paper

  • ...To extract the relevant information from the large amount of data generated, a Web-based tool, ResFinder, for the identification of acquired or intrinsically present antimicrobial resistance genes in whole-genome data was recently developed (15)....

    [...]

Journal ArticleDOI
TL;DR: NCBI’s Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints.
Abstract: NCBI's Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints. CDD includes manually curated domain models that make use of protein 3D structure to refine domain models and provide insights into sequence/structure/function relationships. Manually curated models are organized hierarchically if they describe domain families that are clearly related by common descent. As CDD also imports domain family models from a variety of external sources, it is a partially redundant collection. To simplify protein annotation, redundant models and models describing homologous families are clustered into superfamilies. By default, domain footprints are annotated with the corresponding superfamily designation, on top of which specific annotation may indicate high-confidence assignment of family membership. Pre-computed domain annotation is available for proteins in the Entrez/Protein dataset, and a novel interface, Batch CD-Search, allows the computation and download of annotation for large sets of protein queries. CDD can be accessed via http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.

2,934 citations


"In Silico Detection and Typing of P..." refers background in this paper

  • ...In particular, the replicase proteins showing the pfam02387 or pfam01051 conserved domains were assigned to the FII and FIB groups, respectively (31)....

    [...]

Journal ArticleDOI
TL;DR: Results indicated that the inc/rep PCR method demonstrates high specificity and sensitivity in detecting replicons on reference plasmids and also revealed the presence of recurrent and common plasmid in epidemiologically unrelated Salmonella isolates of different serotypes.

2,163 citations


"In Silico Detection and Typing of P..." refers methods in this paper

  • ...A collection of 24 previously characterized and fully FIG 1 Numbers of fully sequenced plasmids (y axis) classified into incompatibility groups occurring in the different bacterial species of the Enterobacteriaceae family....

    [...]

  • ...Since 2005, a PCR-based replicon typing (PBRT) scheme has been available that targets in multiplex PCRs the replicons of the major plasmid families occurring in members of the family Enterobacteriaceae (2)....

    [...]

  • ...Here, we present two free, easy-to-use Web tools, PlasmidFinder and pMLST, to analyze and classify plasmids from bacterial species of the family Enterobacteriaceae....

    [...]

  • ...Here, we describe the design of two new easy-to-use Web tools useful for the rapid identification of plasmids in Enterobacteriaceae species that are of interest for epidemiological and clinical microbiology investigations of the plasmid-associated spread of antimicrobial resistance....

    [...]

  • ...This method was initially developed to detect the replicons of plasmids belonging to the 18 major incompatibility (Inc) groups of Enterobacteriaceae species (3)....

    [...]

Journal ArticleDOI
TL;DR: The Bacterial Isolate Genome Sequence Database (BIGSDB) represents a freely available resource that will assist the broader community in the elucidation of the structure and function of bacteria by means of a population genomics approach.
Abstract: The opportunities for bacterial population genomics that are being realised by the application of parallel nucleotide sequencing require novel bioinformatics platforms These must be capable of the storage, retrieval, and analysis of linked phenotypic and genotypic information in an accessible, scalable and computationally efficient manner The Bacterial Isolate Genome Sequence Database (BIGSDB) is a scalable, open source, web-accessible database system that meets these needs, enabling phenotype and sequence data, which can range from a single sequence read to whole genome data, to be efficiently linked for a limitless number of bacterial specimens The system builds on the widely used mlstdbNet software, developed for the storage and distribution of multilocus sequence typing (MLST) data, and incorporates the capacity to define and identify any number of loci and genetic variants at those loci within the stored nucleotide sequences These loci can be further organised into 'schemes' for isolate characterisation or for evolutionary or functional analyses Isolates and loci can be indexed by multiple names and any number of alternative schemes can be accommodated, enabling cross-referencing of different studies and approaches LIMS functionality of the software enables linkage to and organisation of laboratory samples The data are easily linked to external databases and fine-grained authentication of access permits multiple users to participate in community annotation by setting up or contributing to different schemes within the database Some of the applications of BIGSDB are illustrated with the genera Neisseria and Streptococcus The BIGSDB source code and documentation are available at http://pubmlstorg/software/database/bigsdb/ Genomic data can be used to characterise bacterial isolates in many different ways but it can also be efficiently exploited for evolutionary or functional studies BIGSDB represents a freely available resource that will assist the broader community in the elucidation of the structure and function of bacteria by means of a population genomics approach

1,943 citations

Journal ArticleDOI
TL;DR: A Web-based method for MLST of 66 bacterial species based on whole-genome sequencing data that enables investigators to determine the sequence types of their isolates on the basis of WGS data.
Abstract: Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the “gold standard” of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST.

1,620 citations


"In Silico Detection and Typing of P..." refers methods in this paper

  • ...If raw sequence reads are uploaded, they are first assembled (after the sequencing platform is given by the user) as described previously (16)....

    [...]