scispace - formally typeset
Search or ask a question
Journal ArticleDOI

In Silico Detection and Typing of Plasmids using PlasmidFinder and Plasmid Multilocus Sequence Typing

TL;DR: Two easy-to-use Web tools for in silico detection and characterization of whole-genome sequence (WGS) and whole-plasmid sequence data from members of the family Enterobacteriaceae are designed and developed.
Abstract: In the work presented here, we designed and developed two easy-to-use Web tools for in silico detection and characterization of whole-genome sequence (WGS) and whole-plasmid sequence data from members of the family Enterobacteriaceae. These tools will facilitate bacterial typing based on draft genomes of multidrug-resistant Enterobacteriaceae species by the rapid detection of known plasmid types. Replicon sequences from 559 fully sequenced plasmids associated with the family Enterobacteriaceae in the NCBI nucleotide database were collected to build a consensus database for integration into a Web tool called PlasmidFinder that can be used for replicon sequence analysis of raw, contig group, or completely assembled and closed plasmid sequencing data. The PlasmidFinder database currently consists of 116 replicon sequences that match with at least at 80% nucleotide identity all replicon sequences identified in the 559 fully sequenced plasmids. For plasmid multilocus sequence typing (pMLST) analysis, a database that is updated weekly was generated from www.pubmlst.org and integrated into a Web tool called pMLST. Both databases were evaluated using draft genomes from a collection of Salmonella enterica serovar Typhimurium isolates. PlasmidFinder identified a total of 103 replicons and between zero and five different plasmid replicons within each of 49 S . Typhimurium draft genomes tested. The pMLST Web tool was able to subtype genomic sequencing data of plasmids, revealing both known plasmid sequence types (STs) and new alleles and ST variants. In conclusion, testing of the two Web tools using both fully assembled plasmid sequences and WGS-generated draft genomes showed them to be able to detect a broad variety of plasmids that are often associated with antimicrobial resistance in clinically relevant bacterial pathogens.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
21 May 2021-PeerJ
TL;DR: ProkEvo as discussed by the authors is an automated, scalable, reproducible, and open-source framework for bacterial population genomics analyses using WGS data, which can be used for basic microbiology research, clinical microbiological diagnostics, and epidemiological surveillance.
Abstract: Whole Genome Sequence (WGS) data from bacterial species is used for a variety of applications ranging from basic microbiological research, diagnostics, and epidemiological surveillance. The availability of WGS data from hundreds of thousands of individual isolates of individual microbial species poses a tremendous opportunity for discovery and hypothesis-generating research into ecology and evolution of these microorganisms. Flexibility, scalability, and user-friendliness of existing pipelines for population-scale inquiry, however, limit applications of systematic, population-scale approaches. Here, we present ProkEvo, an automated, scalable, reproducible, and open-source framework for bacterial population genomics analyses using WGS data. ProkEvo was specifically developed to achieve the following goals: (1) Automation and scaling of complex combinations of computational analyses for many thousands of bacterial genomes from inputs of raw Illumina paired-end sequence reads; (2) Use of workflow management systems (WMS) such as Pegasus WMS to ensure reproducibility, scalability, modularity, fault-tolerance, and robust file management throughout the process; (3) Use of high-performance and high-throughput computational platforms; (4) Generation of hierarchical-based population structure analysis based on combinations of multi-locus and Bayesian statistical approaches for classification for ecological and epidemiological inquiries; (5) Association of antimicrobial resistance (AMR) genes, putative virulence factors, and plasmids from curated databases with the hierarchically-related genotypic classifications; and (6) Production of pan-genome annotations and data compilation that can be utilized for downstream analysis such as identification of population-specific genomic signatures. The scalability of ProkEvo was measured with two datasets comprising significantly different numbers of input genomes (one with ~2,400 genomes, and the second with ~23,000 genomes). Depending on the dataset and the computational platform used, the running time of ProkEvo varied from ~3-26 days. ProkEvo can be used with virtually any bacterial species, and the Pegasus WMS uniquely facilitates addition or removal of programs from the workflow or modification of options within them. To demonstrate versatility of the ProkEvo platform, we performed a hierarchical-based population structure analyses from available genomes of three distinct pathogenic bacterial species as individual case studies. The specific case studies illustrate how hierarchical analyses of population structures, genotype frequencies, and distribution of specific gene functions can be integrated into an analysis. Collectively, our study shows that ProkEvo presents a practical viable option for scalable, automated analyses of bacterial populations with direct applications for basic microbiology research, clinical microbiological diagnostics, and epidemiological surveillance.

4 citations

Journal ArticleDOI
TL;DR: In this article, a detailed molecular characterisation of mobile genetic elements (MGEs) and accessory genes could support and expand the current molecular typing of VREfm isolates sharing the same genetic background, enhancing the discriminatory power of the analysis.
Abstract: Background: Vancomycin-resistant Enterococcus faecium (VREfm) is a successful nosocomial pathogen. The current molecular method recommended in the Netherlands for VREfm typing is based on core genome Multilocus sequence typing (cgMLST), however, the rapid emergence of specific VREfm lineages challenges distinguishing outbreak isolates solely based on their core genome. Here, we explored if a detailed molecular characterisation of mobile genetic elements (MGEs) and accessory genes could support and expand the current molecular typing of VREfm isolates sharing the same genetic background, enhancing the discriminatory power of the analysis. Materials/Methods: The genomes of 39 VREfm and three vancomycin-susceptible E. faecium (VSEfm) isolates belonging to ST117/CT24, as assessed by cgMLST, were retrospectively analysed. The isolates were collected from patients and environmental samples from 2011 to 2017, and their genomes were analysed using short-read sequencing. Pangenome analysis was performed on de novo assemblies, which were also screened for known predicted virulence factors, antimicrobial resistance genes, bacteriocins, and prophages. Two representative isolates were also sequenced using long-read sequencing, which allowed a detailed analysis of their plasmid content. Results: The cgMLST analysis showed that the isolates were closely related, with a minimal allelic difference of 10 between each cluster's closest related isolates. The vanB-carrying transposon Tn1549 was present in all VREfm isolates. However, in our data, we observed independent acquisitions of this transposon. The pangenome analysis revealed differences in the accessory genes related to prophages and bacteriocins content, whilst a similar profile was observed for known predicted virulence and resistance genes. Conclusion: In the case of closely related isolates sharing a similar genetic background, a detailed analysis of MGEs and the integration point of the vanB-carrying transposon allow to increase the discriminatory power compared to the use of cgMLST alone. Thus, enabling the identification of epidemiological links amongst hospitalised patients.

4 citations

Posted ContentDOI
17 Feb 2021-bioRxiv
TL;DR: This work compiled a dataset of over 2000 bacterial genomes harbouring the blaNDM gene including 112 new PacBio hybrid assemblies from clinical and livestock associated isolates across China and developed a novel computational approach to track structural variants in bacterial genomes, which correlated with plasmid backbones, bacterial host species and sampling locations.
Abstract: The mobile resistance gene blaNDM encodes the NDM enzyme which hydrolyses carbapenems, a class of antibiotics used to treat some of the most severe bacterial infections. blaNDM is globally distributed across a variety of Gram-negative bacteria on multiple plasmids, typically located within a highly recombining and transposon-rich genomic region. This complexity means the dynamics underlying the dissemination of blaNDM remain poorly resolved. In this work, we compile a dataset of over 6000 bacterial genomes harbouring the blaNDM gene including 104 newly generated PacBio hybrid assemblies from clinical and livestock associated isolates across China. We develop a novel computational approach to track structural variants surrounding blaNDM in bacterial genomes. This allows us to identify the prevalent genomic contexts of blaNDM and reconstruct the key mobile genetic elements and events in its global spread. We estimate that blaNDM emerged on a Tn125 transposon before 1985 but only reached a global prevalence around a decade after its first recorded observation in 2005. We find that the Tn125 transposon played an important role in early plasmid-mediated jumps of blaNDM but was overtaken by other elements in recent years including IS26-flanked pseudo-composite transposons and Tn3000. Lastly, we observe a notable correlation between plasmid backbones bearing blaNDM and the sampling location of isolates. This observation suggests that the dissemination of resistance genes is mainly driven by successive between-plasmid transposon jumps, with plasmid exchange much more restricted due to the adaptation of plasmids to specific bacterial hosts.

4 citations

Journal ArticleDOI
TL;DR: The use of colistin as a last resort antimicrobial is compromised by the emergence of resistant enterobacteria with acquired determinants like mcr genes, mutations that activate the PmrAB system and by still unknown mechanisms as discussed by the authors.
Abstract: The use of colistin as a last resort antimicrobial is compromised by the emergence of resistant enterobacteria with acquired determinants like mcr genes, mutations that activate the PmrAB system and by still unknown mechanisms. This work analyzed 74 E. coli isolates from healthy swine, turkey or bovine, characterizing their colistin resistance determinants. The mcr-1 gene, detected in 69 isolates, was the main determinant found among which 45% were carried by highly mobile plasmids, followed by four strains lacking previously known resistance determinants or two with mcr-4 (one in addition to mcr-1), whose phenotypes were not transferred by conjugation. Although a fraction of isolates carrying mcr-1 or mcr-4 genes also presented missense polymorphisms in pmrA or pmrB, constitutive activation of PmrAB was not detected, in contrast to strains with mutations that confer colistin resistance. The expression of mcr genes negatively controls the transcription of the arnBCADTEF operon itself, a down-regulation that was also observed in the four isolates lacking known resistance determinants, three of them sharing the same macrorestriction and plasmid profiles. Genomic sequencing of one of these strains, isolated from a bovine in 2015, revealed a IncFII plasmid of 62.1 Kb encoding an extra copy of the arnBCADTEF operon closely related to Kluyvera ascorbata homologs. This element, called pArnT1, was cured by ethidium bromide and the cells lost resistance to colistin in parallel. Furthermore, a susceptible E. coli strain acquired heteroresistance after transformation with pArnT1 or pBAD24 carrying the Kluyvera-like arnBCADTEF operon, revealing it as a new colistin resistance determinant.

4 citations

Journal ArticleDOI
TL;DR: In this paper, the clonality of Klebsiella pneumoniae isolates was examined from the fecal samplings of a healthy married couple and their pet animals during Sep. 2015 to Oct. 2016.

4 citations

References
More filters
Journal ArticleDOI
TL;DR: A web server providing a convenient way of identifying acquired antimicrobial resistance genes in completely sequenced isolates was created, and the method was evaluated on WGS chromosomes and plasmids of 30 isolates.
Abstract: Objectives Identification of antimicrobial resistance genes is important for understanding the underlying mechanisms and the epidemiology of antimicrobial resistance. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available in routine diagnostic laboratories and is anticipated to substitute traditional methods for resistance gene identification. Thus, the current challenge is to extract the relevant information from the large amount of generated data.

3,956 citations


"In Silico Detection and Typing of P..." refers methods in this paper

  • ...To extract the relevant information from the large amount of data generated, a Web-based tool, ResFinder, for the identification of acquired or intrinsically present antimicrobial resistance genes in whole-genome data was recently developed (15)....

    [...]

Journal ArticleDOI
TL;DR: NCBI’s Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints.
Abstract: NCBI's Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints. CDD includes manually curated domain models that make use of protein 3D structure to refine domain models and provide insights into sequence/structure/function relationships. Manually curated models are organized hierarchically if they describe domain families that are clearly related by common descent. As CDD also imports domain family models from a variety of external sources, it is a partially redundant collection. To simplify protein annotation, redundant models and models describing homologous families are clustered into superfamilies. By default, domain footprints are annotated with the corresponding superfamily designation, on top of which specific annotation may indicate high-confidence assignment of family membership. Pre-computed domain annotation is available for proteins in the Entrez/Protein dataset, and a novel interface, Batch CD-Search, allows the computation and download of annotation for large sets of protein queries. CDD can be accessed via http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.

2,934 citations


"In Silico Detection and Typing of P..." refers background in this paper

  • ...In particular, the replicase proteins showing the pfam02387 or pfam01051 conserved domains were assigned to the FII and FIB groups, respectively (31)....

    [...]

Journal ArticleDOI
TL;DR: Results indicated that the inc/rep PCR method demonstrates high specificity and sensitivity in detecting replicons on reference plasmids and also revealed the presence of recurrent and common plasmid in epidemiologically unrelated Salmonella isolates of different serotypes.

2,163 citations


"In Silico Detection and Typing of P..." refers methods in this paper

  • ...A collection of 24 previously characterized and fully FIG 1 Numbers of fully sequenced plasmids (y axis) classified into incompatibility groups occurring in the different bacterial species of the Enterobacteriaceae family....

    [...]

  • ...Since 2005, a PCR-based replicon typing (PBRT) scheme has been available that targets in multiplex PCRs the replicons of the major plasmid families occurring in members of the family Enterobacteriaceae (2)....

    [...]

  • ...Here, we present two free, easy-to-use Web tools, PlasmidFinder and pMLST, to analyze and classify plasmids from bacterial species of the family Enterobacteriaceae....

    [...]

  • ...Here, we describe the design of two new easy-to-use Web tools useful for the rapid identification of plasmids in Enterobacteriaceae species that are of interest for epidemiological and clinical microbiology investigations of the plasmid-associated spread of antimicrobial resistance....

    [...]

  • ...This method was initially developed to detect the replicons of plasmids belonging to the 18 major incompatibility (Inc) groups of Enterobacteriaceae species (3)....

    [...]

Journal ArticleDOI
TL;DR: The Bacterial Isolate Genome Sequence Database (BIGSDB) represents a freely available resource that will assist the broader community in the elucidation of the structure and function of bacteria by means of a population genomics approach.
Abstract: The opportunities for bacterial population genomics that are being realised by the application of parallel nucleotide sequencing require novel bioinformatics platforms These must be capable of the storage, retrieval, and analysis of linked phenotypic and genotypic information in an accessible, scalable and computationally efficient manner The Bacterial Isolate Genome Sequence Database (BIGSDB) is a scalable, open source, web-accessible database system that meets these needs, enabling phenotype and sequence data, which can range from a single sequence read to whole genome data, to be efficiently linked for a limitless number of bacterial specimens The system builds on the widely used mlstdbNet software, developed for the storage and distribution of multilocus sequence typing (MLST) data, and incorporates the capacity to define and identify any number of loci and genetic variants at those loci within the stored nucleotide sequences These loci can be further organised into 'schemes' for isolate characterisation or for evolutionary or functional analyses Isolates and loci can be indexed by multiple names and any number of alternative schemes can be accommodated, enabling cross-referencing of different studies and approaches LIMS functionality of the software enables linkage to and organisation of laboratory samples The data are easily linked to external databases and fine-grained authentication of access permits multiple users to participate in community annotation by setting up or contributing to different schemes within the database Some of the applications of BIGSDB are illustrated with the genera Neisseria and Streptococcus The BIGSDB source code and documentation are available at http://pubmlstorg/software/database/bigsdb/ Genomic data can be used to characterise bacterial isolates in many different ways but it can also be efficiently exploited for evolutionary or functional studies BIGSDB represents a freely available resource that will assist the broader community in the elucidation of the structure and function of bacteria by means of a population genomics approach

1,943 citations

Journal ArticleDOI
TL;DR: A Web-based method for MLST of 66 bacterial species based on whole-genome sequencing data that enables investigators to determine the sequence types of their isolates on the basis of WGS data.
Abstract: Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the “gold standard” of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST.

1,620 citations


"In Silico Detection and Typing of P..." refers methods in this paper

  • ...If raw sequence reads are uploaded, they are first assembled (after the sequencing platform is given by the user) as described previously (16)....

    [...]