scispace - formally typeset
Search or ask a question
Journal ArticleDOI

In Silico Detection and Typing of Plasmids using PlasmidFinder and Plasmid Multilocus Sequence Typing

TL;DR: Two easy-to-use Web tools for in silico detection and characterization of whole-genome sequence (WGS) and whole-plasmid sequence data from members of the family Enterobacteriaceae are designed and developed.
Abstract: In the work presented here, we designed and developed two easy-to-use Web tools for in silico detection and characterization of whole-genome sequence (WGS) and whole-plasmid sequence data from members of the family Enterobacteriaceae. These tools will facilitate bacterial typing based on draft genomes of multidrug-resistant Enterobacteriaceae species by the rapid detection of known plasmid types. Replicon sequences from 559 fully sequenced plasmids associated with the family Enterobacteriaceae in the NCBI nucleotide database were collected to build a consensus database for integration into a Web tool called PlasmidFinder that can be used for replicon sequence analysis of raw, contig group, or completely assembled and closed plasmid sequencing data. The PlasmidFinder database currently consists of 116 replicon sequences that match with at least at 80% nucleotide identity all replicon sequences identified in the 559 fully sequenced plasmids. For plasmid multilocus sequence typing (pMLST) analysis, a database that is updated weekly was generated from www.pubmlst.org and integrated into a Web tool called pMLST. Both databases were evaluated using draft genomes from a collection of Salmonella enterica serovar Typhimurium isolates. PlasmidFinder identified a total of 103 replicons and between zero and five different plasmid replicons within each of 49 S . Typhimurium draft genomes tested. The pMLST Web tool was able to subtype genomic sequencing data of plasmids, revealing both known plasmid sequence types (STs) and new alleles and ST variants. In conclusion, testing of the two Web tools using both fully assembled plasmid sequences and WGS-generated draft genomes showed them to be able to detect a broad variety of plasmids that are often associated with antimicrobial resistance in clinically relevant bacterial pathogens.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: The draft genome sequence of a multidrug-resistant carbapenemase-producing Serratia marcescens isolate recovered from the bronchoalveolar lavage specimen of a patient suffering from chronic obstructive pulmonary disease is reported.
Abstract: The occurrence of multidrug-resistant Serratia marcescens strains producing metallo-β-lactamases or extended-spectrum β-lactamases represents a serious public health threat. Here, we report the draft genome sequence of a multidrug-resistant carbapenemase-producing Serratia marcescens isolate recovered from the bronchoalveolar lavage specimen of a patient suffering from chronic obstructive pulmonary disease (COPD).

12 citations

Journal ArticleDOI
TL;DR: Advances have been made in rapid detection of resistance in cultures, but limited progress in direct detection from specimens in more rapid pathogen identification and antimicrobial susceptibility testing.

12 citations

Posted ContentDOI
15 May 2020-bioRxiv
TL;DR: The genomic diversity of E. coli in backyard poultry from rural Gambia is investigated to contextualise the potential risks of transmission of bacterial strains between humans and rural backyard poultry and suggest strains can be exchanged between poultry and livestock in this setting.
Abstract: Chickens and guinea fowl are commonly reared in Gambian homes as affordable sources of protein. Using standard microbiological techniques, we obtained 68 caecal isolates of Escherichia coli from ten chickens and nine guinea fowl in rural Gambia. After Illumina whole-genome sequencing, 28 sequence types were detected in the isolates (four of them novel), of which ST155 was the most common (22/68, 32%). These strains span four of the eight main phylogroups of E. coli, with phylogroups B1 and A being most prevalent. Nearly a third of the isolates harboured at least one antimicrobial resistance gene, while most of the ST155 isolates (14/22, 64%) encoded resistance to ≥3 classes of clinically relevant antibiotics, as well as putative virulence factors, suggesting pathogenic potential in humans. Furthermore, hierarchical clustering revealed that several Gambian poultry strains were closely related to isolates from humans. Although the ST155 lineage is common in poultry from Africa and South America, the Gambian ST155 isolates belong to a unique cgMLST cluster comprised of closely related (38-39 alleles differences) isolates from poultry and livestock from sub-Saharan Africa—suggesting that strains can be exchanged between poultry and livestock in this setting. Continued surveillance of E. coli and other potential pathogens in rural backyard poultry from sub-Saharan Africa is warranted. Author notes All supporting data and protocols have been provided within the article or as supplementary data files. Eleven supplementary figures and eight supplementary files are available with the online version of this article. Data summary The genomic assemblies for the isolates reported here are available for download from EnteroBase (http://enterobase.warwick.ac.uk/species/index/ecoli) and the EnteroBase assembly barcodes are provided in File S2. Sequences have been deposited in the NCBI SRA, under the BioProject ID: PRJNA616250 and accession numbers SAMN14485281 to SAMN14485348 (File S2). Assemblies have been deposited in GenBank under the BioProject ID: PRJNA616250 and accession numbers CP053258 and CP053259. Impact statement Domestic birds play a crucial role in human society, in particular contributing to food security in low-income countries. Many households in Sub-Saharan Africa rear free-range chickens and guinea fowl, which are often left to scavenge for feed in and around the family compound, where they are frequently exposed to humans, other animals and the environment. Such proximity between backyard poultry and humans is likely to facilitate transmission of pathogens such as Escherichia coli or antimicrobial resistance between the two host species. Little is known about the population structure of E. coli in rural chickens and guinea fowl, although this information is needed to contextualise the potential risks of transmission of bacterial strains between humans and rural backyard poultry. Thus, we sought to investigate the genomic diversity of E. coli in backyard poultry from rural Gambia.

12 citations


Cites methods from "In Silico Detection and Typing of P..."

  • ...Briefly, this tool scans the short-read sequences against the core Virulence 249 Factors Database [45] (virulence factors), ResFinder (AMR) [46] and PlasmidFinder 250 (plasmid-associated genes) [47] databases and generates customised outputs, based on a 251...

    [...]

Journal ArticleDOI
TL;DR: The ontology of P. salmonis plasmids suggests a role in bacterial fitness and adaptation to the environment as they encode proteins related to mobilization, nutrient transport and utilization, and bacterial virulence.
Abstract: Four large cryptic plasmids were identified in the salmon pathogen Piscirickettsia salmonis reference strain LF-89. These plasmids appeared highly novel, with less than 7% nucleotidic identity to the nr plasmid database. Plasmid copy number analysis revealed that they are harbored in chromosome equivalent ratios. In addition to plasmid-related genes (plasmidial autonomous replication, partitioning, maintenance, and mobilization genes), mobile genetic elements such as transposases, integrases, and prophage sequences were also identified in P. salmonis plasmids. However, bacterial lysis was not observed upon the induction of prophages. A total of twelve putative virulence factors (VFs) were identified, in addition to two global transcriptional regulators, the widely conserved CsrA protein and the regulator Crp/Fnr. Eleven of the putative VFs were overexpressed during infection in two salmon-derived cellular infection models, supporting their role as VFs. The ubiquity of these plasmids was also confirmed by sequence similarity in the genomes of other P. salmonis strains. The ontology of P. salmonis plasmids suggests a role in bacterial fitness and adaptation to the environment as they encode proteins related to mobilization, nutrient transport and utilization, and bacterial virulence. Further functional characterization of P. salmonis plasmids may improve our knowledge regarding virulence and mobile elements in this intracellular pathogen.

12 citations


Cites background from "In Silico Detection and Typing of P..."

  • ...It should also be noted that plasmid replication genes and their associated incompatibility groups could not be identified using PlasmidFinder [55]....

    [...]

Journal ArticleDOI
TL;DR: Investigation of next-generation short read sequencing between ten laboratories involved in food safety from Germany and Austria found Illumina short read data to be more accurate and consistent and consistent than Ion Torrent sequence data, with little variation between the different Illumina instruments.
Abstract: We compared the consistency, accuracy and reproducibility of next-generation short read sequencing between ten laboratories involved in food safety (research institutes, state laboratories, universities and companies) from Germany and Austria. Participants were asked to sequence six DNA samples of three bacterial species (Campylobacter jejuni, Listeria monocytogenes and Salmonella enterica) in duplicate, according to their routine in-house sequencing protocol. Four different types of Illumina sequencing platforms (MiSeq, NextSeq, iSeq, NovaSeq) and one Ion Torrent sequencing instrument (S5) were involved in the study. Sequence quality parameters were determined for all data sets and centrally compared between laboratories. SNP and cgMLST calling were performed to assess the reproducibility of sequence data collected for individual samples. Overall, we found Illumina short read data to be more accurate (higher base calling accuracy, fewer miss-assemblies) and consistent (little variability between independent sequencing runs within a laboratory) than Ion Torrent sequence data, with little variation between the different Illumina instruments. Two laboratories with Illumina instruments submitted sequence data with lower quality, probably due to the use of a library preparation kit, which shows difficulty in sequencing low GC genome regions. Differences in data quality were more evident after assembling short reads into genome assemblies, with Ion Torrent assemblies featuring a great number of allele differences to Illumina assemblies. Clonality of samples was confirmed through SNP calling, which proved to be a more suitable method for an integrated data analysis of Illumina and Ion Torrent data sets in this study.

12 citations

References
More filters
Journal ArticleDOI
TL;DR: A web server providing a convenient way of identifying acquired antimicrobial resistance genes in completely sequenced isolates was created, and the method was evaluated on WGS chromosomes and plasmids of 30 isolates.
Abstract: Objectives Identification of antimicrobial resistance genes is important for understanding the underlying mechanisms and the epidemiology of antimicrobial resistance. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available in routine diagnostic laboratories and is anticipated to substitute traditional methods for resistance gene identification. Thus, the current challenge is to extract the relevant information from the large amount of generated data.

3,956 citations


"In Silico Detection and Typing of P..." refers methods in this paper

  • ...To extract the relevant information from the large amount of data generated, a Web-based tool, ResFinder, for the identification of acquired or intrinsically present antimicrobial resistance genes in whole-genome data was recently developed (15)....

    [...]

Journal ArticleDOI
TL;DR: NCBI’s Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints.
Abstract: NCBI's Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints. CDD includes manually curated domain models that make use of protein 3D structure to refine domain models and provide insights into sequence/structure/function relationships. Manually curated models are organized hierarchically if they describe domain families that are clearly related by common descent. As CDD also imports domain family models from a variety of external sources, it is a partially redundant collection. To simplify protein annotation, redundant models and models describing homologous families are clustered into superfamilies. By default, domain footprints are annotated with the corresponding superfamily designation, on top of which specific annotation may indicate high-confidence assignment of family membership. Pre-computed domain annotation is available for proteins in the Entrez/Protein dataset, and a novel interface, Batch CD-Search, allows the computation and download of annotation for large sets of protein queries. CDD can be accessed via http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.

2,934 citations


"In Silico Detection and Typing of P..." refers background in this paper

  • ...In particular, the replicase proteins showing the pfam02387 or pfam01051 conserved domains were assigned to the FII and FIB groups, respectively (31)....

    [...]

Journal ArticleDOI
TL;DR: Results indicated that the inc/rep PCR method demonstrates high specificity and sensitivity in detecting replicons on reference plasmids and also revealed the presence of recurrent and common plasmid in epidemiologically unrelated Salmonella isolates of different serotypes.

2,163 citations


"In Silico Detection and Typing of P..." refers methods in this paper

  • ...A collection of 24 previously characterized and fully FIG 1 Numbers of fully sequenced plasmids (y axis) classified into incompatibility groups occurring in the different bacterial species of the Enterobacteriaceae family....

    [...]

  • ...Since 2005, a PCR-based replicon typing (PBRT) scheme has been available that targets in multiplex PCRs the replicons of the major plasmid families occurring in members of the family Enterobacteriaceae (2)....

    [...]

  • ...Here, we present two free, easy-to-use Web tools, PlasmidFinder and pMLST, to analyze and classify plasmids from bacterial species of the family Enterobacteriaceae....

    [...]

  • ...Here, we describe the design of two new easy-to-use Web tools useful for the rapid identification of plasmids in Enterobacteriaceae species that are of interest for epidemiological and clinical microbiology investigations of the plasmid-associated spread of antimicrobial resistance....

    [...]

  • ...This method was initially developed to detect the replicons of plasmids belonging to the 18 major incompatibility (Inc) groups of Enterobacteriaceae species (3)....

    [...]

Journal ArticleDOI
TL;DR: The Bacterial Isolate Genome Sequence Database (BIGSDB) represents a freely available resource that will assist the broader community in the elucidation of the structure and function of bacteria by means of a population genomics approach.
Abstract: The opportunities for bacterial population genomics that are being realised by the application of parallel nucleotide sequencing require novel bioinformatics platforms These must be capable of the storage, retrieval, and analysis of linked phenotypic and genotypic information in an accessible, scalable and computationally efficient manner The Bacterial Isolate Genome Sequence Database (BIGSDB) is a scalable, open source, web-accessible database system that meets these needs, enabling phenotype and sequence data, which can range from a single sequence read to whole genome data, to be efficiently linked for a limitless number of bacterial specimens The system builds on the widely used mlstdbNet software, developed for the storage and distribution of multilocus sequence typing (MLST) data, and incorporates the capacity to define and identify any number of loci and genetic variants at those loci within the stored nucleotide sequences These loci can be further organised into 'schemes' for isolate characterisation or for evolutionary or functional analyses Isolates and loci can be indexed by multiple names and any number of alternative schemes can be accommodated, enabling cross-referencing of different studies and approaches LIMS functionality of the software enables linkage to and organisation of laboratory samples The data are easily linked to external databases and fine-grained authentication of access permits multiple users to participate in community annotation by setting up or contributing to different schemes within the database Some of the applications of BIGSDB are illustrated with the genera Neisseria and Streptococcus The BIGSDB source code and documentation are available at http://pubmlstorg/software/database/bigsdb/ Genomic data can be used to characterise bacterial isolates in many different ways but it can also be efficiently exploited for evolutionary or functional studies BIGSDB represents a freely available resource that will assist the broader community in the elucidation of the structure and function of bacteria by means of a population genomics approach

1,943 citations

Journal ArticleDOI
TL;DR: A Web-based method for MLST of 66 bacterial species based on whole-genome sequencing data that enables investigators to determine the sequence types of their isolates on the basis of WGS data.
Abstract: Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the “gold standard” of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST.

1,620 citations


"In Silico Detection and Typing of P..." refers methods in this paper

  • ...If raw sequence reads are uploaded, they are first assembled (after the sequencing platform is given by the user) as described previously (16)....

    [...]