scispace - formally typeset
Search or ask a question
Journal ArticleDOI

In Silico Detection and Typing of Plasmids using PlasmidFinder and Plasmid Multilocus Sequence Typing

TL;DR: Two easy-to-use Web tools for in silico detection and characterization of whole-genome sequence (WGS) and whole-plasmid sequence data from members of the family Enterobacteriaceae are designed and developed.
Abstract: In the work presented here, we designed and developed two easy-to-use Web tools for in silico detection and characterization of whole-genome sequence (WGS) and whole-plasmid sequence data from members of the family Enterobacteriaceae. These tools will facilitate bacterial typing based on draft genomes of multidrug-resistant Enterobacteriaceae species by the rapid detection of known plasmid types. Replicon sequences from 559 fully sequenced plasmids associated with the family Enterobacteriaceae in the NCBI nucleotide database were collected to build a consensus database for integration into a Web tool called PlasmidFinder that can be used for replicon sequence analysis of raw, contig group, or completely assembled and closed plasmid sequencing data. The PlasmidFinder database currently consists of 116 replicon sequences that match with at least at 80% nucleotide identity all replicon sequences identified in the 559 fully sequenced plasmids. For plasmid multilocus sequence typing (pMLST) analysis, a database that is updated weekly was generated from www.pubmlst.org and integrated into a Web tool called pMLST. Both databases were evaluated using draft genomes from a collection of Salmonella enterica serovar Typhimurium isolates. PlasmidFinder identified a total of 103 replicons and between zero and five different plasmid replicons within each of 49 S . Typhimurium draft genomes tested. The pMLST Web tool was able to subtype genomic sequencing data of plasmids, revealing both known plasmid sequence types (STs) and new alleles and ST variants. In conclusion, testing of the two Web tools using both fully assembled plasmid sequences and WGS-generated draft genomes showed them to be able to detect a broad variety of plasmids that are often associated with antimicrobial resistance in clinically relevant bacterial pathogens.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: In this paper, a total of 166 non-typhoidal Salmonella (iNTS) isolates collected from a multi-centre surveillance in 10 African countries (2010-2014) and a fever study in Ghana (2007-2009) were genome sequenced to investigate the geographical distribution, antimicrobial genetic determinants and population structure of iNTS serotypes-genotypes.
Abstract: BACKGROUND Invasive non-typhoidal Salmonella (iNTS) is one of the leading causes of bacteraemia in sub-Saharan Africa. We aimed to provide a better understanding of the genetic characteristics and transmission patterns associated with multi-drug resistant (MDR) iNTS serovars across the continent. METHODS A total of 166 iNTS isolates collected from a multi-centre surveillance in 10 African countries (2010-2014) and a fever study in Ghana (2007-2009) were genome sequenced to investigate the geographical distribution, antimicrobial genetic determinants and population structure of iNTS serotypes-genotypes. Phylogenetic analyses were conducted in the context of the existing genomic frameworks for various iNTS serovars. Population-based incidence of MDR-iNTS disease was estimated in each study site. RESULTS Salmonella Typhimurium sequence-type (ST) 313 and Salmonella Enteritidis ST11 were predominant, and both exhibited high frequencies of MDR; Salmonella Dublin ST10 was identified in West Africa only. Mutations in the gyrA gene (fluoroquinolone resistance) were identified in S. Enteritidis and S. Typhimurium in Ghana; an ST313 isolate carrying blaCTX-M-15 was found in Kenya. International transmission of MDR ST313 (lineage II) and MDR ST11 (West African clade) was observed between Ghana and neighbouring West African countries. The incidence of MDR-iNTS disease exceeded 100/100 000 person-years-of-observation in children aged <5 years in several West African countries. CONCLUSIONS We identified the circulation of multiple MDR iNTS serovar STs in the sampled sub-Saharan African countries. Investment in the development and deployment of iNTS vaccines coupled with intensified antimicrobial resistance surveillance are essential to limit the impact of these pathogens in Africa.

10 citations

Journal ArticleDOI
TL;DR: In this paper, the genotypic diversity of S. enterica ser. Typhi strain 4.3.1 was investigated by whole-genome sequence analysis and the majority of the sequenced isolates were predicted to confer resistance to aminoglycosides, β-lactams, phenicols, sulphonamides, tetracycline and fluoroquinolones (including qnrS detection).
Abstract: Background Typhoid fever, caused by S. enterica ser. Typhi, continues to be a substantial health burden in developing countries. Little is known of the genotypic diversity of S. enterica ser. Typhi in Zimbabwe, but this is key for understanding the emergence and spread of this pathogen and devising interventions for its control. Objectives To report the molecular epidemiology of S. enterica ser. Typhi outbreak strains circulating from 2012 to 2019 in Zimbabwe, using comparative genomics. Methods : A review of typhoid cases records from 2012 to 2019 in Zimbabwe was performed. The phylogenetic relationship of outbreak isolates from 2012 to 2019 and emergence of antibiotic resistance was investigated by whole-genome sequence analysis. Results A total 22 479 suspected typhoid cases, 760 confirmed cases were reported from 2012 to 2019 and 29 isolates were sequenced. The majority of the sequenced isolates were predicted to confer resistance to aminoglycosides, β-lactams, phenicols, sulphonamides, tetracycline and fluoroquinolones (including qnrS detection). The qnrS1 gene was associated with an IncN (subtype PST3) plasmid in 79% of the isolates. Whole-genome SNP analysis, SNP-based haplotyping and resistance determinant analysis showed that 93% of the isolates belonged to a single clade represented by multidrug-resistant H58 lineage I (4.3.1.1), with a maximum pair-wise distance of 22 SNPs. Conclusions This study has provided detailed genotypic characterization of the outbreak strain, identified as S. Typhi 4.3.1.1 (H58). The strain has reduced susceptibility to ciprofloxacin due to qnrS carried by an IncN (subtype PST3) plasmid resulting from ongoing evolution to full resistance.

10 citations

Journal ArticleDOI
TL;DR: In this paper, the authors used whole-genome sequencing (WGS) to obtain epidemiological information for the Salmonella enterica serovar Heidelberg lineage in Brazilian poultry farms.
Abstract: Salmonella enterica serovar Heidelberg is isolated from poultry-producing regions around the world. In Brazil, S. Heidelberg has been frequently detected in poultry flocks, slaughterhouses, and chicken meat. The goal of the present study was to assess the population structure, recent temporal evolution, and some important genetic characteristics of S. Heidelberg isolated from Brazilian poultry farms. Phylogenetic analysis of 68 S. Heidelberg genomes sequenced here and additional whole-genome data from NCBI demonstrated that all isolates from the Brazilian poultry production chain clustered into a monophyletic group, here called S. Heidelberg Brazilian poultry lineage (SH-BPL). Bayesian analysis defined the time of the most recent common ancestor (tMRCA) as 2004, and the overall population size (Ne) was constant until 2008, when an ∼10-fold Ne increase was observed until circa 2013. SH-BPL presented at least two plasmids with replicons ColpVC (n = 68; 100%), IncX1 (n = 66; 97%), IncA/C2 (n = 65; 95.5%), ColRNAI (n = 43; 63.2%), IncI1 (n = 32; 47%), ColMG828, Col156, IncHI2A, IncHI2, IncQ1, IncX4, IncY, and TrfA (each with n < 4; <4% each). Antibiotic resistance genes were found, with high frequencies of fosA7 (n = 68; 100%), mdf(A) (n = 68; 100%), tet(34) (n = 68; 100%), sul2 (n = 64; 94.1%), and blaCMY-2 (n = 56; 82.3%), along with an overall multidrug resistance (MDR) profile. Ten Salmonella pathogenicity islands (SPI1 to SPI5, SPI9, and SPI11 to SPI14) and 139 virulence genes were also detected. The SH-BPL profile was like those of other previous S. Heidelberg isolates from poultry around the world in the 1990s. In conclusion, the present study demonstrates the recent introduction (2004) and high level of dissemination of an MDR S. Heidelberg lineage in Brazilian poultry operations. IMPORTANCES. Heidelberg is the most frequent serovar in several broiler farms from the main Brazilian poultry-producing regions. Therefore, avian-source foods (mainly chicken carcasses) commercialized in the country and exported to other continents are contaminated with this foodborne pathogen, generating several national and international economic losses. In addition, isolates of this serovar are usually resistant to antibiotics and can cause human invasive and septicemic infection, representing a public health concern. This study demonstrates the use of whole-genome sequencing (WGS) to obtain epidemiological information for one S. Heidelberg lineage highly spread among Brazilian poultry farms. This information will help to define biosecurity measures to control this important Salmonella serovar in Brazilian and worldwide poultry operations.

10 citations

Posted ContentDOI
01 Nov 2020-bioRxiv
TL;DR: The Plasmid Classification System (PCS) is reported, a machine learning classifier that recognizes plasmid sequences based on gene functions and outperforms the previous state-of-the-art approach based on k-mer decomposition of sequences.
Abstract: Plasmids play a critical role in rapid bacterial adaptation by encoding accessory functions that may increase the host9s fitness. However, the diversity and ecology of plasmids is poorly understood due to computational and experimental challenges in plasmid identification. Here, we report the Plasmid Classification System (PCS), a machine learning classifier that recognizes plasmid sequences based on gene functions. To train PCS, we performed a large-scale discovery and comparison of gene functions in a reference set of >16,000 plasmids and >14,000 chromosomes. PCS accurately recognizes a diverse range of plasmid subtypes, and it outperforms the previous state-of-the-art approach based on k-mer decomposition of sequences. Armed with this model, we conducted, to our knowledge, the largest search for naturally occurring human gut plasmids in 406 publicly available metagenomes representing 5 countries. This search yielded 6,257 high-confidence predicted plasmids, of which 576 had evidence of a circular conformation based on pair-end mapping. These predicted plasmids were found to be highly prevalent across the metagenomes compared to the reference set of known plasmids, suggesting there is extensive and uncharacterized plasmid diversity in the human gut microbiome.

10 citations


Cites background from "In Silico Detection and Typing of P..."

  • ...Plasmidfinder ​(Carattoli et al. 2014) identifies genes related to plasmid replication, but it is trained on a limited number of plasmids of the family Enterobactericiae​....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors used de novo metagenome assembly to reconstruct 11 Methanobrevibacter genomes from the ancient calculus samples, and they found a high abundance of the archaeal genus M. oralis in the calculus.
Abstract: BACKGROUND Dental calculus (mineralised dental plaque) preserves many types of microfossils and biomolecules, including microbial and host DNA, and ancient calculus are thus an important source of information regarding our ancestral human oral microbiome. In this study, we taxonomically characterised the dental calculus microbiome from 20 ancient human skeletal remains originating from Trentino-South Tyrol, Italy, dating from the Neolithic (6000-3500 BCE) to the Early Middle Ages (400-1000 CE). RESULTS We found a high abundance of the archaeal genus Methanobrevibacter in the calculus. However, only a fraction of the sequences showed high similarity to Methanobrevibacter oralis, the only described Methanobrevibacter species in the human oral microbiome so far. To further investigate the diversity of this genus, we used de novo metagenome assembly to reconstruct 11 Methanobrevibacter genomes from the ancient calculus samples. Besides the presence of M. oralis in one of the samples, our phylogenetic analysis revealed two hitherto uncharacterised and unnamed oral Methanobrevibacter species that are prevalent in ancient calculus samples sampled from a broad range of geographical locations and time periods. CONCLUSIONS We have shown the potential of using de novo metagenomic assembly on ancient samples to explore microbial diversity and evolution. Our study suggests that there has been a possible shift in the human oral microbiome member Methanobrevibacter over the last millennia. Video abstract.

10 citations

References
More filters
Journal ArticleDOI
TL;DR: A web server providing a convenient way of identifying acquired antimicrobial resistance genes in completely sequenced isolates was created, and the method was evaluated on WGS chromosomes and plasmids of 30 isolates.
Abstract: Objectives Identification of antimicrobial resistance genes is important for understanding the underlying mechanisms and the epidemiology of antimicrobial resistance. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available in routine diagnostic laboratories and is anticipated to substitute traditional methods for resistance gene identification. Thus, the current challenge is to extract the relevant information from the large amount of generated data.

3,956 citations


"In Silico Detection and Typing of P..." refers methods in this paper

  • ...To extract the relevant information from the large amount of data generated, a Web-based tool, ResFinder, for the identification of acquired or intrinsically present antimicrobial resistance genes in whole-genome data was recently developed (15)....

    [...]

Journal ArticleDOI
TL;DR: NCBI’s Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints.
Abstract: NCBI's Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints. CDD includes manually curated domain models that make use of protein 3D structure to refine domain models and provide insights into sequence/structure/function relationships. Manually curated models are organized hierarchically if they describe domain families that are clearly related by common descent. As CDD also imports domain family models from a variety of external sources, it is a partially redundant collection. To simplify protein annotation, redundant models and models describing homologous families are clustered into superfamilies. By default, domain footprints are annotated with the corresponding superfamily designation, on top of which specific annotation may indicate high-confidence assignment of family membership. Pre-computed domain annotation is available for proteins in the Entrez/Protein dataset, and a novel interface, Batch CD-Search, allows the computation and download of annotation for large sets of protein queries. CDD can be accessed via http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.

2,934 citations


"In Silico Detection and Typing of P..." refers background in this paper

  • ...In particular, the replicase proteins showing the pfam02387 or pfam01051 conserved domains were assigned to the FII and FIB groups, respectively (31)....

    [...]

Journal ArticleDOI
TL;DR: Results indicated that the inc/rep PCR method demonstrates high specificity and sensitivity in detecting replicons on reference plasmids and also revealed the presence of recurrent and common plasmid in epidemiologically unrelated Salmonella isolates of different serotypes.

2,163 citations


"In Silico Detection and Typing of P..." refers methods in this paper

  • ...A collection of 24 previously characterized and fully FIG 1 Numbers of fully sequenced plasmids (y axis) classified into incompatibility groups occurring in the different bacterial species of the Enterobacteriaceae family....

    [...]

  • ...Since 2005, a PCR-based replicon typing (PBRT) scheme has been available that targets in multiplex PCRs the replicons of the major plasmid families occurring in members of the family Enterobacteriaceae (2)....

    [...]

  • ...Here, we present two free, easy-to-use Web tools, PlasmidFinder and pMLST, to analyze and classify plasmids from bacterial species of the family Enterobacteriaceae....

    [...]

  • ...Here, we describe the design of two new easy-to-use Web tools useful for the rapid identification of plasmids in Enterobacteriaceae species that are of interest for epidemiological and clinical microbiology investigations of the plasmid-associated spread of antimicrobial resistance....

    [...]

  • ...This method was initially developed to detect the replicons of plasmids belonging to the 18 major incompatibility (Inc) groups of Enterobacteriaceae species (3)....

    [...]

Journal ArticleDOI
TL;DR: The Bacterial Isolate Genome Sequence Database (BIGSDB) represents a freely available resource that will assist the broader community in the elucidation of the structure and function of bacteria by means of a population genomics approach.
Abstract: The opportunities for bacterial population genomics that are being realised by the application of parallel nucleotide sequencing require novel bioinformatics platforms These must be capable of the storage, retrieval, and analysis of linked phenotypic and genotypic information in an accessible, scalable and computationally efficient manner The Bacterial Isolate Genome Sequence Database (BIGSDB) is a scalable, open source, web-accessible database system that meets these needs, enabling phenotype and sequence data, which can range from a single sequence read to whole genome data, to be efficiently linked for a limitless number of bacterial specimens The system builds on the widely used mlstdbNet software, developed for the storage and distribution of multilocus sequence typing (MLST) data, and incorporates the capacity to define and identify any number of loci and genetic variants at those loci within the stored nucleotide sequences These loci can be further organised into 'schemes' for isolate characterisation or for evolutionary or functional analyses Isolates and loci can be indexed by multiple names and any number of alternative schemes can be accommodated, enabling cross-referencing of different studies and approaches LIMS functionality of the software enables linkage to and organisation of laboratory samples The data are easily linked to external databases and fine-grained authentication of access permits multiple users to participate in community annotation by setting up or contributing to different schemes within the database Some of the applications of BIGSDB are illustrated with the genera Neisseria and Streptococcus The BIGSDB source code and documentation are available at http://pubmlstorg/software/database/bigsdb/ Genomic data can be used to characterise bacterial isolates in many different ways but it can also be efficiently exploited for evolutionary or functional studies BIGSDB represents a freely available resource that will assist the broader community in the elucidation of the structure and function of bacteria by means of a population genomics approach

1,943 citations

Journal ArticleDOI
TL;DR: A Web-based method for MLST of 66 bacterial species based on whole-genome sequencing data that enables investigators to determine the sequence types of their isolates on the basis of WGS data.
Abstract: Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the “gold standard” of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST.

1,620 citations


"In Silico Detection and Typing of P..." refers methods in this paper

  • ...If raw sequence reads are uploaded, they are first assembled (after the sequencing platform is given by the user) as described previously (16)....

    [...]