scispace - formally typeset
Search or ask a question
Journal ArticleDOI

CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database

TL;DR: A new Resistomes & Variants module provides analysis and statistical summary of in silico predicted resistance variants from 82 pathogens and over 100 000 genomes, able to summarize predicted resistance using the information included in CARD, identify trends in AMR mobility and determine previously undescribed and novel resistance variants.
Abstract: The Comprehensive Antibiotic Resistance Database (CARD; https://card.mcmaster.ca) is a curated resource providing reference DNA and protein sequences, detection models and bioinformatics tools on the molecular basis of bacterial antimicrobial resistance (AMR). CARD focuses on providing high-quality reference data and molecular sequences within a controlled vocabulary, the Antibiotic Resistance Ontology (ARO), designed by the CARD biocuration team to integrate with software development efforts for resistome analysis and prediction, such as CARD's Resistance Gene Identifier (RGI) software. Since 2017, CARD has expanded through extensive curation of reference sequences, revision of the ontological structure, curation of over 500 new AMR detection models, development of a new classification paradigm and expansion of analytical tools. Most notably, a new Resistomes & Variants module provides analysis and statistical summary of in silico predicted resistance variants from 82 pathogens and over 100 000 genomes. By adding these resistance variants to CARD, we are able to summarize predicted resistance using the information included in CARD, identify trends in AMR mobility and determine previously undescribed and novel resistance variants. Here, we describe updates and recent expansions to CARD and its biocuration process, including new resources for community biocuration of AMR molecular reference data.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: Paraburkholderia aromaticivorans AR20-38 was examined for its potential to degrade six selected lignin monomers from different upper funneling aromatic pathways.
Abstract: Lignin bio-valorization is an emerging field of applied biotechnology and has not yet been studied at low temperatures. Paraburkholderia aromaticivorans AR20-38 was examined for its potential to degrade six selected lignin monomers (syringic acid, p-coumaric acid, 4-hydroxybenzoic acid, ferulic acid, vanillic acid, benzoic acid) from different upper funneling aromatic pathways. The strain degraded four of these compounds at 10°C, 20°C, and 30°C; syringic acid and vanillic acid were not utilized as sole carbon source. The degradation of 5 mM and 10 mM ferulic acid was accompanied by the stable accumulation of high amounts of the value-added product vanillic acid (85-89% molar yield; 760 and 1540 mg l-1, respectively) over the whole temperature range tested. The presence of essential genes required for reactions in the upper funneling pathways was confirmed in the genome. This is the first report on biodegradation of lignin monomers and the stable vanillic acid production at low and moderate temperatures by P. aromaticivorans. KEY POINTS: • Paraburkholderia aromaticivorans AR20-38 successfully degrades four lignin monomers. • Successful degradation study at low (10°C) and moderate temperatures (20-30°C). • Biotechnological value: high yield of vanillic acid produced from ferulic acid.

16 citations

Posted ContentDOI
16 May 2020-bioRxiv
TL;DR: During its five years existence, ARIES has grown into a well-established reality not only as a web service but as a workflow engine for the Integrated Rapid Infectious Disease Analysis (IRIDA) platform allowing scientists to concentrate on what to do instead of how to do it.
Abstract: Background: With the introduction of Next Generation Sequencing (NGS) and Whole-Genome Sequencing (WGS) in microbiology and molecular epidemiology, the development of an information system for the collection of genomic and epidemiological data and subsequent transparent and reproducible data analysis became indispensable. Further requirements for the system included accessibility and ease of use by bioinformatics as well as command line profane scientists. Findings: The ARIES (Advanced Research Infrastructure for Experimentation in genomicS, https://aries.iss.it) platform has been implemented in 2015 as an instance of the Galaxy framework specific for use of WGS in molecular epidemiology. Here, the experience with ARIES is reported. Conclusions: During its five years existence, ARIES has grown into a well-established reality not only as a web service but as well as a workflow engine for the Integrated Rapid Infectious Disease Analysis (IRIDA) platform. In fact, an environment has been created with the implementation of complex bioinformatic tools in an easy-to-use context allowing scientists to concentrate on what to do instead of how to do it.

15 citations


Cites methods from "CARD 2020: antibiotic resistome sur..."

  • ...General purpose tools for data upload/download/manipulation [12] NGS Trimming: Trimmomatic [29], FASTQ positional and quality trimming [19] NGS Assembly: SPAdes [20], SKESA [37], metaSPAdes [38], A5, INNUca [21], Shovill [39] NGS Mapping: BWA [40], Bowtie2 [41] NGS Alignment: BLAST [42], MMseqs2 [43], Diamond [44], MAFFT [45], MUMmer [46] NGS Quality Control: FastQC [17], QUAST [27] Phylogeny tools: PopPUNK [47], kSNP3 [48], FDA SNP Pipeline [49], MrBayes [50], PhyML [51], IQ-TREE [52] Allele Call tools: SRST2 [53], MentaLiST [8], MLST [54], chewBBACA [9] General Typing tools: Virulotyper [25], AMRFinderPlus [23], Resistance Gene Identifier [55] Specific Typing tools: EURL VTEC WGS PT (E....

    [...]

Journal ArticleDOI
TL;DR: UV-C radiation significantly reduced the abundance and prevalence of carbapenem-resistant bacteria in UWWTPs and detected CRE with blaGES-5, in integrons and plasmids, raising concern as horizontal gene transfer may occur within these systems.

15 citations

Journal ArticleDOI
TL;DR: It is found that IncP-1 plasmids do not always carry accessory genes in unpolluted rhizospheres, which is important to understand the ecology and role of the IncP, δ, and ε subgroups in the natural environment.
Abstract: IncP-1 plasmids, first isolated from clinical specimens (R751, RP4), are recognized as important vectors spreading antibiotic resistance genes. The abundance of IncP-1 plasmids in the environment, previously reported, suggested a correlation with anthropogenic pollution. Unexpectedly, qPCR-based detection of IncP-1 plasmids revealed also an increased relative abundance of IncP-1 plasmids in total community DNA from the rhizosphere of lettuce and tomato plants grown in non-polluted soil along with plant age. Here we report the successful isolation of IncP-1 plasmids by exploiting their ability to mobilize plasmid pSM1890. IncP-1 plasmids were captured from the rhizosphere but not from bulk soil, and a high diversity was revealed by sequencing 14 different plasmids that were assigned to IncP-1β, δ, and e subgroups. Although backbone genes were highly conserved and mobile elements or remnants as Tn501, IS1071, Tn402, or class 1 integron were carried by 13 of the sequenced IncP-1 plasmids, no antibiotic resistance genes were found. Instead, seven plasmids had a mer operon with Tn501-like transposon and five plasmids contained putative metabolic gene clusters linked to these mobile elements. In-depth sequence comparisons with previously known plasmids indicate that the IncP-1 plasmids captured from the rhizosphere are archetypes of those found in clinical isolates. Our findings that IncP-1 plasmids do not always carry accessory genes in unpolluted rhizospheres are important to understand the ecology and role of the IncP-1 plasmids in the natural environment.

15 citations


Additional excerpts

  • ...Similar to pKJK5, originally isolated from manuretreated soils (Bahl et al., 2007), all these plasmids carried class 1 integrons with tet(A) gene and their diversity was proposed to be driven by different sets of other gene cassettes (Heuer et al., 2012; Jechalke et al., 2014; Wolters et al., 2015)....

    [...]

Journal ArticleDOI
TL;DR: In this paper , a 1D CNN architecture was designed to integrate both sequential and non-sequential features to predict drug resistance for Mycobacterium tuberculosis (MTB) using a large and diverse cohort of MTB samples.
Abstract: Effective and timely antibiotic treatment depends on accurate and rapid in silico antimicrobial-resistant (AMR) predictions. Existing statistical rule-based Mycobacterium tuberculosis (MTB) drug resistance prediction methods using bacterial genomic sequencing data often achieve varying results: high accuracy on some antibiotics but relatively low accuracy on others. Traditional machine learning (ML) approaches have been applied to classify drug resistance for MTB and have shown more stable performance. However, there is no study that uses deep learning architecture like Convolutional Neural Network (CNN) on a large and diverse cohort of MTB samples for AMR prediction. We developed 24 binary classifiers of MTB drug resistance status across eight anti-MTB drugs and three different ML algorithms: logistic regression, random forest and 1D CNN using a training dataset of 10,575 MTB isolates collected from 16 countries across six continents, where an extended pan-genome reference was used for detecting genetic features. Our 1D CNN architecture was designed to integrate both sequential and non-sequential features. In terms of F1-scores, 1D CNN models are our best classifiers that are also more accurate and stable than the state-of-the-art rule-based tool Mykrobe predictor (81.1 to 93.8%, 93.7 to 96.2%, 93.1 to 94.8%, 95.9 to 97.2% and 97.1 to 98.2% for ethambutol, rifampicin, pyrazinamide, isoniazid and ofloxacin respectively). We applied filter-based feature selection to find AMR relevant features. All selected variant features are AMR-related ones in CARD database. 78.8% of them are also in the catalogue of MTB mutations that were recently identified as drug resistance-associated ones by WHO. To facilitate ML model development for AMR prediction, we packaged every step into an automated pipeline and shared the source code at https://github.com/KuangXY3/MTB-AMR-classification-CNN .

15 citations

References
More filters
Journal ArticleDOI
TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.

88,255 citations


"CARD 2020: antibiotic resistome sur..." refers background in this paper

  • ...The latter is described by CARD’s Model Ontology (MO, Supplementary Figure S1), which includes reference nucleotide and protein sequences, as well as additional search parameters including mutations conferring AMR (if applicable) and curated BLAST(P/N) (34,35) bit score cut-offs....

    [...]

Journal ArticleDOI
TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Abstract: Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ~10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: [email protected]

43,862 citations

Journal ArticleDOI
TL;DR: Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.
Abstract: As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

37,898 citations


"CARD 2020: antibiotic resistome sur..." refers methods in this paper

  • ...Metagenomics analysis (i.e. RGI bwt) uses Bowtie2 (40) or BWA (41) mapping of sequencing reads to CARD’s PHM reference sequences only, while annotation of genomes or assembly contigs predicts resistome using four of CARD’s AMR detection models: PHM, PVM, RVM and POM (note: RGI currently only scans for nonsynonymous substitutions; not frameshifts, deletions or insertions)....

    [...]

  • ...RGI bwt) uses Bowtie2 (40) or BWA (41) mapping of sequencing reads to CARD’s PHM reference sequences only, while annotation of genomes or assembly contigs predicts resistome using four of CARD’s AMR detection models: PHM, PVM, RVM and POM (note: RGI currently only scans for nonsynonymous substitutions; not frameshifts, deletions or insertions)....

    [...]

Journal ArticleDOI
TL;DR: The goals of the PDB are described, the systems in place for data deposition and access, how to obtain further information and plans for the future development of the resource are described.
Abstract: The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.

34,239 citations


"CARD 2020: antibiotic resistome sur..." refers methods in this paper

  • ...In 2017, we described the CARD*Shark text-mining algorithm (26) for computer-assisted literature triage, which we have expanded based on the new ARO Drug Class classification tags....

    [...]

Journal ArticleDOI
TL;DR: The new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences.
Abstract: Sequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its use of heuristics, the speed of the current BLAST software is suboptimal for very long queries or database sequences. There are also some shortcomings in the user-interface of the current command-line applications. We describe features and improvements of rewritten BLAST software and introduce new command-line applications. Long query sequences are broken into chunks for processing, in some cases leading to dramatically shorter run times. For long database sequences, it is possible to retrieve only the relevant parts of the sequence, reducing CPU time and memory usage for searches of short queries against databases of contigs or chromosomes. The program can now retrieve masking information for database sequences from the BLAST databases. A new modular software library can now access subject sequence data from arbitrary data sources. We introduce several new features, including strategy files that allow a user to save and reuse their favorite set of options. The strategy files can be uploaded to and downloaded from the NCBI BLAST web site. The new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences. We have also improved the user interface of the command-line applications.

13,223 citations


"CARD 2020: antibiotic resistome sur..." refers background or methods in this paper

  • ...The website also includes a built-in BLAST instance for comparing sequences to CARD reference sequences and a web instance of RGI for resistome prediction with data visualization tools (https:// card.mcmaster.ca/analyze)....

    [...]

  • ...The RVM is functionally similar to the PVM, except it works for rRNA mutations and therefore uses a nucleotide reference sequence and a BLASTN bit score cut-off....

    [...]

  • ...Briefly, RGI algorithmically predicts AMR genes and mutations from submitted genomes using a combination of open reading frame prediction with Prodigal (38), sequence alignment with BLAST (35) or DIAMOND (39), and curated resistance mutations included with the AMR detection model....

    [...]

  • ...In the same time period, the CARD website hosted ∼45 000 BLAST analyses, ∼220 000 RGI analyses, ∼64 000 data file downloads, and ∼10,000 RGI software downloads....

    [...]

  • ...We had determined that the asymptotic nature of the BLAST expectation value (E) gave it very low discriminatory power between different -lactamase gene families (nearly 13 of CARD’s content), but that the linear nature of the BLAST bit score (S′) allowed this level of discrimination....

    [...]