scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Reevaluation of the Toxoplasma gondii and Neospora caninum genomes reveals misassembly, karyotype differences, and chromosomal rearrangements.

27 Apr 2021-Genome Research (Cold Spring Harbor Laboratory)-Vol. 31, Iss: 5, pp 823-833
TL;DR: The authors showed that the N. caninum genome was originally incorrectly assembled under the presumption of synteny with Toxoplasma gondii and showed that major chromosomal rearrangements have occurred between these species.
Abstract: Neospora caninum primarily infects cattle, causing abortions, with an estimated impact of a billion dollars on the worldwide economy annually. However, the study of its biology has been unheeded by the established paradigm that it is virtually identical to its close relative, the widely studied human pathogen Toxoplasma gondii By revisiting the genome sequence, assembly, and annotation using third-generation sequencing technologies, here we show that the N. caninum genome was originally incorrectly assembled under the presumption of synteny with T. gondii We show that major chromosomal rearrangements have occurred between these species. Importantly, we show that chromosomes originally named Chr VIIb and VIII are indeed fused, reducing the karyotype of both N. caninum and T. gondii to 13 chromosomes. We reannotate the N. caninum genome, revealing more than 500 new genes. We sequence and annotate the nonphotosynthetic plastid and mitochondrial genomes and show that although apicoplast genomes are virtually identical, high levels of gene fragmentation and reshuffling exist between species and strains. Our results correct assembly artifacts that are currently widely distributed in the genome database of N. caninum and T. gondii and, more importantly, highlight the mitochondria as a previously oversighted source of variability and pave the way for a change in the paradigm of synteny, encouraging rethinking the genome as basis of the comparative unique biology of these pathogens.
Citations
More filters
Journal ArticleDOI
TL;DR: This paper used the Oxford Nanopore Minion platform to generate near-complete de novo genome assemblies for multiple strains of T. gondii and its near relative, N. caninum.
Abstract: Toxoplasma gondii is a useful model for intracellular parasitism given its ease of culture in the laboratory and genomic resources. However, as for many other eukaryotes, the T. gondii genome contains hundreds of sequence gaps owing to repetitive and/or unclonable sequences that disrupt the assembly process. Here, we use the Oxford Nanopore Minion platform to generate near-complete de novo genome assemblies for multiple strains of T. gondii and its near relative, N. caninum We significantly improved T. gondii genome contiguity (average N50 of ∼6.6 Mb) and added ∼2 Mb of newly assembled sequence. For all of the T. gondii strains that we sequenced (RH, ME49, CTG, II×III progeny clones CL13, S27, S21, S26, and D3X1), the largest contig ranged in size between 11.9 and 12.1 Mb in size, which is larger than any previously reported T. gondii chromosome, and found to be due to a consistent fusion of Chromosomes VIIb and VIII. These data were validated by mapping existing T. gondii ME49 Hi-C data to our assembly, providing parallel lines of evidence that the T. gondii karyotype consists of 13, rather than 14, chromosomes. By using this technology, we also resolved hundreds of tandem repeats of varying lengths, including in well-known host-targeting effector loci like rhoptry protein 5 (ROP5) and ROP38 Finally, when we compared T. gondii with N. caninum, we found that although the 13-chromosome karyotype was conserved, extensive, previously unappreciated chromosome-scale rearrangements had occurred in T. gondii and N. caninum since their most recent common ancestry.

15 citations

Journal ArticleDOI
TL;DR:
Abstract: Although infections with Cyclospora cayetanensis are prevalent worldwide, many aspects of this parasite’s life cycle and transmission remain unknown. Humans are the only known hosts of this parasite. Existing information on its endogenous development has been derived from histological examination of only a few biopsy specimens. Its asexual and sexual stages occur in biliary-intestinal epithelium. In histological sections, its stages are less than 10 μm, making definitive identification difficult. Asexual (schizonts) and sexual (gamonts) are located in epithelial cells. Male microgamonts have two flagella; female macrogametes contain wall-forming bodies. Oocysts are excreted in feces unsporulated. Sporulation occurs in the environment, but there are many unanswered questions concerning dissemination and survival of C. cayetanensis oocysts. Biologically and phylogenetically, C. cayetanensis closely resembles Eimeria spp. that parastize chickens; among them, E. acervulina most closely resembles C. cayetanensis in size. Here, we review known and unknown aspects of its life cycle and transmission and discuss the appropriateness of surrogates best capable of hastening progress in understanding its biology and developing mitigating strategies.

4 citations

Journal ArticleDOI
TL;DR: In this article, the authors present the state of the art on mitochondrial genome structure, composition and organization in the apicomplexan phylum revisiting topological and biochemical information gathered through classical techniques.
Abstract: Mitochondria are vital organelles of eukaryotic cells, participating in key metabolic pathways such as cellular respiration, thermogenesis, maintenance of cellular redox potential, calcium homeostasis, cell signaling, and cell death. The phylum Apicomplexa is entirely composed of obligate intracellular parasites, causing a plethora of severe diseases in humans, wild and domestic animals. These pathogens include the causative agents of malaria, cryptosporidiosis, neosporosis, East Coast fever and toxoplasmosis, among others. The mitochondria in Apicomplexa has been put forward as a promising source of undiscovered drug targets, and it has been validated as the target of atovaquone, a drug currently used in the clinic to counter malaria. Apicomplexans present a single tubular mitochondria that varies widely both in structure and in genomic content across the phylum. The organelle is characterized by massive gene migrations to the nucleus, sequence rearrangements and drastic functional reductions in some species. Recent third generation sequencing studies have reignited an interest for elucidating the extensive diversity displayed by the mitochondrial genomes of apicomplexans and their intriguing genomic features. The underlying mechanisms of gene transcription and translation are also ill-understood. In this review, we present the state of the art on mitochondrial genome structure, composition and organization in the apicomplexan phylum revisiting topological and biochemical information gathered through classical techniques. We contextualize this in light of the genomic insight gained by second and, more recently, third generation sequencing technologies. We discuss the mitochondrial genomic and mechanistic features found in evolutionarily related alveolates, and discuss the common and distinct origins of the apicomplexan mitochondria peculiarities.

3 citations

Journal ArticleDOI
TL;DR: In this paper , a comparative mRNA expression analysis of the tachyzoite and bradyzoite stages of the Besnoitia besnoiti strain Lisbon14 isolated from an infected farm animal based on its annotated genome sequence is presented.
Abstract: Cyst-forming Apicomplexa (CFA) of the Sarcocystidae have a ubiquitous presence as pathogens of humans and farm animals transmitted through the food chain between hosts with few notable exceptions. The defining hallmark of this family of obligate intracellular protists consists of their ability to remain for very long periods as infectious tissue cysts in chronically infected intermediate hosts. Nevertheless, each closely related species has evolved unique strategies to maintain distinct reservoirs on global scales and ensuring efficient transmission to definitive hosts as well as between intermediate hosts. Here, we present an in-depth comparative mRNA expression analysis of the tachyzoite and bradyzoite stages of Besnoitia besnoiti strain Lisbon14 isolated from an infected farm animal based on its annotated genome sequence. The B. besnoiti genome is highly syntenic with that of other CFA and also retains the capacity to encode a large majority of known and inferred factors essential for completing a sexual cycle in a yet unknown definitive host. This work introduces Besnoitia besnoiti as a new model for comparative biology of coccidian tissue cysts which can be readily obtained in high purity. This model provides a framework for addressing fundamental questions about the evolution of tissue cysts and the biology of this pharmacologically intractable infectious parasite stage.

2 citations

Journal ArticleDOI
TL;DR: It was possible to associate them with high interstrain plasticity and a role in the adaptability of T. gondii to environmental changes, and a re-analysis of previous transcriptomic data indicated that ST gene expression is strongly linked to the adaptation to different situations such as extracellular passage and changes in metabolism.
Abstract: Subtelomeres (ST) are chromosome regions that separate telomeres from euchromatin and play relevant roles in various biological processes of the cell. While their functions are conserved, ST structure and genetic compositions are unique to each species. This study aims to identify and characterize the subtelomeric regions of the 13 Toxoplasma gondii chromosomes of the Me49 strain. Here, STs were defined at chromosome ends based on poor gene density. The length of STs ranges from 8.1 to 232.4 kbp, with a gene density of 0.049 genes/kbp, lower than the Me49 genome (0.15 kbp). Chromatin organization showed that H3K9me3, H2A.X, and H3.3 are highly enriched near telomeres and the 5′ end of silenced genes, decaying in intensity towards euchromatin. H3K4me3 and H2A.Z/H2B.Z are shown to be enriched in the 5′ end of the ST genes. Satellite DNA was detected in almost all STs, mainly the sat350 family and a novel satellite named sat240. Beyond the STs, only short dispersed fragments of sat240 and sat350 were found. Within STs, there were 12 functional annotated genes, 59 with unknown functions (Hypothetical proteins), 15 from multigene FamB, and 13 from multigene family FamC. Some genes presented low interstrain synteny associated with the presence of satellite DNA. Orthologues of FamB and FamC were also detected in Neospora caninum and Hammondia hammondi. A re-analysis of previous transcriptomic data indicated that ST gene expression is strongly linked to the adaptation to different situations such as extracellular passage (evolve and resequencing study) and changes in metabolism (lack of acetyl-CoA cofactor). In conclusion, the ST region of the T. gondii chromosomes was defined, the STs genes were determined, and it was possible to associate them with high interstrain plasticity and a role in the adaptability of T. gondii to environmental changes.

2 citations

References
More filters
Journal ArticleDOI
TL;DR: SAMtools as discussed by the authors implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments.
Abstract: Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: [email protected]

45,957 citations

Journal ArticleDOI
TL;DR: Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.
Abstract: As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

37,898 citations

Journal ArticleDOI
TL;DR: The results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.
Abstract: High-throughput mRNA sequencing (RNA-Seq) promises simultaneous transcript discovery and abundance estimation. However, this would require algorithms that are not restricted by prior gene annotations and that account for alternative transcription and splicing. Here we introduce such algorithms in an open-source software program called Cufflinks. To test Cufflinks, we sequenced and analyzed >430 million paired 75-bp RNA-Seq reads from a mouse myoblast cell line over a differentiation time series. We detected 13,692 known transcripts and 3,724 previously unannotated ones, 62% of which are supported by independent expression data or by homologous genes in other species. Over the time series, 330 genes showed complete switches in the dominant transcription start site (TSS) or splice isoform, and we observed more subtle shifts in 1,304 other genes. These results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.

13,337 citations

Journal ArticleDOI
TL;DR: The TopHat pipeline is much faster than previous systems, mapping nearly 2.2 million reads per CPU hour, which is sufficient to process an entire RNA-Seq experiment in less than a day on a standard desktop computer.
Abstract: Motivation: A new protocol for sequencing the messenger RNA in a cell, known as RNA-Seq, generates millions of short sequence fragments in a single run. These fragments, or ‘reads’, can be used to measure levels of gene expression and to identify novel splice variants of genes. However, current software for aligning RNA-Seq data to a genome relies on known splice junctions and cannot identify novel ones. TopHat is an efficient read-mapping algorithm designed to align reads from an RNA-Seq experiment to a reference genome without relying on known splice sites. Results: We mapped the RNA-Seq reads from a recent mammalian RNA-Seq experiment and recovered more than 72% of the splice junctions reported by the annotation-based software from that study, along with nearly 20 000 previously unreported junctions. The TopHat pipeline is much faster than previous systems, mapping nearly 2.2 million reads per CPU hour, which is sufficient to process an entire RNA-Seq experiment in less than a day on a standard desktop computer. We describe several challenges unique to ab initio splice site discovery from RNA-Seq reads that will require further algorithm development. Availability: TopHat is free, open-source software available from http://tophat.cbcb.umd.edu Contact: ude.dmu.sc@eloc Supplementary information: Supplementary data are available at Bioinformatics online.

11,473 citations

Journal ArticleDOI
TL;DR: Circos uses a circular ideogram layout to facilitate the display of relationships between pairs of positions by the use of ribbons, which encode the position, size, and orientation of related genomic elements.
Abstract: We created a visualization tool called Circos to facilitate the identification and analysis of similarities and differences arising from comparisons of genomes. Our tool is effective in displaying variation in genome structure and, generally, any other kind of positional relationships between genomic intervals. Such data are routinely produced by sequence alignments, hybridization arrays, genome mapping, and genotyping studies. Circos uses a circular ideogram layout to facilitate the display of relationships between pairs of positions by the use of ribbons, which encode the position, size, and orientation of related genomic elements. Circos is capable of displaying data as scatter, line, and histogram plots, heat maps, tiles, connectors, and text. Bitmap or vector images can be created from GFF-style data inputs and hierarchical configuration files, which can be easily generated by automated tools, making Circos suitable for rapid deployment in data analysis and reporting pipelines.

8,315 citations