scispace - formally typeset
Search or ask a question
Journal ArticleDOI

MITOS: Improved de novo metazoan mitochondrial genome annotation

TL;DR: The MITOS pipeline is designed to compute a consistent de novo annotation of the mitogenomic sequences and it is shown that the results of MITOS match RefSeq and MitoZoa in terms of annotation coverage and quality.
About: This article is published in Molecular Phylogenetics and Evolution.The article was published on 2013-11-01. It has received 3323 citations till now. The article focuses on the topics: RefSeq.
Citations
More filters
Journal ArticleDOI
TL;DR: The approach overcomes the limitations of traditional strategies for obtaining mitochondrial genomes for species with little or no mitochondrial sequence information at hand and represents a fast and highly efficient in silico alternative to laborious conventional strategies relying on initial long-range PCR.
Abstract: We present an in silico approach for the reconstruction of complete mitochondrial genomes of nonmodel organisms directly from next-generation sequencing (NGS) data—mitochondrial baiting and iterative mapping (MITObim). The method is straightforward even if only (i) distantly related mitochondrial genomes or (ii) mitochondrial barcode sequences are available as starting-reference sequences or seeds, respectively. We demonstrate the efficiency of the approach in case studies using real NGS data sets of the two monogenean ectoparasites species Gyrodactylus thymalli and Gyrodactylus derjavinoides including their respective teleost hosts European grayling (Thymallus thymallus) and Rainbow trout (Oncorhynchus mykiss). MITObim appeared superior to existing tools in terms of accuracy, runtime and memory requirements and fully automatically recovered mitochondrial genomes exceeding 99.5% accuracy from total genomic DNA derived NGS data sets in <24 h using a standard desktop computer. The approach overcomes the limitations of traditional strategies for obtaining mitochondrial genomes for species with little or no mitochondrial sequence information at hand and represents a fast and highly efficient in silico alternative to laborious conventional strategies relying on initial long-range PCR. We furthermore demonstrate the applicability of MITObim for metagenomic/pooled data sets using simulated data. MITObim is an easy to use tool even for biologists with modest bioinformatics experience. The software is made available as open source pipeline under the MIT license at https://github.com/ chrishah/MITObim.

1,604 citations

Journal ArticleDOI
TL;DR: A new version of OGDRAW equipped with a new front end enables the user to easily visualize large sets of organellar genomes spanning entire taxonomic clades.
Abstract: Organellar (plastid and mitochondrial) genomes play an important role in resolving phylogenetic relationships, and next-generation sequencing technologies have led to a burst in their availability. The ongoing massive sequencing efforts require software tools for routine assembly and annotation of organellar genomes as well as their display as physical maps. OrganellarGenomeDRAW (OGDRAW) has become the standard tool to draw graphical maps of plastid and mitochondrial genomes. Here, we present a new version of OGDRAW equipped with a new front end. Besides several new features, OGDRAW now has access to a local copy of the organelle genome database of the NCBI RefSeq project. Together with batch processing of (multi-)GenBank files, this enables the user to easily visualize large sets of organellar genomes spanning entire taxonomic clades. The new OGDRAW server can be accessed at https://chlorobox.mpimp-golm.mpg.de/OGDraw.html.

888 citations


Cites methods from "MITOS: Improved de novo metazoan mi..."

  • ...These range from specialized applications such as MITOS (11), that were designed for a subset of organellar genomes and whose output requires little to no quality control or manual curation, to GeSeq, a flexible tool that allows the annotation of essentially any organellar genome (15)....

    [...]

Journal ArticleDOI
TL;DR: MitoFish contains re-annotations of previously sequenced fish mitogenomes, enabling researchers to refer to them when they find annotations that are likely to be erroneous or while conducting comparative mitogenomic analyses.
Abstract: Mitofish is a database of fish mitochondrial genomes (mitogenomes) that includes powerful and precise de novo annotations for mitogenome sequences. Fish occupy an important position in the evolution of vertebrates and the ecology of the hydrosphere, and mitogenomic sequence data have served as a rich source of information for resolving fish phylogenies and identifying new fish species. The importance of a mitogenomic database continues to grow at a rapid pace as massive amounts of mitogenomic data are generated with the advent of new sequencing technologies. A severe bottleneck seems likely to occur with regard to mitogenome annotation because of the overwhelming pace of data accumulation and the intrinsic difficulties in annotating sequences with degenerating transfer RNA structures, divergent start/stop codons of the coding elements, and the overlapping of adjacent elements. To ease this data backlog, we developed an annotation pipeline named MitoAnnotator. MitoAnnotator automatically annotates a fish mitogenome with a high degree of accuracy in approximately 5 min; thus, it is readily applicable to data sets of dozens of sequences. MitoFish also contains re-annotations of previously sequenced fish mitogenomes, enabling researchers to refer to them when they find annotations that are likely to be erroneous or while conducting comparative mitogenomic analyses. For users who need more information on the taxonomy, habitats, phenotypes, or life cycles of fish, MitoFish provides links to related databases. MitoFish and MitoAnnotator are freely available at http://mitofish.aori.u-tokyo.ac.jp/ (last accessed August 28, 2013); all of the data can be batch downloaded, and the annotation pipeline can be used via a web interface.

590 citations


Cites background or methods from "MITOS: Improved de novo metazoan mi..."

  • ...2012), the most comprehensive database for mitogenomes, is known to contain many incorrect mitogenomic annotations (Bernt et al. 2013), which can lead to inaccurate research results....

    [...]

  • ...MITOS (Bernt et al. 2013) is an automated pipeline for the de novo annotation of metazoan mitogenomes and comes closest to what MitoAnnotator achieves....

    [...]

  • ...In addition, RefSeq (Pruitt et al. 2012), the most comprehensive database for mitogenomes, is known to contain many incorrect mitogenomic annotations (Bernt et al. 2013), which can lead to inaccurate research results....

    [...]

Journal ArticleDOI
20 Jun 2014-Science
TL;DR: Characterization of genomic differentiation in a classic example of hybridization between all-black carrion crows and gray-coated hooded crows identified genome-wide introgression extending far beyond the morphological hybrid zone, indicating localized genomic selection can cause marked heterogeneity in introgressive landscapes while maintaining phenotypic divergence.
Abstract: The importance, extent, and mode of interspecific gene flow for the evolution of species has long been debated. Characterization of genomic differentiation in a classic example of hybridization between all-black carrion crows and gray-coated hooded crows identified genome-wide introgression extending far beyond the morphological hybrid zone. Gene expression divergence was concentrated in pigmentation genes expressed in gray versus black feather follicles. Only a small number of narrow genomic islands exhibited resistance to gene flow. One prominent genomic region (<2 megabases) harbored 81 of all 82 fixed differences (of 8.4 million single-nucleotide polymorphisms in total) linking genes involved in pigmentation and in visual perception-a genomic signal reflecting color-mediated prezygotic isolation. Thus, localized genomic selection can cause marked heterogeneity in introgression landscapes while maintaining phenotypic divergence.

495 citations

Journal ArticleDOI
TL;DR: A mitogenome toolkit MitoZ is developed, consisting of independent modules of de novo assembly, findMitoScaf (find Mitochondrial Scaffolds), annotation and visualization, that can generate mitogenomes assembly together with annotations and visualization results from HTS raw reads.
Abstract: Mitochondrial genome (mitogenome) plays important roles in evolutionary and ecological studies. It becomes routine to utilize multiple genes on mitogenome or the entire mitogenomes to investigate phylogeny and biodiversity of focal groups with the onset of High Throughput Sequencing (HTS) technologies. We developed a mitogenome toolkit MitoZ, consisting of independent modules of de novo assembly, findMitoScaf (find Mitochondrial Scaffolds), annotation and visualization, that can generate mitogenome assembly together with annotation and visualization results from HTS raw reads. We evaluated its performance using a total of 50 samples of which mitogenomes are publicly available. The results showed that MitoZ can recover more full-length mitogenomes with higher accuracy compared to the other available mitogenome assemblers. Overall, MitoZ provides a one-click solution to construct the annotated mitogenome from HTS raw data and will facilitate large scale ecological and evolutionary studies. MitoZ is free open source software distributed under GPLv3 license and available at https://github.com/linzhi2013/MitoZ.

465 citations

References
More filters
Journal ArticleDOI
TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.

88,255 citations


"MITOS: Improved de novo metazoan mi..." refers methods in this paper

  • ...Systematic semi-automatic error screening using a list of rules based on tRNAscan-SE (Lowe and Eddy, 1997), ARWEN (Laslett and Canback, 2008), and BLAST (Altschul et al., 1990) searches as well as expert knowledge is used for MitoZoa (Lupi et al., 2010), a recently released new data base....

    [...]

  • ...Furthermore all raw data of BLAST and Infernal, a graphical representation of the structure of the ncRNAs predicted by the covariance models, and a file containing the gene order are available (lacking the anticodon information for tRNA-encoding genes)....

    [...]

  • ...It uses a novel strategy based on aggregating BLAST searches with previously annotated protein sequences to identify protein coding genes (Section 2.1), thereby avoiding the need for a built-in data base of specifically curated protein models....

    [...]

  • ...BLAST is used by MOSAS to search for open reading frames and rRNAs based on a local data base of query sequences (currently from insects only)....

    [...]

  • ...Protein coding genes are annotated by means of a sophisticated aggregation procedure based on BLAST searches, which allows for the detection of frameshifts, duplication events, and split genes....

    [...]

Journal ArticleDOI
TL;DR: A program is described, tRNAscan-SE, which identifies 99-100% of transfer RNA genes in DNA sequence while giving less than one false positive per 15 gigabases.
Abstract: We describe a program, tRNAscan-SE, which identifies 99-100% of transfer RNA genes in DNA sequence while giving less than one false positive per 15 gigabases. Two previously described tRNA detection programs are used as fast, first-pass prefilters to identify candidate tRNAs, which are then analyzed by a highly selective tRNA covariance model. This work represents a practical application of RNA covariance models, which are general, probabilistic secondary structure profiles based on stochastic context-free grammars. tRNAscan-SE searches at approximately 30 000 bp/s. Additional extensions to tRNAscan-SE detect unusual tRNA homologues such as selenocysteine tRNAs, tRNA-derived repetitive elements and tRNA pseudogenes.

9,629 citations


"MITOS: Improved de novo metazoan mi..." refers methods in this paper

  • ...Systematic semi-automatic error screening using a list of rules based on tRNAscan-SE (Lowe and Eddy, 1997), ARWEN (Laslett and Canback, 2008), and BLAST (Altschul et al., 1990) searches as well as expert knowledge is used for MitoZoa (Lupi et al., 2010), a recently released new data base....

    [...]

Journal ArticleDOI
TL;DR: The National Center for Biotechnology Information Reference Sequence (RefSeq) database provides a non-redundant collection of sequences representing genomic data, transcripts and proteins that pragmatically includes sequence data that are currently publicly available in the archival databases.
Abstract: The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database (http://www.ncbi.nlm.nih.gov/RefSeq/) provides a non-redundant collection of sequences representing genomic data, transcripts and proteins. Although the goal is to provide a comprehensive dataset representing the complete sequence information for any given species, the database pragmatically includes sequence data that are currently publicly available in the archival databases. The database incorporates data from over 2400 organisms and includes over one million proteins representing significant taxonomic diversity spanning prokaryotes, eukaryotes and viruses. Nucleotide and protein sequences are explicitly linked, and the sequences are linked to other resources including the NCBI Map Viewer and Gene. Sequences are annotated to include coding regions, conserved domains, variation, references, names, database cross-references, and other features using a combined approach of collaboration and other input from the scientific community, automated annotation, propagation from GenBank and curation by NCBI staff.

4,229 citations


"MITOS: Improved de novo metazoan mi..." refers background in this paper

  • ...The most comprehensive and up-to-date resource for mitochondrial genomes and their annotation is NCBI RefSeq (Pruitt et al., 2007)....

    [...]

  • ...About 2000 completely sequenced mitochondrial genomes are available from the NCBI RefSeq data base together with manually curated annotations of their protein-coding genes, rRNAs, and tRNAs....

    [...]

Journal ArticleDOI
TL;DR: Biopython includes modules for reading and writing different sequence file formats and multiple sequence alignments, dealing with 3D macro molecular structures, interacting with common tools such as BLAST, ClustalW and EMBOSS, accessing key online databases, as well as providing numerical methods for statistical learning.
Abstract: The Biopython project is a mature open source international collaboration of volunteer developers, providing Python libraries for a wide range of bioinformatics problems. Biopython includes modules for reading and writing different sequence. le formats and multiple sequence alignments, dealing with 3D macromolecular structures, interacting with common tools such as BLAST, ClustalW and EMBOSS, accessing key online databases, as well as providing numerical methods for statistical learning.

3,855 citations


"MITOS: Improved de novo metazoan mi..." refers methods in this paper

  • ...We implemented specialised parser for mitogenome annotations based on biopython (Cock et al., 2009) for this task....

    [...]

  • ...In the following we provide a detailed description of the individual components of MITOS....

    [...]

Journal ArticleDOI
TL;DR: The Dual Organellar GenoMe Annotator (DOGMA) automates the annotation of organellar genomes and allows the use of BLAST searches against a custom database, and conservation of basepairing in the secondary structure of animal mitochondrial tRNAs to identify and annotate genes.
Abstract: Summary: The Dual Organellar GenoMe Annotator (DOGMA) automates the annotation of organellar (plant chloroplast and animal mitochondrial) genomes. It is a Web-based package that allows the use of BLAST searches against a custom database, and conservation of basepairing in the secondary structure of animal mitochondrial tRNAs to identify and annotate genes. DOGMA provides a graphical user interface for viewing and editing annotations. Annotations are stored on our password-protected server to enable repeated sessions of working on the same genome. Finished annotations can be extracted for direct submission to GenBank. Availability: http://phylocluster.biosci.utexas.edu/dogma/ Supplementary information: Detailed documentation and tutorials for annotating both animal mitochondrial and plant chloroplast genomes can be found on the DOGMA home page.

2,754 citations


"MITOS: Improved de novo metazoan mi..." refers methods in this paper

  • ...DOGMA (Wyman et al., 2004) is a semi-automated pipeline of methods dealing with both mitochondrial and chloroplast genomes....

    [...]

  • ...Automatic annotation of organellar genomes with DOGMA....

    [...]

  • ...COVE (Eddy and Durbin, 1994) is employed by DOGMA to identify tRNAs candidates based on secondary structure....

    [...]