scispace - formally typeset
Open AccessJournal ArticleDOI

OrthoMaM v10: Scaling-Up Orthologous Coding Sequence and Exon Alignments with More than One Hundred Mammalian Genomes

Reads0
Chats0
TLDR
The main contribution of this version of OrthoMaM is the increase in the number of taxa: 116 mammalian genomes for 14,509 one-to-one orthologous genes.
Abstract
We present version 10 of OrthoMaM, a database of orthologous mammalian markers. OrthoMaM is already 11 years old and since the outset it has kept on improving, providing alignments and phylogenetic trees of high-quality computed with state-of-the-art methods on up-to-date data. The main contribution of this version is the increase in the number of taxa: 116 mammalian genomes for 14,509 one-to-one orthologous genes. This has been made possible by the combination of genomic data deposited in Ensembl complemented by additional good-quality genomes only available in NCBI. Version 10 users will benefit from pipeline improvements and a completely redesigned web-interface.

read more

Citations
More filters
Journal ArticleDOI

Phylogenetic tree building in the genomic age.

TL;DR: The principles, steps and computational tools for phylogenetic tree building are discussed, including identification of orthologous genes or proteins, multiple sequence alignment, and choice of substitution models and inference methodologies.
Journal ArticleDOI

Deep Residual Neural Networks Resolve Quartet Molecular Phylogenies.

TL;DR: The well-trained residual network predictors can outperform existing state-of-the-art inference methods such as the maximum likelihood method on diverse simulated test data, especially under extensive substitution heterogeneities, and conclude that deep learning represents a powerful new approach to phylogenetic reconstruction.
Posted ContentDOI

Endemic island songbirds as windows into evolution in small effective population sizes

TL;DR: These results provide robust evidence that the lower Ne experienced by island species has affected both the ability of natural selection to efficiently remove weakly deleterious mutations and also the adaptive potential of island species, therefore providing considerable empirical support for the nearly neutral theory.

To What Extent Current Limits of Phylogenomics Can Be Overcome

TL;DR: It is argued that data error is pervasive in modern datasets and models are still too simplistic compared to the complexity of biological and evolutionary processes, so the quality of datasets should be enhanced via numerous, rigorous checkpoints, while also boosting the capability of models to handle biological complexity by the development of better models, particularly through joint analyses.
References
More filters
Journal ArticleDOI

MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability

TL;DR: This version of MAFFT has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update.
Journal ArticleDOI

RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies.

TL;DR: This work presents some of the most notable new features and extensions of RAxML, such as a substantial extension of substitution models and supported data types, the introduction of SSE3, AVX and AVX2 vector intrinsics, techniques for reducing the memory requirements of the code and a plethora of operations for conducting post-analyses on sets of trees.
Book

Accelerated Profile HMM Searches

TL;DR: An acceleration heuristic for profile HMMs, the “multiple segment Viterbi” (MSV) algorithm, which computes an optimal sum of multiple ungapped local alignment segments using a striped vector-parallel approach previously described for fast Smith/Waterman alignment.
Journal ArticleDOI

Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods

TL;DR: Two approximate methods are proposed for maximum likelihood phylogenetic estimation, which allow variable rates of substitution across nucleotide sites, and one of them uses several categories of rates to approximate the gamma distribution, with equal probability for each category.
Journal ArticleDOI

MACSE v2: Toolkit for the Alignment of Coding Sequences Accounting for Frameshifts and Stop Codons

TL;DR: A major update with an improved version of the initial algorithm enriched with a complete toolkit to handle multiple alignments of protein-coding sequences is presented, and a graphical interface now provides user-friendly access to the different subprograms.
Related Papers (5)