Author
Bastien Boussau
Other affiliations: École normale supérieure de Lyon, Lyons, Uppsala University ...read more
Bio: Bastien Boussau is an academic researcher from University of Lyon. The author has contributed to research in topics: Phylogenetic tree & Genome. The author has an hindex of 35, co-authored 71 publications receiving 6443 citations. Previous affiliations of Bastien Boussau include École normale supérieure de Lyon & Lyons.
Topics: Phylogenetic tree, Genome, Genome evolution, Gene duplication, Biology
Papers published on a yearly basis
Papers
More filters
••
Duke University1, University of Texas at Austin2, Heidelberg Institute for Theoretical Studies3, American Museum of Natural History4, Beijing Genomics Institute5, Xi'an Jiaotong University6, New Mexico State University7, University of Sydney8, University of California9, Uppsala University10, University of Copenhagen11, Okinawa Institute of Science and Technology12, University of Georgia13, Griffith University14, Catalan Institution for Research and Advanced Studies15, Joint Institute for Nuclear Research16, Oak Ridge National Laboratory17, Aarhus University18, Washington University in St. Louis19, University of California, Santa Cruz20, Cardiff University21, Kunming Institute of Zoology22, China Agricultural University23, Tulane University24, Louisiana State University25, Copenhagen Zoo26, Federal University of Pará27, Oregon Health & Science University28, Technical University of Denmark29, Canterbury Museum30, Curtin University31, Novosibirsk State University32, Smithsonian Institution33, National University of Singapore34, National Museum of Natural History35, Nova Southeastern University36, Occidental College37, University of Edinburgh38, Harvard University39, University of California, San Francisco40, University of Florida41, University of Illinois at Urbana–Champaign42
TL;DR: A genome-scale phylogenetic analysis of 48 species representing all orders of Neoaves recovered a highly resolved tree that confirms previously controversial sister or close relationships and identifies the first divergence in Neoaves, two groups the authors named Passerea and Columbea.
Abstract: To better determine the history of modern birds, we performed a genome-scale phylogenetic analysis of 48 species representing all orders of Neoaves using phylogenomic methods created to handle genome-scale data. We recovered a highly resolved tree that confirms previously controversial sister or close relationships. We identified the first divergence in Neoaves, two groups we named Passerea and Columbea, representing independent lineages of diverse and convergently evolved land and water bird species. Among Passerea, we infer the common ancestor of core landbirds to have been an apex predator and confirm independent gains of vocal learning. Among Columbea, we identify pigeons and flamingoes as belonging to sister clades. Even with whole genomes, some of the earliest branches in Neoaves proved challenging to resolve, which was best explained by massive protein-coding sequence convergence and high levels of incomplete lineage sorting that occurred during a rapid radiation after the Cretaceous-Paleogene mass extinction event about 66 million years ago.
1,624 citations
••
TL;DR: It is shown that these mesophilic archaea are different from hyperthermophilic Crenarchaeota and branch deeper than was previously assumed, and should be considered as a third archaeal phylum, which the authors propose to name Thaum archaeota.
Abstract: The archaeal domain is currently divided into two major phyla, the Euryarchaeota and Crenarchaeota. During the past few years, diverse groups of uncultivated mesophilic archaea have been discovered and affiliated with the Crenarchaeota. It was recently recognized that these archaea have a major role in geochemical cycles. Based on the first genome sequence of a crenarchaeote, Cenarchaeum symbiosum, we show that these mesophilic archaea are different from hyperthermophilic Crenarchaeota and branch deeper than was previously assumed. Our results indicate that C. symbiosum and its relatives are not Crenarchaeota, but should be considered as a third archaeal phylum, which we propose to name Thaumarchaeota (from the Greek 'thaumas', meaning wonder).
1,118 citations
••
TL;DR: RevBayes is a new open-source software package based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models that outperforms competing software for several standard analyses and needs to explicitly specify each part of the model and analysis.
Abstract: Programs for Bayesian inference of phylogeny currently implement a unique and fixed suite of models Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs We developed a new open-source software package, RevBayes, to address these problems RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models Phylogenetic-graphical models can be specified interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev Rev is similar to the R language and the BUGS model-specification language, and should be easy to learn for most users The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models Fortunately, this tremendous flexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses Compared with other programs, RevBayes has fewer black-box elements Users need to explicitly specify each part of the model and analysis Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our field Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny RevBayes is freely available at http://wwwRevBayescom [Bayesian inference; Graphical models; MCMC; statistical phylogenetics]
505 citations
••
TL;DR: A new probabilistic model is presented to jointly infer rooted species and gene trees for dozens of genomes and thousands of gene families and yields a more accurate picture of ancestral genomes than the trees available in the authoritative database Ensembl.
Abstract: Comparisons of gene trees and species trees are key to understanding major processes of genome evolution such as gene duplication and loss. Because current methods to reconstruct phylogenies fail to model the two-way dependency between gene trees and the species tree, they often misrepresent gene and species histories. We present a new probabilistic model to jointly infer rooted species and gene trees for dozens of genomes and thousands of gene families. We use simulations to show that this method accurately infers the species tree and gene trees, is robust to misspecification of the models of sequence and gene family evolution and provides a precise historic record of gene duplications and losses throughout genome evolution. We simultaneously reconstruct the history of mammalian species and their genes, based on 36 completely sequenced genomes, and use the reconstructed gene trees to infer the gene content and organization of ancestral mammalian genomes. We show that our method yields a more accurate picture of ancestral genomes than the trees available in the authoritative database Ensembl.
266 citations
••
TL;DR: A statistical binning technique to address gene tree estimation error is developed and used to produce the first genome-scale coalescent-based avian tree of life, which is helpful in providing more accurate estimations of ILS levels in biological data sets.
Abstract: Gene tree incongruence arising from incomplete lineage sorting (ILS) can reduce the accuracy of concatenation-based estimations of species trees. Although coalescent-based species tree estimation methods can have good accuracy in the presence of ILS, they are sensitive to gene tree estimation error. We propose a pipeline that uses bootstrapping to evaluate whether two genes are likely to have the same tree, then it groups genes into sets using a graph-theoretic optimization and estimates a tree on each subset using concatenation, and finally produces an estimated species tree from these trees using the preferred coalescent-based method. Statistical binning improves the accuracy of MP-EST, a popular coalescent-based method, and we use it to produce the first genome-scale coalescent-based avian tree of life.
248 citations
Cited by
More filters
•
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.
11,521 citations
••
TL;DR: The software package Tracer is presented, for visualizing and analyzing the MCMC trace files generated through Bayesian phylogenetic inference, which provides kernel density estimation, multivariate visualization, demographic trajectory reconstruction, conditional posterior distribution summary, and more.
Abstract: Bayesian inference of phylogeny using Markov chain Monte Carlo (MCMC) plays a central role in understanding evolutionary history from molecular sequence data. Visualizing and analyzing the MCMC-generated samples from the posterior distribution is a key step in any non-trivial Bayesian inference. We present the software package Tracer (version 1.7) for visualizing and analyzing the MCMC trace files generated through Bayesian phylogenetic inference. Tracer provides kernel density estimation, multivariate visualization, demographic trajectory reconstruction, conditional posterior distribution summary, and more. Tracer is open-source and available at http://beast.community/tracer.
5,492 citations
01 Aug 2000
TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.
Abstract: BIOE 402. Medical Technology Assessment. 2 or 3 hours. Bioentrepreneur course. Assessment of medical technology in the context of commercialization. Objectives, competition, market share, funding, pricing, manufacturing, growth, and intellectual property; many issues unique to biomedical products. Course Information: 2 undergraduate hours. 3 graduate hours. Prerequisite(s): Junior standing or above and consent of the instructor.
4,833 citations
••
TL;DR: Some notable features of IQ-TREE version 2 are described and the key advantages over other software are highlighted.
Abstract: IQ-TREE (http://www.iqtree.org, last accessed February 6, 2020) is a user-friendly and widely used software package for phylogenetic inference using maximum likelihood. Since the release of version 1 in 2014, we have continuously expanded IQ-TREE to integrate a plethora of new models of sequence evolution and efficient computational approaches of phylogenetic inference to deal with genomic data. Here, we describe notable features of IQ-TREE version 2 and highlight the key advantages over other software.
4,337 citations
••
TL;DR: The approach to utilizing available RNA-Seq and other data types in the authors' manual curation process for vertebrate, plant, and other species is summarized, and a new direction for prokaryotic genomes and protein name management is described.
Abstract: The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55,000 organisms (>4800 viruses, >40,000 prokaryotes and >10,000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryotic genomes and protein name management.
4,104 citations