Author
John M. Davis
Other affiliations: United States Forest Service, Michigan State University, University of Washington
Bio: John M. Davis is an academic researcher from University of Florida. The author has contributed to research in topics: Population & Gene. The author has an hindex of 35, co-authored 84 publications receiving 8334 citations. Previous affiliations of John M. Davis include United States Forest Service & Michigan State University.
Topics: Population, Gene, Genome, Populus trichocarpa, Genomics
Papers published on a yearly basis
Papers
More filters
••
University of Tennessee1, Oak Ridge National Laboratory2, West Virginia University3, Umeå University4, University of British Columbia5, United States Department of Energy6, Ghent University7, Swedish University of Agricultural Sciences8, Institut national de la recherche agronomique9, Virginia Tech10, Michigan Technological University11, University of Toronto12, Pennsylvania State University13, University of Provence14, University of Georgia15, University of Florida16, University of California, Berkeley17, Lawrence Berkeley National Laboratory18, University of Arizona19, Purdue University20, Stanford University21, United States Department of Agriculture22, University of Helsinki23, University of Turku24, Massachusetts Institute of Technology25, University of Tennessee Health Science Center26, University of Tübingen27
TL;DR: The draft genome of the black cottonwood tree, Populus trichocarpa, has been reported in this paper, with more than 45,000 putative protein-coding genes identified.
Abstract: We report the draft genome of the black cottonwood tree, Populus trichocarpa. Integration of shotgun sequence assembly with genetic mapping enabled chromosome-scale reconstruction of the genome. More than 45,000 putative protein-coding genes were identified. Analysis of the assembled genome revealed a whole-genome duplication event; about 8000 pairs of duplicated genes from that event survived in the Populus genome. A second, older duplication event is indistinguishably coincident with the divergence of the Populus and Arabidopsis lineages. Nucleotide substitution, tandem gene duplication, and gross chromosomal rearrangement appear to proceed substantially more slowly in Populus than in Arabidopsis. Populus has more protein-coding genes than Arabidopsis, ranging on average from 1.4 to 1.6 putative Populus homologs for each Arabidopsis gene. However, the relative frequency of protein domains in the two genomes is similar. Overrepresented exceptions in Populus include genes associated with lignocellulosic wall biosynthesis, meristem development, disease resistance, and metabolite transport.
4,025 citations
••
University of California, Davis1, University of Maryland, College Park2, Johns Hopkins University3, Children's Hospital Oakland Research Institute4, Indiana University5, University of Utah6, University of Florida7, United States Forest Service8, University of Georgia9, North Carolina State University10, Washington State University11, Texas A&M University12
TL;DR: In this paper, the authors used a whole genome shotgun approach relying on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding.
Abstract: The size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination. We develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome. In addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied.
420 citations
••
TL;DR: A combination of genetics and physiology is being used to understand the detailed mechanisms of forest tree growth and development.
Abstract: Forest trees have tremendous economic and ecological value, as well as unique biological properties of basic scientific interest. The inherent difficulties of experimenting on very large long-lived organisms motivates the development of a model system for forest trees. Populus (poplars, cottonwoods, aspens) has several advantages as a model system, including rapid growth, prolific sexual reproduction, ease of cloning, small genome, facile transgenesis, and tight coupling between physiological traits and biomass productivity. A combination of genetics and physiology is being used to understand the detailed mechanisms of forest tree growth and development.
390 citations
••
TL;DR: Four different original methods of genomic selection that differ with respect to assumptions regarding distribution of marker effects are presented, including ridge regression–best linear unbiased prediction (RR–BLUP), Bayes A, (iii) Bayes Cπ, and (iv) Bayesian LASSO, which suggest that alternative approaches to genomic selection prediction models may perform differently for traits with distinct genetic properties.
Abstract: Genomic selection can increase genetic gain per generation through early selection. Genomic selection is expected to be particularly valuable for traits that are costly to phenotype and expressed late in the life cycle of long-lived species. Alternative approaches to genomic selection prediction models may perform differently for traits with distinct genetic properties. Here the performance of four different original methods of genomic selection that differ with respect to assumptions regarding distribution of marker effects, including (i) ridge regression–best linear unbiased prediction (RR–BLUP), (ii) Bayes A, (iii) Bayes Cπ, and (iv) Bayesian LASSO are presented. In addition, a modified RR–BLUP (RR–BLUP B) that utilizes a selected subset of markers was evaluated. The accuracy of these methods was compared across 17 traits with distinct heritabilities and genetic architectures, including growth, development, and disease-resistance properties, measured in a Pinus taeda (loblolly pine) training population of 951 individuals genotyped with 4853 SNPs. The predictive ability of the methods was evaluated using a 10-fold, cross-validation approach, and differed only marginally for most method/trait combinations. Interestingly, for fusiform rust disease-resistance traits, Bayes Cπ, Bayes A, and RR–BLUB B had higher predictive ability than RR–BLUP and Bayesian LASSO. Fusiform rust is controlled by few genes of large effect. A limitation of RR–BLUP is the assumption of equal contribution of all markers to the observed variation. However, RR-BLUP B performed equally well as the Bayesian approaches.The genotypic and phenotypic data used in this study are publically available for comparative analysis of genomic selection prediction models.
362 citations
••
TL;DR: Woody plants can detect and use z3HAC as a signal to prime defenses before actually experiencing damage, and GLVs may have important ecological functions in arboreal ecosystems.
Abstract: * Herbivore-induced plant volatiles (HIPVs), in addition to attracting natural enemies of herbivores, can serve a signaling function within plants to induce or prime defenses. However, it is largely unknown, particularly in woody plants, which volatile compounds within HIPV blends can act as signaling molecules. * Leaves of hybrid poplar saplings were exposed in vivo to naturally wound-emitted concentrations of the green leaf volatile (GLV) cis-3-hexenyl acetate (z3HAC) and then subsequently fed upon by gypsy moth larvae. Volatiles were collected throughout the experiments, and leaf tissue was collected to measure phytohormone concentrations and expression of defense-related genes. * Relative to controls, z3HAC-exposed leaves had higher concentrations of jasmonic acid and linolenic acid following gypsy moth feeding. Furthermore, z3HAC primed transcripts of genes that mediate oxylipin signaling and direct defenses, as determined by both qRT-PCR and microarray analysis using the AspenDB 7 K expressed sequence tags (EST) microarray containing c. 5400 unique gene models. Moreover, z3HAC primed the release of terpene volatiles. * The widespread priming response suggests an adaptive benefit to detecting z3HAC as a wound signal. Thus, woody plants can detect and use z3HAC as a signal to prime defenses before actually experiencing damage. GLVs may therefore have important ecological functions in arboreal ecosystems.
257 citations
Cited by
More filters
•
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.
11,521 citations
••
TL;DR: The Carbohydrate-Active Enzyme (CAZy) database is a knowledge-based resource specialized in the enzymes that build and breakdown complex carbohydrates and glycoconjugates and has been used to improve the quality of functional predictions of a number genome projects by providing expert annotation.
Abstract: The Carbohydrate-Active Enzyme (CAZy) database is a knowledge-based resource specialized in the enzymes that build and breakdown complex carbohydrates and glycoconjugates. As of September 2008, the database describes the present knowledge on 113 glycoside hydrolase, 91 glycosyltransferase, 19 polysaccharide lyase, 15 carbohydrate esterase and 52 carbohydrate-binding module families. These families are created based on experimentally characterized proteins and are populated by sequences from public databases with significant similarity. Protein biochemical information is continuously curated based on the available literature and structural information. Over 6400 proteins have assigned EC numbers and 700 proteins have a PDB structure. The classification (i) reflects the structural features of these enzymes better than their sole substrate specificity, (ii) helps to reveal the evolutionary relationships between these enzymes and (iii) provides a convenient framework to understand mechanistic properties. This resource has been available for over 10 years to the scientific community, contributing to information dissemination and providing a transversal nomenclature to glycobiologists. More recently, this resource has been used to improve the quality of functional predictions of a number genome projects by providing expert annotation. The CAZy resource resides at URL: http://www.cazy.org/.
6,028 citations
•
TL;DR: In this article, the authors present a document, redatto, voted and pubblicato by the Ipcc -Comitato intergovernativo sui cambiamenti climatici - illustra la sintesi delle ricerche svolte su questo tema rilevante.
Abstract: Cause, conseguenze e strategie di mitigazione Proponiamo il primo di una serie di articoli in cui affronteremo l’attuale problema dei mutamenti climatici. Presentiamo il documento redatto, votato e pubblicato dall’Ipcc - Comitato intergovernativo sui cambiamenti climatici - che illustra la sintesi delle ricerche svolte su questo tema rilevante.
4,187 citations
••
Agricultural Research Service1, University of North Carolina at Charlotte2, Purdue University3, University of California, Berkeley4, University of Arizona5, University of Maryland, College Park6, University of Missouri7, Joint Genome Institute8, National Center for Genome Resources9, Iowa State University10, University of Wisconsin–Stevens Point11, University of Nebraska–Lincoln12
TL;DR: An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.
Abstract: Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70% more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78% of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75% of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.
3,743 citations
••
TL;DR: Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number of complete plant genomes.
Abstract: The number of sequenced plant genomes and associated genomic resources is growing rapidly with the advent of both an increased focus on plant genomics from funding agencies, and the application of inexpensive next generation sequencing. To interact with this increasing body of data, we have developed Phytozome (http://www.phytozome.net), a comparative hub for plant genome and gene family data and analysis. Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number (currently 25) of complete plant genomes, including all the land plants and selected algae sequenced at the Joint Genome Institute, as well as selected species sequenced elsewhere. Through a comprehensive plant genome database and web portal, these data and analyses are available to the broader plant science research community, providing powerful comparative genomics tools that help to link model systems with other plants of economic and ecological importance.
3,728 citations