scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The Sorghum bicolor genome and the diversification of grasses

29 Jan 2009-Nature (Nature Publishing Group)-Vol. 457, Iss: 7229, pp 551-556
TL;DR: An initial analysis of the ∼730-megabase Sorghum bicolor (L.) Moench genome is presented, placing ∼98% of genes in their chromosomal context using whole-genome shotgun sequence validated by genetic, physical and syntenic information.
Abstract: Sorghum, an African grass related to sugar cane and maize, is grown for food, feed, fibre and fuel. We present an initial analysis of the approximately 730-megabase Sorghum bicolor (L.) Moench genome, placing approximately 98% of genes in their chromosomal context using whole-genome shotgun sequence validated by genetic, physical and syntenic information. Genetic recombination is largely confined to about one-third of the sorghum genome with gene order and density similar to those of rice. Retrotransposon accumulation in recombinationally recalcitrant heterochromatin explains the approximately 75% larger genome size of sorghum compared with rice. Although gene and repetitive DNA distributions have been preserved since palaeopolyploidization approximately 70 million years ago, most duplicated gene sets lost one member before the sorghum-rice divergence. Concerted evolution makes one duplicated chromosomal segment appear to be only a few million years old. About 24% of genes are grass-specific and 7% are sorghum-specific. Recent gene and microRNA duplications may contribute to sorghum's drought tolerance.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
Patrick S. Schnable1, Doreen Ware2, Robert S. Fulton3, Joshua C. Stein2  +156 moreInstitutions (18)
20 Nov 2009-Science
TL;DR: The sequence of the maize genome reveals it to be the most complex genome known to date and the correlation of methylation-poor regions with Mu transposon insertions and recombination and how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state is reported.
Abstract: We report an improved draft nucleotide sequence of the 2.3-gigabase genome of maize, an important crop plant and model for biological research. Over 32,000 genes were predicted, of which 99.8% were placed on reference chromosomes. Nearly 85% of the genome is composed of hundreds of families of transposable elements, dispersed nonuniformly across the genome. These were responsible for the capture and amplification of numerous gene fragments and affect the composition, sizes, and positions of centromeres. We also report on the correlation of methylation-poor regions with Mu transposon insertions and recombination, and copy number variants with insertions and/or deletions, as well as how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state. These analyses inform and set the stage for further investigations to improve our understanding of the domestication and agricultural improvements of maize.

3,761 citations

Journal ArticleDOI
14 Jan 2010-Nature
TL;DR: An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.
Abstract: Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70% more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78% of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75% of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

3,743 citations

Journal ArticleDOI
TL;DR: Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number of complete plant genomes.
Abstract: The number of sequenced plant genomes and associated genomic resources is growing rapidly with the advent of both an increased focus on plant genomics from funding agencies, and the application of inexpensive next generation sequencing. To interact with this increasing body of data, we have developed Phytozome (http://www.phytozome.net), a comparative hub for plant genome and gene family data and analysis. Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number (currently 25) of complete plant genomes, including all the land plants and selected algae sequenced at the Joint Genome Institute, as well as selected species sequenced elsewhere. Through a comprehensive plant genome database and web portal, these data and analyses are available to the broader plant science research community, providing powerful comparative genomics tools that help to link model systems with other plants of economic and ecological importance.

3,728 citations


Cites background from "The Sorghum bicolor genome and the ..."

  • ...4 models (22) Vitis vinifera Grapevine Genoscope March 2010 annotation on 12X assembly (23) Volvox carteri Volvox JGI v1....

    [...]

Journal ArticleDOI
TL;DR: The MCScanX toolkit implements an adjusted MCScan algorithm for detection of synteny and collinearity that extends the original software by incorporating 14 utility programs for visualization of results and additional downstream analyses.
Abstract: MCScan is an algorithm able to scan multiple genomes or subgenomes in order to identify putative homologous chromosomal regions, and align these regions using genes as anchors. The MCScanX toolkit implements an adjusted MCScan algorithm for detection of synteny and collinearity that extends the original software by incorporating 14 utility programs for visualization of results and additional downstream analyses. Applications of MCScanX to several sequenced plant genomes and gene families are shown as examples. MCScanX can be used to effectively analyze chromosome structural changes, and reveal the history of gene family expansions that might contribute to the adaptation of lineages and taxa. An integrated view of various modes of gene duplication can supplement the traditional gene tree analysis in specific families. The source code and documentation of MCScanX are freely available at http://chibba.pgml.uga.edu/mcscan2/.

3,388 citations


Cites background or methods from "The Sorghum bicolor genome and the ..."

  • ...by transposons may explain the wide spread of dispersed duplicates (27), often via pack-MULEs (55), helitrons (56), or CACTA elements (37) in plant genomes, or through ‘retropositions’ (57)....

    [...]

  • ...The MCScan software package and PGDD database have been applied to a variety of research areas such as genome duplication and evolution (11,29–36), annotation of newly sequenced genomes (37) and the evolution of gene families (38–48)....

    [...]

Journal ArticleDOI
Shusei Sato, Satoshi Tabata, Hideki Hirakawa, Erika Asamizu  +320 moreInstitutions (51)
31 May 2012-Nature
TL;DR: A high-quality genome sequence of domesticated tomato is presented, a draft sequence of its closest wild relative, Solanum pimpinellifolium, is compared, and the two tomato genomes are compared to each other and to the potato genome.
Abstract: Tomato (Solanum lycopersicum) is a major crop plant and a model system for fruit development. Solanum is one of the largest angiosperm genera1 and includes annual and perennial plants from diverse habitats. Here we present a high-quality genome sequence of domesticated tomato, a draft sequence of its closest wild relative, Solanum pimpinellifolium2, and compare them to each other and to the potato genome (Solanum tuberosum). The two tomato genomes show only 0.6% nucleotide divergence and signs of recent admixture, but show more than 8% divergence from potato, with nine large and several smaller inversions. In contrast to Arabidopsis, but similar to soybean, tomato and potato small RNAs map predominantly to gene-rich chromosomal regions, including gene promoters. The Solanum lineage has experienced two consecutive genome triplications: one that is ancient and shared with rosids, and a more recent one. These triplications set the stage for the neofunctionalization of genes controlling fruit characteristics, such as colour and fleshiness.

2,687 citations

References
More filters
Journal ArticleDOI
01 Oct 1982-Gene
TL;DR: A series of plasmid vectors containing the multiple cloning site (MCS7) of M13mp7 has been constructed and a kanamycin-resistance marker has been inserted into the center of the symmetrical MCS7 to yield a restriction-site-mobilizing element (RSM).

5,719 citations

Journal ArticleDOI
Takashi Matsumoto1, Jianzhong Wu1, Hiroyuki Kanamori1, Yuichi Katayose1  +262 moreInstitutions (25)
11 Aug 2005-Nature
TL;DR: A map-based, finished quality sequence that covers 95% of the 389 Mb rice genome, including virtually all of the euchromatin and two complete centromeres, and finds evidence for widespread and recurrent gene transfer from the organelles to the nuclear chromosomes.
Abstract: Rice, one of the world's most important food plants, has important syntenic relationships with the other cereal species and is a model plant for the grasses. Here we present a map-based, finished quality sequence that covers 95% of the 389 Mb genome, including virtually all of the euchromatin and two complete centromeres. A total of 37,544 non-transposable-element-related protein-coding genes were identified, of which 71% had a putative homologue in Arabidopsis. In a reciprocal analysis, 90% of the Arabidopsis proteins had a putative homologue in the predicted rice proteome. Twenty-nine per cent of the 37,544 predicted genes appear in clustered gene families. The number and classes of transposable elements found in the rice genome are consistent with the expansion of syntenic regions in the maize and sorghum genomes. We find evidence for widespread and recurrent gene transfer from the organelles to the nuclear chromosomes. The map-based sequence has proven useful for the identification of genes underlying agronomic traits. The additional single-nucleotide polymorphisms and simple sequence repeats identified in our study should accelerate improvements in rice production.

3,423 citations

Journal ArticleDOI
TL;DR: This review integrates information on the chemical structure of individual polymers with data obtained from new techniques used to probe the arrangement of the polymers within the walls of individual cells consistent with the physical properties of the wall and its components.
Abstract: Advances in determination of polymer structure and in preservation of structure for electron microscopy provide the best view to date of how polysaccharides and structural proteins are organized into plant cell walls. The walls that form and partition dividing cells are modified chemically and structurally from the walls expanding to provide a cell with its functional form. In grasses, the chemical structure of the wall differs from that of all other flowering plant species that have been examined. Nevertheless, both types of wall must conform to the same physical laws. Cell expansion occurs via strictly regulated reorientation of each of the wall's components that first permits the wall to stretch in specific directions and then lock into final shape. This review integrates information on the chemical structure of individual polymers with data obtained from new techniques used to probe the arrangement of the polymers within the walls of individual cells. We provide structural models of two distinct types of walls in flowering plants consistent with the physical properties of the wall and its components.

3,417 citations

Journal ArticleDOI
21 Nov 2003-Science
TL;DR: It is argued that many of these modifications emerged passively in response to the long-term population-size reductions that accompanied increases in organism size, and provided novel substrates for the secondary evolution of phenotypic complexity by natural selection.
Abstract: Complete genomic sequences from diverse phylogenetic lineages reveal notable increases in genome complexity from prokaryotes to multicellular eukaryotes. The changes include gradual increases in gene number, resulting from the retention of duplicate genes, and more abrupt increases in the abundance of spliceosomal introns and mobile genetic elements. We argue that many of these modifications emerged passively in response to the long-term population-size reductions that accompanied increases in organism size. According to this model, much of the restructuring of eukaryotic genomes was initiated by nonadaptive processes, and this in turn provided novel substrates for the secondary evolution of phenotypic complexity by natural selection. The enormous long-term effective population sizes of prokaryotes may impose a substantial barrier to the evolution of complex genomes and morphologies.

1,521 citations

Journal ArticleDOI
TL;DR: The observed diversity of these NBS-LRR proteins indicates the variety of recognition molecules available in an individual genotype to detect diverse biotic challenges.
Abstract: The Arabidopsis genome contains ∼200 genes that encode proteins with similarity to the nucleotide binding site and other domains characteristic of plant resistance proteins. Through a reiterative process of sequence analysis and reannotation, we identified 149 NBS-LRR–encoding genes in the Arabidopsis (ecotype Columbia) genomic sequence. Fifty-six of these genes were corrected from earlier annotations. At least 12 are predicted to be pseudogenes. As described previously, two distinct groups of sequences were identified: those that encoded an N-terminal domain with Toll/Interleukin-1 Receptor homology (TIR-NBS-LRR, or TNL), and those that encoded an N-terminal coiled-coil motif (CC-NBS-LRR, or CNL). The encoded proteins are distinct from the 58 predicted adapter proteins in the previously described TIR-X, TIR-NBS, and CC-NBS groups. Classification based on protein domains, intron positions, sequence conservation, and genome distribution defined four subgroups of CNL proteins, eight subgroups of TNL proteins, and a pair of divergent NL proteins that lack a defined N-terminal motif. CNL proteins generally were encoded in single exons, although two subclasses were identified that contained introns in unique positions. TNL proteins were encoded in modular exons, with conserved intron positions separating distinct protein domains. Conserved motifs were identified in the LRRs of both CNL and TNL proteins. In contrast to CNL proteins, TNL proteins contained large and variable C-terminal domains. The extant distribution and diversity of the NBS-LRR sequences has been generated by extensive duplication and ectopic rearrangements that involved segmental duplications as well as microscale events. The observed diversity of these NBS-LRR proteins indicates the variety of recognition molecules available in an individual genotype to detect diverse biotic challenges.

1,503 citations

Related Papers (5)
15 Sep 2006-Science
Gerald A. Tuskan, Gerald A. Tuskan, Stephen P. DiFazio, Stephen P. DiFazio, Stefan Jansson, Joerg Bohlmann, Igor V. Grigoriev, Uffe Hellsten, Nicholas H. Putnam, Steven G. Ralph, Stephane Rombauts, Asaf Salamov, Jacquie Schein, Lieven Sterck, Andrea Aerts, Rishikeshi Bhalerao, Rishikesh P. Bhalerao, Damien Blaudez, Wout Boerjan, Annick Brun, Amy M. Brunner, Victor Busov, Malcolm M. Campbell, John E. Carlson, Michel Chalot, Jarrod Chapman, G.-L. Chen, Dawn Cooper, Pedro M. Coutinho, Jérémy Couturier, Sarah F. Covert, Quentin C. B. Cronk, R. Cunningham, John M. Davis, Sven Degroeve, Annabelle Déjardin, Claude W. dePamphilis, John C. Detter, Bill Dirks, Inna Dubchak, Inna Dubchak, Sébastien Duplessis, Jürgen Ehlting, Brian E. Ellis, Karla C Gendler, David Goodstein, Michael Gribskov, Jane Grimwood, Andrew Groover, Lee E. Gunter, Björn Hamberger, Berthold Heinze, Yrjö Helariutta, Yrjö Helariutta, Yrjö Helariutta, Bernard Henrissat, D. Holligan, Robert A. Holt, Wenyu Huang, N. Islam-Faridi, Steven J.M. Jones, M. Jones-Rhoades, Richard A. Jorgensen, Chandrashekhar P. Joshi, Jaakko Kangasjärvi, Jan Karlsson, Colin T. Kelleher, Robert Kirkpatrick, Matias Kirst, Annegret Kohler, Udaya C. Kalluri, Frank W. Larimer, Jim Leebens-Mack, Jean-Charles Leplé, Philip F. LoCascio, Y. Lou, Susan Lucas, Francis Martin, Barbara Montanini, Carolyn A. Napoli, David R. Nelson, C D Nelson, Kaisa Nieminen, Ove Nilsson, V. Pereda, Gary F. Peter, Ryan N. Philippe, Gilles Pilate, Alexander Poliakov, J. Razumovskaya, Paul G. Richardson, Cécile Rinaldi, Kermit Ritland, Pierre Rouzé, D. Ryaboy, Jeremy Schmutz, J. Schrader, Bo Segerman, H. Shin, Asim Siddiqui, Fredrik Sterky, Astrid Terry, Chung-Jui Tsai, Edward C. Uberbacher, Per Unneberg, Jorma Vahala, Kerr Wall, Susan R. Wessler, Guojun Yang, T. Yin, Carl J. Douglas, Marco A. Marra, Göran Sandberg, Y. Van de Peer, Daniel S. Rokhsar, Daniel S. Rokhsar