scispace - formally typeset
Search or ask a question
Author

Gengyun Zhang

Bio: Gengyun Zhang is an academic researcher from Beijing Genomics Institute. The author has contributed to research in topics: Genome & Population. The author has an hindex of 27, co-authored 66 publications receiving 8035 citations. Previous affiliations of Gengyun Zhang include Beijing Institute of Genomics & Chinese Ministry of Agriculture.


Papers
More filters
Journal ArticleDOI
Xun Xu1, Shengkai Pan1, Shifeng Cheng1, Bo Zhang1, Mu D1, Peixiang Ni1, Gengyun Zhang1, Shuang Yang1, Ruiqiang Li1, Jun Wang1, Gisella Orjeda2, Frank Guzman2, Torres M2, Roberto Lozano2, Olga Ponce2, Diana Martinez2, De la Cruz G3, Chakrabarti Sk3, Patil Vu3, Konstantin G. Skryabin4, Boris B. Kuznetsov4, Nikolai V. Ravin4, Tatjana V. Kolganova4, Alexey V. Beletsky4, Andrey V. Mardanov4, Di Genova A5, Dan Bolser5, David M. A. Martin5, Li G, Yang Y, Hanhui Kuang6, Hu Q6, Xiong X7, Gerard J. Bishop8, Boris Sagredo, Nilo Mejía, Zagorski W9, Robert Gromadka9, Jan Gawor9, Pawel Szczesny9, Sanwen Huang, Zhang Z, Liang C, He J, Li Y, He Y, Xu J, Youjun Zhang, Xie B, Du Y, Qu D, Merideth Bonierbale10, Marc Ghislain10, Herrera Mdel R, Giovanni Giuliano, Marco Pietrella, Gaetano Perrotta, Paolo Facella, O'Brien K11, Sergio Enrique Feingold, Barreiro Le, Massa Ga, Luis Aníbal Diambra12, Brett R Whitty13, Brieanne Vaillancourt13, Lin H13, Alicia N. Massa13, Geoffroy M13, Lundback S13, Dean DellaPenna13, Buell Cr14, Sanjeev Kumar Sharma14, David Marshall14, Robbie Waugh14, Glenn J. Bryan14, Destefanis M15, Istvan Nagy15, Dan Milbourne15, Susan Thomson16, Mark Fiers16, Jeanne M. E. Jacobs16, Kåre Lehmann Nielsen17, Mads Sønderkær17, Marina Iovene18, Giovana Augusta Torres18, Jiming Jiang18, Richard E. Veilleux19, Christian W. B. Bachem20, de Boer J20, Theo Borm20, Bjorn Kloosterman20, van Eck H20, Erwin Datema20, Hekkert Bt20, Aska Goverse20, van Ham Rc20, Richard G. F. Visser20 
10 Jul 2011-Nature
TL;DR: The potato genome sequence provides a platform for genetic improvement of this vital crop and predicts 39,031 protein-coding genes and presents evidence for at least two genome duplication events indicative of a palaeopolyploid origin.
Abstract: Potato (Solanum tuberosum L.) is the world's most important non-grain food crop and is central to global food security. It is clonally propagated, highly heterozygous, autotetraploid, and suffers acute inbreeding depression. Here we use a homozygous doubled-monoploid potato clone to sequence and assemble 86% of the 844-megabase genome. We predict 39,031 protein-coding genes and present evidence for at least two genome duplication events indicative of a palaeopolyploid origin. As the first genome sequence of an asterid, the potato genome reveals 2,642 genes specific to this large angiosperm clade. We also sequenced a heterozygous diploid clone and show that gene presence/absence variants and other potentially deleterious mutations occur frequently and are a likely cause of inbreeding depression. Gene family expansion, tissue-specific expression and recruitment of genes to new pathways contributed to the evolution of tuber development. The potato genome sequence provides a platform for genetic improvement of this vital crop.

1,813 citations

Journal ArticleDOI
TL;DR: This work reports the ∼738-Mb draft whole genome shotgun sequence of CDC Frontier, a kabuli chickpea variety, which contains an estimated 28,269 genes, and identifies targets of both breeding-associated genetic sweeps and breeding- associated balancing selection.
Abstract: Chickpea (Cicer arietinum) is the second most widely grown legume crop after soybean, accounting for a substantial proportion of human dietary nitrogen intake and playing a crucial role in food security in developing countries. We report the ~738-Mb draft whole genome shotgun sequence of CDC Frontier, a kabuli chickpea variety, which contains an estimated 28,269 genes. Resequencing and analysis of 90 cultivated and wild genotypes from ten countries identifies targets of both breeding-associated genetic sweeps and breeding-associated balancing selection. Candidate genes for disease resistance and agronomic traits are highlighted, including traits that distinguish the two main market classes of cultivated chickpea—desi and kabuli. These data comprise a resource for chickpea improvement through molecular breeding and provide insights into both genome diversity and domestication.

1,014 citations

Journal ArticleDOI
TL;DR: A high level of linkage disequilibrium in the soybean genome is identified, suggesting that marker-assisted breeding of soybean will be less challenging than map-based cloning and to facilitate future breeding and quantitative trait analysis.
Abstract: We report a large-scale analysis of the patterns of genome-wide genetic variation in soybeans. We re-sequenced a total of 17 wild and 14 cultivated soybean genomes to an average of approximately ×5 depth and >90% coverage using the Illumina Genome Analyzer II platform. We compared the patterns of genetic variation between wild and cultivated soybeans and identified higher allelic diversity in wild soybeans. We identified a high level of linkage disequilibrium in the soybean genome, suggesting that marker-assisted breeding of soybean will be less challenging than map-based cloning. We report linkage disequilibrium block location and distribution, and we identified a set of 205,614 tag SNPs that may be useful for QTL mapping and association studies. The data here provide a valuable resource for the analysis of wild soybeans and to facilitate future breeding and quantitative trait analysis.

936 citations

Journal ArticleDOI
25 Apr 2018-Nature
TL;DR: Analyses of genetic variation and population structure based on over 3,000 cultivated rice (Oryza sativa) genomes reveal subpopulations that correlate with geographic location and patterns of introgression consistent with multiple rice domestication events.
Abstract: Here we analyse genetic variation, population structure and diversity among 3,010 diverse Asian cultivated rice (Oryza sativa L.) genomes from the 3,000 Rice Genomes Project. Our results are consistent with the five major groups previously recognized, but also suggest several unreported subpopulations that correlate with geographic location. We identified 29 million single nucleotide polymorphisms, 2.4 million small indels and over 90,000 structural variations that contribute to within- and between-population variation. Using pan-genome analyses, we identified more than 10,000 novel full-length protein-coding genes and a high number of presence-absence variations. The complex patterns of introgression observed in domestication genes are consistent with multiple independent rice domestication events. The public availability of data from the 3,000 Rice Genomes Project provides a resource for rice genomics research and breeding.

885 citations

Journal ArticleDOI
TL;DR: A comprehensive assessment of the evolution of modern maize based on the genome-wide resequencing of 75 wild, landrace and improved maize lines finds evidence of recovery of diversity after domestication, likely introgression from wild relatives, and evidence for stronger selection during domestication than improvement.
Abstract: Domestication and plant breeding are ongoing 10,000-year-old evolutionary experiments that have radically altered wild species to meet human needs. Maize has undergone a particularly striking transformation. Researchers have sought for decades to identify the genes underlying maize evolution, but these efforts have been limited in scope. Here, we report a comprehensive assessment of the evolution of modern maize based on the genome-wide resequencing of 75 wild, landrace and improved maize lines. We find evidence of recovery of diversity after domestication, likely introgression from wild relatives, and evidence for stronger selection during domestication than improvement. We identify a number of genes with stronger signals of selection than those previously shown to underlie major morphological changes. Finally, through transcriptome-wide analysis of gene expression, we find evidence both consistent with removal of cis-acting variation during maize domestication and improvement and suggestive of modern breeding having increased dominance in expression while targeting highly expressed genes.

788 citations


Cited by
More filters
Journal Article
Fumio Tajima1
30 Oct 1989-Genomics
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

11,521 citations

01 Jun 2012
TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).
Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

10,124 citations

Journal ArticleDOI
TL;DR: Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number of complete plant genomes.
Abstract: The number of sequenced plant genomes and associated genomic resources is growing rapidly with the advent of both an increased focus on plant genomics from funding agencies, and the application of inexpensive next generation sequencing. To interact with this increasing body of data, we have developed Phytozome (http://www.phytozome.net), a comparative hub for plant genome and gene family data and analysis. Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number (currently 25) of complete plant genomes, including all the land plants and selected algae sequenced at the Joint Genome Institute, as well as selected species sequenced elsewhere. Through a comprehensive plant genome database and web portal, these data and analyses are available to the broader plant science research community, providing powerful comparative genomics tools that help to link model systems with other plants of economic and ecological importance.

3,728 citations

Journal ArticleDOI
Shusei Sato, Satoshi Tabata, Hideki Hirakawa, Erika Asamizu  +320 moreInstitutions (51)
31 May 2012-Nature
TL;DR: A high-quality genome sequence of domesticated tomato is presented, a draft sequence of its closest wild relative, Solanum pimpinellifolium, is compared, and the two tomato genomes are compared to each other and to the potato genome.
Abstract: Tomato (Solanum lycopersicum) is a major crop plant and a model system for fruit development. Solanum is one of the largest angiosperm genera1 and includes annual and perennial plants from diverse habitats. Here we present a high-quality genome sequence of domesticated tomato, a draft sequence of its closest wild relative, Solanum pimpinellifolium2, and compare them to each other and to the potato genome (Solanum tuberosum). The two tomato genomes show only 0.6% nucleotide divergence and signs of recent admixture, but show more than 8% divergence from potato, with nine large and several smaller inversions. In contrast to Arabidopsis, but similar to soybean, tomato and potato small RNAs map predominantly to gene-rich chromosomal regions, including gene promoters. The Solanum lineage has experienced two consecutive genome triplications: one that is ancient and shared with rosids, and a more recent one. These triplications set the stage for the neofunctionalization of genes controlling fruit characteristics, such as colour and fleshiness.

2,687 citations