scispace - formally typeset
Search or ask a question
Author

Haibao Tang

Bio: Haibao Tang is an academic researcher from Fujian Agriculture and Forestry University. The author has contributed to research in topics: Genome & Gene. The author has an hindex of 55, co-authored 137 publications receiving 22306 citations. Previous affiliations of Haibao Tang include University of Arizona & J. Craig Venter Institute.


Papers
More filters
Journal ArticleDOI
TL;DR: The MCScanX toolkit implements an adjusted MCScan algorithm for detection of synteny and collinearity that extends the original software by incorporating 14 utility programs for visualization of results and additional downstream analyses.
Abstract: MCScan is an algorithm able to scan multiple genomes or subgenomes in order to identify putative homologous chromosomal regions, and align these regions using genes as anchors. The MCScanX toolkit implements an adjusted MCScan algorithm for detection of synteny and collinearity that extends the original software by incorporating 14 utility programs for visualization of results and additional downstream analyses. Applications of MCScanX to several sequenced plant genomes and gene families are shown as examples. MCScanX can be used to effectively analyze chromosome structural changes, and reveal the history of gene family expansions that might contribute to the adaptation of lineages and taxa. An integrated view of various modes of gene duplication can supplement the traditional gene tree analysis in specific families. The source code and documentation of MCScanX are freely available at http://chibba.pgml.uga.edu/mcscan2/.

3,388 citations

Journal ArticleDOI
29 Jan 2009-Nature
TL;DR: An initial analysis of the ∼730-megabase Sorghum bicolor (L.) Moench genome is presented, placing ∼98% of genes in their chromosomal context using whole-genome shotgun sequence validated by genetic, physical and syntenic information.
Abstract: Sorghum, an African grass related to sugar cane and maize, is grown for food, feed, fibre and fuel. We present an initial analysis of the approximately 730-megabase Sorghum bicolor (L.) Moench genome, placing approximately 98% of genes in their chromosomal context using whole-genome shotgun sequence validated by genetic, physical and syntenic information. Genetic recombination is largely confined to about one-third of the sorghum genome with gene order and density similar to those of rice. Retrotransposon accumulation in recombinationally recalcitrant heterochromatin explains the approximately 75% larger genome size of sorghum compared with rice. Although gene and repetitive DNA distributions have been preserved since palaeopolyploidization approximately 70 million years ago, most duplicated gene sets lost one member before the sorghum-rice divergence. Concerted evolution makes one duplicated chromosomal segment appear to be only a few million years old. About 24% of genes are grass-specific and 7% are sorghum-specific. Recent gene and microRNA duplications may contribute to sorghum's drought tolerance.

2,809 citations

Journal ArticleDOI
Shusei Sato, Satoshi Tabata, Hideki Hirakawa, Erika Asamizu  +320 moreInstitutions (51)
31 May 2012-Nature
TL;DR: A high-quality genome sequence of domesticated tomato is presented, a draft sequence of its closest wild relative, Solanum pimpinellifolium, is compared, and the two tomato genomes are compared to each other and to the potato genome.
Abstract: Tomato (Solanum lycopersicum) is a major crop plant and a model system for fruit development. Solanum is one of the largest angiosperm genera1 and includes annual and perennial plants from diverse habitats. Here we present a high-quality genome sequence of domesticated tomato, a draft sequence of its closest wild relative, Solanum pimpinellifolium2, and compare them to each other and to the potato genome (Solanum tuberosum). The two tomato genomes show only 0.6% nucleotide divergence and signs of recent admixture, but show more than 8% divergence from potato, with nine large and several smaller inversions. In contrast to Arabidopsis, but similar to soybean, tomato and potato small RNAs map predominantly to gene-rich chromosomal regions, including gene promoters. The Solanum lineage has experienced two consecutive genome triplications: one that is ancient and shared with rosids, and a more recent one. These triplications set the stage for the neofunctionalization of genes controlling fruit characteristics, such as colour and fleshiness.

2,687 citations

Journal ArticleDOI
Xiaowu Wang1, Hanzhong Wang, Jun Wang2, Jun Wang3, Jun Wang4, Rifei Sun, Jian Wu, Shengyi Liu, Yinqi Bai4, Jeong-Hwan Mun5, Ian Bancroft6, Feng Cheng, Sanwen Huang, Xixiang Li, Wei Hua, Junyi Wang4, Xiyin Wang7, Xiyin Wang8, Michael Freeling9, J. Chris Pires10, Andrew H. Paterson8, Boulos Chalhoub, Bo Wang4, Alice Hayward11, Alice Hayward12, Andrew G. Sharpe13, Beom-Seok Park5, Bernd Weisshaar14, Binghang Liu4, Bo Li4, Bo Liu, Chaobo Tong, Chi Song4, Chris Duran12, Chris Duran15, Chunfang Peng4, Geng Chunyu4, Chushin Koh13, Chuyu Lin4, David Edwards12, David Edwards15, Desheng Mu4, Di Shen, Eleni Soumpourou6, Fei Li, Fiona Fraser6, Gavin C. Conant10, Gilles Lassalle16, Graham J.W. King3, Guusje Bonnema17, Haibao Tang9, Haiping Wang, Harry Belcram, Heling Zhou4, Hideki Hirakawa, Hiroshi Abe, Hui Guo8, Hui Wang, Huizhe Jin8, Isobel A. P. Parkin18, Jacqueline Batley12, Jacqueline Batley11, Jeong-Sun Kim5, Jérémy Just, Jianwen Li4, Jiaohui Xu4, Jie Deng, Jin A Kim5, Jingping Li8, Jingyin Yu, Jinling Meng19, Jinpeng Wang7, Jiumeng Min4, Julie Poulain20, Katsunori Hatakeyama, Kui Wu4, Li Wang7, Lu Fang, Martin Trick6, Matthew G. Links18, Meixia Zhao, Mina Jin5, Nirala Ramchiary21, Nizar Drou22, Paul J. Berkman12, Paul J. Berkman15, Qingle Cai4, Quanfei Huang4, Ruiqiang Li4, Satoshi Tabata, Shifeng Cheng4, Shu Zhang4, Shujiang Zhang, Shunmou Huang, Shusei Sato, Silong Sun, Soo-Jin Kwon5, Su-Ryun Choi21, Tae-Ho Lee8, Wei Fan4, Xiang Zhao4, Xu Tan8, Xun Xu4, Yan Wang, Yang Qiu, Ye Yin4, Yingrui Li4, Yongchen Du, Yongcui Liao, Yong Pyo Lim21, Yoshihiro Narusaka, Yupeng Wang7, Zhenyi Wang7, Zhenyu Li4, Zhiwen Wang4, Zhiyong Xiong10, Zhonghua Zhang 
TL;DR: The annotation and analysis of the draft genome sequence of Brassica rapa accession Chiifu-401-42, a Chinese cabbage, and used Arabidopsis thaliana as an outgroup for investigating the consequences of genome triplication, such as structural and functional evolution.
Abstract: We report the annotation and analysis of the draft genome sequence of Brassica rapa accession Chiifu-401-42, a Chinese cabbage. We modeled 41,174 protein coding genes in the B. rapa genome, which has undergone genome triplication. We used Arabidopsis thaliana as an outgroup for investigating the consequences of genome triplication, such as structural and functional evolution. The extent of gene loss (fractionation) among triplicated genome segments varies, with one of the three copies consistently retaining a disproportionately large fraction of the genes expected to have been present in its ancestor. Variation in the number of members of gene families present in the genome may contribute to the remarkable morphological plasticity of Brassica species. The B. rapa genome sequence provides an important resource for studying the evolution of polyploid genomes and underpins the genetic improvement of Brassica oil and vegetable crops.

1,811 citations

Journal ArticleDOI
Boulos Chalhoub1, Shengyi Liu2, Isobel A. P. Parkin3, Haibao Tang4, Haibao Tang5, Xiyin Wang6, Julien Chiquet1, Harry Belcram1, Chaobo Tong2, Birgit Samans7, Margot Correa8, Corinne Da Silva8, Jérémy Just1, Cyril Falentin9, Chu Shin Koh10, Isabelle Le Clainche1, Maria Bernard8, Pascal Bento8, Benjamin Noel8, Karine Labadie8, Adriana Alberti8, Mathieu Charles9, Dominique Arnaud1, Hui Guo6, Christian Daviaud, Salman Alamery11, Kamel Jabbari1, Kamel Jabbari12, Meixia Zhao13, Patrick P. Edger14, Houda Chelaifa1, David C. Tack15, Gilles Lassalle9, Imen Mestiri1, Nicolas Schnel9, Marie-Christine Le Paslier9, Guangyi Fan, Victor Renault16, Philippe E. Bayer11, Agnieszka A. Golicz11, Sahana Manoli11, Tae-Ho Lee6, Vinh Ha Dinh Thi1, Smahane Chalabi1, Qiong Hu2, Chuchuan Fan17, Reece Tollenaere11, Yunhai Lu1, Christophe Battail8, Jinxiong Shen17, Christine Sidebottom10, Xinfa Wang2, Aurélie Canaguier1, Aurélie Chauveau9, Aurélie Bérard9, G. Deniot9, Mei Guan18, Zhongsong Liu18, Fengming Sun, Yong Pyo Lim19, Eric Lyons20, Christopher D. Town4, Ian Bancroft21, Xiaowu Wang, Jinling Meng17, Jianxin Ma13, J. Chris Pires22, Graham J.W. King23, Dominique Brunel9, Régine Delourme9, Michel Renard9, Jean-Marc Aury8, Keith L. Adams15, Jacqueline Batley11, Jacqueline Batley24, Rod J. Snowdon7, Jörg Tost, David Edwards11, David Edwards24, Yongming Zhou17, Wei Hua2, Andrew G. Sharpe10, Andrew H. Paterson6, Chunyun Guan18, Patrick Wincker1, Patrick Wincker25, Patrick Wincker8 
22 Aug 2014-Science
TL;DR: The polyploid genome of Brassica napus, which originated from a recent combination of two distinct genomes approximately 7500 years ago and gave rise to the crops of rape oilseed, is sequenced.
Abstract: Oilseed rape (Brassica napus L.) was formed ~7500 years ago by hybridization between B. rapa and B. oleracea, followed by chromosome doubling, a process known as allopolyploidy. Together with more ancient polyploidizations, this conferred an aggregate 72× genome multiplication since the origin of angiosperms and high gene content. We examined the B. napus genome and the consequences of its recent duplication. The constituent An and Cn subgenomes are engaged in subtle structural, functional, and epigenetic cross-talk, with abundant homeologous exchanges. Incipient gene loss and expression divergence have begun. Selection in B. napus oilseed types has accelerated the loss of glucosinolate genes, while preserving expansion of oil biosynthesis genes. These processes provide insights into allopolyploid evolution and its relationship with crop domestication and improvement.

1,743 citations


Cited by
More filters
01 Jun 2012
TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).
Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

10,124 citations

Journal ArticleDOI
Patrick S. Schnable1, Doreen Ware2, Robert S. Fulton3, Joshua C. Stein2  +156 moreInstitutions (18)
20 Nov 2009-Science
TL;DR: The sequence of the maize genome reveals it to be the most complex genome known to date and the correlation of methylation-poor regions with Mu transposon insertions and recombination and how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state is reported.
Abstract: We report an improved draft nucleotide sequence of the 2.3-gigabase genome of maize, an important crop plant and model for biological research. Over 32,000 genes were predicted, of which 99.8% were placed on reference chromosomes. Nearly 85% of the genome is composed of hundreds of families of transposable elements, dispersed nonuniformly across the genome. These were responsible for the capture and amplification of numerous gene fragments and affect the composition, sizes, and positions of centromeres. We also report on the correlation of methylation-poor regions with Mu transposon insertions and recombination, and copy number variants with insertions and/or deletions, as well as how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state. These analyses inform and set the stage for further investigations to improve our understanding of the domestication and agricultural improvements of maize.

3,761 citations

Journal ArticleDOI
14 Jan 2010-Nature
TL;DR: An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.
Abstract: Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70% more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78% of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75% of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

3,743 citations

Journal ArticleDOI

3,734 citations

Journal ArticleDOI
TL;DR: Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number of complete plant genomes.
Abstract: The number of sequenced plant genomes and associated genomic resources is growing rapidly with the advent of both an increased focus on plant genomics from funding agencies, and the application of inexpensive next generation sequencing. To interact with this increasing body of data, we have developed Phytozome (http://www.phytozome.net), a comparative hub for plant genome and gene family data and analysis. Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number (currently 25) of complete plant genomes, including all the land plants and selected algae sequenced at the Joint Genome Institute, as well as selected species sequenced elsewhere. Through a comprehensive plant genome database and web portal, these data and analyses are available to the broader plant science research community, providing powerful comparative genomics tools that help to link model systems with other plants of economic and ecological importance.

3,728 citations