scispace - formally typeset
Search or ask a question
Author

Jianchang Du

Other affiliations: University College West
Bio: Jianchang Du is an academic researcher from Purdue University. The author has contributed to research in topics: Genome & Retrotransposon. The author has an hindex of 14, co-authored 16 publications receiving 4102 citations. Previous affiliations of Jianchang Du include University College West.

Papers
More filters
Journal ArticleDOI
14 Jan 2010-Nature
TL;DR: An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.
Abstract: Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70% more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78% of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75% of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

3,743 citations

Journal ArticleDOI
TL;DR: Investigation of the distribution and structural variation of LTR-RTs in relation to the rates of local genetic recombination (GR) and gene densities in the rice genome suggests that GR and gene density play important roles in shaping the dynamic structure of the Rice genome.
Abstract: In flowering plants, the accumulation of small deletions through unequal homologous recombination (UR) and illegitimate recombination (IR) is proposed to be the major process counteracting genome expansion, which is caused primarily by the periodic amplification of long terminal repeat retrotransposons (LTR-RTs). However, the full suite of evolutionary forces that govern the gain or loss of transposable elements (TEs) and their distribution within a genome remains unclear. Here, we investigated the distribution and structural variation of LTR-RTs in relation to the rates of local genetic recombination (GR) and gene densities in the rice (Oryza sativa) genome. Our data revealed a positive correlation between GR rates and gene densities and negative correlations between LTR-RT densities and both GR and gene densities. The data also indicate a tendency for LTR-RT elements and fragments to be shorter in regions with higher GR rates; the size reduction of LTR-RTs appears to be achieved primarily through solo LTR formation by UR. Comparison of indica and japonica rice revealed patterns and frequencies of LTR-RT gain and loss within different evolutionary timeframes. Different LTR-RT families exhibited variable distribution patterns and structural changes, but overall LTR-RT compositions and genes were organized according to the GR gradients of the genome. Further investigation of non-LTR-RTs and DNA transposons revealed a negative correlation between gene densities and the abundance of DNA transposons and a weak correlation between GR rates and the abundance of long interspersed nuclear elements (LINEs)/short interspersed nuclear elements (SINEs). Together, these observations suggest that GR and gene density play important roles in shaping the dynamic structure of the rice genome.

157 citations

Journal ArticleDOI
TL;DR: In this article, the authors investigated 510 long terminal repeat-retrotransposon (LTR-RT) families comprising 32 370 elements in soybean (Glycine max (L.) Merr.).
Abstract: *SUMMARY The availability of complete or nearly complete genome sequences from several plant species permits detailed discovery and cross-species comparison of transposable elements (TEs) at the whole genome level. We initially investigated 510 long terminal repeat-retrotransposon (LTR-RT) families comprising 32 370 elements in soybean (Glycine max (L.) Merr.). Approximately 87% of these elements were located in recombinationsuppressed pericentromeric regions, where the ratio (1.26) of solo LTRs to intact elements (S/I) is significantly lower than that of chromosome arms (1.62). Further analysis revealed a significant positive correlation between S/I and LTR sizes, indicating that larger LTRs facilitate solo LTR formation. Phylogenetic analysis revealed seven Copia and five Gypsy evolutionary lineages that were present before the divergence of eudicot and monocot species, but the scales and timeframes within which they proliferated vary dramatically across families, lineages and species, and notably, a Copia lineage has been lost in soybean. Analysis of the physical association of LTR-RTs with centromere satellite repeats identified two putative centromere retrotransposon (CR) families of soybean, which were grouped into the CR (e.g. CRR and CRM) lineage found in grasses, indicating that the ‘functional specification’ of CR pre-dates the bifurcation of eudicots and monocots. However, a number of families of the CR lineage are not concentrated in centromeres, suggesting that their CR roles may now be defunct. Our data also suggest that the envelope-like genes in the putative Copia retroviruslike family are probably derived from the Gypsy retrovirus-like lineage, and thus we propose the hypothesis of a single ancient origin of envelope-like genes in flowering plants.

155 citations

Journal ArticleDOI
TL;DR: SoyTEdb provides resources and information related to transposable elements in the soybean genome, representing the most comprehensive and the largest manually curated transposables element database for any individual plant genome completely sequenced to date.
Abstract: Background: Transposable elements are the most abundant components of all characterized genomes of higher eukaryotes. It has been documented that these elements not only contribute to the shaping and reshaping of their host genomes, but also play significant roles in regulating gene expression, altering gene function, and creating new genes. Thus, complete identification of transposable elements in sequenced genomes and construction of comprehensive transposable element databases are essential for accurate annotation of genes and other genomic components, for investigation of potential functional interaction between transposable elements and genes, and for study of genome evolution. The recent availability of the soybean genome sequence has provided an unprecedented opportunity for discovery, and structural and functional characterization of transposable elements in this economically important legume crop. Description: Using a combination of structure-based and homology-based approaches, a total of 32,552 retrotransposons (Class I) and 6,029 DNA transposons (Class II) with clear boundaries and insertion sites were structurally annotated and clearly categorized, and a soybean transposable element database, SoyTEdb, was established. These transposable elements have been anchored in and integrated with the soybean physical map and genetic map, and are browsable and visualizable at any scale along the 20 soybean chromosomes, along with predicted genes and other sequence annotations. BLAST search and other infrastracture tools were implemented to facilitate annotation of transposable elements or fragments from soybean and other related legume species. The majority (> 95%) of these elements (particularly a few hundred low-copy-number families) are first described in this study. Conclusion: SoyTEdb provides resources and information related to transposable elements in the soybean genome, representing the most comprehensive and the largest manually curated transposable element database for any individual plant genome completely sequenced to date. Transposable elements previously identified in legumes, the third largest family of flowering plants, are relatively scarce. Thus this database will facilitate structural, evolutionary, functional, and epigenetic analyses of transposable elements in soybean and other legume species.

125 citations

Journal ArticleDOI
TL;DR: The de novo assembly and sequence analysis of a Chinese soybean genome for “Zhonghuang 13” by a combination of SMRT, Hi–C and optical mapping data is reported, finding more than 250,000 structure variations.
Abstract: Soybean was domesticated in China and has become one of the most important oilseed crops. Due to bottlenecks in their introduction and dissemination, soybeans from different geographic areas exhibit extensive genetic diversity. Asia is the largest soybean market; therefore, a high-quality soybean reference genome from this area is critical for soybean research and breeding. Here, we report the de novo assembly and sequence analysis of a Chinese soybean genome for “Zhonghuang 13” by a combination of SMRT, Hi-C and optical mapping data. The assembled genome size is 1.025 Gb with a contig N50 of 3.46 Mb and a scaffold N50 of 51.87 Mb. Comparisons between this genome and the previously reported reference genome ( cv. Williams 82 ) uncovered more than 250,000 structure variations. A total of 52,051 protein coding genes and 36,429 transposable elements were annotated for this genome, and a gene co-expression network including 39,967 genes was also established. This high quality Chinese soybean genome and its sequence analysis will provide valuable information for soybean improvement in the future.

105 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number of complete plant genomes.
Abstract: The number of sequenced plant genomes and associated genomic resources is growing rapidly with the advent of both an increased focus on plant genomics from funding agencies, and the application of inexpensive next generation sequencing. To interact with this increasing body of data, we have developed Phytozome (http://www.phytozome.net), a comparative hub for plant genome and gene family data and analysis. Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number (currently 25) of complete plant genomes, including all the land plants and selected algae sequenced at the Joint Genome Institute, as well as selected species sequenced elsewhere. Through a comprehensive plant genome database and web portal, these data and analyses are available to the broader plant science research community, providing powerful comparative genomics tools that help to link model systems with other plants of economic and ecological importance.

3,728 citations

Journal ArticleDOI
TL;DR: The MCScanX toolkit implements an adjusted MCScan algorithm for detection of synteny and collinearity that extends the original software by incorporating 14 utility programs for visualization of results and additional downstream analyses.
Abstract: MCScan is an algorithm able to scan multiple genomes or subgenomes in order to identify putative homologous chromosomal regions, and align these regions using genes as anchors. The MCScanX toolkit implements an adjusted MCScan algorithm for detection of synteny and collinearity that extends the original software by incorporating 14 utility programs for visualization of results and additional downstream analyses. Applications of MCScanX to several sequenced plant genomes and gene families are shown as examples. MCScanX can be used to effectively analyze chromosome structural changes, and reveal the history of gene family expansions that might contribute to the adaptation of lineages and taxa. An integrated view of various modes of gene duplication can supplement the traditional gene tree analysis in specific families. The source code and documentation of MCScanX are freely available at http://chibba.pgml.uga.edu/mcscan2/.

3,388 citations

Journal ArticleDOI
TL;DR: It is becoming clear that a single WRKY transcription factor might be involved in regulating several seemingly disparate processes, and that members of the family play roles in both the repression and de-repression of important plant processes.

1,967 citations

Journal ArticleDOI
Boulos Chalhoub1, Shengyi Liu2, Isobel A. P. Parkin3, Haibao Tang4, Haibao Tang5, Xiyin Wang6, Julien Chiquet1, Harry Belcram1, Chaobo Tong2, Birgit Samans7, Margot Correa8, Corinne Da Silva8, Jérémy Just1, Cyril Falentin9, Chu Shin Koh10, Isabelle Le Clainche1, Maria Bernard8, Pascal Bento8, Benjamin Noel8, Karine Labadie8, Adriana Alberti8, Mathieu Charles9, Dominique Arnaud1, Hui Guo6, Christian Daviaud, Salman Alamery11, Kamel Jabbari12, Kamel Jabbari1, Meixia Zhao13, Patrick P. Edger14, Houda Chelaifa1, David C. Tack15, Gilles Lassalle9, Imen Mestiri1, Nicolas Schnel9, Marie-Christine Le Paslier9, Guangyi Fan, Victor Renault16, Philippe E. Bayer11, Agnieszka A. Golicz11, Sahana Manoli11, Tae-Ho Lee6, Vinh Ha Dinh Thi1, Smahane Chalabi1, Qiong Hu2, Chuchuan Fan17, Reece Tollenaere11, Yunhai Lu1, Christophe Battail8, Jinxiong Shen17, Christine Sidebottom10, Xinfa Wang2, Aurélie Canaguier1, Aurélie Chauveau9, Aurélie Bérard9, G. Deniot9, Mei Guan18, Zhongsong Liu18, Fengming Sun, Yong Pyo Lim19, Eric Lyons20, Christopher D. Town4, Ian Bancroft21, Xiaowu Wang, Jinling Meng17, Jianxin Ma13, J. Chris Pires22, Graham J.W. King23, Dominique Brunel9, Régine Delourme9, Michel Renard9, Jean-Marc Aury8, Keith L. Adams15, Jacqueline Batley24, Jacqueline Batley11, Rod J. Snowdon7, Jörg Tost, David Edwards24, David Edwards11, Yongming Zhou17, Wei Hua2, Andrew G. Sharpe10, Andrew H. Paterson6, Chunyun Guan18, Patrick Wincker1, Patrick Wincker8, Patrick Wincker25 
22 Aug 2014-Science
TL;DR: The polyploid genome of Brassica napus, which originated from a recent combination of two distinct genomes approximately 7500 years ago and gave rise to the crops of rape oilseed, is sequenced.
Abstract: Oilseed rape (Brassica napus L.) was formed ~7500 years ago by hybridization between B. rapa and B. oleracea, followed by chromosome doubling, a process known as allopolyploidy. Together with more ancient polyploidizations, this conferred an aggregate 72× genome multiplication since the origin of angiosperms and high gene content. We examined the B. napus genome and the consequences of its recent duplication. The constituent An and Cn subgenomes are engaged in subtle structural, functional, and epigenetic cross-talk, with abundant homeologous exchanges. Incipient gene loss and expression divergence have begun. Selection in B. napus oilseed types has accelerated the loss of glucosinolate genes, while preserving expansion of oil biosynthesis genes. These processes provide insights into allopolyploid evolution and its relationship with crop domestication and improvement.

1,743 citations

10 Dec 2007
TL;DR: The experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.
Abstract: EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when combined with the Program to Assemble Spliced Alignments (PASA), yields a comprehensive, configurable annotation system that predicts protein-coding genes and alternatively spliced isoforms. Our experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.

1,528 citations