scispace - formally typeset
Search or ask a question
Author

Xinming Liang

Bio: Xinming Liang is an academic researcher from Beijing Genomics Institute. The author has contributed to research in topics: Genome & Whole genome sequencing. The author has an hindex of 17, co-authored 29 publications receiving 4271 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A draft genome sequence of Brassica oleracea is described, comparing it with that of its sister species B. rapa to reveal numerous chromosome rearrangements and asymmetrical gene loss in duplicated genomic blocks.
Abstract: Polyploidization has provided much genetic variation for plant adaptive evolution, but the mechanisms by which the molecular evolution of polyploid genomes establishes genetic architecture underlying species differentiation are unclear Brassica is an ideal model to increase knowledge of polyploid evolution Here we describe a draft genome sequence of Brassica oleracea, comparing it with that of its sister species B rapa to reveal numerous chromosome rearrangements and asymmetrical gene loss in duplicated genomic blocks, asymmetrical amplification of transposable elements, differential gene co-retention for specific pathways and variation in gene expression, including alternative splicing, among a large number of paralogous and orthologous genes Genes related to the production of anticancer phytochemicals and morphological variations illustrate consequences of genome duplication and gene divergence, imparting biochemical and morphological variation to B oleracea This study provides insights into Brassica genome evolution and will underpin research into the many important crops in this genus

884 citations

Journal ArticleDOI
TL;DR: A draft genome using 181-fold paired-end sequences assisted by fivefold BAC-to-BAC sequences and a high-resolution genetic map is produced for G. hirsutum, revealing conserved gene order and concerted evolution of different regulatory mechanisms for Cellulose synthase and 1-Aminocyclopropane-1-carboxylic acid oxidase1 and 3 may be important for enhanced fiber production.
Abstract: Gossypium hirsutum has proven difficult to sequence owing to its complex allotetraploid (AtDt) genome. Here we produce a draft genome using 181-fold paired-end sequences assisted by fivefold BAC-to-BAC sequences and a high-resolution genetic map. In our assembly 88.5% of the 2,173-Mb scaffolds, which cover 89.6%∼96.7% of the AtDt genome, are anchored and oriented to 26 pseudochromosomes. Comparison of this G. hirsutum AtDt genome with the already sequenced diploid Gossypium arboreum (AA) and Gossypium raimondii (DD) genomes revealed conserved gene order. Repeated sequences account for 67.2% of the AtDt genome, and transposable elements (TEs) originating from Dt seem more active than from At. Reduction in the AtDt genome size occurred after allopolyploidization. The A or At genome may have undergone positive selection for fiber traits. Concerted evolution of different regulatory mechanisms for Cellulose synthase (CesA) and 1-Aminocyclopropane-1-carboxylic acid oxidase1 and 3 (ACO1,3) may be important for enhanced fiber production in G. hirsutum.

836 citations

Journal ArticleDOI
TL;DR: Comparative transcriptome studies showed the key role of the nucleotide binding site (NBS)-encoding gene family in resistance to Verticillium dahliae and the involvement of ethylene in the development of cotton fiber cells.
Abstract: Yu-Xian Zhu, Jun Wang, Shuxun Yu and colleagues report sequencing and assembly of the genome of cultivated cotton, Gossypium arboreum. Comparison with the Gossypium raimondii genome sequence provides insights into genome evolution and speciation, and identifies two shared whole-genome duplication events occurring before the speciation event around 2–13 million years ago.

729 citations

Journal ArticleDOI
TL;DR: A high-quality draft genome sequence of the east Asia watermelon cultivar 97103 containing 23,440 predicted protein-coding genes is reported, which yielded important insights into aspects of phloem-based vascular signaling in common between watermelon and cucumber and identified genes crucial to valuable fruit-quality traits.
Abstract: Zhangjun Fei and colleagues report the draft genome of a Chinese elite watermelon inbred line 97103 and resequencing of 20 diverse accessions that represent the three subspecies of Citrullus lunatus. Comparative genome-wide analyses identify the extent of genetic diversity and population structure of watermelon germplasm.

646 citations

Journal ArticleDOI
TL;DR: This work assemble a 280M genome by combining 101-fold next-generation sequencing and optical mapping data, and succeeds in reconstructing nine ancestral chromosomes of Rosaceae family, as well as depicting chromosome fusion, fission and duplication history in three major subfamilies.
Abstract: The genome of Prunus mume (mei), which was domesticated in China more than 3000 years ago as an important fruit and ornamental plant, was one of the first sequenced genomes among the Prunus subfamilies of Rosaceae. In this study, the 280 M genome was assembled into scaffolds by combining 101-fold NGS data and optical mapping data; 83.9% of these scaffolds were further anchored to eight chromosomes in a genetic map constructed by restriction site-associated DNA (RAD) sequencing. Combining the P. mume genome data with other available genome data, we reconstructed nine ancestral chromosomes of the Rosaceae family, depicting chromosome fusion, fission and duplication history in the three major Rosaceae subfamilies. We sequenced the transcriptome of various tissues and performed a genome-wide analysis to reveal the characteristics of P. mume, including its regulation of early blooming in endodormancy and biosynthesis of flower scent. The P. mume genome sequence adds to our understanding of Rosaceae evolution and provides important data for the improvement of fruit trees.

390 citations


Cited by
More filters
01 Jan 2011
TL;DR: The sheer volume and scope of data posed by this flood of data pose a significant challenge to the development of efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.
Abstract: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.

2,187 citations

Journal ArticleDOI
Boulos Chalhoub1, Shengyi Liu2, Isobel A. P. Parkin3, Haibao Tang4, Haibao Tang5, Xiyin Wang6, Julien Chiquet1, Harry Belcram1, Chaobo Tong2, Birgit Samans7, Margot Correa8, Corinne Da Silva8, Jérémy Just1, Cyril Falentin9, Chu Shin Koh10, Isabelle Le Clainche1, Maria Bernard8, Pascal Bento8, Benjamin Noel8, Karine Labadie8, Adriana Alberti8, Mathieu Charles9, Dominique Arnaud1, Hui Guo6, Christian Daviaud, Salman Alamery11, Kamel Jabbari12, Kamel Jabbari1, Meixia Zhao13, Patrick P. Edger14, Houda Chelaifa1, David C. Tack15, Gilles Lassalle9, Imen Mestiri1, Nicolas Schnel9, Marie-Christine Le Paslier9, Guangyi Fan, Victor Renault16, Philippe E. Bayer11, Agnieszka A. Golicz11, Sahana Manoli11, Tae-Ho Lee6, Vinh Ha Dinh Thi1, Smahane Chalabi1, Qiong Hu2, Chuchuan Fan17, Reece Tollenaere11, Yunhai Lu1, Christophe Battail8, Jinxiong Shen17, Christine Sidebottom10, Xinfa Wang2, Aurélie Canaguier1, Aurélie Chauveau9, Aurélie Bérard9, G. Deniot9, Mei Guan18, Zhongsong Liu18, Fengming Sun, Yong Pyo Lim19, Eric Lyons20, Christopher D. Town5, Ian Bancroft21, Xiaowu Wang, Jinling Meng17, Jianxin Ma13, J. Chris Pires22, Graham J.W. King23, Dominique Brunel9, Régine Delourme9, Michel Renard9, Jean-Marc Aury8, Keith L. Adams15, Jacqueline Batley11, Jacqueline Batley24, Rod J. Snowdon7, Jörg Tost, David Edwards24, David Edwards11, Yongming Zhou17, Wei Hua2, Andrew G. Sharpe10, Andrew H. Paterson6, Chunyun Guan18, Patrick Wincker25, Patrick Wincker8, Patrick Wincker1 
22 Aug 2014-Science
TL;DR: The polyploid genome of Brassica napus, which originated from a recent combination of two distinct genomes approximately 7500 years ago and gave rise to the crops of rape oilseed, is sequenced.
Abstract: Oilseed rape (Brassica napus L.) was formed ~7500 years ago by hybridization between B. rapa and B. oleracea, followed by chromosome doubling, a process known as allopolyploidy. Together with more ancient polyploidizations, this conferred an aggregate 72× genome multiplication since the origin of angiosperms and high gene content. We examined the B. napus genome and the consequences of its recent duplication. The constituent An and Cn subgenomes are engaged in subtle structural, functional, and epigenetic cross-talk, with abundant homeologous exchanges. Incipient gene loss and expression divergence have begun. Selection in B. napus oilseed types has accelerated the loss of glucosinolate genes, while preserving expansion of oil biosynthesis genes. These processes provide insights into allopolyploid evolution and its relationship with crop domestication and improvement.

1,743 citations

Journal ArticleDOI
TL;DR: Genomic signatures of selection and domestication are associated with positively selected genes (PSGs) for fiber improvement in the A subgenome and for stress tolerance in the D subgenomes, suggesting asymmetric evolution.
Abstract: Upland cotton is a model for polyploid crop domestication and transgenic improvement. Here we sequenced the allotetraploid Gossypium hirsutum L. acc. TM-1 genome by integrating whole-genome shotgun reads, bacterial artificial chromosome (BAC)-end sequences and genotype-by-sequencing genetic maps. We assembled and annotated 32,032 A-subgenome genes and 34,402 D-subgenome genes. Structural rearrangements, gene loss, disrupted genes and sequence divergence were more common in the A subgenome than in the D subgenome, suggesting asymmetric evolution. However, no genome-wide expression dominance was found between the subgenomes. Genomic signatures of selection and domestication are associated with positively selected genes (PSGs) for fiber improvement in the A subgenome and for stress tolerance in the D subgenome. This draft genome sequence provides a resource for engineering superior cotton lines.

1,221 citations

Journal ArticleDOI
TL;DR: SOAPnuke is demonstrated as a tool with abundant functions for a “QC-Preprocess-QC” workflow and MapReduce acceleration framework that enables large scalability to distribute all the processing works to an entire compute cluster.
Abstract: Quality control (QC) and preprocessing are essential steps for sequencing data analysis to ensure the accuracy of results. However, existing tools cannot provide a satisfying solution with integrated comprehensive functions, proper architectures, and highly scalable acceleration. In this article, we demonstrate SOAPnuke as a tool with abundant functions for a "QC-Preprocess-QC" workflow and MapReduce acceleration framework. Four modules with different preprocessing functions are designed for processing datasets from genomic, small RNA, Digital Gene Expression, and metagenomic experiments, respectively. As a workflow-like tool, SOAPnuke centralizes processing functions into 1 executable and predefines their order to avoid the necessity of reformatting different files when switching tools. Furthermore, the MapReduce framework enables large scalability to distribute all the processing works to an entire compute cluster.We conducted a benchmarking where SOAPnuke and other tools are used to preprocess a ∼30× NA12878 dataset published by GIAB. The standalone operation of SOAPnuke struck a balance between resource occupancy and performance. When accelerated on 16 working nodes with MapReduce, SOAPnuke achieved ∼5.7 times the fastest speed of other tools.

1,043 citations