scispace - formally typeset
Search or ask a question
Author

Ming Wen

Bio: Ming Wen is an academic researcher from National University of Singapore. The author has contributed to research in topics: Gene & Genome. The author has an hindex of 13, co-authored 16 publications receiving 3349 citations. Previous affiliations of Ming Wen include Sun Yat-sen University & University of Ottawa.
Topics: Gene, Genome, Sequence assembly, Cucumis, Oryza sativa

Papers
More filters
Journal ArticleDOI
TL;DR: This study establishes that five of the cucumber's seven chromosomes arose from fusions of ten ancestral chromosomes after divergence from Cucumis melo, and identifies 686 gene clusters related to phloem function.
Abstract: Cucumber is an economically important crop as well as a model system for sex determination studies and plant vascular biology. Here we report the draft genome sequence of Cucumis sativus var. sativus L., assembled using a novel combination of traditional Sanger and next-generation Illumina GA sequencing technologies to obtain 72.2-fold genome coverage. The absence of recent whole-genome duplication, along with the presence of few tandem duplications, explains the small number of genes in the cucumber. Our study establishes that five of the cucumber's seven chromosomes arose from fusions of ten ancestral chromosomes after divergence from Cucumis melo. The sequenced cucumber genome affords insight into traits such as its sex expression, disease resistance, biosynthesis of cucurbitacin and 'fresh green' odor. We also identify 686 gene clusters related to phloem function. The cucumber genome provides a valuable resource for developing elite cultivars and for studying the evolution and function of the plant vascular system.

1,289 citations

Journal ArticleDOI
Ruiqiang Li, Wei Fan, Geng Tian1, Hongmei Zhu, Lin He2, Lin He3, Jing Cai1, Jing Cai4, Quanfei Huang, Qingle Cai5, Bo Li, Yinqi Bai, Zhihe Zhang6, Ya-Ping Zhang4, Wen Wang4, Jun Li, Fuwen Wei1, Heng Li7, Min Jian, Jianwen Li, Zhaolei Zhang8, Rasmus Nielsen9, Dawei Li, Wanjun Gu10, Zhentao Yang, Zhaoling Xuan, Oliver A. Ryder, Frederick C. Leung11, Yan Zhou, Jianjun Cao, Xiao Sun10, Yonggui Fu12, Xiaodong Fang, Xiaosen Guo, Bo Wang, Rong Hou6, Fujun Shen6, Bo Mu, Peixiang Ni, Runmao Lin, Wubin Qian, Guo-Dong Wang4, Guo-Dong Wang1, Chang Yu, Wenhui Nie4, Jinhuan Wang4, Zhigang Wu, Huiqing Liang, Jiumeng Min5, Qi Wu1, Shifeng Cheng5, Jue Ruan1, Mingwei Wang, Zhongbin Shi, Ming Wen, Binghang Liu, Xiaoli Ren, Huisong Zheng, Dong Dong8, Kathleen Cook8, Gao Shan, Hao Zhang, Carolin Kosiol13, Xueying Xie10, Zuhong Lu10, Hancheng Zheng, Yingrui Li1, Cynthia C. Steiner, Tommy Tsan-Yuk Lam11, Siyuan Lin, Qinghui Zhang, Guoqing Li, Jing Tian, Timing Gong, Hongde Liu10, Dejin Zhang10, Lin Fang, Chen Ye, Juanbin Zhang, Wenbo Hu12, Anlong Xu12, Yuanyuan Ren, Guojie Zhang1, Guojie Zhang4, Michael William Bruford14, Qibin Li1, Lijia Ma1, Yiran Guo1, Na An, Yujie Hu1, Yang Zheng1, Yongyong Shi3, Zhiqiang Li3, Qing Liu, Yanling Chen, Jing Zhao, Ning Qu5, Shancen Zhao, Feng Tian, Xiaoling Wang, Haiyin Wang, Lizhi Xu, Xiao Liu, Tomas Vinar15, Yajun Wang16, Tak-Wah Lam11, Siu-Ming Yiu11, Shiping Liu17, Hemin Zhang, Desheng Li, Yan Huang, Xia Wang, Guohua Yang, Zhi Jiang, Junyi Wang, Nan Qin, Li Li, Jingxiang Li, Lars Bolund, Karsten Kristiansen18, Gane Ka-Shu Wong19, Maynard V. Olson20, Xiuqing Zhang, Songgang Li, Huanming Yang, Jing Wang, Jun Wang18 
21 Jan 2010-Nature
TL;DR: Using next-generation sequencing technology alone, a draft sequence of the giant panda genome is generated and assembled, indicating that its bamboo diet might be more dependent on its gut microbiome than its own genetic composition.
Abstract: Using next-generation sequencing technology alone, we have successfully generated and assembled a draft sequence of the giant panda genome. The assembled contigs (2.25 gigabases (Gb)) cover approximately 94% of the whole genome, and the remaining gaps (0.05 Gb) seem to contain carnivore-specific repeats and tandem repeats. Comparisons with the dog and human showed that the panda genome has a lower divergence rate. The assessment of panda genes potentially underlying some of its unique traits indicated that its bamboo diet might be more dependent on its gut microbiome than its own genetic composition. We also identified more than 2.7 million heterozygous single nucleotide polymorphisms in the diploid genome. Our data and analyses provide a foundation for promoting mammalian genetic research, and demonstrate the feasibility for using next-generation sequencing technologies for accurate, cost-effective and rapid de novo assembly of large eukaryotic genomes.

1,109 citations

Journal ArticleDOI
09 Jun 2017-Science
TL;DR: Two unbiased high-dimensional technologies are employed to characterize the human DC lineage from bone marrow to blood and provide new markers that can be used to identify unambiguously pre-DC from pDC, including CD33, CX3CR1, CD2, CD5, and CD327.
Abstract: INTRODUCTION Dendritic cells (DC) are professional antigen-presenting cells that orchestrate immune responses. The human DC population comprises multiple subsets, including plasmacytoid DC (pDC) and two functionally specialized lineages of conventional DC (cDC1 and cDC2), whose origins and differentiation pathways remain incompletely defined. RATIONALE As DC are essential regulators of the immune response in health and disease, potential intervention strategies aiming at manipulation of these cells will require in-depth insights of their origins, the mechanisms that govern their homeostasis, and their functional properties. Here, we employed two unbiased high-dimensional technologies to characterize the human DC lineage from bone marrow to blood. RESULTS We isolated the DC-containing population (Lineage − HLA − DR + CD135 + cells) from human blood and defined the transcriptomes of 710 individual cells using massively parallel single-cell mRNA sequencing. By combining complementary bioinformatic approaches, we identified a small cluster of cells within this population as putative DC precursors (pre-DC). We then confirmed this finding using cytometry by time-of-flight (CyTOF) to simultaneously measure the expression of a panel of 38 different proteins at the single-cell level on Lineage − HLA − DR + cells and found that pre-DC possessed a CD123 + CD33 + CD45RA + phenotype. We confirmed the precursor potential of pre-DC by establishing their potential to differentiate in vitro into cDC1 and cDC2, but not pDC, in the known proportions found in vivo . Interestingly, pre-DC also express classical pDC markers, including CD123, CD303, and CD304. Thus, any previous studies using these markers to identify or isolate pDC will have inadvertently included CD123 + CD33 + pre-DC. We provide here new markers that can be used to identify unambiguously pre-DC from pDC, including CD33, CX3CR1, CD2, CD5, and CD327. When CD123 + CD33 + pre-DC and CD123 + CD33 − pDC were isolated separately, we observed that pre-DC have unique functional properties that were previously attributed to pDC. Although pDC remain bona fide interferon-α–producing cells, their reported interleukin-12 (IL-12) production and CD4 T cell allostimulatory capacity can likely be attributed to “contaminating” pre-DC. We then asked whether the pre-DC population contained both uncommitted and committed pre-cDC1 and pre-cDC2 precursors, as recently shown in mice. Using microfluidic single-cell mRNA sequencing (scmRNAseq), we showed that the human pre-DC population contains cells exhibiting transcriptomic priming toward cDC1 and cDC2 lineages. Flow cytometry and in vitro DC differentiation experiments further identified CD123 + CADM1 − CD1c − putative uncommitted pre-DC, alongside CADM1 + CD1c − pre-cDC1 and CADM1 − CD1c + pre-cDC2. Finally, we found that pre-DC subsets expressed T cell costimulatory molecules and induced comparable proliferation and polarization of naive CD4 T cells as adult DC. However, exposure to the Toll-like receptor 9 (TLR9) ligand CpG triggered IL-12p40 and tumor necrosis factor–α production by early pre-DC, pre-cDC1, and pre-cDC2, in contrast to differentiated cDC1 and cDC2, which do not express TLR9. CONCLUSION Using unsupervised scmRNAseq and CyTOF analyses, we have unraveled the complexity of the human DC lineage at the single-cell level, revealing a continuous process of differentiation that starts in the bone marrow (BM) with common DC progenitors (CDP), diverges at the point of emergence of pre-DC and pDC potential, and culminates in maturation of both lineages in the blood and spleen. The pre-DC compartment contains functionally and phenotypically distinct lineage-committed subpopulations, including one early uncommitted CD123 + pre-DC subset and two CD45RA + CD123 lo lineage-committed subsets. The discovery of multiple committed pre-DC populations with unique capabilities opens promising new avenues for the therapeutic exploitation of DC subset-specific targeting.

425 citations

Journal ArticleDOI
TL;DR: An integrated pipline for exploring the expressional and evolutionary dynamics of miRNAs across multiple species is presented, miREvo, an integrated software platform with a graphical user interface (GUI), to process deep-sequencing data of small RNAs and to analyze miRNA sequence and expression evolution based on the multiple-species whole genome alignments (WGAs).
Abstract: Background: MicroRNAs (miRNAs) are small (~19-24nt) non-coding RNAs that play important roles in various biological processes. To date, the next-generation sequencing (NGS) technology has been widely used to discover miRNAs in plants and animals. Although evolutionary analysis is important to reveal the functional dynamics of miRNAs, few computational tools have been developed to analyze the evolution of miRNA sequence and expression across species, especially the newly emerged ones, Results: We developed miREvo, an integrated software platform with a graphical user interface (GUI), to process deep-sequencing data of small RNAs and to analyze miRNA sequence and expression evolution based on the multiple-species whole genome alignments (WGAs). Three major features are provided by miREvo: (i) to identify novel miRNAs in both plants and animals, based on a modified miRDeep algorithm, (ii) to detect miRNA homologs and measure their pairwise evolutionary distances among multiple species based on a WGA, and (iii) to profile miRNA expression abundances and analyze expression divergence across multiple species (small RNA libraries). Moreover, we demonstrated the utility of miREvo with Illumina data sets from Drosophila melanogaster and Arabidopsis, respectively. Conclusion: This work presents an integrated pipline, miREvo, for exploring the expressional and evolutionary dynamics of miRNAs across multiple species. MiREvo is standalone, modular, and freely available at http://evolution. sysu.edu.cn/software/mirevo.htm under the GNU/GPL license.

372 citations

Journal ArticleDOI
TL;DR: Dampened activation of the NLR family pyrin domain containing 3 (NLRP3) inflammasome in bat primary immune cells in response to infection with multiple zoonotic viruses is caused by decreased transcriptional priming, the presence of a unique splice variant and an altered leucine-rich repeat domain of bat NLRP3.
Abstract: Bats are special in their ability to host emerging viruses. As the only flying mammal, bats endure high metabolic rates yet exhibit elongated lifespans. It is currently unclear whether these unique features are interlinked. The important inflammasome sensor, NLR family pyrin domain containing 3 (NLRP3), has been linked to both viral-induced and age-related inflammation. Here, we report significantly dampened activation of the NLRP3 inflammasome in bat primary immune cells compared to human or mouse counterparts. Lower induction of apoptosis-associated speck-like protein containing a CARD (ASC) speck formation and secretion of interleukin-1β in response to both 'sterile' stimuli and infection with multiple zoonotic viruses including influenza A virus (-single-stranded (ss) RNA), Melaka virus (PRV3M, double-stranded RNA) and Middle East respiratory syndrome coronavirus (+ssRNA) was observed. Importantly, this reduction of inflammation had no impact on the overall viral loads. We identified dampened transcriptional priming, a novel splice variant and an altered leucine-rich repeat domain of bat NLRP3 as the cause. Our results elucidate an important mechanism through which bats dampen inflammation with implications for longevity and unique viral reservoir status.

203 citations


Cited by
More filters
Journal Article
Fumio Tajima1
30 Oct 1989-Genomics
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

11,521 citations

Journal ArticleDOI
24 Jun 2021-Cell
TL;DR: Weighted-nearest neighbor analysis as mentioned in this paper is an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities.

3,369 citations

Posted ContentDOI
12 Oct 2020-bioRxiv
TL;DR: ‘weighted-nearest neighbor’ analysis is introduced, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities.
Abstract: The simultaneous measurement of multiple modalities, known as multimodal analysis, represents an exciting frontier for single-cell genomics and necessitates new computational methods that can define cellular states based on multiple data types. Here, we introduce ‘weighted-nearest neighbor’ analysis, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities. We apply our procedure to a CITE-seq dataset of hundreds of thousands of human white blood cells alongside a panel of 228 antibodies to construct a multimodal reference atlas of the circulating immune system. We demonstrate that integrative analysis substantially improves our ability to resolve cell states and validate the presence of previously unreported lymphoid subpopulations. Moreover, we demonstrate how to leverage this reference to rapidly map new datasets, and to interpret immune responses to vaccination and COVID-19. Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets, including paired measurements of RNA and chromatin state, and to look beyond the transcriptome towards a unified and multimodal definition of cellular identity. Availability Installation instructions, documentation, tutorials, and CITE-seq datasets are available at http://www.satijalab.org/seurat

2,924 citations

Journal ArticleDOI
TL;DR: This work proposes a new k-mer counting algorithm and associated implementation, called Jellyfish, which is fast and memory efficient, based on a multithreaded, lock-free hash table optimized for counting k-mers up to 31 bases in length.
Abstract: Motivation: Counting the number of occurrences of every k-mer (substring of length k) in a long string is a central subproblem in many applications, including genome assembly, error correction of sequencing reads, fast multiple sequence alignment and repeat detection. Recently, the deep sequence coverage generated by next-generation sequencing technologies has caused the amount of sequence to be processed during a genome project to grow rapidly, and has rendered current k-mer counting tools too slow and memory intensive. At the same time, large multicore computers have become commonplace in research facilities allowing for a new parallel computational paradigm. Results: We propose a new k-mer counting algorithm and associated implementation, called Jellyfish, which is fast and memory efficient. It is based on a multithreaded, lock-free hash table optimized for counting k-mers up to 31 bases in length. Due to their flexibility, suffix arrays have been the data structure of choice for solving many string problems. For the task of k-mer counting, important in many biological applications, Jellyfish offers a much faster and more memory-efficient solution. Availability: The Jellyfish software is written in C++ and is GPL licensed. It is available for download at http://www.cbcb.umd.edu/software/jellyfish. Contact: [email protected] Supplementary information:Supplementary data are available at Bioinformatics online.

2,779 citations

Journal ArticleDOI
TL;DR: The development of this de novo short read assembly method creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost-effective way.
Abstract: Next-generation massively parallel DNA sequencing technologies provide ultrahigh throughput at a substantially lower unit data cost; however, the data are very short read length sequences, making de novo assembly extremely challenging. Here, we describe a novel method for de novo assembly of large genomes from short read sequences. We successfully assembled both the Asian and African human genome sequences, achieving an N50 contig size of 7.4 and 5.9 kilobases (kb) and scaffold of 446.3 and 61.9 kb, respectively. The development of this de novo short read assembly method creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost-effective way.

2,760 citations