Home
/
Authors
/
Ming Wen

Author

Ming Wen

Other affiliations: Sun Yat-sen University, University of Ottawa

Bio: Ming Wen is an academic researcher from National University of Singapore. The author has contributed to research in topics: Gene & Genome. The author has an hindex of 13, co-authored 16 publications receiving 3349 citations. Previous affiliations of Ming Wen include Sun Yat-sen University & University of Ottawa.

Topics: Gene, Genome, Sequence assembly, Cucumis, Oryza sativa ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The genome of the cucumber, Cucumis sativus L.

[...]

Sanwen Huang, Ruiqiang Li¹, Zhonghua Zhang, Li Li, Xingfang Gu, Wei Fan, William J. Lucas², Xiaowu Wang, Bingyan Xie, Peixiang Ni, Yuanyuan Ren, Hongmei Zhu, Jun Li, Kui Lin³, Weiwei Jin⁴, Zhangjun Fei⁵, Guangcun Li, Jack E. Staub⁶, Andrzej Kilian, Edwin A. G. van der Vossen⁷, Yang Wu³, Jie Guo³, Jun He, Zhiqi Jia, Yi Ren, Geng Tian, Yao Lu, Jue Ruan⁸, Wubin Qian, Mingwei Wang, Quanfei Huang, Bo Li, Zhaoling Xuan, Jianjun Cao, Asan, Zhigang Wu, Juanbin Zhang, Qingle Cai, Yinqi Bai, Bowen Zhao⁹, Yonghua Han⁴, Ying Li, Xuefeng Li, Shenhao Wang, Qiuxiang Shi, Shiqiang Liu, Won Kyong Cho¹⁰, Jae-Yean Kim¹⁰, Yong Xu, Katarzyna Heller-Uszynska, Han Miao, Zhouchao Cheng, Shengping Zhang, Jian Wu, Yuhong Yang, Houxiang Kang, Man Li, Huiqing Liang, Xiaoli Ren, Zhongbin Shi, Ming Wen, Min Jian, Hailong Yang, Guojie Zhang⁸, Zhentao Yang, Rui Chen, Shifang Liu, Jianwen Li, Lijia Ma⁸, Hui Liu, Yan Zhou, Jing Zhao, Xiaodong Fang, Guoqing Li, Lin Fang, Yingrui Li⁸, Dongyuan Liu, Hongkun Zheng¹, Yong Zhang, Nan Qin, Zhuo Li, Guohua Yang, Shuang Yang, Lars Bolund¹¹, Karsten Kristiansen¹², Hancheng Zheng¹³, Shaochuan Li¹³, Xiuqing Zhang, Huanming Yang, Jing Wang, Rifei Sun, Zhang Baoxi, Shuzhi Jiang, Jun Wang¹², Yongchen Du, Songgang Li - Show less +92 more•Institutions (13)

University of Southern Denmark¹, University of Minnesota², Beijing Normal University³, China Agricultural University⁴, Boyce Thompson Institute for Plant Research⁵, University of Wisconsin-Madison⁶, Wageningen University and Research Centre⁷, Chinese Academy of Sciences⁸, Renmin University of China⁹, Gyeongsang National University¹⁰, Aarhus University¹¹, University of Copenhagen¹², South China University of Technology¹³

01 Dec 2009-Nature Genetics

TL;DR: This study establishes that five of the cucumber's seven chromosomes arose from fusions of ten ancestral chromosomes after divergence from Cucumis melo, and identifies 686 gene clusters related to phloem function.

...read moreread less

Abstract: Cucumber is an economically important crop as well as a model system for sex determination studies and plant vascular biology. Here we report the draft genome sequence of Cucumis sativus var. sativus L., assembled using a novel combination of traditional Sanger and next-generation Illumina GA sequencing technologies to obtain 72.2-fold genome coverage. The absence of recent whole-genome duplication, along with the presence of few tandem duplications, explains the small number of genes in the cucumber. Our study establishes that five of the cucumber's seven chromosomes arose from fusions of ten ancestral chromosomes after divergence from Cucumis melo. The sequenced cucumber genome affords insight into traits such as its sex expression, disease resistance, biosynthesis of cucurbitacin and 'fresh green' odor. We also identify 686 gene clusters related to phloem function. The cucumber genome provides a valuable resource for developing elite cultivars and for studying the evolution and function of the plant vascular system.

...read moreread less

1,289 citations

Journal Article•DOI•

The sequence and de novo assembly of the giant panda genome

[...]

Ruiqiang Li, Wei Fan, Geng Tian¹, Hongmei Zhu, Lin He², Lin He³, Jing Cai¹, Jing Cai⁴, Quanfei Huang, Qingle Cai⁵, Bo Li, Yinqi Bai, Zhihe Zhang⁶, Ya-Ping Zhang⁴, Wen Wang⁴, Jun Li, Fuwen Wei¹, Heng Li⁷, Min Jian, Jianwen Li, Zhaolei Zhang⁸, Rasmus Nielsen⁹, Dawei Li, Wanjun Gu¹⁰, Zhentao Yang, Zhaoling Xuan, Oliver A. Ryder, Frederick C. Leung¹¹, Yan Zhou, Jianjun Cao, Xiao Sun¹⁰, Yonggui Fu¹², Xiaodong Fang, Xiaosen Guo, Bo Wang, Rong Hou⁶, Fujun Shen⁶, Bo Mu, Peixiang Ni, Runmao Lin, Wubin Qian, Guo-Dong Wang⁴, Guo-Dong Wang¹, Chang Yu, Wenhui Nie⁴, Jinhuan Wang⁴, Zhigang Wu, Huiqing Liang, Jiumeng Min⁵, Qi Wu¹, Shifeng Cheng⁵, Jue Ruan¹, Mingwei Wang, Zhongbin Shi, Ming Wen, Binghang Liu, Xiaoli Ren, Huisong Zheng, Dong Dong⁸, Kathleen Cook⁸, Gao Shan, Hao Zhang, Carolin Kosiol¹³, Xueying Xie¹⁰, Zuhong Lu¹⁰, Hancheng Zheng, Yingrui Li¹, Cynthia C. Steiner, Tommy Tsan-Yuk Lam¹¹, Siyuan Lin, Qinghui Zhang, Guoqing Li, Jing Tian, Timing Gong, Hongde Liu¹⁰, Dejin Zhang¹⁰, Lin Fang, Chen Ye, Juanbin Zhang, Wenbo Hu¹², Anlong Xu¹², Yuanyuan Ren, Guojie Zhang¹, Guojie Zhang⁴, Michael William Bruford¹⁴, Qibin Li¹, Lijia Ma¹, Yiran Guo¹, Na An, Yujie Hu¹, Yang Zheng¹, Yongyong Shi³, Zhiqiang Li³, Qing Liu, Yanling Chen, Jing Zhao, Ning Qu⁵, Shancen Zhao, Feng Tian, Xiaoling Wang, Haiyin Wang, Lizhi Xu, Xiao Liu, Tomas Vinar¹⁵, Yajun Wang¹⁶, Tak-Wah Lam¹¹, Siu-Ming Yiu¹¹, Shiping Liu¹⁷, Hemin Zhang, Desheng Li, Yan Huang, Xia Wang, Guohua Yang, Zhi Jiang, Junyi Wang, Nan Qin, Li Li, Jingxiang Li, Lars Bolund, Karsten Kristiansen¹⁸, Gane Ka-Shu Wong¹⁹, Maynard V. Olson²⁰, Xiuqing Zhang, Songgang Li, Huanming Yang, Jing Wang, Jun Wang¹⁸ - Show less +123 more•Institutions (20)

Chinese Academy of Sciences¹, Fudan University², Shanghai Jiao Tong University³, Kunming Institute of Zoology⁴, Shenzhen University⁵, Chengdu Research Base of Giant Panda Breeding⁶, Wellcome Trust⁷, University of Toronto⁸, University of California, Berkeley⁹, Southeast University¹⁰, University of Hong Kong¹¹, Sun Yat-sen University¹², University of Vienna¹³, Cardiff University¹⁴, Comenius University in Bratislava¹⁵, Sichuan University¹⁶, South China University of Technology¹⁷, University of Copenhagen¹⁸, University of Alberta¹⁹, University of Washington²⁰

21 Jan 2010-Nature

TL;DR: Using next-generation sequencing technology alone, a draft sequence of the giant panda genome is generated and assembled, indicating that its bamboo diet might be more dependent on its gut microbiome than its own genetic composition.

...read moreread less

Abstract: Using next-generation sequencing technology alone, we have successfully generated and assembled a draft sequence of the giant panda genome. The assembled contigs (2.25 gigabases (Gb)) cover approximately 94% of the whole genome, and the remaining gaps (0.05 Gb) seem to contain carnivore-specific repeats and tandem repeats. Comparisons with the dog and human showed that the panda genome has a lower divergence rate. The assessment of panda genes potentially underlying some of its unique traits indicated that its bamboo diet might be more dependent on its gut microbiome than its own genetic composition. We also identified more than 2.7 million heterozygous single nucleotide polymorphisms in the diploid genome. Our data and analyses provide a foundation for promoting mammalian genetic research, and demonstrate the feasibility for using next-generation sequencing technologies for accurate, cost-effective and rapid de novo assembly of large eukaryotic genomes.

...read moreread less

1,109 citations

Journal Article•DOI•

Mapping the human DC lineage through the integration of high-dimensional techniques

[...]

Peter See¹, Charles-Antoine Dutertre¹, Charles-Antoine Dutertre², Jinmiao Chen¹, Patrick Günther³, Naomi McGovern¹, Sergio Erdal Irac², Merry Gunawan⁴, Marc Beyer⁵, Marc Beyer³, Kristian Händler³, Kaibo Duan¹, Hermi Sumatoh¹, Nicolas Ruffin⁶, Mabel Jouve⁶, Ester Gea-Mallorquí⁶, Raoul C.M. Hennekam⁷, Tony Kiat Hon Lim⁸, Chan Chung Yip⁸, Ming Wen², Benoit Malleret², Benoit Malleret¹, Ivy Low¹, Nurhidaya Binte Shadan¹, Charlene Foong Shu Fen⁹, Alicia Tay¹, Josephine Lum¹, Francesca Zolezzi¹, Anis Larbi¹, Michael Poidinger¹, Jerry Kok Yen Chan, Qingfeng Chen¹⁰, Laurent Rénia¹, Muzlifah Haniffa⁴, Philippe Benaroch⁶, Andreas Schlitzer¹, Andreas Schlitzer³, Joachim L. Schultze⁵, Joachim L. Schultze³, Evan W. Newell¹, Florent Ginhoux¹ - Show less +37 more•Institutions (10)

Singapore Immunology Network¹, National University of Singapore², University of Bonn³, Newcastle University⁴, German Center for Neurodegenerative Diseases⁵, PSL Research University⁶, University of Amsterdam⁷, Singapore General Hospital⁸, SingHealth⁹, Agency for Science, Technology and Research¹⁰

09 Jun 2017-Science

TL;DR: Two unbiased high-dimensional technologies are employed to characterize the human DC lineage from bone marrow to blood and provide new markers that can be used to identify unambiguously pre-DC from pDC, including CD33, CX3CR1, CD2, CD5, and CD327.

...read moreread less

Abstract: INTRODUCTION Dendritic cells (DC) are professional antigen-presenting cells that orchestrate immune responses. The human DC population comprises multiple subsets, including plasmacytoid DC (pDC) and two functionally specialized lineages of conventional DC (cDC1 and cDC2), whose origins and differentiation pathways remain incompletely defined. RATIONALE As DC are essential regulators of the immune response in health and disease, potential intervention strategies aiming at manipulation of these cells will require in-depth insights of their origins, the mechanisms that govern their homeostasis, and their functional properties. Here, we employed two unbiased high-dimensional technologies to characterize the human DC lineage from bone marrow to blood. RESULTS We isolated the DC-containing population (Lineage − HLA − DR + CD135 + cells) from human blood and defined the transcriptomes of 710 individual cells using massively parallel single-cell mRNA sequencing. By combining complementary bioinformatic approaches, we identified a small cluster of cells within this population as putative DC precursors (pre-DC). We then confirmed this finding using cytometry by time-of-flight (CyTOF) to simultaneously measure the expression of a panel of 38 different proteins at the single-cell level on Lineage − HLA − DR + cells and found that pre-DC possessed a CD123 + CD33 + CD45RA + phenotype. We confirmed the precursor potential of pre-DC by establishing their potential to differentiate in vitro into cDC1 and cDC2, but not pDC, in the known proportions found in vivo . Interestingly, pre-DC also express classical pDC markers, including CD123, CD303, and CD304. Thus, any previous studies using these markers to identify or isolate pDC will have inadvertently included CD123 + CD33 + pre-DC. We provide here new markers that can be used to identify unambiguously pre-DC from pDC, including CD33, CX3CR1, CD2, CD5, and CD327. When CD123 + CD33 + pre-DC and CD123 + CD33 − pDC were isolated separately, we observed that pre-DC have unique functional properties that were previously attributed to pDC. Although pDC remain bona fide interferon-α–producing cells, their reported interleukin-12 (IL-12) production and CD4 T cell allostimulatory capacity can likely be attributed to “contaminating” pre-DC. We then asked whether the pre-DC population contained both uncommitted and committed pre-cDC1 and pre-cDC2 precursors, as recently shown in mice. Using microfluidic single-cell mRNA sequencing (scmRNAseq), we showed that the human pre-DC population contains cells exhibiting transcriptomic priming toward cDC1 and cDC2 lineages. Flow cytometry and in vitro DC differentiation experiments further identified CD123 + CADM1 − CD1c − putative uncommitted pre-DC, alongside CADM1 + CD1c − pre-cDC1 and CADM1 − CD1c + pre-cDC2. Finally, we found that pre-DC subsets expressed T cell costimulatory molecules and induced comparable proliferation and polarization of naive CD4 T cells as adult DC. However, exposure to the Toll-like receptor 9 (TLR9) ligand CpG triggered IL-12p40 and tumor necrosis factor–α production by early pre-DC, pre-cDC1, and pre-cDC2, in contrast to differentiated cDC1 and cDC2, which do not express TLR9. CONCLUSION Using unsupervised scmRNAseq and CyTOF analyses, we have unraveled the complexity of the human DC lineage at the single-cell level, revealing a continuous process of differentiation that starts in the bone marrow (BM) with common DC progenitors (CDP), diverges at the point of emergence of pre-DC and pDC potential, and culminates in maturation of both lineages in the blood and spleen. The pre-DC compartment contains functionally and phenotypically distinct lineage-committed subpopulations, including one early uncommitted CD123 + pre-DC subset and two CD45RA + CD123 lo lineage-committed subsets. The discovery of multiple committed pre-DC populations with unique capabilities opens promising new avenues for the therapeutic exploitation of DC subset-specific targeting.

...read moreread less

425 citations

Journal Article•DOI•

miREvo: an integrative microRNA evolutionary analysis platform for next-generation sequencing experiments

[...]

Ming Wen¹, Yang Shen¹, Suhua Shi¹, Tian Tang¹•Institutions (1)

Sun Yat-sen University¹

21 Jun 2012-BMC Bioinformatics

TL;DR: An integrated pipline for exploring the expressional and evolutionary dynamics of miRNAs across multiple species is presented, miREvo, an integrated software platform with a graphical user interface (GUI), to process deep-sequencing data of small RNAs and to analyze miRNA sequence and expression evolution based on the multiple-species whole genome alignments (WGAs).

...read moreread less

Abstract: Background: MicroRNAs (miRNAs) are small (~19-24nt) non-coding RNAs that play important roles in various biological processes. To date, the next-generation sequencing (NGS) technology has been widely used to discover miRNAs in plants and animals. Although evolutionary analysis is important to reveal the functional dynamics of miRNAs, few computational tools have been developed to analyze the evolution of miRNA sequence and expression across species, especially the newly emerged ones, Results: We developed miREvo, an integrated software platform with a graphical user interface (GUI), to process deep-sequencing data of small RNAs and to analyze miRNA sequence and expression evolution based on the multiple-species whole genome alignments (WGAs). Three major features are provided by miREvo: (i) to identify novel miRNAs in both plants and animals, based on a modified miRDeep algorithm, (ii) to detect miRNA homologs and measure their pairwise evolutionary distances among multiple species based on a WGA, and (iii) to profile miRNA expression abundances and analyze expression divergence across multiple species (small RNA libraries). Moreover, we demonstrated the utility of miREvo with Illumina data sets from Drosophila melanogaster and Arabidopsis, respectively. Conclusion: This work presents an integrated pipline, miREvo, for exploring the expressional and evolutionary dynamics of miRNAs across multiple species. MiREvo is standalone, modular, and freely available at http://evolution. sysu.edu.cn/software/mirevo.htm under the GNU/GPL license.

...read moreread less

372 citations

Journal Article•DOI•

Dampened NLRP3-mediated inflammation in bats and implications for a special viral reservoir host.

[...]

Matae Ahn¹, Danielle E. Anderson¹, Qian Zhang², Qian Zhang¹, Chee Wah Tan¹, Beng Lee Lim¹, Katarina Luko¹, Ming Wen¹, Wan Ni Chia¹, Shailendra Mani¹, Loo Chien Wang³, Justin H. J. Ng¹, Radoslaw M. Sobota³, Charles-Antoine Dutertre¹, Charles-Antoine Dutertre³, Florent Ginhoux³, Zhengli Shi², Aaron T. Irving¹, Lin-Fa Wang¹ - Show less +15 more•Institutions (3)

National University of Singapore¹, Chinese Academy of Sciences², Agency for Science, Technology and Research³

25 Feb 2019-Nature microbiology

TL;DR: Dampened activation of the NLR family pyrin domain containing 3 (NLRP3) inflammasome in bat primary immune cells in response to infection with multiple zoonotic viruses is caused by decreased transcriptional priming, the presence of a unique splice variant and an altered leucine-rich repeat domain of bat NLRP3.

...read moreread less

Abstract: Bats are special in their ability to host emerging viruses. As the only flying mammal, bats endure high metabolic rates yet exhibit elongated lifespans. It is currently unclear whether these unique features are interlinked. The important inflammasome sensor, NLR family pyrin domain containing 3 (NLRP3), has been linked to both viral-induced and age-related inflammation. Here, we report significantly dampened activation of the NLRP3 inflammasome in bat primary immune cells compared to human or mouse counterparts. Lower induction of apoptosis-associated speck-like protein containing a CARD (ASC) speck formation and secretion of interleukin-1β in response to both 'sterile' stimuli and infection with multiple zoonotic viruses including influenza A virus (-single-stranded (ss) RNA), Melaka virus (PRV3M, double-stranded RNA) and Middle East respiratory syndrome coronavirus (+ssRNA) was observed. Importantly, this reduction of inflammation had no impact on the overall viral loads. We identified dampened transcriptional priming, a novel splice variant and an altered leucine-rich repeat domain of bat NLRP3 as the cause. Our results elucidate an important mechanism through which bats dampen inflammation with implications for longevity and unique viral reservoir status.

...read moreread less

203 citations

1
2
3
4
…

Cited by

PDF

Open Access

More filters

Journal Article•

Statistical method for testing the neutral mutation hypothesis by DNA polymorphism.

[...]

Fumio Tajima¹•Institutions (1)

Kyushu University¹

30 Oct 1989-Genomics

TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

...read moreread less

11,521 citations

Journal Article•DOI•

Integrated analysis of multimodal single-cell data

[...]

Yuhan Hao¹, Stephanie Hao², Erica Andersen-Nissen³, William M. Mauck¹, Shiwei Zheng¹, Andrew Butler¹, Maddie Jane Lee⁴, Aaron J. Wilk⁴, Charlotte A. Darby¹, Michael Zager³, Paul Hoffman¹, Marlon Stoeckius², Efthymia Papalexi¹, Eleni P. Mimitou², Jaison Jain¹, Avi Srivastava¹, Tim Stuart¹, Lamar M. Fleming³, Bertrand Z. Yeung, Angela J. Rogers⁴, Juliana M. McElrath³, Catherine A. Blish⁴, Raphael Gottardo³, Peter Smibert², Rahul Satija¹ - Show less +21 more•Institutions (4)

New York University¹, Harvard University², Fred Hutchinson Cancer Research Center³, Stanford University⁴

24 Jun 2021-Cell

TL;DR: Weighted-nearest neighbor analysis as mentioned in this paper is an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities.

...read moreread less

3,369 citations

Posted Content•DOI•

Integrated analysis of multimodal single-cell data

[...]

Yuhan Hao¹, Stephanie Hao², Erica Andersen-Nissen³, William M. Mauck¹, Shiwei Zheng¹, Andrew Butler¹, Maddie Jane Lee⁴, Aaron J. Wilk⁴, Charlotte A. Darby¹, Michael Zagar³, Paul Hoffman¹, Marlon Stoeckius², Efthymia Papalexi¹, Eleni P. Mimitou², Jaison Jain¹, Avi Srivastava¹, Tim Stuart¹, Lamar Ballweber Fleming³, Bertrand Z. Yeung, Angela J. Rogers⁴, Juliana M. McElrath³, Catherine A. Blish⁴, Raphael Gottardo³, Peter Smibert², Rahul Satija¹ - Show less +21 more•Institutions (4)

New York University¹, Harvard University², Fred Hutchinson Cancer Research Center³, Stanford University⁴

12 Oct 2020-bioRxiv

TL;DR: ‘weighted-nearest neighbor’ analysis is introduced, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities.

...read moreread less

Abstract: The simultaneous measurement of multiple modalities, known as multimodal analysis, represents an exciting frontier for single-cell genomics and necessitates new computational methods that can define cellular states based on multiple data types. Here, we introduce ‘weighted-nearest neighbor’ analysis, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities. We apply our procedure to a CITE-seq dataset of hundreds of thousands of human white blood cells alongside a panel of 228 antibodies to construct a multimodal reference atlas of the circulating immune system. We demonstrate that integrative analysis substantially improves our ability to resolve cell states and validate the presence of previously unreported lymphoid subpopulations. Moreover, we demonstrate how to leverage this reference to rapidly map new datasets, and to interpret immune responses to vaccination and COVID-19. Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets, including paired measurements of RNA and chromatin state, and to look beyond the transcriptome towards a unified and multimodal definition of cellular identity. Availability Installation instructions, documentation, tutorials, and CITE-seq datasets are available at http://www.satijalab.org/seurat

...read moreread less

2,924 citations

Journal Article•DOI•

A fast, lock-free approach for efficient parallel counting of occurrences of k-mers

[...]

Guillaume Marçais¹, Carl Kingsford¹•Institutions (1)

University of Maryland, College Park¹

01 Mar 2011-Bioinformatics

TL;DR: This work proposes a new k-mer counting algorithm and associated implementation, called Jellyfish, which is fast and memory efficient, based on a multithreaded, lock-free hash table optimized for counting k-mers up to 31 bases in length.

...read moreread less

Abstract: Motivation: Counting the number of occurrences of every k-mer (substring of length k) in a long string is a central subproblem in many applications, including genome assembly, error correction of sequencing reads, fast multiple sequence alignment and repeat detection. Recently, the deep sequence coverage generated by next-generation sequencing technologies has caused the amount of sequence to be processed during a genome project to grow rapidly, and has rendered current k-mer counting tools too slow and memory intensive. At the same time, large multicore computers have become commonplace in research facilities allowing for a new parallel computational paradigm. Results: We propose a new k-mer counting algorithm and associated implementation, called Jellyfish, which is fast and memory efficient. It is based on a multithreaded, lock-free hash table optimized for counting k-mers up to 31 bases in length. Due to their flexibility, suffix arrays have been the data structure of choice for solving many string problems. For the task of k-mer counting, important in many biological applications, Jellyfish offers a much faster and more memory-efficient solution. Availability: The Jellyfish software is written in C++ and is GPL licensed. It is available for download at http://www.cbcb.umd.edu/software/jellyfish. Contact: [email protected] Supplementary information:Supplementary data are available at Bioinformatics online.

...read moreread less

2,779 citations

Journal Article•DOI•

De novo assembly of human genomes with massively parallel short read sequencing

[...]

Ruiqiang Li¹, Hongmei Zhu, Jue Ruan, Wubin Qian, Xiaodong Fang, Zhongbin Shi, Yingrui Li, Shengting Li², Gao Shan, Karsten Kristiansen, Songgang Li, Huanming Yang, Jing Wang, Jun Wang - Show less +10 more•Institutions (2)

Beijing Genomics Institute¹, Aarhus University²

01 Feb 2010-Genome Research

TL;DR: The development of this de novo short read assembly method creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost-effective way.

...read moreread less

Abstract: Next-generation massively parallel DNA sequencing technologies provide ultrahigh throughput at a substantially lower unit data cost; however, the data are very short read length sequences, making de novo assembly extremely challenging. Here, we describe a novel method for de novo assembly of large genomes from short read sequences. We successfully assembled both the Asian and African human genome sequences, achieving an N50 contig size of 7.4 and 5.9 kilobases (kb) and scaffold of 446.3 and 61.9 kb, respectively. The development of this de novo short read assembly method creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost-effective way.

...read moreread less

2,760 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse