Home
/
Authors
/
Songgang Li

Author

Songgang Li

Other affiliations: Shenzhen University, Beijing Genomics Institute, Chinese Academy of Sciences ...read more

Bio: Songgang Li is an academic researcher from Peking University. The author has contributed to research in topics: Genome & Gene. The author has an hindex of 24, co-authored 35 publications receiving 15015 citations. Previous affiliations of Songgang Li include Shenzhen University & Beijing Genomics Institute.

Topics: Genome, Gene, Reference genome, Human genome, Whole genome sequencing ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A metagenome-wide association study of gut microbiota in type 2 diabetes

[...]

Junjie Qin, Yingrui Li, Zhiming Cai¹, Shenghui Li, Jianfeng Zhu, Fan Zhang², Suisha Liang, Wenwei Zhang, Yuanlin Guan, Dongqian Shen, Yangqing Peng, Dongya Zhang, Zhuye Jie, Wenxian Wu, Youwen Qin, Wenbin Xue, Junhua Li, Lingchuan Han², Donghui Lu², Peixian Wu², Yali Dai², Xiaojuan Sun¹, Zesong Li¹, Aifa Tang¹, Shilong Zhong³, Xiaoping Li, Weineng Chen, Ran Xu, Mingbang Wang, Qiang Feng, Meihua Gong, Jing Yu, Yanyan Zhang, Ming Zhang, Torben Hansen⁴, Gaston Sanchez⁵, Jeroen Raes⁶, Gwen Falony⁶, Shujiro Okuda⁶, Mathieu Almeida⁷, Emmanuelle Le-chatelier⁷, Pierre Renault⁷, Nicolas Pons⁷, Jean-Michel Batto⁷, Zhaoxi Zhang, Hua Chen, Ruifu Yang, Wei-Mou Zheng, Songgang Li, Huanming Yang, Jian Wang, S. Dusko Ehrlich⁷, Rasmus Nielsen⁵, Oluf Pedersen⁴, Oluf Pedersen⁸, Karsten Kristiansen⁴, Jun Wang⁴ - Show less +53 more•Institutions (8)

Shenzhen University¹, Peking University², Guangdong General Hospital³, University of Copenhagen⁴, University of California, Berkeley⁵, Vrije Universiteit Brussel⁶, Institut national de la recherche agronomique⁷, Health Science University⁸

04 Oct 2012-Nature

TL;DR: MGWAS analysis showed that patients with type 2 diabetes were characterized by a moderate degree of gut microbial dysbiosis, a decrease in the abundance of some universal butyrate-producing bacteria and an increase in various opportunistic pathogens, as well as an enrichment of other microbial functions conferring sulphate reduction and oxidative stress resistance.

...read moreread less

Abstract: Assessment and characterization of gut microbiota has become a major research area in human disease, including type 2 diabetes, the most prevalent endocrine disease worldwide. To carry out analysis on gut microbial content in patients with type 2 diabetes, we developed a protocol for a metagenome-wide association study (MGWAS) and undertook a two-stage MGWAS based on deep shotgun sequencing of the gut microbial DNA from 345 Chinese individuals. We identified and validated approximately 60,000 type-2-diabetes-associated markers and established the concept of a metagenomic linkage group, enabling taxonomic species-level analyses. MGWAS analysis showed that patients with type 2 diabetes were characterized by a moderate degree of gut microbial dysbiosis, a decrease in the abundance of some universal butyrate-producing bacteria and an increase in various opportunistic pathogens, as well as an enrichment of other microbial functions conferring sulphate reduction and oxidative stress resistance. An analysis of 23 additional individuals demonstrated that these gut microbial markers might be useful for classifying type 2 diabetes.

...read moreread less

4,981 citations

Journal Article•DOI•

De novo assembly of human genomes with massively parallel short read sequencing

[...]

Ruiqiang Li¹, Hongmei Zhu, Jue Ruan, Wubin Qian, Xiaodong Fang, Zhongbin Shi, Yingrui Li, Shengting Li², Gao Shan, Karsten Kristiansen, Songgang Li, Huanming Yang, Jing Wang, Jun Wang - Show less +10 more•Institutions (2)

Beijing Genomics Institute¹, Aarhus University²

01 Feb 2010-Genome Research

TL;DR: The development of this de novo short read assembly method creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost-effective way.

...read moreread less

Abstract: Next-generation massively parallel DNA sequencing technologies provide ultrahigh throughput at a substantially lower unit data cost; however, the data are very short read length sequences, making de novo assembly extremely challenging. Here, we describe a novel method for de novo assembly of large genomes from short read sequences. We successfully assembled both the Asian and African human genome sequences, achieving an N50 contig size of 7.4 and 5.9 kilobases (kb) and scaffold of 446.3 and 61.9 kb, respectively. The development of this de novo short read assembly method creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost-effective way.

...read moreread less

2,760 citations

Journal Article•DOI•

Sequencing of 50 Human Exomes Reveals Adaptation to High Altitude

[...]

Xin Yi, Yu Liang¹, Emilia Huerta-Sanchez², Xin Jin³, Zha Xi Ping Cuo¹, John E. Pool², John E. Pool⁴, Xun Xu, Hui Jiang, Nicolas Vinckenbosch², Thorfinn Sand Korneliussen⁵, Hancheng Zheng³, Tao Liu, Weiming He³, Kui Li¹, Ruibang Luo³, Xifang Nie, Honglong Wu⁶, Meiru Zhao, Hongzhi Cao⁶, Jing Zou, Ying Shan³, Shuzheng Li, Qi Yang, Asan¹, Peixiang Ni, Geng Tian¹, Junming Xu, Xiao Liu, Tao Jiang⁶, Renhua Wu, Guangyu Zhou, Meifang Tang, Junjie Qin, Tong Wang, Shuijian Feng, Guohong Li, Huasang, Jiangbai Luosang, Wei Wang, Fang Chen, Yading Wang, Xiaoguang Zheng¹, Zhuo Li, Zhuoma Bianba, Ge Yang, Xiznping Wang, Shuhui Tang, Guoyi Gao, Yong Chen, Zhen Luo, Lamu Gusang, Zheng Cao, Qinghui Zhang, Wei-Han OuYang, Xiaoli Ren, Huiqing Liang, Huisong Zheng, Yebo Huang, Jingxiang Li, Lars Bolund, Karsten Kristiansen⁵, Yingrui Li, Yong Zhang, Xiuqing Zhang, Ruiqiang Li⁵, Songgang Li, Huanming Yang, Rasmus Nielsen², Rasmus Nielsen⁵, Jun Wang⁵, Jing Wang - Show less +68 more•Institutions (6)

Chinese Academy of Sciences¹, University of California, Berkeley², South China University of Technology³, University of California, Davis⁴, University of Copenhagen⁵, Shenzhen University⁶

02 Jul 2010-Science

TL;DR: A population genomic survey has revealed a functionally important locus in genetic adaptation to high altitude, and the strongest signal of natural selection came from endothelial Per-Arnt-Sim domain protein 1 (EPAS1), a transcription factor involved in response to hypoxia.

...read moreread less

Abstract: Residents of the Tibetan Plateau show heritable adaptations to extreme altitude. We sequenced 50 exomes of ethnic Tibetans, encompassing coding sequences of 92% of human genes, with an average coverage of 18x per individual. Genes showing population-specific allele frequency changes, which represent strong candidates for altitude adaptation, were identified. The strongest signal of natural selection came from endothelial Per-Arnt-Sim (PAS) domain protein 1 (EPAS1), a transcription factor involved in response to hypoxia. One single-nucleotide polymorphism (SNP) at EPAS1 shows a 78% frequency difference between Tibetan and Han samples, representing the fastest allele frequency change observed at any human gene to date. This SNP's association with erythrocyte abundance supports the role of EPAS1 in adaptation to hypoxia. Thus, a population genomic survey has revealed a functionally important locus in genetic adaptation to high altitude.

...read moreread less

1,325 citations

Journal Article•DOI•

The genome of the cucumber, Cucumis sativus L.

[...]

Sanwen Huang, Ruiqiang Li¹, Zhonghua Zhang, Li Li, Xingfang Gu, Wei Fan, William J. Lucas², Xiaowu Wang, Bingyan Xie, Peixiang Ni, Yuanyuan Ren, Hongmei Zhu, Jun Li, Kui Lin³, Weiwei Jin⁴, Zhangjun Fei⁵, Guangcun Li, Jack E. Staub⁶, Andrzej Kilian, Edwin A. G. van der Vossen⁷, Yang Wu³, Jie Guo³, Jun He, Zhiqi Jia, Yi Ren, Geng Tian, Yao Lu, Jue Ruan⁸, Wubin Qian, Mingwei Wang, Quanfei Huang, Bo Li, Zhaoling Xuan, Jianjun Cao, Asan, Zhigang Wu, Juanbin Zhang, Qingle Cai, Yinqi Bai, Bowen Zhao⁹, Yonghua Han⁴, Ying Li, Xuefeng Li, Shenhao Wang, Qiuxiang Shi, Shiqiang Liu, Won Kyong Cho¹⁰, Jae-Yean Kim¹⁰, Yong Xu, Katarzyna Heller-Uszynska, Han Miao, Zhouchao Cheng, Shengping Zhang, Jian Wu, Yuhong Yang, Houxiang Kang, Man Li, Huiqing Liang, Xiaoli Ren, Zhongbin Shi, Ming Wen, Min Jian, Hailong Yang, Guojie Zhang⁸, Zhentao Yang, Rui Chen, Shifang Liu, Jianwen Li, Lijia Ma⁸, Hui Liu, Yan Zhou, Jing Zhao, Xiaodong Fang, Guoqing Li, Lin Fang, Yingrui Li⁸, Dongyuan Liu, Hongkun Zheng¹, Yong Zhang, Nan Qin, Zhuo Li, Guohua Yang, Shuang Yang, Lars Bolund¹¹, Karsten Kristiansen¹², Hancheng Zheng¹³, Shaochuan Li¹³, Xiuqing Zhang, Huanming Yang, Jing Wang, Rifei Sun, Zhang Baoxi, Shuzhi Jiang, Jun Wang¹², Yongchen Du, Songgang Li - Show less +92 more•Institutions (13)

University of Southern Denmark¹, University of Minnesota², Beijing Normal University³, China Agricultural University⁴, Boyce Thompson Institute for Plant Research⁵, University of Wisconsin-Madison⁶, Wageningen University and Research Centre⁷, Chinese Academy of Sciences⁸, Renmin University of China⁹, Gyeongsang National University¹⁰, Aarhus University¹¹, University of Copenhagen¹², South China University of Technology¹³

01 Dec 2009-Nature Genetics

TL;DR: This study establishes that five of the cucumber's seven chromosomes arose from fusions of ten ancestral chromosomes after divergence from Cucumis melo, and identifies 686 gene clusters related to phloem function.

...read moreread less

Abstract: Cucumber is an economically important crop as well as a model system for sex determination studies and plant vascular biology. Here we report the draft genome sequence of Cucumis sativus var. sativus L., assembled using a novel combination of traditional Sanger and next-generation Illumina GA sequencing technologies to obtain 72.2-fold genome coverage. The absence of recent whole-genome duplication, along with the presence of few tandem duplications, explains the small number of genes in the cucumber. Our study establishes that five of the cucumber's seven chromosomes arose from fusions of ten ancestral chromosomes after divergence from Cucumis melo. The sequenced cucumber genome affords insight into traits such as its sex expression, disease resistance, biosynthesis of cucurbitacin and 'fresh green' odor. We also identify 686 gene clusters related to phloem function. The cucumber genome provides a valuable resource for developing elite cultivars and for studying the evolution and function of the plant vascular system.

...read moreread less

1,289 citations

Journal Article•DOI•

The sequence and de novo assembly of the giant panda genome

[...]

Ruiqiang Li, Wei Fan, Geng Tian¹, Hongmei Zhu, Lin He², Lin He³, Jing Cai¹, Jing Cai⁴, Quanfei Huang, Qingle Cai⁵, Bo Li, Yinqi Bai, Zhihe Zhang⁶, Ya-Ping Zhang⁴, Wen Wang⁴, Jun Li, Fuwen Wei¹, Heng Li⁷, Min Jian, Jianwen Li, Zhaolei Zhang⁸, Rasmus Nielsen⁹, Dawei Li, Wanjun Gu¹⁰, Zhentao Yang, Zhaoling Xuan, Oliver A. Ryder, Frederick C. Leung¹¹, Yan Zhou, Jianjun Cao, Xiao Sun¹⁰, Yonggui Fu¹², Xiaodong Fang, Xiaosen Guo, Bo Wang, Rong Hou⁶, Fujun Shen⁶, Bo Mu, Peixiang Ni, Runmao Lin, Wubin Qian, Guo-Dong Wang⁴, Guo-Dong Wang¹, Chang Yu, Wenhui Nie⁴, Jinhuan Wang⁴, Zhigang Wu, Huiqing Liang, Jiumeng Min⁵, Qi Wu¹, Shifeng Cheng⁵, Jue Ruan¹, Mingwei Wang, Zhongbin Shi, Ming Wen, Binghang Liu, Xiaoli Ren, Huisong Zheng, Dong Dong⁸, Kathleen Cook⁸, Gao Shan, Hao Zhang, Carolin Kosiol¹³, Xueying Xie¹⁰, Zuhong Lu¹⁰, Hancheng Zheng, Yingrui Li¹, Cynthia C. Steiner, Tommy Tsan-Yuk Lam¹¹, Siyuan Lin, Qinghui Zhang, Guoqing Li, Jing Tian, Timing Gong, Hongde Liu¹⁰, Dejin Zhang¹⁰, Lin Fang, Chen Ye, Juanbin Zhang, Wenbo Hu¹², Anlong Xu¹², Yuanyuan Ren, Guojie Zhang⁴, Guojie Zhang¹, Michael William Bruford¹⁴, Qibin Li¹, Lijia Ma¹, Yiran Guo¹, Na An, Yujie Hu¹, Yang Zheng¹, Yongyong Shi², Zhiqiang Li², Qing Liu, Yanling Chen, Jing Zhao, Ning Qu⁵, Shancen Zhao, Feng Tian, Xiaoling Wang, Haiyin Wang, Lizhi Xu, Xiao Liu, Tomas Vinar¹⁵, Yajun Wang¹⁶, Tak-Wah Lam¹¹, Siu-Ming Yiu¹¹, Shiping Liu¹⁷, Hemin Zhang, Desheng Li, Yan Huang, Xia Wang, Guohua Yang, Zhi Jiang, Junyi Wang, Nan Qin, Li Li, Jingxiang Li, Lars Bolund, Karsten Kristiansen¹⁸, Gane Ka-Shu Wong¹⁹, Maynard V. Olson²⁰, Xiuqing Zhang, Songgang Li, Huanming Yang, Jing Wang, Jun Wang¹⁸ - Show less +123 more•Institutions (20)

Chinese Academy of Sciences¹, Shanghai Jiao Tong University², Fudan University³, Kunming Institute of Zoology⁴, Shenzhen University⁵, Chengdu Research Base of Giant Panda Breeding⁶, Wellcome Trust⁷, University of Toronto⁸, University of California, Berkeley⁹, Southeast University¹⁰, University of Hong Kong¹¹, Sun Yat-sen University¹², University of Vienna¹³, Cardiff University¹⁴, Comenius University in Bratislava¹⁵, Sichuan University¹⁶, South China University of Technology¹⁷, University of Copenhagen¹⁸, University of Alberta¹⁹, University of Washington²⁰

21 Jan 2010-Nature

TL;DR: Using next-generation sequencing technology alone, a draft sequence of the giant panda genome is generated and assembled, indicating that its bamboo diet might be more dependent on its gut microbiome than its own genetic composition.

...read moreread less

Abstract: Using next-generation sequencing technology alone, we have successfully generated and assembled a draft sequence of the giant panda genome. The assembled contigs (2.25 gigabases (Gb)) cover approximately 94% of the whole genome, and the remaining gaps (0.05 Gb) seem to contain carnivore-specific repeats and tandem repeats. Comparisons with the dog and human showed that the panda genome has a lower divergence rate. The assessment of panda genes potentially underlying some of its unique traits indicated that its bamboo diet might be more dependent on its gut microbiome than its own genetic composition. We also identified more than 2.7 million heterozygous single nucleotide polymorphisms in the diploid genome. Our data and analyses provide a foundation for promoting mammalian genetic research, and demonstrate the feasibility for using next-generation sequencing technologies for accurate, cost-effective and rapid de novo assembly of large eukaryotic genomes.

...read moreread less

1,109 citations

1
2
3
4
…
5
6
7
8

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data

[...]

Aaron McKenna¹, Matthew Hanna, Eric Banks, Andrey Sivachenko, Kristian Cibulskis, Andrew Kernytsky, Kiran V. Garimella, David Altshuler, Stacey Gabriel, Mark J. Daly, Mark A. DePristo - Show less +7 more•Institutions (1)

Broad Institute¹

01 Sep 2010-Genome Research

TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

...read moreread less

Abstract: Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS—the 1000 Genome pilot alone includes nearly five terabases—make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

...read moreread less

20,557 citations

Journal Article•DOI•

Ultrafast and memory-efficient alignment of short DNA sequences to the human genome

[...]

Ben Langmead¹, Cole Trapnell¹, Mihai Pop¹, Steven L. Salzberg¹•Institutions (1)

University of Maryland, College Park¹

04 Mar 2009-Genome Biology

TL;DR: Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches and can be used simultaneously to achieve even greater alignment speeds.

...read moreread less

Abstract: Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source http://bowtie.cbcb.umd.edu.

...read moreread less

20,335 citations

疟原虫var基因转换速率变化导致抗原变异[英]／Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A

[...]

宁北芳, 朱淮民

28 Jul 2005

TL;DR: PfPMP1）与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作�ly.

...read moreread less

Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1（PfPMP1）与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员，通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

...read moreread less

18,940 citations

Journal Article•DOI•

SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing

[...]

Anton Bankevich¹, Sergey Nurk, Dmitry Antipov, Alexey Gurevich, Mikhail Dvorkin, Alexander S. Kulikov, Valery M. Lesin, Sergey I. Nikolenko, Son Pham, Andrey D. Prjibelski, Alexey V. Pyshkin, Alexander Sirotkin, Nikolay Vyahhi, Glenn Tesler, Max A. Alekseyev, Pavel A. Pevzner - Show less +12 more•Institutions (1)

Saint Petersburg Academic University¹

07 May 2012-Journal of Computational Biology

TL;DR: SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies.

...read moreread less

Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V−SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online (http://bioinf.spbau.ru/spades). It is distributed as open source software.

...read moreread less

16,859 citations

Journal Article•DOI•

A global reference for human genetic variation.

[...]

Adam Auton¹, Gonçalo R. Abecasis², David Altshuler³, Richard Durbin⁴ +514 more•Institutions (90)

01 Oct 2015-Nature

TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.

...read moreread less

Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

...read moreread less

12,661 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse