Home
/
Authors
/
Marc Sultan

Author

Marc Sultan

Bio: Marc Sultan is an academic researcher from Novartis. The author has contributed to research in topics: Transcriptome & Exome sequencing. The author has an hindex of 27, co-authored 43 publications receiving 19048 citations. Previous affiliations of Marc Sultan include Max Planck Society.

Topics: Transcriptome, Exome sequencing, Epigenetics, Gene, RNA-Seq ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A global reference for human genetic variation.

[...]

Adam Auton¹, Gonçalo R. Abecasis², David Altshuler³, Richard Durbin⁴ +514 more•Institutions (90)

01 Oct 2015-Nature

TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.

...read moreread less

Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

...read moreread less

12,661 citations

A global reference for human genetic variation

[...]

Adam Auton, Gonçalo R. Abecasis, David Altshuler, Richard Durbin +476 more

01 Oct 2015

TL;DR: The 1000 Genomes Project as mentioned in this paper provided a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and reported the completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole genome sequencing, deep exome sequencing and dense microarray genotyping.

...read moreread less

3,247 citations

Journal Article•DOI•

Transcriptome and genome sequencing uncovers functional variation in humans

[...]

Tuuli Lappalainen¹, Michael Sammeth, Marc R. Friedländer, Peter A C 't Hoen², Jean Monlong³, Manuel A. Rivas⁴, Mar Gonzàlez-Porta⁵, Natalja Kurbatova⁵, Thasso Griebel, Pedro G. Ferreira³, Matthias Barann⁶, Thomas Wieland, Liliana Greger⁵, Maarten van Iterson², Jonas Carlsson Almlöf⁷, Paolo Ribeca, Irina Pulyakhina², Daniela Esser⁶, Thomas Giger¹, Andrew Tikhonov⁵, Marc Sultan⁸, Gabrielle Bertier³, Daniel G. MacArthur⁹, Daniel G. MacArthur¹⁰, Monkol Lek⁹, Monkol Lek¹⁰, Esther Lizano, Henk P. J. Buermans², Ismael Padioleau¹, Ismael Padioleau¹¹, Thomas Schwarzmayr, Olof Karlberg⁷, Halit Ongen¹, Halit Ongen¹¹, Helena Kilpinen¹, Helena Kilpinen¹¹, Sergi Beltran, Marta Gut, Katja Kahlem, Vyacheslav Amstislavskiy⁸, Oliver Stegle⁵, Matti Pirinen⁴, Stephen B. Montgomery¹², Stephen B. Montgomery¹, Peter Donnelly⁴, Mark I. McCarthy¹³, Mark I. McCarthy⁴, Paul Flicek⁵, Tim M. Strom¹⁴, Hans Lehrach⁸, Stefan Schreiber⁶, Ralf Sudbrak⁸, Angel Carracedo¹⁵, Stylianos E. Antonarakis¹, Robert Häsler⁶, Ann-Christine Syvänen⁷, Gert-Jan B. van Ommen², Alvis Brazma⁵, Thomas Meitinger¹⁴, Philip Rosenstiel⁶, Roderic Guigó³, Ivo Gut, Xavier Estivill, Emmanouil T. Dermitzakis¹, Emmanouil T. Dermitzakis¹¹ - Show less +61 more•Institutions (15)

University of Geneva¹, Leiden University Medical Center², Pompeu Fabra University³, Wellcome Trust Centre for Human Genetics⁴, European Bioinformatics Institute⁵, University of Kiel⁶, Science for Life Laboratory⁷, Max Planck Society⁸, Broad Institute⁹, Harvard University¹⁰, Swiss Institute of Bioinformatics¹¹, Stanford University¹², University of Oxford¹³, Technische Universität München¹⁴, University of Santiago de Compostela¹⁵

26 Sep 2013-Nature

TL;DR: Se sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project—the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences discover extremely widespread genetic variation affecting the regulation of most genes.

...read moreread less

Abstract: Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project--the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences. We discover extremely widespread genetic variation affecting the regulation of most genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on the cellular mechanisms of regulatory and loss-of-function variation, and allows us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.

...read moreread less

1,892 citations

Journal Article•DOI•

A Global View of Gene Activity and Alternative Splicing by Deep Sequencing of the Human Transcriptome

[...]

Marc Sultan¹, Marcel H. Schulz², Marcel H. Schulz¹, Hugues Richard¹, Alon Magen¹, Andreas Klingenhoff³, Matthias Scherf³, Martin Seifert³, Tatjana Borodina¹, Aleksey V. Soldatov¹, Dmitri Parkhomchuk¹, Dominic Schmidt¹, Sean O'Keeffe¹, Stefan A. Haas¹, Martin Vingron¹, Hans Lehrach¹, Marie-Laure Yaspo¹ - Show less +13 more•Institutions (3)

Max Planck Society¹, Free University of Berlin², Genomatix³

15 Aug 2008-Science

TL;DR: A global survey of messenger RNA splicing events identified 94,241 splice junctions and showed that exon skipping is the most prevalent form of alternative splicing.

...read moreread less

Abstract: The functional complexity of the human transcriptome is not yet fully elucidated. We report a high-throughput sequence of the human transcriptome from a human embryonic kidney and a B cell line. We used shotgun sequencing of transcripts to generate randomly distributed reads. Of these, 50% mapped to unique genomic locations, of which 80% corresponded to known exons. We found that 66% of the polyadenylated transcriptome mapped to known genes and 34% to nonannotated genomic regions. On the basis of known transcripts, RNA-Seq can detect 25% more genes than can microarrays. A global survey of messenger RNA splicing events identified 94,241 splice junctions (4096 of which were previously unidentified) and showed that exon skipping is the most prevalent form of alternative splicing.

...read moreread less

1,288 citations

Journal Article•DOI•

The genome of a songbird

[...]

Wesley C. Warren¹, David F. Clayton², Hans Ellegren³, Arthur P. Arnold⁴, LaDeana W. Hillier¹, Axel Künstner³, Steve Searle⁵, Simon D. M. White⁵, Albert J. Vilella, Susan Fairley⁵, Andreas Heger⁶, Lesheng Kong⁶, Chris P. Ponting⁶, Erich D. Jarvis⁷, Claudio V. Mello, Patrick Minx¹, Peter V. Lovell, Tarciso A. F. Velho, Margaret Ferris², Christopher N. Balakrishnan², Saurabh Sinha², Charles Blatti², Sarah E. London², Yun Li², Ya-Chi Lin², Jimin George², Jonathan V. Sweedler², Bruce R. Southey², Preethi H. Gunaratne⁸, Michael E. Watson, Kiwoong Nam³, Niclas Backström³, Linnéa Smeds³, Benoit Nabholz³, Yuichiro Itoh⁴, Osceola Whitney⁷, Andreas R. Pfenning⁷, Jason T. Howard⁷, Martin Völker, Benjamin M. Skinner⁹, Darren K. Griffin⁹, Liang Ye¹, William M. McLaren, Paul Flicek, Víctor Quesada¹⁰, Gloria Velasco¹⁰, Carlos López-Otín¹⁰, Xose S. Puente¹⁰, Tsviya Olender¹¹, Doron Lancet¹¹, Arian F.A. Smit¹², Robert Hubley¹², Miriam K. Konkel¹³, Jerilyn A. Walker¹³, Mark A. Batzer¹³, Wanjun Gu¹⁴, David D. Pollock¹⁴, Lin Chen¹⁵, Ze Cheng¹⁵, Evan E. Eichler¹⁵, Jessica Stapley¹⁵, Jon Slate¹⁶, Robert Ekblom¹⁶, Tim R. Birkhead¹⁶, Terry Burke¹⁶, David W. Burt¹⁷, Constance Scharff¹⁸, Iris Adam¹⁹, Hugues Richard¹⁸, Marc Sultan¹⁸, Alexey Soldatov¹⁸, Hans Lehrach¹⁸, Scott V. Edwards²⁰, Shiaw-Pyng Yang²¹, XiaoChing Li¹³, Tina Graves¹, Lucinda Fulton¹, Joanne O. Nelson¹, Asif T. Chinwalla¹, Shunfeng Hou¹, Elaine R. Mardis¹, Richard K. Wilson¹ - Show less +78 more•Institutions (21)

Washington University in St. Louis¹, University of Illinois at Urbana–Champaign², Uppsala University³, University of California, Los Angeles⁴, Wellcome Trust Sanger Institute⁵, University of Oxford⁶, Duke University⁷, University of Houston⁸, University of Kent⁹, University of Oviedo¹⁰, Weizmann Institute of Science¹¹, Institute for Systems Biology¹², Louisiana State University¹³, University of Colorado Denver¹⁴, University of Washington¹⁵, University of Sheffield¹⁶, University of Edinburgh¹⁷, Max Planck Society¹⁸, Free University of Berlin¹⁹, Harvard University²⁰, Monsanto²¹

01 Apr 2010-Nature

TL;DR: This work shows that song behaviour engages gene regulatory networks in the zebra finch brain, altering the expression of long non-coding RNAs, microRNAs, transcription factors and their targets and shows evidence for rapid molecular evolution in the songbird lineage of genes that are regulated during song experience.

...read moreread less

Abstract: The zebra finch is an important model organism in several fields with unique relevance to human neuroscience. Like other songbirds, the zebra finch communicates through learned vocalizations, an ability otherwise documented only in humans and a few other animals and lacking in the chicken-the only bird with a sequenced genome until now. Here we present a structural, functional and comparative analysis of the genome sequence of the zebra finch (Taeniopygia guttata), which is a songbird belonging to the large avian order Passeriformes. We find that the overall structures of the genomes are similar in zebra finch and chicken, but they differ in many intrachromosomal rearrangements, lineage-specific gene family expansions, the number of long-terminal-repeat-based retrotransposons, and mechanisms of sex chromosome dosage compensation. We show that song behaviour engages gene regulatory networks in the zebra finch brain, altering the expression of long non-coding RNAs, microRNAs, transcription factors and their targets. We also show evidence for rapid molecular evolution in the songbird lineage of genes that are regulated during song experience. These results indicate an active involvement of the genome in neural processes underlying vocal communication and identify potential genetic substrates for the evolution and regulation of this behaviour.

...read moreread less

837 citations

1
2
3
4
…
5
6
7
8
9
10

Collapse

Cited by

PDF

Open Access

More filters

疟原虫var基因转换速率变化导致抗原变异[英]／Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A

[...]

宁北芳, 朱淮民

28 Jul 2005

TL;DR: PfPMP1）与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作�ly.

...read moreread less

Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1（PfPMP1）与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员，通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

...read moreread less

18,940 citations

Journal Article•DOI•

RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome

[...]

Bo Li¹, Colin N. Dewey¹•Institutions (1)

University of Wisconsin-Madison¹

04 Aug 2011-BMC Bioinformatics

TL;DR: It is shown that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads, and estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired- end reads, depending on the number of possible splice forms for each gene.

...read moreread less

Abstract: RNA-Seq is revolutionizing the way transcript abundances are measured. A key challenge in transcript quantification from RNA-Seq data is the handling of reads that map to multiple genes or isoforms. This issue is particularly important for quantification with de novo transcriptome assemblies in the absence of sequenced genomes, as it is difficult to determine which transcripts are isoforms of the same gene. A second significant issue is the design of RNA-Seq experiments, in terms of the number of reads, read length, and whether reads come from one or both ends of cDNA fragments. We present RSEM, an user-friendly software package for quantifying gene and isoform abundances from single-end or paired-end RNA-Seq data. RSEM outputs abundance estimates, 95% credibility intervals, and visualization files and can also simulate RNA-Seq data. In contrast to other existing tools, the software does not require a reference genome. Thus, in combination with a de novo transcriptome assembler, RSEM enables accurate transcript quantification for species without sequenced genomes. On simulated and real data sets, RSEM has superior or comparable performance to quantification methods that rely on a reference genome. Taking advantage of RSEM's ability to effectively use ambiguously-mapping reads, we show that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads. On the other hand, estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired-end reads, depending on the number of possible splice forms for each gene. RSEM is an accurate and user-friendly software tool for quantifying transcript abundances from RNA-Seq data. As it does not rely on the existence of a reference genome, it is particularly useful for quantification with de novo transcriptome assemblies. In addition, RSEM has enabled valuable guidance for cost-efficient design of quantification experiments with RNA-Seq, which is currently relatively expensive.

...read moreread less

14,524 citations

Journal Article•DOI•

RNA-Seq: a revolutionary tool for transcriptomics

[...]

Zhong Wang¹, Mark Gerstein¹, Michael Snyder¹•Institutions (1)

Yale University¹

01 Jan 2009-Nature Reviews Genetics

TL;DR: The RNA-Seq approach to transcriptome profiling that uses deep-sequencing technologies provides a far more precise measurement of levels of transcripts and their isoforms than other methods.

...read moreread less

Abstract: RNA-Seq is a recently developed approach to transcriptome profiling that uses deep-sequencing technologies. Studies using this method have already altered our view of the extent and complexity of eukaryotic transcriptomes. RNA-Seq also provides a far more precise measurement of levels of transcripts and their isoforms than other methods. This article describes the RNA-Seq approach, the challenges associated with its application, and the advances made so far in characterizing several eukaryote transcriptomes.

...read moreread less

11,528 citations

Journal Article•DOI•

TopHat: discovering splice junctions with RNA-Seq

[...]

Cole Trapnell¹, Lior Pachter¹, Steven L. Salzberg¹•Institutions (1)

University of Maryland, College Park¹

01 May 2009-Bioinformatics

TL;DR: The TopHat pipeline is much faster than previous systems, mapping nearly 2.2 million reads per CPU hour, which is sufficient to process an entire RNA-Seq experiment in less than a day on a standard desktop computer.

...read moreread less

Abstract: Motivation: A new protocol for sequencing the messenger RNA in a cell, known as RNA-Seq, generates millions of short sequence fragments in a single run. These fragments, or ‘reads’, can be used to measure levels of gene expression and to identify novel splice variants of genes. However, current software for aligning RNA-Seq data to a genome relies on known splice junctions and cannot identify novel ones. TopHat is an efficient read-mapping algorithm designed to align reads from an RNA-Seq experiment to a reference genome without relying on known splice sites. Results: We mapped the RNA-Seq reads from a recent mammalian RNA-Seq experiment and recovered more than 72% of the splice junctions reported by the annotation-based software from that study, along with nearly 20 000 previously unreported junctions. The TopHat pipeline is much faster than previous systems, mapping nearly 2.2 million reads per CPU hour, which is sufficient to process an entire RNA-Seq experiment in less than a day on a standard desktop computer. We describe several challenges unique to ab initio splice site discovery from RNA-Seq reads that will require further algorithm development. Availability: TopHat is free, open-source software available from http://tophat.cbcb.umd.edu Contact: ude.dmu.sc@eloc Supplementary information: Supplementary data are available at Bioinformatics online.

...read moreread less

11,473 citations

Journal Article•DOI•

Analysis of protein-coding genetic variation in 60,706 humans

[...]

Monkol Lek, Konrad J. Karczewski¹, Konrad J. Karczewski², Eric Vallabh Minikel², Eric Vallabh Minikel¹, Kaitlin E. Samocha, Eric Banks², Timothy Fennell², Anne H. O’Donnell-Luria², Anne H. O’Donnell-Luria¹, Anne H. O’Donnell-Luria³, James S. Ware, Andrew J. Hill⁴, Andrew J. Hill¹, Andrew J. Hill², Beryl B. Cummings¹, Beryl B. Cummings², Taru Tukiainen², Taru Tukiainen¹, Daniel P. Birnbaum², Jack A. Kosmicki, Laramie E. Duncan¹, Laramie E. Duncan², Karol Estrada², Karol Estrada¹, Fengmei Zhao², Fengmei Zhao¹, James Zou², Emma Pierce-Hoffman¹, Emma Pierce-Hoffman², Joanne Berghout⁵, David Neil Cooper⁶, Nicole A. Deflaux⁷, Mark A. DePristo², Ron Do, Jason Flannick², Jason Flannick¹, Menachem Fromer, Laura D. Gauthier², Jackie Goldstein², Jackie Goldstein¹, Namrata Gupta², Daniel P. Howrigan², Daniel P. Howrigan¹, Adam Kiezun², Mitja I. Kurki², Mitja I. Kurki¹, Ami Levy Moonshine², Pradeep Natarajan, Lorena Orozco, Gina M. Peloso², Gina M. Peloso¹, Ryan Poplin², Manuel A. Rivas², Valentin Ruano-Rubio², Samuel A. Rose², Douglas M. Ruderfer⁸, Khalid Shakir², Peter D. Stenson⁶, Christine Stevens², Brett Thomas², Brett Thomas¹, Grace Tiao², María Teresa Tusié-Luna, Ben Weisburd², Hong-Hee Won⁹, Dongmei Yu, David Altshuler¹⁰, David Altshuler², Diego Ardissino, Michael Boehnke¹¹, John Danesh¹², Stacey Donnelly², Roberto Elosua, Jose C. Florez¹, Jose C. Florez², Stacey Gabriel², Gad Getz¹, Gad Getz², Stephen J. Glatt¹³, Christina M. Hultman¹⁴, Sekar Kathiresan, Markku Laakso¹⁵, Steven A. McCarroll¹, Steven A. McCarroll², Mark I. McCarthy¹⁶, Mark I. McCarthy¹⁷, Dermot P.B. McGovern¹⁸, Ruth McPherson¹⁹, Benjamin M. Neale¹, Benjamin M. Neale², Aarno Palotie, Shaun Purcell⁸, Danish Saleheen²⁰, Jeremiah M. Scharf, Pamela Sklar, Patrick F. Sullivan¹⁴, Patrick F. Sullivan²¹, Jaakko Tuomilehto²², Ming T. Tsuang²³, Hugh Watkins¹⁷, Hugh Watkins¹⁶, James G. Wilson²⁴, Mark J. Daly², Mark J. Daly¹, Daniel G. MacArthur², Daniel G. MacArthur¹ - Show less +103 more•Institutions (24)

Harvard University¹, Broad Institute², Boston Children's Hospital³, University of Washington⁴, University of Arizona⁵, Cardiff University⁶, Google⁷, Icahn School of Medicine at Mount Sinai⁸, Samsung Medical Center⁹, Vertex Pharmaceuticals¹⁰, University of Michigan¹¹, University of Cambridge¹², State University of New York Upstate Medical University¹³, Karolinska Institutet¹⁴, University of Eastern Finland¹⁵, Wellcome Trust Centre for Human Genetics¹⁶, University of Oxford¹⁷, Cedars-Sinai Medical Center¹⁸, University of Ottawa¹⁹, University of Pennsylvania²⁰, University of North Carolina at Chapel Hill²¹, University of Helsinki²², University of California, San Diego²³, University of Mississippi Medical Center²⁴

18 Aug 2016-Nature

TL;DR: The aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC) provides direct evidence for the presence of widespread mutational recurrence.

...read moreread less

Abstract: Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

...read moreread less

8,758 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse