Home
/
Authors
/
Simon Tavaré

Author

Simon Tavaré

Other affiliations: NHS Blood and Transplant, Children's Hospital Oakland Research Institute, Medical Research Council ...read more

Bio: Simon Tavaré is an academic researcher from Columbia University. The author has contributed to research in topics: Population & Coalescent theory. The author has an hindex of 82, co-authored 284 publications receiving 35081 citations. Previous affiliations of Simon Tavaré include NHS Blood and Transplant & Children's Hospital Oakland Research Institute.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1987
1986
1984
1983
1982
1981
1980
1979
1978
1976

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups

[...]

Christina Curtis¹, Christina Curtis², Sohrab P. Shah³, Suet-Feung Chin¹, Gulisa Turashvili³, Oscar M. Rueda¹, Mark J Dunning, Doug Speed², Doug Speed¹, Andy G. Lynch¹, Shamith A. Samarajiwa¹, Yinyin Yuan¹, Stefan Gräf¹, Gavin Ha³, Gholamreza Haffari³, Ali Bashashati³, Roslin Russell, Steven McKinney³, Anita Langerød⁴, Andrew R. Green⁵, Elena Provenzano¹, Gordon C. Wishart¹, Sarah E Pinder⁶, Peter H. Watson³, Peter H. Watson⁷, Florian Markowetz¹, Leigh C. Murphy⁷, Ian O. Ellis⁵, Arnie Purushotham⁶, Arnie Purushotham⁸, Anne Lise Børresen-Dale⁹, Anne Lise Børresen-Dale⁴, James D. Brenton, Simon Tavaré, Carlos Caldas, Samuel Aparicio³ - Show less +32 more•Institutions (9)

University of Cambridge¹, University of Southern California², University of British Columbia³, Oslo University Hospital⁴, University of Nottingham⁵, King's College London⁶, University of Manitoba⁷, Guy's and St Thomas' NHS Foundation Trust⁸, University of Oslo⁹

21 Jun 2012-Nature

TL;DR: The results provide a novel molecular stratification of the breast cancer population, derived from the impact of somatic CNAs on the transcriptome, and identify novel subgroups with distinct clinical outcomes, which reproduced in the validation cohort.

...read moreread less

Abstract: The elucidation of breast cancer subgroups and their molecular drivers requires integrated views of the genome and transcriptome from representative numbers of patients. We present an integrated analysis of copy number and gene expression in a discovery and validation set of 997 and 995 primary breast tumours, respectively, with long-term clinical follow-up. Inherited variants (copy number variants and single nucleotide polymorphisms) and acquired somatic copy number aberrations (CNAs) were associated with expression in 40% of genes, with the landscape dominated by cisand trans-acting CNAs. By delineating expression outlier genes driven in cis by CNAs, we identified putative cancer genes, including deletions in PPP2R2A, MTAP and MAP2K4. Unsupervised analysis of paired DNA–RNA profiles revealed novel subgroups with distinct clinical outcomes, which reproduced in the validation cohort. These include a high-risk, oestrogen-receptor-positive 11q13/14 cis-acting subgroup and a favourable prognosis subgroup devoid of CNAs. Trans-acting aberration hotspots were found to modulate subgroup-specific gene networks, including a TCR deletion-mediated adaptive immune response in the ‘CNA-devoid’ subgroup and a basal-specific chromosome 5 deletion-associated mitotic network. Our results provide a novel molecular stratification of the breast cancer population, derived from the impact of somatic CNAs on the transcriptome.

...read moreread less

4,722 citations

Some probabilistic and statistical problems in the analysis of DNA sequences

[...]

Simon Tavaré

01 Jan 1986

2,780 citations

Journal Article•DOI•

Relative Impact of Nucleotide and Copy Number Variation on Gene Expression Phenotypes

[...]

Barbara E. Stranger¹, Matthew S. Forrest¹, Mark J Dunning², Catherine E. Ingle¹, Claude Beazley¹, Natalie P. Thorne², Richard Redon¹, Christine P. Bird¹, Anna De Grassi, Charles Lee³, Charles Lee⁴, Chris Tyler-Smith¹, Nigel P. Carter¹, Stephen W. Scherer⁵, Stephen W. Scherer⁶, Simon Tavaré⁷, Simon Tavaré², Panagiotis Deloukas¹, Matthew E. Hurles¹, Emmanouil T. Dermitzakis¹ - Show less +16 more•Institutions (7)

Wellcome Trust Sanger Institute¹, University of Cambridge², Brigham and Women's Hospital³, Massachusetts Institute of Technology⁴, University of Toronto⁵, The Centre for Applied Genomics⁶, University of Southern California⁷

09 Feb 2007-Science

TL;DR: To determine the overall contribution of CNVs to complex phenotypes, association analyses of expression levels with SNPs and CNVs in individuals who are part of the International HapMap project show little overlap between the two types of variation.

...read moreread less

Abstract: Extensive studies are currently being performed to associate disease susceptibility with one form of genetic variation, namely, single-nucleotide polymorphisms (SNPs). In recent years, another type of common genetic variation has been characterized, namely, structural variation, including copy number variants (CNVs). To determine the overall contribution of CNVs to complex phenotypes, we have performed association analyses of expression levels of 14,925 transcripts with SNPs and CNVs in individuals who are part of the International HapMap project. SNPs and CNVs captured 83.6% and 17.7% of the total detected genetic variation in gene expression, respectively, but the signals from the two types of variation had little overlap. Interrogation of the genome for both types of variants may be an effective way to elucidate the causes of complex phenotypes and disease in humans.

...read moreread less

1,729 citations

Journal Article•DOI•

Pan-cancer analysis of whole genomes

[...]

Peter J. Campbell¹, Gad Getz², Jan O. Korbel³, Joshua M. Stuart⁴ +1329 more•Institutions (238)

06 Feb 2020-Nature

TL;DR: The flagship paper of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium describes the generation of the integrative analyses of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types, the structures for international data sharing and standardized analyses, and the main scientific findings from across the consortium studies.

...read moreread less

Abstract: Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale1,2,3. Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4–5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter4; identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation5,6; analyses timings and patterns of tumour evolution7; describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity8,9; and evaluates a range of more-specialized features of cancer genomes8,10,11,12,13,14,15,16,17,18.

...read moreread less

1,600 citations

Journal Article•DOI•

Intratumor heterogeneity in human glioblastoma reflects cancer evolutionary dynamics

[...]

Andrea Sottoriva¹, Andrea Sottoriva², Andrea Sottoriva³, Inmaculada Spiteri¹, Sara Grazia Maria Piccirillo¹, Anestis Touloumis³, V. Peter Collins¹, John C. Marioni³, Christina Curtis, Colin Watts³, Simon Tavaré¹, Simon Tavaré², Simon Tavaré³ - Show less +9 more•Institutions (3)

Cancer Research UK¹, University of Southern California², University of Cambridge³

05 Mar 2013-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: The genome-wide architecture of intratumor variability in GB is revealed across multiple spatial scales and patient-specific patterns of cancer evolution, with consequences for treatment design.

...read moreread less

Abstract: Glioblastoma (GB) is the most common and aggressive primary brain malignancy, with poor prognosis and a lack of effective therapeutic options. Accumulating evidence suggests that intratumor heterogeneity likely is the key to understanding treatment failure. However, the extent of intratumor heterogeneity as a result of tumor evolution is still poorly understood. To address this, we developed a unique surgical multisampling scheme to collect spatially distinct tumor fragments from 11 GB patients. We present an integrated genomic analysis that uncovers extensive intratumor heterogeneity, with most patients displaying different GB subtypes within the same tumor. Moreover, we reconstructed the phylogeny of the fragments for each patient, identifying copy number alterations in EGFR and CDKN2A/B/p14ARF as early events, and aberrations in PDGFRA and PTEN as later events during cancer progression. We also characterized the clonal organization of each tumor fragment at the single-molecule level, detecting multiple coexisting cell lineages. Our results reveal the genome-wide architecture of intratumor variability in GB across multiple spatial scales and patient-specific patterns of cancer evolution, with consequences for treatment design.

...read moreread less

1,495 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Ultrafast and memory-efficient alignment of short DNA sequences to the human genome

[...]

Ben Langmead¹, Cole Trapnell¹, Mihai Pop¹, Steven L. Salzberg¹•Institutions (1)

University of Maryland, College Park¹

04 Mar 2009-Genome Biology

TL;DR: Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches and can be used simultaneously to achieve even greater alignment speeds.

...read moreread less

Abstract: Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source http://bowtie.cbcb.umd.edu.

...read moreread less

20,335 citations

Journal Article•DOI•

The Theory of Island Biogeography

[...]

Jeff Swinebroad, Robert H. MacArthur, Edward O. Wilson

01 Oct 1969-Journal of Wildlife Management

TL;DR: Preface to the Princeton Landmarks in Biology Edition vii Preface xi Symbols used xiii 1.

...read moreread less

Abstract: Preface to the Princeton Landmarks in Biology Edition vii Preface xi Symbols Used xiii 1. The Importance of Islands 3 2. Area and Number of Speicies 8 3. Further Explanations of the Area-Diversity Pattern 19 4. The Strategy of Colonization 68 5. Invasibility and the Variable Niche 94 6. Stepping Stones and Biotic Exchange 123 7. Evolutionary Changes Following Colonization 145 8. Prospect 181 Glossary 185 References 193 Index 201

...read moreread less

14,171 citations

Journal Article•DOI•

A global reference for human genetic variation.

[...]

Adam Auton¹, Gonçalo R. Abecasis², David Altshuler³, Richard Durbin⁴ +514 more•Institutions (90)

01 Oct 2015-Nature

TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.

...read moreread less

Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

...read moreread less

12,661 citations

Journal Article•DOI•

The sequence of the human genome.

[...]

J. Craig Venter¹, Mark Raymond Adams¹, Eugene W. Myers¹, Peter W. Li¹ +269 more•Institutions (12)

16 Feb 2001-Science

TL;DR: Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems are indicated.

...read moreread less

Abstract: A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.

...read moreread less

12,098 citations

Journal Article•DOI•

BEAST: Bayesian evolutionary analysis by sampling trees

[...]

Alexei J. Drummond¹, Andrew Rambaut²•Institutions (2)

University of Auckland¹, University of Edinburgh²

08 Nov 2007-BMC Evolutionary Biology

TL;DR: BEAST is a fast, flexible software architecture for Bayesian analysis of molecular sequences related by an evolutionary tree that provides models for DNA and protein sequence evolution, highly parametric coalescent analysis, relaxed clock phylogenetics, non-contemporaneous sequence data, statistical alignment and a wide range of options for prior distributions.

...read moreread less

Abstract: The evolutionary analysis of molecular sequence variation is a statistical enterprise. This is reflected in the increased use of probabilistic models for phylogenetic inference, multiple sequence alignment, and molecular population genetics. Here we present BEAST: a fast, flexible software architecture for Bayesian analysis of molecular sequences related by an evolutionary tree. A large number of popular stochastic models of sequence evolution are provided and tree-based models suitable for both within- and between-species sequence data are implemented. BEAST version 1.4.6 consists of 81000 lines of Java source code, 779 classes and 81 packages. It provides models for DNA and protein sequence evolution, highly parametric coalescent analysis, relaxed clock phylogenetics, non-contemporaneous sequence data, statistical alignment and a wide range of options for prior distributions. BEAST source code is object-oriented, modular in design and freely available at http://beast-mcmc.googlecode.com/ under the GNU LGPL license. BEAST is a powerful and flexible evolutionary analysis package for molecular sequence variation. It also provides a resource for the further development of new models and statistical methods of evolutionary analysis.

...read moreread less

11,916 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse