scispace - formally typeset
Search or ask a question
Author

Benjamin Raeder

Bio: Benjamin Raeder is an academic researcher from European Bioinformatics Institute. The author has contributed to research in topics: Structural variation & Genome. The author has an hindex of 16, co-authored 20 publications receiving 16023 citations.

Papers
More filters
Journal ArticleDOI
Adam Auton1, Gonçalo R. Abecasis2, David Altshuler3, Richard Durbin4  +514 moreInstitutions (90)
01 Oct 2015-Nature
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

12,661 citations

01 Oct 2015
TL;DR: The 1000 Genomes Project as mentioned in this paper provided a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and reported the completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole genome sequencing, deep exome sequencing and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

3,247 citations

Journal ArticleDOI
Peter H. Sudmant1, Tobias Rausch, Eugene J. Gardner2, Robert E. Handsaker3, Robert E. Handsaker4, Alexej Abyzov5, John Huddleston1, Yan Zhang6, Kai Ye7, Goo Jun8, Goo Jun9, Markus His Yang Fritz, Miriam K. Konkel10, Ankit Malhotra, Adrian M. Stütz, Xinghua Shi11, Francesco Paolo Casale12, Jieming Chen6, Fereydoun Hormozdiari1, Gargi Dayama9, Ken Chen13, Maika Malig1, Mark Chaisson1, Klaudia Walter12, Sascha Meiers, Seva Kashin3, Seva Kashin4, Erik Garrison14, Adam Auton15, Hugo Y. K. Lam, Xinmeng Jasmine Mu6, Xinmeng Jasmine Mu3, Can Alkan16, Danny Antaki17, Taejeong Bae5, Eliza Cerveira, Peter S. Chines18, Zechen Chong13, Laura Clarke12, Elif Dal16, Li Ding7, S. Emery9, Xian Fan13, Madhusudan Gujral17, Fatma Kahveci16, Jeffrey M. Kidd9, Yu Kong15, Eric-Wubbo Lameijer19, Shane A. McCarthy12, Paul Flicek12, Richard A. Gibbs20, Gabor T. Marth14, Christopher E. Mason21, Androniki Menelaou22, Androniki Menelaou23, Donna M. Muzny24, Bradley J. Nelson1, Amina Noor17, Nicholas F. Parrish25, Matthew Pendleton24, Andrew Quitadamo11, Benjamin Raeder, Eric E. Schadt24, Mallory Romanovitch, Andreas Schlattl, Robert Sebra24, Andrey A. Shabalin26, Andreas Untergasser27, Jerilyn A. Walker10, Min Wang20, Fuli Yu20, Chengsheng Zhang, Jing Zhang6, Xiangqun Zheng-Bradley12, Wanding Zhou13, Thomas Zichner, Jonathan Sebat17, Mark A. Batzer10, Steven A. McCarroll3, Steven A. McCarroll4, Ryan E. Mills9, Mark Gerstein6, Ali Bashir24, Oliver Stegle12, Scott E. Devine2, Charles Lee28, Evan E. Eichler1, Jan O. Korbel12 
01 Oct 2015-Nature
TL;DR: In this paper, the authors describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which are constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations.
Abstract: Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.

1,971 citations

Journal ArticleDOI
Peter J. Campbell1, Gad Getz2, Jan O. Korbel3, Joshua M. Stuart4  +1329 moreInstitutions (238)
06 Feb 2020-Nature
TL;DR: The flagship paper of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium describes the generation of the integrative analyses of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types, the structures for international data sharing and standardized analyses, and the main scientific findings from across the consortium studies.
Abstract: Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale1,2,3. Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4–5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter4; identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation5,6; analyses timings and patterns of tumour evolution7; describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity8,9; and evaluates a range of more-specialized features of cancer genomes8,10,11,12,13,14,15,16,17,18.

1,600 citations

Journal ArticleDOI
David T.W. Jones1, Natalie Jäger1, Marcel Kool1, Thomas Zichner2, Barbara Hutter1, Marc Sultan3, Yoon Jae Cho4, Trevor J. Pugh5, Volker Hovestadt1, Adrian M. Stütz2, Tobias Rausch2, Hans-Jörg Warnatz3, Marina Ryzhova, Sebastian Bender1, Dominik Sturm1, Sabrina Pleier1, Huriye Cin1, Elke Pfaff1, Laura Sieber1, Andrea Wittmann1, Marc Remke1, Hendrik Witt1, Hendrik Witt6, Sonja Hutter1, Theophilos Tzaridis1, Joachim Weischenfeldt2, Benjamin Raeder2, Meryem Avci3, Vyacheslav Amstislavskiy3, Marc Zapatka1, Ursula D. Weber1, Qi Wang1, Bärbel Lasitschka1, Cynthia C. Bartholomae1, Manfred Schmidt1, Christof von Kalle1, Volker Ast1, Chris Lawerenz1, Jürgen Eils1, Rolf Kabbe1, Vladimir Benes2, Peter van Sluis7, Jan Koster7, Richard Volckmann7, David Shih, Matthew J. Betts6, Robert B. Russell6, Simona Coco, Gian Paolo Tonini, Ulrich Schüller8, Volkmar Hans, Norbert Graf9, Yoo-Jin Kim9, Camelia M. Monoranu, Wolfgang Roggendorf, Andreas Unterberg6, Christel Herold-Mende6, Till Milde6, Till Milde1, Andreas E. Kulozik6, Andreas von Deimling1, Andreas von Deimling6, Olaf Witt1, Olaf Witt6, Eberhard Maass, Jochen Rössler, Martin Ebinger, Martin U. Schuhmann, Michael C. Frühwald10, Martin Hasselblatt, Nada Jabado11, Stefan Rutkowski12, André O. von Bueren12, Daniel Williamson13, Steven C. Clifford13, Martin G. McCabe14, Martin G. McCabe15, V. Peter Collins14, Stephan Wolf1, Stefan Wiemann1, Hans Lehrach3, Benedikt Brors1, Wolfram Scheurlen10, Jörg Felsberg16, Guido Reifenberger16, Paul A. Northcott, Michael D. Taylor, Matthew Meyerson5, Matthew Meyerson17, Scott L. Pomeroy10, Scott L. Pomeroy5, Marie-Laure Yaspo3, Jan O. Korbel2, Andrey Korshunov1, Andrey Korshunov6, Roland Eils1, Roland Eils6, Stefan M. Pfister1, Stefan M. Pfister6, Peter Lichter1 
02 Aug 2012-Nature
TL;DR: An integrative deep-sequencing analysis of 125 tumour–normal pairs enhances the understanding of the genomic complexity and heterogeneity underlying medulloblastoma, and provides several potential targets for new therapeutics, especially for Group 3 and 4 patients.
Abstract: Medulloblastoma is an aggressively growing tumour, arising in the cerebellum or medulla/brain stem. It is the most common malignant brain tumour in children, and shows tremendous biological and clinical heterogeneity. Despite recent treatment advances, approximately 40% of children experience tumour recurrence, and 30% will die from their disease. Those who survive often have a significantly reduced quality of life. Four tumour subgroups with distinct clinical, biological and genetic profiles are currently identified. WNT tumours, showing activated wingless pathway signalling, carry a favourable prognosis under current treatment regimens. SHH tumours show hedgehog pathway activation, and have an intermediate prognosis. Group 3 and 4 tumours are molecularly less well characterized, and also present the greatest clinical challenges. The full repertoire of genetic events driving this distinction, however, remains unclear. Here we describe an integrative deep-sequencing analysis of 125 tumour-normal pairs, conducted as part of the International Cancer Genome Consortium (ICGC) PedBrain Tumor Project. Tetraploidy was identified as a frequent early event in Group 3 and 4 tumours, and a positive correlation between patient age and mutation rate was observed. Several recurrent mutations were identified, both in known medulloblastoma-related genes (CTNNB1, PTCH1, MLL2, SMARCA4) and in genes not previously linked to this tumour (DDX3X, CTDNEP1, KDM6A, TBR1), often in subgroup-specific patterns. RNA sequencing confirmed these alterations, and revealed the expression of what are, to our knowledge, the first medulloblastoma fusion genes identified. Chromatin modifiers were frequently altered across all subgroups. These findings enhance our understanding of the genomic complexity and heterogeneity underlying medulloblastoma, and provide several potential targets for new therapeutics, especially for Group 3 and 4 patients.

775 citations


Cited by
More filters
Journal ArticleDOI
Adam Auton1, Gonçalo R. Abecasis2, David Altshuler3, Richard Durbin4  +514 moreInstitutions (90)
01 Oct 2015-Nature
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

12,661 citations

Journal ArticleDOI
Monkol Lek, Konrad J. Karczewski1, Konrad J. Karczewski2, Eric Vallabh Minikel2, Eric Vallabh Minikel1, Kaitlin E. Samocha, Eric Banks1, Timothy Fennell1, Anne H. O’Donnell-Luria3, Anne H. O’Donnell-Luria2, Anne H. O’Donnell-Luria1, James S. Ware, Andrew J. Hill2, Andrew J. Hill4, Andrew J. Hill1, Beryl B. Cummings1, Beryl B. Cummings2, Taru Tukiainen2, Taru Tukiainen1, Daniel P. Birnbaum1, Jack A. Kosmicki, Laramie E. Duncan2, Laramie E. Duncan1, Karol Estrada2, Karol Estrada1, Fengmei Zhao2, Fengmei Zhao1, James Zou1, Emma Pierce-Hoffman1, Emma Pierce-Hoffman2, Joanne Berghout5, David Neil Cooper6, Nicole A. Deflaux7, Mark A. DePristo1, Ron Do, Jason Flannick2, Jason Flannick1, Menachem Fromer, Laura D. Gauthier1, Jackie Goldstein1, Jackie Goldstein2, Namrata Gupta1, Daniel P. Howrigan2, Daniel P. Howrigan1, Adam Kiezun1, Mitja I. Kurki2, Mitja I. Kurki1, Ami Levy Moonshine1, Pradeep Natarajan, Lorena Orozco, Gina M. Peloso1, Gina M. Peloso2, Ryan Poplin1, Manuel A. Rivas1, Valentin Ruano-Rubio1, Samuel A. Rose1, Douglas M. Ruderfer8, Khalid Shakir1, Peter D. Stenson6, Christine Stevens1, Brett Thomas2, Brett Thomas1, Grace Tiao1, María Teresa Tusié-Luna, Ben Weisburd1, Hong-Hee Won9, Dongmei Yu, David Altshuler10, David Altshuler1, Diego Ardissino, Michael Boehnke11, John Danesh12, Stacey Donnelly1, Roberto Elosua, Jose C. Florez2, Jose C. Florez1, Stacey Gabriel1, Gad Getz2, Gad Getz1, Stephen J. Glatt13, Christina M. Hultman14, Sekar Kathiresan, Markku Laakso15, Steven A. McCarroll2, Steven A. McCarroll1, Mark I. McCarthy16, Mark I. McCarthy17, Dermot P.B. McGovern18, Ruth McPherson19, Benjamin M. Neale2, Benjamin M. Neale1, Aarno Palotie, Shaun Purcell8, Danish Saleheen20, Jeremiah M. Scharf, Pamela Sklar, Patrick F. Sullivan14, Patrick F. Sullivan21, Jaakko Tuomilehto22, Ming T. Tsuang23, Hugh Watkins16, Hugh Watkins17, James G. Wilson24, Mark J. Daly1, Mark J. Daly2, Daniel G. MacArthur2, Daniel G. MacArthur1 
18 Aug 2016-Nature
TL;DR: The aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC) provides direct evidence for the presence of widespread mutational recurrence.
Abstract: Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

8,758 citations

Journal ArticleDOI
11 Oct 2018-Nature
TL;DR: Deep phenotype and genome-wide genetic data from 500,000 individuals from the UK Biobank is described, describing population structure and relatedness in the cohort, and imputation to increase the number of testable variants to 96 million.
Abstract: The UK Biobank project is a prospective cohort study with deep genetic and phenotypic data collected on approximately 500,000 individuals from across the United Kingdom, aged between 40 and 69 at recruitment. The open resource is unique in its size and scope. A rich variety of phenotypic and health-related information is available on each participant, including biological measurements, lifestyle indicators, biomarkers in blood and urine, and imaging of the body and brain. Follow-up information is provided by linking health and medical records. Genome-wide genotype data have been collected on all participants, providing many opportunities for the discovery of new genetic associations and the genetic bases of complex traits. Here we describe the centralized analysis of the genetic data, including genotype quality, properties of population structure and relatedness of the genetic data, and efficient phasing and genotype imputation that increases the number of testable variants to around 96 million. Classical allelic variation at 11 human leukocyte antigen genes was imputed, resulting in the recovery of signals with known associations between human leukocyte antigen alleles and many diseases.

4,489 citations

01 Feb 2015
TL;DR: In this article, the authors describe the integrative analysis of 111 reference human epigenomes generated as part of the NIH Roadmap Epigenomics Consortium, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression.
Abstract: The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.

4,409 citations

Journal ArticleDOI
12 Oct 2017-Nature
TL;DR: It is found that local genetic variation affects gene expression levels for the majority of genes, and inter-chromosomal genetic effects for 93 genes and 112 loci are identified, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.
Abstract: Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.

3,289 citations