Showing papers by "Lukas Forer published in 2016"

PDF

Open Access

Journal Article•DOI•

Next-generation genotype imputation service and methods.

[...]

Sayantan Das¹, Lukas Forer², Sebastian Schönherr², Carlo Sidore¹, Carlo Sidore³, Adam E. Locke¹, Alan Kwong¹, Scott I. Vrieze⁴, Emily Y. Chew⁵, Shawn Levy, Matt McGue⁶, David Schlessinger⁵, Dwight Stambolian⁷, Po-Ru Loh⁸, William G. Iacono⁶, Anand Swaroop⁵, Laura J. Scott¹, Francesco Cucca³, Florian Kronenberg², Michael Boehnke¹, Gonçalo R. Abecasis¹, Christian Fuchsberger⁹, Christian Fuchsberger², Christian Fuchsberger¹ - Show less +20 more•Institutions (9)

University of Michigan¹, Innsbruck Medical University², University of Sassari³, University of Colorado Boulder⁴, National Institutes of Health⁵, University of Minnesota⁶, University of Pennsylvania⁷, Harvard University⁸, University of Lübeck⁹

01 Oct 2016-Nature Genetics

TL;DR: Improvements to imputation machinery are described that reduce computational requirements by more than an order of magnitude with no loss of accuracy in comparison to standard imputation tools.

...read moreread less

Abstract: Christian Fuchsberger, Goncalo Abecasis and colleagues describe a new web-based imputation service that enables rapid imputation of large numbers of samples and allows convenient access to large reference panels of sequenced individuals. Their state space reduction provides a computationally efficient solution for genotype imputation with no loss in imputation accuracy.

...read moreread less

2,556 citations

Journal Article•DOI•

A reference panel of 64,976 haplotypes for genotype imputation

[...]

Shane A. McCarthy¹, Sayantan Das², Warren W. Kretzschmar³, Olivier Delaneau⁴, Andrew R. Wood⁵, Alexander Teumer⁶, Hyun Min Kang², Christian Fuchsberger², Petr Danecek¹, Kevin Sharp³, Yang Luo¹, C Sidore⁷, Alan Kwong², Nicholas J. Timpson⁸, Seppo Koskinen, Scott I. Vrieze⁹, Laura J. Scott², He Zhang², Anubha Mahajan³, Jan H. Veldink, Ulrike Peters¹⁰, Ulrike Peters¹¹, Carlos N. Pato¹², Cornelia M. van Duijn¹³, Christopher E. Gillies², Ilaria Gandin¹⁴, Massimo Mezzavilla, Arthur Gilly¹, Massimiliano Cocca¹⁴, Michela Traglia, Andrea Angius⁷, Jeffrey C. Barrett¹, D.I. Boomsma¹⁵, Kari Branham², Gerome Breen¹⁶, Gerome Breen¹⁷, Chad M. Brummett², Fabio Busonero⁷, Harry Campbell¹⁸, Andrew T. Chan¹⁹, Sai Chen², Emily Y. Chew²⁰, Francis S. Collins²⁰, Laura J Corbin⁸, George Davey Smith⁸, George Dedoussis²¹, Marcus Dörr⁶, Aliki-Eleni Farmaki²¹, Luigi Ferrucci²⁰, Lukas Forer²², Ross M. Fraser², Stacey Gabriel²³, Shawn Levy, Leif Groop²⁴, Leif Groop²⁵, Tabitha A. Harrison¹¹, Andrew T. Hattersley⁵, Oddgeir L. Holmen²⁶, Kristian Hveem²⁶, Matthias Kretzler², James Lee²⁷, Matt McGue²⁸, Thomas Meitinger²⁹, David Melzer⁵, Josine L. Min⁸, Karen L. Mohlke³⁰, John B. Vincent³¹, Matthias Nauck⁶, Deborah A. Nickerson¹⁰, Aarno Palotie²³, Aarno Palotie¹⁹, Michele T. Pato¹², Nicola Pirastu¹⁴, Melvin G. McInnis², J. Brent Richards³², J. Brent Richards¹⁶, Cinzia Sala, Veikko Salomaa, David Schlessinger²⁰, Sebastian Schoenherr²², P. Eline Slagboom³³, Kerrin S. Small¹⁶, Tim D. Spector¹⁶, Dwight Stambolian³⁴, Marcus A. Tuke⁵, Jaakko Tuomilehto, Leonard H. van den Berg, Wouter van Rheenen, Uwe Völker⁶, Cisca Wijmenga³⁵, Daniela Toniolo, Eleftheria Zeggini¹, Paolo Gasparini¹⁴, Matthew G. Sampson², James F. Wilson¹⁸, Timothy M. Frayling⁵, Paul I.W. de Bakker³⁶, Morris A. Swertz³⁵, Steven A. McCarroll¹⁹, Charles Kooperberg¹¹, Annelot M. Dekker, David Altshuler, Cristen J. Willer², William G. Iacono²⁸, Samuli Ripatti²⁵, Nicole Soranzo¹, Nicole Soranzo²⁷, Klaudia Walter¹, Anand Swaroop²⁰, Francesco Cucca⁷, Carl A. Anderson¹, Richard M. Myers, Michael Boehnke², Mark I. McCarthy³, Mark I. McCarthy³⁷, Richard Durbin¹, Gonçalo R. Abecasis², Jonathan Marchini³ - Show less +114 more•Institutions (37)

Wellcome Trust Sanger Institute¹, University of Michigan², University of Oxford³, University of Geneva⁴, University of Exeter⁵, Greifswald University Hospital⁶, National Research Council⁷, University of Bristol⁸, University of Colorado Boulder⁹, University of Washington¹⁰, Fred Hutchinson Cancer Research Center¹¹, SUNY Downstate Medical Center¹², Erasmus University Rotterdam¹³, University of Trieste¹⁴, VU University Amsterdam¹⁵, King's College London¹⁶, South London and Maudsley NHS Foundation Trust¹⁷, University of Edinburgh¹⁸, Harvard University¹⁹, National Institutes of Health²⁰, Harokopio University²¹, Innsbruck Medical University²², Broad Institute²³, Lund University²⁴, University of Helsinki²⁵, Norwegian University of Science and Technology²⁶, University of Cambridge²⁷, University of Minnesota²⁸, Technische Universität München²⁹, University of North Carolina at Chapel Hill³⁰, University of Toronto³¹, McGill University³², Leiden University³³, University of Pennsylvania³⁴, University of Groningen³⁵, Utrecht University³⁶, Churchill Hospital³⁷

22 Aug 2016-Nature Genetics

TL;DR: A reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry leads to accurate genotype imputation at minor allele frequencies as low as 0.1% and a large increase in the number of SNPs tested in association studies.

...read moreread less

Abstract: We describe a reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry. Using this resource leads to accurate genotype imputation at minor allele frequencies as low as 0.1% and a large increase in the number of SNPs tested in association studies, and it can help to discover and refine causal loci. We describe remote server resources that allow researchers to carry out imputation and phasing consistently and efficiently.

...read moreread less

2,149 citations

A reference panel of 64,976 haplotypes for genotype imputation

[...]

Shane A. McCarthy, Sayantan Das, Warren W. Kretzschmar, Olivier Delaneau, Andrew R. Wood, Alexander Teumer, Hyun Min Kang, Christian Fuchsberger, Petr Danecek, Kevin Sharp, Yang Luo, Carlo Sidorel, Alan Kwong, Nicholas J. Timpson, Seppo Koskinen, Scott I. Vrieze, Laura J. Scott, He Zhang, Anubha Mahajan, Jan H. Veldink, Ulrike Peters, Carlos N. Pato, Cornelia M. van Duijn, Christopher E. Gillies, Ilaria Gandin, Massimo Mezzavilla, Arthur Gilly, Massimiliano Cocca, Michela Traglia, Andrea Angius, Jeffrey C. Barrett, D.I. Boomsma, Kari Branham, Gerome Breen, Chad M. Brummett, Fabio Busonero, Harry Campbell, Andrew T. Chan, Sai Che, Emily Y. Chew, Francis S. Collins, Laura J Corbin, George Davey Smith, George Dedoussis, Marcus Dörr, Aliki-Eleni Farmaki, Luigi Ferrucci, Lukas Forer, Ross M. Fraser, Stacey Gabriel, Shawn Levy, Leif Groop, Tabitha A. Harrison, Andrew T. Hattersley, Oddgeir L. Holmen, Kristian Hveem, Matthias Kretzler, James Lee, Matt McGue, Thomas Meitinger, David Melzer, Josine L. Min, Karen L. Mohlke, John B. Vincent, Matthias Nauck, Deborah A. Nickerson, Aarno Palotie, Michele T. Pato, Nicola Pirastu, Melvin G. McInnis, J. Brent Richards, Cinzia Sala, Veikko Salomaa, David Schlessinger, Sebastian Schoenherr, P. Eline Slagboom, Kerrin S. Small, Tim D. Spector, Dwight Stambolian, Marcus A. Tuke, Jaakko Tuomilehto, Leonard H. van den Berg, Wouter van Rheenen, Uwe Völker, Cisca Wijmenga, Daniela Toniolo, Eleftheria Zeggini, Paolo Gasparini, Matthew G. Sampson, James F. Wilson, Timothy M. Frayling, Paul I.W. de Bakker, Morris A. Swertz, Steven A. McCarroll, Charles Kooperberg, Annelot M. Dekker, David Altshuler, Cristen J. Willer, William G. Iacono, Samuli Ripatti, Nicole Soranzo, Klaudia Walter, Anand Swaroop, Francesco Cucca, Carl A. Anderson, Richard M. Myers, Michael Boehnke, Mark I. McCarthy, Richard Durbin, Gonçalo R. Abecasis, Jonathan Marchini - Show less +107 more

01 Jan 2016

TL;DR: In this article, a reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry is presented.

...read moreread less

1,261 citations

Journal Article•DOI•

Reference-based phasing using the Haplotype Reference Consortium panel.

[...]

Po-Ru Loh¹, Po-Ru Loh², Petr Danecek³, Pier Francesco Palamara², Pier Francesco Palamara¹, Christian Fuchsberger⁴, Christian Fuchsberger⁵, Yakir A. Reshef¹, Hilary K. Finucane⁶, Hilary K. Finucane¹, Sebastian Schoenherr⁷, Lukas Forer⁷, Shane A. McCarthy³, Gonçalo R. Abecasis⁵, Richard Durbin³, Alkes L. Price¹, Alkes L. Price² - Show less +13 more•Institutions (7)

Harvard University¹, Broad Institute², Wellcome Trust Sanger Institute³, European Academy of Bozen⁴, University of Michigan⁵, Massachusetts Institute of Technology⁶, Innsbruck Medical University⁷

01 Nov 2016-Nature Genetics

TL;DR: A new phasing algorithm, Eagle2, is introduced that attains high accuracy across a broad range of cohort sizes by efficiently leveraging information from large external reference panels (such as the Haplotype Reference Consortium; HRC) using a new data structure based on the positional Burrows-Wheeler transform.

...read moreread less

Abstract: Po-Ru Loh, Alkes Price and colleagues present Eagle2, a reference-based phasing algorithm that allows for highly accurate and efficient phasing of genotypes across a broad range of cohort sizes. They demonstrate an approximately 10% improvement in accuracy and 20% improvement in speed compared to a competing method, SHAPEIT2.

...read moreread less

1,246 citations

Journal Article•DOI•

HaploGrep 2: mitochondrial haplogroup classification in the era of high-throughput sequencing.

[...]

Hansi Weissensteiner¹, Dominic Pacher¹, Anita Kloss-Brandstätter¹, Lukas Forer¹, Günther Specht², Hans-Jürgen Bandelt³, Florian Kronenberg¹, Antonio Salas⁴, Sebastian Schönherr¹ - Show less +5 more•Institutions (4)

Innsbruck Medical University¹, University of Innsbruck², University of Hamburg³, University of Santiago de Compostela⁴

08 Jul 2016-Nucleic Acids Research

TL;DR: This work presents the completely updated version HaploGrep 2 offering several advanced features, including a generic rule-based system for immediate quality control (QC), which allows detecting artificial recombinants and missing variants as well as annotating rare and phantom mutations.

...read moreread less

Abstract: Mitochondrial DNA (mtDNA) profiles can be classified into phylogenetic clusters (haplogroups), which is of great relevance for evolutionary, forensic and medical genetics. With the extensive growth of the underlying phylogenetic tree summarizing the published mtDNA sequences, the manual process of haplogroup classification would be too time-consuming. The previously published classification tool HaploGrep provided an automatic way to address this issue. Here, we present the completely updated version HaploGrep 2 offering several advanced features, including a generic rule-based system for immediate quality control (QC). This allows detecting artificial recombinants and missing variants as well as annotating rare and phantom mutations. Furthermore, the handling of high-throughput data in form of VCF files is now directly supported. For data output, several graphical reports are generated in real time, such as a multiple sequence alignment format, a VCF format and extended haplogroup QC reports, all viewable directly within the application. In addition, HaploGrep 2 generates a publication-ready phylogenetic tree of all input samples encoded relative to the revised Cambridge Reference Sequence. Finally, new distance measures and optimizations of the algorithm increase accuracy and speed-up the application. HaploGrep 2 can be accessed freely and without any registration at http://haplogrep.uibk.ac.at.

...read moreread less

612 citations

Journal Article•DOI•

mtDNA-Server: next-generation sequencing data analysis of human mitochondrial DNA in the cloud

[...]

Hansi Weissensteiner¹, Lukas Forer¹, Christian Fuchsberger², Bernd Schöpf¹, Anita Kloss-Brandstätter¹, Günther Specht³, Florian Kronenberg¹, Sebastian Schönherr¹ - Show less +4 more•Institutions (3)

Innsbruck Medical University¹, University of Michigan², University of Innsbruck³

08 Jul 2016-Nucleic Acids Research

TL;DR: The mtDNA-server as mentioned in this paper is a scalable web server for the analysis of mtDNA studies of any size with a special focus on usability as well as reliable identification and quantification of heteroplasmic variants.

...read moreread less

Abstract: Next generation sequencing (NGS) allows investigating mitochondrial DNA (mtDNA) characteristics such as heteroplasmy (i.e. intra-individual sequence variation) to a higher level of detail. While several pipelines for analyzing heteroplasmies exist, issues in usability, accuracy of results and interpreting final data limit their usage. Here we present mtDNA-Server, a scalable web server for the analysis of mtDNA studies of any size with a special focus on usability as well as reliable identification and quantification of heteroplasmic variants. The mtDNA-Server workflow includes parallel read alignment, heteroplasmy detection, artefact or contamination identification, variant annotation as well as several quality control metrics, often neglected in current mtDNA NGS studies. All computational steps are parallelized with Hadoop MapReduce and executed graphically with Cloudgene. We validated the underlying heteroplasmy and contamination detection model by generating four artificial sample mix-ups on two different NGS devices. Our evaluation data shows that mtDNA-Server detects heteroplasmies and artificial recombinations down to the 1% level with perfect specificity and outperforms existing approaches regarding sensitivity. mtDNA-Server is currently able to analyze the 1000G Phase 3 data (n = 2,504) in less than 5 h and is freely accessible at https://mtdna-server.uibk.ac.at.

...read moreread less

121 citations

Journal Article•DOI•

A genome-wide association meta-analysis on apolipoprotein A-IV concentrations.

[...]

Claudia Lamina¹, Salome Friedel¹, Stefan Coassin¹, Rico Rueedi², Noha A. Yousri³, Ilkka Seppälä⁴, Christian Gieger, Sebastian Schönherr¹, Lukas Forer¹, Gertraud Erhart¹, Barbara Kollerits¹, Pedro Marques-Vidal⁵, Janina S. Ried, Gérard Waeber⁵, Sven Bergmann², Sven Bergmann⁶, Doreen Dähnhardt¹, Andrea Stöckl¹, Stefan Kiechl¹, Olli T. Raitakari⁷, Mika Kähönen⁴, Johann Willeit¹, Ludmilla Kedenko⁸, Bernhard Paulweber⁸, Annette Peters, Thomas Meitinger⁹, Konstantin Strauch¹⁰, Terho Lehtimäki⁴, Steven C. Hunt¹¹, Peter Vollenweider⁵, Florian Kronenberg¹ - Show less +27 more•Institutions (11)

Innsbruck Medical University¹, University of Lausanne², Cornell University³, University of Tampere⁴, University Hospital of Lausanne⁵, Swiss Institute of Bioinformatics⁶, Turku University Hospital⁷, Paracelsus Private Medical University of Salzburg⁸, Technische Universität München⁹, Ludwig Maximilian University of Munich¹⁰, University of Utah¹¹

15 Aug 2016-Human Molecular Genetics

TL;DR: Two independent SNPs located in or next the APOA4 gene and one SNP in KLKB1 are identified, which suggests an involvement of apoA-IV in renal metabolism and/or an interaction within HDL particles.

...read moreread less

Abstract: Apolipoprotein A-IV (apoA-IV) is a major component of HDL and chylomicron particles and is involved in reverse cholesterol transport. It is an early marker of impaired renal function. We aimed to identify genetic loci associated with apoA-IV concentrations and to investigate relationships with known susceptibility loci for kidney function and lipids. A genome-wide association meta-analysis on apoA-IV concentrations was conducted in five population-based cohorts (n = 13,813) followed by two additional replication studies (n = 2,267) including approximately 10 M SNPs. Three independent SNPs from two genomic regions were significantly associated with apoA-IV concentrations: rs1729407 near APOA4 (P = 6.77 × 10 - 44), rs5104 in APOA4 (P = 1.79 × 10-24) and rs4241819 in KLKB1 (P = 5.6 × 10-14). Additionally, a look-up of the replicated SNPs in downloadable GWAS meta-analysis results was performed on kidney function (defined by eGFR), HDL-cholesterol and triglycerides. From these three SNPs mentioned above, only rs1729407 showed an association with HDL-cholesterol (P = 7.1 × 10 - 07). Moreover, weighted SNP-scores were built involving known susceptibility loci for the aforementioned traits (53, 70 and 38 SNPs, respectively) and were associated with apoA-IV concentrations. This analysis revealed a significant and an inverse association for kidney function with apoA-IV concentrations (P = 5.5 × 10-05). Furthermore, an increase of triglyceride-increasing alleles was found to decrease apoA-IV concentrations (P = 0.0078). In summary, we identified two independent SNPs located in or next the APOA4 gene and one SNP in KLKB1 The association of KLKB1 with apoA-IV suggests an involvement of apoA-IV in renal metabolism and/or an interaction within HDL particles. Analyses of SNP-scores indicate potential causal effects of kidney function and by lesser extent triglycerides on apoA-IV concentrations.

...read moreread less

22 citations

Posted Content•DOI•

Reference-based phasing using the Haplotype Reference Consortium panel

[...]

Po-Ru Loh¹, Petr Danecek², Pier Francesco Palamara¹, Christian Fuchsberger³, Yakir A. Reshef¹, Hilary K. Finucane⁴, Sebastian Schoenherr⁵, Lukas Forer⁵, Shane A. McCarthy², Gonçalo R. Abecasis⁶, Richard Durbin², Alkes L. Price¹ - Show less +8 more•Institutions (6)

Harvard University¹, Wellcome Trust Sanger Institute², European Academy of Bozen³, Massachusetts Institute of Technology⁴, Innsbruck Medical University⁵, University of Michigan⁶

07 Jul 2016-bioRxiv

TL;DR: A new phasing algorithm, Eagle2, is introduced that attains high accuracy across a broad range of cohort sizes by efficiently leveraging information from large external reference panels (such as the Haplotype Reference Consortium, HRC) using a new data structure based on the positional BurrowsWheeler transform.

...read moreread less

Abstract: Haplotype phasing is a fundamental problem in medical and population genetics. Phasing is generally performed via statistical phasing within a genotyped cohort, an approach that can attain high accuracy in very large cohorts but attains lower accuracy in smaller cohorts. Here, we instead explore the paradigm of reference-based phasing. We introduce a new phasing algorithm, Eagle2, that attains high accuracy across a broad range of cohort sizes by efficiently leveraging information from large external reference panels (such as the Haplotype Reference Consortium, HRC) using a new data structure based on the positional Burrows-Wheeler transform. We demonstrate that Eagle2 attains a ≈20x speedup and ≈10% increase in accuracy compared to reference-based phasing using SHAPEIT2. On European-ancestry samples, Eagle2 with the HRC panel achieves >2x the accuracy of 1000 Genomes-based phasing. Eagle2 is open source and freely available for HRC-based phasing via the Sanger Imputation Service and the Michigan Imputation Server.

...read moreread less

13 citations

Journal Article•DOI•

Cloudflow - enabling faster biomedical pipelines with MapReduce and Spark

[...]

Lukas Forer, Enis Afgan, Hansi Weissensteiner, Davor Davidović, Guenther Specht, Florian Kronenberg, Sebastian Schoenherr - Show less +3 more

05 Feb 2016-Scalable Computing: Practice and Experience

TL;DR: The extension of Cloudfl ow to support Apache Spark without any adaptions to already implemented pipelines is described, demonstrating that Spark can bring an additional boost for analysing next generation sequencing (NGS) data to the field of genetics.

...read moreread less

Abstract: For many years Apache Hadoop has been used as a synonym for processing data in the MapReduce fashion. However, due to the complexity of developing MapReduce applications, adoption of this paradigm in genetics has been limited. To alleviate some of the issues, we have previously developed Cloudfl ow - a high-level pipeline framework that allows users to create sophisticated biomedical pipelines using predefined code blocks while the framework automatically translates those into the MapReduce execution model. With the introduction of the YARN resource management layer, new computational processing models such as Apache Spark are now plugable into the Hadoop ecosystem. In this paper we describe the extension of Cloudfl ow to support Apache Spark without any adaptions to already implemented pipelines. The described performance evaluation demonstrates that Spark can bring an additional boost for analysing next generation sequencing (NGS) data to the field of genetics. The Cloudflow framework is open source and freely available at https://github.com/genepi/cloud flow.

...read moreread less