Showing papers by "Richard Durbin published in 2016"

PDF

Open Access

Journal Article•DOI•

A reference panel of 64,976 haplotypes for genotype imputation

[...]

Shane A. McCarthy¹, Sayantan Das², Warren W. Kretzschmar³, Olivier Delaneau⁴, Andrew R. Wood⁵, Alexander Teumer⁶, Hyun Min Kang², Christian Fuchsberger², Petr Danecek¹, Kevin Sharp³, Yang Luo¹, C Sidore⁷, Alan Kwong², Nicholas J. Timpson⁸, Seppo Koskinen, Scott I. Vrieze⁹, Laura J. Scott², He Zhang², Anubha Mahajan³, Jan H. Veldink, Ulrike Peters¹⁰, Ulrike Peters¹¹, Carlos N. Pato¹², Cornelia M. van Duijn¹³, Christopher E. Gillies², Ilaria Gandin¹⁴, Massimo Mezzavilla, Arthur Gilly¹, Massimiliano Cocca¹⁴, Michela Traglia, Andrea Angius⁷, Jeffrey C. Barrett¹, D.I. Boomsma¹⁵, Kari Branham², Gerome Breen¹⁶, Gerome Breen¹⁷, Chad M. Brummett², Fabio Busonero⁷, Harry Campbell¹⁸, Andrew T. Chan¹⁹, Sai Chen², Emily Y. Chew²⁰, Francis S. Collins²⁰, Laura J Corbin⁸, George Davey Smith⁸, George Dedoussis²¹, Marcus Dörr⁶, Aliki-Eleni Farmaki²¹, Luigi Ferrucci²⁰, Lukas Forer²², Ross M. Fraser², Stacey Gabriel²³, Shawn Levy, Leif Groop²⁴, Leif Groop²⁵, Tabitha A. Harrison¹⁰, Andrew T. Hattersley⁵, Oddgeir L. Holmen²⁶, Kristian Hveem²⁶, Matthias Kretzler², James Lee²⁷, Matt McGue²⁸, Thomas Meitinger²⁹, David Melzer⁵, Josine L. Min⁸, Karen L. Mohlke³⁰, John B. Vincent³¹, Matthias Nauck⁶, Deborah A. Nickerson¹¹, Aarno Palotie²³, Aarno Palotie¹⁹, Michele T. Pato¹², Nicola Pirastu¹⁴, Melvin G. McInnis², J. Brent Richards¹⁶, J. Brent Richards³², Cinzia Sala, Veikko Salomaa, David Schlessinger²⁰, Sebastian Schoenherr²², P. Eline Slagboom³³, Kerrin S. Small¹⁶, Tim D. Spector¹⁶, Dwight Stambolian³⁴, Marcus A. Tuke⁵, Jaakko Tuomilehto, Leonard H. van den Berg, Wouter van Rheenen, Uwe Völker⁶, Cisca Wijmenga³⁵, Daniela Toniolo, Eleftheria Zeggini¹, Paolo Gasparini¹⁴, Matthew G. Sampson², James F. Wilson¹⁸, Timothy M. Frayling⁵, Paul I.W. de Bakker³⁶, Morris A. Swertz³⁵, Steven A. McCarroll¹⁹, Charles Kooperberg¹⁰, Annelot M. Dekker, David Altshuler, Cristen J. Willer², William G. Iacono²⁸, Samuli Ripatti²⁴, Nicole Soranzo²⁷, Nicole Soranzo¹, Klaudia Walter¹, Anand Swaroop²⁰, Francesco Cucca⁷, Carl A. Anderson¹, Richard M. Myers, Michael Boehnke², Mark I. McCarthy³⁷, Mark I. McCarthy³, Richard Durbin¹, Gonçalo R. Abecasis², Jonathan Marchini³ - Show less +114 more•Institutions (37)

Wellcome Trust Sanger Institute¹, University of Michigan², University of Oxford³, University of Geneva⁴, University of Exeter⁵, Greifswald University Hospital⁶, National Research Council⁷, University of Bristol⁸, University of Colorado Boulder⁹, Fred Hutchinson Cancer Research Center¹⁰, University of Washington¹¹, SUNY Downstate Medical Center¹², Erasmus University Rotterdam¹³, University of Trieste¹⁴, VU University Amsterdam¹⁵, King's College London¹⁶, South London and Maudsley NHS Foundation Trust¹⁷, University of Edinburgh¹⁸, Harvard University¹⁹, National Institutes of Health²⁰, Harokopio University²¹, Innsbruck Medical University²², Broad Institute²³, University of Helsinki²⁴, Lund University²⁵, Norwegian University of Science and Technology²⁶, University of Cambridge²⁷, University of Minnesota²⁸, Technische Universität München²⁹, University of North Carolina at Chapel Hill³⁰, University of Toronto³¹, McGill University³², Leiden University³³, University of Pennsylvania³⁴, University of Groningen³⁵, Utrecht University³⁶, Churchill Hospital³⁷

22 Aug 2016-Nature Genetics

TL;DR: A reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry leads to accurate genotype imputation at minor allele frequencies as low as 0.1% and a large increase in the number of SNPs tested in association studies.

...read moreread less

Abstract: We describe a reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry. Using this resource leads to accurate genotype imputation at minor allele frequencies as low as 0.1% and a large increase in the number of SNPs tested in association studies, and it can help to discover and refine causal loci. We describe remote server resources that allow researchers to carry out imputation and phasing consistently and efficiently.

...read moreread less

2,149 citations

A reference panel of 64,976 haplotypes for genotype imputation

[...]

Shane A. McCarthy, Sayantan Das, Warren W. Kretzschmar, Olivier Delaneau, Andrew R. Wood, Alexander Teumer, Hyun Min Kang, Christian Fuchsberger, Petr Danecek, Kevin Sharp, Yang Luo, Carlo Sidorel, Alan Kwong, Nicholas J. Timpson, Seppo Koskinen, Scott I. Vrieze, Laura J. Scott, He Zhang, Anubha Mahajan, Jan H. Veldink, Ulrike Peters, Carlos N. Pato, Cornelia M. van Duijn, Christopher E. Gillies, Ilaria Gandin, Massimo Mezzavilla, Arthur Gilly, Massimiliano Cocca, Michela Traglia, Andrea Angius, Jeffrey C. Barrett, D.I. Boomsma, Kari Branham, Gerome Breen, Chad M. Brummett, Fabio Busonero, Harry Campbell, Andrew T. Chan, Sai Che, Emily Y. Chew, Francis S. Collins, Laura J Corbin, George Davey Smith, George Dedoussis, Marcus Dörr, Aliki-Eleni Farmaki, Luigi Ferrucci, Lukas Forer, Ross M. Fraser, Stacey Gabriel, Shawn Levy, Leif Groop, Tabitha A. Harrison, Andrew T. Hattersley, Oddgeir L. Holmen, Kristian Hveem, Matthias Kretzler, James Lee, Matt McGue, Thomas Meitinger, David Melzer, Josine L. Min, Karen L. Mohlke, John B. Vincent, Matthias Nauck, Deborah A. Nickerson, Aarno Palotie, Michele T. Pato, Nicola Pirastu, Melvin G. McInnis, J. Brent Richards, Cinzia Sala, Veikko Salomaa, David Schlessinger, Sebastian Schoenherr, P. Eline Slagboom, Kerrin S. Small, Tim D. Spector, Dwight Stambolian, Marcus A. Tuke, Jaakko Tuomilehto, Leonard H. van den Berg, Wouter van Rheenen, Uwe Völker, Cisca Wijmenga, Daniela Toniolo, Eleftheria Zeggini, Paolo Gasparini, Matthew G. Sampson, James F. Wilson, Timothy M. Frayling, Paul I.W. de Bakker, Morris A. Swertz, Steven A. McCarroll, Charles Kooperberg, Annelot M. Dekker, David Altshuler, Cristen J. Willer, William G. Iacono, Samuli Ripatti, Nicole Soranzo, Klaudia Walter, Anand Swaroop, Francesco Cucca, Carl A. Anderson, Richard M. Myers, Michael Boehnke, Mark I. McCarthy, Richard Durbin, Gonçalo R. Abecasis, Jonathan Marchini - Show less +107 more

01 Jan 2016

TL;DR: In this article, a reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry is presented.

...read moreread less

1,261 citations

Journal Article•DOI•

Reference-based phasing using the Haplotype Reference Consortium panel.

[...]

Po-Ru Loh¹, Po-Ru Loh², Petr Danecek³, Pier Francesco Palamara², Pier Francesco Palamara¹, Christian Fuchsberger⁴, Christian Fuchsberger⁵, Yakir A. Reshef¹, Hilary K. Finucane¹, Hilary K. Finucane⁶, Sebastian Schoenherr⁷, Lukas Forer⁷, Shane A. McCarthy³, Gonçalo R. Abecasis⁵, Richard Durbin³, Alkes L. Price¹, Alkes L. Price² - Show less +13 more•Institutions (7)

Harvard University¹, Broad Institute², Wellcome Trust Sanger Institute³, European Academy of Bozen⁴, University of Michigan⁵, Massachusetts Institute of Technology⁶, Innsbruck Medical University⁷

01 Nov 2016-Nature Genetics

TL;DR: A new phasing algorithm, Eagle2, is introduced that attains high accuracy across a broad range of cohort sizes by efficiently leveraging information from large external reference panels (such as the Haplotype Reference Consortium; HRC) using a new data structure based on the positional Burrows-Wheeler transform.

...read moreread less

Abstract: Po-Ru Loh, Alkes Price and colleagues present Eagle2, a reference-based phasing algorithm that allows for highly accurate and efficient phasing of genotypes across a broad range of cohort sizes. They demonstrate an approximately 10% improvement in accuracy and 20% improvement in speed compared to a competing method, SHAPEIT2.

...read moreread less

1,246 citations

Journal Article•DOI•

BCFtools/RoH: a hidden Markov model approach for detecting autozygosity from next-generation sequencing data

[...]

Vagheesh M. Narasimhan¹, Petr Danecek¹, Aylwyn Scally², Yali Xue¹, Chris Tyler-Smith¹, Richard Durbin¹ - Show less +2 more•Institutions (2)

Wellcome Trust Sanger Institute¹, University of Cambridge²

01 Jun 2016-Bioinformatics

TL;DR: BCFtools/RoH is presented and evaluated, an extension to the BCFtools software package, that detects regions of autozygosity in sequencing data, in particular exome data, using a hidden Markov model and it is shown that it has higher sensitivity and specificity than existing methods under a range of sequencing error rates and levels of autozykgosity.

...read moreread less

Abstract: Summary: Runs of homozygosity (RoHs) are genomic stretches of a diploid genome that show identical alleles on both chromosomes. Longer RoHs are unlikely to have arisen by chance but are likely to denote autozygosity, whereby both copies of the genome descend from the same recent ancestor. Early tools to detect RoH used genotype array data, but substantially more information is available from sequencing data. Here, we present and evaluate BCFtools/RoH, an extension to the BCFtools software package, that detects regions of autozygosity in sequencing data, in particular exome data, using a hidden Markov model. By applying it to simulated data and real data from the 1000 Genomes Project we estimate its accuracy and show that it has higher sensitivity and specificity than existing methods under a range of sequencing error rates and levels of autozygosity. Availability and implementation: BCFtools/RoH and its associated binary/source files are freely available from https://github.com/samtools/BCFtools. Contact: ku.ca.regnas@2nv or ku.ca.regnas@3dp Supplementary information: Supplementary data are available at Bioinformatics online.

...read moreread less

452 citations

Journal Article•DOI•

A genomic history of Aboriginal Australia

[...]

Anna-Sapfo Malaspinas¹, Anna-Sapfo Malaspinas², Anna-Sapfo Malaspinas³, Michael C. Westaway⁴, Craig Muller¹, Vitor C. Sousa³, Vitor C. Sousa², Oscar Lao⁵, Isabel Alves⁶, Isabel Alves², Isabel Alves³, Anders Bergström⁷, Georgios Athanasiadis⁸, Jade Yu Cheng⁹, Jade Yu Cheng⁸, Jacob E. Crawford⁹, Tim H. Heupink⁴, Enrico Macholdt¹⁰, Stephan Peischl³, Stephan Peischl², Simon Rasmussen¹¹, Stephan Schiffels¹⁰, Sankar Subramanian⁴, Joanne L. Wright⁴, Anders Albrechtsen¹, Chiara Barbieri¹⁰, Isabelle Dupanloup², Isabelle Dupanloup³, Anders Eriksson¹², Anders Eriksson¹³, Ashot Margaryan¹, Ida Moltke¹, Irina Pugach¹⁰, Thorfinn Sand Korneliussen¹, Ivan P. Levkivskyi¹⁴, J. Víctor Moreno-Mayar¹, Shengyu Ni¹⁰, Fernando Racimo⁹, Martin Sikora¹, Yali Xue⁷, Farhang Aghakhanian¹⁵, Nicolas Brucato¹⁶, Søren Brunak¹, Paula F. Campos¹, Paula F. Campos¹⁷, Warren Clark, Sturla Ellingvåg, Gudjugudju Fourmile, Pascale Gerbault¹⁸, Darren Injie, George Koki¹⁹, Matthew Leavesley²⁰, Betty Logan, Aubrey Lynch, Elizabeth Matisoo-Smith²¹, Peter McAllister, Alexander J. Mentzer²², Mait Metspalu²³, Andrea Bamberg Migliano¹⁸, Les Murgha, Maude E. Phipps¹⁵, William Pomat¹⁹, Doc Reynolds, François-Xavier Ricaut¹⁶, Peter Siba¹⁹, Mark G. Thomas¹⁸, Thomas Wales, Colleen Ma Run Wall, Stephen Oppenheimer²⁴, Chris Tyler-Smith⁷, Richard Durbin⁷, Joe Dortch²⁵, Andrea Manica¹², Mikkel H. Schierup⁸, Robert Foley¹, Robert Foley¹², Marta Mirazón Lahr¹², Marta Mirazón Lahr¹, Claire Bowern²⁶, Jeffrey D. Wall²⁷, Thomas Mailund⁸, Mark Stoneking¹⁰, Rasmus Nielsen¹, Rasmus Nielsen⁹, Manjinder S. Sandhu⁷, Laurent Excoffier², Laurent Excoffier³, David M. Lambert⁴, Eske Willerslev¹, Eske Willerslev¹², Eske Willerslev⁷ - Show less +87 more•Institutions (27)

13 Oct 2016-Nature

TL;DR: A population expansion in northeast Australia during the Holocene epoch associated with limited gene flow from this region to the rest of Australia, consistent with the spread of the Pama–Nyungan languages is inferred.

...read moreread less

Abstract: The population history of Aboriginal Australians remains largely uncharacterized. Here we generate high-coverage genomes for 83 Aboriginal Australians (speakers of Pama–Nyungan languages) and 25 Papuans from the New Guinea Highlands. We find that Papuan and Aboriginal Australian ancestors diversified 25–40 thousand years ago (kya), suggesting pre-Holocene population structure in the ancient continent of Sahul (Australia, New Guinea and Tasmania). However, all of the studied Aboriginal Australians descend from a single founding population that differentiated ~10–32 kya. We infer a population expansion in northeast Australia during the Holocene epoch (past 10,000 years) associated with limited gene flow from this region to the rest of Australia, consistent with the spread of the Pama–Nyungan languages. We estimate that Aboriginal Australians and Papuans diverged from Eurasians 51–72 kya, following a single out-of-Africa dispersal, and subsequently admixed with archaic populations. Finally, we report evidence of selection in Aboriginal Australians potentially associated with living in the desert.

...read moreread less

389 citations

Journal Article•DOI•

Health and population effects of rare gene knockouts in adult humans with related parents.

[...]

Vagheesh M. Narasimhan¹, Karen A. Hunt², Dan Mason³, Christopher L. Baker, Konrad J. Karczewski⁴, Konrad J. Karczewski⁵, Michael R. Barnes², Anthony H. Barnett⁶, Christopher M. Bates, Srikanth Bellary⁷, Nicholas A. Bockett², Kristina Giorda, Chris Griffiths², Harry Hemingway⁸, Jia Zhilong², Ann M. Kelly⁹, Hajrah A. Khawaja², Monkol Lek⁵, Monkol Lek⁴, Shane A. McCarthy¹, Rosie McEachan³, Anne H. O’Donnell-Luria⁵, Anne H. O’Donnell-Luria⁴, Kenneth Paigen, Constantinos A. Parisinos², Eamonn Sheridan³, Laura Southgate², Louise Tee⁹, Mark G. Thomas¹, Yali Xue¹, Michael Schnall-Levin, Petko M. Petkov, Chris Tyler-Smith¹, Eamonn R. Maher¹⁰, Eamonn R. Maher¹¹, Richard C. Trembath¹², Richard C. Trembath², Daniel G. MacArthur⁴, Daniel G. MacArthur⁵, John Wright³, Richard Durbin¹, David A. van Heel² - Show less +38 more•Institutions (12)

Wellcome Trust Sanger Institute¹, Queen Mary University of London², National Health Service³, Broad Institute⁴, Harvard University⁵, Heart of England NHS Foundation Trust⁶, Aston University⁷, University College London⁸, University of Birmingham⁹, Cambridge University Hospitals NHS Foundation Trust¹⁰, National Institute for Health Research¹¹, King's College London¹²

22 Apr 2016-Science

TL;DR: The results show that meiotic recombination sites are localized away from PRDM9-dependent hotspots, Thus, natural LOF variants inform on essential genetic loci and demonstratePRDM9 redundancy in humans.

...read moreread less

Abstract: Examining complete gene knockouts within a viable organism can inform on gene function. We sequenced the exomes of 3222 British Pakistani-heritage adults with high parental relatedness, discovering 1111 rare-variant homozygous genotypes with predicted loss of gene function (knockouts) in 781 genes. We observed 13.7% fewer than expected homozygous knockout genotypes, implying an average load of 1.6 recessive-lethal-equivalent LOF variants per adult. Linking genetic data to lifelong health records, knockouts were not associated with clinical consultation or prescription rate. In this dataset we identified a healthy PRDM9 knockout mother, and performed phased genome sequencing on her, her child and controls, which showed meiotic recombination sites localized away from PRDM9-dependent hotspots. Thus, natural LOF variants inform upon essential genetic loci, and demonstrate PRDM9 redundancy in humans.

...read moreread less

266 citations

Posted Content•DOI•

Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly

[...]

Valerie A. Schneider¹, Tina A. Graves-Lindsay², Kerstin Howe³, Nathan Bouk¹, Hsiu-Chuan Chen¹, Paul Kitts¹, Terence Murphy¹, Kim D. Pruitt¹, Françoise Thibaud-Nissen¹, Derek Albracht², Robert S. Fulton², Milinn Kremitzki², Vincent Magrini², Chris Markovic², Sean McGrath², Karyn Meltz Steinberg², Kate Auger³, William Chow³, Joanna Collins³, Glenn Harden³, Tim Hubbard⁴, Sarah Pelan³, Jared T. Simpson⁵, Glen Threadgold³, James Torrance³, Jonathan Wood³, Laura Clarke⁶, Sergey Koren¹, Matthew Boitano⁷, Heng Li⁸, Chen-Shan Chin⁷, Adam M. Phillippy¹, Richard Durbin³, Richard K. Wilson², Paul Flicek⁶, Deanna M. Church¹ - Show less +32 more•Institutions (8)

National Institutes of Health¹, University of Washington², Wellcome Trust Sanger Institute³, Queen Mary University of London⁴, Ontario Institute for Cancer Research⁵, European Bioinformatics Institute⁶, Pacific Biosciences⁷, Broad Institute⁸

30 Aug 2016-bioRxiv

TL;DR: It is asserted that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote the understanding of human biology and advance the efforts to improve health.

...read moreread less

Abstract: The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009 and reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that while the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health.

...read moreread less

194 citations

Journal Article•DOI•

A federated ecosystem for sharing genomic, clinical data

[...]

10 Jun 2016-Science

TL;DR: This data-sharing effort has led to improved variant interpretation and development of treatments for rare diseases and some cancer types, but such benefits will only be available to the general population if researchers and clinicians can access and make comparisons across data from millions of individuals.

...read moreread less

Abstract: Silos of genome data collection are being transformed into seamlessly connected, independent systems Early data-sharing efforts have led to improved variant interpretation and development of treatments for rare diseases and some cancer types (1–3). However, such benefits will only be available to the general population if researchers and clinicians can access and make comparisons across data from millions of individuals.

...read moreread less

173 citations

Journal Article•DOI•

Deficient methylation and formylation of mt-tRNA Met wobble cytosine in a patient carrying mutations in NSUN3

[...]

Lindsey Van Haute¹, Sabine Dietmann², Laura S. Kremer³, Shobbir Hussain⁴, Sarah F. Pearce¹, Christopher A. Powell¹, Joanna Rorbach¹, Rebecca Lantaff², Sandra Blanco², Sascha Sauer⁵, Sascha Sauer⁶, Sascha Sauer⁷, Urania Kotzaeridou⁸, Georg F. Hoffmann⁸, Yasin Memari⁹, Anja Kolb-Kokocinski⁹, Richard Durbin⁹, Johannes A. Mayr¹⁰, Michaela Frye², Holger Prokisch³, Michal Minczuk¹ - Show less +17 more•Institutions (10)

MRC Mitochondrial Biology Unit¹, University of Cambridge², Technische Universität München³, University of Bath⁴, University of Würzburg⁵, Max Planck Society⁶, Max Delbrück Center for Molecular Medicine⁷, Boston Children's Hospital⁸, Wellcome Trust Sanger Institute⁹, Paracelsus Private Medical University of Salzburg¹⁰

30 Jun 2016-Nature Communications

TL;DR: It is shown that NSun3 is required for deposition of m5C at the anticodon loop in the mitochondrially encoded transfer RNA methionine (mt-tRNAMet), and f5C in human mitochondrial RNA is generated by oxidative processing of m 5C.

...read moreread less

Abstract: Epitranscriptome modifications are required for structure and function of RNA and defects in these pathways have been associated with human disease. Here we identify the RNA target for the previously uncharacterized 5-methylcytosine (m(5)C) methyltransferase NSun3 and link m(5)C RNA modifications with energy metabolism. Using whole-exome sequencing, we identified loss-of-function mutations in NSUN3 in a patient presenting with combined mitochondrial respiratory chain complex deficiency. Patient-derived fibroblasts exhibit severe defects in mitochondrial translation that can be rescued by exogenous expression of NSun3. We show that NSun3 is required for deposition of m(5)C at the anticodon loop in the mitochondrially encoded transfer RNA methionine (mt-tRNA(Met)). Further, we demonstrate that m(5)C deficiency in mt-tRNA(Met) results in the lack of 5-formylcytosine (f(5)C) at the same tRNA position. Our findings demonstrate that NSUN3 is necessary for efficient mitochondrial translation and reveal that f(5)C in human mitochondrial RNA is generated by oxidative processing of m(5)C.

...read moreread less

172 citations

Journal Article•DOI•

Iron Age and Anglo-Saxon genomes from East England reveal British migration history

[...]

Stephan Schiffels¹, Wolfgang Haak¹, Pirita Paajanen², Pirita Paajanen³, Bastien Llamas⁴, Elizabeth Popescu⁵, Louise Loe⁵, Rachel Clarke⁵, Alice Lyons⁵, Richard Mortimer⁵, Duncan Sayer⁶, Chris Tyler-Smith², Alan Cooper⁴, Richard Durbin² - Show less +10 more•Institutions (6)

Max Planck Society¹, Wellcome Trust Sanger Institute², Norwich Research Park³, University of Adelaide⁴, University of Oxford⁵, University of Central Lancashire⁶

19 Jan 2016-Nature Communications

TL;DR: Using rarecoal, a new method, it is estimated that on average the contemporary East English population derives 38% of its ancestry from Anglo-Saxon migrations, while the Iron Age samples share ancestors with multiple Northern European populations including Britain.

...read moreread less

Abstract: British population history has been shaped by a series of immigrations, including the early Anglo-Saxon migrations after 400 CE. It remains an open question how these events affected the genetic composition of the current British population. Here, we present whole-genome sequences from 10 individuals excavated close to Cambridge in the East of England, ranging from the late Iron Age to the middle Anglo-Saxon period. By analysing shared rare variants with hundreds of modern samples from Britain and Europe, we estimate that on average the contemporary East English population derives 38% of its ancestry from Anglo-Saxon migrations. We gain further insight with a new method, rarecoal, which infers population history and identifies fine-scale genetic ancestry from rare variants. Using rarecoal we find that the Anglo-Saxon samples are closely related to modern Dutch and Danish populations, while the Iron Age samples share ancestors with multiple Northern European populations including Britain.

...read moreread less

144 citations

Journal Article•DOI•

DNAH11 Localization in the Proximal Region of Respiratory Cilia Defines Distinct Outer Dynein Arm Complexes.

[...]

Gerard W. Dougherty, Niki T. Loges, Judith A. Klinkenbusch, Heike Olbrich, Petra Pennekamp, Tabea Menchen, Johanna Raidt, Julia Wallmeier, Claudius Werner, Cordula Westermann, Christian Ruckert¹, Virginia Mirra², Rim Hjeij, Yasin Memari³, Richard Durbin³, Anja Kolb-Kokocinski³, Kavita Praveen⁴, Mohammad Amin Kashef⁵, Mohammad Amin Kashef⁶, Sara Kashef⁶, Fardin Eghtedari, Karsten Häffner⁷, Pekka Valmari, Gyorgy Baktai, Micha Aviram⁸, Lea Bentur, Israel Amirav⁹, Erica E. Davis⁴, Nicholas Katsanis⁴, Martina Brueckner¹⁰, Artem Shaposhnykov¹¹, Gaia Pigino¹¹, Bernd Dworniczak¹, Heymut Omran - Show less +30 more•Institutions (11)

University of Münster¹, University of Naples Federico II², Wellcome Trust Sanger Institute³, Duke University⁴, Baystate Medical Center⁵, Shiraz University of Medical Sciences⁶, University of Freiburg⁷, Soroka Medical Center⁸, University of Alberta⁹, Yale University¹⁰, Max Planck Society¹¹

01 Aug 2016-American Journal of Respiratory Cell and Molecular Biology

TL;DR: A monoclonal antibody specific to DNAH11 was designed and validated and performed high-resolution IFM of both control and PCD-affected human respiratory cells, as well as samples from green fluorescent protein (GFP)-left-right dynein mice, to determine the ciliary localization of DNAH 11.

...read moreread less

Abstract: Primary ciliary dyskinesia (PCD) is a recessively inherited disease that leads to chronic respiratory disorders owing to impaired mucociliary clearance. Conventional transmission electron microscopy (TEM) is a diagnostic standard to identify ultrastructural defects in respiratory cilia but is not useful in approximately 30% of PCD cases, which have normal ciliary ultrastructure. DNAH11 mutations are a common cause of PCD with normal ciliary ultrastructure and hyperkinetic ciliary beating, but its pathophysiology remains poorly understood. We therefore characterized DNAH11 in human respiratory cilia by immunofluorescence microscopy (IFM) in the context of PCD. We used whole-exome and targeted next-generation sequence analysis as well as Sanger sequencing to identify and confirm eight novel loss-of-function DNAH11 mutations. We designed and validated a monoclonal antibody specific to DNAH11 and performed high-resolution IFM of both control and PCD-affected human respiratory cells, as well as samples from green fluorescent protein (GFP)-left-right dynein mice, to determine the ciliary localization of DNAH11. IFM analysis demonstrated native DNAH11 localization in only the proximal region of wild-type human respiratory cilia and loss of DNAH11 in individuals with PCD with certain loss-of-function DNAH11 mutations. GFP-left-right dynein mice confirmed proximal DNAH11 localization in tracheal cilia. DNAH11 retained proximal localization in respiratory cilia of individuals with PCD with distinct ultrastructural defects, such as the absence of outer dynein arms (ODAs). TEM tomography detected a partial reduction of ODAs in DNAH11-deficient cilia. DNAH11 mutations result in a subtle ODA defect in only the proximal region of respiratory cilia, which is detectable by IFM and TEM tomography.

...read moreread less

Journal Article•DOI•

Bi-allelic Truncating Mutations in TANGO2 Cause Infancy-Onset Recurrent Metabolic Crises with Encephalocardiomyopathy

[...]

Laura S. Kremer¹, Felix Distelmaier², Bader Alhaddad¹, Maja Hempel³, Arcangela Iuso¹, Clemens Küpper⁴, Clemens Küpper⁵, Chris Mühlhausen³, Reka Kovacs-Nagy¹, Robin Satanovskij¹, Elisabeth Graf, Riccardo Berutti, Gertrud Eckstein, Richard Durbin⁶, Sascha Sauer⁷, Sascha Sauer⁸, Georg F. Hoffmann⁹, Tim M. Strom¹, René Santer³, Thomas Meitinger¹, Thomas Klopstock⁵, Thomas Klopstock⁴, Holger Prokisch¹, Tobias B. Haack¹ - Show less +20 more•Institutions (9)

Technische Universität München¹, University of Düsseldorf², University of Hamburg³, Ludwig Maximilian University of Munich⁴, German Center for Neurodegenerative Diseases⁵, Wellcome Trust Sanger Institute⁶, University of Würzburg⁷, Max Planck Society⁸, University Hospital Heidelberg⁹

04 Feb 2016-American Journal of Human Genetics

TL;DR: The results establish TANGO2 deficiency as a clinically recognizable cause of pediatric disease with multi-organ involvement and Investigation of palmitate-dependent respiration in mutant fibroblasts showed evidence of a functional defect in mitochondrial β-oxidation.

...read moreread less

Abstract: Molecular diagnosis of mitochondrial disorders is challenging because of extreme clinical and genetic heterogeneity. By exome sequencing, we identified three different bi-allelic truncating mutations in TANGO2 in three unrelated individuals with infancy-onset episodic metabolic crises characterized by encephalopathy, hypoglycemia, rhabdomyolysis, arrhythmias, and laboratory findings suggestive of a defect in mitochondrial fatty acid oxidation. Over the course of the disease, all individuals developed global brain atrophy with cognitive impairment and pyramidal signs. TANGO2 (transport and Golgi organization 2) encodes a protein with a putative function in redistribution of Golgi membranes into the endoplasmic reticulum in Drosophila and a mitochondrial localization has been confirmed in mice. Investigation of palmitate-dependent respiration in mutant fibroblasts showed evidence of a functional defect in mitochondrial β-oxidation. Our results establish TANGO2 deficiency as a clinically recognizable cause of pediatric disease with multi-organ involvement.

...read moreread less

Journal Article•DOI•

TTC25 Deficiency Results in Defects of the Outer Dynein Arm Docking Machinery and Primary Ciliary Dyskinesia with Left-Right Body Asymmetry Randomization

[...]

Julia Wallmeier, Hidetaka Shiratori¹, Gerard W. Dougherty, Christine Edelbusch, Rim Hjeij, Niki T. Loges, Tabea Menchen, Heike Olbrich, Petra Pennekamp, Johanna Raidt, Claudius Werner, Katsura Minegishi¹, Kyosuke Shinohara¹, Yasuko Asai¹, Katsuyoshi Takaoka¹, Chanjae Lee², Matthias Griese³, Yasin Memari⁴, Richard Durbin⁴, Anja Kolb-Kokocinski⁴, Sascha Sauer⁵, John B. Wallingford², Hiroshi Hamada¹, Heymut Omran - Show less +20 more•Institutions (5)

Osaka University¹, University of Texas at Austin², Ludwig Maximilian University of Munich³, Wellcome Trust Sanger Institute⁴, Max Delbrück Center for Molecular Medicine⁵

04 Aug 2016-American Journal of Human Genetics

TL;DR: TTC25 is reported as a new member of the ODA-DC machinery in humans and mice, and loss of the ciliary ODAs in humans via TEM and immunofluorescence analyses.

...read moreread less

Abstract: Multiprotein complexes referred to as outer dynein arms (ODAs) develop the main mechanical force to generate the ciliary and flagellar beat. ODA defects are the most common cause of primary ciliary dyskinesia (PCD), a congenital disorder of ciliary beating, characterized by recurrent infections of the upper and lower airways, as well as by progressive lung failure and randomization of left-right body asymmetry. Using a whole-exome sequencing approach, we identified recessive loss-of-function mutations within TTC25 in three individuals from two unrelated families affected by PCD. Mice generated by CRISPR/Cas9 technology and carrying a deletion of exons 2 and 3 in Ttc25 presented with laterality defects. Consistently, we observed immotile nodal cilia and missing leftward flow via particle image velocimetry. Furthermore, transmission electron microscopy (TEM) analysis in TTC25-deficient mice revealed an absence of ODAs. Consistent with our findings in mice, we were able to show loss of the ciliary ODAs in humans via TEM and immunofluorescence (IF) analyses. Additionally, IF analyses revealed an absence of the ODA docking complex (ODA-DC), along with its known components CCDC114, CCDC151, and ARMC4. Co-immunoprecipitation revealed interaction between the ODA-DC component CCDC114 and TTC25. Thus, here we report TTC25 as a new member of the ODA-DC machinery in humans and mice.

...read moreread less

Journal Article•DOI•

A high-content platform to characterise human induced pluripotent stem cell lines.

[...]

Andreas Leha¹, Nathalie Moens², Ruta Meleckyte², Oliver J. Culley², Mia K. R. Gervasio², Maximilian Kerz², Andreas Reimer², Stuart A. Cain³, Ian Streeter⁴, Amos Folarin⁵, Oliver Stegle⁴, Cay M. Kielty³, Richard Durbin¹, Fiona M. Watt², Davide Danovi² - Show less +11 more•Institutions (5)

Wellcome Trust Sanger Institute¹, King's College London², Wellcome Trust Centre for Cell-Matrix Research³, European Bioinformatics Institute⁴, Centre for Mental Health⁵

01 Mar 2016-Methods

TL;DR: In this paper, a high-content platform for phenotypic analysis of human induced pluripotent stem cells (iPSC) lines is described, where cells are dissociated and seeded as single cells onto 96-well plates coated with fibronectin at three different concentrations.

...read moreread less

Journal Article•DOI•

A Method for Checking Genomic Integrity in Cultured Cell Lines from SNP Genotyping Data.

[...]

Petr Danecek¹, Shane A. McCarthy¹, Richard Durbin¹•Institutions (1)

Wellcome Trust Sanger Institute¹

13 May 2016-PLOS ONE

TL;DR: A new method for sensitive detection of copy number alterations, aneuploidy, and contamination in cell lines using genome-wide SNP genotyping data is presented and results based on induced pluripotent stem cell lines obtained in the HipSci project are presented.

...read moreread less

Abstract: Genomic screening for chromosomal abnormalities is an important part of quality control when establishing and maintaining stem cell lines. We present a new method for sensitive detection of copy number alterations, aneuploidy, and contamination in cell lines using genome-wide SNP genotyping data. In contrast to other methods designed for identifying copy number variations in a single sample or in a sample composed of a mixture of normal and tumor cells, this new method is tailored for determining differences between cell lines and the starting material from which they were derived, which allows us to distinguish between normal and novel copy number variation. We implemented the method in the freely available BCFtools package and present results based on induced pluripotent stem cell lines obtained in the HipSci project.

...read moreread less

Posted Content•DOI•

The rate of false polymorphisms introduced when imputing genotypes from global imputation panels

[...]

Ida Surakka¹, Antti-Pekka Sarin¹, Sanni Ruotsalainen¹, Richard Durbin², Salomaa³, Mark J. Daly⁴, Aarno Palotie¹, Samuli Ripatti¹ - Show less +4 more•Institutions (4)

University of Helsinki¹, Wellcome Trust Sanger Institute², National Institute for Health and Welfare³, Harvard University⁴

13 Oct 2016-bioRxiv

TL;DR: The rate of false positive variants introduced by the imputation of Finnish genotype data using global reference panels using Haplotype Reference Consortium1; HRC, and the 1000Genomes project Phase I3; 1000G is evaluated and the results are compared to a Finnish population-specific reference panel combining whole genome and exome sequenced samples.

...read moreread less

Abstract: Previous studies1,2 have shown that large multi-population imputation reference panels increases the number of well-imputed variants. However, to our knowledge, no previous studies have evaluated the rate of introduced variation in monomorphic sites of the study population when using imputation panels with admixed populations. In this study we evaluate the rate of false positive variants introduced by the imputation of Finnish genotype data using global reference panels (Haplotype Reference Consortium1; HRC, and the 1000Genomes project Phase I3; 1000G) and compare the results to a Finnish population-specific reference panel combining whole genome and exome sequenced samples. In sites that were monomorphic in our test set, we observed high false positive rates for the global reference panels (4.0% for 1000G and 2.6% for HRC) compared to the Finnish panel (0.26%). This rate was even higher (7.4%) when using a combination panel of 1000G and Finnish whole genome sequences with cross-panel imputation.

...read moreread less

Posted Content•DOI•

trio-sga: facilitating de novo assembly of highly heterozygous genomes with parent-child trios

[...]

Milan Malinsky¹, Jared T. Simpson², Richard Durbin³•Institutions (3)

University of Cambridge¹, Ontario Institute for Cancer Research², Wellcome Trust Sanger Institute³

03 May 2016-bioRxiv

TL;DR: Tripathi et al. as mentioned in this paper proposed a set of three algorithms to reduce heterozygosity in genomic data prior to assembly in organisms with moderate to high levels of homozygosity.

...read moreread less

Abstract: Motivation: Most DNA sequence in diploid organisms is found in two copies, one contributed by the mother and the other by the father. The high density of differences between the maternally and paternally contributed sequences (heterozygous sites) in some organisms makes de novo genome assembly very challenging, even for algorithms specifically designed to deal with these cases. Therefore, various approaches, most commonly inbreeding in the laboratory, are used to reduce heterozygosity in genomic data prior to assembly. However, many species are not amenable to these techniques. Results: We introduce trio-sga, a set of three algorithms designed to take advantage of mother-father-offspring trio sequencing to facilitate better quality genome assembly in organisms with moderate to high levels of heterozygosity. Two of the algorithms use haplotype phase information present in the trio data to eliminate the majority of heterozygous sites before the assembly commences. The third algorithm is designed to reduce sequencing costs by enabling the use of parents' reads in the assembly of the genome of the offspring. We test these algorithms on a 'simulated trio' from four haploid datasets, and further demonstrate their performance by assembling three highly heterozygous Heliconius butterfly genomes. While the implementation of trio-sga is tuned towards Illumina-generated data, we note that the trio approach to reducing heterozygosity is likely to have cross-platform utility for de novo assembly. Availability: trio-sga is an extension of the sga genome assembler. It is available at https://github.com/millanek/trio-sga, written in C++, and runs multithreaded on UNIX- based systems. Contact: millanek@gmail.com, rd@sanger.ac.uk

...read moreread less

Posted Content•DOI•

A direct multi-generational estimate of the human mutation rate from autozygous segments seen in thousands of parentally related individuals

[...]

Vagheesh M. Narasimhan¹, Raheleh Rahbari¹, Aylwyn Scally², Arthur Wuster¹, Dan Mason³, Yali Xue¹, John Wright³, Richard C. Trembath⁴, Eamonn R. Maher², van Heel Da⁵, Adam Auton⁶, Matthew E. Hurles¹, Chris Tyler-Smith¹, Richard Durbin¹ - Show less +10 more•Institutions (6)

Wellcome Trust Sanger Institute¹, University of Cambridge², National Health Service³, King's College London⁴, Queen Mary University of London⁵, Albert Einstein College of Medicine⁶

17 Jun 2016-bioRxiv

TL;DR: Exome sequences from 3,222 British-Pakistani individuals with high parental relatedness are used to estimate exome mutation rates, finding frequent recurrence of mutations at polymorphic CpG sites, and an increase in C to T mutations in the Pakistani population compared to Europeans, suggesting that mutational processes have evolved rapidly between human populations.

...read moreread less

Abstract: Heterozygous mutations within homozygous sequences descended from a recent common ancestor offer a way to ascertain de novo mutations (DNMs) across multiple generations. Using exome sequences from 3,222 British-Pakistani individuals with high parental relatedness, we estimate a mutation rate of 1.45 ± 0.05 × 10 -8 per base pair per generation in autosomal coding sequence, with a corresponding non-crossover gene conversion rate of 8.75 ± 0.05 × 10 -6 per base pair per generation. This is at the lower end of exome mutation rates previously estimated in parent-offspring trios, suggesting that post-zygotic mutations contribute little to the human germline mutation rate. We found frequent recurrence of mutations at polymorphic CpG sites, and an increase in C to T mutations in a 59 CCG 39 → 59 CTG 39 context in the Pakistani population compared to Europeans, suggesting that mutational processes have evolved rapidly between human populations.

...read moreread less

Posted Content•DOI•

Reference-based phasing using the Haplotype Reference Consortium panel

[...]

Po-Ru Loh¹, Petr Danecek², Pier Francesco Palamara¹, Christian Fuchsberger³, Yakir A. Reshef¹, Hilary K. Finucane⁴, Sebastian Schoenherr⁵, Lukas Forer⁵, Shane A. McCarthy², Gonçalo R. Abecasis⁶, Richard Durbin², Alkes L. Price¹ - Show less +8 more•Institutions (6)

Harvard University¹, Wellcome Trust Sanger Institute², European Academy of Bozen³, Massachusetts Institute of Technology⁴, Innsbruck Medical University⁵, University of Michigan⁶

07 Jul 2016-bioRxiv

TL;DR: A new phasing algorithm, Eagle2, is introduced that attains high accuracy across a broad range of cohort sizes by efficiently leveraging information from large external reference panels (such as the Haplotype Reference Consortium, HRC) using a new data structure based on the positional BurrowsWheeler transform.

...read moreread less

Abstract: Haplotype phasing is a fundamental problem in medical and population genetics. Phasing is generally performed via statistical phasing within a genotyped cohort, an approach that can attain high accuracy in very large cohorts but attains lower accuracy in smaller cohorts. Here, we instead explore the paradigm of reference-based phasing. We introduce a new phasing algorithm, Eagle2, that attains high accuracy across a broad range of cohort sizes by efficiently leveraging information from large external reference panels (such as the Haplotype Reference Consortium, HRC) using a new data structure based on the positional Burrows-Wheeler transform. We demonstrate that Eagle2 attains a ≈20x speedup and ≈10% increase in accuracy compared to reference-based phasing using SHAPEIT2. On European-ancestry samples, Eagle2 with the HRC panel achieves >2x the accuracy of 1000 Genomes-based phasing. Eagle2 is open source and freely available for HRC-based phasing via the Sanger Imputation Service and the Michigan Imputation Server.

...read moreread less

Posted Content•DOI•

Common genetic variation drives molecular heterogeneity in human iPSCs

[...]

Helena Kilpinen¹, Angela Goncalves², Andreas Leha², Vackar Afzal³, Sofie Ashford⁴, Sendu Bala², Dalila Bensaddek³, Francesco Paolo Casale¹, Oliver J. Culley⁵, Petr Danacek², Adam Faulconbridge¹, Peter W. Harrison¹, Davis J. McCarthy⁶, Davis J. McCarthy¹, Shane A. McCarthy², Ruta Meleckyte⁵, Yasin Memari², Nathalie Moens⁵, Filipa A.C. Soares⁴, Ian Streeter¹, Chukwuma A. Agu², Alex Alderton², Rachel Nelson², Sarah Harper², Minal Patel², Laura Clarke¹, Reena Halai², Christopher M. Kirton², Anja Kolb-Kokocinski², Philip L. Beales⁷, Ewan Birney¹, Davide Danovi⁵, Angus I. Lamond³, Willem H. Ouwehand⁴, Ludovic Vallier², Fiona M. Watt⁵, Richard Durbin², Oliver Stegle¹, Daniel J. Gaffney² - Show less +35 more•Institutions (7)

European Bioinformatics Institute¹, Wellcome Trust Sanger Institute², University of Dundee³, University of Cambridge⁴, King's College London⁵, St. Vincent's Institute of Medical Research⁶, University College London⁷

25 May 2016-bioRxiv

TL;DR: This study provides a comprehensive picture of the major sources of genetic and phenotypic variation in iPSCs and establishes their suitability for use in genetic studies of complex human traits and cancer.

...read moreread less

Abstract: Induced pluripotent stem cell (iPSC) technology has enormous potential to provide improved cellular models of human disease. However, variable genetic and phenotypic characterisation of many existing iPSC lines limits their potential use for research and therapy. Here, we describe the systematic generation, genotyping and phenotyping of 522 open access human iPSCs derived from 189 healthy male and female individuals as part of the Human Induced Pluripotent Stem Cells Initiative (HipSci: http://www.hipsci.org). Our study provides a comprehensive picture of the major sources of genetic and phenotypic variation in iPSCs and establishes their suitability for use in genetic studies of complex human traits and cancer. Using a combination of genome-wide analyses we find that 5-25% of the variation in different iPSC phenotypes, including differentiation capacity and cellular morphology, arises from differences between individuals. We also assess the phenotypic effects of rare, genomic copy number mutations that are recurrently seen following iPSC reprogramming and present an initial map of common regulatory variants affecting the transcriptome of pluripotent cells in humans.

...read moreread less

Journal Article•DOI•

Whole-exome sequencing in an isolated population from the Dalmatian island of Vis.

[...]

Ana Jerončić¹, Yasin Memari², Graham R. S. Ritchie², Audrey E. Hendricks³, Audrey E. Hendricks², Anja Kolb-Kokocinski², Angela Matchan², Veronique Vitart⁴, Caroline Hayward⁴, Ivana Kolcic¹, Dominik Glodzik⁴, Alan F. Wright⁴, Igor Rudan⁴, Harry Campbell⁴, Richard Durbin², Ozren Polasek⁴, Ozren Polasek¹, Eleftheria Zeggini², Vesna Boraska Perica¹, Vesna Boraska Perica² - Show less +16 more•Institutions (4)

University of Split¹, Wellcome Trust Sanger Institute², University of Colorado Denver³, University of Edinburgh⁴

06 Apr 2016-European Journal of Human Genetics

TL;DR: This work confirms the isolate status of Vis population by means of whole-exome sequence and reveals the pattern of loss-of-function mutations, which resembles the trails of adaptive evolution that were found in other species.

...read moreread less

Abstract: We have whole-exome sequenced 176 individuals from the isolated population of the island of Vis in Croatia in order to describe exonic variation architecture. We found 290 577 single nucleotide variants (SNVs), 65% of which are singletons, low frequency or rare variants. A total of 25 430 (9%) SNVs are novel, previously not catalogued in NHLBI GO Exome Sequencing Project, UK10K-Generation Scotland, 1000Genomes Project, ExAC or NCBI Reference Assembly dbSNP. The majority of these variants (76%) are singletons. Comparable to data obtained from UK10K-Generation Scotland that were sequenced and analysed using the same protocols, we detected an enrichment of potentially damaging variants (non-synonymous and loss-of-function) in the low frequency and common variant categories. On average 115 (range 93–140) genotypes with loss-of-function variants, 23 (15–34) of which were homozygous, were identified per person. The landscape of loss-of-function variants across an exome revealed that variants mainly accumulated in genes on the xenobiotic-related pathways, of which majority coded for enzymes. The frequency of loss-of-function variants was additionally increased in Vis runs of homozygosity regions where variants mainly affected signalling pathways. This work confirms the isolate status of Vis population by means of whole-exome sequence and reveals the pattern of loss-of-function mutations, which resembles the trails of adaptive evolution that were found in other species. By cataloguing the exomic variants and describing the allelic structure of the Vis population, this study will serve as a valuable resource for future genetic studies of human diseases, population genetics and evolution in this population.

...read moreread less

Posted Content•DOI•

Contrasting genome dynamics between domesticated and wild yeasts

[...]

Jia-Xing Yue¹, Jing Li¹, Louise Aigrain², Johan Hallin¹, Karl Persson³, Karen Oliver², Anders Bergström², Paul Coupland², Jonas Warringer³, Marco Cosentino Lagomarsino⁴, Gilles Fischer⁴, Richard Durbin², Gianni Liti¹ - Show less +9 more•Institutions (4)

French Institute of Health and Medical Research¹, Wellcome Trust Sanger Institute², University of Gothenburg³, University of Paris⁴

22 Sep 2016-bioRxiv

TL;DR: High-resolution view of structural dynamics uncovers that, in chromosomal cores, S. paradoxus exhibits higher accumulation rate of balanced structural rearrangements (inversions, translocations and transpositions) whereas S. cerevisiae accumulates unbalanced rearrangement more rapidly.

...read moreread less

Abstract: Structural rearrangements have long been recognized as an important source of genetic variation with implications in phenotypic diversity and disease, yet their evolutionary dynamics are difficult to characterize with short-read sequencing. Here, we report long-read sequencing for 12 strains representing major subpopulations of the partially domesticated yeast Saccharomyces cerevisiae and its wild relative Saccharomyces paradoxus. Complete genome assemblies and annotations generate population-level reference genomes and allow for the first explicit definition of chromosome partitioning into cores, subtelomeres and chromosome-ends. High-resolution view of structural dynamics uncovers that, in chromosomal cores, S. paradoxus exhibits higher accumulation rate of balanced structural rearrangements (inversions, translocations and transpositions) whereas S. cerevisiae accumulates unbalanced rearrangements (large insertions, deletions and duplications) more rapidly. In subtelomeres, recurrent interchromosomal reshuffling was found in both species, with higher rate in S. cerevisiae. Such striking contrasts between wild and domesticated yeasts reveal the influence of human activities on structural genome evolution.

...read moreread less

Posted Content•DOI•

Whole genome view of the consequences of a population bottleneck using 2926 genome sequences from Finland and United Kingdom

[...]

Himanshu Chheda¹, Priit Palta¹, Matti Pirinen¹, Shane A. McCarthy², Klaudia Walter², Seppo Koskinen³, Veikko Salomaa³, Mark J. Daly⁴, Richard Durbin², Aarno Palotie¹, Tero Aittokallio¹, Samuli Ripatti¹ - Show less +8 more•Institutions (4)

University of Helsinki¹, Wellcome Trust Sanger Institute², National Institutes of Health³, Harvard University⁴

12 Jul 2016-bioRxiv

TL;DR: A significant depletion of variants in the rare frequency spectrum was observed in Finns when comparing the two populations and these functional categories represent the highest a priori power for downstream association studies of rare variants using population isolates.

...read moreread less

Abstract: Isolated populations with enrichment of variants due to recent population bottlenecks provide a powerful resource for identifying disease-associated genetic variants and genes. As a model of an isolate population, we sequenced the genomes of 1463 Finnish individuals as part of the Sequencing Initiative Suomi (SISu) Project. We compared the genomic profiles of the 1463 Finns to a sample of 1463 British individuals that were sequenced in parallel as part of the UK10K Project. Whereas there were no major differences in the allele frequency of common variants, a significant depletion of variants in the rare frequency spectrum was observed in Finns when comparing the two populations. On the other hand, we observed >2.1 million variants that were twice as frequent among Finns compared to Britons and 800,000 variants that were more than 10 times more frequent in Finns. Furthermore, in Finns we observed a relative proportional enrichment of variants in the minor allele frequency range between 2 - 5% (p

...read moreread less

Posted Content•DOI•

Using reference-free compressed data structures to analyse sequencing reads from thousands of human genomes

[...]

Dirk-Dominic Dolle¹, Zhicheng Liu¹, Matthew Cotten¹, Jared T. Simpson², Zamin Iqbal³, Richard Durbin¹, Shane A. McCarthy¹, Thomas M. Keane¹ - Show less +4 more•Institutions (3)

Wellcome Trust Sanger Institute¹, Ontario Institute for Cancer Research², University of Oxford³

22 Jun 2016-bioRxiv

TL;DR: The concept of a population BWT is introduced and used to store and index the sequencing reads of 2,705 samples from the 1000 Genomes Project and it is shown that as more genomes are added, identical read sequences are increasingly observed and compression becomes more efficient.

...read moreread less

Abstract: We are rapidly approaching the point where we have sequenced millions of human genomes. There is a pressing need for new data structures to store raw sequencing data and efficient algorithms for population scale analysis. Current reference based data formats do not fully exploit the redundancy in population sequencing nor take advantage of shared genetic variation. In recent years, the Burrows-Wheeler transform (BWT) and FM-index have been widely employed as a full text searchable index for read alignment and de novo assembly. We introduce the concept of a population BWT and use it to store and index the sequencing reads of 2,705 samples from the 1000 Genomes Project. A key feature is that as more genomes are added, identical read sequences are increasingly observed and compression becomes more efficient. We assess the support in the 1000 Genomes read data for every base position of two human reference assembly versions, identifying that 3.2 Mbp with population support was lost in the transition from GRCh37 with 13.7 Mbp added to GRCh38. We show that the vast majority of variant alleles can be uniquely described by overlapping 31-mers and show how rapid and accurate SNP and indel genotyping can be carried out across the genomes in the population BWT. We use the population BWT to carry out non-reference queries to search for the presence of all known viral genomes, and discover human T-lymphotropic virus 1 integrations in six samples in a recognised epidemiological distribution.

...read moreread less

Posted Content•DOI•

Recombination Suppression is Unlikely to Contribute to Speciation in Sympatric Heliconius Butterflies

[...]

John W. Davey¹, Sarah L. Barker¹, Pasi Rastas¹, Ana Pinharanda¹, Simon H. Martin¹, Richard Durbin², Richard M. Merrill¹, Chris D. Jiggins¹ - Show less +4 more•Institutions (2)

University of Cambridge¹, Wellcome Trust Sanger Institute²

27 Oct 2016-bioRxiv

TL;DR: Deep sequencing of large crosses of butterflies is used to show that there are no long chromosomes regions that are not broken up during hybridisation, and no long chromosome inversions anywhere between the two genomes, which suggests that hybridisation is rare enough and mate preference is strong enough that inversions are not necessary to maintain the species barrier.

...read moreread less

Abstract: Mechanisms that suppress recombination are known to help maintain species barriers by preventing the breakup of co-adapted gene combinations The sympatric butterfly species H melpomene and H cydno are separated by many strong barriers, but the species still hybridise infrequently in the wild, with around 40% of the genome influenced by introgression We tested the hypothesis that genetic barriers between the species are reinforced by inversions or other mechanisms to reduce between-species recombination rate We constructed fine-scale recombination maps for Panamanian populations of both species and hybrids to directly measure recombination rate between these species, and generated long sequence reads to detect inversions We find no evidence for a systematic reduction in recombination rates in F1 hybrids, and also no evidence for inversions longer than 50 kb that might be involved in generating or maintaining species barriers This suggests that mechanisms leading to global or local reduction in recombination do not play a significant role in the maintenance of species barriers between H melpomene and H cydno

...read moreread less