scispace - formally typeset
Search or ask a question
Institution

Wellcome Trust Sanger Institute

NonprofitCambridge, United Kingdom
About: Wellcome Trust Sanger Institute is a nonprofit organization based out in Cambridge, United Kingdom. It is known for research contribution in the topics: Population & Genome. The organization has 4009 authors who have published 9671 publications receiving 1224479 citations.


Papers
More filters
Journal ArticleDOI
Anubha Mahajan1, Daniel Taliun2, Matthias Thurner1, Neil R. Robertson1, Jason M. Torres1, N. William Rayner3, N. William Rayner1, Anthony Payne1, Valgerdur Steinthorsdottir4, Robert A. Scott5, Niels Grarup6, James P. Cook7, Ellen M. Schmidt2, Matthias Wuttke8, Chloé Sarnowski9, Reedik Mägi10, Jana Nano11, Christian Gieger, Stella Trompet12, Cécile Lecoeur13, Michael Preuss14, Bram P. Prins3, Xiuqing Guo15, Lawrence F. Bielak2, Jennifer E. Below16, Donald W. Bowden17, John C. Chambers, Young-Jin Kim, Maggie C.Y. Ng17, Lauren E. Petty16, Xueling Sim18, Weihua Zhang19, Weihua Zhang20, Amanda J. Bennett1, Jette Bork-Jensen6, Chad M. Brummett2, Mickaël Canouil13, Kai-Uwe Ec Kardt21, Krista Fischer10, Sharon L.R. Kardia2, Florian Kronenberg22, Kristi Läll10, Ching-Ti Liu9, Adam E. Locke23, Jian'an Luan5, Ioanna Ntalla24, Vibe Nylander1, Sebastian Schönherr22, Claudia Schurmann14, Loic Yengo13, Erwin P. Bottinger14, Ivan Brandslund25, Cramer Christensen, George Dedoussis26, Jose C. Florez, Ian Ford27, Oscar H. Franco11, Timothy M. Frayling28, Vilmantas Giedraitis29, Sophie Hackinger3, Andrew T. Hattersley28, Christian Herder30, M. Arfan Ikram11, Martin Ingelsson29, Marit E. Jørgensen25, Marit E. Jørgensen31, Torben Jørgensen32, Torben Jørgensen6, Jennifer Kriebel, Johanna Kuusisto33, Symen Ligthart11, Cecilia M. Lindgren1, Cecilia M. Lindgren34, Allan Linneberg6, Allan Linneberg35, Valeriya Lyssenko36, Valeriya Lyssenko37, Vasiliki Mamakou26, Thomas Meitinger38, Karen L. Mohlke39, Andrew D. Morris40, Andrew D. Morris41, Girish N. Nadkarni14, James S. Pankow42, Annette Peters, Naveed Sattar43, Alena Stančáková33, Konstantin Strauch44, Kent D. Taylor15, Barbara Thorand, Gudmar Thorleifsson4, Unnur Thorsteinsdottir4, Unnur Thorsteinsdottir45, Jaakko Tuomilehto, Daniel R. Witte46, Josée Dupuis9, Patricia A. Peyser2, Eleftheria Zeggini3, Ruth J. F. Loos14, Philippe Froguel20, Philippe Froguel13, Erik Ingelsson47, Erik Ingelsson48, Lars Lind29, Leif Groop49, Leif Groop37, Markku Laakso33, Francis S. Collins50, J. Wouter Jukema12, Colin N. A. Palmer51, Harald Grallert, Andres Metspalu10, Abbas Dehghan20, Abbas Dehghan11, Anna Köttgen8, Gonçalo R. Abecasis2, James B. Meigs52, Jerome I. Rotter15, Jonathan Marchini1, Oluf Pedersen6, Torben Hansen25, Torben Hansen6, Claudia Langenberg5, Nicholas J. Wareham5, Kari Stefansson4, Kari Stefansson45, Anna L. Gloyn1, Andrew P. Morris1, Andrew P. Morris10, Andrew P. Morris7, Michael Boehnke2, Mark I. McCarthy1 
TL;DR: Combining 32 genome-wide association studies with high-density imputation provides a comprehensive view of the genetic contribution to type 2 diabetes in individuals of European ancestry with respect to locus discovery, causal-variant resolution, and mechanistic insight.
Abstract: We expanded GWAS discovery for type 2 diabetes (T2D) by combining data from 898,130 European-descent individuals (9% cases), after imputation to high-density reference panels. With these data, we (i) extend the inventory of T2D-risk variants (243 loci, 135 newly implicated in T2D predisposition, comprising 403 distinct association signals); (ii) enrich discovery of lower-frequency risk alleles (80 index variants with minor allele frequency 2); (iii) substantially improve fine-mapping of causal variants (at 51 signals, one variant accounted for >80% posterior probability of association (PPA)); (iv) extend fine-mapping through integration of tissue-specific epigenomic information (islet regulatory annotations extend the number of variants with PPA >80% to 73); (v) highlight validated therapeutic targets (18 genes with associations attributable to coding variants); and (vi) demonstrate enhanced potential for clinical translation (genome-wide chip heritability explains 18% of T2D risk; individuals in the extremes of a T2D polygenic risk score differ more than ninefold in prevalence).

1,136 citations

Journal ArticleDOI
Swapan Mallick1, Swapan Mallick2, Swapan Mallick3, Heng Li2, Mark Lipson3, Iain Mathieson3, Melissa Gymrek, Fernando Racimo4, Mengyao Zhao3, Mengyao Zhao2, Mengyao Zhao1, Niru Chennagiri1, Niru Chennagiri3, Niru Chennagiri2, Susanne Nordenfelt3, Susanne Nordenfelt2, Susanne Nordenfelt1, Arti Tandon2, Arti Tandon3, Pontus Skoglund2, Pontus Skoglund3, Iosif Lazaridis2, Iosif Lazaridis3, Sriram Sankararaman3, Sriram Sankararaman2, Sriram Sankararaman5, Qiaomei Fu3, Qiaomei Fu6, Qiaomei Fu2, Nadin Rohland3, Nadin Rohland2, Gabriel Renaud7, Yaniv Erlich8, Thomas Willems9, Carla Gallo10, Jeffrey P. Spence4, Yun S. Song11, Yun S. Song4, Giovanni Poletti10, Francois Balloux12, George van Driem13, Peter de Knijff14, Irene Gallego Romero15, Aashish R. Jha16, Doron M. Behar17, Claudio M. Bravi18, Cristian Capelli19, Tor Hervig20, Andrés Moreno-Estrada, Olga L. Posukh21, Elena Balanovska, Oleg Balanovsky22, Sena Karachanak-Yankova23, Hovhannes Sahakyan17, Hovhannes Sahakyan24, Draga Toncheva23, Levon Yepiskoposyan24, Chris Tyler-Smith25, Yali Xue25, M. Syafiq Abdullah26, Andres Ruiz-Linares12, Cynthia M. Beall27, Anna Di Rienzo16, Choongwon Jeong16, Elena B. Starikovskaya, Ene Metspalu28, Ene Metspalu17, Jüri Parik17, Richard Villems29, Richard Villems28, Richard Villems17, Brenna M. Henn30, Ugur Hodoglugil31, Robert W. Mahley32, Antti Sajantila33, George Stamatoyannopoulos34, Joseph Wee, Rita Khusainova35, Elza Khusnutdinova35, Sergey Litvinov17, Sergey Litvinov35, George Ayodo36, David Comas37, Michael F. Hammer38, Toomas Kivisild39, Toomas Kivisild17, William Klitz, Cheryl A. Winkler40, Damian Labuda41, Michael J. Bamshad34, Lynn B. Jorde42, Sarah A. Tishkoff11, W. Scott Watkins42, Mait Metspalu17, Stanislav Dryomov, Rem I. Sukernik43, Lalji Singh44, Lalji Singh5, Kumarasamy Thangaraj44, Svante Pääbo7, Janet Kelso7, Nick Patterson2, David Reich1, David Reich3, David Reich2 
13 Oct 2016-Nature
TL;DR: It is demonstrated that indigenous Australians, New Guineans and Andamanese do not derive substantial ancestry from an early dispersal of modern humans; instead, their modern human ancestry is consistent with coming from the same source as that of other non-Africans.
Abstract: Here we report the Simons Genome Diversity Project data set: high quality genomes from 300 individuals from 142 diverse populations. These genomes include at least 5.8 million base pairs that are not present in the human reference genome. Our analysis reveals key features of the landscape of human genome variation, including that the rate of accumulation of mutations has accelerated by about 5% in non-Africans compared to Africans since divergence. We show that the ancestors of some pairs of present-day human populations were substantially separated by 100,000 years ago, well before the archaeologically attested onset of behavioural modernity. We also demonstrate that indigenous Australians, New Guineans and Andamanese do not derive substantial ancestry from an early dispersal of modern humans; instead, their modern human ancestry is consistent with coming from the same source as that of other non-Africans.

1,133 citations

Journal ArticleDOI
TL;DR: It is shown that the single-cell latent variable model (scLVM) allows the identification of otherwise undetectable subpopulations of cells that correspond to different stages during the differentiation of naive T cells into T helper 2 cells.
Abstract: Hidden cell sub-populations are detected by accounting for confounding variation inthe analysis of single-cell RNA-seq data. Recent technical developments have enabled the transcriptomes of hundreds of cells to be assayed in an unbiased manner, opening up the possibility that new subpopulations of cells can be found. However, the effects of potential confounding factors, such as the cell cycle, on the heterogeneity of gene expression and therefore on the ability to robustly identify subpopulations remain unclear. We present and validate a computational approach that uses latent variable models to account for such hidden factors. We show that our single-cell latent variable model (scLVM) allows the identification of otherwise undetectable subpopulations of cells that correspond to different stages during the differentiation of naive T cells into T helper 2 cells. Our approach can be used not only to identify cellular subpopulations but also to tease apart different sources of gene expression heterogeneity in single-cell transcriptomes.

1,132 citations

Journal ArticleDOI
TL;DR: This article describes a computational workflow for low-level analyses of scRNA-seq data, based primarily on software packages from the open-source Bioconductor project, which covers basic steps including quality control, data exploration and normalization, as well as more complex procedures such as cell cycle phase assignment.
Abstract: Single-cell RNA sequencing (scRNA-seq) is widely used to profile the transcriptome of individual cells This provides biological resolution that cannot be matched by bulk RNA sequencing, at the cost of increased technical noise and data complexity The differences between scRNA-seq and bulk RNA-seq data mean that the analysis of the former cannot be performed by recycling bioinformatics pipelines for the latter Rather, dedicated single-cell methods are required at various steps to exploit the cellular resolution while accounting for technical noise This article describes a computational workflow for low-level analyses of scRNA-seq data, based primarily on software packages from the open-source Bioconductor project It covers basic steps including quality control, data exploration and normalization, as well as more complex procedures such as cell cycle phase assignment, identification of highly variable and correlated genes, clustering into subpopulations and marker gene detection Analyses were demonstrated on gene-level count data from several publicly available datasets involving haematopoietic stem cells, brain-derived cells, T-helper cells and mouse embryonic stem cells This will provide a range of usage scenarios from which readers can construct their own analysis pipelines

1,128 citations

Posted ContentDOI
Konrad J. Karczewski1, Konrad J. Karczewski2, Laurent C. Francioli1, Laurent C. Francioli2, Grace Tiao2, Grace Tiao1, Beryl B. Cummings2, Beryl B. Cummings1, Jessica Alföldi2, Jessica Alföldi1, Qingbo Wang2, Qingbo Wang1, Ryan L. Collins2, Ryan L. Collins1, Kristen M. Laricchia1, Kristen M. Laricchia2, Andrea Ganna3, Andrea Ganna2, Andrea Ganna1, Daniel P. Birnbaum1, Laura D. Gauthier1, Harrison Brand1, Harrison Brand2, Matthew Solomonson1, Matthew Solomonson2, Nicholas A. Watts2, Nicholas A. Watts1, Daniel R. Rhodes4, Moriel Singer-Berk1, Eleanor G. Seaby1, Eleanor G. Seaby2, Jack A. Kosmicki1, Jack A. Kosmicki2, Raymond K. Walters1, Raymond K. Walters2, Katherine Tashman2, Katherine Tashman1, Yossi Farjoun1, Eric Banks1, Timothy Poterba1, Timothy Poterba2, Arcturus Wang1, Arcturus Wang2, Cotton Seed1, Cotton Seed2, Nicola Whiffin5, Nicola Whiffin1, Jessica X. Chong6, Kaitlin E. Samocha7, Emma Pierce-Hoffman1, Zachary Zappala8, Zachary Zappala1, Anne H. O’Donnell-Luria9, Anne H. O’Donnell-Luria1, Anne H. O’Donnell-Luria2, Eric Vallabh Minikel1, Ben Weisburd1, Monkol Lek1, Monkol Lek10, James S. Ware5, James S. Ware1, Christopher Vittal1, Christopher Vittal2, Irina M. Armean11, Irina M. Armean1, Irina M. Armean2, Louis Bergelson1, Kristian Cibulskis1, Kristen M. Connolly1, Miguel Covarrubias1, Stacey Donnelly1, Steven Ferriera1, Stacey Gabriel1, Jeff Gentry1, Namrata Gupta1, Thibault Jeandet1, Diane Kaplan1, Christopher Llanwarne1, Ruchi Munshi1, Sam Novod1, Nikelle Petrillo1, David Roazen1, Valentin Ruano-Rubio1, Andrea Saltzman1, Molly Schleicher1, Jose Soto1, Kathleen Tibbetts1, Charlotte Tolonen1, Gordon Wade1, Michael E. Talkowski2, Michael E. Talkowski1, Benjamin M. Neale1, Benjamin M. Neale2, Mark J. Daly1, Daniel G. MacArthur2, Daniel G. MacArthur1 
30 Jan 2019-bioRxiv
TL;DR: Using an improved human mutation rate model, human protein-coding genes are classified along a spectrum representing tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve gene discovery power for both common and rare diseases.
Abstract: Summary Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes critical for an organism’s function will be depleted for such variants in natural populations, while non-essential genes will tolerate their accumulation. However, predicted loss-of-function (pLoF) variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes. Here, we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence pLoF variants in this cohort after filtering for sequencing and annotation artifacts. Using an improved model of human mutation, we classify human protein-coding genes along a spectrum representing intolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve gene discovery power for both common and rare diseases.

1,128 citations


Authors

Showing all 4058 results

NameH-indexPapersCitations
Nicholas J. Wareham2121657204896
Gonçalo R. Abecasis179595230323
Panos Deloukas162410154018
Michael R. Stratton161443142586
David W. Johnson1602714140778
Michael John Owen1601110135795
Naveed Sattar1551326116368
Robert E. W. Hancock15277588481
Julian Parkhill149759104736
Nilesh J. Samani149779113545
Michael Conlon O'Donovan142736118857
Jian Yang1421818111166
Christof Koch141712105221
Andrew G. Clark140823123333
Stylianos E. Antonarakis13874693605
Network Information
Related Institutions (5)
Broad Institute
11.6K papers, 1.5M citations

96% related

Howard Hughes Medical Institute
34.6K papers, 5.2M citations

95% related

Laboratory of Molecular Biology
24.2K papers, 2.1M citations

94% related

Salk Institute for Biological Studies
13.1K papers, 1.6M citations

93% related

National Institutes of Health
297.8K papers, 21.3M citations

93% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
202317
202270
2021836
2020810
2019854
2018764