Institution
Wellcome Trust Sanger Institute
Nonprofit•Cambridge, United Kingdom•
About: Wellcome Trust Sanger Institute is a nonprofit organization based out in Cambridge, United Kingdom. It is known for research contribution in the topics: Population & Genome. The organization has 4009 authors who have published 9671 publications receiving 1224479 citations.
Topics: Population, Genome, Gene, Genome-wide association study, Genomics
Papers published on a yearly basis
Papers
More filters
••
University of Oxford1, University of Michigan2, Wellcome Trust Sanger Institute3, Amgen4, University of Cambridge5, University of Copenhagen6, University of Liverpool7, University of Freiburg8, Boston University9, University of Tartu10, Erasmus University Medical Center11, Leiden University Medical Center12, Pasteur Institute13, Icahn School of Medicine at Mount Sinai14, UCLA Medical Center15, Vanderbilt University Medical Center16, Wake Forest University17, National University of Singapore18, London North West Healthcare NHS Trust19, Imperial College London20, Charité21, Innsbruck Medical University22, Washington University in St. Louis23, Queen Mary University of London24, University of Southern Denmark25, National and Kapodistrian University of Athens26, Robertson Centre for Biostatistics27, University of Exeter28, Uppsala University29, University of Düsseldorf30, Steno Diabetes Center31, Aalborg University32, University of Eastern Finland33, Broad Institute34, Frederiksberg Hospital35, University of Bergen36, Lund University37, Technische Universität München38, University of North Carolina at Chapel Hill39, University of Edinburgh40, Ninewells Hospital41, University of Minnesota42, University of Glasgow43, Ludwig Maximilian University of Munich44, University of Iceland45, Aarhus University46, Science for Life Laboratory47, Stanford University48, University of Helsinki49, National Institutes of Health50, University of Dundee51, Harvard University52
TL;DR: Combining 32 genome-wide association studies with high-density imputation provides a comprehensive view of the genetic contribution to type 2 diabetes in individuals of European ancestry with respect to locus discovery, causal-variant resolution, and mechanistic insight.
Abstract: We expanded GWAS discovery for type 2 diabetes (T2D) by combining data from 898,130 European-descent individuals (9% cases), after imputation to high-density reference panels. With these data, we (i) extend the inventory of T2D-risk variants (243 loci, 135 newly implicated in T2D predisposition, comprising 403 distinct association signals); (ii) enrich discovery of lower-frequency risk alleles (80 index variants with minor allele frequency 2); (iii) substantially improve fine-mapping of causal variants (at 51 signals, one variant accounted for >80% posterior probability of association (PPA)); (iv) extend fine-mapping through integration of tissue-specific epigenomic information (islet regulatory annotations extend the number of variants with PPA >80% to 73); (v) highlight validated therapeutic targets (18 genes with associations attributable to coding variants); and (vi) demonstrate enhanced potential for clinical translation (genome-wide chip heritability explains 18% of T2D risk; individuals in the extremes of a T2D polygenic risk score differ more than ninefold in prevalence).
1,136 citations
••
Howard Hughes Medical Institute1, Broad Institute2, Harvard University3, University of California, Berkeley4, University of California, Los Angeles5, Chinese Academy of Sciences6, Max Planck Society7, Columbia University8, Massachusetts Institute of Technology9, Cayetano Heredia University10, University of Pennsylvania11, University College London12, University of Bern13, Leiden University14, Nanyang Technological University15, University of Chicago16, Estonian Biocentre17, National University of La Plata18, University of Oxford19, University of Bergen20, Novosibirsk State University21, Moscow Institute of Physics and Technology22, Sofia Medical University23, Armenian National Academy of Sciences24, Wellcome Trust Sanger Institute25, Raja Isteri Pengiran Anak Saleha Hospital26, Case Western Reserve University27, University of Tartu28, Estonian Academy of Sciences29, Stony Brook University30, Illumina31, Gladstone Institutes32, University of Helsinki33, University of Washington34, Bashkir State University35, Jaramogi Oginga Odinga University of Science and Technology36, Pompeu Fabra University37, University of Arizona38, University of Cambridge39, Leidos40, Université de Montréal41, University of Utah42, Altai State University43, Council of Scientific and Industrial Research44
TL;DR: It is demonstrated that indigenous Australians, New Guineans and Andamanese do not derive substantial ancestry from an early dispersal of modern humans; instead, their modern human ancestry is consistent with coming from the same source as that of other non-Africans.
Abstract: Here we report the Simons Genome Diversity Project data set: high quality genomes from 300 individuals from 142 diverse populations. These genomes include at least 5.8 million base pairs that are not present in the human reference genome. Our analysis reveals key features of the landscape of human genome variation, including that the rate of accumulation of mutations has accelerated by about 5% in non-Africans compared to Africans since divergence. We show that the ancestors of some pairs of present-day human populations were substantially separated by 100,000 years ago, well before the archaeologically attested onset of behavioural modernity. We also demonstrate that indigenous Australians, New Guineans and Andamanese do not derive substantial ancestry from an early dispersal of modern humans; instead, their modern human ancestry is consistent with coming from the same source as that of other non-Africans.
1,133 citations
••
TL;DR: It is shown that the single-cell latent variable model (scLVM) allows the identification of otherwise undetectable subpopulations of cells that correspond to different stages during the differentiation of naive T cells into T helper 2 cells.
Abstract: Hidden cell sub-populations are detected by accounting for confounding variation inthe analysis of single-cell RNA-seq data. Recent technical developments have enabled the transcriptomes of hundreds of cells to be assayed in an unbiased manner, opening up the possibility that new subpopulations of cells can be found. However, the effects of potential confounding factors, such as the cell cycle, on the heterogeneity of gene expression and therefore on the ability to robustly identify subpopulations remain unclear. We present and validate a computational approach that uses latent variable models to account for such hidden factors. We show that our single-cell latent variable model (scLVM) allows the identification of otherwise undetectable subpopulations of cells that correspond to different stages during the differentiation of naive T cells into T helper 2 cells. Our approach can be used not only to identify cellular subpopulations but also to tease apart different sources of gene expression heterogeneity in single-cell transcriptomes.
1,132 citations
••
TL;DR: This article describes a computational workflow for low-level analyses of scRNA-seq data, based primarily on software packages from the open-source Bioconductor project, which covers basic steps including quality control, data exploration and normalization, as well as more complex procedures such as cell cycle phase assignment.
Abstract: Single-cell RNA sequencing (scRNA-seq) is widely used to profile the transcriptome of individual cells This provides biological resolution that cannot be matched by bulk RNA sequencing, at the cost of increased technical noise and data complexity The differences between scRNA-seq and bulk RNA-seq data mean that the analysis of the former cannot be performed by recycling bioinformatics pipelines for the latter Rather, dedicated single-cell methods are required at various steps to exploit the cellular resolution while accounting for technical noise This article describes a computational workflow for low-level analyses of scRNA-seq data, based primarily on software packages from the open-source Bioconductor project It covers basic steps including quality control, data exploration and normalization, as well as more complex procedures such as cell cycle phase assignment, identification of highly variable and correlated genes, clustering into subpopulations and marker gene detection Analyses were demonstrated on gene-level count data from several publicly available datasets involving haematopoietic stem cells, brain-derived cells, T-helper cells and mouse embryonic stem cells This will provide a range of usage scenarios from which readers can construct their own analysis pipelines
1,128 citations
••
TL;DR: Using an improved human mutation rate model, human protein-coding genes are classified along a spectrum representing tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve gene discovery power for both common and rare diseases.
Abstract: Summary Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes critical for an organism’s function will be depleted for such variants in natural populations, while non-essential genes will tolerate their accumulation. However, predicted loss-of-function (pLoF) variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes. Here, we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence pLoF variants in this cohort after filtering for sequencing and annotation artifacts. Using an improved model of human mutation, we classify human protein-coding genes along a spectrum representing intolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve gene discovery power for both common and rare diseases.
1,128 citations
Authors
Showing all 4058 results
Name | H-index | Papers | Citations |
---|---|---|---|
Nicholas J. Wareham | 212 | 1657 | 204896 |
Gonçalo R. Abecasis | 179 | 595 | 230323 |
Panos Deloukas | 162 | 410 | 154018 |
Michael R. Stratton | 161 | 443 | 142586 |
David W. Johnson | 160 | 2714 | 140778 |
Michael John Owen | 160 | 1110 | 135795 |
Naveed Sattar | 155 | 1326 | 116368 |
Robert E. W. Hancock | 152 | 775 | 88481 |
Julian Parkhill | 149 | 759 | 104736 |
Nilesh J. Samani | 149 | 779 | 113545 |
Michael Conlon O'Donovan | 142 | 736 | 118857 |
Jian Yang | 142 | 1818 | 111166 |
Christof Koch | 141 | 712 | 105221 |
Andrew G. Clark | 140 | 823 | 123333 |
Stylianos E. Antonarakis | 138 | 746 | 93605 |