scispace - formally typeset
Search or ask a question
Institution

Wellcome Trust Sanger Institute

NonprofitCambridge, United Kingdom
About: Wellcome Trust Sanger Institute is a nonprofit organization based out in Cambridge, United Kingdom. It is known for research contribution in the topics: Population & Genome. The organization has 4009 authors who have published 9671 publications receiving 1224479 citations.


Papers
More filters
Journal ArticleDOI
29 Aug 2014-Science
TL;DR: It is proposed that because of a truly complex genetic background, tame behavior in rabbits and other domestic animals evolved by shifts in allele frequencies at many loci, rather than by critical changes at only a few domestication loci.
Abstract: The genetic changes underlying the initial steps of animal domestication are still poorly understood. We generated a high-quality reference genome for the rabbit and compared it to resequencing data from populations of wild and domestic rabbits. We identified more than 100 selective sweeps specific to domestic rabbits but only a relatively small number of fixed (or nearly fixed) single-nucleotide polymorphisms (SNPs) for derived alleles. SNPs with marked allele frequency differences between wild and domestic rabbits were enriched for conserved noncoding sites. Enrichment analyses suggest that genes affecting brain and neuronal development have often been targeted during domestication. We propose that because of a truly complex genetic background, tame behavior in rabbits and other domestic animals evolved by shifts in allele frequencies at many loci, rather than by critical changes at only a few domestication loci.

328 citations

Journal ArticleDOI
13 Oct 2017-Science
TL;DR: A strategy that can be used to explore the origin of cancer-associated mutational signatures is developed and it is found that mutation accumulation in organoids deficient in the mismatch repair gene MLH1 is driven by replication errors and accurately models the mutation profiles observed in mismatch repair–deficient colorectal cancers.
Abstract: Mutational processes underlie cancer initiation and progression. Signatures of these processes in cancer genomes may explain cancer etiology and could hold diagnostic and prognostic value. We developed a strategy that can be used to explore the origin of cancer-associated mutational signatures. We used CRISPR-Cas9 technology to delete key DNA repair genes in human colon organoids, followed by delayed subcloning and whole-genome sequencing. We found that mutation accumulation in organoids deficient in the mismatch repair gene MLH1 is driven by replication errors and accurately models the mutation profiles observed in mismatch repair–deficient colorectal cancers. Application of this strategy to the cancer predisposition gene NTHL1, which encodes a base excision repair protein, revealed a mutational footprint (signature 30) previously observed in a breast cancer cohort. We show that signature 30 can arise from germline NTHL1 mutations.

327 citations

Posted ContentDOI
20 Apr 2018-bioRxiv
TL;DR: This work presents a method, SoupX, for quantifying the extent of the contamination and estimating “background corrected”, cell expression profiles that can be integrated with existing downstream analysis tools and shows that the application of this method reduces batch effects, strengthens cell-specific quality control and improves biological interpretation.
Abstract: Droplet based single cell RNA sequence analyses assume all acquired RNAs are endogenous to cells. However, any cell free RNAs contained within the input solution are also captured by these assays. This sequencing of cell free RNA constitutes a background contamination that has the potential to confound the correct biological interpretation of single cell transcriptomic data. Here, we demonstrate that contamination from this "soup" of cell free RNAs is ubiquitous, experiment specific in its composition and magnitude, and can lead to erroneous biological conclusions. We present a method, SoupX, for quantifying the extent of the contamination and estimating "background corrected", cell expression profiles that can be integrated with existing downstream analysis tools. We apply this method to two data-sets and show that the application of this method reduces batch effects, strengthens cell-specific quality control and improves biological interpretation.

327 citations

Journal ArticleDOI
01 Dec 2016
TL;DR: Kaptive, a novel software tool that automates the process of identifying K-loci based on full locus information extracted from whole genome sequences, is introduced, highlighting the extensive diversity of Klebsiella K- loci and the proteins that they encode.
Abstract: Klebsiella pneumoniae is a growing cause of healthcare-associated infections for which multi-drug resistance is a concern. Its polysaccharide capsule is a major virulence determinant and epidemiological marker. However, little is known about capsule epidemiology since serological typing is not widely accessible and many isolates are serologically non-typeable. Molecular typing techniques provide useful insights, but existing methods fail to take full advantage of the information in whole genome sequences. We investigated the diversity of the capsule synthesis loci (K-loci) among 2503 K. pneumoniae genomes. We incorporated analyses of full-length K-locus nucleotide sequences and also clustered protein-encoding sequences to identify, annotate and compare K-locus structures. We propose a standardized nomenclature for K-loci and present a curated reference database. A total of 134 distinct K-loci were identified, including 31 novel types. Comparative analyses indicated 508 unique protein-encoding gene clusters that appear to reassort via homologous recombination. Extensive intra- and inter-locus nucleotide diversity was detected among the wzi and wzc genes, indicating that current molecular typing schemes based on these genes are inadequate. As a solution, we introduce Kaptive, a novel software tool that automates the process of identifying K-loci based on full locus information extracted from whole genome sequences (https://github.com/katholt/Kaptive). This work highlights the extensive diversity of Klebsiella K-loci and the proteins that they encode. The nomenclature, reference database and novel typing method presented here will become essential resources for genomic surveillance and epidemiological investigations of this pathogen.

325 citations

Journal ArticleDOI
Monika Karmin1, Monika Karmin2, Lauri Saag1, Lauri Saag2, Mário Vicente3, Melissa A. Wilson Sayres4, Melissa A. Wilson Sayres5, Mari Järve2, Ulvi Gerst Talas1, Siiri Rootsi2, Anne-Mai Ilumäe1, Anne-Mai Ilumäe2, Reedik Mägi1, Mario Mitt1, Luca Pagani3, Tarmo Puurand1, Zuzana Faltyskova3, Florian Clemente3, Alexia Cardona3, Ene Metspalu1, Ene Metspalu2, Hovhannes Sahakyan2, Hovhannes Sahakyan6, Bayazit Yunusbayev7, Bayazit Yunusbayev2, Georgi Hudjashov2, Georgi Hudjashov8, Michael DeGiorgio9, Eva Liis Loogväli2, Christina A. Eichstaedt3, Mikk Eelmets1, Mikk Eelmets2, Gyaneshwer Chaubey2, Kristiina Tambets2, S. S. Litvinov2, S. S. Litvinov7, Maru Mormina10, Yali Xue11, Qasim Ayub11, Grigor Zoraqi, Thorfinn Sand Korneliussen4, Thorfinn Sand Korneliussen12, Farida Akhatova13, Farida Akhatova14, Joseph Lachance15, Joseph Lachance16, Sarah A. Tishkoff16, Kuvat T. Momynaliev, François-Xavier Ricaut17, Pradiptajati Kusuma17, Pradiptajati Kusuma18, Harilanto Razafindrazaka17, Denis Pierron17, Murray P. Cox19, Gazi Nurun Nahar Sultana20, Rane Willerslev21, Craig Muller12, Michael C. Westaway22, David M. Lambert22, Vedrana Škaro23, Lejla Kovacevic, Shahlo Turdikulova24, Dilbar Dalimova24, Rita Khusainova7, Rita Khusainova13, N. N. Trofimova7, N. N. Trofimova2, V. L. Akhmetova7, I. M. Khidiyatova13, I. M. Khidiyatova7, Daria V. Lichman, Jainagul Isakova, Elvira Pocheshkhova25, Zhaxylyk Sabitov26, Zhaxylyk Sabitov27, Nikolay A. Barashkov28, Pagbajabyn Nymadawa29, Evelin Mihailov1, Joseph Wee Tien Seng, Irina Evseeva30, Andrea Bamberg Migliano31, S M Abdullah, George Andriadze32, Dragan Primorac, L. A. Atramentova33, Olga Utevska33, Levon Yepiskoposyan6, Damir Marjanović34, Alena Kushniarevich2, Alena Kushniarevich35, Doron M. Behar2, Christian Gilissen36, Lisenka E.L.M. Vissers36, Joris A. Veltman36, Elena Balanovska7, Miroslava Derenko7, Boris Malyarchuk7, Andres Metspalu1, Sardana A. Fedorova28, Anders Eriksson37, Anders Eriksson3, Andrea Manica3, Fernando L. Mendez38, Tatiana M. Karafet39, Krishna R. Veeramah40, Neil Bradman, Michael F. Hammer39, Ludmila P. Osipova, Oleg Balanovsky7, Elza Khusnutdinova7, Elza Khusnutdinova13, Knut Johnsen41, Maido Remm1, Mark G. Thomas31, Chris Tyler-Smith11, Peter A. Underhill38, Eske Willerslev12, Rasmus Nielsen4, Mait Metspalu1, Mait Metspalu2, Richard Villems1, Richard Villems2, Richard Villems42, Toomas Kivisild3, Toomas Kivisild2 
TL;DR: A study of 456 geographically diverse high-coverage Y chromosome sequences, including 299 newly reported samples, infer a second strong bottleneck in Y-chromosome lineages dating to the last 10 ky, and hypothesize that this bottleneck is caused by cultural changes affecting variance of reproductive success among males.
Abstract: It is commonly thought that human genetic diversity in non-African populations was shaped primarily by an out-of-Africa dispersal 50-100 thousand yr ago (kya). Here, we present a study of 456 geographically diverse high-coverage Y chromosome sequences, including 299 newly reported samples. Applying ancient DNA calibration, we date the Y-chromosomal most recent common ancestor (MRCA) in Africa at 254 (95% CI 192-307) kya and detect a cluster of major non-African founder haplogroups in a narrow time interval at 47-52 kya, consistent with a rapid initial colonization model of Eurasia and Oceania after the out-of-Africa bottleneck. In contrast to demographic reconstructions based on mtDNA, we infer a second strong bottleneck in Y-chromosome lineages dating to the last 10 ky. We hypothesize that this bottleneck is caused by cultural changes affecting variance of reproductive success among males.

325 citations


Authors

Showing all 4058 results

NameH-indexPapersCitations
Nicholas J. Wareham2121657204896
Gonçalo R. Abecasis179595230323
Panos Deloukas162410154018
Michael R. Stratton161443142586
David W. Johnson1602714140778
Michael John Owen1601110135795
Naveed Sattar1551326116368
Robert E. W. Hancock15277588481
Julian Parkhill149759104736
Nilesh J. Samani149779113545
Michael Conlon O'Donovan142736118857
Jian Yang1421818111166
Christof Koch141712105221
Andrew G. Clark140823123333
Stylianos E. Antonarakis13874693605
Network Information
Related Institutions (5)
Broad Institute
11.6K papers, 1.5M citations

96% related

Howard Hughes Medical Institute
34.6K papers, 5.2M citations

95% related

Laboratory of Molecular Biology
24.2K papers, 2.1M citations

94% related

Salk Institute for Biological Studies
13.1K papers, 1.6M citations

93% related

National Institutes of Health
297.8K papers, 21.3M citations

93% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
202317
202270
2021836
2020810
2019854
2018764