scispace - formally typeset
Search or ask a question
Institution

Wellcome Trust Centre for Human Genetics

FacilityOxford, United Kingdom
About: Wellcome Trust Centre for Human Genetics is a facility organization based out in Oxford, United Kingdom. It is known for research contribution in the topics: Population & Genome-wide association study. The organization has 2122 authors who have published 4269 publications receiving 433899 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A read mapper, Stampy, which uses a hybrid mapping algorithm and a detailed statistical model to achieve both speed and sensitivity, particularly when reads include sequence variation, which results in a higher useable sequence yield and improved accuracy compared to that of existing software.
Abstract: High-volume sequencing of DNA and RNA is now within reach of any research laboratory and is quickly becoming established as a key research tool. In many workflows, each of the short sequences ("reads") resulting from a sequencing run are first "mapped" (aligned) to a reference sequence to infer the read from which the genomic location derived, a challenging task because of the high data volumes and often large genomes. Existing read mapping software excel in either speed (e.g., BWA, Bowtie, ELAND) or sensitivity (e.g., Novoalign), but not in both. In addition, performance often deteriorates in the presence of sequence variation, particularly so for short insertions and deletions (indels). Here, we present a read mapper, Stampy, which uses a hybrid mapping algorithm and a detailed statistical model to achieve both speed and sensitivity, particularly when reads include sequence variation. This results in a higher useable sequence yield and improved accuracy compared to that of existing software.

1,184 citations

Journal ArticleDOI
TL;DR: A general approach that can accommodate nuclear families of any size, with or without parental information, is constructed, and it is shown that, when siblings are available, the total number of genotypes required in order to achieve comparable power is smaller if parents are not genotyped.
Abstract: High-resolution mapping is an important step in the identification of complex disease genes. In outbred populations, linkage disequilibrium is expected to operate over short distances and could provide a powerful fine-mapping tool. Here we build on recently developed methods for linkage-disequilibrium mapping of quantitative traits to construct a general approach that can accommodate nuclear families of any size, with or without parental information. Variance components are used to construct a test that utilizes information from all available offspring but that is not biased in the presence of linkage or familiality. A permutation test is described for situations in which maximum-likelihood estimates of the variance components are biased. Simulation studies are used to investigate power and error rates of this approach and to highlight situations in which violations of multivariate normality assumptions warrant the permutation test. The relationship between power and the level of linkage disequilibrium for this test suggests that the method is well suited to the analysis of dense maps. The relationship between power and family structure is investigated, and these results are applicable to study design in complex disease, especially for late-onset conditions for which parents are usually not available. When parental genotypes are available, power does not depend greatly on the number of offspring in each family. Power decreases when parental genotypes are not available, but the loss in power is negligible when four or more offspring per family are genotyped. Finally, it is shown that, when siblings are available, the total number of genotypes required in order to achieve comparable power is smaller if parents are not genotyped.

1,173 citations

Journal ArticleDOI
TL;DR: The results show that trio-based exome sequencing is a powerful approach for identifying new candidate genes for ASDs and suggest that de novo mutations may contribute substantially to the genetic etiology of ASDs.
Abstract: Evidence for the etiology of autism spectrum disorders (ASDs) has consistently pointed to a strong genetic component complicated by substantial locus heterogeneity. We sequenced the exomes of 20 individuals with sporadic ASD (cases) and their parents, reasoning that these families would be enriched for de novo mutations of major effect. We identified 21 de novo mutations, 11 of which were protein altering. Protein-altering mutations were significantly enriched for changes at highly conserved residues. We identified potentially causative de novo events in 4 out of 20 probands, particularly among more severely affected individuals, in FOXP1, GRIN2B, SCN1A and LAMC3. In the FOXP1 mutation carrier, we also observed a rare inherited CNTNAP2 missense variant, and we provide functional support for a multi-hit model for disease risk. Our results show that trio-based exome sequencing is a powerful approach for identifying new candidate genes for ASDs and suggest that de novo mutations may contribute substantially to the genetic etiology of ASDs.

1,116 citations

Journal ArticleDOI
TL;DR: This protocol details the steps for data quality assessment and control that are typically carried out during case-control association studies, including the identification and removal of DNA samples and markers that introduce bias.
Abstract: This protocol details the steps for data quality assessment and control that are typically carried out during case-control association studies. The steps described involve the identification and removal of DNA samples and markers that introduce bias. These critical steps are paramount to the success of a case-control study and are necessary before statistically testing for association. We describe how to use PLINK, a tool for handling SNP data, to perform assessments of failure rate per individual and per SNP and to assess the degree of relatedness between individuals. We also detail other quality-control procedures, including the use of SMARTPCA software for the identification of ancestral outliers. These platforms were selected because they are user-friendly, widely used and computationally efficient. Steps needed to detect and establish a disease association using case-control data are not discussed here. Issues concerning study design and marker selection in case-control studies have been discussed in our earlier protocols. This protocol, which is routinely used in our labs, should take approximately 8 h to complete.

1,106 citations

Journal ArticleDOI
Aysu Okbay1, Jonathan P. Beauchamp2, Mark Alan Fontana3, James J. Lee4  +293 moreInstitutions (81)
26 May 2016-Nature
TL;DR: In this article, the results of a genome-wide association study (GWAS) for educational attainment were reported, showing that single-nucleotide polymorphisms associated with educational attainment disproportionately occur in genomic regions regulating gene expression in the fetal brain.
Abstract: Educational attainment is strongly influenced by social and other environmental factors, but genetic factors are estimated to account for at least 20% of the variation across individuals. Here we report the results of a genome-wide association study (GWAS) for educational attainment that extends our earlier discovery sample of 101,069 individuals to 293,723 individuals, and a replication study in an independent sample of 111,349 individuals from the UK Biobank. We identify 74 genome-wide significant loci associated with the number of years of schooling completed. Single-nucleotide polymorphisms associated with educational attainment are disproportionately found in genomic regions regulating gene expression in the fetal brain. Candidate genes are preferentially expressed in neural tissue, especially during the prenatal period, and enriched for biological pathways involved in neural development. Our findings demonstrate that, even for a behavioural phenotype that is mostly environmentally determined, a well-powered GWAS identifies replicable associated genetic variants that suggest biologically relevant pathways. Because educational attainment is measured in large numbers of individuals, it will continue to be useful as a proxy phenotype in efforts to characterize the genetic influences of related phenotypes, including cognition and neuropsychiatric diseases.

1,102 citations


Authors

Showing all 2127 results

NameH-indexPapersCitations
Mark I. McCarthy2001028187898
John P. A. Ioannidis1851311193612
Gonçalo R. Abecasis179595230323
Simon I. Hay165557153307
Robert Plomin151110488588
Ashok Kumar1515654164086
Julian Parkhill149759104736
James F. Wilson146677101883
Jeremy K. Nicholson14177380275
Hugh Watkins12852491317
Erik Ingelsson12453885407
Claudia Langenberg12445267326
Adrian V. S. Hill12258964613
John A. Todd12151567413
Elaine Holmes11956058975
Network Information
Related Institutions (5)
Howard Hughes Medical Institute
34.6K papers, 5.2M citations

94% related

National Institutes of Health
297.8K papers, 21.3M citations

94% related

University of Massachusetts Medical School
31.8K papers, 1.9M citations

93% related

Laboratory of Molecular Biology
24.2K papers, 2.1M citations

93% related

Fred Hutchinson Cancer Research Center
30.9K papers, 2.2M citations

92% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
202221
202183
202074
2019134
2018182
2017323