scispace - formally typeset
Search or ask a question

Showing papers by "Daniel G. MacArthur published in 2010"


Journal ArticleDOI
01 Apr 2010-Nature
TL;DR: It is concluded that the heritability void left by genome-wide association studies will not be accounted for by common CNVs, and 30 loci with CNVs that are candidates for influencing disease susceptibility are identified.
Abstract: Structural variations of DNA greater than 1 kilobase in size account for most bases that vary among human genomes, but are still relatively under-ascertained. Here we use tiling oligonucleotide microarrays, comprising 42 million probes, to generate a comprehensive map of 11,700 copy number variations (CNVs) greater than 443 base pairs, of which most (8,599) have been validated independently. For 4,978 of these CNVs, we generated reference genotypes from 450 individuals of European, African or East Asian ancestry. The predominant mutational mechanisms differ among CNV size classes. Retrotransposition has duplicated and inserted some coding and non-coding DNA segments randomly around the genome. Furthermore, by correlation with known trait-associated single nucleotide polymorphisms (SNPs), we identified 30 loci with CNVs that are candidates for influencing disease susceptibility. Despite this, having assessed the completeness of our map and the patterns of linkage disequilibrium between CNVs and SNPs, we conclude that, for complex traits, the heritability void left by genome-wide association studies will not be accounted for by common CNVs.

1,892 citations


01 Oct 2010
TL;DR: The pilot phase of the 1000 Genomes Project is presented, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms, and the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants are described.

599 citations


Journal ArticleDOI
TL;DR: Systematic, high-quality catalogues of LOF variants present in the genomes of healthy individuals, built from the output of large-scale sequencing studies such as the 1000 Genomes Project, will help to distinguish between benign and disease-causing LOF variant, and will provide valuable resources for clinical genomics.
Abstract: Genetic variants predicted to seriously disrupt the function of human protein-coding genes—so-called loss-of-function (LOF) variants—have traditionally been viewed in the context of severe Mendelian disease. However, recent large-scale sequencing and genotyping projects have revealed a surprisingly large number of these variants in the genomes of apparently healthy individuals—at least 100 per genome, including more than 30 in a homozygous state—suggesting a previously unappreciated level of variation in functional gene content between humans. These variants are mostly found at low frequency, suggesting that they are enriched for mildly deleterious polymorphisms suppressed by negative natural selection, and thus represent an attractive set of candidate variants for complex disease susceptibility. However, they are also enriched for sequencing and annotation artefacts, so overall present serious challenges for clinical sequencing projects seeking to identify severe disease genes amidst the ‘noise’ of technical error and benign genetic polymorphism. Systematic, high-quality catalogues of LOF variants present in the genomes of healthy individuals, built from the output of large-scale sequencing studies such as the 1000 Genomes Project, will help to distinguish between benign and disease-causing LOF variants, and will provide valuable resources for clinical genomics.

191 citations


Journal ArticleDOI
TL;DR: A link between alpha-actinin-3 and glycogen metabolism which may underlie the metabolic changes seen in the KO mouse is demonstrated and it is proposed that the alteration in GPh activity in the absence of alpha-Actinin -3 is a fundamental mechanistic link in the association between ACTN3 genotype and human performance.
Abstract: Approximately one billion people worldwide are homozygous for a stop codon polymorphism in the ACTN3 gene (R577X) which results in complete deficiency of the fast fibre muscle protein alpha-actinin-3. ACTN3 genotype is associated with human athletic performance and alpha-actinin-3 deficient mice [Actn3 knockout (KO) mice] have a shift in the properties of fast muscle fibres towards slower fibre properties, with increased activity of multiple enzymes in the aerobic metabolic pathway and slower contractile properties. alpha-Actinins have been shown to interact with a number of muscle proteins including the key metabolic regulator glycogen phosphorylase (GPh). In this study, we demonstrated a link between alpha-actinin-3 and glycogen metabolism which may underlie the metabolic changes seen in the KO mouse. Actn3 KO mice have higher muscle glycogen content and a 50% reduction in the activity of GPh. The reduction in enzyme activity is accompanied by altered post-translational modification of GPh, suggesting that alpha-actinin-3 regulates GPh activity by altering its level of phosphorylation. We propose that the changes in glycogen metabolism underlie the downstream metabolic consequences of alpha-actinin-3 deficiency. Finally, as GPh has been shown to regulate calcium handling, we examined calcium handling in KO mouse primary mouse myoblasts and find changes that may explain the slower contractile properties previously observed in these mice. We propose that the alteration in GPh activity in the absence of alpha-actinin-3 is a fundamental mechanistic link in the association between ACTN3 genotype and human performance.

81 citations


Journal ArticleDOI
TL;DR: The completed genome sequence of over 32 invertebrate species has allowed the analysis of gene structure and exon-gene duplication over a diverse range of phyla to show that relative to early branching metazoans, there has been considerable intron loss especially in arthropods with few cases of intron gains.
Abstract: The α-actinins are an important family of actin-binding proteins with the ability to cross-link actin filaments when in dimer form. Members of the α-actinin family share a domain topology composed of highly conserved actin-binding and EF-hand domains separated by a rod domain composed of spectrin-like repeats. Functional diversity within this family has arisen through exon duplication and the formation of alternate splice isoforms as well as gene duplications during the evolution of vertebrates. In addition to the known functional domains, α-actinins also contain a consensus PDZ-binding site. The completed genome sequence of over 32 invertebrate species has allowed the analysis of gene structure and exon–gene duplication over a diverse range of phyla. Our analysis shows that relative to early branching metazoans, there has been considerable intron loss especially in arthropods with few cases of intron gains. The C-terminal PDZ-binding site is conserved in nearly all invertebrates but is missing in some nematodes and platyhelminths. Alternative splicing in the actin-binding domain is conserved in chordates, arthropods, and some nematodes and platyhelminths. In contrast, alternative splicing of the EF-hand domain is only observed in chordates. Finally, given the prevalence of exon duplications seen in the actin-binding domain, this may act as a significant mechanism in the modification of actin-binding properties.

20 citations



Journal ArticleDOI
TL;DR: A crackdown on firms selling gene tests direct to the consumer would come at a cost, argue Daniel MacArthur and Caroline Wright.

1 citations


Journal ArticleDOI
TL;DR: A proposal to define the human reference gene set that takes into account the inter-individual differences in gene numbers arising from gene inactivation events, such as premature termination or aberrant splicing due to nonsense SNPs or SNPs at essential splice sites respectively is presented.
Abstract: The number of coding genes in the human genome is still under debate [1]. Here, we present a proposal to define the human reference gene set that takes into account the inter-individual differences in gene numbers arising from gene inactivation events, such as premature termination or aberrant splicing due to nonsense SNPs or SNPs at essential splice sites respectively. We have analyzed SNPs (specifically nonsense SNPs and SNPs affecting essential splice sites) from 23 personal genomes and exomes. We see a wide range in numbers of SNPs in each of the categories surveyed. A large fraction of these SNPs are singletons. Using a data set of high-confidence SNPs obtained by intersecting SNPs from dbSNP and the personal genomes, we identify a common set of 279 genes predicted to be pseudogenic (non-functional) in some individuals and functional in others. We focused on two key questions arising from these considerations: (i) Which criteria should be used for inclusion and exclusion of genes from the reference set? (ii) What sequence should be used as the reference for genes that are non-functional in some humans? For the first question, we propose to include all genes that are functional even in one individual to produce a maximally-inclusive set of genes. For the second, we propose the use of the ancestral allele as the reference allele. This will provide a uniform basis for gene annotation and ensure that the reference gene set and sequence will be relatively stable as more individual genomes are sequenced. In the few cases where an ancestral state assignment is unavailable or ambiguous, we propose that genes be annotated as the functional allele.