Showing papers in "PLOS Genetics in 2009"

PDF

Open Access

Journal Article•DOI•

A flexible and accurate genotype imputation method for the next generation of genome-wide association studies.

[...]

Bryan Howie¹, Peter Donnelly¹, Peter Donnelly², Jonathan Marchini¹•Institutions (2)

University of Oxford¹, Wellcome Trust Centre for Human Genetics²

19 Jun 2009-PLOS Genetics

TL;DR: It is found that imputation accuracy can be greatly enhanced by expanding the reference panel to contain thousands of chromosomes and that IMPUTE v2 outperforms other methods in this setting at both rare and common SNPs, with overall error rates that are 15%–20% lower than those of the closest competing method.

...read moreread less

Abstract: Genotype imputation methods are now being widely used in the analysis of genome-wide association studies. Most imputation analyses to date have used the HapMap as a reference dataset, but new reference panels (such as controls genotyped on multiple SNP chips and densely typed samples from the 1,000 Genomes Project) will soon allow a broader range of SNPs to be imputed with higher accuracy, thereby increasing power. We describe a genotype imputation method (IMPUTE version 2) that is designed to address the challenges presented by these new datasets. The main innovation of our approach is a flexible modelling framework that increases accuracy and combines information across multiple reference panels while remaining computationally feasible. We find that IMPUTE v2 attains higher accuracy than other methods when the HapMap provides the sole reference panel, but that the size of the panel constrains the improvements that can be made. We also find that imputation accuracy can be greatly enhanced by expanding the reference panel to contain thousands of chromosomes and that IMPUTE v2 outperforms other methods in this setting at both rare and common SNPs, with overall error rates that are 15%–20% lower than those of the closest competing method. One particularly challenging aspect of next-generation association studies is to integrate information across multiple reference panels genotyped on different sets of SNPs; we show that our approach to this problem has practical advantages over other suggested solutions.

...read moreread less

3,902 citations

Journal Article•DOI•

Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data.

[...]

Ryan N. Gutenkunst¹, Ryan D. Hernandez², Scott Williamson³, Carlos Bustamante³•Institutions (3)

Los Alamos National Laboratory¹, University of Chicago², Cornell University³

23 Oct 2009-PLOS Genetics

TL;DR: Combining the demographic model with a previously estimated distribution of selective effects among newly arising amino acid mutations accurately predicts the frequency spectrum of nonsynonymous variants across three continental populations (YRI, CHB, CEU).

...read moreread less

Abstract: Demographic models built from genetic data play important roles in illuminating prehistorical events and serving as null models in genome scans for selection. We introduce an inference method based on the joint frequency spectrum of genetic variants within and between populations. For candidate models we numerically compute the expected spectrum using a diffusion approximation to the one-locus, two-allele Wright-Fisher process, involving up to three simultaneous populations. Our approach is a composite likelihood scheme, since linkage between neutral loci alters the variance but not the expectation of the frequency spectrum. We thus use bootstraps incorporating linkage to estimate uncertainties for parameters and significance values for hypothesis tests. Our method can also incorporate selection on single sites, predicting the joint distribution of selected alleles among populations experiencing a bevy of evolutionary forces, including expansions, contractions, migrations, and admixture. We model human expansion out of Africa and the settlement of the New World, using 5 Mb of noncoding DNA resequenced in 68 individuals from 4 populations (YRI, CHB, CEU, and MXL) by the Environmental Genome Project. We infer divergence between West African and Eurasian populations 140 thousand years ago (95% confidence interval: 40–270 kya). This is earlier than other genetic studies, in part because we incorporate migration. We estimate the European (CEU) and East Asian (CHB) divergence time to be 23 kya (95% c.i.: 17–43 kya), long after archeological evidence places modern humans in Europe. Finally, we estimate divergence between East Asians (CHB) and Mexican-Americans (MXL) of 22 kya (95% c.i.: 16.3–26.9 kya), and our analysis yields no evidence for subsequent migration. Furthermore, combining our demographic model with a previously estimated distribution of selective effects among newly arising amino acid mutations accurately predicts the frequency spectrum of nonsynonymous variants across three continental populations (YRI, CHB, CEU).

...read moreread less

1,636 citations

Journal Article•DOI•

Organised Genome Dynamics in the Escherichia coli Species Results in Highly Diverse Adaptive Paths

[...]

Marie Touchon¹, Marie Touchon², Claire Hoede³, Olivier Tenaillon³, Valérie Barbe, Simon Baeriswyl³, Philippe Bidet³, Edouard Bingen³, Stéphane Bonacorsi³, Christiane Bouchier¹, Odile Bouvet³, Alexandra Calteau⁴, Hélène Chiapello⁵, Olivier Clermont³, Stéphane Cruveiller⁴, Antoine Danchin¹, Médéric Diard³, Carole Dossat, Meriem El Karoui⁵, Eric Frapy³, Louis Garry³, Jean Marc Ghigo¹, Anne-Marie Gilles¹, James R. Johnson⁶, James R. Johnson⁷, Chantal Le Bouguénec¹, Mathilde Lescat³, Sophie Mangenot, Vanessa Martinez-Jéhanne¹, Ivan Matic³, Xavier Nassif³, Sophie Oztas, Marie-Agnès Petit⁵, Christophe Pichon¹, Zoé Rouy⁴, Claude Saint Ruf³, Dominique Schneider⁴, Jérôme Tourret³, Benoit Vacherie, David Vallenet⁴, Claudine Médigue⁴, Eduardo P. C. Rocha², Eduardo P. C. Rocha¹, Erick Denamur³ - Show less +40 more•Institutions (7)

Pasteur Institute¹, Pierre-and-Marie-Curie University², University of Paris³, Centre national de la recherche scientifique⁴, Institut national de la recherche agronomique⁵, Veterans Health Administration⁶, University of Minnesota⁷

23 Jan 2009-PLOS Genetics

TL;DR: An important adaptive role for metabolism diversification within group B2 and Shigella strains is found, but few or no extraint intestinal virulence-specific genes are identified, which could render difficult the development of a vaccine against extraintestinal infections.

...read moreread less

Abstract: The Escherichia coli species represents one of the best-studied model organisms, but also encompasses a variety of commensal and pathogenic strains that diversify by high rates of genetic change. We uniformly (re-) annotated the genomes of 20 commensal and pathogenic E. coli strains and one strain of E. fergusonii (the closest E. coli related species), including seven that we sequenced to completion. Within the approximately 18,000 families of orthologous genes, we found approximately 2,000 common to all strains. Although recombination rates are much higher than mutation rates, we show, both theoretically and using phylogenetic inference, that this does not obscure the phylogenetic signal, which places the B2 phylogenetic group and one group D strain at the basal position. Based on this phylogeny, we inferred past evolutionary events of gain and loss of genes, identifying functional classes under opposite selection pressures. We found an important adaptive role for metabolism diversification within group B2 and Shigella strains, but identified few or no extraintestinal virulence-specific genes, which could render difficult the development of a vaccine against extraintestinal infections. Genome flux in E. coli is confined to a small number of conserved positions in the chromosome, which most often are not associated with integrases or tRNA genes. Core genes flanking some of these regions show higher rates of recombination, suggesting that a gene, once acquired by a strain, spreads within the species by homologous recombination at the flanking genes. Finally, the genome's long-scale structure of recombination indicates lower recombination rates, but not higher mutation rates, at the terminus of replication. The ensuing effect of background selection and biased gene conversion may thus explain why this region is A+T-rich and shows high sequence divergence but low sequence polymorphism. Overall, despite a very high gene flow, genes co-exist in an organised genome.

...read moreread less

1,213 citations

Journal Article•DOI•

A groupwise association test for rare mutations using a weighted sum statistic.

[...]

Bo Eskerod Madsen¹, Sharon R. Browning²•Institutions (2)

Aarhus University¹, University of Auckland²

13 Feb 2009-PLOS Genetics

TL;DR: It is demonstrated that resequencing studies can identify important genetic associations, provided that specialised analysis methods, such as the weighted-sum method, are used.

...read moreread less

Abstract: Resequencing is an emerging tool for identification of rare disease-associated mutations. Rare mutations are difficult to tag with SNP genotyping, as genotyping studies are designed to detect common variants. However, studies have shown that genetic heterogeneity is a probable scenario for common diseases, in which multiple rare mutations together explain a large proportion of the genetic basis for the disease. Thus, we propose a weighted-sum method to jointly analyse a group of mutations in order to test for groupwise association with disease status. For example, such a group of mutations may result from resequencing a gene. We compare the proposed weighted-sum method to alternative methods and show that it is powerful for identifying disease-associated genes, both on simulated and Encode data. Using the weighted-sum method, a resequencing study can identify a disease-associated gene with an overall population attributable risk (PAR) of 2%, even when each individual mutation has much lower PAR, using 1,000 to 7,000 affected and unaffected individuals, depending on the underlying genetic model. This study thus demonstrates that resequencing studies can identify important genetic associations, provided that specialised analysis methods, such as the weighted-sum method, are used.

...read moreread less

1,092 citations

Journal Article•DOI•

Aging and Environmental Exposures Alter Tissue-Specific DNA Methylation Dependent upon CpG Island Context

[...]

Brock C. Christensen¹, E. Andres Houseman¹, E. Andres Houseman², Carmen J. Marsit¹, Shichun Zheng³, Margaret Wrensch³, Joseph L. Wiemels³, Heather H. Nelson⁴, Margaret R. Karagas⁵, James F. Padbury¹, Raphael Bueno⁶, David J. Sugarbaker⁶, Ru Fang Yeh³, John K. Wiencke³, Karl T. Kelsey¹ - Show less +11 more•Institutions (6)

Brown University¹, Harvard University², University of California, San Francisco³, University of Minnesota⁴, Dartmouth College⁵, Brigham and Women's Hospital⁶

14 Aug 2009-PLOS Genetics

TL;DR: This work provides novel insight into the role of aging and the environment in susceptibility to diseases such as cancer and critically informs the field of epigenomics by providing evidence of epigenetic dysregulation by age-related methylation alterations.

...read moreread less

Abstract: Epigenetic control of gene transcription is critical for normal human development and cellular differentiation. While alterations of epigenetic marks such as DNA methylation have been linked to cancers and many other human diseases, interindividual epigenetic variations in normal tissues due to aging, environmental factors, or innate susceptibility are poorly characterized. The plasticity, tissue-specific nature, and variability of gene expression are related to epigenomic states that vary across individuals. Thus, population-based investigations are needed to further our understanding of the fundamental dynamics of normal individual epigenomes. We analyzed 217 non-pathologic human tissues from 10 anatomic sites at 1,413 autosomal CpG loci associated with 773 genes to investigate tissue-specific differences in DNA methylation and to discern how aging and exposures contribute to normal variation in methylation. Methylation profile classes derived from unsupervised modeling were significantly associated with age (P<0.0001) and were significant predictors of tissue origin (P<0.0001). In solid tissues (n = 119) we found striking, highly significant CpG island-dependent correlations between age and methylation; loci in CpG islands gained methylation with age, loci not in CpG islands lost methylation with age (P<0.001), and this pattern was consistent across tissues and in an analysis of blood-derived DNA. Our data clearly demonstrate age- and exposure-related differences in tissue-specific methylation and significant age-associated methylation patterns which are CpG island context-dependent. This work provides novel insight into the role of aging and the environment in susceptibility to diseases such as cancer and critically informs the field of epigenomics by providing evidence of epigenetic dysregulation by age-related methylation alterations. Collectively we reveal key issues to consider both in the construction of reference and disease-related epigenomes and in the interpretation of potentially pathologically important alterations.

...read moreread less

1,005 citations

Journal Article•DOI•

A Microhomology-Mediated Break-Induced Replication Model for the Origin of Human Copy Number Variation

[...]

P. J. Hastings¹, Grzegorz Ira¹, James R. Lupski¹, James R. Lupski²•Institutions (2)

Baylor College of Medicine¹, Boston Children's Hospital²

30 Jan 2009-PLOS Genetics

TL;DR: It is proposed that breakage of replication forks in stressed cells that are deficient in homologous recombination induces an aberrant repair process with features of break-induced replication (BIR) that will anneal with microhomology on any single-stranded DNA nearby, priming low-processivity polymerization with multiple template switches generating complex rearrangements, and eventual re-establishment of processive replication.

...read moreread less

Abstract: Chromosome structural changes with nonrecurrent endpoints associated with genomic disorders offer windows into the mechanism of origin of copy number variation (CNV). A recent report of nonrecurrent duplications associated with Pelizaeus-Merzbacher disease identified three distinctive characteristics. First, the majority of events can be seen to be complex, showing discontinuous duplications mixed with deletions, inverted duplications, and triplications. Second, junctions at endpoints show microhomology of 2–5 base pairs (bp). Third, endpoints occur near pre-existing low copy repeats (LCRs). Using these observations and evidence from DNA repair in other organisms, we derive a model of microhomology-mediated break-induced replication (MMBIR) for the origin of CNV and, ultimately, of LCRs. We propose that breakage of replication forks in stressed cells that are deficient in homologous recombination induces an aberrant repair process with features of break-induced replication (BIR). Under these circumstances, single-strand 3′ tails from broken replication forks will anneal with microhomology on any single-stranded DNA nearby, priming low-processivity polymerization with multiple template switches generating complex rearrangements, and eventual re-establishment of processive replication.

...read moreread less

763 citations

Journal Article•DOI•

Assessing the Impact of Transgenerational Epigenetic Variation on Complex Traits

[...]

Frank Johannes, Emmanuelle Porcher¹, Emmanuelle Porcher², Felipe Karam Teixeira³, Felipe Karam Teixeira², Vera Saliba-Colombani², Vera Saliba-Colombani¹, Matthieu Simon², Nicolas Agier², Agnès Bulski², Agnès Bulski³, Juliette Albuisson², Fabiana Heredia², Pascal Audigier², David Bouchez², Christine Dillmann¹, Philippe Guerche², Vincent Colot², Vincent Colot³ - Show less +15 more•Institutions (3)

University of Paris-Sud¹, Institut national de la recherche agronomique², École Normale Supérieure³

26 Jun 2009-PLOS Genetics

TL;DR: The demonstration that numerous epialleles across the genome can be stable over many generations in the absence of selection or extensive DNA sequence variation highlights the need to integrate epigenetic information into population genetics studies.

...read moreread less

Abstract: Loss or gain of DNA methylation can affect gene expression and is sometimes transmitted across generations. Such epigenetic alterations are thus a possible source of heritable phenotypic variation in the absence of DNA sequence change. However, attempts to assess the prevalence of stable epigenetic variation in natural and experimental populations and to quantify its impact on complex traits have been hampered by the confounding effects of DNA sequence polymorphisms. To overcome this problem as much as possible, two parents with little DNA sequence differences, but contrasting DNA methylation profiles, were used to derive a panel of epigenetic Recombinant Inbred Lines (epiRILs) in the reference plant Arabidopsis thaliana. The epiRILs showed variation and high heritability for flowering time and plant height (~30%), as well as stable inheritance of multiple parental DNA methylation variants (epialleles) over at least eight generations. These findings provide a first rationale to identify epiallelic variants that contribute to heritable variation in complex traits using linkage or association studies. More generally, the demonstration that numerous epialleles across the genome can be stable over many generations in the absence of selection or extensive DNA sequence variation highlights the need to integrate epigenetic information into population genetics studies.

...read moreread less

743 citations

Journal Article•DOI•

A Genome-Wide Association Study in Chronic Obstructive Pulmonary Disease (COPD): Identification of Two Major Susceptibility Loci

[...]

Sreekumar G. Pillai¹, Dongliang Ge², Guohua Zhu¹, Xiangyang Kong¹, Kevin V. Shianna², Anna C. Need², Sheng Feng², Craig P. Hersh³, Per Bakke⁴, Amund Gulsvik⁴, Andreas Ruppert, Karin C. Lødrup Carlsen⁵, Allen D. Roses⁶, Allen D. Roses², Wayne Anderson¹, Stephen I. Rennard⁷, David A. Lomas⁸, Edwin K. Silverman³, David Goldstein² - Show less +15 more•Institutions (8)

Research Triangle Park¹, Duke University², Brigham and Women's Hospital³, University of Bergen⁴, University of Oslo⁵, Discovery Institute⁶, University of Nebraska Medical Center⁷, University of Cambridge⁸

20 Mar 2009-PLOS Genetics

TL;DR: A genome-wide association study in a homogenous case-control cohort from Bergen, Norway and evaluated the top 100 single nucleotide polymorphisms (SNPs) in the family-based International COPD Genetics Network found two SNPs at the α-nicotinic acetylcholine receptor (CHRNA 3/5) locus showed unambiguous replication and were significantly associated with lung function in both the ICGN and Boston Early-Onset COPD populations.

...read moreread less

Abstract: There is considerable variability in the susceptibility of smokers to develop chronic obstructive pulmonary disease (COPD). The only known genetic risk factor is severe deficiency of alpha(1)-antitrypsin, which is present in 1-2% of individuals with COPD. We conducted a genome-wide association study (GWAS) in a homogenous case-control cohort from Bergen, Norway (823 COPD cases and 810 smoking controls) and evaluated the top 100 single nucleotide polymorphisms (SNPs) in the family-based International COPD Genetics Network (ICGN; 1891 Caucasian individuals from 606 pedigrees) study. The polymorphisms that showed replication were further evaluated in 389 subjects from the US National Emphysema Treatment Trial (NETT) and 472 controls from the Normative Aging Study (NAS) and then in a fourth cohort of 949 individuals from 127 extended pedigrees from the Boston Early-Onset COPD population. Logistic regression models with adjustments of covariates were used to analyze the case-control populations. Family-based association analyses were conducted for a diagnosis of COPD and lung function in the family populations. Two SNPs at the alpha-nicotinic acetylcholine receptor (CHRNA 3/5) locus were identified in the genome-wide association study. They showed unambiguous replication in the ICGN family-based analysis and in the NETT case-control analysis with combined p-values of 1.48 x 10(-10), (rs8034191) and 5.74 x 10(-10) (rs1051730). Furthermore, these SNPs were significantly associated with lung function in both the ICGN and Boston Early-Onset COPD populations. The C allele of the rs8034191 SNP was estimated to have a population attributable risk for COPD of 12.2%. The association of hedgehog interacting protein (HHIP) locus on chromosome 4 was also consistently replicated, but did not reach genome-wide significance levels. Genome-wide significant association of the HHIP locus with lung function was identified in the Framingham Heart study (Wilk et al., companion article in this issue of PLoS Genetics; doi:10.1371/journal.pgen.1000429). The CHRNA 3/5 and the HHIP loci make a significant contribution to the risk of COPD. CHRNA3/5 is the same locus that has been implicated in the risk of lung cancer.

...read moreread less

723 citations

Journal Article•DOI•

The Genetic Signatures of Noncoding RNAs

[...]

John S. Mattick¹•Institutions (1)

Australian Research Council¹

24 Apr 2009-PLOS Genetics

TL;DR: It is shown that an historic emphasis, both phenotypically and technically, on mutations in protein-coding sequences, and by presumptions about the nature of regulatory mutations, show that most variations in regulatory sequences produce relatively subtle phenotypic changes, in contrast to mutations in proteins that frequently cause catastrophic component failure.

...read moreread less

Abstract: The majority of the genome in animals and plants is transcribed in a developmentally regulated manner to produce large numbers of non–protein-coding RNAs (ncRNAs), whose incidence increases with developmental complexity. There is growing evidence that these transcripts are functional, particularly in the regulation of epigenetic processes, leading to the suggestion that they compose a hitherto hidden layer of genomic programming in humans and other complex organisms. However, to date, very few have been identified in genetic screens. Here I show that this is explicable by an historic emphasis, both phenotypically and technically, on mutations in protein-coding sequences, and by presumptions about the nature of regulatory mutations. Most variations in regulatory sequences produce relatively subtle phenotypic changes, in contrast to mutations in protein-coding sequences that frequently cause catastrophic component failure. Until recently, most mapping projects have focused on protein-coding sequences, and the limited number of identified regulatory mutations have been interpreted as affecting conventional cis-acting promoter and enhancer elements, although these regions are often themselves transcribed. Moreover, ncRNA-directed regulatory circuits underpin most, if not all, complex genetic phenomena in eukaryotes, including RNA interference-related processes such as transcriptional and post-transcriptional gene silencing, position effect variegation, hybrid dysgenesis, chromosome dosage compensation, parental imprinting and allelic exclusion, paramutation, and possibly transvection and transinduction. The next frontier is the identification and functional characterization of the myriad sequence variations that influence quantitative traits, disease susceptibility, and other complex characteristics, which are being shown by genome-wide association studies to lie mostly in noncoding, presumably regulatory, regions. There is every possibility that many of these variations will alter the interactions between regulatory RNAs and their targets, a prospect that should be borne in mind in future functional analyses.

...read moreread less

687 citations

Journal Article•DOI•

Genome-wide association scan meta-analysis identifies three Loci influencing adiposity and fat distribution

[...]

Cecilia M. Lindgren¹, Iris M. Heid², Joshua C. Randall¹, Claudia Lamina³ +152 more•Institutions (36)

01 Jun 2009-PLOS Genetics

TL;DR: By focusing on anthropometric measures of central obesity and fat distribution, a meta-analysis of 16 genome-wide association studies informative for adult waist circumference and waist–hip ratio identified three loci implicated in the regulation of human adiposity.

...read moreread less

Abstract: To identify genetic loci influencing central obesity and fat distribution, we performed a meta-analysis of 16 genome-wide association studies (GWAS, N = 38,580) informative for adult waist circumference (WC) and waist-hip ratio (WHR). We selected 26 SNPs for follow-up, for which the evidence of association with measures of central adiposity (WC and/or WHR) was strong and disproportionate to that for overall adiposity or height. Follow-up studies in a maximum of 70,689 individuals identified two loci strongly associated with measures of central adiposity; these map near TFAP2B (WC, P = 1.9x10(-11)) and MSRA (WC, P = 8.9x10(-9)). A third locus, near LYPLAL1, was associated with WHR in women only (P = 2.6x10(-8)). The variants near TFAP2B appear to influence central adiposity through an effect on overall obesity/fat-mass, whereas LYPLAL1 displays a strong female-only association with fat distribution. By focusing on anthropometric measures of central obesity and fat distribution, we have identified three loci implicated in the regulation of human adiposity.

...read moreread less

648 citations

Journal Article•DOI•

Harmonics of Circadian Gene Transcription in Mammals

[...]

Michael E. Hughes¹, Luciano DiTacchio², Kevin R. Hayes¹, Christopher Vollmers², S. Pulivarthy², Julie E. Baggs¹, Satchidananda Panda², John B. Hogenesch¹ - Show less +4 more•Institutions (2)

University of Pennsylvania¹, Salk Institute for Biological Studies²

03 Apr 2009-PLOS Genetics

TL;DR: These studies illustrate the importance of time sampling with respect to multiple testing, suggest caution in use of autonomous cellular models to study clock output, and demonstrate the existence of harmonics of circadian gene expression in the mouse.

...read moreread less

Abstract: The circadian clock is a molecular and cellular oscillator found in most mammalian tissues that regulates rhythmic physiology and behavior. Numerous investigations have addressed the contribution of circadian rhythmicity to cellular, organ, and organismal physiology. We recently developed a method to look at transcriptional oscillations with unprecedented precision and accuracy using high-density time sampling. Here, we report a comparison of oscillating transcription from mouse liver, NIH3T3, and U2OS cells. Several surprising observations resulted from this study, including a 100-fold difference in the number of cycling transcripts in autonomous cellular models of the oscillator versus tissues harvested from intact mice. Strikingly, we found two clusters of genes that cycle at the second and third harmonic of circadian rhythmicity in liver, but not cultured cells. Validation experiments show that 12-hour oscillatory transcripts occur in several other peripheral tissues as well including heart, kidney, and lungs. These harmonics are lost ex vivo, as well as under restricted feeding conditions. Taken in sum, these studies illustrate the importance of time sampling with respect to multiple testing, suggest caution in use of autonomous cellular models to study clock output, and demonstrate the existence of harmonics of circadian gene expression in the mouse.

...read moreread less

Journal Article•DOI•

Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip

[...]

Chris C. A. Spencer¹, Zhan Su¹, Peter Donnelly¹, Jonathan Marchini¹•Institutions (1)

University of Oxford¹

15 May 2009-PLOS Genetics

TL;DR: It is argued that the statistical power to detect a causative variant should be the major criterion in study design and that, when taking budgetary considerations into account, the most powerful design may not always correspond to the chip with the highest coverage.

...read moreread less

Abstract: Genome-wide association studies are revolutionizing the search for the genes underlying human complex diseases. The main decisions to be made at the design stage of these studies are the choice of the commercial genotyping chip to be used and the numbers of case and control samples to be genotyped. The most common method of comparing different chips is using a measure of coverage, but this fails to properly account for the effects of sample size, the genetic model of the disease, and linkage disequilibrium between SNPs. In this paper, we argue that the statistical power to detect a causative variant should be the major criterion in study design. Because of the complicated pattern of linkage disequilibrium (LD) in the human genome, power cannot be calculated analytically and must instead be assessed by simulation. We describe in detail a method of simulating case-control samples at a set of linked SNPs that replicates the patterns of LD in human populations, and we used it to assess power for a comprehensive set of available genotyping chips. Our results allow us to compare the performance of the chips to detect variants with different effect sizes and allele frequencies, look at how power changes with sample size in different populations or when using multi-marker tags and genotype imputation approaches, and how performance compares to a hypothetical chip that contains every SNP in HapMap. A main conclusion of this study is that marked differences in genome coverage may not translate into appreciable differences in power and that, when taking budgetary considerations into account, the most powerful design may not always correspond to the chip with the highest coverage. We also show that genotype imputation can be used to boost the power of many chips up to the level obtained from a hypothetical “complete” chip containing all the SNPs in HapMap. Our results have been encapsulated into an R software package that allows users to design future association studies and our methods provide a framework with which new chip sets can be evaluated.

...read moreread less

Journal Article•DOI•

A genome-wide association study confirms VKORC1, CYP2C9, and CYP4F2 as principal genetic determinants of warfarin dose

[...]

Fumihiko Takeuchi¹, Ralph McGinnis¹, Stephane Bourgeois¹, Chris P. Barnes¹, Niclas Eriksson, Nicole Soranzo¹, Pamela Whittaker¹, Venkatesh Ranganath¹, Vasudev Kumanduri¹, William M. McLaren¹, Lennart Holm², Jonatan D. Lindh², Anders Rane², Mia Wadelius³, Panos Deloukas¹ - Show less +11 more•Institutions (3)

Wellcome Trust Sanger Institute¹, Karolinska University Hospital², Uppsala University Hospital³

20 Mar 2009-PLOS Genetics

TL;DR: The first genome-wide association study (GWAS) whose sample size (1,053 Swedish subjects) is sufficiently powered to detect genome- wide significance (p<1.5×10−7) for polymorphisms that modestly alter therapeutic warfarin dose is reported.

...read moreread less

Abstract: We report the first genome-wide association study (GWAS) whose sample size (1,053 Swedish subjects) is sufficiently powered to detect genome-wide significance (p<1.5 x 10(-7)) for polymorphisms that modestly alter therapeutic warfarin dose. The anticoagulant drug warfarin is widely prescribed for reducing the risk of stroke, thrombosis, pulmonary embolism, and coronary malfunction. However, Caucasians vary widely (20-fold) in the dose needed for therapeutic anticoagulation, and hence prescribed doses may be too low (risking serious illness) or too high (risking severe bleeding). Prior work established that approximately 30% of the dose variance is explained by single nucleotide polymorphisms (SNPs) in the warfarin drug target VKORC1 and another approximately 12% by two non-synonymous SNPs (*2, *3) in the cytochrome P450 warfarin-metabolizing gene CYP2C9. We initially tested each of 325,997 GWAS SNPs for association with warfarin dose by univariate regression and found the strongest statistical signals (p<10(-78)) at SNPs clustering near VKORC1 and the second lowest p-values (p<10(-31)) emanating from CYP2C9. No other SNPs approached genome-wide significance. To enhance detection of weaker effects, we conducted multiple regression adjusting for known influences on warfarin dose (VKORC1, CYP2C9, age, gender) and identified a single SNP (rs2108622) with genome-wide significance (p = 8.3 x 10(-10)) that alters protein coding of the CYP4F2 gene. We confirmed this result in 588 additional Swedish patients (p<0.0029) and, during our investigation, a second group provided independent confirmation from a scan of warfarin-metabolizing genes. We also thoroughly investigated copy number variations, haplotypes, and imputed SNPs, but found no additional highly significant warfarin associations. We present power analysis of our GWAS that is generalizable to other studies, and conclude we had 80% power to detect genome-wide significance for common causative variants or markers explaining at least 1.5% of dose variance. These GWAS results provide further impetus for conducting large-scale trials assessing patient benefit from genotype-based forecasting of warfarin dose.

...read moreread less

Journal Article•DOI•

Meta-analysis of 28,141 individuals identifies common variants within five new loci that influence uric acid concentrations.

[...]

Melanie Kolz, Toby Johnson¹, Toby Johnson², Toby Johnson³, Serena Sanna, Alexander Teumer⁴, Veronique Vitart⁵, M. Perola⁶, Massimo Mangino⁷, Eva Albrecht, Chris Wallace⁸, Martin Farrall⁹, Martin Farrall¹⁰, Åsa Johansson¹¹, Dale R. Nyholt¹², Yurii S. Aulchenko¹³, Jacques S. Beckmann¹, Jacques S. Beckmann², Sven Bergmann², Sven Bergmann³, Murielle Bochud¹, Morris J. Brown⁸, Harry Campbell¹⁴, John M. C. Connell¹⁵, Anna F. Dominiczak¹⁵, Georg Homuth⁴, Claudia Lamina¹⁶, Mark I. McCarthy⁹, Mark I. McCarthy¹⁷, Mark I. McCarthy¹⁰, Thomas Meitinger, Vincent Mooser¹⁸, Patricia B. Munroe¹⁹, Matthias Nauck⁴, John F. Peden⁹, John F. Peden¹⁰, Holger Prokisch, Perttu Salo⁶, Veikko Salomaa⁶, Nilesh J. Samani²⁰, David Schlessinger²¹, Manuela Uda, Uwe Völker⁴, Gérard Waeber¹, Dawn M. Waterworth¹⁸, Rui Wang-Sattler, Alan F. Wright⁵, Jerzy Adamski, John Whitfield¹², Ulf Gyllensten¹¹, James F. Wilson¹⁴, Igor Rudan¹⁴, Peter P. Pramstaller, Hugh Watkins¹⁰, Hugh Watkins⁹, Angela Doering, H.-Erich Wichmann²², Tim D. Spector⁷, Leena Peltonen²³, Henry Völzke⁴, Ramaiah Nagaraja²¹, Peter Vollenweider¹, Mark J. Caulfield¹⁹, Thomas Illig, Christian Gieger - Show less +61 more•Institutions (23)

05 Jun 2009-PLOS Genetics

TL;DR: In this article, a meta-analysis of genome-wide association scans from 14 studies with 28,141 participants of European descent was conducted, resulting in identification of 954 SNPs distributed across nine loci that exceeded the threshold of genomewide significance, five of which are novel.

...read moreread less

Abstract: Elevated serum uric acid levels cause gout and are a risk factor for cardiovascular disease and diabetes. To investigate the polygenetic basis of serum uric acid levels, we conducted a meta-analysis of genome-wide association scans from 14 studies totalling 28,141 participants of European descent, resulting in identification of 954 SNPs distributed across nine loci that exceeded the threshold of genome-wide significance, five of which are novel. Overall, the common variants associated with serum uric acid levels fall in the following nine regions: SLC2A9 (p = 5.2x10(-201)), ABCG2 (p = 3.1x10(-26)), SLC17A1 (p = 3.0x10(-14)), SLC22A11 (p = 6.7x10(-14)), SLC22A12 (p = 2.0x10(-9)), SLC16A9 (p = 1.1x10(-8)), GCKR (p = 1.4x10(-9)), LRRC16A (p = 8.5x10(-9)), and near PDZK1 (p = 2.7x10(-9)). Identified variants were analyzed for gender differences. We found that the minor allele for rs734553 in SLC2A9 has greater influence in lowering uric acid levels in women and the minor allele of rs2231142 in ABCG2 elevates uric acid levels more strongly in men compared to women. To further characterize the identified variants, we analyzed their association with a panel of metabolites. rs12356193 within SLC16A9 was associated with DL-carnitine (p = 4.0x10(-26)) and propionyl-L-carnitine (p = 5.0x10(-8)) concentrations, which in turn were associated with serum UA levels (p = 1.4x10(-57) and p = 8.1x10(-54), respectively), forming a triangle between SNP, metabolites, and UA levels. Taken together, these associations highlight additional pathways that are important in the regulation of serum uric acid levels and point toward novel potential targets for pharmacological intervention to prevent or treat hyperuricemia. In addition, these findings strongly support the hypothesis that transport proteins are key in regulating serum uric acid levels.

...read moreread less

Journal Article•DOI•

A Multiparent Advanced Generation Inter-Cross to Fine-Map Quantitative Traits in Arabidopsis thaliana

[...]

Paula X. Kover¹, Paula X. Kover², William Valdar³, Joseph Trakalo³, Nora Scarcelli¹, Ian M. Ehrenreich⁴, Michael D. Purugganan⁴, Caroline Durrant³, Richard Mott³ - Show less +5 more•Institutions (4)

University of Manchester¹, University of Bath², University of Oxford³, New York University⁴

10 Jul 2009-PLOS Genetics

TL;DR: The first panel of MAGIC lines developed is presented, a set of 527 recombinant inbred lines descended from a heterogeneous stock of 19 intermated accessions of the plant Arabidopsis thaliana, and it is shown how the power to detect a QTL and the mapping accuracy vary, depending on QTL location.

...read moreread less

Abstract: Identifying natural allelic variation that underlies quantitative trait variation remains a fundamental problem in genetics. Most studies have employed either simple synthetic populations with restricted allelic variation or performed association mapping on a sample of naturally occurring haplotypes. Both of these approaches have some limitations, therefore alternative resources for the genetic dissection of complex traits continue to be sought. Here we describe one such alternative, the Multiparent Advanced Generation Inter-Cross (MAGIC). This approach is expected to improve the precision with which QTL can be mapped, improving the outlook for QTL cloning. Here, we present the first panel of MAGIC lines developed: a set of 527 recombinant inbred lines (RILs) descended from a heterogeneous stock of 19 intermated accessions of the plant Arabidopsis thaliana. These lines and the 19 founders were genotyped with 1,260 single nucleotide polymorphisms and phenotyped for development-related traits. Analytical methods were developed to fine-map quantitative trait loci (QTL) in the MAGIC lines by reconstructing the genome of each line as a mosaic of the founders. We show by simulation that QTL explaining 10% of the phenotypic variance will be detected in most situations with an average mapping error of about 300 kb, and that if the number of lines were doubled the mapping error would be under 200 kb. We also show how the power to detect a QTL and the mapping accuracy vary, depending on QTL location. We demonstrate the utility of this new mapping population by mapping several known QTL with high precision and by finding novel QTL for germination data and bolting time. Our results provide strong support for similar ongoing efforts to produce MAGIC lines in other organisms.

...read moreread less

Journal Article•DOI•

Sensitive detection of chromosomal segments of distinct ancestry in admixed populations.

[...]

Alkes L. Price¹, Arti Tandon², Arti Tandon¹, Nick Patterson², Kathleen C. Barnes³, Nicholas Rafaels³, Ingo Ruczinski³, Terri H. Beaty³, Rasika A. Mathias⁴, David Reich¹, David Reich², Simon Myers⁵, Simon Myers⁶, Simon Myers² - Show less +10 more•Institutions (6)

Harvard University¹, Broad Institute², Johns Hopkins University³, National Institutes of Health⁴, Wellcome Trust Centre for Human Genetics⁵, University of Oxford⁶

19 Jun 2009-PLOS Genetics

TL;DR: HAPMIX will be of particular utility for mapping disease genes in recently admixed populations, as its accurate estimates of local ancestry permit admixture and case-control association signals to be combined, enabling more powerful tests of association than with either signal alone.

...read moreread less

Abstract: Identifying the ancestry of chromosomal segments of distinct ancestry has a wide range of applications from disease mapping to learning about history. Most methods require the use of unlinked markers; but, using all markers from genome-wide scanning arrays, it should in principle be possible to infer the ancestry of even very small segments with exquisite accuracy. We describe a method, HAPMIX, which employs an explicit population genetic model to perform such local ancestry inference based on fine-scale variation data. We show that HAPMIX outperforms other methods, and we explore its utility for inferring ancestry, learning about ancestral populations, and inferring dates of admixture. We validate the method empirically by applying it to populations that have experienced recent and ancient admixture: 935 African Americans from the United States and 29 Mozabites from North Africa. HAPMIX will be of particular utility for mapping disease genes in recently admixed populations, as its accurate estimates of local ancestry permit admixture and case-control association signals to be combined, enabling more powerful tests of association than with either signal alone.

...read moreread less

Journal Article•DOI•

Network properties of robust immunity in plants.

[...]

Kenichi Tsuda¹, Masanao Sato¹, Masanao Sato², Thomas Stoddard¹, Jane Glazebrook¹, Fumiaki Katagiri¹ - Show less +2 more•Institutions (2)

University of Minnesota¹, University of Tokyo²

11 Dec 2009-PLOS Genetics

TL;DR: Signaling allocation analysis showed that, contrary to current ideas, each of the JA, ET, and SA signaling sectors can positively contribute to immunity against both biotrophic and necrotrophic pathogens.

...read moreread less

Abstract: Two modes of plant immunity against biotrophic pathogens, Effector Triggered Immunity (ETI) and Pattern-Triggered Immunity (PTI), are triggered by recognition of pathogen effectors and Microbe-Associated Molecular Patterns (MAMPs), respectively. Although the jasmonic acid (JA)/ethylene (ET) and salicylic acid (SA) signaling sectors are generally antagonistic and important for immunity against necrotrophic and biotrophic pathogens, respectively, their precise roles and interactions in ETI and PTI have not been clear. We constructed an Arabidopsis dde2/ein2/pad4/sid2-quadruple mutant. DDE2, EIN2, and SID2 are essential components of the JA, ET, and SA sectors, respectively. The pad4 mutation affects the SA sector and a poorly characterized sector. Although the ETI triggered by the bacterial effector AvrRpt2 (AvrRpt2-ETI) and the PTI triggered by the bacterial MAMP flg22 (flg22-PTI) were largely intact in plants with mutations in any one of these genes, they were mostly abolished in the quadruple mutant. For the purposes of this study, AvrRpt2-ETI and flg22-PTI were measured as relative growth of Pseudomonas syringae bacteria within leaves. Immunity to the necrotrophic fungal pathogen Alternaria brassicicola was also severely compromised in the quadruple mutant. Quantitative measurements of the immunity levels in all combinatorial mutants and wild type allowed us to estimate the effects of the wild-type genes and their interactions on the immunity by fitting a mixed general linear model. This signaling allocation analysis showed that, contrary to current ideas, each of the JA, ET, and SA signaling sectors can positively contribute to immunity against both biotrophic and necrotrophic pathogens. The analysis also revealed that while flg22-PTI and AvrRpt2-ETI use a highly overlapping signaling network, the way they use the common network is very different: synergistic relationships among the signaling sectors are evident in PTI, which may amplify the signal; compensatory relationships among the sectors dominate in ETI, explaining the robustness of ETI against genetic and pathogenic perturbations.

...read moreread less

Journal Article•DOI•

Maize Inbreds Exhibit High Levels of Copy Number Variation (CNV) and Presence/Absence Variation (PAV) in Genome Content

[...]

Nathan M. Springer¹, Kai Ying², Yan-Yan Fu², Tieming Ji², Cheng Ting Yeh², Yi Jia², Wei-Wei Wu², Todd Richmond³, Jacob O. Kitzman³, Heidi Rosenbaum³, A. Leonardo Iniguez³, W. Brad Barbazuk⁴, Jeffrey A. Jeddeloh³, Dan Nettleton², Patrick S. Schnable² - Show less +11 more•Institutions (4)

University of Minnesota¹, Iowa State University², Hoffmann-La Roche³, University of Florida⁴

20 Nov 2009-PLOS Genetics

TL;DR: A level of structural diversity between the inbred lines B73 and Mo17 that is unprecedented among higher eukaryotes is revealed, and hundreds of single-copy, expressed genes may contribute to heterosis and to the extraordinary phenotypic diversity of this important crop.

...read moreread less

Abstract: Following the domestication of maize over the past ∼10,000 years, breeders have exploited the extensive genetic diversity of this species to mold its phenotype to meet human needs. The extent of structural variation, including copy number variation (CNV) and presence/absence variation (PAV), which are thought to contribute to the extraordinary phenotypic diversity and plasticity of this important crop, have not been elucidated. Whole-genome, array-based, comparative genomic hybridization (CGH) revealed a level of structural diversity between the inbred lines B73 and Mo17 that is unprecedented among higher eukaryotes. A detailed analysis of altered segments of DNA conservatively estimates that there are several hundred CNV sequences among the two genotypes, as well as several thousand PAV sequences that are present in B73 but not Mo17. Haplotype-specific PAVs contain hundreds of single-copy, expressed genes that may contribute to heterosis and to the extraordinary phenotypic diversity of this important crop.

...read moreread less

Journal Article•DOI•

A Genealogical Interpretation of Principal Components Analysis

[...]

Gil McVean¹•Institutions (1)

University of Oxford¹

16 Oct 2009-PLOS Genetics

TL;DR: For SNP data the projection of samples onto the principal components can be obtained directly from considering the average coalescent times between pairs of haploid genomes, which provides a framework for interpreting PCA projections in terms of underlying processes, including migration, geographical isolation, and admixture.

...read moreread less

Abstract: Principal components analysis, PCA, is a statistical method commonly used in population genetics to identify structure in the distribution of genetic variation across geographical location and ethnic background. However, while the method is often used to inform about historical demographic processes, little is known about the relationship between fundamental demographic parameters and the projection of samples onto the primary axes. Here I show that for SNP data the projection of samples onto the principal components can be obtained directly from considering the average coalescent times between pairs of haploid genomes. The result provides a framework for interpreting PCA projections in terms of underlying processes, including migration, geographical isolation, and admixture. I also demonstrate a link between PCA and Wright's f(st) and show that SNP ascertainment has a largely simple and predictable effect on the projection of samples. Using examples from human genetics, I discuss the application of these results to empirical data and the implications for inference.

...read moreread less

Journal Article•DOI•

Expression of the Multiple Sclerosis-Associated MHC Class II Allele HLA-DRB1*1501 Is Regulated by Vitamin D

[...]

Sreeram V. Ramagopalan¹, Narelle J. Maugeri¹, Lahiru Handunnetthi², Lahiru Handunnetthi¹, Matthew R. Lincoln², Matthew R. Lincoln¹, S M Orton¹, S M Orton², David A. Dyment¹, David A. Dyment², Gabriele C. DeLuca², Gabriele C. DeLuca¹, Blanca M. Herrera², Blanca M. Herrera¹, Michael J. Chao², Michael J. Chao¹, A. Dessa Sadovnick³, George C. Ebers², George C. Ebers¹, Julian C. Knight¹ - Show less +16 more•Institutions (3)

Wellcome Trust Centre for Human Genetics¹, John Radcliffe Hospital², University of British Columbia³

06 Feb 2009-PLOS Genetics

TL;DR: In this article, a single MHC vitamin D response element (VDRE) was found to interact with inherited factors and sought responsive regulatory elements in the MHC class II region.

...read moreread less

Abstract: Multiple sclerosis (MS) is a complex trait in which allelic variation in the MHC class II region exerts the single strongest effect on genetic risk. Epidemiological data in MS provide strong evidence that environmental factors act at a population level to influence the unusual geographical distribution of this disease. Growing evidence implicates sunlight or vitamin D as a key environmental factor in aetiology. We hypothesised that this environmental candidate might interact with inherited factors and sought responsive regulatory elements in the MHC class II region. Sequence analysis localised a single MHC vitamin D response element (VDRE) to the promoter region of HLA-DRB1. Sequencing of this promoter in greater than 1,000 chromosomes from HLA-DRB1 homozygotes showed absolute conservation of this putative VDRE on HLA-DRB1*15 haplotypes. In contrast, there was striking variation among non-MS-associated haplotypes. Electrophoretic mobility shift assays showed specific recruitment of vitamin D receptor to the VDRE in the HLA-DRB1*15 promoter, confirmed by chromatin immunoprecipitation experiments using lymphoblastoid cells homozygous for HLA-DRB1*15. Transient transfection using a luciferase reporter assay showed a functional role for this VDRE. B cells transiently transfected with the HLA-DRB1*15 gene promoter showed increased expression on stimulation with 1,25-dihydroxyvitamin D3 (P = 0.002) that was lost both on deletion of the VDRE or with the homologous "VDRE" sequence found in non-MS-associated HLA-DRB1 haplotypes. Flow cytometric analysis showed a specific increase in the cell surface expression of HLA-DRB1 upon addition of vitamin D only in HLA-DRB1*15 bearing lymphoblastoid cells. This study further implicates vitamin D as a strong environmental candidate in MS by demonstrating direct functional interaction with the major locus determining genetic susceptibility. These findings support a connection between the main epidemiological and genetic features of this disease with major practical implications for studies of disease mechanism and prevention.

...read moreread less

Journal Article•DOI•

Deletion of the Mitochondrial Superoxide Dismutase sod-2 Extends Lifespan in Caenorhabditis elegans

[...]

Jeremy M. Van Raamsdonk¹, Siegfried Hekimi¹•Institutions (1)

McGill University¹

06 Feb 2009-PLOS Genetics

TL;DR: It is shown that increased oxidative stress caused by deletion of sod genes does not result in decreased lifespan in C. elegans and that deletion of Sod-2 extends worm lifespan by altering mitochondrial function, and the demonstration of decreased oxygen consumption in sod-2 mutant worms.

...read moreread less

Abstract: The oxidative stress theory of aging postulates that aging results from the accumulation of molecular damage caused by reactive oxygen species (ROS) generated during normal metabolism. Superoxide dismutases (SODs) counteract this process by detoxifying superoxide. It has previously been shown that elimination of either cytoplasmic or mitochondrial SOD in yeast, flies, and mice results in decreased lifespan. In this experiment, we examine the effect of eliminating each of the five individual sod genes present in Caenorhabditis elegans. In contrast to what is observed in other model organisms, none of the sod deletion mutants shows decreased lifespan compared to wild-type worms, despite a clear increase in sensitivity to paraquat- and juglone-induced oxidative stress. In fact, even mutants lacking combinations of two or three sod genes survive at least as long as wild-type worms. Examination of gene expression in these mutants reveals mild compensatory up-regulation of other sod genes. Interestingly, we find that sod-2 mutants are long-lived despite a significant increase in oxidatively damaged proteins. Testing the effect of sod-2 deletion on known pathways of lifespan extension reveals a clear interaction with genes that affect mitochondrial function: sod-2 deletion markedly increases lifespan in clk-1 worms while clearly decreasing the lifespan of isp-1 worms. Combined with the mitochondrial localization of SOD-2 and the fact that sod-2 mutant worms exhibit phenotypes that are characteristic of long-lived mitochondrial mutants—including slow development, low brood size, and slow defecation—this suggests that deletion of sod-2 extends lifespan through a similar mechanism. This conclusion is supported by our demonstration of decreased oxygen consumption in sod-2 mutant worms. Overall, we show that increased oxidative stress caused by deletion of sod genes does not result in decreased lifespan in C. elegans and that deletion of sod-2 extends worm lifespan by altering mitochondrial function.

...read moreread less

Journal Article•DOI•

SOS Response Induces Persistence to Fluoroquinolones in Escherichia coli

[...]

Tobias Dörr¹, Kim Lewis¹, Marin Vulić¹•Institutions (1)

Northeastern University¹

11 Dec 2009-PLOS Genetics

TL;DR: Findings reveal an active and inducible mechanism of persister formation mediated by the SOS response, challenging the prevailing view that persisters are pre-existing and formed purely by stochastic means.

...read moreread less

Abstract: Bacteria can survive antibiotic treatment without acquiring heritable antibiotic resistance. We investigated persistence to the fluoroquinolone ciprofloxacin in Escherichia coli. Our data show that a majority of persisters to ciprofloxacin were formed upon exposure to the antibiotic, in a manner dependent on the SOS gene network. These findings reveal an active and inducible mechanism of persister formation mediated by the SOS response, challenging the prevailing view that persisters are pre-existing and formed purely by stochastic means. SOS-induced persistence is a novel mechanism by which cells can counteract DNA damage and promote survival to fluoroquinolones. This unique survival mechanism may be an important factor influencing the outcome of antibiotic therapy in vivo.

...read moreread less

Journal Article•DOI•

Identifying Relationships among Genomic Disease Regions: Predicting Genes at Pathogenic SNP Associations and Rare Deletions

[...]

Soumya Raychaudhuri¹, Robert M. Plenge², Robert M. Plenge¹, Robert M. Plenge³, Elizabeth J. Rossin¹, Elizabeth J. Rossin², Elizabeth J. Rossin⁴, Aylwin Ng², Shaun Purcell¹, Shaun Purcell², Pamela Sklar, Edward M. Scolnick¹, Edward M. Scolnick², Ramnik J. Xavier², David Altshuler, Mark J. Daly², Mark J. Daly¹ - Show less +13 more•Institutions (4)

Broad Institute¹, Harvard University², Brigham and Women's Hospital³, Massachusetts Institute of Technology⁴

26 Jun 2009-PLOS Genetics

TL;DR: A statistical method that takes a list of disease regions and automatically assesses the degree of relatedness of implicated genes using 250,000 PubMed abstracts, and offers a statistically robust approach to identifying functionally related genes from across multiple disease regions—that likely represent key disease pathways.

...read moreread less

Abstract: Translating a set of disease regions into insight about pathogenic mechanisms requires not only the ability to identify the key disease genes within them, but also the biological relationships among those key genes. Here we describe a statistical method, Gene Relationships Among Implicated Loci (GRAIL), that takes a list of disease regions and automatically assesses the degree of relatedness of implicated genes using 250,000 PubMed abstracts. We first evaluated GRAIL by assessing its ability to identify subsets of highly related genes in common pathways from validated lipid and height SNP associations from recent genome-wide studies. We then tested GRAIL, by assessing its ability to separate true disease regions from many false positive disease regions in two separate practical applications in human genetics. First, we took 74 nominally associated Crohn's disease SNPs and applied GRAIL to identify a subset of 13 SNPs with highly related genes. Of these, ten convincingly validated in follow-up genotyping; genotyping results for the remaining three were inconclusive. Next, we applied GRAIL to 165 rare deletion events seen in schizophrenia cases (less than one-third of which are contributing to disease risk). We demonstrate that GRAIL is able to identify a subset of 16 deletions containing highly related genes; many of these genes are expressed in the central nervous system and play a role in neuronal synapses. GRAIL offers a statistically robust approach to identifying functionally related genes from across multiple disease regions—that likely represent key disease pathways. An online version of this method is available for public use (http://www.broad.mit.edu/mpg/grail/).

...read moreread less

Journal Article•DOI•

A genome-wide investigation of SNPs and CNVs in schizophrenia

[...]

Anna C. Need¹, Dongliang Ge¹, Michael E. Weale², Jessica M. Maia¹, Sheng Feng¹, Erin L. Heinzen¹, Kevin V. Shianna¹, Woohyun Yoon¹, Dalia Kasperavičiūtė³, Massimo Gennarelli⁴, Warren J. Strittmatter¹, Cristian Bonvicini, Giuseppe Rossi, Karu Jayathilake⁵, Philip A. Cola, Joseph P. McEvoy¹, Richard S.E. Keefe¹, Elizabeth M. C. Fisher³, Pamela L. St. Jean⁶, Ina Giegling⁷, Annette M. Hartmann⁷, Hans-Jürgen Möller⁷, Andreas Ruppert, Gillian Fraser⁸, Caroline Crombie⁸, Lefkos T. Middleton⁹, David St Clair⁸, Allen D. Roses¹, Pierandrea Muglia¹⁰, Clyde Francks¹⁰, Dan Rujescu⁷, Herbert Y. Meltzer⁵, David Goldstein¹ - Show less +29 more•Institutions (10)

Duke University¹, King's College London², University College London³, University of Brescia⁴, Vanderbilt University⁵, Research Triangle Park⁶, Ludwig Maximilian University of Munich⁷, University of Aberdeen⁸, Hammersmith Hospital⁹, GlaxoSmithKline¹⁰

06 Feb 2009-PLOS Genetics

TL;DR: These data suggest that very few schizophrenia patients share identical genomic causation, potentially complicating efforts to personalize treatment regimens and support the emerging view that rare deleterious variants may be more important in schizophrenia predisposition than common polymorphisms.

...read moreread less

Abstract: We report a genome-wide assessment of single nucleotide polymorphisms (SNPs) and copy number variants (CNVs) in schizophrenia. We investigated SNPs using 871 patients and 863 controls, following up the top hits in four independent cohorts comprising 1,460 patients and 12,995 controls, all of European origin. We found no genome-wide significant associations, nor could we provide support for any previously reported candidate gene or genome-wide associations. We went on to examine CNVs using a subset of 1,013 cases and 1,084 controls of European ancestry, and a further set of 60 cases and 64 controls of African ancestry. We found that eight cases and zero controls carried deletions greater than 2 Mb, of which two, at 8p22 and 16p13.11-p12.4, are newly reported here. A further evaluation of 1,378 controls identified no deletions greater than 2 Mb, suggesting a high prior probability of disease involvement when such deletions are observed in cases. We also provide further evidence for some smaller, previously reported, schizophrenia-associated CNVs, such as those in NRXN1 and APBA2. We could not provide strong support for the hypothesis that schizophrenia patients have a significantly greater “load” of large (>100 kb), rare CNVs, nor could we find common CNVs that associate with schizophrenia. Finally, we did not provide support for the suggestion that schizophrenia-associated CNVs may preferentially disrupt genes in neurodevelopmental pathways. Collectively, these analyses provide the first integrated study of SNPs and CNVs in schizophrenia and support the emerging view that rare deleterious variants may be more important in schizophrenia predisposition than common polymorphisms. While our analyses do not suggest that implicated CNVs impinge on particular key pathways, we do support the contribution of specific genomic regions in schizophrenia, presumably due to recurrent mutation. On balance, these data suggest that very few schizophrenia patients share identical genomic causation, potentially complicating efforts to personalize treatment regimens.

...read moreread less

Journal Article•DOI•

Comprehensive Functional Analysis of Mycobacterium tuberculosis Toxin-Antitoxin Systems: Implications for Pathogenesis, Stress Responses, and Evolution

[...]

Holly Ramage¹, Lynn E. Connolly¹, Jeffery S. Cox¹•Institutions (1)

University of California, San Francisco¹

11 Dec 2009-PLOS Genetics

TL;DR: It is shown that 88 putative TA system candidates are present in M. tuberculosis, considerably more than previously thought, and that four systems are specifically activated during stresses likely encountered in vivo, including hypoxia and phagocytosis by macrophages.

...read moreread less

Abstract: Toxin-antitoxin (TA) systems, stress-responsive genetic elements ubiquitous in microbial genomes, are unusually abundant in the major human pathogen Mycobacterium tuberculosis. Why M. tuberculosis has so many TA systems and what role they play in the unique biology of the pathogen is unknown. To address these questions, we have taken a comprehensive approach to identify and functionally characterize all the TA systems encoded in the M. tuberculosis genome. Here we show that 88 putative TA system candidates are present in M. tuberculosis, considerably more than previously thought. Comparative genomic analysis revealed that the vast majority of these systems are conserved in the M. tuberculosis complex (MTBC), but largely absent from other mycobacteria, including close relatives of M. tuberculosis. We found that many of the M. tuberculosis TA systems are located within discernable genomic islands and were thus likely acquired recently via horizontal gene transfer. We discovered a novel TA system located in the core genome that is conserved across the genus, suggesting that it may fulfill a role common to all mycobacteria. By expressing each of the putative TA systems in M. smegmatis, we demonstrate that 30 encode a functional toxin and its cognate antitoxin. We show that the toxins of the largest family of TA systems, VapBC, act by inhibiting translation via mRNA cleavage. Expression profiling demonstrated that four systems are specifically activated during stresses likely encountered in vivo, including hypoxia and phagocytosis by macrophages. The expansion and maintenance of TA genes in the MTBC, coupled with the finding that a subset is transcriptionally activated by stress, suggests that TA systems are important for M. tuberculosis pathogenesis.

...read moreread less

Journal Article•DOI•

Widespread Genomic Signatures of Natural Selection in Hominid Evolution

[...]

Graham McVicker¹, David Gordon¹, Colleen Davis¹, Philip Green¹•Institutions (1)

University of Washington¹

08 May 2009-PLOS Genetics

TL;DR: A dominant role for selection in shaping genomic diversity and divergence patterns is revealed, long term selection explains the large intragenomic variation in human/chimpanzee divergence, and is a baseline for investigating specific selective events.

...read moreread less

Abstract: Selection acting on genomic functional elements can be detected by its indirect effects on population diversity at linked neutral sites. To illuminate the selective forces that shaped hominid evolution, we analyzed the genomic distributions of human polymorphisms and sequence differences among five primate species relative to the locations of conserved sequence features. Neutral sequence diversity in human and ancestral hominid populations is substantially reduced near such features, resulting in a surprisingly large genome average diversity reduction due to selection of 19–26% on the autosomes and 12–40% on the X chromosome. The overall trends are broadly consistent with “background selection” or hitchhiking in ancestral populations acting to remove deleterious variants. Average selection is much stronger on exonic (both protein-coding and untranslated) conserved features than non-exonic features. Long term selection, rather than complex speciation scenarios, explains the large intragenomic variation in human/chimpanzee divergence. Our analyses reveal a dominant role for selection in shaping genomic diversity and divergence patterns, clarify hominid evolution, and provide a baseline for investigating specific selective events.

...read moreread less

Journal Article•DOI•

Bacterial toxin-antitoxin systems: more than selfish entities?

[...]

Laurence Van Melderen¹, Manuel Saavedra De Bast¹•Institutions (1)

Université libre de Bruxelles¹

27 Mar 2009-PLOS Genetics

TL;DR: Current hypotheses regarding the biological roles of these evolutionarily successful small operons are discussed and the various selective forces that could drive the maintenance of TA systems in bacterial genomes are considered.

...read moreread less

Abstract: Bacterial toxin–antitoxin (TA) systems are diverse and widespread in the prokaryotic kingdom. They are composed of closely linked genes encoding a stable toxin that can harm the host cell and its cognate labile antitoxin, which protects the host from the toxin's deleterious effect. TA systems are thought to invade bacterial genomes through horizontal gene transfer. Some TA systems might behave as selfish elements and favour their own maintenance at the expense of their host. As a consequence, they may contribute to the maintenance of plasmids or genomic islands, such as super-integrons, by post-segregational killing of the cell that loses these genes and so suffers the stable toxin's destructive effect. The function of the chromosomally encoded TA systems is less clear and still open to debate. This Review discusses current hypotheses regarding the biological roles of these evolutionarily successful small operons. We consider the various selective forces that could drive the maintenance of TA systems in bacterial genomes.

...read moreread less

Journal Article•DOI•

The role of geography in human adaptation.

[...]

Graham Coop¹, Joseph K. Pickrell¹, John Novembre¹, Sridhar Kudaravalli¹, Jun Li², Devin Absher, Richard M. Myers, Luigi Luca Cavalli-Sforza³, Marcus W. Feldman³, Jonathan K. Pritchard¹, Jonathan K. Pritchard⁴ - Show less +7 more•Institutions (4)

University of Chicago¹, University of Michigan², Stanford University³, Howard Hughes Medical Institute⁴

05 Jun 2009-PLOS Genetics

TL;DR: This paper found that the average allele frequency divergence is highly predictive of the most extreme FST values across the whole genome and that the geographic distribution of putatively selected alleles almost invariably conforms to population clusters identified using randomly chosen genetic markers.

...read moreread less

Abstract: Various observations argue for a role of adaptation in recent human evolution, including results from genome-wide studies and analyses of selection signals at candidate genes. Here, we use genome-wide SNP data from the HapMap and CEPH-Human Genome Diversity Panel samples to study the geographic distributions of putatively selected alleles at a range of geographic scales. We find that the average allele frequency divergence is highly predictive of the most extreme FST values across the whole genome. On a broad scale, the geographic distribution of putatively selected alleles almost invariably conforms to population clusters identified using randomly chosen genetic markers. Given this structure, there are surprisingly few fixed or nearly fixed differences between human populations. Among the nearly fixed differences that do exist, nearly all are due to fixation events that occurred outside of Africa, and most appear in East Asia. These patterns suggest that selection is often weak enough that neutral processes—especially population history, migration, and drift—exert powerful influences over the fate and geographic distribution of selected alleles.

...read moreread less

Journal Article•DOI•

A phenotypic profile of the Candida albicans regulatory network.

[...]

Oliver R. Homann¹, Jeanselle Dea¹, Suzanne M. Noble¹, Alexander D. Johnson¹•Institutions (1)

University of California, San Francisco¹

24 Dec 2009-PLOS Genetics

TL;DR: It is found that, despite the many specific wiring changes documented between these species, the general phenotypes of orthologous transcriptional regulator knockouts are largely conserved, supporting the idea that many wiring changes affect the detailed architecture of the circuit, but not its overall output.

...read moreread less

Abstract: Candida albicans is a normal resident of the gastrointestinal tract and also the most prevalent fungal pathogen of humans. It last shared a common ancestor with the model yeast Saccharomyces cerevisiae over 300 million years ago. We describe a collection of 143 genetically matched strains of C. albicans, each of which has been deleted for a specific transcriptional regulator. This collection represents a large fraction of the non-essential transcription circuitry. A phenotypic profile for each mutant was developed using a screen of 55 growth conditions. The results identify the biological roles of many individual transcriptional regulators; for many, this work represents the first description of their functions. For example, a quarter of the strains showed altered colony formation, a phenotype reflecting transitions among yeast, pseudohyphal, and hyphal cell forms. These transitions, which have been closely linked to pathogenesis, have been extensively studied, yet our work nearly doubles the number of transcriptional regulators known to influence them. As a second example, nearly a quarter of the knockout strains affected sensitivity to commonly used antifungal drugs; although a few transcriptional regulators have previously been implicated in susceptibility to these drugs, our work indicates many additional mechanisms of sensitivity and resistance. Finally, our results inform how transcriptional networks evolve. Comparison with the existing S. cerevisiae data (supplemented by additional S. cerevisiae experiments reported here) allows the first systematic analysis of phenotypic conservation by orthologous transcriptional regulators over a large evolutionary distance. We find that, despite the many specific wiring changes documented between these species, the general phenotypes of orthologous transcriptional regulator knockouts are largely conserved. These observations support the idea that many wiring changes affect the detailed architecture of the circuit, but not its overall output.

...read moreread less

Journal Article•DOI•

Genome-wide analyses of exonic copy number variants in a family-based study point to novel autism susceptibility genes

[...]

Maja Bucan¹, Brett S. Abrahams², Kai Wang¹, Joseph T. Glessner¹, Edward I. Herman², Lisa I. Sonnenblick², Ana I. Alvarez Retuerto², Marcin Imielinski¹, Dexter Hadley¹, Jonathan P. Bradfield¹, Cecilia Kim¹, Nicole B. Gidaya¹, Ingrid E. Lindquist¹, Ted Hutman², Marian Sigman², Vlad Kustanovich, Clara Lajonchere³, Andrew B. Singleton⁴, Junhyong Kim¹, Thomas H. Wassink⁵, William M. McMahon⁶, Thomas Owley⁷, John A. Sweeney⁷, Hilary Coon⁶, John I. Nurnberger⁸, Mingyao Li¹, Rita M. Cantor², Nancy J. Minshew⁹, James S. Sutcliffe¹⁰, Edwin H. Cook⁷, Geraldine Dawson¹¹, Joseph D. Buxbaum¹², Struan F.A. Grant¹, Gerard D. Schellenberg¹, Daniel H. Geschwind², Hakon Hakonarson¹ - Show less +32 more•Institutions (12)

University of Pennsylvania¹, University of California, Los Angeles², University of Southern California³, National Institutes of Health⁴, University of Iowa⁵, University of Utah⁶, University of Illinois at Chicago⁷, Indiana University⁸, University of Pittsburgh⁹, Vanderbilt University¹⁰, University of North Carolina at Chapel Hill¹¹, Icahn School of Medicine at Mount Sinai¹²

26 Jun 2009-PLOS Genetics

TL;DR: To pinpoint genes likely to contribute to ASD etiology, high density genotyping was performed in 912 multiplex families from the Autism Genetics Resource Exchange collection and contrasted results to those obtained for 1,488 healthy controls.

...read moreread less

Abstract: The genetics underlying the autism spectrum disorders (ASDs) is complex and remains poorly understood. Previous work has demonstrated an important role for structural variation in a subset of cases, but has lacked the resolution necessary to move beyond detection of large regions of potential interest to identification of individual genes. To pinpoint genes likely to contribute to ASD etiology, we performed high density genotyping in 912 multiplex families from the Autism Genetics Resource Exchange (AGRE) collection and contrasted results to those obtained for 1,488 healthy controls. Through prioritization of exonic deletions (eDels), exonic duplications (eDups), and whole gene duplication events (gDups), we identified more than 150 loci harboring rare variants in multiple unrelated probands, but no controls. Importantly, 27 of these were confirmed on examination of an independent replication cohort comprised of 859 cases and an additional 1,051 controls. Rare variants at known loci, including exonic deletions at NRXN1 and whole gene duplications encompassing UBE3A and several other genes in the 15q11-q13 region, were observed in the course of these analyses. Strong support was likewise observed for previously unreported genes such as BZRAP1, an adaptor molecule known to regulate synaptic transmission, with eDels or eDups observed in twelve unrelated cases but no controls (p = 2.3x10(-5)). Less is known about MDGA2, likewise observed to be case-specific (p = 1.3x10(-4)). But, it is notable that the encoded protein shows an unexpectedly high similarity to Contactin 4 (BLAST E-value = 3x10(-39)), which has also been linked to disease. That hundreds of distinct rare variants were each seen only once further highlights complexity in the ASDs and points to the continued need for larger cohorts.

...read moreread less

Collapse