scispace - formally typeset
Search or ask a question

Showing papers in "American Journal of Human Genetics in 2007"


Journal ArticleDOI
TL;DR: This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.
Abstract: Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.

26,280 citations


Journal ArticleDOI
TL;DR: This work presents a new method and software for inference of haplotypes phase and missing data that can accurately phase data from whole-genome association studies, and presents the first comparison of haplotype-inference methods for real and simulated data sets with thousands of genotyped individuals.
Abstract: Whole-genome association studies present many new statistical and computational challenges due to the large quantity of data obtained. One of these challenges is haplotype inference; methods for haplotype inference designed for small data sets from candidate-gene studies do not scale well to the large number of individuals genotyped in whole-genome association studies. We present a new method and software for inference of haplotype phase and missing data that can accurately phase data from whole-genome association studies, and we present the first comparison of haplotype-inference methods for real and simulated data sets with thousands of genotyped individuals. We find that our method outperforms existing methods in terms of both speed and accuracy for large data sets with thousands of individuals and densely spaced genetic markers, and we use our method to phase a real data set of 3,002 individuals genotyped for 490,032 markers in 3.1 days of computing time, with 99% of masked alleles imputed correctly. Our method is implemented in the Beagle software package, which is freely available.

2,849 citations


Journal ArticleDOI
TL;DR: A multitiered, case-control association study of psoriasis in three independent sample sets of white North American individuals found a highly significant association with an IL12B 3'-untranslated-region SNP (rs3212227), confirming the results of a small Japanese study.
Abstract: We performed a multitiered, case-control association study of psoriasis in three independent sample sets of white North American individuals (1,446 cases and 1,432 controls) with 25,215 genecentric single-nucleotide polymorphisms (SNPs) and found a highly significant association with an IL12B 3'-untranslated-region SNP (rs3212227), confirming the results of a small Japanese study. This SNP was significant in all three sample sets (odds ratio [OR](common) 0.64, combined P [Pcomb]=7.85x10(-10)). A Monte Carlo simulation to address multiple testing suggests that this association is not a type I error. The coding regions of IL12B were resequenced in 96 individuals with psoriasis, and 30 additional IL12B-region SNPs were genotyped. Haplotypes were estimated, and genotype-conditioned analyses identified a second risk allele (rs6887695) located approximately 60 kb upstream of the IL12B coding region that exhibited association with psoriasis after adjustment for rs3212227. Together, these two SNPs mark a common IL12B risk haplotype (OR(common) 1.40, Pcomb=8.11x10(-9)) and a less frequent protective haplotype (OR(common) 0.58, Pcomb=5.65x10(-12)), which were statistically significant in all three studies. Since IL12B encodes the common IL-12p40 subunit of IL-12 and IL-23, we individually genotyped 17 SNPs in the genes encoding the other chains of these cytokines (IL12A and IL23A) and their receptors (IL12RB1, IL12RB2, and IL23R). Haplotype analyses identified two IL23R missense SNPs that together mark a common psoriasis-associated haplotype in all three studies (OR(common) 1.44, Pcomb=3.13x10(-6)). Individuals homozygous for both the IL12B and the IL23R predisposing haplotypes have an increased risk of disease (OR(common) 1.66, Pcomb=1.33x10(-8)). These data, and the previous observation that administration of an antibody specific for the IL-12p40 subunit to patients with psoriasis is highly efficacious, suggest that these genes play a fundamental role in psoriasis pathogenesis.

1,078 citations


Journal ArticleDOI
TL;DR: It is demonstrated that pathway-based approaches, which jointly consider multiple contributing factors in the same pathway, might complement the most-significant SNPs/genes approach and provide additional insights into interpretation of GWA data on complex diseases.
Abstract: Published genomewide association (GWA) studies typically analyze and report single-nucleotide polymorphisms (SNPs) and their neighboring genes with the strongest evidence of association (the "most-significant SNPs/genes" approach), while paying little attention to the rest. Borrowing ideas from microarray data analysis, we demonstrate that pathway-based approaches, which jointly consider multiple contributing factors in the same pathway, might complement the most-significant SNPs/genes approach and provide additional insights into interpretation of GWA data on complex diseases.

889 citations


Journal ArticleDOI
TL;DR: The rapid progress in an important part of medical genetics and genomics is reviewed, as chronicled in MIM/OMIM over these 40 years, and the future challenges of OMIM are contemplated.
Abstract: Last year marked the 40th anniversary of the publication of the first print edition of Mendelian Inheritance in Man (MIM).1 This seems an appropriate juncture at which to review its origins, evolution, and present status, including and particularly those of its online version, OMIM (Online Mendelian Inheritance in Man). This is an opportunity, at the same time, to review in brief the rapid progress in an important part of medical genetics and genomics, as chronicled in MIM/OMIM over these 40 years, and to contemplate the future challenges of OMIM.

679 citations


Journal ArticleDOI
TL;DR: The combined data provide support that haploinsufficiency of SHANK3 can cause a monogenic form of autism in sufficient frequency to warrant consideration in clinical diagnostic testing.
Abstract: Mutations in SHANK3, which encodes a synaptic scaffolding protein, have been described in subjects with an autism spectrum disorder (ASD). To assess the quantitative contribution of SHANK3 to the pathogenesis of autism, we determined the frequency of DNA sequence and copy-number variants in this gene in 400 ASD-affected subjects ascertained in Canada. One de novo mutation and two gene deletions were discovered, indicating a contribution of 0.75% in this cohort. One additional SHANK3 deletion was characterized in two ASD-affected siblings from another collection, which brings the total number of published mutations in unrelated ASD-affected families to seven. The combined data provide support that haploinsufficiency of SHANK3 can cause a monogenic form of autism in sufficient frequency to warrant consideration in clinical diagnostic testing.

622 citations


Journal ArticleDOI
TL;DR: It is found that ~20% of new missense mutations in humans result in a loss of function, whereas ~27% are effectively neutral, implying that mutation-selection balance may be a feasible evolutionary mechanism underlying some common diseases.
Abstract: The accumulation of mildly deleterious missense mutations in individual human genomes has been proposed to be a genetic basis for complex diseases. The plausibility of this hypothesis depends on quantitative estimates of the prevalence of mildly deleterious de novo mutations and polymorphic variants in humans and on the intensity of selective pressure against them. We combined analysis of mutations causing human Mendelian diseases, of human-chimpanzee divergence, and of systematic data on human genetic variation and found that ∼20% of new missense mutations in humans result in a loss of function, whereas ∼27% are effectively neutral. Thus, the remaining 53% of new missense mutations have mildly deleterious effects. These mutations give rise to many low-frequency deleterious allelic variants in the human population, as is evident from a new data set of 37 genes sequenced in >1,500 individual human chromosomes. Surprisingly, up to 70% of low-frequency missense alleles are mildly deleterious and are associated with a heterozygous fitness loss in the range 0.001–0.003. Thus, the low allele frequency of an amino acid variant can, by itself, serve as a predictor of its functional significance. Several recent studies have reported a significant excess of rare missense variants in candidate genes or pathways in individuals with extreme values of quantitative phenotypes. These studies would be unlikely to yield results if most rare variants were neutral or if rare variants were not a significant contributor to the genetic component of phenotypic inheritance. Our results provide a justification for these types of candidate-gene (pathway) association studies and imply that mutation-selection balance may be a feasible evolutionary mechanism underlying some common diseases.

614 citations


Journal ArticleDOI
TL;DR: A generalized MDR (GMDR) method is reported that permits adjustment for discrete and quantitative covariates and is applicable to both dichotomous and continuous phenotypes in various population-based study designs and serves the purpose of identifying contributors to population variation better than do the other existing methods.
Abstract: The determination of gene-by-gene and gene-by-environment interactions has long been one of the greatest challenges in genetics. The traditional methods are typically inadequate because of the problem referred to as the "curse of dimensionality." Recent combinatorial approaches, such as the multifactor dimensionality reduction (MDR) method, the combinatorial partitioning method, and the restricted partition method, have a straightforward correspondence to the concept of the phenotypic landscape that unifies biological, statistical genetics, and evolutionary theories. However, the existing approaches have several limitations, such as not allowing for covariates, that restrict their practical use. In this study, we report a generalized MDR (GMDR) method that permits adjustment for discrete and quantitative covariates and is applicable to both dichotomous and continuous phenotypes in various population-based study designs. Computer simulations indicated that the GMDR method has superior performance in its ability to identify epistatic loci, compared with current methods in the literature. We applied our proposed method to a genetics study of four genes that were reported to be associated with nicotine dependence and found significant joint action between CHRNB4 and NTRK2. Moreover, our example illustrates that the newly proposed GMDR approach can increase prediction ability, suggesting that its use is justified in practice. In summary, GMDR serves the purpose of identifying contributors to population variation better than do the other existing methods.

527 citations


Journal ArticleDOI
TL;DR: An in-depth survey of CNVs across the human genome provides a valuable baseline for studies involving human genetics and raises the possibility of the contribution of microRNAs to phenotypic diversity in humans.
Abstract: Segmental copy-number variations (CNVs) in the human genome are associated with developmental disorders and susceptibility to diseases. More importantly, CNVs may represent a major genetic component of our phenotypic diversity. In this study, using a whole-genome array comparative genomic hybridization assay, we identified 3,654 autosomal segmental CNVs, 800 of which appeared at a frequency of at least 3%. Of these frequent CNVs, 77% are novel. In the 95 individuals analyzed, the two most diverse genomes differed by at least 9 Mb in size or varied by at least 266 loci in content. Approximately 68% of the 800 polymorphic regions overlap with genes, which may reflect human diversity in senses (smell, hearing, taste, and sight), rhesus phenotype, metabolism, and disease susceptibility. Intriguingly, 14 polymorphic regions harbor 21 of the known human microRNAs, raising the possibility of the contribution of microRNAs to phenotypic diversity in humans. This in-depth survey of CNVs across the human genome provides a valuable baseline for studies involving human genetics.

517 citations


Journal ArticleDOI
TL;DR: In this paper, the authors investigated the correlation between the fibrillin-1 (FBN1) genotype and the nature and severity of the clinical phenotype, including skeletal, cardiovascular, ophthalmologic, skin, pulmonary, and dural.
Abstract: Mutations in the fibrillin-1 (FBN1) gene cause Marfan syndrome (MFS) and have been associated with a wide range of overlapping phenotypes. Clinical care is complicated by variable age at onset and the wide range of severity of aortic features. The factors that modulate phenotypical severity, both among and within families, remain to be determined. The availability of international FBN1 mutation Universal Mutation Database (UMD-FBN1) has allowed us to perform the largest collaborative study ever reported, to investigate the correlation between the FBN1 genotype and the nature and severity of the clinical phenotype. A range of qualitative and quantitative clinical parameters (skeletal, cardiovascular, ophthalmologic, skin, pulmonary, and dural) was compared for different classes of mutation (types and locations) in 1,013 probands with a pathogenic FBN1 mutation. A higher probability of ectopia lentis was found for patients with a missense mutation substituting or producing a cysteine, when compared with other missense mutations. Patients with an FBN1 premature termination codon had a more severe skeletal and skin phenotype than did patients with an inframe mutation. Mutations in exons 24-32 were associated with a more severe and complete phenotype, including younger age at diagnosis of type I fibrillinopathy and higher probability of developing ectopia lentis, ascending aortic dilatation, aortic surgery, mitral valve abnormalities, scoliosis, and shorter survival; the majority of these results were replicated even when cases of neonatal MFS were excluded. These correlations, found between different mutation types and clinical manifestations, might be explained by different underlying genetic mechanisms (dominant negative versus haploinsufficiency) and by consideration of the two main physiological functions of fibrillin-1 (structural versus mediator of TGF beta signalling). Exon 24-32 mutations define a high-risk group for cardiac manifestations associated with severe prognosis at all ages.

491 citations


Journal ArticleDOI
TL;DR: The authors' data indicate that SMC3 and SMC1A mutations contribute to approximately 5% of cases of CdLS, result in a consistently mild phenotype with absence of major structural anomalies typically associated with Cd LS, and in some instances, result with a phenotype that approaches that of apparently nonsyndromic mental retardation.
Abstract: Mutations in the cohesin regulators NIPBL and ESCO2 are causative of the Cornelia de Lange syndrome (CdLS) and Roberts or SC phocomelia syndrome, respectively. Recently, mutations in the cohesin complex structural component SMC1A have been identified in two probands with features of CdLS. Here, we report the identification of a mutation in the gene encoding the complementary subunit of the cohesin heterodimer, SMC3, and 14 additional SMC1A mutations. All mutations are predicted to retain an open reading frame, and no truncating mutations were identified. Structural analysis of the mutant SMC3 and SMC1A proteins indicate that all are likely to produce functional cohesin complexes, but we posit that they may alter their chromosome binding dynamics. Our data indicate that SMC3 and SMC1A mutations (1) contribute to ∼5% of cases of CdLS, (2) result in a consistently mild phenotype with absence of major structural anomalies typically associated with CdLS, and (3) in some instances, result in a phenotype that approaches that of apparently nonsyndromic mental retardation.

Journal ArticleDOI
TL;DR: T theoretical modeling is used to demonstrate that flip-flop associations can occur when the investigated variant is correlated, through interactive effects or linkage disequilibrium, with a causal variant at another locus, and it is shown how these findings could explain previous reports of flip-Flop associations.
Abstract: An increasing number of publications are replicating a previously reported disease-marker association but with the risk allele reversed from the previous report. Do such "flip-flop" associations confirm or refute the previous association findings? We hypothesized that these associations may indeed be confirmations but that multilocus effects and variation in interlocus correlations contribute to this flip-flop phenomenon. We used theoretical modeling to demonstrate that flip-flop associations can occur when the investigated variant is correlated, through interactive effects or linkage disequilibrium, with a causal variant at another locus, and we show how these findings could explain previous reports of flip-flop associations.

Journal ArticleDOI
TL;DR: A new pattern of interactions is described between the two major known genetic risk factors and the major environmental risk factor concerning the risk of developing anti-CCP-positive RA, extending the basis for a pathogenetic hypothesis for RA involving genetic and environmental factors.
Abstract: Gene-gene and gene-environment interactions are key features in the development of rheumatoid arthritis (RA) and other complex diseases. The aim of this study was to use and compare three different definitions of interaction between the two major genetic risk factors of RA—the HLA-DRB1 shared epitope (SE) alleles and the PTPN22 R620W allele—in three large case-control studies: the Swedish Epidemiological Investigation of Rheumatoid Arthritis (EIRA) study, the North American RA Consortium (NARAC) study, and the Dutch Leiden Early Arthritis Clinic study (in total, 1,977 cases and 2,405 controls). The EIRA study was also used to analyze interactions between smoking and the two genes. “Interaction” was defined either as a departure from additivity, as interaction in a multiplicative model, or in terms of linkage disequilibrium—for example, deviation from independence of penetrance of two unlinked loci. Consistent interaction, defined as departure from additivity, between HLA-DRB1 SE alleles and the A allele of PTPN22 R620W was seen in all three studies regarding anti-CCP–positive RA. Testing for multiplicative interactions demonstrated an interaction between the two genes only when the three studies were pooled. The linkage disequilibrium approach indicated a gene-gene interaction in EIRA and NARAC, as well as in the pooled analysis. No interaction was seen between smoking and PTPN22 R620W. A new pattern of interactions is described between the two major known genetic risk factors and the major environmental risk factor concerning the risk of developing anti-CCP–positive RA. The data extend the basis for a pathogenetic hypothesis for RA involving genetic and environmental factors. The study also raises and illustrates principal questions concerning ways to define interactions in complex diseases.

Journal ArticleDOI
TL;DR: The Bayesian false-discovery probability (BFDP) shares the ease of calculation of the recently proposed false-positive report probability (FPRP) but uses more information, has a noteworthy threshold defined naturally in terms of the costs of false discovery and nondiscovery, and has a sound methodological foundation.
Abstract: In light of the vast amounts of genomic data that are now being generated, we propose a new measure, the Bayesian false-discovery probability (BFDP), for assessing the noteworthiness of an observed association. BFDP shares the ease of calculation of the recently proposed false-positive report probability (FPRP) but uses more information, has a noteworthy threshold defined naturally in terms of the costs of false discovery and nondiscovery, and has a sound methodological foundation. In addition, in a multiple-testing situation, it is straightforward to estimate the expected numbers of false discoveries and false nondiscoveries. We provide an in-depth discussion of FPRP, including a comparison with the q value, and examine the empirical behavior of these measures, along with BFDP, via simulation. Finally, we use BFDP to assess the association between 131 single-nucleotide polymorphisms and lung cancer in a case-control study.

Journal ArticleDOI
TL;DR: Family-based association tests suggested that a specific haplotype with a single short C4B in tight linkage disequilibrium with the -308A allele of TNFA was more likely to be transmitted to patients with SLE.
Abstract: Interindividual gene copy-number variation (CNV) of complement component C4 and its associated polymorphisms in gene size (long and short) and protein isotypes (C4A and C4B) probably lead to different susceptibilities to autoimmune disease. We investigated the C4 gene CNV in 1,241 European Americans, including patients with systemic lupus erythematosus (SLE), their first-degree relatives, and unrelated healthy subjects, by definitive genotyping and phenotyping techniques. The gene copy number (GCN) varied from 2 to 6 for total C4 , from 0 to 5 for C4A, and from 0 to 4 for C4B. Four copies of total C4, two copies of C4A, and two copies of C4B were the most common GCN counts, but each constituted only between one-half and three-quarters of the study populations. Long C4 genes were strongly correlated with C4A ( R =0.695; P C4 genes were correlated with C4B ( R =0.437; P C4 and C4A shifting to the lower side. The risk of SLE disease susceptibility significantly increased among subjects with only two copies of total C4 (patients 9.3%; unrelated controls 1.5%; odds ratio [OR] = 6.514; P =.00002) but decreased in those with ≥5 copies of C4 (patients 5.79%; controls 12%; OR=0.466; P =.016). Both zero copies (OR=5.267; P =.001) and one copy (OR=1.613; P =.022) of C4A were risk factors for SLE, whereas ≥3 copies of C4A appeared to be protective (OR=0.574; P =.012). Family-based association tests suggested that a specific haplotype with a single short C4B in tight linkage disequilibrium with the −308A allele of TNFA was more likely to be transmitted to patients with SLE. This work demonstrates how gene CNV and its related polymorphisms are associated with the susceptibility to a human complex disease.

Journal ArticleDOI
TL;DR: A method of computing P values adjusted for correlated tests (P(ACT)) that attains the accuracy of permutation or simulation-based tests in much less computation time is presented and it is shown that the method applies to many common association tests that are based on multiple traits, markers, and genetic models.
Abstract: Contemporary genetic association studies may test hundreds of thousands of genetic variants for association, often with multiple binary and continuous traits or under more than one model of inheritance. Many of these association tests may be correlated with one another because of linkage disequilibrium between nearby markers and correlation between traits and models. Permutation tests and simulation-based methods are often employed to adjust groups of correlated tests for multiple testing, since conventional methods such as Bonferroni correction are overly conservative when tests are correlated. We present here a method of computing P values adjusted for correlated tests (PACT) that attains the accuracy of permutation or simulation-based tests in much less computation time, and we show that our method applies to many common association tests that are based on multiple traits, markers, and genetic models. Simulation demonstrates that PACT attains the power of permutation testing and provides a valid adjustment for hundreds of correlated association tests. In data analyzed as part of the Finland–United States Investigation of NIDDM Genetics (FUSION) study, we observe a near one-to-one relationship (r2>.999) between PACT and the corresponding permutation-based P values, achieving the same precision as permutation testing but thousands of times faster.

Journal ArticleDOI
TL;DR: A computationally efficient approach to testing association between SNPs and quantitative phenotypes, which can be applied to whole-genome association scans and allows estimation of missing genotypes, resulting in substantial increases in power when genotyping resources are limited.
Abstract: With millions of single-nucleotide polymorphisms (SNPs) identified and characterized, genomewide association studies have begun to identify susceptibility genes for complex traits and diseases. These studies involve the characterization and analysis of very-high-resolution SNP genotype data for hundreds or thousands of individuals. We describe a computationally efficient approach to testing association between SNPs and quantitative phenotypes, which can be applied to whole-genome association scans. In addition to observed genotypes, our approach allows estimation of missing genotypes, resulting in substantial increases in power when genotyping resources are limited. We estimate missing genotypes probabilistically using the Lander-Green or Elston-Stewart algorithms and combine high-resolution SNP genotypes for a subset of individuals in each pedigree with sparser marker data for the remaining individuals. We show that power is increased whenever phenotype information for ungenotyped individuals is included in analyses and that high-density genotyping of just three carefully selected individuals in a nuclear family can recover >90% of the information available if every individual were genotyped, for a fraction of the cost and experimental effort. To aid in study design, we evaluate the power of strategies that genotype different subsets of individuals in each pedigree and make recommendations about which individuals should be genotyped at a high density. To illustrate our method, we performed genomewide association analysis for 27 gene-expression phenotypes in 3-generation families (Centre d'Etude du Polymorphisme Humain pedigrees), in which genotypes for ∼860,000 SNPs in 90 grandparents and parents are complemented by genotypes for ∼6,700 SNPs in a total of 168 individuals. In addition to increasing the evidence of association at 15 previously identified cis-acting associated alleles, our genotype-inference algorithm allowed us to identify associated alleles at 4 cis-acting loci that were missed when analysis was restricted to individuals with the high-density SNP data. Our genotype-inference algorithm and the proposed association tests are implemented in software that is available for free.

Journal ArticleDOI
TL;DR: The global assessment of 1,433 sequence variants of unknown significance in the BRCA genes will be invaluable for validation of functional assays, structural models, and in silico analyses.
Abstract: Mutation screening of the breast and ovarian cancer–predisposition genes BRCA1 and BRCA2 is becoming an increasingly important part of clinical practice. Classification of rare nontruncating sequence variants in these genes is problematic, because it is not known whether these subtle changes alter function sufficiently to predispose cells to cancer development. Using data from the Myriad Genetic Laboratories database of nearly 70,000 full-sequence tests, we assessed the clinical significance of 1,433 sequence variants of unknown significance (VUSs) in the BRCA genes. Three independent measures were employed in the assessment: co-occurrence in trans of a VUS with known deleterious mutations; detailed analysis, by logistic regression, of personal and family history of cancer in VUS-carrying probands; and, in a subset of probands, an analysis of cosegregation with disease in pedigrees. For each of these factors, a likelihood ratio was computed under the hypothesis that the VUSs were equivalent to an “average” deleterious mutation, compared with neutral, with respect to risk. The likelihood ratios derived from each component were combined to provide an overall assessment for each VUS. A total of 133 VUSs had odds of at least 100:1 in favor of neutrality with respect to risk, whereas 43 had odds of at least 20:1 in favor of being deleterious. VUSs with evidence in favor of causality were those that were predicted to affect splicing, fell at positions that are highly conserved among BRCA orthologs, and were more likely to be located in specific domains of the proteins. In addition to their utility for improved genetics counseling of patients and their families, the global assessment reported here will be invaluable for validation of functional assays, structural models, and in silico analyses.

Journal ArticleDOI
TL;DR: This work presents an approach that corrects for the ascertainment bias and generates an estimate of the frequency of a variant and its penetrance parameters and shows that application of the method to case-control data can improve the design of replication studies considerably.
Abstract: Genomewide association studies are now a widely used approach in the search for loci that affect complex traits. After detection of significant association, estimates of penetrance and allele-frequency parameters for the associated variant indicate the importance of that variant and facilitate the planning of replication studies. However, when these estimates are based on the original data used to detect the variant, the results are affected by an ascertainment bias known as the "winner's curse." The actual genetic effect is typically smaller than its estimate. This overestimation of the genetic effect may cause replication studies to fail because the necessary sample size is underestimated. Here, we present an approach that corrects for the ascertainment bias and generates an estimate of the frequency of a variant and its penetrance parameters. The method produces a point estimate and confidence region for the parameter estimates. We study the performance of this method using simulated data sets and show that it is possible to greatly reduce the bias in the parameter estimates, even when the original association study had low power. The uncertainty of the estimate decreases with increasing sample size, independent of the power of the original test for association. Finally, we show that application of the method to case-control data can improve the design of replication studies considerably.

Journal ArticleDOI
TL;DR: A SNP in the 3' untranslated region of HLA-G is reported that influences the targeting of three microRNAs (miRNAs) to this gene, and it is suggested that allele-specific targeting of these miRNAs accounts, at least in part, for earlier observations on Hla-G and the risk of asthma.
Abstract: HLA-G is a nonclassic, class I HLA molecule that has important immunomodulatory properties. Previously, we identified HLA-G as an asthma-susceptibility gene and discovered that the risk of asthma in a child was determined by both the child's HLA-G genotype and the mother's affection status. Here we report a SNP in the 3' untranslated region of HLA-G that influences the targeting of three microRNAs (miRNAs) to this gene, and we suggest that allele-specific targeting of these miRNAs accounts, at least in part, for our earlier observations on HLA-G and the risk of asthma.

Journal ArticleDOI
Gillian I. Rice, Teresa Patrick, Rekha Parmar, Claire F Taylor, Alec Aeby, Jean Aicardi, Rafael Artuch, Simon Attard Montalto, Carlos A. Bacino, Bruno Barroso, Peter Baxter, Willam S Benko, Carsten Bergmann, Enrico Bertini, Roberta Biancheri, Edward Blair, Nenad Blau, David T. Bonthron, Tracy A Briggs, Louise Brueton, Han G. Brunner, Christopher J. Burke, Ian M. Carr, Daniel R. Carvalho, Kate Chandler, Hans-Jurgen Christen, Peter Corry, Frances M. Cowan, Helen Cox, Stefano D'Arrigo, John Dean, Corinne De Laet, Claudine De Praeter, Catherine Dery, Colin D. Ferrie, Kim Flintoff, Suzanna G.M. Frints, Angels García-Cazorla, Blanca Gener, Cyril Goizet, Francoise Goutieres, Andrew Green, Agnes Guet, Ben C.J. Hamel, Bruce E. Hayward, Arvid Heiberg, Raoul C.M. Hennekam, Marie Husson, Andrew P. Jackson, Rasieka Jayatunga, Yong-hui Jiang, Sarina G. Kant, Amy Kao, Mary D. King, Helen Kingston, Joerg Klepper, Marjo S. van der Knaap, Andrew J. Kornberg, Dieter Kotzot, Wilfried Kratzer, Didier Lacombe, Lieven Lagae, Pierre Landrieu, Giovanni Lanzi, Andrea Leitch, Ming K. Lim, John H. Livingston, Charles Marques Lourenço, E G Hermione Lyall, Sally Ann Lynch, Michael J. Lyons, Daphna Marom, John P McClure, Robert McWilliam, Serge B. Melançon, Leena D Mewasingh, Marie-Laure Moutard, Ken K. Nischal, John R. Østergaard, Julie S. Prendiville, Magnhild Rasmussen, R. Curtis Rogers, Dominique Roland, Elisabeth Rosser, Kevin Rostasy, Agathe Roubertie, Amparo Sanchis, Raphael Schiffmann, Sabine Scholl-Bürgi, Sunita Seal, Stavit A. Shalev, C Sierra Corcoles, Gyan P Sinha, Doriette Soler, Ronen Spiegel, John B.P. Stephenson, Uta Tacke, Tiong Yang Tan, Marianne Till, John Tolmie, Pam Tomlin, Federica Vagnarelli, Enza Maria Valente, Rudy Van Coster, Nathalie Van der Aa, Adeline Vanderver, Johannes S H Vles, Thomas Voit, Evangeline Wassmer, Bernhard Weschke, Margo L. Whiteford, Michèl A.A.P. Willemsen, Andreas Zankl, Sameer M. Zuberi, Simona Orcesi, Elisa Fazzi, Pierre Lebon, Yanick J. Crow 
TL;DR: The analysis defines the phenotypic spectrum of AGS and suggests a coherent mutation-screening strategy in this heterogeneous disorder, and indicates that at least one further AGS-causing gene remains to be identified.
Abstract: Aicardi-Goutieres syndrome (AGS) is a genetic encephalopathy whose clinical features mimic those of acquired in utero viral infection. AGS exhibits locus heterogeneity, with mutations identified in genes encoding the 3′→5′ exonuclease TREX1 and the three subunits of the RNASEH2 endonuclease complex. To define the molecular spectrum of AGS, we performed mutation screening in patients, from 127 pedigrees, with a clinical diagnosis of the disease. Biallelic mutations in TREX1, RNASEH2A, RNASEH2B, and RNASEH2C were observed in 31, 3, 47, and 18 families, respectively. In five families, we identified an RNASEH2A or RNASEH2B mutation on one allele only. In one child, the disease occurred because of a de novo heterozygous TREX1 mutation. In 22 families, no mutations were found. Null mutations were common in TREX1, although a specific missense mutation was observed frequently in patients from northern Europe. Almost all mutations in RNASEH2A, RNASEH2B, and RNASEH2C were missense. We identified an RNASEH2C founder mutation in 13 Pakistani families. We also collected clinical data from 123 mutation-positive patients. Two clinical presentations could be delineated: an early-onset neonatal form, highly reminiscent of congenital infection seen particularly with TREX1 mutations, and a later-onset presentation, sometimes occurring after several months of normal development and occasionally associated with remarkably preserved neurological function, most frequently due to RNASEH2B mutations. Mortality was correlated with genotype; 34.3% of patients with TREX1, RNASEH2A, and RNASEH2C mutations versus 8.0% RNASEH2B mutation–positive patients were known to have died (P=.001). Our analysis defines the phenotypic spectrum of AGS and suggests a coherent mutation-screening strategy in this heterogeneous disorder. Additionally, our data indicate that at least one further AGS-causing gene remains to be identified.

Journal ArticleDOI
TL;DR: The 1166C allele may be functionally associated with hypertension by abrogating regulation by hsa-miR-155, thereby elevating AGTR1 levels, and mapping annotated SNPs onto a collection of experimentally supported human miRNA targets found one of these target sites containing SNP rs5186.
Abstract: Animal microRNAs (miRNAs) regulate gene expression through base pairing to their targets within the 3′ untranslated region (UTR) of protein-coding genes. Single-nucleotide polymorphisms (SNPs) located within such target sites can affect miRNA regulation. We mapped annotated SNPs onto a collection of experimentally supported human miRNA targets. Of the 143 experimentally supported human target sites, 9 contain 12 SNPs. We further experimentally investigated one of these target sites for hsa-miR-155, within the 3′ UTR of the human AGTR1 gene that contains SNP rs5186. Using reporter silencing assays, we show that hsa-miR-155 down-regulates the expression of only the 1166A, and not the 1166C, allele of rs5186. Remarkably, the 1166C allele has been associated with hypertension in many studies. Thus, the 1166C allele may be functionally associated with hypertension by abrogating regulation by hsa-miR-155, thereby elevating AGTR1 levels. Since hsa-miR-155 is on chromosome 21, we hypothesize that the observed lower blood pressure in trisomy 21 is partially caused by the overexpression of hsa-miR-155 leading to allele-specific underexpression of AGTR1. Indeed, we have shown in fibroblasts from monozygotic twins discordant for trisomy 21 that levels of AGTR1 protein are lower in trisomy 21.

Journal ArticleDOI
TL;DR: A de novo heterozygous mutation, affecting a critical catalytic residue in TREX1, that results in typical Aicardi-Goutieres syndrome is described.
Abstract: TREX1 constitutes the major 3′→5′ DNA exonuclease activity measured in mammalian cells. Recently, biallelic mutations in TREX1 have been shown to cause Aicardi-Goutieres syndrome at the AGS1 locus. Interestingly, Aicardi-Goutieres syndrome shows overlap with systemic lupus erythematosus at both clinical and pathological levels. Here, we report a heterozygous TREX1 mutation causing familial chilblain lupus. Additionally, we describe a de novo heterozygous mutation, affecting a critical catalytic residue in TREX1, that results in typical Aicardi-Goutieres syndrome.

Journal ArticleDOI
TL;DR: The critical region for Potocki-Lupski syndrome is refined to a 1.3-Mb genomic interval that contains the dosage-sensitive RAI1 gene and lends further support for the concept that genomic architecture incites genomic instability.
Abstract: The duplication 17p11.2 syndrome, associated with dup(17)(p11.2p11.2), is a recently recognized syndrome of multiple congenital anomalies and mental retardation and is the first predicted reciprocal microduplication syndrome described—the homologous recombination reciprocal of the Smith-Magenis syndrome (SMS) microdeletion (del(17)(p11.2p11.2)). We previously described seven subjects with dup(17)(p11.2p11.2) and noted their relatively mild phenotype compared with that of individuals with SMS. Here, we molecularly analyzed 28 additional patients, using multiple independent assays, and also report the phenotypic characteristics obtained from extensive multidisciplinary clinical study of a subset of these patients. Whereas the majority of subjects (22 of 35) harbor the homologous recombination reciprocal product of the common SMS microdeletion (∼3.7 Mb), 13 subjects (∼37%) have nonrecurrent duplications ranging in size from 1.3 to 15.2 Mb. Molecular studies suggest potential mechanistic differences between nonrecurrent duplications and nonrecurrent genomic deletions. Clinical features observed in patients with the common dup(17)(p11.2p11.2) are distinct from those seen with SMS and include infantile hypotonia, failure to thrive, mental retardation, autistic features, sleep apnea, and structural cardiovascular anomalies. We narrow the critical region to a 1.3-Mb genomic interval that contains the dosage-sensitive RAI1 gene. Our results refine the critical region for Potocki-Lupski syndrome, provide information to assist in clinical diagnosis and management, and lend further support for the concept that genomic architecture incites genomic instability.

Journal ArticleDOI
TL;DR: The findings suggest that INI1 is the predisposing gene in familial schwannomatosis, and the exact oncogenetic mechanism in theseSchwannomas remains to be elucidated.
Abstract: Patients with schwannomatosis develop multiple schwannomas but no vestibular schwannomas diagnostic of neurofibromatosis type 2. We report an inactivating germline mutation in exon 1 of the tumor-suppressor gene INI1 in a father and daughter who both had schwannomatosis. Inactivation of the wild-type INI1 allele, by a second mutation in exon 5 or by clear loss, was found in two of four investigated schwannomas from these patients. All four schwannomas displayed complete loss of nuclear INI1 protein expression in part of the cells. Although the exact oncogenetic mechanism in these schwannomas remains to be elucidated, our findings suggest that INI1 is the predisposing gene in familial schwannomatosis.

Journal ArticleDOI
TL;DR: STRA6 mutations define a pleiotropic malformation syndrome representing the first human phenotype associated with mutations in a gene from the "STRA" group.
Abstract: We observed two unrelated consanguineous families with malformation syndromes sharing anophthalmia and distinct eyebrows as common signs, but differing for alveolar capillary dysplasia or complex congenital heart defect in one and diaphragmatic hernia in the other family. Homozygosity mapping revealed linkage to a common locus on chromosome 15, and pathogenic homozygous mutations were identified in STRA6, a member of a large group of "stimulated by retinoic acid" genes encoding novel transmembrane proteins, transcription factors, and secreted signaling molecules or proteins of largely unknown function. Subsequently, homozygous STRA6 mutations were also demonstrated in 3 of 13 patients chosen on the basis of significant phenotypic overlap to the original cases. While a homozygous deletion generating a premature stop codon (p.G50AfsX22) led to absence of the immunoreactive protein in patient's fibroblast culture, structural analysis of three missense mutations (P90L, P293L, and T321P) suggested significant effects on the geometry of the loops connecting the transmembrane helices of STRA6. Two further variations in the C-terminus (T644M and R655C) alter specific functional sites, an SH2-binding motif and a phosphorylation site, respectively. STRA6 mutations thus define a pleiotropic malformation syndrome representing the first human phenotype associated with mutations in a gene from the "STRA" group.

Journal ArticleDOI
TL;DR: It is shown that the risk of visual failure is greater when the 11778G-->A or 14484T-->C mutations are present in specific subgroups of haplogroup J and when the 3460G -->A mutation is present in haplogroups K and K, and significantly less when 11778g-->A occurs in haplogiroup H.
Abstract: Leber hereditary optic neuropathy (LHON) is due primarily to one of three common point mutations of mitochondrial DNA (mtDNA), but the incomplete penetrance implicates additional genetic or environmental factors in the pathophysiology of the disorder. Both the 11778G→A and 14484T→C LHON mutations are preferentially found on a specific mtDNA genetic background, but 3460G→A is not. However, there is no clear evidence that any background influences clinical penetrance in any of these mutations. By studying 3,613 subjects from 159 LHON-affected pedigrees, we show that the risk of visual failure is greater when the 11778G→A or 14484T→C mutations are present in specific subgroups of haplogroup J (J2 for 11778G→A and J1 for 14484T→C) and when the 3460G→A mutation is present in haplogroup K. By contrast, the risk of visual failure is significantly less when 11778G→A occurs in haplogroup H. Substitutions on MTCYB provide an explanation for these findings, which demonstrate that common genetic variants have a marked effect on the expression of an ostensibly monogenic mtDNA disorder.

Journal ArticleDOI
TL;DR: It is speculated that missplicing mutations in mitochondrial aminoacyl-tRNA synthethase genes preferentially affect the brain because of a tissue-specific vulnerability of the splicing machinery.
Abstract: Homozygosity mapping was performed in a consanguineous Sephardic Jewish family with three patients who presented with severe infantile encephalopathy associated with pontocerebellar hypoplasia and multiple mitochondrial respiratory-chain defects. This resulted in the identification of an intronic mutation in RARS2, the gene encoding mitochondrial arginine–transfer RNA (tRNA) synthetase. The mutation was associated with the production of an abnormally short RARS2 transcript and a marked reduction of the mitochondrial tRNAArg transcript in the patients’ fibroblasts. We speculate that missplicing mutations in mitochondrial aminoacyl-tRNA synthethase genes preferentially affect the brain because of a tissue-specific vulnerability of the splicing machinery.

Journal ArticleDOI
TL;DR: These data represent results from the first study to correlate a specific small mutation of the NF1 gene to the expression of a particular clinical phenotype, and the biological mechanism that relates this specific mutation to the suppression of cutaneous neurofibroma development is unknown.
Abstract: Neurofibromatosis type 1 (NF1) is characterized by cafe-au-lait spots, skinfold freckling, and cutaneous neurofibromas. No obvious relationships between small mutations (<20 bp) of the NF1 gene and a specific phenotype have previously been demonstrated, which suggests that interaction with either unlinked modifying genes and/or the normal NF1 allele may be involved in the development of the particular clinical features associated with NF1. We identified 21 unrelated probands with NF1 (14 familial and 7 sporadic cases) who were all found to have the same c.2970-2972 delAAT (p.990delM) mutation but no cutaneous neurofibromas or clinically obvious plexiform neurofibromas. Molecular analysis identified the same 3-bp inframe deletion (c.2970-2972 delAAT) in exon 17 of the NF1 gene in all affected subjects. The ΔAAT mutation is predicted to result in the loss of one of two adjacent methionines (codon 991 or 992) (ΔMet991), in conjunction with silent ACA→ACG change of codon 990. These two methionine residues are located in a highly conserved region of neurofibromin and are expected, therefore, to have a functional role in the protein. Our data represent results from the first study to correlate a specific small mutation of the NF1 gene to the expression of a particular clinical phenotype. The biological mechanism that relates this specific mutation to the suppression of cutaneous neurofibroma development is unknown.

Journal ArticleDOI
TL;DR: In this article, the T-box family transcription factor gene TBX20 was linked to CHD and a complex spectrum of developmental anomalies, including defects in septation, chamber growth, and valvulogenesis.
Abstract: The T-box family transcription factor gene TBX20 acts in a conserved regulatory network, guiding heart formation and patterning in diverse species. Mouse Tbx20 is expressed in cardiac progenitor cells, differentiating cardiomyocytes, and developing valvular tissue, and its deletion or RNA interference–mediated knockdown is catastrophic for heart development. TBX20 interacts physically, functionally, and genetically with other cardiac transcription factors, including NKX2-5, GATA4, and TBX5, mutations of which cause congenital heart disease (CHD). Here, we report nonsense (Q195X) and missense (I152M) germline mutations within the T-box DNA-binding domain of human TBX20 that were associated with a family history of CHD and a complex spectrum of developmental anomalies, including defects in septation, chamber growth, and valvulogenesis. Biophysical characterization of wild-type and mutant proteins indicated how the missense mutation disrupts the structure and function of the TBX20 T-box. Dilated cardiomyopathy was a feature of the TBX20 mutant phenotype in humans and mice, suggesting that mutations in developmental transcription factors can provide a sensitized template for adult-onset heart disease. Our findings are the first to link TBX20 mutations to human pathology. They provide insights into how mutation of different genes in an interactive regulatory circuit lead to diverse clinical phenotypes, with implications for diagnosis, genetic screening, and patient follow-up.