scispace - formally typeset
Search or ask a question

Showing papers by "Gonçalo R. Abecasis published in 2005"


Journal ArticleDOI
John W. Belmont1, Andrew Boudreau, Suzanne M. Leal1, Paul Hardenbol  +229 moreInstitutions (40)
27 Oct 2005
TL;DR: A public database of common variation in the human genome: more than one million single nucleotide polymorphisms for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted.
Abstract: Inherited genetic variation has a critical but as yet largely uncharacterized role in human disease. Here we report a public database of common variation in the human genome: more than one million single nucleotide polymorphisms (SNPs) for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted. These data document the generality of recombination hotspots, a block-like structure of linkage disequilibrium and low haplotype diversity, leading to substantial correlations of SNPs with many of their neighbours. We show how the HapMap resource can guide the design and analysis of genetic association studies, shed light on structural variation and recombination, and identify loci that may have been subject to natural selection during human evolution.

5,479 citations


Journal ArticleDOI
TL;DR: In this paper, a genome-wide association scan was conducted to identify genetic variants associated with obesity-related quantitative traits in the genetically isolated population of Sardinia, and the results showed that common genetic variants in the FTO gene are associated with substantial changes in body weight.
Abstract: The obesity epidemic is responsible for a substantial economic burden in developed countries and is a major risk factor for type 2 diabetes and cardiovascular disease. The disease is the result not only of several environmental risk factors, but also of genetic predisposition. To take advantage of recent advances in gene-mapping technology, we executed a genome-wide association scan to identify genetic variants associated with obesity-related quantitative traits in the genetically isolated population of Sardinia. Initial analysis suggested that several SNPs in the FTO and PFKP genes were associated with increased BMI, hip circumference, and weight. Within the FTO gene, rs9930506 showed the strongest association with BMI (p ¼ 8.6 310 � 7 ), hip circumference (p ¼ 3.4 3 10 � 8 ), and weight (p ¼ 9.1 3 10 � 7 ). In Sardinia, homozygotes for the rare ‘‘G’’ allele of this SNP (minor allele frequency ¼ 0.46) were 1.3 BMI units heavier than homozygotes for the common ‘‘A’’ allele. Within the PFKP gene, rs6602024 showed very strong association with BMI (p ¼4.9 310 � 6 ). Homozygotes for the rare ‘‘A’’ allele of this SNP (minor allele frequency ¼0.12) were 1.8 BMI units heavier than homozygotes for the common ‘‘G’’ allele. To replicate our findings, we genotyped these two SNPs in the GenNet study. In European Americans (N ¼ 1,496) and in Hispanic Americans (N ¼ 839), we replicated significant association between rs9930506 in the FTO gene and BMI (p-value for meta-analysis of European American and Hispanic American follow-up samples, p ¼0.001), weight (p ¼0.001), and hip circumference (p ¼0.0005). We did not replicate association between rs6602024 and obesity-related traits in the GenNet sample, although we found that in European Americans, Hispanic Americans, and African Americans, homozygotes for the rare ‘‘A’’ allele were, on average, 1.0–3.0 BMI units heavier than homozygotes for the more common ‘‘G’’ allele. In summary, we have completed a whole genome– association scan for three obesity-related quantitative traits and report that common genetic variants in the FTO gene are associated with substantial changes in BMI, hip circumference, and body weight. These changes could have a significant impact on the risk of obesity-related morbidity in the general population.

1,619 citations


Journal ArticleDOI
TL;DR: These methods adequately control type I error in large and small samples and are computationally efficient and will be useful for quality assessment of genotype data and for the detection of genetic association or population stratification in very large data sets.
Abstract: Deviations from Hardy-Weinberg equilibrium (HWE) can indicate inbreeding, population stratification, and even problems in genotyping. In samples of affected individuals, these deviations can also provide evidence for association. Tests of HWE are commonly performed using a simple χ2 goodness-of-fit test. We show that this χ2 test can have inflated type I error rates, even in relatively large samples (e.g., samples of 1,000 individuals that include ∼100 copies of the minor allele). On the basis of previous work, we describe exact tests of HWE together with efficient computational methods for their implementation. Our methods adequately control type I error in large and small samples and are computationally efficient. They have been implemented in freely available code that will be useful for quality assessment of genotype data and for the detection of genetic association or population stratification in very large data sets.

1,374 citations


Journal ArticleDOI
TL;DR: Significant evidence for heritability of many medically important traits, including cardiovascular function and personality is found, and evidence for heterogeneity by age and sex suggests that models allowing for these differences will be important in mapping quantitative traits.
Abstract: In family studies, phenotypic similarities between relatives yield information on the overall contribution of genes to trait variation. Large samples are important for these family studies, especially when comparing heritability between subgroups such as young and old, or males and females. We recruited a cohort of 6,148 participants, aged 14–102 y, from four clustered towns in Sardinia. The cohort includes 34,469 relative pairs. To extract genetic information, we implemented software for variance components heritability analysis, designed to handle large pedigrees, analyze multiple traits simultaneously, and model heterogeneity. Here, we report heritability analyses for 98 quantitative traits, focusing on facets of personality and cardiovascular function. We also summarize results of bivariate analyses for all pairs of traits and of heterogeneity analyses for each trait. We found a significant genetic component for every trait. On average, genetic effects explained 40% of the variance for 38 blood tests, 51% for five anthropometric measures, 25% for 20 measures of cardiovascular function, and 19% for 35 personality traits. Four traits showed significant evidence for an X-linked component. Bivariate analyses suggested overlapping genetic determinants for many traits, including multiple personality facets and several traits related to the metabolic syndrome; but we found no evidence for shared genetic determinants that might underlie the reported association of some personality traits and cardiovascular risk factors. Models allowing for heterogeneity suggested that, in this cohort, the genetic variance was typically larger in females and in younger individuals, but interesting exceptions were observed. For example, narrow heritability of blood pressure was approximately 26% in individuals more than 42 y old, but only approximately 8% in younger individuals. Despite the heterogeneity in effect sizes, the same loci appear to contribute to variance in young and old, and in males and females. In summary, we find significant evidence for heritability of many medically important traits, including cardiovascular function and personality. Evidence for heterogeneity by age and sex suggests that models allowing for these differences will be important in mapping quantitative traits.

547 citations


Journal ArticleDOI
TL;DR: A tool that produces summary statistics and basic quality assessments for gene-mapping data, accommodating either pedigree or case-control datasets, and can also produce graphic output in the PDF format is described.
Abstract: Summary: We describe a tool that produces summary statistics and basic quality assessments for gene-mapping data, accommodating either pedigree or case-control datasets. Our tool can also produce graphic output in the PDF format. Availability: http://www.sph.umich.edu/csg/abecasis/Pedstats/download/ Contact: [email protected] Supplementary information: http://www.sph.umich.edu/csg/abecasis/Pedstats/

432 citations


Journal ArticleDOI
TL;DR: Using a large sample of cases and controls from a single center, it is shown that a T-->C substitution in exon 9 of the complement factor H gene is strongly associated with susceptibility to age-related macular degeneration, the most common cause of blindness in the elderly.
Abstract: Using a large sample of cases and controls from a single center, we show that a T→C substitution in exon 9 (Y402H) of the complement factor H gene is strongly associated with susceptibility to age-related macular degeneration, the most common cause of blindness in the elderly. Frequency of the C allele was 0.61 in cases, versus 0.34 in age-matched controls ( P −24 ). Genotype frequencies also differ markedly between cases and controls (χ 2 =112.68 [2 degrees of freedom]; P −24 ). A multiplicative model fits the data well, and we estimate the population frequency of the high-risk C allele to be 0.39 (95% confidence interval 0.36–0.42) and the genotype relative risk to be 2.44 (95% confidence interval 2.08–2.83) for TC heterozygotes and 5.93 (95% confidence interval 4.33–8.02) for CC homozygotes.

398 citations


Journal ArticleDOI
TL;DR: The results suggest that polymorphisms in GLUT9 could affect glucose metabolism and uric acid synthesis and/or renal reabsorption, influencing serum uric Acid levels over a wide range of values.
Abstract: High serum uric acid levels elevate pro-inflammatory–state gout crystal arthropathy and place individuals at high risk for cardiovascular morbidity and mortality. Genome-wide scans in the genetically isolated Sardinian population identified variants associated with serum uric acid levels as a quantitative trait. They mapped within GLUT9, a Chromosome 4 glucose transporter gene predominantly expressed in liver and kidney. SNP rs6855911 showed the strongest association (p = 1.84 × 10−16), along with eight others (p = 7.75 × 10−16 to 6.05 × 10−11). Individuals homozygous for the rare allele of rs6855911 (minor allele frequency = 0.26) had 0.6 mg/dl less uric acid than those homozygous for the common allele; the results were replicated in an unrelated cohort from Tuscany. Our results suggest that polymorphisms in GLUT9 could affect glucose metabolism and uric acid synthesis and/or renal reabsorption, influencing serum uric acid levels over a wide range of values.

282 citations


Journal ArticleDOI
TL;DR: An efficient approach for modeling linkage disequilibrium (LD) between markers during multipoint analysis of human pedigrees using a gene-counting algorithm suitable for pedigree data and with the use of a hidden Markov model.
Abstract: Single-nucleotide polymorphisms (SNPs) are rapidly replacing microsatellites as the markers of choice for genetic linkage studies and many other studies of human pedigrees. Here, we describe an efficient approach for modeling linkage disequilibrium (LD) between markers during multipoint analysis of human pedigrees. Using a gene-counting algorithm suitable for pedigree data, our approach enables rapid estimation of allele and haplotype frequencies within clusters of tightly linked markers. In addition, with the use of a hidden Markov model, our approach allows for multipoint pedigree analysis with large numbers of SNP markers organized into clusters of markers in LD. Simulation results show that our approach resolves previously described biases in multipoint linkage analysis with SNPs that are in LD. An updated version of the freely available Merlin software package uses the approach described here to perform many common pedigree analyses, including haplotyping and haplotype frequency estimation, parametric and nonparametric multipoint linkage analysis of discrete traits, variance-components and regression-based analysis of quantitative traits, calculation of identity-by-descent or kinship coefficients, and case selection for follow-up association studies. To illustrate the possibilities, we examine a data set that provides evidence of linkage of psoriasis to chromosome 17.

274 citations


Journal ArticleDOI
TL;DR: A meta- analysis of six AMD genome screens is performed using the genome-scan meta-analysis method, which allows linkage results from several studies to be combined, providing greater power to identify regions that show only weak evidence for linkage in individual studies.
Abstract: A genetic contribution to the development of age-related macular degeneration (AMD) is well established. Several genome-wide linkage studies have identified a number of putative susceptibility loci for AMD but only a few of these regions have been replicated in independent studies. Here, we perform a meta-analysis of six AMD genome screens using the genome-scan meta-analysis method, which allows linkage results from several studies to be combined, providing greater power to identify regions that show only weak evidence for linkage in individual studies. Results from non-parametric analysis for a broad AMD clinical phenotype (including two studies with quantitative traits) were extracted. For each study, 120 genomic bins of approximately 30 cM were defined and ranked according to maximum evidence for linkage within each bin. Bin ranks were weighted according to study size and summed across all studies; the summed rank (SR) for each bin was assessed empirically for significance using permutation methods. A high SR indicates a region with consistent evidence for linkage across studies. The strongest evidence for an AMD susceptibility locus was found on chromosome 10q26 where genome-wide significant linkage was observed (P=0.00025). Several other regions met the empirical significance criteria for bins likely to contain linked loci including adjacent pairs of bins on chromosomes 1q, 2p, 3p and 16. Several of the regions identified here showed only weak evidence for linkage in the individual studies. These results will help prioritize regions for future positional and functional candidate gene studies in AMD.

246 citations


Journal ArticleDOI
TL;DR: Data provide evidence of a link between multiple diverse mechanisms underlying AMD pathogenesis, and the effect of TLR4, APOE and ABCA1 variants on AMD susceptibility was opposite to that of association with atherosclerosis risk.
Abstract: Age-related macular degeneration (AMD) is a genetically heterogeneous disease that leads to progressiveand irreversible vision loss among the elderly. Inflammation, oxidative damage, cholesterol metabolismand/or impaired function of retinal pigment epithelium (RPE) have been implicated in AMD pathogenesis.We examined toll-like receptor 4 (TLR4) as a candidate gene for AMD susceptibility because: (i) theTLR4gene is located on chromosome 9q32–33, a region exhibiting evidence of linkage to AMD in three indepen-dent reports; (ii) the TLR4-D299G variant is associated with reduced risk of atherosclerosis, a chronic inflam-matory disease with subendothelial accumulation; (iii) the TLR4 is not only a key mediator ofproinflammatory signaling pathways but also linked to regulation of cholesterol efflux and (iv) the TLR4 par-ticipates in phagocytosis of photoreceptor outer segments by the RPE. We examined D299G and T399I vari-ants of TLR4 in a sample of 667 unrelated AMD patients and 439 unrelated controls, all of Caucasian ancestry.Multiple logistic regression demonstrated an increased risk of AMD in carriers of the G allele at TLR4 residue299 (odds ratio 5 2.65, P 5 0.025), but lack of an independent effect by T399I variant. TLR4-D299G showed anadditive effect on AMD risk (odds ratio 5 4.13, P 5 0.002) with allelic variants of apolipoprotein E (APOE) andATP-binding cassette transporter-1 (ABCA1), two genes involved in cholesterol efflux. Interestingly, theeffect of TLR4, APOE and ABCA1 variants on AMD susceptibility was opposite to that of association withatherosclerosis risk. Our data provide evidence of a link between multiple diverse mechanisms underlyingAMD pathogenesis.INTRODUCTIONAge-related macular degeneration (AMD) is the leading causeof irreversible vision loss in the elderly population of devel-oped countries (1–3). This progressive degenerative diseaseprimarily involves the retina and the retinal pigment epi-thelium (RPE) in the macular region. As in atherosclerosisand Alzheimer’s disease, extracellular deposits of proteinsand lipids (called ‘drusen’) are clinical hallmarks of AMD(4,5). Pathological characteristics of advanced diseaseinclude the presence of large macular drusen (LMD),geographic atrophy (GA) and/or choroidal neovascularization(CNV) (6,7).AMD is believed to result from interactions between mul-tiple genetic variants and environmental factors (8). The stron-gest identified risk factors are advanced age and familyhistory, though smoking, hypertension and many other riskfactors have also been implicated (9,10). Recent studieshave begun to establish the importance of genetic variationsin the development of AMD. An association between AMDand allelic variants of apolipoprotein E (APOE) has beenwidely documented; specifically, the APOE-14 allele is

194 citations


Journal ArticleDOI
TL;DR: A novel approach is proposed that quantifies the degree of linkage disequilibrium (LD) between the candidate SNP and the putative disease locus through joint modeling of linkage and association and can be extended to study designs that include unaffected family members.
Abstract: Once genetic linkage has been identified for a complex disease, the next step is often association analysis, in which single-nucleotide polymorphisms (SNPs) within the linkage region are genotyped and tested for association with the disease. If a SNP shows evidence of association, it is useful to know whether the linkage result can be explained, in part or in full, by the candidate SNP. We propose a novel approach that quantifies the degree of linkage disequilibrium (LD) between the candidate SNP and the putative disease locus through joint modeling of linkage and association. We describe a simple likelihood of the marker data conditional on the trait data for a sample of affected sib pairs, with disease penetrances and disease-SNP haplotype frequencies as parameters. We estimate model parameters by maximum likelihood and propose two likelihood-ratio tests to characterize the relationship of the candidate SNP and the disease locus. The first test assesses whether the candidate SNP and the disease locus are in linkage equilibrium so that the SNP plays no causal role in the linkage signal. The second test assesses whether the candidate SNP and the disease locus are in complete LD so that the SNP or a marker in complete LD with it may account fully for the linkage signal. Our method also yields a genetic model that includes parameter estimates for disease-SNP haplotype frequencies and the degree of disease-SNP LD. Our method provides a new tool for detecting linkage and association and can be extended to study designs that include unaffected family members.

Journal ArticleDOI
TL;DR: Genotype data generated by the International HapMap Project is used to dissect the relationship between sequence features and the degree of linkage disequilibrium in the genome and suggest an evolutionary justification for the heterogeneity in linkage disequ equilibrium.
Abstract: We use genotype data generated by the International HapMap Project to dissect the relationship between sequence features and the degree of linkage disequilibrium in the genome. We show that variation in linkage disequilibrium is broadly similar across populations and examine sequence landscape in regions of strong and weak disequilibrium. Linkage disequilibrium is generally low within approximately 15 Mb of the telomeres of each chromosome and noticeably elevated in large, duplicated regions of the genome as well as within approximately 5 Mb of centromeres and other heterochromatic regions. At a broad scale (100-1000 kb resolution), our results show that regions of strong linkage disequilibrium are typically GC poor and have reduced polymorphism. In addition, these regions are enriched for LINE repeats, but have fewer SINE, DNA, and simple repeats than the rest of the genome. At a fine scale, we examine the sequence composition of "hotspots" for the rapid breakdown of linkage disequilibrium and show that they are enriched in SINEs, in simple repeats, and in sequences that are conserved between species. Regions of high and low linkage disequilibrium (the top and bottom quartiles of the genome) have a higher density of genes and coding bases than the rest of the genome. Closer examination of the data shows that whereas some types of genes (including genes involved in immune response and sensory perception) are typically located in regions of low linkage disequilibrium, other genes (including those involved in DNA and RNA metabolism, response to DNA damage, and the cell cycle) are preferentially located in regions of strong linkage disequilibrium. Our results provide a detailed analysis of the relationship between sequence features and linkage disequilibrium and suggest an evolutionary justification for the heterogeneity in linkage disequilibrium in the genome.

Journal ArticleDOI
TL;DR: The importance of using appropriate statistical procedures, such as the false discovery rate, to maximize the chances of success in large scale association studies is highlighted.
Abstract: This brief review provides a summary of the biological causes of genetic association between tightly linked markers--termed linkage disequilibrium--and unlinked markers--termed population structure. We also review the utility of linkage disequilibrium data in gene mapping in isolated populations, in the estimation of recombination rates and in studying the history of particular alleles, including the detection of natural selection. We discuss current understanding of the extent and patterns of linkage disequilibrium in the genome, and its promise for genetic association studies in complex disease. Finally, we highlight the importance of using appropriate statistical procedures, such as the false discovery rate, to maximize the chances of success in large scale association studies.


Journal ArticleDOI
TL;DR: Evidence for linkage of psoriasis to distal chromosome 17q is reported, with a linkage peak mapping 1.7 cM distal to the RUNX1 binding site, but there is no evidence for association to individual SNPs or haplotypes in either of the previously identified peaks of association.
Abstract: Background: A previous study identified two peaks of allelic association between psoriasis and single nucleotide polymorphisms (SNPs) mapping to distal chromosome 17q, including a disease associated SNP that leads to loss of a RUNX1 transcription factor binding site, and additional SNPs in the third intron of the RAPTOR gene. Another study found an association with SNPs in the RAPTOR gene, but not with the RUNX1 binding site polymorphism. Methods: In an effort to confirm these observations, we genotyped 579 pedigrees containing 1285 affected individuals for three SNPs immediately flanking and including the RUNX1 binding site, and for three SNPs in the RAPTOR gene. Results: Here we report further evidence for linkage to distal chromosome 17q, with a linkage peak mapping 1.7 cM distal to the RUNX1 binding site (logarithm of the odds 2.26 to 2.73, depending upon statistic used). However, we found no evidence for association to individual SNPs or haplotypes in either of the previously identified peaks of association. Power analysis demonstrated 80% power to detect significant association at genotype relative risks of 1.2 (additive and multiplicative models) to 1.5 (dominant and recessive models) for the RUNX1 binding site, and 1.3 to 1.4 for the RAPTOR locus under all models except dominant. Conclusions: Our data provide no support for the previously identified RUNX1 binding site or for the RAPTOR locus as genetic determinants of psoriasis, despite evidence for linkage of psoriasis to distal chromosome 17q.

Journal ArticleDOI
01 May 2005
TL;DR: Results indicate that cluster 17 does not carry a psoriasis-susceptibility allele, and expands the PSORS1 risk interval to approximately 300 kb.
Abstract: Human leukocyte antigen (HLA)-Cw6 has long been associated with psoriasis, and PSORS1 (psoriasis susceptibility 1), a major gene for psoriasis susceptibility, has been mapped to its vicinity. A previous analysis identified multiple risk haplotypes carrying HLA-Cw6 and one haplotype (cluster 17, HLA-Cw8-B65) that appeared to carry risk for psoriasis but did not carry HLA-Cw6. This haplotype was very similar to other risk haplotypes for at least 60 kb telomeric to HLA-C, suggesting identity by descent with the remaining risk chromosomes. The association, however, between psoriasis and this haplotype as assessed by the transmission/disequilibrium test (TDT) was of borderline significance (p-value 0.048). In order to better assess the risk associated with cluster 17, a multicenter collaboration typed additional subjects for a single marker (M6S161) for which one allele (249 bp) was found only on cluster 17. The new sample included 1275 pedigrees as well as 300 cases and 913 controls. Transmission of this allele to affected individuals was examined using the TDT and the pedigree disequilibrium test (PDT), and case-control samples were analyzed by a trend test across genotype categories. By all methods, the newly acquired genotypes failed to confirm the association originally reported, despite adequate power. In contrast, the 248 bp allele, which is found on all HLA-Cw6-positive risk haplotypes as well as several non-risk haplotypes, shows significant excess transmission for all cohorts. Taken together, these results indicate that cluster 17 does not carry a psoriasis-susceptibility allele, and expand the PSORS1 risk interval to approximately 300 kb.

Book ChapterDOI
15 Jul 2005
TL;DR: A short history of the software packages that are now in widespread use in human gene mapping studies are presented, and some of the more important developments are summarized.
Abstract: The analysis of human gene mapping data presents daunting computational challenges Improvements in laboratory technology have motivated the development and refinement of increasingly sophisticated algorithms for the analysis of human pedigrees We review and summarize some of the more important developments, and present a short history of the software packages that are now in widespread use in human gene mapping studies Keywords: gene-mapping algorithms; computational methods; human pedigrees; Elston–Stewart; Lander–Green; stochastic methods