scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Estimating African American admixture proportions by use of population-specific alleles.

TL;DR: Significant nonrandom association between two markers located 22 cM apart (FY-null and AT3) is detected, most likely due to admixture linkage disequilibrium created in the interbreeding of the two parental populations, emphasize the importance of admixed populations as a useful resource for mapping traits with different prevalence in two parental population.
Abstract: We analyzed the European genetic contribution to 10 populations of African descent in the United States (Maywood, Illinois; Detroit; New York; Philadelphia; Pittsburgh; Baltimore; Charleston, South Carolina; New Orleans; and Houston) and in Jamaica, using nine autosomal DNA markers. These markers either are population-specific or show frequency differences >45% between the parental populations and are thus especially informative for admixture. European genetic ancestry ranged from 6.8% (Jamaica) to 22.5% (New Orleans). The unique utility of these markers is reflected in the low variance associated with these admixture estimates (SEM 1.3%-2.7%). We also estimated the male and female European contribution to African Americans, on the basis of informative mtDNA (haplogroups H and L) and Y Alu polymorphic markers. Results indicate a sex-biased gene flow from Europeans, the male contribution being substantially greater than the female contribution. mtDNA haplogroups analysis shows no evidence of a significant maternal Amerindian contribution to any of the 10 populations. We detected significant nonrandom association between two markers located 22 cM apart (FY-null and AT3), most likely due to admixture linkage disequilibrium created in the interbreeding of the two parental populations. The strength of this association and the substantial genetic distance between FY and AT3 emphasize the importance of admixed populations as a useful resource for mapping traits with different prevalence in two parental populations.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
01 Jun 2000-Genetics
TL;DR: Pritch et al. as discussed by the authors proposed a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations, which can be applied to most of the commonly used genetic markers, provided that they are not closely linked.
Abstract: We describe a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations. We assume a model in which there are K populations (where K may be unknown), each of which is characterized by a set of allele frequencies at each locus. Individuals in the sample are assigned (probabilistically) to populations, or jointly to two or more populations if their genotypes indicate that they are admixed. Our model does not assume a particular mutation process, and it can be applied to most of the commonly used genetic markers, provided that they are not closely linked. Applications of our method include demonstrating the presence of population structure, assigning individuals to populations, studying hybrid zones, and identifying migrants and admixed individuals. We show that the method can produce highly accurate assignments using modest numbers of loci— e.g. , seven microsatellite loci in an example using genotype data from an endangered bird species. The software used for this article is available from http://www.stats.ox.ac.uk/~pritch/home.html.

27,454 citations

Journal ArticleDOI
01 Aug 2003-Genetics
TL;DR: Extensions to the method of Pritchard et al. for inferring population structure from multilocus genotype data are described and methods that allow for linkage between loci are developed, which allows identification of subtle population subdivisions that were not detectable using the existing method.
Abstract: We describe extensions to the method of Pritchard et al. for inferring population structure from multilocus genotype data. Most importantly, we develop methods that allow for linkage between loci. The new model accounts for the correlations between linked loci that arise in admixed populations (“admixture linkage disequilibium”). This modification has several advantages, allowing (1) detection of admixture events farther back into the past, (2) inference of the population of origin of chromosomal regions, and (3) more accurate estimates of statistical uncertainty when linked loci are used. It is also of potential use for admixture mapping. In addition, we describe a new prior model for the allele frequencies within each population, which allows identification of subtle population subdivisions that were not detectable using the existing method. We present results applying the new methods to study admixture in African-Americans, recombination in Helicobacter pylori , and drift in populations of Drosophila melanogaster . The methods are implemented in a program, structure , version 2.0, which is available at http://pritch.bsd.uchicago.edu.

7,615 citations


Cites background or methods from "Estimating African American admixtu..."

  • ...Further, a regression of the correlations have been chosen specifically to have large frequency differences between the putative parental populations (Parra et al. 1998), these loci may be more informative for studying admixture than microsatellites were here....

    [...]

  • ...Since the admixture is quite recent (primar-information provided by linkage, it is also possible to ily in the last 200 years or so; Parra et al. 1998; Pfaffreconstruct ancestral populations in the absence of pure et al. 2001), it is likely that admixture LD extends overindividuals, using the…...

    [...]

  • ...…asestimate of 17.8% European ancestry is very similar to long as the chromosome chunks are typically larger thanthe estimate of 18.8% obtained by Parra et al. (1998) the intermarker distances, then even highly inaccuratefor their sample from this population. maps do not lead to biases in r.The…...

    [...]

  • ...Indeed, Parra et al. (1998) de- tected admixture LD across 22 cM between two markers that have extremely large frequency differences between Africans and Europeans, in several African-American populations....

    [...]

Journal ArticleDOI
TL;DR: This article describes a novel, statistically valid, method for case-control association studies in structured populations that uses a set of unlinked genetic markers to infer details of population structure, and to estimate the ancestry of sampled individuals, before using this information to test for associations within subpopulations.
Abstract: The use, in association studies, of the forthcoming dense genomewide collection of single-nucleotide polymorphisms (SNPs) has been heralded as a potential breakthrough in the study of the genetic basis of common complex disorders. A serious problem with association mapping is that population structure can lead to spurious associations between a candidate marker and a phenotype. One common solution has been to abandon case-control studies in favor of family-based tests of association, such as the transmission/disequilibrium test (TDT), but this comes at a considerable cost in the need to collect DNA from close relatives of affected individuals. In this article we describe a novel, statistically valid, method for case-control association studies in structured populations. Our method uses a set of unlinked genetic markers to infer details of population structure, and to estimate the ancestry of sampled individuals, before using this information to test for associations within subpopulations. It provides power comparable with the TDT in many settings and may substantially outperform it if there are conflicting associations in different subpopulations.

1,904 citations


Cites background or result from "Estimating African American admixtu..."

  • ...These data are intended to approximate a population of African Americans, with an average of 20% European admixture (this is consistent with estimates given by Parra et al. [1998])....

    [...]

  • ...For example, in a sample of African Americans, a typical individual might have 5%–20% European admixture (Parra et al. 1998), whereas some individuals may have substantially more or less. Such a sample could be modeled using subpopulations (African and K = 2 European), with typical individuals having q1 in the range and q2 in the range , but with some (.05,.2) (.8,.95) individuals having more extreme values. The challenge is to infer this kind of pattern using genetic data. Pritchard et al. (2000) provide a method of performing such inference, even when little is known about either the number of subpopulations that have contributed to the sample or the allele frequencies in these putative subpopulations....

    [...]

  • ...For example, in a sample of African Americans, a typical individual might have 5%–20% European admixture (Parra et al. 1998), whereas some individuals may have substantially more or less....

    [...]

  • ...For example, in a sample of African Americans, a typical individual might have 5%–20% European admixture (Parra et al. 1998), whereas some individuals may have substantially more or less. Such a sample could be modeled using subpopulations (African and K = 2 European), with typical individuals having q1 in the range and q2 in the range , but with some (.05,.2) (.8,.95) individuals having more extreme values. The challenge is to infer this kind of pattern using genetic data. Pritchard et al. (2000) provide a method of performing such inference, even when little is known about either the number of subpopulations that have contributed to the sample or the allele frequencies in these putative subpopulations. They use a Markov chain Monte Carlo method to estimate the number of subpopulations, the allele frequencies in each subpopulation, and the value of q for each sampled individual. The method can be applied to most of the commonly used genetic markers, including microsatellites and SNPs, and can produce accurate results using modest numbers of loci, even when popular clustering algorithms such as Neighbor-Joining are relatively uninformative. The accuracy of the inference depends on the sample size, the number of loci used, and on the magnitude of allele-frequency differences between the subpopulations. Examples of applications of this method are given later. See Pritchard et al. (2000) for further details....

    [...]

Journal ArticleDOI
22 May 2009-Science
TL;DR: A detailed genetic analysis of most major groups of African populations is provided, suggesting that Africans represent 14 ancestral populations that correlate with self-described ethnicity and shared cultural and/or linguistic properties.
Abstract: Africa is the source of all modern humans, but characterization of genetic variation and of relationships among populations across the continent has been enigmatic. We studied 121 African populations, four African American populations, and 60 non-African populations for patterns of variation at 1327 nuclear microsatellite and insertion/deletion markers. We identified 14 ancestral population clusters in Africa that correlate with self-described ethnicity and shared cultural and/or linguistic properties. We observed high levels of mixed ancestry in most populations, reflecting historical migration events across the continent. Our data also provide evidence for shared ancestry among geographically diverse hunter-gatherer populations (Khoesan speakers and Pygmies). The ancestry of African Americans is predominantly from Niger-Kordofanian (approximately 71%), European (approximately 13%), and other African (approximately 8%) populations, although admixture levels varied considerably among individuals. This study helps tease apart the complex evolutionary history of Africans and African Americans, aiding both anthropological and genetic epidemiologic studies.

1,376 citations

Journal ArticleDOI
18 Feb 2005-Science
TL;DR: This work has characterized whole-genome patterns of common human DNA variation by genotyping 1,586,383 single-nucleotide polymorphisms (SNPs) in 71 Americans of European, African, and Asian ancestry and indicates that these SNPs capture most common genetic variation as a result of linkage disequilibrium.
Abstract: Individual differences in DNA sequence are the genetic basis of human variability. We have characterized whole-genome patterns of common human DNA variation by genotyping 1,586,383 single-nucleotide polymorphisms (SNPs) in 71 Americans of European, African, and Asian ancestry. Our results indicate that these SNPs capture most common genetic variation as a result of linkage disequilibrium, the correlation among common SNP alleles. We observe a strong correlation between extended regions of linkage disequilibrium and functional genomic elements. Our data provide a tool for exploring many questions that remain regarding the causal role of common human DNA variation in complex human traits and for investigating the nature of genetic variation within and between human populations.

1,197 citations

References
More filters
Journal ArticleDOI
TL;DR: In this paper, the significance level for a test of Hardy-Weinberg proportions (HWP) is estimated for loci with more than a few alleles, and two algorithms are proposed.
Abstract: SUMMARY The Hardy-Weinberg law plays an important role in the field of population genetics and often serves as a basis for genetic inference. Because of its importance, much attention has been devoted to tests of Hardy-Weinberg proportions (HWP) over the decades. It has long been recognized that largesample goodness-of-fit tests can sometimes lead to spurious results when the sample size and/or some genotypic frequencies are small. Although a complete enumeration algorithm for the exact test has been proposed, it is not of practical use for loci with more than a few alleles due to the amount of computation required. We propose two algorithms to estimate the significance level for a test of HWP. The algorithms are easily applicable to loci with multiple alleles. Both are remarkably simple and computationally fast. Relative efficiency and merits of the two algorithms are compared. Guidelines regarding their usage are given. Numerical examples are given to illustrate the practicality of the algorithms.

5,075 citations

Journal Article
TL;DR: The statistical basis for this "transmission test for linkage disequilibrium" (transmission/disequilibrium test] is described and the relationship of this test to tests of cosegregation that are based on the proportion of haplotypes or genes identical by descent in affected sibs is shown.
Abstract: A population association has consistently been observed between insulin-dependent diabetes mellitus (IDDM) and the "class 1" alleles of the region of tandem-repeat DNA (5' flanking polymorphism [5'FP]) adjacent to the insulin gene on chromosome 11p. This finding suggests that the insulin gene region contains a gene or genes contributing to IDDM susceptibility. However, several studies that have sought to show linkage with IDDM by testing for cosegregation in affected sib pairs have failed to find evidence for linkage. As means for identifying genes for complex diseases, both the association and the affected-sib-pairs approaches have limitations. It is well known that population association between a disease and a genetic marker can arise as an artifact of population structure, even in the absence of linkage. On the other hand, linkage studies with modest numbers of affected sib pairs may fail to detect linkage, especially if there is linkage heterogeneity. We consider an alternative method to test for linkage with a genetic marker when population association has been found. Using data from families with at least one affected child, we evaluate the transmission of the associated marker allele from a heterozygous parent to an affected offspring. This approach has been used by several investigators, but the statistical properties of the method as a test for linkage have not been investigated. In the present paper we describe the statistical basis for this "transmission test for linkage disequilibrium" (transmission/disequilibrium test [TDT]). We then show the relationship of this test to tests of cosegregation that are based on the proportion of haplotypes or genes identical by descent in affected sibs. The TDT provides strong evidence for linkage between the 5'FP and susceptibility to IDDM. The conclusions from this analysis apply in general to the study of disease associations, where genetic markers are usually closely linked to candidate genes. When a disease is found to be associated with such a marker, the TDT may detect linkage even when haplotype-sharing tests do not.

3,791 citations


"Estimating African American admixtu..." refers background in this paper

  • ...…also been proposed by McKeigue (1997) and Kaplan et al. (1997) that the linkage disequilibrium that results from recent admixture could also be used to detect disease genes for qualitative or quantitative traits by means of the transmission disequilibrium test (Spielman et al. 1993; Allison 1997)....

    [...]

  • ...(1997) that the linkage disequilibrium that results from recent admixture could also be used to detect disease genes for qualitative or quantitative traits by means of the transmission disequilibrium test (Spielman et al. 1993; Allison 1997)....

    [...]

Journal ArticleDOI
10 Jan 1964-Genetics
TL;DR: The results of these investigations were sufficient to show that even for relatively simple cases (two loci, simple symmetrical selective values) linkage might have profound effects on the course of natural selection and, pari passu, natural selection may have major effect on the distribution of coupling and repulsion linkage in a population.
Abstract: HILE the theory of the genetic changes in a population due to selection is quite well understood for single loci, our theory for multiple-gene characters is in a rudimentary stage. Most of the formulations for multiple-gene characters are simply extensions of single-locus models, extensions which ignore the problem of linkage. There are, however, a few papers in which the role of linkage has been investigated for more or less special cases of selection (KIMURA 1956; LEWONTIN and KOJIMA 1960; BODMER and PARSONS 1962). The results of these investigations were sufficient to show that even for relatively simple cases (two loci, simple symmetrical selective values) linkage might have profound effects on the course of natural selection and, pari passu, natural selection may have major effects on the distribution of coupling and repulsion linkage in a population. The results of the investigations of LEWONTIN and KOJIMA (1960) of the twolocus model can be summarized as follows: (1) If the fitnesses are additive between loci (no epistasis), linkage does not effect the final equilibrium state of the population. (2) If linkage is tighter than the value demanded by the magnitude of the epistasis (the greater the epistasis the greater the value) there may be permanent linkage disequilibrium and alteration of equilibrium gene frequencies. (3) The rate of genetic change with time is affected by the tightness of the linkage. (4) In some cases stable gene frequency equilibria are possible only if linkage is tight enough. Although these conclusions were based only on two-locus model and for selective values of a fairly restricted sort, they point clearly to the importance of taking linkage into account in understanding the changes of gene frequencies in populations. In fact, some experimental results (an example of which will be given below) can be understood only if the interaction of selection and linkage is taken into account, The equations describing the interaction between selection and linkage (see below) do not usually have general literal solutions. It is for this reason that the authors cited above have restricted themselves to relatively simple cases. In view of the interesting findings of those previous papers, however, it is worthwhile to explore the subject more intensively. To do so requires the numerical rather than general literal solutions to the equations, but such numerical solutions apply, obviously, only to the particular parameter values chosen. To make such a nu-

1,913 citations


"Estimating African American admixtu..." refers background or methods in this paper

  • ...…regions were Senegambia (Gambia and Senegal), Sierra Leone (Guinea and Sierra Leone), Windward Coast (Ivory Coast and Liberia), Gold Coast (Ghana), Bight of Benin (from the Volta River to the Benin River), Bight of Biafra (east of the Benin River to Gabon), and Angola (southwest Africa,…...

    [...]

  • ...D′ coefficients, in which the gametic disequilibrium (D) is standardized by the theoretical maximum disequilibrium (Dmax), were calculated on the basis of the estimated haplotype frequencies (Lewontin 1964, 1988; Thomson et al. 1988)....

    [...]

Journal ArticleDOI
01 Dec 1996-Genetics
TL;DR: The conclusion that most haplogroups observed in Europe are Caucasoid-specific, and that at least some of them occur at varying frequencies in different Caucasoid populations, is supported.
Abstract: Mitochondrial DNA (mtDNA) sequence variation was examined in Finns, Swedes and Tuscans by PCR amplification and restriction analysis. About 99% of the mtDNAs were subsumed within 10 mtDNA haplogroups (H, I, J, K, M, T, U, V, W, and X) suggesting that the identified haplogroups could encompass virtually all European mtDNAs. Because both hypervariable segments of the mtDNA control region were previously sequenced in the Tuscan samples, the mtDNA haplogroups and control region sequences could be compared. Using a combination of haplogroup-specific restriction site changes and control region nucleotide substitutions, the distribution of the haplogroups was surveyed through the published restriction site polymorphism and control region sequence data of Caucasoids. This supported the conclusion that most haplogroups observed in Europe are Caucasoid-specific, and that at least some of them occur at varying frequencies in different Caucasoid populations. The classification of almost all European mtDNA variation in a number of well defined haplogroups could provide additional insights about the origin and relationships of Caucasoid populations and the process of human colonization of Europe, and is valuable for the definition of the role played by mtDNA backgrounds in the expression of pathological mtDNA mutations

904 citations


"Estimating African American admixtu..." refers background in this paper

  • ...L and H are the most common haplogroups that are unique to African and European populations, respectively (Torroni et al. 1994, 1996; Chen et al. 1995), and can be used to test the relative African and European maternal contribution to African Americans and Jamaicans....

    [...]

Related Papers (5)