scispace - formally typeset
Open AccessJournal ArticleDOI

Inferring genome-wide patterns of admixture in Qataris using fifty-five ancestral populations.

Reads0
Chats0
TLDR
Simulations show that SupportMix is not only more accurate than other popular admixture discovery tools but is the first admixture inference method that can efficiently scale for simultaneous analysis of 50-100 putative ancestral populations while being independent of prior demographic information.
Abstract
Populations of the Arabian Peninsula have a complex genetic structure that reflects waves of migrations including the earliest human migrations from Africa and eastern Asia, migrations along ancient civilization trading routes and colonization history of recent centuries. Here, we present a study of genome-wide admixture in this region, using 156 genotyped individuals from Qatar, a country located at the crossroads of these migration patterns. Since haplotypes of these individuals could have originated from many different populations across the world, we have developed a machine learning method "SupportMix" to infer loci-specific genomic ancestry when simultaneously analyzing many possible ancestral populations. Simulations show that SupportMix is not only more accurate than other popular admixture discovery tools but is the first admixture inference method that can efficiently scale for simultaneous analysis of 50-100 putative ancestral populations while being independent of prior demographic information. By simultaneously using the 55 world populations from the Human Genome Diversity Panel, SupportMix was able to extract the fine-scale ancestry of the Qatar population, providing many new observations concerning the ancestry of the region. For example, as well as recapitulating the three major sub-populations in Qatar, composed of mainly Arabic, Persian, and African ancestry, SupportMix additionally identifies the specific ancestry of the Persian group to populations sampled in Greater Persia rather than from China and the ancestry of the African group to sub-Saharan origin and not Southern African Bantu origin as previously thought.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference.

TL;DR: RFMix, a powerful discriminative modeling approach that is faster and more accurate than existing methods and capable of learning from the admixed samples themselves to boost performance and autocorrect phasing errors, is presented.
Journal ArticleDOI

Evaluating the Use of ABBA–BABA Statistics to Locate Introgressed Loci

TL;DR: It is found that D is unreliable in this situation as it gives inflated values when effective population size is low, causing D outliers to cluster in genomic regions of reduced diversity, and a related statistic f^d is proposed, a modified version of a statistic originally developed to estimate the genome-wide fraction of admixture.
Journal ArticleDOI

Barley landraces are characterized by geographically heterogeneous genomic origins

TL;DR: The findings indicate that cultivated barley is comprised of multiple source populations with unequal contributions traceable across the genome, which increases the understanding of the evolutionary process associated with the transition from wild to domesticated barley.
Posted ContentDOI

Ancestry Composition: A Novel, Efficient Pipeline for Ancestry Deconvolution

TL;DR: Ancestry Composition is described, a modular three-stage pipeline that efficiently and accurately identifies the ancestral origin of chromosomal segments in admixed individuals and achieves high precision and recall for labeling chromosomesomal segments across over 25 different populations worldwide.
References
More filters
Journal ArticleDOI

Inference of population structure using multilocus genotype data

TL;DR: Pritch et al. as discussed by the authors proposed a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations, which can be applied to most of the commonly used genetic markers, provided that they are not closely linked.
Journal ArticleDOI

PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses

TL;DR: This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.
Proceedings ArticleDOI

A training algorithm for optimal margin classifiers

TL;DR: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented, applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions.
Journal ArticleDOI

A comparison of methods for multiclass support vector machines

TL;DR: Decomposition implementations for two "all-together" multiclass SVM methods are given and it is shown that for large problems methods by considering all data at once in general need fewer support vectors.
Journal ArticleDOI

A second generation human haplotype map of over 3.1 million SNPs

Kelly A. Frazer, +237 more
- 18 Oct 2007 - 
TL;DR: The Phase II HapMap is described, which characterizes over 3.1 million human single nucleotide polymorphisms genotyped in 270 individuals from four geographically diverse populations and includes 25–35% of common SNP variation in the populations surveyed, and increased differentiation at non-synonymous, compared to synonymous, SNPs is demonstrated.
Related Papers (5)