Interrogating a High-Density SNP Map for Signatures of Natural Selection
TLDR
An analysis of single nucleotide polymorphisms with allele frequencies that were determined in three populations provides a first generation natural selection map of the human genome and provides compelling evidence that selection has shaped extant patterns of human genomic variation.Abstract:
Natural selection, which can be defined as the differential contribution of genetic variants to future generations (Aquadro et al. 2001), is the driving force of Darwinian evolution. Despite intense research, only a relatively small number of regions and genes have been directly implicated as targets of selection in the human genome (Kitano and Saitou 1999; Rana et al. 1999; Huttley et al. 2000; Hollox et al. 2001; Hull et al. 2001; Hurst and Pal 2001; Koda et al. 2001; Sullivan et al. 2001; Tishkoff et al. 2001; Baum et al. 2002; Fullerton et al. 2002; Gilad et al. 2002; Hamblin et al. 2002). A more comprehensive and genomic understanding of how and where natural selection has shaped patterns of genetic variation may provide important insights into the mechanisms of evolutionary change (Otto 2000), guide selection of loci for inclusion in population genetic studies (Vitalis et al. 2001), facilitate the annotation of functionally significant genomic regions (Nielsen 2001), and help elucidate genotype-phenotype correlations in complex diseases (Przeworski et al. 2000; Nielsen 2001).
Detecting unambiguous evidence for natural selection remains challenging because the effect of selection on the distribution of genetic variation can be mimicked by population demographic history (i.e., the size, structure, and mating pattern of a population). For instance, both adaptive hitchhiking and population expansion can cause an excess of rare variants observed in DNA sequence data compared with what is expected under a standard neutral model (Tajima 1989; Przeworski et al. 2000). Despite these difficulties, the recent deluge of publicly available single nucleotide polymorphisms (SNPs) provides an exciting opportunity to identify genome-wide signatures of selection (Sunyaev et al. 2000; Fay et al. 2001; Sachidanandam et al. 2001).
To this end, examining the variation in SNP allele frequencies between populations, which can be quantified by the statistic FST, is a promising strategy for detecting signatures of natural selection (Lewontin and Krakauer 1973; Rana et al. 1999; Hollox et al. 2001; Fullerton et al. 2002; Gilad et al. 2002; Hamblin et al. 2002). Under selective neutrality, FST is determined by genetic drift, which will affect all loci across the genome in a similar and predictable fashion. On the other hand, natural selection is a locus-specific force that can cause systematic deviations in FST values for a selected gene and nearby genetic markers. For example, geographically restricted directional selection may lead to an increase in FST of a selected locus, whereas balancing or species-wide directional selection may lead to a decrease in FST compared with neutrally evolving loci (Cavalli-Sforza 1966; Bowcock et al. 1991; Andolfatto 2001). Previous studies that have attempted to identify natural selection based on patterns of population differentiation relied on simulations to obtain the expected distribution of FST under selective neutrality (Lewontin and Krakauer 1973; Bowcock et al. 1991; Beaumont and Nichols 1996). However, the simulated distribution of FST strongly depends on the assumed population demographic history, which is rarely known with any degree of certainty.
As an expanding number of SNPs are genotyped across multiple populations, a complimentary approach that does not require tenuous assumptions about population demographic history is now becoming feasible. Specifically, by sampling a large number of SNPs throughout the genome, loci that have been affected by natural selection can simply be identified as outliers in the extreme tails of the empirical distribution of FST (Cavalli-Sforza 1966; Black et al. 2001; Goldstein and Chikhi 2002). Recently, this strategy has been used to infer natural selection in the CAPN10 gene; however, the empirical distribution of FST contained <100 loci (Fullerton et al. 2002).
In this work, we describe an analysis of 26,530 SNPs with allele frequencies that were determined in three populations: African-American, East Asian, and European-American. The density of this SNP allele frequency map provides a unique and powerful opportunity to interrogate the genome for signatures of natural selection. Through a variety of analyses, we have found statistically significant evidence supporting the hypothesis that selection has influenced extant patterns of human genetic variation. Furthermore, we have identified 174 candidate genes that demonstrate signatures of selection when contrasted to the empirical genome-wide distribution of FST. This analysis provides the conceptual foundation for constructing a high-resolution natural selection map, which will be an important resource in understanding the recent evolutionary history of our species, and will facilitate detailed studies on the identified candidate genes.read more
Citations
More filters
Journal ArticleDOI
A second generation human haplotype map of over 3.1 million SNPs
Kelly A. Frazer,Dennis G. Ballinger,David R. Cox,David A. Hinds,Laura L. Stuve,Richard A. Gibbs,John W. Belmont,Andrew Boudreau,Paul Hardenbol,Suzanne M. Leal,Shiran Pasternak,David A. Wheeler,Thomas D. Willis,Fuli Yu,Huanming Yang,Changqing Zeng,Gao Yang,H. B. Hu,Weitao Hu,Chaohua Li,Wei Lin,Siqi Liu,Hao Pan,Xiaoli Tang,Jian Wang,Wei Wang,Jun Yu,Bo Zhang,Qingrun Zhang,Hongbin Zhao,Hui Zhao,Jun Zhou,Stacey Gabriel,Rachel Barry,Brendan Blumenstiel,Amy L. Camargo,Matthew Defelice,Maura Faggart,Mary Goyette,Supriya Gupta,Jamie Moore,Huy Nguyen,Robert C. Onofrio,Melissa Parkin,Jessica Roy,Erich Stahl,Ellen Winchester,Liuda Ziaugra,David Altshuler,Yan Shen,Zhijian Yao,Wei Huang,Xun Chu,Yungang He,Li Jin,Yangfan Liu,Yayun Shen,Weiwei Sun,Haifeng Wang,Yi Wang,Ying Wang,Xiaoyan Xiong,Liang Xu,Mary M.Y. Waye,Stephen Kwok-Wing Tsui,Hong Xue,J. Tze Fei Wong,Luana Galver,Jian-Bing Fan,Kevin L. Gunderson,Sarah S. Murray,Arnold Oliphant,Mark S. Chee,Alexandre Montpetit,Fanny Chagnon,Vincent Ferretti,Martin Leboeuf,Jean François Olivier,Michael S. Phillips,Stéphanie Roumy,Clémentine Sallée,Andrei Verner,Thomas J. Hudson,Pui-Yan Kwok,Dongmei Cai,Daniel C. Koboldt,Raymond D. Miller,Ludmila Pawlikowska,Patricia Taillon-Miller,Ming Xiao,Lap-Chee Tsui,William Mak,Qiang Song You,Paul K.H. Tam,Yusuke Nakamura,Takahisa Kawaguchi,Takuya Kitamoto,Takashi Morizono,Atsushi Nagashima,Yozo Ohnishi,Akihiro Sekine,Toshihiro Tanaka,Tatsuhiko Tsunoda,Panos Deloukas,Christine P. Bird,Marcos Delgado,Emmanouil T. Dermitzakis,Rhian Gwilliam,Sarah E. Hunt,Jonathan J. Morrison,Don Powell,Barbara E. Stranger,Pamela Whittaker,David R. Bentley,Mark J. Daly,Paul I.W. de Bakker,Jeffrey C. Barrett,Yves Chretien,Julian Maller,Steve McCarroll,Nick Patterson,Itsik Pe'er,Alkes L. Price,Shaun Purcell,Daniel J. Richter,Pardis C. Sabeti,Richa Saxena,Stephen F. Schaffner,Pak C. Sham,Patrick Varilly,Lincoln Stein,Lalitha Krishnan,Albert V. Smith,Marcela K. Tello-Ruiz,Gudmundur A. Thorisson,Aravinda Chakravarti,Peter E. Chen,David J. Cutler,Carl S. Kashuk,Shin Lin,Gonçalo R. Abecasis,Weihua Guan,Yun Li,Heather M. Munro,Zhaohui S. Qin,Daryl J. Thomas,Gilean McVean,Adam Auton,Leonardo Bottolo,Niall Cardin,Susana Eyheramendy,Colin Freeman,Jonathan Marchini,Simon Myers,Chris C. A. Spencer,Matthew Stephens,Peter Donnelly,Lon R. Cardon,Geraldine M. Clarke,David M. Evans,Andrew P. Morris,Bruce S. Weir,Todd A. Johnson,James C. Mullikin,Stephen T. Sherry,Michael Feolo,Andrew D. Skol,Houcan Zhang,Ichiro Matsuda,Yoshimitsu Fukushima,Darryl Macer,Eiko Suda,Charles N. Rotimi,Clement Adebamowo,Ike Ajayi,Toyin Aniagwu,Patricia A. Marshall,Chibuzor Nkwodimmah,Charmaine D.M. Royal,Mark Leppert,Missy Dixon,Andy Peiffer,Renzong Qiu,Alastair Kent,Kazuto Kato,Norio Niikawa,Isaac F. Adewole,Bartha Maria Knoppers,Morris W. Foster,Ellen Wright Clayton,Jessica Watkin,Donna M. Muzny,Lynne V. Nazareth,Erica Sodergren,George M. Weinstock,Imtaz Yakub,Bruce W. Birren,Richard K. Wilson,Lucinda Fulton,Jane Rogers,John Burton,Nigel P. Carter,C M Clee,Mark Griffiths,Matthew C. Jones,Kirsten McLay,Robert W. Plumb,Mark T. Ross,Sarah Sims,David Willey,Zhu Chen,Hua Han,Le Kang,Martin Godbout,John C. Wallenburg,Paul L'Archevêque,Guy Bellemare,Koji Saeki,Hongguang Wang,Daochang An,Hongbo Fu,Qing Li,Zhen Wang,Renwu Wang,Arthur L. Holden,Lisa D. Brooks,Jean E. McEwen,Mark S. Guyer,Vivian Ota Wang,Jane Peterson,Michael Shi,Jack Spiegel,Lawrence M. Sung,Lynn F. Zacharia,Francis S. Collins,Karen Kennedy,Ruth Jamieson,John Stewart +237 more
TL;DR: The Phase II HapMap is described, which characterizes over 3.1 million human single nucleotide polymorphisms genotyped in 270 individuals from four geographically diverse populations and includes 25–35% of common SNP variation in the populations surveyed, and increased differentiation at non-synonymous, compared to synonymous, SNPs is demonstrated.
Journal ArticleDOI
Molecular Signatures of Natural Selection
TL;DR: This review provides a nonmathematical description of the issues involved in detecting selection from DNA sequences and SNP data and is intended for readers who are not familiar with population genetic theory.
Journal ArticleDOI
How to track and assess genotyping errors in population genetics studies.
Aurélie Bonin,Eva Bellemain,P. Bronken Eidesen,François Pompanon,Christian Brochmann,Pierre Taberlet +5 more
TL;DR: Four case studies representing a large variety of population genetics investigations differing in their sampling strategies, in the type of organism studied (plant or animal) and the molecular markers used [microsatellites or amplified fragment length polymorphisms (AFLPs), and the estimated genotyping error rate are considered.
Journal ArticleDOI
The power and promise of population genomics: from genotyping to genome typing
TL;DR: The most useful contribution of the genomics model to population genetics will be improving inferences about population demography and evolutionary history.
Journal ArticleDOI
Whole-Genome Patterns of Common DNA Variation in Three Human Populations
David A. Hinds,David A. Hinds,Laura L. Stuve,Laura L. Stuve,Geoffrey B. Nilsen,Geoffrey B. Nilsen,Eran Halperin,Eran Halperin,Eleazar Eskin,Eleazar Eskin,Dennis G. Ballinger,Dennis G. Ballinger,Kelly A. Frazer,Kelly A. Frazer,David R. Cox,David R. Cox +15 more
TL;DR: This work has characterized whole-genome patterns of common human DNA variation by genotyping 1,586,383 single-nucleotide polymorphisms (SNPs) in 71 Americans of European, African, and Asian ancestry and indicates that these SNPs capture most common genetic variation as a result of linkage disequilibrium.
References
More filters
Journal ArticleDOI
Gene Ontology: tool for the unification of biology
M Ashburner,Catherine A. Ball,Judith A. Blake,David Botstein,Heather Butler,J. M. Cherry,Allan Peter Davis,Kara Dolinski,Selina S. Dwight,J.T. Eppig,Midori A. Harris,David P. Hill,Laurie Issel-Tarver,Andrew Kasarskis,Suzanna E. Lewis,John C. Matese,Joel E. Richardson,M. Ringwald,Gerald M. Rubin,Gavin Sherlock +19 more
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Journal ArticleDOI
Estimating F-statistics for the analysis of population structure.
Bruce S. Weir,C. Clark Cockerham +1 more
TL;DR: The purpose of this discussion is to offer some unity to various estimation formulae and to point out that correlations of genes in structured populations, with which F-statistics are concerned, are expressed very conveniently with a set of parameters treated by Cockerham (1 969, 1973).
Book
Molecular Evolutionary Genetics
TL;DR: Recent developments of statistical methods in molecular phylogenetics are reviewed and it is shown that the mathematical foundations of these methods are not well established, but computer simulations and empirical data indicate that currently used methods produce reasonably good phylogenetic trees when a sufficiently large number of nucleotides or amino acids are used.
Journal ArticleDOI
A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms
Ravi Sachidanandam,David Weissman,Steven Schmidt,Jerzy M. Kakol,Lincoln Stein,Gabor T. Marth,Steve Sherry,James C. Mullikin,Beverley J. Mortimore,David Willey,Sarah E. Hunt,Charlotte G. Cole,Penny Coggill,Catherine M. Rice,Zemin Ning,Jane Rogers,David R. Bentley,Pui-Yan Kwok,Elaine R. Mardis,Raymond T. Yeh,Brian Schultz,Lisa Cook,Ruth Davenport,Michael Dante,Lucinda Fulton,LaDeana W. Hillier,Robert H. Waterston,John Douglas Mcpherson,Brian Gilman,Stephen F. Schaffner,William J. Van Etten,David Reich,John M. Higgins,Mark J. Daly,Brendan Blumenstiel,Jennifer Baldwin,Nicole Stange-Thomann,Michael C. Zody,Lauren Linton,Eric S. Lander,David Altshuler +40 more
TL;DR: This high-density SNP map provides a public resource for defining haplotype variation across the genome, and should help to identify biomedically important genes for diagnosis and therapy.
Related Papers (5)
Estimating F-statistics for the analysis of population structure.
Bruce S. Weir,C. Clark Cockerham +1 more