scispace - formally typeset
Search or ask a question

Showing papers by "Carlos Bustamante published in 2010"


Journal ArticleDOI
08 Apr 2010-Nature
TL;DR: It is shown that dog breeds share a higher proportion of multi-locus haplotypes unique to grey wolves from the Middle East, indicating that they are a dominant source of genetic diversity for dogs rather than wolves from east Asia, as suggested by mitochondrial DNA sequence data.
Abstract: Advances in genome technology have facilitated a new understanding of the historical and genetic processes crucial to rapid phenotypic evolution under domestication. To understand the process of dog diversification better, we conducted an extensive genome-wide survey of more than 48,000 single nucleotide polymorphisms in dogs and their wild progenitor, the grey wolf. Here we show that dog breeds share a higher proportion of multi-locus haplotypes unique to grey wolves from the Middle East, indicating that they are a dominant source of genetic diversity for dogs rather than wolves from east Asia, as suggested by mitochondrial DNA sequence data. Furthermore, we find a surprising correspondence between genetic and phenotypic/functional breed groupings but there are exceptions that suggest phenotypic diversification depended in part on the repeated crossing of individuals with novel phenotypes. Our results show that Middle Eastern wolves were a critical source of genome diversity, although interbreeding with local wolf populations clearly occurred elsewhere in the early history of specific lineages. More recently, the evolution of modern dog breeds seems to have been an iterative process that drew on a limited genetic toolkit to create remarkable phenotypic diversity.

692 citations


01 Oct 2010
TL;DR: The pilot phase of the 1000 Genomes Project is presented, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms, and the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants are described.

599 citations


Journal ArticleDOI
TL;DR: This paper analyzed Affymetrix GeneChip 500k genotype data from African Americans and individuals with ancestry from West Africa (n = 203 from 12 populations) and Europe (n= 400 from 42 countries) to obtain a fine-scale genome-wide perspective of ancestry.
Abstract: Quantifying patterns of population structure in Africans and African Americans illuminates the history of human populations and is critical for undertaking medical genomic studies on a global scale. To obtain a fine-scale genome-wide perspective of ancestry, we analyze Affymetrix GeneChip 500K genotype data from African Americans (n = 365) and individuals with ancestry from West Africa (n = 203 from 12 populations) and Europe (n = 400 from 42 countries). We find that population structure within the West African sample reflects primarily language and secondarily geographical distance, echoing the Bantu expansion. Among African Americans, analysis of genomic admixture by a principal component-based approach indicates that the median proportion of European ancestry is 18.5% (25th–75th percentiles: 11.6–27.7%), with very large variation among individuals. In the African-American sample as a whole, few autosomal regions showed exceptionally high or low mean African ancestry, but the X chromosome showed elevated levels of African ancestry, consistent with a sex-biased pattern of gene flow with an excess of European male and African female ancestry. We also find that genomic profiles of individual African Americans afford personalized ancestry reconstructions differentiating ancient vs. recent European and African ancestry. Finally, patterns of genetic similarity among inferred African segments of African-American genomes and genomes of contemporary African populations included in this study suggest African ancestry is most similar to non-Bantu Niger-Kordofanian-speaking populations, consistent with historical documents of the African Diaspora and trans-Atlantic slave trade.

454 citations


Journal ArticleDOI
TL;DR: The largest genetic study to date of morphology in domestic dogs identifies genes controlling nearly 100 morphological traits and identifies important trends in phenotypic variation within this species.
Abstract: Domestic dogs exhibit tremendous phenotypic diversity, including a greater variation in body size than any other terrestrial mammal. Here, we generate a high density map of canine genetic variation by genotyping 915 dogs from 80 domestic dog breeds, 83 wild canids, and 10 outbred African shelter dogs across 60,968 single-nucleotide polymorphisms (SNPs). Coupling this genomic resource with external measurements from breed standards and individuals as well as skeletal measurements from museum specimens, we identify 51 regions of the dog genome associated with phenotypic variation among breeds in 57 traits. The complex traits include average breed body size and external body dimensions and cranial, dental, and long bone shape and size with and without allometric scaling. In contrast to the results from association mapping of quantitative traits in humans and domesticated plants, we find that across dog breeds, a small number of quantitative trait loci (≤3) explain the majority of phenotypic variation for most of the traits we studied. In addition, many genomic regions show signatures of recent selection, with most of the highly differentiated regions being associated with breed-defining traits such as body size, coat characteristics, and ear floppiness. Our results demonstrate the efficacy of mapping multiple traits in the domestic dog using a database of genotyped individuals and highlight the important role human-directed selection has played in altering the genetic architecture of key traits in this important species.

431 citations


Journal ArticleDOI
TL;DR: The results suggest future genome-wide association scans in Hispanic/Latino populations may require correction for local genomic ancestry at a subcontinental scale when associating differences in the genome with disease risk, progression, and drug efficacy, as well as for admixture mapping.
Abstract: Hispanic/Latino populations possess a complex genetic structure that reflects recent admixture among and potentially ancient substructure within Native American, European, and West African source populations. Here, we quantify genome-wide patterns of SNP and haplotype variation among 100 individuals with ancestry from Ecuador, Colombia, Puerto Rico, and the Dominican Republic genotyped on the Illumina 610-Quad arrays and 112 Mexicans genotyped on Affymetrix 500K platform. Intersecting these data with previously collected high-density SNP data from 4,305 individuals, we use principal component analysis and clustering methods FRAPPE and STRUCTURE to investigate genome-wide patterns of African, European, and Native American population structure within and among Hispanic/Latino populations. Comparing autosomal, X and Y chromosome, and mtDNA variation, we find evidence of a significant sex bias in admixture proportions consistent with disproportionate contribution of European male and Native American female ancestry to present-day populations. We also find that patterns of linkage-disequilibria in admixed Hispanic/Latino populations are largely affected by the admixture dynamics of the populations, with faster decay of LD in populations of higher African ancestry. Finally, using the locus-specific ancestry inference method LAMP, we reconstruct fine-scale chromosomal patterns of admixture. We document moderate power to differentiate among potential subcontinental source populations within the Native American, European, and African segments of the admixed Hispanic/Latino genomes. Our results suggest future genome-wide association scans in Hispanic/Latino populations may require correction for local genomic ancestry at a subcontinental scale when associating differences in the genome with disease risk, progression, and drug efficacy, as well as for admixture mapping.

384 citations


Journal ArticleDOI
24 May 2010-PLOS ONE
TL;DR: These analyses highlight the power of population genomics in agricultural systems to identify functionally important regions of the genome and to decipher the role of human-directed breeding in refashioning the genomes of a domesticated species.
Abstract: Background: The domestication of Asian rice (Oryza sativa) was a complex process punctuated by episodes of introgressive hybridization among and between subpopulations. Deep genetic divergence between the two main varietal groups (Indica and Japonica) suggests domestication from at least two distinct wild populations. However, genetic uniformity surrounding key domestication genes across divergent subpopulations suggests cultural exchange of genetic material among ancient farmers. Methodology/Principal Findings: In this study, we utilize a novel 1,536 SNP panel genotyped across 395 diverse accessions of O. sativa to study genome-wide patterns of polymorphism, to characterize population structure, and to infer the introgression history of domesticated Asian rice. Our population structure analyses support the existence of five major subpopulations (indica, aus, tropical japonica, temperate japonica and GroupV) consistent with previous analyses. Our introgression analysis shows that most accessions exhibit some degree of admixture, with many individuals within a population sharing the same introgressed segment due to artificial selection. Admixture mapping and association analysis of amylose content and grain length illustrate the potential for dissecting the genetic basis of complex traits in domesticated plant populations. Conclusions/Significance: Genes in these regions control a myriad of traits including plant stature, blast resistance, and amylose content. These analyses highlight the power of population genomics in agricultural systems to identify functionally important regions of the genome and to decipher the role of human-directed breeding in refashioning the genomes of a domesticated species.

264 citations


Journal ArticleDOI
TL;DR: The authors' unique energy values and salt corrections improve predictions of DNA unzipping forces and are fully compatible with melting temperatures for oligos, and should make it possible to obtain free energies, enthalpies, and entropies in conditions not accessible by bulk methodologies.
Abstract: Accurate knowledge of the thermodynamic properties of nucleic acids is crucial to predicting their structure and stability. To date most measurements of base-pair free energies in DNA are obtained in thermal denaturation experiments, which depend on several assumptions. Here we report measurements of the DNA base-pair free energies based on a simplified system, the mechanical unzipping of single DNA molecules. By combining experimental data with a physical model and an optimization algorithm for analysis, we measure the 10 unique nearest-neighbor base-pair free energies with 0.1 kcal mol-1 precision over two orders of magnitude of monovalent salt concentration. We find an improved set of standard energy values compared with Unified Oligonucleotide energies and a unique set of 10 base-pair-specific salt-correction values. The latter are found to be strongest for AA/TT and weakest for CC/GG. Our unique energy values and salt corrections improve predictions of DNA unzipping forces and are fully compatible with melting temperatures for oligos. The method should make it possible to obtain free energies, enthalpies, and entropies in conditions not accessible by bulk methodologies.

247 citations


Journal ArticleDOI
TL;DR: Genetic and functional evidence is presented that another gene with a fundamental role in MHC class I presentation, endoplasmic reticulum aminopeptidase 2 (ERAP2), has also evolved under balancing selection and contains a variant that affects antigen presentation.
Abstract: A remarkable characteristic of the human major histocompatibility complex (MHC) is its extreme genetic diversity, which is maintained by balancing selection. In fact, the MHC complex remains one of the best-known examples of natural selection in humans, with well-established genetic signatures and biological mechanisms for the action of selection. Here, we present genetic and functional evidence that another gene with a fundamental role in MHC class I presentation, endoplasmic reticulum aminopeptidase 2 (ERAP2), has also evolved under balancing selection and contains a variant that affects antigen presentation. Specifically, genetic analyses of six human populations revealed strong and consistent signatures of balancing selection affecting ERAP2. This selection maintains two highly differentiated haplotypes (Haplotype A and Haplotype B), with frequencies 0.44 and 0.56, respectively. We found that ERAP2 expressed from Haplotype B undergoes differential splicing and encodes a truncated protein, leading to nonsense-mediated decay of the mRNA. To investigate the consequences of ERAP2 deficiency on MHC presentation, we correlated surface MHC class I expression with ERAP2 genotypes in primary lymphocytes. Haplotype B homozygotes had lower levels of MHC class I expressed on the surface of B cells, suggesting that naturally occurring ERAP2 deficiency affects MHC presentation and immune response. Interestingly, an ERAP2 paralog, endoplasmic reticulum aminopeptidase 1 (ERAP1), also shows genetic signatures of balancing selection. Together, our findings link the genetic signatures of selection with an effect on splicing and a cellular phenotype. Although the precise selective pressure that maintains polymorphism is unknown, the demonstrated differences between the ERAP2 splice forms provide important insights into the potential mechanism for the action of selection.

236 citations


Journal ArticleDOI
03 Jun 2010-Nature
TL;DR: It is speculated that proteins may have evolved to select certain topologies that increase coupling between regions to avoid areas of the landscape that lead to kinetic trapping and misfolding.
Abstract: The three-dimensional structures of proteins often show a modular architecture comprised of discrete structural regions or domains. Cooperative communication between these regions is important for catalysis, regulation and efficient folding; lack of coupling has been implicated in the formation of fibrils and other misfolding pathologies. How different structural regions of a protein communicate and contribute to a protein's overall energetics and folding, however, is still poorly understood. Here we use a single-molecule optical tweezers approach to induce the selective unfolding of particular regions of T4 lysozyme and monitor the effect on other regions not directly acted on by force. We investigate how the topological organization of a protein (the order of structural elements along the sequence) affects the coupling and folding cooperativity between its domains. To probe the status of the regions not directly subjected to force, we determine the free energy changes during mechanical unfolding using Crooks' fluctuation theorem. We pull on topological variants (circular permutants) and find that the topological organization of the polypeptide chain critically determines the folding cooperativity between domains and thus what parts of the folding/unfolding landscape are explored. We speculate that proteins may have evolved to select certain topologies that increase coupling between regions to avoid areas of the landscape that lead to kinetic trapping and misfolding.

235 citations


Journal ArticleDOI
TL;DR: A direct measure of the de novo mutation rate and selective constraints from DNMs estimated from a deep resequencing data set generated from a large cohort of ASD and SCZ cases and population control individuals with available parental DNA is presented.
Abstract: The role of de novo mutations (DNMs) in common diseases remains largely unknown. Nonetheless, the rate of de novo deleterious mutations and the strength of selection against de novo mutations are critical to understanding the genetic architecture of a disease. Discovery of high-impact DNMs requires substantial high-resolution interrogation of partial or complete genomes of families via resequencing. We hypothesized that deleterious DNMs may play a role in cases of autism spectrum disorders (ASD) and schizophrenia (SCZ), two etiologically heterogeneous disorders with significantly reduced reproductive fitness. We present a direct measure of the de novo mutation rate (μ) and selective constraints from DNMs estimated from a deep resequencing data set generated from a large cohort of ASD and SCZ cases (n = 285) and population control individuals (n = 285) with available parental DNA. A survey of ∼430 Mb of DNA from 401 synapse-expressed genes across all cases and 25 Mb of DNA in controls found 28 candidate DNMs, 13 of which were cell line artifacts. Our calculated direct neutral mutation rate (1.36 × 10−8) is similar to previous indirect estimates, but we observed a significant excess of potentially deleterious DNMs in ASD and SCZ individuals. Our results emphasize the importance of DNMs as genetic mechanisms in ASD and SCZ and the limitations of using DNA from archived cell lines to identify functional variants.

233 citations


Journal ArticleDOI
TL;DR: A suite of SNP-based resources have been developed and made publicly available for broad application in rice research, including large SNP datasets, tools for identifying informative SNPs for targeted applications, and a suite of custom-designed SNP assays for use in marker-assisted and genomic selection, association and QTL mapping, positional cloning, pedigree analysis, variety identification and seed purity testing as mentioned in this paper.
Abstract: Single nucleotide polymorphisms (SNPs) are the most abundant form of genetic variation in eukaryotic genomes. SNPs may be functionally responsible for specific traits or phenotypes, or they may be informative for tracing the evolutionary history of a species or the pedigree of a variety. As genetic markers, SNPs are rapidly replacing simple sequence repeats (SSRs) because they are more abundant, stable, amenable to automation, efficient, and increasingly cost-effective. The integration of high throughput SNP genotyping capability promises to accelerate genetic gain in a breeding program, but also imposes a series of economic, organizational and technical hurdles. To begin to address these challenges, SNP-based resources are being developed and made publicly available for broad application in rice research. These resources include large SNP datasets, tools for identifying informative SNPs for targeted applications, and a suite of custom-designed SNP assays for use in marker-assisted and genomic selection, association and QTL mapping, positional cloning, pedigree analysis, variety identification and seed purity testing. SNP resources also make it possible for breeders to more efficiently evaluate and utilize the wealth of natural variation that exists in both wild and cultivated germplasm with the aim of improving the productivity and sustainability of agriculture.

Journal ArticleDOI
TL;DR: The results indicate that the diploid yeast nuclear genome is remarkably stable during the vegetative and meiotic cell cycles and support the hypothesis that peripheral regions of chromosomes are more dynamic than gene-rich central sections where structural rearrangements could be deleterious.
Abstract: Accurate estimates of mutation rates provide critical information to analyze genome evolution and organism fitness. We used whole-genome DNA sequencing, pulse-field gel electrophoresis, and compara ...

Journal ArticleDOI
TL;DR: A push and roll mechanism to explain how the packaging motor translocates the DNA in bursts of four 2.5 bp power strokes, while rotating the DNA, and it might apply to other ring motors as well.

Journal ArticleDOI
12 Oct 2010-Rice
TL;DR: An overview of a research platform that provides essential germplasm, genotypic and phenotypic data and analytical tools for dissecting phenotype–genotype associations in rice is presented, to empower basic research discoveries in rice by linking sequence diversity with physiological, morphological, and agronomic variation.
Abstract: We present an overview of a research platform that provides essential germplasm, genotypic and phenotypic data and analytical tools for dissecting phenotype–genotype associations in rice. These resources include a diversity panel of 400 Oryza sativa and 100 Oryza rufipogon accessions that have been purified by single seed descent, a custom-designed Affymetrix array consisting of 44,100 SNPs, an Illumina GoldenGate assay consisting of 1,536 SNPs, and a suite of low-resolution 384-SNP assays for the Illumina BeadXpress Reader that are designed for applications in breeding, genetics and germplasm management. Our long-term goal is to empower basic research discoveries in rice by linking sequence diversity with physiological, morphological, and agronomic variation. This research platform will also help increase breeding efficiency by providing a database of diversity information that will enable researchers to identify useful DNA polymorphisms in genes and germplasm of interest and convert that information into cost-effective tools for applied plant improvement.

Journal ArticleDOI
TL;DR: The utility of 454 sequencing and MassARRAY genotyping for population genetics in natural populations of the teleost, Fundulus heteroclitus as well as closely related Fundulus species is investigated.
Abstract: Background: By targeting SNPs contained in both coding and non-coding areas of the genome, we are able to identify genetic differences and characterize genome-wide patterns of variation among individuals, populations and species. We investigated the utility of 454 sequencing and MassARRAY genotyping for population genetics in natural populations of the teleost, Fundulus heteroclitus as well as closely related Fundulus species (F. grandis, F. majalis and F. similis). Results: We used 454 pyrosequencing and MassARRAY genotyping technology to identify and type 458 genome-wide SNPs and determine genetic differentiation within and between populations and species of Fundulus. Specifically, pyrosequencing identified 96 putative SNPs across coding and non-coding regions of the F. heteroclitus genome: 88.8% were verified as true SNPs with MassARRAY. Additionally, putative SNPs identified in F. heteroclitus EST sequences were verified in most (86.5%) F. heteroclitus individuals; fewer were genotyped in F. grandis (74.4%), F. majalis (72.9%), and F. similis (60.7%) individuals. SNPs were polymorphic and showed latitudinal clinal variation separating northern and southern populations and established isolation by distance in F. heteroclitus populations. In F. grandis, SNPs were less polymorphic but still established isolation by distance. Markers differentiated species and populations. Conclusions: In total, these approaches were used to quickly determine differences within the Fundulus genome and provide markers for population genetic studies.

Journal ArticleDOI
TL;DR: A new model for the role of SpoIIIE assembly in septal membrane fission is proposed that has strong implications for how the chromosome terminus crosses the septum, and this unstable protein transitions between disassembled and assembled oligomeric states.
Abstract: SpoIIIE is an FtsK-related protein that transports the forespore chromosome across the Bacillus subtilis sporulation septum. We use membrane photobleaching and protoplast assays to demonstrate that SpoIIIE is required for septal membrane fission in the presence of trapped DNA, and that DNA is transported across separate daughter cell membranes, suggesting that SpoIIIE forms a channel that partitions the daughter cell membranes. Our results reveal a close correlation between septal membrane fission and the assembly of a stable SpoIIIE translocation complex at the septal midpoint. Time-lapse epifluorescence, total internal reflection fluorescence (TIRF) microscopy, and live-cell photoactivation localization microscopy (PALM) demonstrate that the SpoIIIE transmembrane domain mediates dynamic localization to active division sites, whereas the assembly of a stable focus also requires the cytoplasmic domain. The transmembrane domain fails to completely separate the membrane, and it assembles unstable foci. TIRF microscopy and biophysical modeling of fluorescence recovery after photobleaching (FRAP) data suggest that this unstable protein transitions between disassembled and assembled oligomeric states. We propose a new model for the role of SpoIIIE assembly in septal membrane fission that has strong implications for how the chromosome terminus crosses the septum.

Book ChapterDOI
TL;DR: This chapter formalizes the connection between the hidden fluctuations in the kinetic states that compose a full kinetic cycle and the measured fluctuations in that cycle and shows that these classifications provide a first level of constraint on possible kinetic mechanisms.
Abstract: A variety of recent advances in single-molecule methods are now making possible the routine measurement of the distinct catalytic trajectories of individual enzymes. Unlike their bulk counterparts, these measurements directly reveal the fluctuations inherent to enzymatic dynamics, and statistical measures of these fluctuations promise to greatly constrain possible kinetic mechanisms. In this chapter, we discuss a variety of advances, ranging from theoretical to practical, in the new and growing field of statistical kinetics. In particular, we formalize the connection between the hidden fluctuations in the kinetic states that compose a full kinetic cycle and the measured fluctuations in the time to complete this cycle. We then discuss the characterization of fluctuations in a fashion that permits kinetic constraints to be easily extracted. When there are multiple observable enzymatic outcomes, we provide the proper way to sort events so as not to bias the final statistics, and we show that these classifications provide a first level of constraint on possible kinetic mechanisms. Finally, we discuss the basic substrate dependence of an important function of the statistical moments. The new kinetic parameters of this expression, akin to the Michaelis–Menten parameters, provide model-independent constraints on the kinetic mechanism.

Journal Article
TL;DR: Identification of this ADAM9 deletion in crd3-affected dogs establishes this canine disease as orthologous to CORD9 in humans, and offers opportunities for further characterization of the disease process, and potential for genetic therapeutic intervention.
Abstract: Purpose To identify the causative mutation in a canine cone-rod dystrophy (crd3) that segregates as an adult onset disorder in the Glen of Imaal Terrier breed of dog.

Journal ArticleDOI
TL;DR: Here it is shown that there is a single general expression for the substrate dependence of nmin for a wide range of kinetic models, governed by three kinetic parameters which have simple geometric interpretations and provide clear constraints on possible kinetic mechanisms.
Abstract: The time it takes an enzyme to complete its reaction is a stochastic quantity governed by thermal fluctuations. With the advent of high-resolution methods of single-molecule manipulation and detection, it is now possible to observe directly this natural variation in the enzymatic cycle completion time and extract kinetic information from the statistics of its fluctuations. To this end, the inverse of the squared coefficient of variation, which we term nmin, is a useful measure of fluctuations because it places a strict lower limit on the number of kinetic states in the enzymatic mechanism. Here we show that there is a single general expression for the substrate dependence of nmin for a wide range of kinetic models. This expression is governed by three kinetic parameters, which we term NL, NS, and α. These parameters have simple geometric interpretations and provide clear constraints on possible kinetic mechanisms. As a demonstration of this analysis, we fit the fluctuations in the dwell times of the packaging motor of the bacteriophage φ29, revealing additional features of the nucleotide loading process in this motor. Because a diverse set of kinetic models display the same substrate dependence for their fluctuations, the expression for this general dependence may prove of use in the characterization and study of the dynamics of a wide range of enzymes.

Journal ArticleDOI
TL;DR: It is shown that specific epigenetic features in mouse cells correlate with imprinting status in mice, and hundreds of additional genes predicted to be imprinted in the mouse are identified.
Abstract: Approximately 100 mouse genes undergo genomic imprinting, whereby one of the two parental alleles is epigenetically silenced. Imprinted genes influence processes including development, X chromosome inactivation, obesity, schizophrenia, and diabetes, motivating the identification of all imprinted loci. Local sequence features have been used to predict candidate imprinted genes, but rigorous testing using reciprocal crosses validated only three, one of which resided in previously identified imprinting clusters. Here we show that specific epigenetic features in mouse cells correlate with imprinting status in mice, and we identify hundreds of additional genes predicted to be imprinted in the mouse. We used a multitiered approach to validate imprinted expression, including use of a custom single nucleotide polymorphism array and traditional molecular methods. Of 65 candidates subjected to molecular assays for allele-specific expression, we found 10 novel imprinted genes that were maternally expressed in the placenta.

Journal ArticleDOI
TL;DR: A new genotype calling algorithm called ‘ALCHEMY’ is developed based on statistical modeling of the raw intensity data rather than modelless clustering allowing accurate genotyping of both inbred and heterozygous samples even when analyzed simultaneously.
Abstract: Motivation: The development of new high-throughput genotyping products requires a significant investment in testing and training samples to evaluate and optimize the product before it can be used reliably on new samples. One reason for this is current methods for automated calling of genotypes are based on clustering approaches which require a large number of samples to be analyzed simultaneously, or an extensive training dataset to seed clusters. In systems where inbred samples are of primary interest, current clustering approaches perform poorly due to the inability to clearly identify a heterozygote cluster. Results: As part of the development of two custom single nucleotide polymorphism genotyping products for Oryza sativa (domestic rice), we have developed a new genotype calling algorithm called ‘ALCHEMY’ based on statistical modeling of the raw intensity data rather than modelless clustering. A novel feature of the model is the ability to estimate and incorporate inbreeding information on a per sample basis allowing accurate genotyping of both inbred and heterozygous samples even when analyzed simultaneously. Since clustering is not used explicitly, ALCHEMY performs well on small sample sizes with accuracy exceeding 99% with as few as 18 samples.

Journal ArticleDOI
TL;DR: This work has applied DaDi to human data from Africa, Europe, and East Asia, building the most complex statistically well-characterized model of human migration out of Africa to date.
Abstract: Models of demographic history (population sizes, migration rates, and divergence times) inferred from genetic data complement archeology and serve as null models in genome scans for selection. Most current inference methods are computationally limited to considering simple models or non-recombining data. We introduce a method based on a diffusion approximation to the joint frequency spectrum of genetic variation between populations. Our implementation, DaDi, can model up to three interacting populations and scales well to genome-wide data. We have applied DaDi to human data from Africa, Europe, and East Asia, building the most complex statistically well-characterized model of human migration out of Africa to date.

Journal ArticleDOI
TL;DR: The implications of population structure for the distribution and discovery of disease-causing genetic variants, in the light of the imminent availability of sequencing data for a multitude of diverse human genomes, are discussed.
Abstract: Fine-scale population structure characterizes most continents and is especially pronounced in non-cosmopolitan populations. Roughly half of the world's population remains non-cosmopolitan and even populations within cities often assort along ethnic and linguistic categories. Barriers to random mating can be ecologically extreme, such as the Sahara Desert, or cultural, such as the Indian caste system. In either case, subpopulations accumulate genetic differences if the barrier is maintained over multiple generations. Genome-wide polymorphism data, initially with only a few hundred autosomal microsatellites, have clearly established differences in allele frequency not only among continental regions, but also within continents and within countries. We review recent evidence from the analysis of genome-wide polymorphism data for genetic boundaries delineating human population structure and the main demographic and genomic processes shaping variation, and discuss the implications of population structure for the distribution and discovery of disease-causing genetic variants, in the light of the imminent availability of sequencing data for a multitude of diverse human genomes.

Journal ArticleDOI
01 Jun 2010-Genetics
TL;DR: Simulations and single-nucleotide polymorphism data collected through direct resequencing and genotyping are used and it is found that when estimating the current population size and magnitude of recent growth in an ancestral population using the site frequency spectrum (SFS), it is possible to obtain reasonably accurate estimates of the parameters.
Abstract: Despite the widespread study of genetic variation in admixed human populations, such as African-Americans, there has not been an evaluation of the effects of recent admixture on patterns of polymorphism or inferences about population demography. These issues are particularly relevant because estimates of the timing and magnitude of population growth in Africa have differed among previous studies, some of which examined African-American individuals. Here we use simulations and single-nucleotide polymorphism (SNP) data collected through direct resequencing and genotyping to investigate these issues. We find that when estimating the current population size and magnitude of recent growth in an ancestral population using the site frequency spectrum (SFS), it is possible to obtain reasonably accurate estimates of the parameters when using samples drawn from the admixed population under certain conditions. We also show that methods for demographic inference that use haplotype patterns are more sensitive to recent admixture than are methods based on the SFS. The analysis of human genetic variation data from the Yoruba people of Ibadan, Nigeria and African-Americans supports the predictions from the simulations. Our results have important implications for the evaluation of previous population genetic studies that have considered African-American individuals as a proxy for individuals from West Africa as well as for future population genetic studies of additional admixed populations.

Journal ArticleDOI
TL;DR: It is demonstrated that PNAs are versatile and robust sequence-specific tethers and implement them in two model single-molecule experiments where individual DNA molecules are manipulated via microfluidic flow and optical tweezers, respectively.
Abstract: The ability to strongly and sequence-specifically attach modifications such as fluorophores and haptens to individual double-stranded (ds) DNA molecules is critical to a variety of single-molecule experiments. We propose using modified peptide nucleic acids (PNAs) for this purpose and implement them in two model single-molecule experiments where individual DNA molecules are manipulated via microfluidic flow and optical tweezers, respectively. We demonstrate that PNAs are versatile and robust sequence-specific tethers.

Journal ArticleDOI
01 Oct 2010-Genetics
TL;DR: A genome-wide analysis of mutations that accumulate in MMR-deficient diploid lines of Saccharomyces cerevisiae demonstrates that the mutation pattern seen previously in mismatch repair defective strains using a limited number of reporters holds true for the entire genome.
Abstract: DNA replication errors that escape polymerase proofreading and mismatch repair (MMR) can lead to base substitution and frameshift mutations. Such mutations can disrupt gene function, reduce fitness, and promote diseases such as cancer and are also the raw material of molecular evolution. To analyze with limited bias genomic features associated with DNA polymerase errors, we performed a genome-wide analysis of mutations that accumulate in MMR-deficient diploid lines of Saccharomyces cerevisiae. These lines were derived from a common ancestor and were grown for 160 generations, with bottlenecks reducing the population to one cell every 20 generations. We sequenced to between 8- and 20-fold coverage one wild-type and three mutator lines using Illumina Solexa 36-bp reads. Using an experimentally aware Bayesian genotype caller developed to pool experimental data across sequencing runs for all strains, we detected 28 heterozygous single-nucleotide polymorphisms (SNPs) and 48 single-nt insertion/deletions (indels) from the data set. This method was evaluated on simulated data sets and found to have a very low false-positive rate (∼6 × 10−5) and a false-negative rate of 0.08 within the unique mapping regions of the genome that contained at least sevenfold coverage. The heterozygous mutations identified by the Bayesian genotype caller were confirmed by Sanger sequencing. All of the mutations were unique to a given line, except for a single-nt deletion mutation which occurred independently in two lines. All 48 indels, composed of 46 deletions and two insertions, occurred in homopolymer (HP) tracts [i.e., 47 poly(A) or (T) tracts, 1 poly(G) or (C) tract] between 5 and 13 bp long. Our findings are of interest because HP tracts are present at high levels in the yeast genome (>77,400 for 5- to 20-nt HP tracts), and frameshift mutations in these regions are likely to disrupt gene function. In addition, they demonstrate that the mutation pattern seen previously in mismatch repair defective strains using a limited number of reporters holds true for the entire genome.

Journal ArticleDOI
TL;DR: The external morphology of the egg capsule of Bythaelurus canescens and its fixation to the substratum are described, revealing the presence of longitudinal ridges and long coiled tendrils at both anterior and posterior ends of the capsule.
Abstract: The external morphology of the egg capsule of Bythaelurus canescens and its fixation to the substratum are described. Bythaelurus canescens egg capsules are typically vase-shaped, dorso-ventrally flattened, pale yellow in colour when fresh and covered by 12-15 longitudinal ridges. The anterior border of the capsule is straight, whereas the posterior border is semicircular. Two horns bearing long, coiled tendrils arise from the anterior and posterior ends of the capsule. The presence of longitudinal ridges and long coiled tendrils at both anterior and posterior ends of the capsule readily distinguish these egg capsules from those of other chondrichthyans occurring in the south-east Pacific Ocean.

Journal ArticleDOI
15 Jan 2010-Gene
TL;DR: In this paper, platelet function studies and linkage analyses in a pedigree of Canine Scott syndrome (CSS) affected German shepherd dogs were conducted. And the results provided the basis for fine mapping studies to narrow the disease interval and target the evaluation of putative disease genes.

Journal ArticleDOI
TL;DR: The model predicts that the NS3 helicase actively unwinds duplex by reducing more than 50% the free energy that stabilizes base pairing/stacking, which lowers the average unwinding efficiency to less than 1 bp per ATP cycle.

Journal ArticleDOI
23 Dec 2010-Nature
TL;DR: The genome of a female archaic hominin from Denisova Cave in southern Siberia has now been sequenced from DNA extracted from a finger bone and the morphology of a tooth with a mitochondrial genome very similar to that of the finger bone suggests that these hominins are evolutionarily distinct from both Neanderthals and modern humans.
Abstract: Analysis of ancient nuclear DNA, recovered from 40,000-year-old remains in the Denisova Cave, Siberia, hints at the multifaceted interaction of human populations following their migration out of Africa. See Article p.1053 Anatomically modern humans were in Africa from some point after 200,000 years ago and reached Eurasia rather later. Meanwhile, archaic hominins — including the Neanderthals — had been in Eurasia from at least 230,000 years ago and disappear from the fossil record only about 30,000 years ago. The genome of a female archaic hominin from Denisova Cave in southern Siberia has now been sequenced from DNA extracted from a finger bone. The group to which this 'Denisovan' individual belonged shares a common origin with Neanderthals and, although it was not involved in the putative gene flow from Neanderthals into Eurasians, it contributed 4–6% of the genomes of present-day Melanesians. In addition, the morphology of a tooth with a mitochondrial genome very similar to that of the finger bone suggests that these hominins are evolutionarily distinct from both Neanderthals and modern humans.