Showing papers by "Carlos Bustamante published in 2012"
••
TL;DR: The findings suggest that most human variation is rare, not shared between populations, and that rare variants are likely to play a role in human health, and show that large sample sizes will be required to associate rare variants with complex traits.
Abstract: As a first step toward understanding how rare variants contribute to risk for complex diseases, we sequenced 15,585 human protein-coding genes to an average median depth of 111× in 2440 individuals of European (n = 1351) and African (n = 1088) ancestry. We identified over 500,000 single-nucleotide variants (SNVs), the majority of which were rare (86% with a minor allele frequency less than 0.5%), previously unknown (82%), and population-specific (82%). On average, 2.3% of the 13,595 SNVs each person carried were predicted to affect protein function of ~313 genes per genome, and ~95.7% of SNVs predicted to be functionally important were rare. This excess of rare functional variants is due to the combined effects of explosive, recent accelerated population growth and weak purifying selection. Furthermore, we show that large sample sizes will be required to associate rare variants with complex traits.
1,680 citations
••
Saarland University1, University of Tübingen2, University of Kiel3, Life Technologies4, Stanford University5, University of Pavia6, Centre national de la recherche scientifique7, University of Tartu8, Sorenson Molecular Genealogy Foundation9, University of La Laguna10, Heidelberg University11, Ontario Institute for Cancer Research12, Fred Hutchinson Cancer Research Center13, Wrocław Medical University14
TL;DR: The complete genome sequence of the Iceman is reported and 100% concordance between the previously reported mitochondrial genome sequence and the consensus sequence generated from the genomic data is shown.
Abstract: The Tyrolean Iceman, a 5,300-year-old Copper age individual, was discovered in 1991 on the Tisenjoch Pass in the Italian part of the Otztal Alps. Here we report the complete genome sequence of the Iceman and show 100% concordance between the previously reported mitochondrial genome sequence and the consensus sequence generated from our genomic data. We present indications for recent common ancestry between the Iceman and present-day inhabitants of the Tyrrhenian Sea, that the Iceman probably had brown eyes, belonged to blood group O and was lactose intolerant. His genetic predisposition shows an increased risk for coronary heart disease and may have contributed to the development of previously reported vascular calcifications. Sequences corresponding to ~60% of the genome of Borrelia burgdorferi are indicative of the earliest human case of infection with the pathogen for Lyme borreliosis.
413 citations
••
TL;DR: A kinetic model is introduced to describe blinking and it is shown that Dendra2 photobleaches three times faster and blinks seven times less than mEos2, making Dendra 2 a better photoactivated localization microscopy tag than m Eos2 for molecular counting.
Abstract: We present a single molecule method for counting proteins within a diffraction-limited area when using photoactivated localization microscopy. The intrinsic blinking of photoactivatable fluorescent proteins mEos2 and Dendra2 leads to an overcounting error, which constitutes a major obstacle for their use as molecular counting tags. Here, we introduce a kinetic model to describe blinking and show that Dendra2 photobleaches three times faster and blinks seven times less than mEos2, making Dendra2 a better photoactivated localization microscopy tag than mEos2 for molecular counting. The simultaneous activation of multiple molecules is another source of error, but it leads to molecular undercounting instead. We propose a photoactivation scheme that maximally separates the activation of different molecules, thus helping to overcome undercounting. We also present a method that quantifies the total counting error and minimizes it by balancing over- and undercounting. This unique method establishes that Dendra2 is better for counting purposes than mEos2, allowing us to count in vitro up to 200 molecules in a diffraction-limited spot with a bias smaller than 2% and an uncertainty less than 6% within 10 min. Finally, we demonstrate that this counting method can be applied to protein quantification in vivo by counting the bacterial flagellar motor protein FliM fused to Dendra2.
363 citations
••
TL;DR: The genomes of seven North African populations reveal an extraordinarily complex history of migrations, involving at least five ancestral populations, into North Africa, and a gradient of likely autochthonous Maghrebi ancestry that increases from east to west across northern Africa.
Abstract: North African populations are distinct from sub-Saharan Africans based on cultural, linguistic, and phenotypic attributes; however, the time and the extent of genetic divergence between populations north and south of the Sahara remain poorly understood. Here, we interrogate the multilayered history of North Africa by characterizing the effect of hypothesized migrations from the Near East, Europe, and sub-Saharan Africa on current genetic diversity. We present dense, genome-wide SNP genotyping array data (730,000 sites) from seven North African populations, spanning from Egypt to Morocco, and one Spanish population. We identify a gradient of likely autochthonous Maghrebi ancestry that increases from east to west across northern Africa; this ancestry is likely derived from “back-to-Africa” gene flow more than 12,000 years ago (ya), prior to the Holocene. The indigenous North African ancestry is more frequent in populations with historical Berber ethnicity. In most North African populations we also see substantial shared ancestry with the Near East, and to a lesser extent sub-Saharan Africa and Europe. To estimate the time of migration from sub-Saharan populations into North Africa, we implement a maximum likelihood dating method based on the distribution of migrant tracts. In order to first identify migrant tracts, we assign local ancestry to haplotypes using a novel, principal component-based analysis of three ancestral populations. We estimate that a migration of western African origin into Morocco began about 40 generations ago (approximately 1,200 ya); a migration of individuals with Nilotic ancestry into Egypt occurred about 25 generations ago (approximately 750 ya). Our genomic data reveal an extraordinarily complex history of migrations, involving at least five ancestral populations, into North Africa.
299 citations
••
TL;DR: It is found that all individuals derive at least a few percent of their genomes from admixture with non-Khoisan populations that began ∼1,200 years ago, supporting the hypothesis of an ancient link between southern and eastern Africa.
Abstract: Southern and eastern African populations that speak non-Bantu languages with click consonants are known to harbour some of the most ancient genetic lineages in humans, but their relationships are poorly understood. Here, we report data from 23 populations analysed at over half a million single-nucleotide polymorphisms, using a genome-wide array designed for studying human history. The southern African Khoisan fall into two genetic groups, loosely corresponding to the northwestern and southeastern Kalahari, which we show separated within the last 30,000 years. We find that all individuals derive at least a few percent of their genomes from admixture with non-Khoisan populations that began ∼1,200 years ago. In addition, the East African Hadza and Sandawe derive a fraction of their ancestry from admixture with a population related to the Khoisan, supporting the hypothesis of an ancient link between southern and eastern Africa.
278 citations
••
University of California, San Francisco1, Case Western Reserve University2, University of Santiago de Compostela3, University of Barcelona4, University of Zulia5, University of Valle6, Industrial University of Santander7, University of Antioquia8, Technological University of Pereira9, University of Michigan10, University of Colorado Boulder11, Syracuse University12, Cayetano Heredia University13, Wake Forest University14, Higher University of San Andrés15, Mexican Social Security Institute16, Hospital General de México17, Johns Hopkins University18, Stanford University19, National Institutes of Health20, Pennsylvania State University21, University of Southern California22, University of Toronto23
TL;DR: A panel of 446 ancestry informative markers (AIMs) optimized to estimate ancestral proportions in individuals and populations throughout Latin America will be useful resources to explore population history of admixture in Latin America and to correct for the potential effects of population stratification in admixed samples in the region.
Abstract: Most individuals throughout the Americas are admixed descendants of Native American, European, and African ancestors. Complex historical factors have resulted in varying proportions of ancestral contributions between individuals within and among ethnic groups. We developed a panel of 446 ancestry informative markers (AIMs) optimized to estimate ancestral proportions in individuals and populations throughout Latin America. We used genome-wide data from 953 individuals from diverse African, European, and Native American populations to select AIMs optimized for each of the three main continental populations that form the basis of modern Latin American populations. We selected markers on the basis of locus-specific branch length to be informative, well distributed throughout the genome, capable of being genotyped on widely available commercial platforms, and applicable throughout the Americas by minimizing within-continent heterogeneity. We then validated the panel in samples from four admixed populations by comparing ancestry estimates based on the AIMs panel to estimates based on genome-wide association study (GWAS) data. The panel provided balanced discriminatory power among the three ancestral populations and accurate estimates of individual ancestry proportions (R2>0.9 for ancestral components with significant between-subject variance). Finally, we genotyped samples from 18 populations from Latin America using the AIMs panel and estimated variability in ancestry within and between these populations. This panel and its reference genotype information will be useful resources to explore population history of admixture in Latin America and to correct for the potential effects of population stratification in admixed samples in the region.
239 citations
••
TL;DR: Each of these nucleosomal elements controls transcription elongation by affecting distinctly the density and duration of polymerase pauses, thus providing multiple and alternative mechanisms for control of gene expression by chromatin remodeling and transcription factors.
174 citations
••
TL;DR: PCAdmix is presented, a Principal Components-based algorithm for determining ancestry along each chromosome from a high-density, genome-wide set of phased single-nucleotide polymorphism (SNP) genotypes of admixed individuals, and it is found to have better accuracy than LAMP when applied to three-population admixture.
Abstract: Identifying ancestry along each chromosome in admixed individuals provides a wealth of information for understanding the population genetic history of admixture events and is valuable for admixture mapping and identifying recent targets of selection. We present PCAdmix (available at https://sites.google.com/site/pcadmix/home), a Principal Components-based algorithm for determining ancestry along each chromosome from a high-density, genome-wide set of phased single-nucleotide polymorphism (SNP) genotypes of admixed individuals. We compare our method to HAPMIX on simulated data from two ancestral populations, and we find high concordance between the methods. Our method also has better accuracy than LAMP when applied to three-population admixture, a situation as yet unaddressed by HAPMIX. Finally, we apply our method to a data set of four Latino populations with European, African, and Native American ancestry. We find evidence of assortative mating in each of the four populations, and we identify regions of shared ancestry that may be recent targets of selection and could serve as candidate regions for admixture-based association mapping.
173 citations
••
TL;DR: Pilot studies were successfully performed for diversity analysis, QTL mapping, marker-assisted backcrossing, and developing specialized genetic stocks, demonstrating that 384-plex SNP genotyping on the BeadXpress platform is a robust and efficient method for marker genotypesing in rice.
Abstract: Multiplexed single nucleotide polymorphism (SNP) markers have the potential to increase the speed and cost-effectiveness of genotyping, provided that an optimal SNP density is used for each application. To test the efficiency of multiplexed SNP genotyping for diversity, mapping and breeding applications in rice (Oryza sativa L.), we designed seven GoldenGate VeraCode oligo pool assay (OPA) sets for the Illumina BeadXpress Reader. Validated markers from existing 1536 Illumina SNPs and 44 K Affymetrix SNP chips developed at Cornell University were used to select subsets of informative SNPs for different germplasm groups with even distribution across the genome. A 96-plex OPA was developed for quality control purposes and for assigning a sample into one of the five O. sativa population subgroups. Six 384-plex OPAs were designed for genetic diversity analysis, DNA fingerprinting, and to have evenly-spaced polymorphic markers for quantitative trait locus (QTL) mapping and background selection for crosses between different germplasm pools in rice: Indica/Indica, Indica/Japonica, Japonica/Japonica, Indica/O. rufipogon, and Japonica/O. rufipogon. After testing on a diverse set of rice varieties, two of the SNP sets were re-designed by replacing poor-performing SNPs. Pilot studies were successfully performed for diversity analysis, QTL mapping, marker-assisted backcrossing, and developing specialized genetic stocks, demonstrating that 384-plex SNP genotyping on the BeadXpress platform is a robust and efficient method for marker genotyping in rice.
151 citations
••
TL;DR: This study reveals the causal variant for a canine QTL contributing to a major morphologic trait and uncovers a missense mutation in BMP3.
Abstract: Since the beginnings of domestication, the craniofacial architecture of the domestic dog has morphed and radiated to human whims. By beginning to define the genetic underpinnings of breed skull shapes, we can elucidate mechanisms of morphological diversification while presenting a framework for understanding human cephalic disorders. Using intrabreed association mapping with museum specimen measurements, we show that skull shape is regulated by at least five quantitative trait loci (QTLs). Our detailed analysis using whole-genome sequencing uncovers a missense mutation in BMP3. Validation studies in zebrafish show that Bmp3 function in cranial development is ancient. Our study reveals the causal variant for a canine QTL contributing to a major morphologic trait.
150 citations
••
TL;DR: It is found that this small cooperatively folded protein shows an anisotropic response to force; the protein is more mechanically resistant to force applied along a longitudinal axis compared to force applications perpendicular to the terminal β strand.
Abstract: Many biological processes generate force, and proteins have evolved to resist and respond to tension along different force axes. Single-molecule force spectroscopy allows for molecular insight into the behavior of proteins under force and the mechanism of protein folding in general. Here, we have used src SH3 to investigate the effect of different pulling axes under the low-force regime afforded by an optical trap. We find that this small cooperatively folded protein shows an anisotropic response to force; the protein is more mechanically resistant to force applied along a longitudinal axis compared to force applied perpendicular to the terminal β strand. In the longitudinal axis, we observe an unusual biphasic behavior revealing a force-induced switch in the unfolding mechanism suggesting the existence of two parallel unfolding pathways. A site-specific variant can selectively affect one of these pathways. Thus, even this simple two-state protein demonstrates a complex mechanical unfolding trajectory, accessing multiple unfolding pathways under the low-force regime of the optical trap; the specific unfolding pathway depends on the perturbation axis and the applied force.
••
TL;DR: This research presents a meta-modelling architecture that automates the very labor-intensive and therefore time-heavy and expensive and therefore expensive and expensive process of designing and implementing nanofiltration systems.
Abstract: volume 30 number 3 march 2012 nature biotechnology Liege, Belgium. 31The Babraham Institute, Cambridge, UK. 32Genomatix Software GmbH, Munich, Germany. 33Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland. 34Christian-Albrechts-Universitaet Zu Kiel, Kiel, Germany. 35Cellzome AG, Heidelberg, Germany. 36Institut National de la Sante et de la Recherche Medicale, Marseille, France. 37Weizmann Institute of Science, Rehovot, Israel. 38Barcelona Supercomputing Center, Barcelona, Spain. 39Centro Nacional de Investigaciones Oncologicas, Madrid, Spain. 40University Medical Centre Groningen, Groningen, The Netherlands. 41University of Saarland, Saarbruecken, Germany. 42Oxford Nanopore Technologies Ltd., Oxford, UK. e-mail: h.stunnenberg@ncmls.ru.nl
••
TL;DR: A comprehensive mechanochemical characterization of a homomeric ring ATPase-the bacteriophage φ29 packaging motor-a homopentamer that translocates double-stranded DNA in cycles composed of alternating dwells and bursts and shows that the motor displays an unexpected division of labor.
••
TL;DR: Type 2 diabetes (T2D) demonstrated extreme directional differentiation of risk allele frequencies across human populations, compared with null distributions of European-frequency matched control genomic alleles and risk alleles for other diseases, and these patterns contribute to disparities in predicted genetic risk.
Abstract: Many disease-susceptible SNPs exhibit significant disparity in ancestral and derived allele frequencies across worldwide populations. While previous studies have examined population differentiation of alleles at specific SNPs, global ethnic patterns of ensembles of disease risk alleles across human diseases are unexamined. To examine these patterns, we manually curated ethnic disease association data from 5,065 papers on human genetic studies representing 1,495 diseases, recording the precise risk alleles and their measured population frequencies and estimated effect sizes. We systematically compared the population frequencies of cross-ethnic risk alleles for each disease across 1,397 individuals from 11 HapMap populations, 1,064 individuals from 53 HGDP populations, and 49 individuals with whole-genome sequences from 10 populations. Type 2 diabetes (T2D) demonstrated extreme directional differentiation of risk allele frequencies across human populations, compared with null distributions of European-frequency matched control genomic alleles and risk alleles for other diseases. Most T2D risk alleles share a consistent pattern of decreasing frequencies along human migration into East Asia. Furthermore, we show that these patterns contribute to disparities in predicted genetic risk across 1,397 HapMap individuals, T2D genetic risk being consistently higher for individuals in the African populations and lower in the Asian populations, irrespective of the ethnicity considered in the initial discovery of risk alleles. We observed a similar pattern in the distribution of T2D Genetic Risk Scores, which are associated with an increased risk of developing diabetes in the Diabetes Prevention Program cohort, for the same individuals. This disparity may be attributable to the promotion of energy storage and usage appropriate to environments and inconsistent energy intake. Our results indicate that the differential frequencies of T2D risk alleles may contribute to the observed disparity in T2D incidence rates across ethnic populations.
••
TL;DR: A systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage identifies subtle variations across populations in the proportion of neutral versus deleterious variation and finds that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in.
Abstract: Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas—70% of the European ancestry in today’s African Americans dates back to European gene flow happening only 7–8 generations ago.
••
TL;DR: An arginine-to-cysteine change at a highly conserved residue in tyrosinase-related protein 1 (TYRP1) is identified as a major determinant of blond hair in Solomon Islanders and represents a strong common genetic effect on a complex human phenotype.
Abstract: Naturally blond hair is rare in humans and found almost exclusively in Europe and Oceania. Here, we identify an arginine-to-cysteine change at a highly conserved residue in tyrosinase-related protein 1 (TYRP1) as a major determinant of blond hair in Solomon Islanders. This missense mutation is predicted to affect catalytic activity of TYRP1 and causes blond hair through a recessive mode of inheritance. The mutation is at a frequency of 26% in the Solomon Islands, is absent outside of Oceania, represents a strong common genetic effect on a complex human phenotype, and highlights the importance of examining genetic associations worldwide.
••
TL;DR: Results reveal that RNA secondary structures provide a cis-acting mechanism by which sequence modulates transcriptional elongation, and correlate well with RNA folding energies obtained from cotranscriptional folding simulations.
Abstract: RNA polymerase pausing represents an important mechanism of transcriptional regulation. In this study, we use a single-molecule transcription assay to investigate the effect of template base-pair composition on pausing by RNA polymerase II and the evolutionarily distinct mitochondrial polymerase Rpo41. For both enzymes, pauses are shorter and less frequent on GC-rich templates. Significantly, incubation with RNase abolishes the template dependence of pausing. A kinetic model, wherein the secondary structure of the nascent RNA poses an energetic barrier to pausing by impeding backtracking along the template, quantitatively predicts the pause densities and durations observed. The energy barriers extracted from the data correlate well with RNA folding energies obtained from cotranscriptional folding simulations. These results reveal that RNA secondary structures provide a cis-acting mechanism by which sequence modulates transcriptional elongation.
••
TL;DR: The approach incorporates stochastic variation due to the evolutionary process and can be fit using standard statistical software, and universally outperforms existing methods for detecting genes subject to selection using polymorphism and divergence data.
Abstract: We present an approach for identifying genes under natural selection using polymorphism and divergence data from synonymous and non-synonymous sites within genes. A generalized linear mixed model is used to model the genome-wide variability among categories of mutations and estimate its functional consequence. We demonstrate how the model's estimated fixed and random effects can be used to identify genes under selection. The parameter estimates from our generalized linear model can be transformed to yield population genetic parameter estimates for quantities including the average selection coefficient for new mutations at a locus, the synonymous and non-synynomous mutation rates, and species divergence times. Furthermore, our approach incorporates stochastic variation due to the evolutionary process and can be fit using standard statistical software. The model is fit in both the empirical Bayes and Bayesian settings using the lme4 package in R, and Markov chain Monte Carlo methods in WinBUGS. Using simulated data we compare our method to existing approaches for detecting genes under selection: the McDonald-Kreitman test, and two versions of the Poisson random field based method MKprf. Overall, we find our method universally outperforms existing methods for detecting genes subject to selection using polymorphism and divergence data.
••
TL;DR: The studies suggest that the molten globule state of apomyoglobin is more deformable and the unfolding rate is more sensitive to force, which could have important implications for mechanical processes in the cell.
Abstract: Recently, the role of force in cellular processes has become more evident, and now with advances in force spectroscopy, the response of proteins to force can be directly studied. Such studies have found that native proteins are brittle, and thus not very deformable. Here, we examine the mechanical properties of a class of intermediates referred to as the molten globule state. Using optical trap force spectroscopy, we investigated the response to force of the native and molten globule states of apomyoglobin along different pulling axes. Unlike natively folded proteins, the molten globule state of apomyoglobin is compliant (large distance to the transition state); this large compliance means that the molten globule is more deformable and the unfolding rate is more sensitive to force (the application of force or tension will have a more dramatic effect on the unfolding rate). Our studies suggest that these are general properties of molten globules and could have important implications for mechanical processes in the cell.
••
TL;DR: Results proving the existence of a stretched basepaired form of DNA can be presented and the extension observed in the reversible transition coincides with that produced on DNA by binding of bacterial RecA and human Rad51, pointing to its possible relevance in homologous recombination.
Abstract: Mixed-sequence DNA molecules undergo mechanical overstretching by approximately 70% at 60–70 pN. Since its initial discovery 15 y ago, a debate has arisen as to whether the molecule adopts a new form [Cluzel P, et al. (1996) Science 271:792–794; Smith SB, Cui Y, Bustamante C (1996) Science 271:795–799], or simply denatures under tension [van Mameren J, et al. (2009) Proc Natl Acad Sci USA 106:18231–18236]. Here, we resolve this controversy by using optical tweezers to extend small 60–64 bp single DNA duplex molecules whose base content can be designed at will. We show that when AT content is high (70%), a force-induced denaturation of the DNA helix ensues at 62 pN that is accompanied by an extension of the molecule of approximately 70%. By contrast, GC-rich sequences (60% GC) are found to undergo a reversible overstretching transition into a distinct form that is characterized by a 51% extension and that remains base-paired. For the first time, results proving the existence of a stretched basepaired form of DNA can be presented. The extension observed in the reversible transition coincides with that produced on DNA by binding of bacterial RecA and human Rad51, pointing to its possible relevance in homologous recombination.
••
TL;DR: It is found that positive selection does not appear to be a strong determinant of allele-frequency differentiation among these African populations, and a novel method is used to identify biological functions enriched among populations’ empirical tail genomic windows, such as immune response in agricultural groups.
Abstract: While hundreds of loci have been identified as reflecting strong-positive selection in human populations, connections between candidate loci and specific selective pressures often remain obscure. This study investigates broader patterns of selection in African populations, which are underrepresented despite their potential to offer key insights into human adaptation. We scan for hard selective sweeps using several haplotype and allele-frequency statistics with a data set of nearly 500,000 genome-wide single-nucleotide polymorphisms in 12 highly diverged African populations that span a range of environments and subsistence strategies. We find that positive selection does not appear to be a strong determinant of allele-frequency differentiation among these African populations. Haplotype statistics do identify putatively selected regions that are shared across African populations. However, as assessed by extensive simulations, patterns of haplotype sharing between African populations follow neutral expectations and suggest that tails of the empirical distributions contain false-positive signals. After highlighting several genomic regions where positive selection can be inferred with higher confidence, we use a novel method to identify biological functions enriched among populations’ empirical tail genomic windows, such as immune response in agricultural groups. In general, however, it seems that current methods for selection scans are poorly suited to populations that, like the African populations in this study, are affected by ascertainment bias and have low levels of linkage disequilibrium, possibly old selective sweeps, and potentially reduced phasing accuracy. Additionally, population history can confound the interpretation of selection statistics, suggesting that greater care is needed in attributing broad genetic patterns to human adaptation.
••
TL;DR: It is demonstrated that S1 promotes RNA unwinding by binding to the single-stranded RNA formed transiently during the thermal breathing of the RNA base pairs and that S2 dissociation results in RNA rezipping, and that a multistep scheme greatly expedites S1 unwinding of an RNA structure compared to a single-step mode.
Abstract: The sequence and secondary structure of the 5′-end of mRNAs regulate translation by controlling ribosome initiation on the mRNA. Ribosomal protein S1 is crucial for ribosome initiation on many natural mRNAs, particularly for those with structured 5′-ends, or with no or weak Shine-Dalgarno sequences. Besides a critical role in translation, S1 has been implicated in several other cellular processes, such as transcription recycling, and the rescuing of stalled ribosomes by tmRNA. The mechanisms of S1 functions are still elusive but have been widely considered to be linked to the affinity of S1 for single-stranded RNA and its corresponding destabilization of mRNA secondary structures. Here, using optical tweezers techniques, we demonstrate that S1 promotes RNA unwinding by binding to the single-stranded RNA formed transiently during the thermal breathing of the RNA base pairs and that S1 dissociation results in RNA rezipping. We measured the dependence of the RNA unwinding and rezipping rates on S1 concentration, and the force applied to the ends of the RNA. We found that each S1 binds 10 nucleotides of RNA in a multistep fashion implying that S1 can facilitate ribosome initiation on structured mRNA by first binding to the single strand next to an RNA duplex structure (“stand-by site”) before subsequent binding leads to RNA unwinding. Unwinding by multiple small substeps is much less rate limited by thermal breathing than unwinding in a single step. Thus, a multistep scheme greatly expedites S1 unwinding of an RNA structure compared to a single-step mode.
••
TL;DR: It is found that North African populations have a significant excess of derived alleles shared with Neandertals, when compared to sub-Saharan Africans, a fact that can be interpreted as a sign of Ne andertal admixture.
Abstract: One of the main findings derived from the analysis of the Neandertal genome was the evidence for admixture between Neandertals and non-African modern humans. An alternative scenario is that the ancestral population of non-Africans was closer to Neandertals than to Africans because of ancient population substructure. Thus, the study of North African populations is crucial for testing both hypotheses. We analyzed a total of 780,000 SNPs in 125 individuals representing seven different North African locations and searched for their ancestral/derived state in comparison to different human populations and Neandertals. We found that North African populations have a significant excess of derived alleles shared with Neandertals, when compared to sub-Saharan Africans. This excess is similar to that found in non-African humans, a fact that can be interpreted as a sign of Neandertal admixture. Furthermore, the Neandertal's genetic signal is higher in populations with a local, pre-Neolithic North African ancestry. Therefore, the detected ancient admixture is not due to recent Near Eastern or European migrations. Sub-Saharan populations are the only ones not affected by the admixture event with Neandertals.
••
TL;DR: Simulations show that SupportMix is not only more accurate than other popular admixture discovery tools but is the first admixture inference method that can efficiently scale for simultaneous analysis of 50-100 putative ancestral populations while being independent of prior demographic information.
Abstract: Populations of the Arabian Peninsula have a complex genetic structure that reflects waves of migrations including the earliest human migrations from Africa and eastern Asia, migrations along ancient civilization trading routes and colonization history of recent centuries. Here, we present a study of genome-wide admixture in this region, using 156 genotyped individuals from Qatar, a country located at the crossroads of these migration patterns. Since haplotypes of these individuals could have originated from many different populations across the world, we have developed a machine learning method "SupportMix" to infer loci-specific genomic ancestry when simultaneously analyzing many possible ancestral populations. Simulations show that SupportMix is not only more accurate than other popular admixture discovery tools but is the first admixture inference method that can efficiently scale for simultaneous analysis of 50-100 putative ancestral populations while being independent of prior demographic information. By simultaneously using the 55 world populations from the Human Genome Diversity Panel, SupportMix was able to extract the fine-scale ancestry of the Qatar population, providing many new observations concerning the ancestry of the region. For example, as well as recapitulating the three major sub-populations in Qatar, composed of mainly Arabic, Persian, and African ancestry, SupportMix additionally identifies the specific ancestry of the Persian group to populations sampled in Greater Persia rather than from China and the ancestry of the African group to sub-Saharan origin and not Southern African Bantu origin as previously thought.
••
TL;DR: This study is compatible with the history of North African Jews—founding during Classical Antiquity with proselytism of local populations, followed by genetic isolation with the rise of Christianity and then Islam, and admixture following the emigration of Sephardic Jews during the Inquisition.
Abstract: North African Jews constitute the second largest Jewish Diaspora group. However, their relatedness to each other; to European, Middle Eastern, and other Jewish Diaspora groups; and to their former North African non-Jewish neighbors has not been well defined. Here, genome-wide analysis of five North African Jewish groups (Moroccan, Algerian, Tunisian, Djerban, and Libyan) and comparison with other Jewish and non-Jewish groups demonstrated distinctive North African Jewish population clusters with proximity to other Jewish populations and variable degrees of Middle Eastern, European, and North African admixture. Two major subgroups were identified by principal component, neighbor joining tree, and identity-by-descent analysis—Moroccan/Algerian and Djerban/Libyan—that varied in their degree of European admixture. These populations showed a high degree of endogamy and were part of a larger Ashkenazi and Sephardic Jewish group. By principal component analysis, these North African groups were orthogonal to contemporary populations from North and South Morocco, Western Sahara, Tunisia, Libya, and Egypt. Thus, this study is compatible with the history of North African Jews—founding during Classical Antiquity with proselytism of local populations, followed by genetic isolation with the rise of Christianity and then Islam, and admixture following the emigration of Sephardic Jews during the Inquisition.
••
TL;DR: It is demonstrated that the commonly used technique of force feedback has severe limitations when used to evaluate rapid macromolecular conformational transitions, and the causes are elucidated and a simple test is provided to identify and evaluate the magnitude of the effect.
••
TL;DR: This work uses high-coverage whole-genome sequencing of a conditional mismatch repair mutant line of diploid yeast to identify mutations that accumulated after 160 generations of growth and suggests that specific mutation hot spots can contribute disproportionately to the genetic variation that is introduced into populations.
01 Jan 2012
TL;DR: The design of an ongoing randomized trial is described to investigate whether CVD risk factor profiles can be improved by providing participants with knowledge related to their inherited risk of CVD in addition to information on their risk related to measured TRFs.
Abstract: Background Genome-wide association studies (GWAS) have identified more than 1500 disease-associated single nucleotide polymorphisms (SNPs), including many related to atherosclerotic cardiovascular disease (CVD). Associations have been found for most traditional risk factors (TRFs), including lipids,1,2 blood pressure/hypertension,3,4 weight/body mass index,5,6 smoking behavior,7 and diabetes.8–13 GWAS have also identified susceptibility variants for coronary heart disease (CHD). The first and, so far, strongest of these signals was found in the 9p21.3 locus, where common variants in this region increase the relative risk of CVD by 15% to 30% per risk allele in most race/ethnic groups.13–20 Subsequent largescale GWAS meta-analyses and replication studies in largely white/European populations have led to the reliable identification of an additional 26 loci conferring susceptibility to CHD,2,20–23 all with substantially lower effects sizes compared with the 9p21 locus. Many of these CVD susceptibility loci appear to be conferring risk independent of TRFs and thus cannot currently be assessed by surrogate clinical measures (Table 1). Among the 27 independent loci identified in the most recent large meta-analyses of CVD, 21 were reported not to be associated with any of the TRFs.20,21 Several studies have explored whether initial CVD-related genetic markers can improve risk prediction over standard models restricted to TRFs using a genetic risk score (GRS) constructed on the basis of the number of risk alleles inherited.24–26 Results to date have been mixed. Although all have shown that a GRS is strongly associated with the outcome of interest independent of TRFs, none were able to demonstrate a significant improvement in the c-statistic. Two of the 3 studies showed some modest improvement in newly defined discrimination indices, including the integrated discrimination index, the net reclassification index, and the clinical net reclassification index (net reclassification index in the intermediate-risk subjects). Thus, the use of these markers has not yet been shown to convincingly outperform models that include TRFs and family history alone. One important reason for the failure of these markers to demonstrate clinically meaningful improvement of risk prediction relates to the small proportion of the genetic variance explained by these markers, a phenomenon commonly referred to as the heritability gap. The basis for this heritability gap is the focus of intense investigation. Despite this gap, it is still possible that knowledge of genetic risk may improve patient outcomes through means other than enhanced risk reclassification. For instance, genetic testing may improve patient adherence and CVD risk factor reduction for Mendelian disorders related to CHD, such as familial hypercholesterolemia.27 This effect may be owing to an increase in patient motivation (eg, people who recognize and accept their high risk are more encouraged to reduce it); however, no clinical trial to date has demonstrated that newly discovered genetic markers improve risk factor profiles by improving adherence to prescribed therapy for complex (garden variety) CVD. Here, we describe the design of an ongoing randomized trial to investigate whether CVD risk factor profiles can be improved by providing participants with knowledge related to their inherited risk of CVD in addition to information on their risk related to measured TRFs. We also discuss some of the challenges that arise in the design and conduct of such a trial and how they were addressed.
••
TL;DR: In this paper, the authors explored whether initial CVD-related genetic markers can improve risk prediction over standard models restricted to TRFs using a genetic risk score (GRS) constructed on the basis of the number of risk alleles inherited.
Abstract: Genome-wide association studies (GWAS) have identified more than 1500 disease-associated single nucleotide polymorphisms (SNPs), including many related to atherosclerotic cardiovascular disease (CVD). Associations have been found for most traditional risk factors (TRFs), including lipids,1,2 blood pressure/hypertension,3,4 weight/body mass index,5,6 smoking behavior,7 and diabetes.8–13 GWAS have also identified susceptibility variants for coronary heart disease (CHD). The first and, so far, strongest of these signals was found in the 9p21.3 locus, where common variants in this region increase the relative risk of CVD by 15% to 30% per risk allele in most race/ethnic groups.13–20 Subsequent large-scale GWAS meta-analyses and replication studies in largely white/European populations have led to the reliable identification of an additional 26 loci conferring susceptibility to CHD,2,20–23 all with substantially lower effects sizes compared with the 9p21 locus. Many of these CVD susceptibility loci appear to be conferring risk independent of TRFs and thus cannot currently be assessed by surrogate clinical measures (Table 1). Among the 27 independent loci identified in the most recent large meta-analyses of CVD, 21 were reported not to be associated with any of the TRFs.20,21
View this table:
Table 1.
SNPs Related to CVD That Are Independent of Traditional Risk Factors
Several studies have explored whether initial CVD-related genetic markers can improve risk prediction over standard models restricted to TRFs using a genetic risk score (GRS) constructed on the basis of the number of risk alleles inherited.24–26 Results to date have been mixed. Although all have shown that a GRS is strongly associated with the outcome of interest independent of TRFs, none were able to demonstrate a significant improvement in the c-statistic. Two of the 3 studies showed some modest improvement in …
••
TL;DR: This is the first report on egg-laying rate of Z. chilensis and D. trachyderma from the south-eastern Pacific Ocean and the first description of egg capsules of both species.
Abstract: Egg capsules of Zearaja chilensis were obtained from individuals kept in captivity and from dead specimens captured in Valparaiso Bay, central Chile. One female under laboratory conditions deposited three pairs of egg capsules in 6 days. The egg capsules of Dipturus trachyderma were obtained from a female captured in Valdivia, south Chile. Fresh egg capsules of both species were golden-brown and thick walled. Size of egg capsules of Z. chilensis ranged from 94 to 144 mm in capsule length and 64 to 76 mm in capsule width. Those of D. trachyderma ranged between 197 and 199 mm in capsule length and 110.0 and 129.0 mm in capsule width. Range and mean values of capsule length and width of egg capsules described in this study were smaller than those reported for the same species in the south-western Atlantic Ocean. This is the first report on egg-laying rate of Z. chilensis and the first description of egg capsules of Z. chilensis and D. trachyderma from the south-eastern Pacific Ocean.