scispace - formally typeset
Search or ask a question

Showing papers by "Carlos Bustamante published in 2005"


Journal ArticleDOI
TL;DR: This work shows that a new parametric test, based on composite likelihood, has a high power to detect selective sweeps and is surprisingly robust to assumptions regarding recombination rates and demography (i.e., has low Type I error).
Abstract: Detecting selective sweeps from genomic SNP data is complicated by the intricate ascertainment schemes used to discover SNPs, and by the confounding influence of the underlying complex demographics and varying mutation and recombination rates. Current methods for detecting selective sweeps have little or no robustness to the demographic assumptions and varying recombination rates, and provide no method for correcting for ascertainment biases. Here, we present several new tests aimed at detecting selective sweeps from genomic SNP data. Using extensive simulations, we show that a new parametric test, based on composite likelihood, has a high power to detect selective sweeps and is surprisingly robust to assumptions regarding recombination rates and demography (i.e., has low Type I error). Our new test also provides estimates of the location of the selective sweep(s) and the magnitude of the selection coefficient. To illustrate the method, we apply our approach to data from the Seattle SNP project and to Chromosome 2 data from the HapMap project. In Chromosome 2, the most extreme signal is found in the lactase gene, which previously has been shown to be undergoing positive selection. Evidence for selective sweeps is also found in many other regions, including genes known to be associated with disease risk such as DPP10 and COL4A3.

950 citations


Journal ArticleDOI
08 Sep 2005-Nature
TL;DR: It is shown that the Crooks fluctuation theorem can be used to determine folding free energies for folding and unfolding processes occurring in weak as well as strong nonequilibrium regimes, thereby providing a test of its validity under such conditions.
Abstract: Atomic force microscopes and optical tweezers are widely used to probe the mechanical properties of individual molecules and molecular interactions, by exerting mechanical forces that induce transitions such as unfolding or dissociation. These transitions often occur under nonequilibrium conditions and are associated with hysteresis effects-features usually taken to preclude the extraction of equilibrium information from the experimental data. But fluctuation theorems allow us to relate the work along nonequilibrium trajectories to thermodynamic free-energy differences. They have been shown to be applicable to single-molecule force measurements and have already provided information on the folding free energy of a RNA hairpin. Here we show that the Crooks fluctuation theorem can be used to determine folding free energies for folding and unfolding processes occurring in weak as well as strong nonequilibrium regimes, thereby providing a test of its validity under such conditions. We use optical tweezers to measure repeatedly the mechanical work associated with the unfolding and refolding of a small RNA hairpin and an RNA three-helix junction. The resultant work distributions are then analysed according to the theorem and allow us to determine the difference in folding free energy between an RNA molecule and a mutant differing only by one base pair, and the thermodynamic stabilizing effect of magnesium ions on the RNA structure.

938 citations


Journal ArticleDOI
TL;DR: This work compares 13,731 annotated genes from humans to their chimpanzee orthologs to identify genes that show evidence of positive selection, and hypothesizes that positive selection in some of these genes may be driven by genomic conflict due to apoptosis during spermatogenesis.
Abstract: Since the divergence of humans and chimpanzees about 5 million years ago, these species have undergone a remarkable evolution with drastic divergence in anatomy and cognitive abilities. At the molecular level, despite the small overall magnitude of DNA sequence divergence, we might expect such evolutionary changes to leave a noticeable signature throughout the genome. We here compare 13,731 annotated genes from humans to their chimpanzee orthologs to identify genes that show evidence of positive selection. Many of the genes that present a signature of positive selection tend to be involved in sensory perception or immune defenses. However, the group of genes that show the strongest evidence for positive selection also includes a surprising number of genes involved in tumor suppression and apoptosis, and of genes involved in spermatogenesis. We hypothesize that positive selection in some of these genes may be driven by genomic conflict due to apoptosis during spermatogenesis. Genes with maximal expression in the brain show little or no evidence for positive selection, while genes with maximal expression in the testis tend to be enriched with positively selected genes. Genes on the X chromosome also tend to show an elevated tendency for positive selection. We also present polymorphism data from 20 Caucasian Americans and 19 African Americans for the 50 annotated genes showing the strongest evidence for positive selection. The polymorphism analysis further supports the presence of positive selection in these genes by showing an excess of high-frequency derived nonsynonymous mutations.

936 citations


Journal ArticleDOI
20 Oct 2005-Nature
TL;DR: Comparisons of DNA polymorphism within species to divergence between species enables the discovery of molecular adaptation in evolutionarily constrained genes as well as the differentiation of weak from strong purifying selection, and finds strong evidence that natural selection has shaped the recent molecular evolution of the authors' species.
Abstract: Comparisons of DNA polymorphism within species to divergence between species enables the discovery of molecular adaptation in evolutionarily constrained genes as well as the differentiation of weak from strong purifying selection. The extent to which weak negative and positive darwinian selection have driven the molecular evolution of different species varies greatly, with some species, such as Drosophila melanogaster, showing strong evidence of pervasive positive selection, and others, such as the selfing weed Arabidopsis thaliana, showing an excess of deleterious variation within local populations. Here we contrast patterns of coding sequence polymorphism identified by direct sequencing of 39 humans for over 11,000 genes to divergence between humans and chimpanzees, and find strong evidence that natural selection has shaped the recent molecular evolution of our species. Our analysis discovered 304 (9.0%) out of 3,377 potentially informative loci showing evidence of rapid amino acid evolution. Furthermore, 813 (13.5%) out of 6,033 potentially informative loci show a paucity of amino acid differences between humans and chimpanzees, indicating weak negative selection and/or balancing selection operating on mutations at these loci. We find that the distribution of negatively and positively selected genes varies greatly among biological processes and molecular functions, and that some classes, such as transcription factors, show an excess of rapidly evolving genes, whereas others, such as cytoskeletal proteins, show an excess of genes with extensive amino acid polymorphism within humans and yet little amino acid divergence between humans and chimpanzees.

781 citations


Journal ArticleDOI
TL;DR: A new mutation in MSTN found in the whippet dog breed that results in a double-muscled phenotype known as the “bully” whippets is described, marking the first time a mutation in the myostatin gene has been quantitatively linked to increased athletic performance.
Abstract: Double muscling is a trait previously described in several mammalian species including cattle and sheep and is caused by mutations in the myostatin (MSTN) gene (previously referred to as GDF8). Here we describe a new mutation in MSTN found in the whippet dog breed that results in a double-muscled phenotype known as the “bully” whippet. Individuals with this phenotype carry two copies of a two-base-pair deletion in the third exon of MSTN leading to a premature stop codon at amino acid 313. Individuals carrying only one copy of the mutation are, on average, more muscular than wild-type individuals (p = 7.43 × 10−6; Kruskal-Wallis Test) and are significantly faster than individuals carrying the wild-type genotype in competitive racing events (Kendall's nonparametric measure, τ = 0.3619; p ≈ 0.00028). These results highlight the utility of performance-enhancing polymorphisms, marking the first time a mutation in MSTN has been quantitatively linked to increased athletic performance.

738 citations


Journal ArticleDOI
TL;DR: These results explain how remodeling factors can be recruited to particular nucleosomes on a biologically relevant timescale, and imply that the major impediment to entry of RNA polymerase into a nucleosome is rewrapping of nucleosomal DNA, not unwrapping.
Abstract: DNA wrapped in nucleosomes is sterically occluded, creating obstacles for proteins that must bind it. How proteins gain access to DNA buried inside nucleosomes is not known. Here we report measurements of the rates of spontaneous nucleosome conformational changes in which a stretch of DNA transiently unwraps off the histone surface, starting from one end of the nucleosome, and then rewraps. The rates are rapid. Nucleosomal DNA remains fully wrapped for only approximately 250 ms before spontaneously unwrapping; unwrapped DNA rewraps within approximately 10-50 ms. Spontaneous unwrapping of nucleosomal DNA allows any protein rapid access even to buried stretches of the DNA. Our results explain how remodeling factors can be recruited to particular nucleosomes on a biologically relevant timescale, and they imply that the major impediment to entry of RNA polymerase into a nucleosome is rewrapping of nucleosomal DNA, not unwrapping.

657 citations


Journal ArticleDOI
23 Sep 2005-Science
TL;DR: Force-measuring optical tweezers were used to induce complete mechanical unfolding and refolding of individual Escherichia coli ribonuclease H (RNase H) molecules to map the energy landscape of RNase H.
Abstract: We used force-measuring optical tweezers to induce complete mechanical unfolding and refolding of individual Escherichia coli ribonuclease H (RNase H) molecules. The protein unfolds in a two-state manner and refolds through an intermediate that correlates with the transient molten globule–like intermediate observed in bulk studies. This intermediate displays unusual mechanical compliance and unfolds at substantially lower forces than the native state. In a narrow range of forces, the molecule hops between the unfolded and intermediate states in real time. Occasionally, hopping was observed to stop as the molecule crossed the folding barrier directly from the intermediate, demonstrating that the intermediate is on-pathway. These studies allow us to map the energy landscape of RNase H.

608 citations


Journal ArticleDOI
TL;DR: In this article, the interactions of tiny objects with their environments are dominated by thermal fluctuations, and the authors have begun to study such interactions in detail, guided by theory and assisted by new micromanipulation tools.
Abstract: The interactions of tiny objects with their environments are dominated by thermal fluctuations. Guided by theory and assisted by new micromanipulation tools, scientists have begun to study such interactions in detail.

605 citations


Journal ArticleDOI
TL;DR: An ascertainment correction is performed and it is shown how the post-correction data are more consistent across these studies, suggesting that the heterogeneity in the SNP discovery process of the HapMap project resulted in a data set resistant to complete ascertainment Correction.
Abstract: Large-scale SNP genotyping studies rely on an initial assessment of nucleotide variation to identify sites in the DNA sequence that harbor variation among individuals. This "SNP discovery" sample may be quite variable in size and composition, and it has been well established that properties of the SNPs that are found are influenced by the discovery sampling effort. The International HapMap project relied on nearly any piece of information available to identify SNPs-including BAC end sequences, shotgun reads, and differences between public and private sequences-and even made use of chimpanzee data to confirm human sequence differences. In addition, the ascertainment criteria shifted from using only SNPs that had been validated in population samples, to double-hit SNPs, to finally accepting SNPs that were singletons in small discovery samples. In contrast, Perlegen's primary discovery was a resequencing-by-hybridization effort using the 24 people of diverse origin in the Polymorphism Discovery Resource. Here we take these two data sets and contrast two basic summary statistics, heterozygosity and F(ST), as well as the site frequency spectra, for 500-kb windows spanning the genome. The magnitude of disparity between these samples in these measures of variability indicates that population genetic analysis on the raw genotype data is ill advised. Given the knowledge of the discovery samples, we perform an ascertainment correction and show how the post-correction data are more consistent across these studies. However, discrepancies persist, suggesting that the heterogeneity in the SNP discovery process of the HapMap project resulted in a data set resistant to complete ascertainment correction. Ascertainment bias will likely erode the power of tests of association between SNPs and complex disorders, but the effect will likely be small, and perhaps more importantly, it is unlikely that the bias will introduce false-positive inferences.

482 citations


Journal ArticleDOI
TL;DR: It is found that recent adaptation is strikingly pervasive in the human genome, with as much as 10% of the genome affected by linkage to a selective sweep.
Abstract: Identifying genomic locations that have experienced selective sweeps is an important first step toward understanding the molecular basis of adaptive evolution. Using statistical methods that account for the confounding effects of population demography, recombination rate variation, and single-nucleotide polymorphism ascertainment, while also providing fine-scale estimates of the position of the selected site, we analyzed a genomic dataset of 1.2 million human single-nucleotide polymorphisms genotyped in African-American, European-American, and Chinese samples. We identify 101 regions of the human genome with very strong evidence (p < 10(-5)) of a recent selective sweep and where our estimate of the position of the selective sweep falls within 100 kb of a known gene. Within these regions, genes of biological interest include genes in pigmentation pathways, components of the dystrophin protein complex, clusters of olfactory receptors, genes involved in nervous system development and function, immune system genes, and heat shock genes. We also observe consistent evidence of selective sweeps in centromeric regions. In general, we find that recent adaptation is strikingly pervasive in the human genome, with as much as 10% of the genome affected by linkage to a selective sweep.

477 citations


Journal ArticleDOI
TL;DR: It is found that a simple bottleneck model cannot explain the derived nucleotide polymorphism site frequency spectrum in rice, and a bottleneck model that incorporates selective sweeps, or a more complex demographic model that includes subdivision and gene flow are more plausible explanations for patterns of variation in domesticated rice varieties.
Abstract: Domesticated Asian rice (Oryza sativa) is one of the oldest domesticated crop species in the world, having fed more people than any other plant in human history. We report the patterns of DNA sequence variation in rice and its wild ancestor, O. rufipogon, across 111 randomly chosen gene fragments, and use these to infer the evolutionary dynamics that led to the origins of rice. There is a genome-wide excess of high-frequency derived single nucleotide polymorphisms (SNPs) in O. sativa varieties, a pattern that has not been reported for other crop species. We developed several alternative models to explain contemporary patterns of polymorphisms in rice, including a (i) selectively neutral population bottleneck model, (ii) bottleneck plus migration model, (iii) multiple selective sweeps model, and (iv) bottleneck plus selective sweeps model. We find that a simple bottleneck model, which has been the dominant demographic model for domesticated species, cannot explain the derived nucleotide polymorphism site frequency spectrum in rice. Instead, a bottleneck model that incorporates selective sweeps, or a more complex demographic model that includes subdivision and gene flow, are more plausible explanations for patterns of variation in domesticated rice varieties. If selective sweeps are indeed the explanation for the observed nucleotide data of domesticated rice, it suggests that strong selection can leave its imprint on genome-wide polymorphism patterns, contrary to expectations that selection results only in a local signature of variation.

Journal ArticleDOI
TL;DR: A model-based approach to developing predictions for patterns of polymorphism in the presence of both population size change and natural selection, which is more robust to assumptions regarding the true underlying demography than previous approaches to detecting and analyzing selection.
Abstract: Natural selection and demographic forces can have similar effects on patterns of DNA polymorphism. Therefore, to infer selection from samples of DNA sequences, one must simultaneously account for demographic effects. Here we take a model-based approach to this problem by developing predictions for patterns of polymorphism in the presence of both population size change and natural selection. If data are available from different functional classes of variation, and a priori information suggests that mutations in one of those classes are selectively neutral, then the putatively neutral class can be used to infer demographic parameters, and inferences regarding selection on other classes can be performed given demographic parameter estimates. This procedure is more robust to assumptions regarding the true underlying demography than previous approaches to detecting and analyzing selection. We apply this method to a large polymorphism data set from 301 human genes and find (i) widespread negative selection acting on standing nonsynonymous variation, (ii) that the fitness effects of nonsynonymous mutations are well predicted by several measures of amino acid exchangeability, especially site-specific methods, and (iii) strong evidence for very recent population growth.

Journal ArticleDOI
09 Sep 2005-Cell
TL;DR: The kinetic parameters of the packaging motor and their dependence on external load are determined to show that DNA translocation does not occur during ATP binding but is likely triggered by phosphate release, and a minimal mechanochemical cycle of this DNA-translocating ATPase is proposed.

Journal ArticleDOI
01 Jul 2005-Genetics
TL;DR: It is demonstrated that the composite-likelihood-ratio test comparing selective and neutral hypotheses is not robust to undetected population structure or a recent bottleneck, with some parameter combinations resulting in a false positive rate of nearly 90%.
Abstract: In 2002 Kim and Stephan proposed a promising composite-likelihood method for localizing and estimating the fitness advantage of a recently fixed beneficial mutation. Here, we demonstrate that their composite-likelihood-ratio (CLR) test comparing selective and neutral hypotheses is not robust to undetected population structure or a recent bottleneck, with some parameter combinations resulting in a false positive rate of nearly 90%. We also propose a goodness-of-fit test for discriminating rejections due to directional selection (true positive) from those due to population and demographic forces (false positives) and demonstrate that the new method has high sensitivity to differentiate the two classes of rejections.

Journal ArticleDOI
TL;DR: The change from the red seeds of wild rice to the white seeds of cultivated rice (Oryza sativa) resulted from the strong selective sweep of a single mutation, a frame-shift deletion within the Rc gene that is found in 97.9% of white rice varieties today.
Abstract: Here we report that the change from the red seeds of wild rice to the white seeds of cultivated rice (Oryza sativa) resulted from the strong selective sweep of a single mutation, a frame-shift deletion within the Rc gene that is found in 97.9% of white rice varieties today. A second mutation, also within Rc, is present in less than 3% of white accessions surveyed. Haplotype analysis revealed that the predominant mutation originated in the japonica subspecies and crossed both geographic and sterility barriers to move into the indica subspecies. A little less than one Mb of japonica DNA hitchhiked with the rc allele into most indica varieties, suggesting that other linked domestication alleles may have been transferred from japonica to indica along with white pericarp color. Our finding provides evidence of active cultural exchange among ancient farmers over the course of rice domestication coupled with very strong, positive selection for a single white allele in both subspecies of O. sativa.

Journal ArticleDOI
TL;DR: Examination of mtDNA replicative intermediates from mouse liver using atomic force microscopy and 2D agarose gel electrophoresis provides evidence for only the orthodox, strand-displacement mode of replication and reveals the presence of additional, alternative origins of lagging light-strand mtDNA synthesis.
Abstract: The established strand-displacement model for mammalian mitochondrial DNA (mtDNA) replication has recently been questioned in light of new data using two-dimensional (2D) agarose gel electrophoresis. It has been proposed that a synchronous, strand-coupled mode of replication occurs in tissues, thereby casting doubt on the general validity of the “orthodox,” or strand-displacement model. We have examined mtDNA replicative intermediates from mouse liver using atomic force microscopy and 2D agarose gel electrophoresis in order to resolve this issue. The data provide evidence for only the orthodox, strand-displacement mode of replication and reveal the presence of additional, alternative origins of lagging light-strand mtDNA synthesis. The conditions used for 2D agarose gel analysis are favorable for branch migration of asymmetrically replicating nascent strands. These data reconcile the original displacement mode of replication with the data obtained from 2D gel analyses.

Journal ArticleDOI
28 Jan 2005-Science
TL;DR: The results imply that FtsK is a bidirectional motor that changes direction in response to short, asymmetric directing DNA sequences, as it does in vivo.
Abstract: DNA translocases are molecular motors that move rapidly along DNA using adenosine triphosphate as the source of energy. We directly observed the movement of purified FtsK, an Escherichia coli translocase, on single DNA molecules. The protein moves at 5 kilobases per second and against forces up to 60 piconewtons, and locally reverses direction without dissociation. On three natural substrates, independent of its initial binding position, FtsK efficiently translocates over long distances to the terminal region of the E. coli chromosome, as it does in vivo. Our results imply that FtsK is a bidirectional motor that changes direction in response to short, asymmetric directing DNA sequences.

Journal ArticleDOI
TL;DR: Two methods of temperature control of a dual-beam optical-tweezers system are compared and force was measured directly by sensors of the momentum flux of light, independent of environmental disturbances including refractive index changes that vary with temperature.

Journal ArticleDOI
TL;DR: GNGNAGGG, its complement, or both are identified as a sequence motif that controls translocation directionality and it is shown that the FtsK translocase is a powerful motor that is able to displace a triplex-forming oligo from a DNA substrate.
Abstract: FtsK from Escherichia coli is a fast and sequence-directed DNA translocase with roles in chromosome dimer resolution, segregation, and decatenation. From the movement of single FtsK particles on defined DNA substrates and an analysis of skewed DNA sequences in bacteria, we identify GNGNAGGG, its complement, or both as a sequence motif that controls translocation directionality. GNGNAGGG is skewed so that it is predominantly on the leading strand of chromosomal replication. Translocation across this octamer from the 3′ side of the G-rich strand causes FtsK to pause, turn around, and translocate in the opposite direction. Only 39 ± 4% of the encounters between FtsK and the octamer result in a turnaround, congruent with our optimum turnaround probability prediction of 30%. The probability that the observed skew of GNGNAGGG within 1 megabase of dif occurred by chance in E. coli is 1.7 × 10–57, and similarly dramatic skews are found in the five other bacterial genomes we examined. The fact that FtsK acts only in the terminus region and the octamer skew extends from origin to terminus implies that this skew is also important in other basic cellular processes that are common among bacteria. Finally, we show that the FtsK translocase is a powerful motor that is able to displace a triplex-forming oligo from a DNA substrate.

Journal ArticleDOI
TL;DR: This work model the process of breed formation and shows that the probability of two or three adjacent marker loci showing a spurious signal of selection within at least one breed is low if highly variable and moderately spaced markers are utilized, and suggests that the causative mutation is a gene or regulatory region closely linked to FGFR3.
Abstract: Many domestic dog breeds have originated through fixation of discrete mutations by intense artificial selection. As a result of this process, markers in the proximity of genes influencing breed-defining traits will have reduced variation (a selective sweep) and will show divergence in allele frequency. Consequently, low-resolution genomic scans can potentially be used to identify regions containing genes that have a major influence on breed-defining traits. We model the process of breed formation and show that the probability of two or three adjacent marker loci showing a spurious signal of selection within at least one breed (i.e., Type I error or false-positive rate) is low if highly variable and moderately spaced markers are utilized. We also use simulations with selection to demonstrate that even a moderately spaced set of highly polymorphic markers (e.g., one every 0.8 cM) has high power to detect regions targeted by strong artificial selection in dogs. Further, we show that a gene responsible for black coat color in the Large Munsterlander has a 40-Mb region surrounding the gene that is very low in heterozygosity for microsatellite markers. Similarly, we survey 302 microsatellite markers in the Dachshund and find three linked monomorphic microsatellite markers all within a 10-Mb region on chromosome 3. This region contains the FGFR3 gene, which is responsible for achondroplasia in humans, but not in dogs. Consequently, our results suggest that the causative mutation is a gene or regulatory region closely linked to FGFR3.

Journal ArticleDOI
TL;DR: Protein unfolding during import resembles mechanical unfolding, and the specificity of import is determined by the resistance of the mature domain to unfolding as well as by the properties of the targeting sequence.
Abstract: Most proteins that are to be imported into the mitochondrial matrix are synthesized as precursors, each composed of an N-terminal targeting sequence followed by a mature domain. Precursors are recognized through their targeting sequences by receptors at the mitochondrial surface and are then threaded through import channels into the matrix. Both the targeting sequence and the mature domain contribute to the efficiency with which proteins are imported into mitochondria. Precursors must be in an unfolded conformation during translocation. Mitochondria can unfold some proteins by changing their unfolding pathways. The effectiveness of this unfolding mechanism depends on the local structure of the mature domain adjacent to the targeting sequence. This local structure determines the extent to which the unfolding pathway can be changed and, therefore, the unfolding rate increased. Atomic force microscopy studies find that the local structures of proteins near their N and C termini also influence their resistance to mechanical unfolding. Thus, protein unfolding during import resembles mechanical unfolding, and the specificity of import is determined by the resistance of the mature domain to unfolding as well as by the properties of the targeting sequence.

Journal ArticleDOI
TL;DR: In this paper, the precise positioning of a carbon nanotube on an atomic force microscope (AFM) tip was reported. And the performance of these tips for AFM imaging was demonstrated by improved lateral resolution of DNA molecules.
Abstract: We report on the precise positioning of a carbon nanotube on an atomic force microscope (AFM) tip. By using a nanomanipulator inside a scanning electron microscope, an individual nanotube was retrieved from a metal foil by the AFM tip. The electron beam allows us to control the nanotube length and to sharpen its end. The performance of these tips for AFM imaging is demonstrated by improved lateral resolution of DNA molecules.

Journal ArticleDOI
TL;DR: Results on simulated coevolutionary data indicate that the BMM method can successfully detect nearly all coevolving sites when the model has been correctly specified, and that non-parametric statistics such as mutual information are generally less powerful than parametric statistics.
Abstract: Motivation: The evolution of protein sequences is constrained by complex interactions between amino acid residues. Because harmful substitutions may be compensated for by other substitutions at neighboring sites, residues can coevolve. We describe a Bayesian phylogenetic approach to the detection of coevolving residues in protein families. This method, Bayesian mutational mapping (BMM), assigns mutations to the branches of the evolutionary tree stochastically, and then test statistics are calculated to determine whether a coevolutionary signal exists in the mapping. Posterior predictive P-values provide an estimate of significance, and specificity is maintained by integrating over uncertainty in the estimation of the tree topology, branch lengths and substitution rates. A coevolutionary Markov model for codon substitution is also described, and this model is used as the basis of several test statistics. Results: Results on simulated coevolutionary data indicate that the BMM method can successfully detect nearly all coevolving sites when the model has been correctly specified, and that non-parametric statistics such as mutual information are generally less powerful than parametric statistics. On a dataset of eukaryotic proteins from the phosphoglycerate kinase (PGK) family, interdomain site contacts yield a significantly greater coevolutionary signal than interdomain non-contacts, an indication that the method provides information about interacting sites. Failure to account for the heterogeneity in rates across sites in PGK resulted in a less discriminating test, yielding a marked increase in the number of reported positives at both contact and non-contact sites. Contact: [email protected] Supplementary information: http://www.dimmic.net/supplement/

Journal ArticleDOI
01 Jul 2005-Genetics
TL;DR: The test has excellent power to detects weak negative selection and moderate power to detect positive selection and is quite robust to bias in the estimate of local recombination rate, but not to certain demographic scenarios such as population growth or a recent bottleneck.
Abstract: We present a novel composite-likelihood-ratio test (CLRT) for detecting genes and genomic regions that are subject to recurrent natural selection (either positive or negative). The method uses the likelihood functions of Hartl et al. (1994) for inference in a Wright-Fisher genic selection model and corrects for nonindependence among sites by application of coalescent simulations with recombination. Here, we (1) characterize the distribution of the CLRT statistic (Λ) as a function of the population recombination rate (R = 4Ner); (2) explore the effects of bias in estimation of R on the size (type I error) of the CLRT; (3) explore the robustness of the model to population growth, bottlenecks, and migration; (4) explore the power of the CLRT under varying levels of mutation, selection, and recombination; (5) explore the discriminatory power of the test in distinguishing negative selection from population growth; and (6) evaluate the performance of maximum composite-likelihood estimation (MCLE) of the selection coefficient. We find that the test has excellent power to detect weak negative selection and moderate power to detect positive selection. Moreover, the test is quite robust to bias in the estimate of local recombination rate, but not to certain demographic scenarios such as population growth or a recent bottleneck. Last, we demonstrate that the MCLE of the selection parameter has little bias for weak negative selection and has downward bias for positively selected mutations.

Journal ArticleDOI
TL;DR: These analyses statistically confirm that an evolutionary slowdown occurs late in infection, strongly support the immune-relaxation hypothesis, and indicate that the cessation of nonsynonymous evolution is associated with disease progression.
Abstract: Within-patient HIV populations evolve rapidly because of a high mutation rate, short generation time, and strong positive selection pressures. Previous studies have identified "consistent patterns" of viral sequence evolution. Just before HIV infection progresses to AIDS, evolution seems to slow markedly, and the genetic diversity of the viral population drops. This evolutionary slowdown could be caused either by a reduction in the average viral replication rate or because selection pressures weaken with the collapse of the immune system. The former hypothesis (which we denote "cellular exhaustion") predicts a simultaneous reduction in both synonymous and nonsynonymous evolution, whereas the latter hypothesis (denoted "immune relaxation") predicts that only nonsynonymous evolution will slow. In this paper, we present a set of statistical procedures for distinguishing between these alternative hypotheses using DNA sequences sampled over the course of infection. The first component is a new method for estimating evolutionary rates that takes advantage of the temporal information in longitudinal DNA sequence samples. Second, we develop a set of probability models for the analysis of evolutionary rates in HIV populations in vivo. Application of these models to both synonymous and nonsynonymous evolution affords a comparison of the cellular-exhaustion and immune-relaxation hypotheses. We apply the procedures to longitudinal data sets in which sequences of the env gene were sampled over the entire course of infection. Our analyses (1) statistically confirm that an evolutionary slowdown occurs late in infection, (2) strongly support the immune-relaxation hypothesis, and (3) indicate that the cessation of nonsynonymous evolution is associated with disease progression.

Journal ArticleDOI
TL;DR: If the unfolding process occurs irreversibly, it is shown here that single-molecule experiments can still provide equilibrium, thermodynamic information from non-equilibrium data by using recently discovered fluctuation theorems, which represent a bridge between equilibrium and non-Equilibrium statistical mechanics.
Abstract: During the last 15 years, scientists have developed methods that permit the direct mechanical manipulation of individual molecules. Using this approach, they have begun to investigate the effect of force and torque in chemical and biochemical reactions. These studies span from the study of the mechanical properties of macromolecules, to the characterization of molecular motors, to the mechanical unfolding of individual proteins and RNA. Here I present a review of some of our most recent results using mechanical force to unfold individual molecules of RNA. These studies make it possible to follow in real time the trajectory of each molecule as it unfolds and characterize the various intermediates of the reaction. Moreover, if the process takes place reversibly it is possible to extract both kinetic and thermodynamic information from these experiments at the same time that we characterize the forces that maintain the three-dimensional structure of the molecule in solution. These studies bring us closer to the biological unfolding processes in the cell as they simulate in vitro, the mechanical unfolding of RNAs carried out in the cell by helicases. If the unfolding process occurs irreversibly, I show here that single-molecule experiments can still provide equilibrium, thermodynamic information from non-equilibrium data by using recently discovered fluctuation theorems. Such theorems represent a bridge between equilibrium and non-equilibrium statistical mechanics. In fact, first derived in 1997, the first experimental demonstration of the validity of fluctuation theorems was obtained by unfolding mechanically a single molecule of RNA. It is perhaps a sign of the times that important physical results are these days used to extract information about biological systems and that biological systems are being used to test and confirm fundamental new laws in physics.

Journal ArticleDOI
TL;DR: Atomic force microscopy is used to visualize protein-DNA complexes formed during the initial stages of P-element transposition and reveals that GTP acts to promote assembly of the first detectable noncovalent precleavage synaptic complex.
Abstract: P transposable elements in Drosophila are members of a larger class of mobile elements that move using a cutand-paste mechanism. P-element transposase uses guanosine triphosphate (GTP) as a cofactor for transposition. Here, we use atomic force microscopy (AFM) to visualize protein–DNA complexes formed during the initial stages of P-element transposition. These studies reveal that GTP acts to promote assembly of the first detectable noncovalent precleavage synaptic complex. This initial complex then randomly and independently cleaves each P-element end. These data show that GTP acts to promote protein–DNA assembly, and may explain why Pelement excision often leads to unidirectional deletions.

Book ChapterDOI
01 Jan 2005


Book ChapterDOI
01 Jan 2005
TL;DR: The rapid evolution of this novel gene, Sdic, is presented as a fascinating example of male-driven evolution incurred by recurrent selective sweeps.
Abstract: The Sdic gene cluster at the base of the X-chromosome is unique to the lineage of Drosophila melanogaster. The repeating unit in the cluster was formed from a duplication and fusion of the genes, AnnX and Cdic, which juxtaposed the 3′ untranslated region of AnnX to the third intron of Cdic. AnnX encodes Annexin 10 and Cdic encodes a cytoplasmic dynein intermediate chain. The 3′ untranslated region of AnnX contains two promoter elements, including a testis-specific element, and Cdic intron 3 contains a third promoter element; together these elements result in testis-specific transcription of Sdic. The Sdic protein features a novel amino terminus derived in part from Cdic intron 3 which contains motifs similar to those in axonemal dyneins. It has been demonstrated that the Sdic protein becomes incorporated into the tails of mature sperm. The evolution of the Sdic cluster required several deletions, at least one insertion, at least eleven nudeotide substitutions, and an estimated tenfold tandem duplication, all of which took place in the 1–3 million years since the divergence of D. melanogaster from D. simulans. Evidence for the ongoing evolution of Sdic including a recent selective sweep is found in the low levels of polymorphism across neighboring genes in the region, a large number of fixed amino acid replacements relative to fixed synonymous nucleotide substitutions, and a frequency spectrum of polymorphic nucleotides skewed toward rare variants. The analysis of polymorphism and divergence in the Sdic region, however, is complicated by the possible effects of background selection caused by deleterious new mutations, owing to the reduced amount of recombination in the region associated with its proximity to centromeric heterochromatin. We present the rapid evolution of this novel gene as a fascinating example of male-driven evolution incurred by recurrent selective sweeps.