scispace - formally typeset
Search or ask a question

Showing papers by "Carlos Bustamante published in 2014"


Journal ArticleDOI
TL;DR: It is found that none of the extant wolf lineages from putative domestication centers is more closely related to dogs, and, instead, the sampled wolves form a sister monophyletic clade, suggesting that a re-evaluation of past hypotheses regarding dog origins is necessary.
Abstract: To identify genetic changes underlying dog domestication and reconstruct their early evolutionary history, we generated high-quality genome sequences from three gray wolves, one from each of the three putative centers of dog domestication, two basal dog lineages (Basenji and Dingo) and a golden jackal as an outgroup. Analysis of these sequences supports a demographic model in which dogs and wolves diverged through a dynamic process involving population bottlenecks in both lineages and post-divergence gene flow. In dogs, the domestication bottleneck involved at least a 16-fold reduction in population size, a much more severe bottleneck than estimated previously. A sharp bottleneck in wolves occurred soon after their divergence from dogs, implying that the pool of diversity from which dogs arose was substantially larger than represented by modern wolf populations. We narrow the plausible range for the date of initial dog domestication to an interval spanning 11–16 thousand years ago, predating the rise of agriculture. In light of this finding, we expand upon previous work regarding the increase in copy number of the amylase gene (AMY2B) in dogs, which is believed to have aided digestion of starch in agricultural refuse. We find standing variation for amylase copy number variation in wolves and little or no copy number increase in the Dingo and Husky lineages. In conjunction with the estimated timing of dog origins, these results provide additional support to archaeological finds, suggesting the earliest dogs arose alongside hunter-gathers rather than agriculturists. Regarding the geographic origin of dogs, we find that, surprisingly, none of the extant wolf lineages from putative domestication centers is more closely related to dogs, and, instead, the sampled wolves form a sister monophyletic clade. This result, in combination with dog-wolf admixture during the process of domestication, suggests that a re-evaluation of past hypotheses regarding dog origins is necessary.

504 citations


Journal ArticleDOI
13 Feb 2014-Nature
TL;DR: The genome sequence of a male infant recovered from the Anzick burial site in western Montana is sequenced and it is shown that the gene flow from the Siberian Upper Palaeolithic Mal’ta population into Native American ancestors is also shared by the AnZick-1 individual and thus happened before 12,600 years bp.
Abstract: Clovis, with its distinctive biface, blade and osseous technologies, is the oldest widespread archaeological complex defined in North America, dating from 11,100 to 10,700 C-14 years before present (BP) (13,000 to 12,600 calendar years BP)(1,2). Nearly 50 years of archaeological research point to the Clovis complex as having developed south of the North American ice sheets from an ancestral technology(3). However, both the origins and the genetic legacy of the people who manufactured Clovis tools remain under debate. It is generally believed that these people ultimately derived from Asia and were directly related to contemporary Native Americans(2). An alternative, Solutrean, hypothesis posits that the Clovis predecessors emigrated from southwestern Europe during the Last Glacial Maximum(4). Here we report the genome sequence of a male infant (Anzick-1) recovered from the Anzick burial site in western Montana. The human bones date to 10,705 +/- 35 C-14 years BP (approximately 12,707-12,556 calendar years BP) and were directly associated with Clovis tools. We sequenced the genome to an average depth of 14.4x and show that the gene flow from the Siberian Upper Palaeolithic Mal'ta population(5) into Native American ancestors is also shared by the Anzick-1 individual and thus happened before 12,600 years BP. We also show that the Anzick-1 individual is more closely related to all indigenous American populations than to any other group. Our data are compatible with the hypothesis that Anzick-1 belonged to a population directly ancestral to many contemporary Native Americans. Finally, we find evidence of a deep divergence in Native American populations that predates the Anzick-1 individual.

464 citations


Journal ArticleDOI
13 Jun 2014-Science
TL;DR: Pre-Columbian genetic substructure is recapitulated in the indigenous ancestry of admixed mestizo individuals across the country, and two independently phenotyped cohorts of Mexicans and Mexican Americans showed a significant association between subcontinental ancestry and lung function.
Abstract: Mexico harbors great cultural and ethnic diversity, yet fine-scale patterns of human genome-wide variation from this region remain largely uncharacterized. We studied genomic variation within Mexico from over 1000 individuals representing 20 indigenous and 11 mestizo populations. We found striking genetic stratification among indigenous populations within Mexico at varying degrees of geographic isolation. Some groups were as differentiated as Europeans are from East Asians. Pre-Columbian genetic substructure is recapitulated in the indigenous ancestry of admixed mestizo individuals across the country. Furthermore, two independently phenotyped cohorts of Mexicans and Mexican Americans showed a significant association between subcontinental ancestry and lung function. Thus, accounting for fine-scale ancestry patterns is critical for medical and population genetic studies within Mexico, in Mexican-descent populations, and likely in many other populations worldwide.

416 citations


Journal ArticleDOI
29 Aug 2014-Science
TL;DR: In this paper, the authors present genome-wide sequence data from ancient and present-day humans from Greenland, Arctic Canada, Alaska, Aleutian Islands, and Siberia, and show that a single Paleo-Eskimo metapopulation likely survived in near-isolation for more than 4000 years, only to vanish around 700 years ago.
Abstract: The New World Arctic, the last region of the Americas to be populated by humans, has a relatively well-researched archaeology, but an understanding of its genetic history is lacking. We present genome-wide sequence data from ancient and present-day humans from Greenland, Arctic Canada, Alaska, Aleutian Islands, and Siberia. We show that Paleo-Eskimos (~3000 BCE to 1300 CE) represent a migration pulse into the Americas independent of both Native American and Inuit expansions. Furthermore, the genetic continuity characterizing the Paleo-Eskimo period was interrupted by the arrival of a new population, representing the ancestors of present-day Inuit, with evidence of past gene flow between these lineages. Despite periodic abandonment of major Arctic regions, a single Paleo-Eskimo metapopulation likely survived in near-isolation for more than 4000 years, only to vanish around 700 years ago.

266 citations


Journal ArticleDOI
24 Apr 2014-Cell
TL;DR: This study investigates a viral packaging machine as it fills the capsid with DNA and encounters increasing internal pressure and finds that the motor rotates the DNA during packaging and that the rotation per base pair increases with filling.

133 citations


Journal ArticleDOI
TL;DR: The first genome assembly of an extremophile, the first dipteran in the family Chironomidae, and the first Antarctic eukaryote to be sequenced are presented.
Abstract: The midge, Belgica antarctica, is the only insect endemic to Antarctica, and thus it offers a powerful model for probing responses to extreme temperatures, freeze tolerance, dehydration, osmotic stress, ultraviolet radiation and other forms of environmental stress. Here we present the first genome assembly of an extremophile, the first dipteran in the family Chironomidae, and the first Antarctic eukaryote to be sequenced. At 99 megabases, B. antarctica has the smallest insect genome sequenced thus far. Although it has a similar number of genes as other Diptera, the midge genome has very low repeat density and a reduction in intron length. Environmental extremes appear to constrain genome architecture, not gene content. The few transposable elements present are mainly ancient, inactive retroelements. An abundance of genes associated with development, regulation of metabolism and responses to external stimuli may reflect adaptations for surviving in this harsh

131 citations


Journal ArticleDOI
TL;DR: It is demonstrated that GBS is a cost-effective method for generating genome-wide SNP data suitable for genetic mapping in a highly diverse and heterozygous agricultural species and future improvements to the GBS analysis pipeline presented here will enhance the utility of next-generation DNA sequence data for the purposes of genetic mapping across diverse species.
Abstract: Next-generation DNA sequencing (NGS) produces vast amounts of DNA sequence data, but it is not specifically designed to generate data suitable for genetic mapping. Recently developed DNA library preparation methods for NGS have helped solve this problem, however, by combining the use of reduced representation libraries with DNA sample barcoding to generate genome-wide genotype data from a common set of genetic markers across a large number of samples. Here we use such a method, called genotyping-by-sequencing (GBS), to produce a data set for genetic mapping in an F1 population of apples (Malus × domestica) segregating for skin color. We show that GBS produces a relatively large, but extremely sparse, genotype matrix: over 270,000 SNPs were discovered but most SNPs have too much missing data across samples to be useful for genetic mapping. After filtering for genotype quality and missing data, only 6% of the 85 million DNA sequence reads contributed to useful genotype calls. Despite this limitation, using existing software and a set of simple heuristics, we generated a final genotype matrix containing 3967 SNPs from 89 DNA samples from a single lane of Illumina HiSeq and used it to create a saturated genetic linkage map and to identify a known QTL underlying apple skin color. We therefore demonstrate that GBS is a cost-effective method for generating genome-wide SNP data suitable for genetic mapping in a highly diverse and heterozygous agricultural species. We anticipate future improvements to the GBS analysis pipeline presented here that will enhance the utility of next-generation DNA sequence data for the purposes of genetic mapping across diverse species.

127 citations


Journal ArticleDOI
TL;DR: The results categorically demonstrate that electron transport and proton circuitry in this model bacterium are spatially delocalized over the cell membrane, in stark contrast to mitochondrial bioenergetic supercomplexes.

126 citations


Journal ArticleDOI
TL;DR: It is proposed that the downstream structure traps ribosomal complexes in the fluctuating conformational states of the translocation process and thus allows more opportunities for frameshifting, which would allow the PRE complex to explore alternative translocation pathways such as −1PRF.
Abstract: Ribosomal frameshifting occurs when a ribosome slips a few nucleotides on an mRNA and generates a new sequence of amino acids. Programmed -1 ribosomal frameshifting (-1PRF) is used in various systems to express two or more proteins from a single mRNA at precisely regulated levels. We used single-molecule fluorescence resonance energy transfer (smFRET) to study the dynamics of -1PRF in the Escherichia coli dnaX gene. The frameshifting mRNA (FSmRNA) contained the frameshifting signals: a Shine-Dalgarno sequence, a slippery sequence, and a downstream stem loop. The dynamics of ribosomal complexes translating through the slippery sequence were characterized using smFRET between the Cy3-labeled L1 stalk of the large ribosomal subunit and a Cy5-labeled tRNA(Lys) in the ribosomal peptidyl-tRNA-binding (P) site. We observed significantly slower elongation factor G (EF-G)-catalyzed translocation through the slippery sequence of FSmRNA in comparison with an mRNA lacking the stem loop, ΔSL. Furthermore, the P-site tRNA/L1 stalk of FSmRNA-programmed pretranslocation (PRE) ribosomal complexes exhibited multiple fluctuations between the classical/open and hybrid/closed states, respectively, in the presence of EF-G before translocation, in contrast with ΔSL-programmed PRE complexes, which sampled the hybrid/closed state approximately once before undergoing translocation. Quantitative analysis showed that the stimulatory stem loop destabilizes the hybrid state and elevates the energy barriers corresponding to subsequent substeps of translocation. The shift of the FSmRNA-programmed PRE complex equilibrium toward the classical/open state and toward states that favor EF-G dissociation apparently allows the PRE complex to explore alternative translocation pathways such as -1PRF.

99 citations


Journal ArticleDOI
TL;DR: The results suggest that rare variation contributes to individual differences in response to albuterol in Latinos, notably in SLC genes that include membrane transport proteins involved in the transport of endogenous metabolites and xenobiotics.
Abstract: Background The primary rescue medication to treat acute asthma exacerbation is the short-acting β 2 -adrenergic receptor agonist; however, there is variation in how well a patient responds to treatment. Although these differences might be due to environmental factors, there is mounting evidence for a genetic contribution to variability in bronchodilator response (BDR). Objective To identify genetic variation associated with bronchodilator drug response in Latino children with asthma. Methods We performed a genome-wide association study (GWAS) for BDR in 1782 Latino children with asthma using standard linear regression, adjusting for genetic ancestry and ethnicity, and performed replication studies in an additional 531 Latinos. We also performed admixture mapping across the genome by testing for an association between local European, African, and Native American ancestry and BDR, adjusting for genomic ancestry and ethnicity. Results We identified 7 genetic variants associated with BDR at a genome-wide significant threshold ( P −8 ), all of which had frequencies of less than 5%. Furthermore, we observed an excess of small P values driven by rare variants (frequency, SLC22A15 as being associated with increased BDR in Mexicans. Quantitative PCR and immunohistochemistry identified SLC22A15 as being expressed in the lung and bronchial epithelial cells. Conclusion Our results suggest that rare variation contributes to individual differences in response to albuterol in Latinos, notably in SLC genes that include membrane transport proteins involved in the transport of endogenous metabolites and xenobiotics. Resequencing in larger, multiethnic population samples and additional functional studies are required to further understand the role of rare variation in BDR.

96 citations


Journal ArticleDOI
TL;DR: A widespread and important statistical measure known as the randomness parameter, which is the squared coefficient of variation of the cycle completion times, although it places significant limits on the minimal complexity of possible enzymatic mechanisms is focused on.
Abstract: Enzyme-catalyzed reactions are naturally stochastic, and precision measurements of these fluctuations, made possible by single-molecule methods, promise to provide fundamentally new constraints on the possible mechanisms underlying these reactions. We review some aspects of statistical kinetics: a new field with the goal of extracting mechanistic information from statistical measures of fluctuations in chemical reactions. We focus on a widespread and important statistical measure known as the randomness parameter. This parameter is remarkably simple in that it is the squared coefficient of variation of the cycle completion times, although it places significant limits on the minimal complexity of possible enzymatic mechanisms. Recently, a general expression has been introduced for the substrate dependence of the randomness parameter that is for rate fluctuations what the Michaelis–Menten expression is for the mean rate of product generation. We discuss the information provided by the new kinetic parameters introduced by this expression and demonstrate that this expression can simplify the vast majority of published models.

Journal ArticleDOI
TL;DR: Using whole-genome sequencing data, it is confirmed that the Iceman is, indeed, most closely related to Sardinians and it is shown that this relationship extends to other individuals from cultural contexts associated with the spread of agriculture during the Neolithic transition, in contrast to individuals from a hunter-gatherer context.
Abstract: Genome sequencing of the 5,300-year-old mummy of the Tyrolean Iceman, found in 1991 on a glacier near the border of Italy and Austria, has yielded new insights into his origin and relationship to modern European populations. A key finding of that study was an apparent recent common ancestry with individuals from Sardinia, based largely on the Y chromosome haplogroup and common autosomal SNP variation. Here, we compiled and analyzed genomic datasets from both modern and ancient Europeans, including genome sequence data from over 400 Sardinians and two ancient Thracians from Bulgaria, to investigate this result in greater detail and determine its implications for the genetic structure of Neolithic Europe. Using whole-genome sequencing data, we confirm that the Iceman is, indeed, most closely related to Sardinians. Furthermore, we show that this relationship extends to other individuals from cultural contexts associated with the spread of agriculture during the Neolithic transition, in contrast to individuals from a hunter-gatherer context. We hypothesize that this genetic affinity of ancient samples from different parts of Europe with Sardinians represents a common genetic component that was geographically widespread across Europe during the Neolithic, likely related to migrations and population expansions associated with the spread of agriculture.

Journal ArticleDOI
TL;DR: Admixture mapping and genome-wide association are complementary techniques that provide evidence for multiple asthma-associated loci in Latinos and identify a novel locus on 6p21 that replicates in a meta-analysis of several Latino populations, whereas genome- wide association confirms the previously identified locus in 17q21.
Abstract: Background Asthma is a complex disease with both genetic and environmental causes. Genome-wide association studies of asthma have mostly involved European populations, and replication of positive associations has been inconsistent. Objective We sought to identify asthma-associated genes in a large Latino population with genome-wide association analysis and admixture mapping. Methods Latino children with asthma (n = 1893) and healthy control subjects (n = 1881) were recruited from 5 sites in the United States: Puerto Rico, New York, Chicago, Houston, and the San Francisco Bay Area. Subjects were genotyped on an Affymetrix World Array IV chip. We performed genome-wide association and admixture mapping to identify asthma-associated loci. Results We identified a significant association between ancestry and asthma at 6p21 (lowest P value: rs2523924, P −6 ). This association replicates in a meta-analysis of the EVE Asthma Consortium ( P = .01). Fine mapping of the region in this study and the EVE Asthma Consortium suggests an association between PSORS1C1 and asthma. We confirmed the strong allelic association between SNPs in the 17q21 region and asthma in Latinos ( IKZF3 , lowest P value: rs90792, odds ratio, 0.67; 95% CI, 0.61-0.75; P = 6 × 10 −13 ) and replicated associations in several genes that had previously been associated with asthma in genome-wide association studies. Conclusions Admixture mapping and genome-wide association are complementary techniques that provide evidence for multiple asthma-associated loci in Latinos. Admixture mapping identifies a novel locus on 6p21 that replicates in a meta-analysis of several Latino populations, whereas genome-wide association confirms the previously identified locus on 17q21.

Journal ArticleDOI
11 Aug 2014-eLife
TL;DR: It is found that translocation rates depend exponentially on the force, with a characteristic distance close to the one-codon step, ruling out the existence of sub-steps and showing that the ribosome likely functions as a Brownian ratchet, thus providing a basis for their regulatory role.
Abstract: Producing a protein first requires its gene to be transcribed into a long molecule called a messenger RNA (mRNA). A complex molecular machine called the ribosome then translates the mRNA code by reading it three letters at a time. Each triplet of letters—known as a codon—tells the ribosome which amino acid to add next into the protein. After adding an amino acid, the ribosome moves along the mRNA molecule to read the next codon and add another amino acid into the protein chain. While researchers understand how protein chains are formed, how the ribosome shifts along the mRNA strand—a process called translocation—is still unclear. It is known that this process involves many force-generating movements and changes to the shape of the ribosome. However, it is only recently that researchers have been able to measure these forces. Using optical tweezers—an instrument that uses a highly focused laser beam to hold and manipulate microscopic objects—Liu, Kaplan et al. followed individual ribosomes as they translated an mRNA and measured the effect that applying an opposing force has on the rate of translation. The results shed new light on the mechanism of translocation. First, Liu, Kaplan et al. found that ribosomes jump directly from one triplet to the next in the mRNA sequence, rather than moving there in a series of smaller steps. Next, the results indicate that translocation occurs spontaneously, driven by thermal energy, while chemical reactions prevent the reverse movement, in a mechanism known as a ‘Brownian Ratchet’. Measurements of the maximum force generated by the ribosome also give insights into how translation is regulated. Strands of mRNA can fold into certain structures that slow down translation, because the mRNA must first be unfolded before the ribosome can translate it. Liu, Kaplan et al. found that the maximum force generated by a ribosome is only just enough to unwind these mRNA structures, making the translation rate highly sensitive to the existence of such structures, and the structures themselves of high importance for regulating transcription. Given its importance as the ultimate decoder of the genetic information, understanding the ribosome's function and regulation has broad implications. The work of Liu, Kaplan et al. opens the way for a full characterization of the role of mechanical forces in the translation process.

Journal ArticleDOI
TL;DR: During transcription, a DNA molecule is copied into RNA molecules that are then used to translate the genetic information into proteins; this logical pattern has been conserved throughout all three kingdoms of life, making it an essential and fundamental cellular process.
Abstract: Transcription represents the first step in gene expression. It is therefore not surprising that transcription is a highly regulated process and its control is essential to understand the flow and processing of information required by the cell to maintain its homeostasis. During transcription, a DNA molecule is copied into RNA molecules that are then used to translate the genetic information into proteins; this logical pattern has been conserved throughout all three kingdoms of life, from Archaea to Eukarya, making it an essential and fundamental cellular process. Even though some viruses that encode their genome in an RNA molecule use it as a template to make mRNA, others synthesize an intermediate DNA molecule from the RNA, a process known as reverse transcription, from which regular transcription of viral genes can then proceed in the host cells.

Journal ArticleDOI
TL;DR: A slow-switching Dronpa variant, rsKame, featuring a V157L amino acid substitution proximal to the chromophore reduced the excitation light-induced photoactivation from the dark to fluorescent state, suggesting support for the twistase model of Drp1 constriction, with potential loss of subunits at the helical ends.
Abstract: We studied the single-molecule photo-switching properties of Dronpa, a green photo-switchable fluorescent protein and a popular marker for photoactivated localization microscopy. We found the excitation light photoactivates as well as deactivates Dronpa single molecules, hindering temporal separation and limiting super resolution. To resolve this limitation, we have developed a slow-switching Dronpa variant, rsKame, featuring a V157L amino acid substitution proximal to the chromophore. The increased steric hindrance generated by the substitution reduced the excitation light-induced photoactivation from the dark to fluorescent state. To demonstrate applicability, we paired rsKame with PAmCherry1 in a two-color photoactivated localization microscopy imaging method to observe the inner and outer mitochondrial membrane structures and selectively labeled dynamin related protein 1 (Drp1), responsible for membrane scission during mitochondrial fission. We determined the diameter and length of Drp1 helical rings encircling mitochondria during fission and showed that, whereas their lengths along mitochondria were not significantly changed, their diameters decreased significantly. These results suggest support for the twistase model of Drp1 constriction, with potential loss of subunits at the helical ends.

Journal ArticleDOI
02 Oct 2014-Blood
TL;DR: The results provide the first evidence linking genetic variation in folate homeostasis to warfarin response, and a population-specific regulatory variant, rs7856096, which overlaps functional elements and was associated with FPGS gene expression in lymphoblastoid cell lines derived from combined HapMap African populations.

Journal ArticleDOI
TL;DR: The various mechanisms by which ring motors convert chemical energy to mechanical force or torque and coordinate the activities of individual subunits that constitute the ring are described.

Journal ArticleDOI
TL;DR: It is found that these transcription factors enhance overall transcription elongation by reducing the lifetime of transcriptional pauses and that TFIIF also decreases the probability of pause entry, and it is observed that both factors enhance the processivity of RNA polymerase II through the nucleosomal barrier.
Abstract: Transcription factors IIS (TFIIS) and IIF (TFIIF) are known to stimulate transcription elongation. Here, we use a single-molecule transcription elongation assay to study the effects of both factors. We find that these transcription factors enhance overall transcription elongation by reducing the lifetime of transcriptional pauses and that TFIIF also decreases the probability of pause entry. Furthermore, we observe that both factors enhance the processivity of RNA polymerase II through the nucleosomal barrier. The effects of TFIIS and TFIIF are quantitatively described using the linear Brownian ratchet kinetic model for transcription elongation and the backtracking model for transcriptional pauses, modified to account for the effects of the transcription factors. Our findings help elucidate the molecular mechanisms by which transcription factors modulate gene expression.

Journal ArticleDOI
TL;DR: Together, these results provide a resource for studies analyzing functional differences across populations by estimating the degree of shared gene expression, alternative splicing, and regulatory genetics across populations from the broadest points of human migration history yet sampled.
Abstract: Large-scale sequencing efforts have documented extensive genetic variation within the human genome. However, our understanding of the origins, global distribution, and functional consequences of this variation is far from complete. While regulatory variation influencing gene expression has been studied within a handful of populations, the breadth of transcriptome differences across diverse human populations has not been systematically analyzed. To better understand the spectrum of gene expression variation, alternative splicing, and the population genetics of regulatory variation in humans, we have sequenced the genomes, exomes, and transcriptomes of EBV transformed lymphoblastoid cell lines derived from 45 individuals in the Human Genome Diversity Panel (HGDP). The populations sampled span the geographic breadth of human migration history and include Namibian San, Mbuti Pygmies of the Democratic Republic of Congo, Algerian Mozabites, Pathan of Pakistan, Cambodians of East Asia, Yakut of Siberia, and Mayans of Mexico. We discover that approximately 25.0% of the variation in gene expression found amongst individuals can be attributed to population differences. However, we find few genes that are systematically differentially expressed among populations. Of this population-specific variation, 75.5% is due to expression rather than splicing variability, and we find few genes with strong evidence for differential splicing across populations. Allelic expression analyses indicate that previously mapped common regulatory variants identified in eight populations from the International Haplotype Map Phase 3 project have similar effects in our seven sampled HGDP populations, suggesting that the cellular effects of common variants are shared across diverse populations. Together, these results provide a resource for studies analyzing functional differences across populations by estimating the degree of shared gene expression, alternative splicing, and regulatory genetics across populations from the broadest points of human migration history yet sampled.

Journal ArticleDOI
TL;DR: This review describes how applications of single-molecule manipulation methods, in particular optical tweezers, are shedding new light on the molecular mechanisms of quality control during the life cycles of proteins.
Abstract: Cells employ a variety of strategies to maintain proteome homeostasis. Beginning during protein biogenesis, the translation machinery and a number of molecular chaperones promote correct de novo folding of nascent proteins even before synthesis is complete. Another set of molecular chaperones helps to maintain proteins in their functional, native state. Polypeptides that are no longer needed or pose a threat to the cell, such as misfolded proteins and aggregates, are removed in an efficient and timely fashion by ATP-dependent proteases. In this review, we describe how applications of single-molecule manipulation methods, in particular optical tweezers, are shedding new light on the molecular mechanisms of quality control during the life cycles of proteins.

Journal ArticleDOI
TL;DR: It is shown that exome capture of saliva-derived DNA yields sufficient non-human sequences to characterize oral microbial communities, including detection of bacteria linked to oral disease (e.g. Prevotella melaninogenica).
Abstract: Targeted capture of genomic regions reduces sequencing cost while generating higher coverage by allowing biomedical researchers to focus on specific loci of interest, such as exons. Targeted capture also has the potential to facilitate the generation of genomic data from DNA collected via saliva or buccal cells. DNA samples derived from these cell types tend to have a lower human DNA yield, may be degraded from age and/or have contamination from bacteria or other ambient oral microbiota. However, thousands of samples have been previously collected from these cell types, and saliva collection has the advantage that it is a non-invasive and appropriate for a wide variety of research. We demonstrate successful enrichment and sequencing of 15 South African KhoeSan exomes and 2 full genomes with samples initially derived from saliva. The expanded exome dataset enables us to characterize genetic diversity free from ascertainment bias for multiple KhoeSan populations, including new exome data from six HGDP Namibian San, revealing substantial population structure across the Kalahari Desert region. Additionally, we discover and independently verify thirty-one previously unknown KIR alleles using methods we developed to accurately map and call the highly polymorphic HLA and KIR loci from exome capture data. Finally, we show that exome capture of saliva-derived DNA yields sufficient non-human sequences to characterize oral microbial communities, including detection of bacteria linked to oral disease (e.g. Prevotella melaninogenica). For comparison, two samples were sequenced using standard full genome library preparation without exome capture and we found no systematic bias of metagenomic information between exome-captured and non-captured data. DNA from human saliva samples, collected and extracted using standard procedures, can be used to successfully sequence high quality human exomes, and metagenomic data can be derived from non-human reads. We find that individuals from the Kalahari carry a higher oral pathogenic microbial load than samples surveyed in the Human Microbiome Project. Additionally, rare variants present in the exomes suggest strong population structure across different KhoeSan populations.

Journal ArticleDOI
TL;DR: In this article, the authors show that it is unnecessary to assume either (i) or (ii) and focus their analysis on the zipping/unzipping transitions of an RNA hairpin.
Abstract: Extracting kinetic models from single molecule data is an important route to mechanistic insight in biophysics, chemistry, and biology. Data collected from force spectroscopy can probe discrete hops of a single molecule between different conformational states. Model extraction from such data is a challenging inverse problem because single molecule data are noisy and rich in structure. Standard modeling methods normally assume (i) a prespecified number of discrete states and (ii) that transitions between states are Markovian. The data set is then fit to this predetermined model to find a handful of rates describing the transitions between states. We show that it is unnecessary to assume either (i) or (ii) and focus our analysis on the zipping/unzipping transitions of an RNA hairpin. The key is in starting with a very broad class of non-Markov models in order to let the data guide us toward the best model from this very broad class. Our method suggests that there exists a folding intermediate for the P5ab RNA hairpin whose zipping/unzipping is monitored by force spectroscopy experiments. This intermediate would not have been resolved if a Markov model had been assumed from the onset. We compare the merits of our method with those of others.

Journal ArticleDOI
09 Jul 2014-PLOS ONE
TL;DR: Analysis of the local organization of PSII supercomplexes in ordered and disordered phases found evidence that interactions among light-harvesting antenna complexes are weakened in the absence of SOQ1, inducing protein rearrangements that favor larger separations between PSII complexes in the majority (disordered) phase and reshaping the PSII crystallization landscape.
Abstract: Photoautotrophic organisms efficiently regulate absorption of light energy to sustain photochemistry while promoting photoprotection. Photoprotection is achieved in part by triggering a series of dissipative processes termed non-photochemical quenching (NPQ), which depend on the re-organization of photosystem (PS) II supercomplexes in thylakoid membranes. Using atomic force microscopy, we characterized the structural attributes of grana thylakoids from Arabidopsis thaliana to correlate differences in PSII organization with the role of SOQ1, a recently discovered thylakoid protein that prevents formation of a slowly reversible NPQ state. We developed a statistical image analysis suite to discriminate disordered from crystalline particles and classify crystalline arrays according to their unit cell properties. Through detailed analysis of the local organization of PSII supercomplexes in ordered and disordered phases, we found evidence that interactions among light-harvesting antenna complexes are weakened in the absence of SOQ1, inducing protein rearrangements that favor larger separations between PSII complexes in the majority (disordered) phase and reshaping the PSII crystallization landscape. The features we observe are distinct from known protein rearrangements associated with NPQ, providing further support for a role of SOQ1 in a novel NPQ pathway. The particle clustering and unit cell methodology developed here is generalizable to multiple types of microscopy and will enable unbiased analysis and comparison of large data sets.

Journal ArticleDOI
TL;DR: The presence of 93 species was confirmed, although 30 species were encountered rarely, through validated catch records and sightings made in artisanal and commercial fisheries and on specific research cruises.
Abstract: A review of the primary literature on the cartilaginous fishes (sharks, skates, rays and chimaeras), together with new information suggests that 106 species occur in Chilean waters, comprising 58 sharks, 30 skates, 13 rays and five chimaeras. The presence of 93 species was confirmed, although 30 species were encountered rarely, through validated catch records and sightings made in artisanal and commercial fisheries and on specific research cruises. Overall, only 63 species appear to have a range distribution that normally includes Chilean waters. Actual reliable records of occurrence are lacking for 13 species. Chile has a cartilaginous fish fauna that is relatively impoverished compared with the global species inventory, but conservative compared with countries in South America with warm-temperate waters. The region of highest species richness occurs in the mid-Chilean latitudes of c. 30–40° S. This region represents a transition zone with a mix of species related to both the warm-temperate Peruvian province to the north and cold-temperate Magellan province to the south. This study provides clarification of species occurrence and the functional biodiversity of Chile's cartilaginous fish fauna.

Journal ArticleDOI
TL;DR: This special issue of Chemical Reviews has assembled some of the leading experts in the field with the goal of having them present the current state-of-the-art of these single molecule methods and their applications.
Abstract: One of the first concepts of molecules dates back to the preSocratic Greek philosopher Empedocles, who intuited around 450 B.C. that the four “elements”fire, earth, air, and water experience “forces” of attraction (“love”) and repulsion (“strive”) that allow them to mix and separate in ways that induced the origin and development of life. These ideas were refined by Democritus, who conceived the existence of atoms or indivisible particles and whose ideas were preserved for posterity by the Latin poet Lucretius. This knowledge was lost with the end of the Roman Empire, and reemerged only after almost two millennia in the 1661 treatise “The Sceptical Chymist”, wherein Robert Boyle hypothesized that matter is composed of clusters of “corpuscles” capable of arranging themselves into groups. By the 1770s, the theory of molecules as particles endowed with hooks and barbs holding them together was generally referred to as “Cartesian Chemistry” in honor of Descartes’ work on the subject. It would take another 200 years to develop sophisticated microscopes that could visualize single molecules based on either their optical absorption or fluorescence, and mechanically manipulate and detect them through the use of magnetic tweezers, optical tweezers, and atomic force microscopes. In the years since the publication of these pioneering studies, the initial skepticism about the power of single molecule methods has been replaced with their enthusiastic adoption by a growing number of scientists. This special issue of Chemical Reviews has assembled some of the leading experts in the field with the goal of having them present the current state-of-the-art of these single molecule methods and their applications. As will become apparent throughout these contributions, the ability to interrogate individual molecules and follow their molecular trajectories provides direct access to the detailed molecular mechanisms underlying biomolecular processes. In particular, single molecule studies naturally avoid the ensemble or population average that characterizes bulk methods, and can provide direct access not only to the mean values but also the higher moments of the kinetic coefficients that characterize a dynamic process. Furthermore, single molecule methods can give access to nonuniform kinetic behavior as well as transient and rare or unlikely molecular states that are all but impossible to study in bulk. Similarly, complex spatiotemporal distributions and properties of ensembles of single molecules in an individual cell or specimen can be revealed. Constant refinement of the microscopy tools available often has led and will continue to lead to the rapid emergence of entirely new areas of inquiry. Accomplishments have come in rapid succession over the past two decades, and the future of single molecule approaches appears assured as they are applied to samples of increasing complexity, including whole organisms. In this thematic issue, the assembled 11 reviews represent a snapshot of many of the techniques and insights emerging from the field. Naturally, the selection of articles is limited by our own biases and the availability of authors. While we focus on current applications in the molecular biosciences, single molecule methods can equally be applied to scientific problems in, for example, materials science. It is therefore in a spirit of humble awe of what is already possible and what may still become feasible in years to come across all fields of science that we offer this inherently limited snapshot in hopes that it may help inspire new applications, technical advances, and recruitment of the brightest young minds. First, Greene and colleagues remind us that DNA often plays an active, dynamic role in directing the biological functions of DNA/protein complexes. After establishing a conceptual framework for the interactions between DNA and its effector proteins, the authors review how their fluorescence microscopy based single molecule DNA curtains have opened a window onto biologically important aspects of this framework. Next, Wuite and colleagues introduce us to optical tweezers as a complementary tool to probe DNA/protein interactions. They highlight the plethora of assays that have been developed to perform single molecule force measurements that quantify the mechanical properties of DNA. Upon binding protein partners, these forces are changed in a way that often provides unique mechanistic insight into the mechanics and energetics of DNA/protein complexes. On occasion, optical tweezers observations are combined with fluorescence microscopy to more completely monitor force-induced changes. This set of tools has led to insights across a whole variety of biological processes involving the replication, maintenance, and expression of genetic information. Ando and colleagues then introduce us to recent advances in atomic force microscopy (AFM) that have allowed them to directly image any biopolymer at near video rates without the need for attaching fluorophore labels or handles for tweezing. High-speed AFM relies on a tiny cantilever tip bouncing over an atomically flat surface that is coated with the single molecules of interest, typically bound to a scaffold or membrane. The range of dynamic samples visualized so far spans from motor proteins like myosin walking on actin and membrane-bound energy-conversion enzymes such as F1ATPase and bacteriorhodopsin to self-aggregating proteins and cellulose-digesting enzymes, and even includes live cells. In an outlook, the authors share their perspective of how this nascent imaging tool may further evolve. Direct observation of the dynamic spatiotemporal distributions of proteins in the cell is the topic of the review by Lippincott-Schwartz and colleagues. These and other authors in the field have developed a set of superresolution techniques

Posted ContentDOI
24 Jul 2014-bioRxiv
TL;DR: It is shown that bait length has an important influence on library enrichment and WGC biases against the shorter molecules that are enriched in SSL preparation protocols, and application of WGC to such samples is not recommended without future optimization.
Abstract: 1. The application of whole genome capture (WGC) methods to ancient DNA (aDNA) promises to increase the efficiency of ancient genome sequencing. 2. We compared the performance of two recently developed WGC methods in enriching human aDNA within Illumina libraries built using both double-stranded (DSL) and single-stranded (SSL) build protocols. Although both methods effectively enriched aDNA, one consistently produced marginally better results, giving us the opportunity to further explore the parameters influencing WGC experiments. 3. Our results suggest that bait length has an important influence on library enrichment. Moreover, we show that WGC biases against the shorter molecules that are enriched in SSL preparation protocols. Therefore application of WGC to such samples is not recommended without future optimization. Lastly, we document the effect of WGC on other features including clonality, GC composition and repetitive DNA content of captured libraries. 4. Our findings provide insights for researchers planning to perform WGC on aDNA, and suggest future tests and optimization to improve WGC efficiency.

19 Jun 2014
TL;DR: In this paper, the authors show that it is unnecessary to assume either (i) or (ii) and focus their analysis on the zipping/unzipping transitions of an RNA hairpin.
Abstract: Extracting kinetic models from single molecule data is an important route to mechanistic insight in biophysics, chemistry, and biology. Data collected from force spectroscopy can probe discrete hops of a single molecule between different conformational states. Model extraction from such data is a challenging inverse problem because single molecule data are noisy and rich in structure. Standard modeling methods normally assume (i) a prespecified number of discrete states and (ii) that transitions between states are Markovian. The data set is then fit to this predetermined model to find a handful of rates describing the transitions between states. We show that it is unnecessary to assume either (i) or (ii) and focus our analysis on the zipping/unzipping transitions of an RNA hairpin. The key is in starting with a very broad class of non-Markov models in order to let the data guide us toward the best model from this very broad class. Our method suggests that there exists a folding intermediate for the P5ab RNA hairpin whose zipping/unzipping is monitored by force spectroscopy experiments. This intermediate would not have been resolved if a Markov model had been assumed from the onset. We compare the merits of our method with those of others.


Patent
02 May 2014
TL;DR: In this paper, a method for capturing DNA molecules in solution is presented. But the method is not suitable for the handling of large amounts of DNA and does not handle the complexity of DNA molecules.
Abstract: Provided herein is a method for capturing DNA molecules in solution. The method may comprise: extracting DNA from a sample that comprises endogenous DNA and environmental DNA to produce extracted DNA; ligating universal adaptors to the extracted DNA; hybridizing the extracted DNA, in solution, with affinity-tagged RNA probes generated by: in vitro transcribing a library of fragmented reference genomic DNA that has been ligated to an RNA promoter adaptor, in the presence of an affinity-tagged ribonucleotide; binding the product with a capture agent that is tethered to a substrate in the presence of RNA oligonucleotides that are complementary to the adaptors, thereby capturing the hybridized DNA molecules on the substrate; washing the substrate to remove any unbound DNA molecules; and releasing the captured DNA molecules. A kit for performing the method is also provided.