Other affiliations: Lawrence Berkeley National Laboratory, University of California, University of Miami ...read more
Bio: Carlos Bustamante is an academic researcher from Stanford University. The author has contributed to research in topic(s): Population & Genome-wide association study. The author has an hindex of 161, co-authored 770 publication(s) receiving 106053 citation(s). Previous affiliations of Carlos Bustamante include Lawrence Berkeley National Laboratory & University of California.
Papers published on a yearly basis
Adam Auton1, Gonçalo R. Abecasis2, David Altshuler3, Richard Durbin4 +514 more•Institutions (90)
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
TL;DR: DsDNA molecules in aqueous buffer undergo a highly cooperative transition into a stable form with 5.8 angstroms rise per base pair, that is, 70% longer than B-form dsDNA, which may play a significant role in the energetics of DNA recombination.
Abstract: Single molecules of double-stranded DNA (dsDNA) were stretched with force-measuring laser tweezers. Under a longitudinal stress of approximately 65 piconewtons (pN), dsDNA molecules in aqueous buffer undergo a highly cooperative transition into a stable form with 5.8 angstroms rise per base pair, that is, 70% longer than B form dsDNA. When the stress was relaxed below 65 pN, the molecules rapidly and reversibly contracted to their normal contour lengths. This transition was affected by changes in the ionic strength of the medium and the water activity or by cross-linking of the two strands of dsDNA. Individual molecules of single-stranded DNA were also stretched giving a persistence length of 7.5 angstroms and a stretch modulus of 800 pN. The overstretched form may play a significant role in the energetics of DNA recombination.
TL;DR: Deviations from the force curves predicted by the freely jointed chain model suggest that DNA has significant local curvature in solution, and the effect of bend-inducing cis-diamminedichloroplatinum (II) was large and supports the hypothesis of natural curvatures in DNA.
Abstract: Single DNA molecules were chemically attached by one end to a glass surface and by their other end to a magnetic bead. Equilibrium positions of the beads were observed in an optical microscope while the beads were acted on by known magnetic and hydrodynamic forces. Extension versus force curves were obtained for individual DNA molecules at three different salt concentrations with forces between 10(-14) and 10(-11) newtons. Deviations from the force curves predicted by the freely jointed chain model suggest that DNA has significant local curvature in solution. Ethidium bromide and 4',6-diamidino-2-phenylindole had little effect on the elastic response of the molecules, but their extent of intercalation was directly measured. Conversely, the effect of bend-inducing cis-diamminedichloroplatinum (II) was large and supports the hypothesis of natural curvature in DNA.
TL;DR: The findings suggest that most human variation is rare, not shared between populations, and that rare variants are likely to play a role in human health, and show that large sample sizes will be required to associate rare variants with complex traits.
Abstract: As a first step toward understanding how rare variants contribute to risk for complex diseases, we sequenced 15,585 human protein-coding genes to an average median depth of 111× in 2440 individuals of European (n = 1351) and African (n = 1088) ancestry. We identified over 500,000 single-nucleotide variants (SNVs), the majority of which were rare (86% with a minor allele frequency less than 0.5%), previously unknown (82%), and population-specific (82%). On average, 2.3% of the 13,595 SNVs each person carried were predicted to affect protein function of ~313 genes per genome, and ~95.7% of SNVs predicted to be functionally important were rare. This excess of rare functional variants is due to the combined effects of explosive, recent accelerated population growth and weak purifying selection. Furthermore, we show that large sample sizes will be required to associate rare variants with complex traits.
28 Jul 2005
TL;DR: NAMD as discussed by the authors is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems that scales to hundreds of processors on high-end parallel platforms, as well as tens of processors in low-cost commodity clusters, and also runs on individual desktop and laptop computers.
Abstract: NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD scales to hundreds of processors on high-end parallel platforms, as well as tens of processors on low-cost commodity clusters, and also runs on individual desktop and laptop computers. NAMD works with AMBER and CHARMM potential functions, parameters, and file formats. This article, directed to novices as well as experts, first introduces concepts and methods used in the NAMD program, describing the classical molecular dynamics force field, equations of motion, and integration methods along with the efficient electrostatics evaluation algorithms employed and temperature and pressure controls used. Features for steering the simulation across barriers and for calculating both alchemical and conformational free energy differences are presented. The motivations for and a roadmap to the internal design of NAMD, implemented in C++ and based on Charm++ parallel objects, are outlined. The factors affecting the serial and parallel performance of a simulation are discussed. Finally, typical NAMD use is illustrated with representative applications to a small, a medium, and a large biomolecular system, highlighting particular features of NAMD, for example, the Tcl scripting language. The article also provides a list of the key features of NAMD and discusses the benefits of combining NAMD with the molecular graphics/sequence analysis software VMD and the grid computing/collaboratory software BioCoRE. NAMD is distributed free of charge with source code at www.ks.uiuc.edu.
14 Dec 2012
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.
Abstract: The relationship between the two estimates of genetic variation at the DNA level, namely the number of segregating sites and the average number of nucleotide differences estimated from pairwise comparison, is investigated. It is found that the correlation between these two estimates is large when the sample size is small, and decreases slowly as the sample size increases. Using the relationship obtained, a statistical method for testing the neutral mutation hypothesis is developed. This method needs only the data of DNA polymorphism, namely the genetic variation within population at the DNA level. A simple method of computer simulation, that was used in order to obtain the distribution of a new statistic developed, is also presented. Applying this statistical method to the five regions of DNA sequences in Drosophila melanogaster, it is found that large insertion/deletion (greater than 100 bp) is deleterious. It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.