scispace - formally typeset
Search or ask a question
Author

Carlos Bustamante

Bio: Carlos Bustamante is an academic researcher from Stanford University. The author has contributed to research in topics: Population & Optical tweezers. The author has an hindex of 161, co-authored 770 publications receiving 106053 citations. Previous affiliations of Carlos Bustamante include Lawrence Berkeley National Laboratory & University of California.


Papers
More filters
Journal ArticleDOI
TL;DR: Phylogenetic analysis based on mtDNA revealed low genetic divergence among longnose skates, in particular, those dwelling the continental shelf and slope off the coasts of Chile and Argentina.
Abstract: The complete mitochondrial genome of the roughskin skate Dipturus trachyderma is described from 1 455 724 sequences obtained using Illumina NGS technology. Total length of the mitogenome was 16 909 base pairs, comprising 2 rRNAs, 13 protein-coding genes, 22 tRNAs and 2 non-coding regions. Phylogenetic analysis based on mtDNA revealed low genetic divergence among longnose skates, in particular, those dwelling the continental shelf and slope off the coasts of Chile and Argentina.

9 citations

Posted ContentDOI
13 Apr 2019-bioRxiv
TL;DR: Digitization of human and veterinary health information will continue to be a reality, particularly in the form of unstructured narratives, and the use of LSTM-RNN models represents a scalable structure that could prove useful in cohort identification for comparative oncology studies.
Abstract: Objective Unstructured clinical narratives are continuously being recorded as part of delivery of care in electronic health records, and dedicated tagging staff spend considerable effort manually assigning clinical codes for billing purposes; despite these efforts, label availability and accuracy are both suboptimal. Materials and Methods In this retrospective study, we trained long short-term memory (LSTM) recurrent neural networks (RNNs) on 52,722 human and 89,591 veterinary records. We investigated the accuracy of both separate-domain and combined-domain models and probed model portability. We established relevant baselines by training Decision Trees (DT) and Random Forests (RF), and using MetaMap Lite, a clinical natural language processing tool. Results We show that the LSTM-RNNs accurately classify veterinary and human text narratives into top-level categories with an average weighted macro F1 score of 0.74 and 0.68 respectively. In the “neoplasia” category, the model built with veterinary data has a high accuracy in veterinary data, and moderate accuracy in human data, with F1 scores of 0.91 and 0.70 respectively. Our LSTM method scored slightly higher than that of the DT and RF models. Discussion The use of LSTM-RNN models represents a scalable structure that could prove useful in cohort identification for comparative oncology studies. Conclusion Digitization of human and veterinary health information will continue to be a reality, particularly in the form of unstructured narratives. Our approach is a step forward for these two domains to learn from, and inform, one another.

9 citations

Journal ArticleDOI
TL;DR: The conditions under which circular intensity differential scattering can be measurable in the soft X‐ray region of the spectrum are established and the parameter which determines the strength of the preferential interaction of chiral molecules with opposite circular polarizations at high energies is the anisotropy of the atomic polarizabilities in the molecule.
Abstract: In this paper we present a general review of some of the new branches in the field of optical activity that have been developed during the last five years. Also, the conditions under which circular intensity differential scattering can be measurable in the soft X-ray region of the spectrum are established. It is found that the parameter which determines the strength of the preferential interaction of chiral molecules with opposite circular polarizations at these high energies is the anisotropy of the atomic polarizabilities in the molecule. The possibility of extending the other techniques discussed here to shorter wavelengths, is also discussed.

9 citations

Book ChapterDOI
01 Jan 2005
TL;DR: The rapid evolution of this novel gene, Sdic, is presented as a fascinating example of male-driven evolution incurred by recurrent selective sweeps.
Abstract: The Sdic gene cluster at the base of the X-chromosome is unique to the lineage of Drosophila melanogaster. The repeating unit in the cluster was formed from a duplication and fusion of the genes, AnnX and Cdic, which juxtaposed the 3′ untranslated region of AnnX to the third intron of Cdic. AnnX encodes Annexin 10 and Cdic encodes a cytoplasmic dynein intermediate chain. The 3′ untranslated region of AnnX contains two promoter elements, including a testis-specific element, and Cdic intron 3 contains a third promoter element; together these elements result in testis-specific transcription of Sdic. The Sdic protein features a novel amino terminus derived in part from Cdic intron 3 which contains motifs similar to those in axonemal dyneins. It has been demonstrated that the Sdic protein becomes incorporated into the tails of mature sperm. The evolution of the Sdic cluster required several deletions, at least one insertion, at least eleven nudeotide substitutions, and an estimated tenfold tandem duplication, all of which took place in the 1–3 million years since the divergence of D. melanogaster from D. simulans. Evidence for the ongoing evolution of Sdic including a recent selective sweep is found in the low levels of polymorphism across neighboring genes in the region, a large number of fixed amino acid replacements relative to fixed synonymous nucleotide substitutions, and a frequency spectrum of polymorphic nucleotides skewed toward rare variants. The analysis of polymorphism and divergence in the Sdic region, however, is complicated by the possible effects of background selection caused by deleterious new mutations, owing to the reduced amount of recombination in the region associated with its proximity to centromeric heterochromatin. We present the rapid evolution of this novel gene as a fascinating example of male-driven evolution incurred by recurrent selective sweeps.

9 citations


Cited by
More filters
28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Journal ArticleDOI
TL;DR: NAMD as discussed by the authors is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems that scales to hundreds of processors on high-end parallel platforms, as well as tens of processors in low-cost commodity clusters, and also runs on individual desktop and laptop computers.
Abstract: NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD scales to hundreds of processors on high-end parallel platforms, as well as tens of processors on low-cost commodity clusters, and also runs on individual desktop and laptop computers. NAMD works with AMBER and CHARMM potential functions, parameters, and file formats. This article, directed to novices as well as experts, first introduces concepts and methods used in the NAMD program, describing the classical molecular dynamics force field, equations of motion, and integration methods along with the efficient electrostatics evaluation algorithms employed and temperature and pressure controls used. Features for steering the simulation across barriers and for calculating both alchemical and conformational free energy differences are presented. The motivations for and a roadmap to the internal design of NAMD, implemented in C++ and based on Charm++ parallel objects, are outlined. The factors affecting the serial and parallel performance of a simulation are discussed. Finally, typical NAMD use is illustrated with representative applications to a small, a medium, and a large biomolecular system, highlighting particular features of NAMD, for example, the Tcl scripting language. The article also provides a list of the key features of NAMD and discusses the benefits of combining NAMD with the molecular graphics/sequence analysis software VMD and the grid computing/collaboratory software BioCoRE. NAMD is distributed free of charge with source code at www.ks.uiuc.edu.

14,558 citations

Journal ArticleDOI
Adam Auton1, Gonçalo R. Abecasis2, David Altshuler3, Richard Durbin4  +514 moreInstitutions (90)
01 Oct 2015-Nature
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

12,661 citations

Journal Article
Fumio Tajima1
30 Oct 1989-Genomics
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

11,521 citations