scispace - formally typeset
Search or ask a question
Author

Bruce W. Birren

Bio: Bruce W. Birren is an academic researcher from Broad Institute. The author has contributed to research in topics: Genome & Gene. The author has an hindex of 103, co-authored 205 publications receiving 113491 citations. Previous affiliations of Bruce W. Birren include Massachusetts Institute of Technology & California Institute of Technology.
Topics: Genome, Gene, Genomics, Population, Human genome


Papers
More filters
Journal ArticleDOI
TL;DR: The complex dynamic of dengue in Cambodia in the last ten years has been associated with a combination of stochastic climatic events, cocirculation, coevolution, adaptation to different vector populations, and with the human population immunological landscape.

15 citations

Journal ArticleDOI
TL;DR: In this article, the full coding region from so far poorly characterized variants of HCV genotype 4 was amplified and sequenced using a long range PCR technique and analyzed with respect to phylogenetic relationship, possible recombination and prominent sequence characteristics compared to other known HCV strains.
Abstract: Infection with genotype 4 of the Hepatitis C virus is common in Africa and the Mediterranean area, but has also been found at increasing frequencies in injection drug users in Europe and North America. Full length viral sequences to characterize viral diversity and structure have recently become available mostly for subtype 4a, and studies in Egypt and Saudi Arabia, where high proportions of subtype 4a infected patients exist, have begun to establish optimized treatment regimens. However knowledge about other subtype variants of genotype 4 present in less developed African states is lacking. In this study the full coding region from so far poorly characterized variants of HCV genotype 4 was amplified and sequenced using a long range PCR technique. Sequences were analyzed with respect to phylogenetic relationship, possible recombination and prominent sequence characteristics compared to other known HCV strains. We present for the first time two full-length sequences from the HCV genotype 4k, in addition to five strains from HCV genotypes 4d and 4f. Reference sequences for accurate HCV genotyping are required for optimized treatment, and a better knowledge of the global viral sequence diversity is needed to guide vaccines or new drugs effective in the world wide epidemic.

14 citations

Journal ArticleDOI
15 Jul 1994-Genomics
TL;DR: Screening a human bacterial artificial chromosome (BAC) library using the total pool of clones from a chromosome 22-specific cosmid library as a composite probe to rapidly identify chromosome-specific subsets of clone from a total human genomic library is described.

13 citations

Journal ArticleDOI
Patrick F. Sullivan, Jennifer R. S. Meadows, Steven Gazal, BaDoi N. Phan, Gregory R. Andrews, Sharadha Sakthikumar, Jessika Nordin, Ananya Roy, Chao Wang, James Xue, Shuyang Yao, Quan Sun, Jin P. Szatkiewicz, Jia Wen, Laura M. Huckins, Zhili Zheng, Jian Zeng, Naomi R. Wray, Yun Li, Jessica S. Johnson, Jiawen Chen, Benedict Paten, Zhiping Weng, Andreas R. Pfenning, Elinor K. Karlsson, Joel C. Armstrong, Matteo Bianchi, Bruce W. Birren, Kevin R. Bredemeyer, Ana M Breit, Matthew J. Christmas, Hiram Clawson, Joana Damas, Federica Di Palma, Mark Diekhans, Michael X. Dong, Eduardo Eizirik, Kaili Fan, Cornelia E. Fanter, Nicole M. Foley, Karin Forsberg-Nilsson, John Gatesy, Diane P. Genereux, Linda Goodman, Jenna R. Grimshaw, Michaela K. Halsey, Andrew J. Harris, Glenn Hickey, Michael Hiller, Allyson Hindle, Robert Hubley, Graham M. Hughes, Jeremy A. Johnson, David Juan, Irene M. Kaplow, Kathleen C. Keough, Bogdan M. Kirilenko, Klaus-Peter Koepfli, Jennifer M. Korstian, Amanda Kowalczyk, Sergey V. Kozyrev, Alyssa J. Lawler, Colleen Lawless, Thomas Lehmann, Daniel Lévesque, Harris A. Lewin, Xue Li, Abigail L. Lind, Kerstin Lindblad-Toh, Ava Mackay-Smith, Voichita D. Marinescu, Tomas Marques-Bonet, Victor C. Mason, Wynn K. Meyer, Jill Moore, Lucas R. Moreira, Diana D. Moreno-Santillán, Kathleen Morrill, Gerard Muntané, William J. Murphy, Arcadi Navarro, Martin T. Nweeia, Sylvia Ortmann, Austin B. Osmanski, Nicole Paulat, Katherine S. Pollard, Henry Pratt, David A. Ray, Steven K. Reilly, Jeb Rosen, Irina Ruf, Louise Ryan, Oliver A. Ryder, Pardis C. Sabeti, Daniel E. Schäffer, Aitor Serres, Beth Shapiro, Arian F.A. Smit, Mark S. Springer, Chaitanya Srinivasan, Cynthia C. Steiner, Jessica M. Storer, Kevin A.M. Sullivan, Elisabeth Sundström, Megan A. Supple, Ross Swofford, Joy-El R B Talbot, Emma C. Teeling, Jason Turner-Maier, Alejandro Valenzuela, Franziska Wagner, Ola Wallerman, Juehan Wang, Aryn P. Wilder, Morgan Wirthlin, Xiaomeng Zhang 
10 Mar 2023-Science
TL;DR: In this article , single base phyloP scores from the whole genome alignment of 240 placental mammals identified 3.5% of the human genome as significantly constrained, and likely functional.
Abstract: Although thousands of genomic regions have been associated with heritable human diseases, attempts to elucidate biological mechanisms are impeded by a general inability to discern which genomic positions are functionally important. Evolutionary constraint is a powerful predictor of function that is agnostic to cell type or disease mechanism. Here, single base phyloP scores from the whole genome alignment of 240 placental mammals identified 3.5% of the human genome as significantly constrained, and likely functional. We compared these scores to large-scale genome annotation, genome-wide association studies (GWAS), copy number variation, clinical genetics findings, and cancer data sets. Evolutionarily constrained positions are enriched for variants explaining common disease heritability (more than any other functional annotation). Our results improve variant annotation but also highlight that the regulatory landscape of the human genome still needs to be further explored and linked to disease.

13 citations

01 Jan 2003
TL;DR: The mathematical and algorithmic results underpinning the analysis of the genome sequences of S. paradoxus, S. mikatae and S. bayanus are described and the methods for the automatic determination of genome correspondence are presented, demonstrating the power of comparative genomics to further the understanding of any species.
Abstract: In Kellis et al. (2003), we reported the genome sequences of S. paradoxus, S. mikatae and S. bayanus and compared these three yeast species to their close relative, S. cerevisiae. Genome-wide comparative analysis allowed the identification of functionally important sequences, both coding and non-coding. In this companion paper we describe the mathematical and algorithmic results underpinning the analysis of these genomes. We present methods for the automatic determination of genome correspondence. The algorithms enabled the automatic identification of orthologs for more than 90% of genes and intergenic regions across the four species despite the large number of duplicated genes in the yeast genome. The remaining ambiguities in the gene correspondence revealed recent gene family expansions in regions of rapid genomic change. We present methods for the identification of protein-coding genes based on their patterns of nucleotide conservation across related species. We observed the pressure to conserve the reading frame of functional proteins and developed a test for gene identification with high sensitivity and specificity. We used this test to revisit the genome of S. cerevisiae, reducing the overall gene count by 500 genes (10% of previously annotated genes) and refi ning the gene structure of hundreds of genes. We present novel methods for the systematic de novo identification of regulatory motifs. The methods do not rely on previous knowledge of gene function and in that way differ from the current literature on computational motif discovery. Based on the genome-wide conservation patterns of known motifs, we developed three conservation criteria that we used to discover novel motifs. We used an enumeration approach to select strongly conserved motif cores, which we extended and collapsed into a small number of candidate regulatory motifs. These include most previously known regulatory motifs as well as several noteworthy novel motifs. The majority of discovered motifs are enriched in functionally related genes, allowing us to infer a candidate function for novel motifs. Our results demonstrate the power of comparative genomics to further our understanding of any species. Our methods are validated by the extensive experimental knowledge in yeast, and will be invaluable in the study of complex genomes like that of human.

12 citations


Cited by
More filters
Journal ArticleDOI
Eric S. Lander1, Lauren Linton1, Bruce W. Birren1, Chad Nusbaum1  +245 moreInstitutions (29)
15 Feb 2001-Nature
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

22,269 citations

Journal ArticleDOI
TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
Abstract: Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS—the 1000 Genome pilot alone includes nearly five terabases—make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

20,557 citations

28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Journal ArticleDOI
TL;DR: SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies.
Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V−SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online (http://bioinf.spbau.ru/spades). It is distributed as open source software.

16,859 citations

Journal ArticleDOI
TL;DR: The Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available, providing a unified solution for transcriptome reconstruction in any sample.
Abstract: Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.

15,665 citations