scispace - formally typeset
Search or ask a question
Author

Sam Rash

Bio: Sam Rash is an academic researcher from Joint Genome Institute. The author has contributed to research in topics: Gene & Genome. The author has an hindex of 9, co-authored 10 publications receiving 4756 citations. Previous affiliations of Sam Rash include Agency for Science, Technology and Research & United States Department of Energy.
Topics: Gene, Genome, Chromosome 19, Chromosome 16, Autosome

Papers
More filters
Journal ArticleDOI
Paramvir S. Dehal1, Yutaka Satou2, Robert K. Campbell3, Jarrod Chapman1, Bernard M. Degnan4, Anthony W. De Tomaso5, Brad Davidson6, Anna Di Gregorio6, Maarten D. Sollewijn Gelpke1, David Goodstein1, Naoe Harafuji6, Kenneth E. M. Hastings7, Isaac Ho1, Kohji Hotta8, Wayne Huang1, Takeshi Kawashima2, Patrick Lemaire9, Diego Martinez1, Ian A. Meinertzhagen10, Simona Necula1, Masaru Nonaka11, Nik Putnam1, Sam Rash1, Hidetoshi Saiga12, Masanobu Satake13, Astrid Terry1, Lixy Yamada2, Hong Gang Wang14, Satoko Awazu2, Kaoru Azumi15, Jeffrey L. Boore1, Margherita Branno16, Stephen T. Chin-Bow17, Rosaria DeSantis16, Sharon A. Doyle1, Pilar Francino1, David N. Keys1, David N. Keys6, Shinobu Haga8, Hiroko Hayashi8, Kyosuke Hino2, Kaoru S. Imai2, Kazuo Inaba13, Shungo Kano2, Shungo Kano16, Kenji Kobayashi2, Mari Kobayashi2, Byung In Lee1, Kazuhiro W. Makabe2, Chitra Manohar1, Giorgio Matassi16, Mónica Medina1, Yasuaki Mochizuki2, Steve Mount18, Tomomi Morishita8, Sachiko Miura8, Akie Nakayama2, Satoko Nishizaka8, Hisayo Nomoto8, Fumiko Ohta8, Kazuko Oishi8, Isidore Rigoutsos17, Masako Sano8, Akane Sasaki2, Yasunori Sasakura2, Eiichi Shoguchi2, Tadasu Shin-I8, Antoinetta Spagnuolo16, Didier Y.R. Stainier19, Miho Suzuki20, Olivier Tassy9, Naohito Takatori2, Miki Tokuoka2, Kasumi Yagi2, Fumiko Yoshizaki11, Shuichi Wada2, Cindy Zhang1, P. Douglas Hyatt21, Frank W. Larimer21, Chris Detter1, Norman A. Doggett22, Tijana Glavina1, Trevor Hawkins1, Paul G. Richardson1, Susan Lucas1, Yuji Kohara8, Michael Levine6, Nori Satoh2, Daniel S. Rokhsar1, Daniel S. Rokhsar6 
13 Dec 2002-Science
TL;DR: A draft of the protein-coding portion of the genome of the most studied ascidian, Ciona intestinalis, is generated, suggesting that ascidians contain the basic ancestral complement of genes involved in cell signaling and development.
Abstract: The first chordates appear in the fossil record at the time of the Cambrian explosion, nearly 550 million years ago. The modern ascidian tadpole represents a plausible approximation to these ancestral chordates. To illuminate the origins of chordate and vertebrates, we generated a draft of the protein-coding portion of the genome of the most studied ascidian, Ciona intestinalis. The Ciona genome contains approximately 16,000 protein-coding genes, similar to the number in other invertebrates, but only half that found in vertebrates. Vertebrate gene families are typically found in simplified form in Ciona, suggesting that ascidians contain the basic ancestral complement of genes involved in cell signaling and development. The ascidian genome has also acquired a number of lineage-specific innovations, including a group of genes engaged in cellulose metabolism that are related to those in bacteria and fungi.

1,582 citations

Journal ArticleDOI
23 Aug 2002-Science
TL;DR: The Fugu rubripes genome has been sequenced to over 95% coverage, and more than 80% of the assembly is in multigene-sized scaffolds as discussed by the authors.
Abstract: The compact genome of Fugu rubripes has been sequenced to over 95% coverage, and more than 80% of the assembly is in multigene-sized scaffolds. In this 365-megabase vertebrate genome, repetitive DNA accounts for less than one-sixth of the sequence, and gene loci occupy about one-third of the genome. As with the human genome, gene loci are not evenly distributed, but are clustered into sparse and dense regions. Some “giant” genes were observed that had average coding sequence sizes but were spread over genomic lengths significantly larger than those of their human orthologs. Although three-quarters of predicted human proteins have a strong match toFugu, approximately a quarter of the human proteins had highly diverged from or had no pufferfish homologs, highlighting the extent of protein evolution in the 450 million years since teleosts and mammals diverged. Conserved linkages between Fugu and human genes indicate the preservation of chromosomal segments from the common vertebrate ancestor, but with considerable scrambling of gene order.

1,446 citations

Journal ArticleDOI
01 Sep 2006-Science
TL;DR: Comparison of the two species' genomes reveals a rapid expansion and diversification of many protein families associated with plant infection such as hydrolases, ABC transporters, protein toxins, proteinase inhibitors, and, in particular, a superfamily of 700 proteins with similarity to known oömycete avirulence genes.
Abstract: Draft genome sequences have been determined for the soybean pathogen Phytophthora sojae and the sudden oak death pathogen Phytophthora ramorum. Oomycetes such as these Phytophthora species share the kingdom Stramenopila with photosynthetic algae such as diatoms, and the presence of many Phytophthora genes of probable phototroph origin supports a photosynthetic ancestry for the stramenopiles. Comparison of the two species' genomes reveals a rapid expansion and diversification of many protein families associated with plant infection such as hydrolases, ABC transporters, protein toxins, proteinase inhibitors, and, in particular, a superfamily of 700 proteins with similarity to known oomycete avirulence genes.

1,016 citations

Journal ArticleDOI
Jane Grimwood1, Laurie Gordon2, Laurie Gordon3, Anne S. Olsen3, Anne S. Olsen2, Astrid Terry2, Jeremy Schmutz1, Jane Lamerdin3, Jane Lamerdin2, Uffe Hellsten2, David Goodstein2, Olivier Couronne2, Mary Bao Tran-Gyamfi3, Mary Bao Tran-Gyamfi2, Andrea Aerts2, Michael R. Altherr4, Michael R. Altherr2, Linda K. Ashworth3, Linda K. Ashworth2, Eva Bajorek1, Stacey Black1, Elbert Branscomb2, Elbert Branscomb3, Sean Caenepeel2, Anthony V. Carrano3, Anthony V. Carrano2, Chenier Caoile1, Yee Man Chan1, Mari Christensen3, Mari Christensen2, Catherine A. Cleland2, Catherine A. Cleland4, Alex Copeland2, Eileen Dalin2, Paramvir S. Dehal2, Mirian Denys1, John C. Detter2, Julio Escobar1, Dave Flowers1, Dea Fotopulos1, Carmen Rosa Albacete García1, Anca M. Georgescu3, Anca M. Georgescu2, Tijana Glavina2, Maria Gomez1, Eidelyn Gonzales1, Matthew Groza2, Matthew Groza3, Nancy Hammon2, Trevor Hawkins2, Lauren Haydu1, Isaac Ho2, Wayne Huang2, Sanjay Israni2, Jamie Jett2, Kristen Kadner2, Heather Kimball2, Arthur Kobayashi3, Arthur Kobayashi2, Vladimer Larionov, Sun-Hee Leem, Frederick Lopez1, Yunian Lou2, Steve Lowry2, Stephanie Malfatti2, Stephanie Malfatti3, Diego Martinez2, Paula McCready2, Paula McCready3, Catherine Medina1, Jenna Morgan2, Kathryn Nelson2, Kathryn Nelson4, Matt Nolan2, Ivan Ovcharenko2, Ivan Ovcharenko3, Sam Pitluck2, Martin Pollard2, Anthony P. Popkie5, Paul Predki2, Glenda Quan3, Glenda Quan2, Lucía Ramírez1, Sam Rash2, James Retterer1, Alex Rodriguez1, Stephanine Rogers1, Asaf Salamov2, Angelica Salazar1, Xinwei She5, Doug Smith2, Tom Slezak3, Tom Slezak2, Victor V. Solovyev2, Nina Thayer4, Nina Thayer2, Hope Tice2, Ming Tsai1, Anna Ustaszewska2, Nu Vo1, Mark C. Wagner3, Mark C. Wagner2, Jeremy Wheeler1, Kevin Wu1, Gary Xie2, Gary Xie4, Joan Yang1, Inna Dubchak2, Terrence S. Furey6, Pieter J. deJong7, Mark Dickson1, David Gordon8, Evan E. Eichler5, Len A. Pennacchio2, Paul G. Richardson2, Lisa Stubbs3, Lisa Stubbs2, Daniel S. Rokhsar2, Richard M. Myers1, Edward M. Rubin2, Susan Lucas2 
01 Apr 2004-Nature
TL;DR: Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.
Abstract: Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high G + C content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9% of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in mendelian disorders, including familial hypercholesterolaemia and insulin-resistant diabetes. Nearly one-quarter of these genes belong to tandemly arranged families, encompassing more than 25% of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.

307 citations

Journal ArticleDOI
06 Jul 2001-Science
TL;DR: To illuminate the function and evolutionary history of both genomes, mouse DNA related to human chromosome 19 is sequenced and breakpoints of all 15 evolutionary rearrangements are sequenced, providing a view of the forces that drive chromosome evolution in mammals.
Abstract: To illuminate the function and evolutionary history of both genomes, we sequenced mouse DNA related to human chromosome 19. Comparative sequence alignments yielded confirmatory evidence for hypothetical genes and identified exons, regulatory elements, and candidate genes that were missed by other predictive methods. Chromosome-wide comparisons revealed a difference between single-copy HSA19 genes, which are overwhelmingly conserved in mouse, and genes residing in tandem familial clusters, which differ extensively in number, coding capacity, and organization between the two species. Finally, we sequenced breakpoints of all 15 evolutionary rearrangements, providing a view of the forces that drive chromosome evolution in mammals.

218 citations


Cited by
More filters
Journal ArticleDOI
Robert H. Waterston1, Kerstin Lindblad-Toh2, Ewan Birney, Jane Rogers3  +219 moreInstitutions (26)
05 Dec 2002-Nature
TL;DR: The results of an international collaboration to produce a high-quality draft sequence of the mouse genome are reported and an initial comparative analysis of the Mouse and human genomes is presented, describing some of the insights that can be gleaned from the two sequences.
Abstract: The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

6,643 citations

Journal ArticleDOI
14 Jun 2007-Nature
TL;DR: Functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project are reported, providing convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts.
Abstract: We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.

5,091 citations

01 Aug 2000
TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.
Abstract: BIOE 402. Medical Technology Assessment. 2 or 3 hours. Bioentrepreneur course. Assessment of medical technology in the context of commercialization. Objectives, competition, market share, funding, pricing, manufacturing, growth, and intellectual property; many issues unique to biomedical products. Course Information: 2 undergraduate hours. 3 graduate hours. Prerequisite(s): Junior standing or above and consent of the instructor.

4,833 citations

Journal ArticleDOI
TL;DR: A major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes is described and is expected to be a useful platform for functional annotation of newlysequenced genomes, including those of complex eukARYotes, and genome-wide evolutionary studies.
Abstract: The availability of multiple, essentially complete genome sequences of prokaryotes and eukaryotes spurred both the demand and the opportunity for the construction of an evolutionary classification of genes from these genomes. Such a classification system based on orthologous relationships between genes appears to be a natural framework for comparative genomics and should facilitate both functional annotation of genomes and large-scale evolutionary studies. We describe here a major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes and the construction of clusters of predicted orthologs for 7 eukaryotic genomes, which we named KOGs after euk aryotic o rthologous g roups. The COG collection currently consists of 138,458 proteins, which form 4873 COGs and comprise 75% of the 185,505 (predicted) proteins encoded in 66 genomes of unicellular organisms. The euk aryotic o rthologous g roups (KOGs) include proteins from 7 eukaryotic genomes: three animals (the nematode Caenorhabditis elegans, the fruit fly Drosophila melanogaster and Homo sapiens), one plant, Arabidopsis thaliana, two fungi (Saccharomyces cerevisiae and Schizosaccharomyces pombe), and the intracellular microsporidian parasite Encephalitozoon cuniculi. The current KOG set consists of 4852 clusters of orthologs, which include 59,838 proteins, or ~54% of the analyzed eukaryotic 110,655 gene products. Compared to the coverage of the prokaryotic genomes with COGs, a considerably smaller fraction of eukaryotic genes could be included into the KOGs; addition of new eukaryotic genomes is expected to result in substantial increase in the coverage of eukaryotic genomes with KOGs. Examination of the phyletic patterns of KOGs reveals a conserved core represented in all analyzed species and consisting of ~20% of the KOG set. This conserved portion of the KOG set is much greater than the ubiquitous portion of the COG set (~1% of the COGs). In part, this difference is probably due to the small number of included eukaryotic genomes, but it could also reflect the relative compactness of eukaryotes as a clade and the greater evolutionary stability of eukaryotic genomes. The updated collection of orthologous protein sets for prokaryotes and eukaryotes is expected to be a useful platform for functional annotation of newly sequenced genomes, including those of complex eukaryotes, and genome-wide evolutionary studies.

4,167 citations

Journal ArticleDOI
21 Oct 2004-Nature
TL;DR: The current human genome sequence (Build 35) as discussed by the authors contains 2.85 billion nucleotides interrupted by only 341 gaps and is accurate to an error rate of approximately 1 event per 100,000 bases.
Abstract: The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers approximately 99% of the euchromatic genome and is accurate to an error rate of approximately 1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human genome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead.

3,989 citations