scispace - formally typeset
Search or ask a question
Author

James E. Galagan

Bio: James E. Galagan is an academic researcher from Boston University. The author has contributed to research in topics: Genome & Gene. The author has an hindex of 57, co-authored 102 publications receiving 42515 citations. Previous affiliations of James E. Galagan include Broad Institute & Massachusetts Institute of Technology.


Papers
More filters
Journal ArticleDOI
Eric S. Lander1, Lauren Linton1, Bruce W. Birren1, Chad Nusbaum1  +245 moreInstitutions (29)
15 Feb 2001-Nature
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

22,269 citations

Journal ArticleDOI
24 Apr 2003-Nature
TL;DR: A high-quality draft sequence of the N. crassa genome is reported, suggesting that RIP has had a profound impact on genome evolution, greatly slowing the creation of new genes through genomic duplication and resulting in a genome with an unusually low proportion of closely related genes.
Abstract: Neurospora crassa is a central organism in the history of twentieth-century genetics, biochemistry and molecular biology. Here, we report a high-quality draft sequence of the N. crassa genome. The approximately 40-megabase genome encodes about 10,000 protein-coding genes—more than twice as many as in the fission yeast Schizosaccharomyces pombe and only about 25% fewer than in the fruitfly Drosophila melanogaster. Analysis of the gene set yields insights into unexpected aspects of Neurospora biology including the identification of genes potentially associated with red light photobiology, genes implicated in secondary metabolism, and important differences in Ca21 signalling as compared with plants and animals. Neurospora possesses the widest array of genome defence mechanisms known for any eukaryotic organism, including a process unique to fungi called repeat-induced point mutation (RIP). Genome analysis suggests that RIP has had a profound impact on genome evolution, greatly slowing the creation of new genes through genomic duplication and resulting in a genome with an unusually low proportion of closely related genes.

1,659 citations

Journal ArticleDOI
21 Apr 2005-Nature
TL;DR: The draft sequence of the M. grisea genome is reported, reflecting the clonal nature of this fungus imposed by widespread rice cultivation and analysis of the gene set provides an insight into the adaptations required by a fungus to cause disease.
Abstract: Magnaporthe grisea is the most destructive pathogen of rice worldwide and the principal model organism for elucidating the molecular basis of fungal disease of plants. Here, we report the draft sequence of the M. grisea genome. Analysis of the gene set provides an insight into the adaptations required by a fungus to cause disease. The genome encodes a large and diverse set of secreted proteins, including those defined by unusual carbohydrate-binding domains. This fungus also possesses an expanded family of G-protein-coupled receptors, several new virulence-associated genes and large suites of enzymes involved in secondary metabolism. Consistent with a role in fungal pathogenesis, the expression of several of these genes is upregulated during the early stages of infection-related development. The M. grisea genome has been subject to invasion and proliferation of active transposable elements, reflecting the clonal nature of this fungus imposed by widespread rice cultivation.

1,520 citations

Journal ArticleDOI
18 Mar 2010-Nature
TL;DR: Comparison of genomes of three phenotypically diverse Fusarium species revealed lineage-specific genomic regions in F. oxysporum that include four entire chromosomes and account for more than one-quarter of the genome, putting the evolution of fungal pathogenicity into a new perspective.
Abstract: Fusarium species are among the most important phytopathogenic and toxigenic fungi. To understand the molecular underpinnings of pathogenicity in the genus Fusarium, we compared the genomes of three phenotypically diverse species: Fusarium graminearum, Fusarium verticillioides and Fusarium oxysporum f. sp. lycopersici. Our analysis revealed lineage-specific (LS) genomic regions in F. oxysporum that include four entire chromosomes and account for more than one-quarter of the genome. LS regions are rich in transposons and genes with distinct evolutionary profiles but related to pathogenicity, indicative of horizontal acquisition. Experimentally, we demonstrate the transfer of two LS chromosomes between strains of F. oxysporum, converting a non-pathogenic strain into a pathogen. Transfer of LS chromosomes between otherwise genetically isolated strains explains the polyphyletic origin of host specificity and the emergence of new pathogenic lineages in F. oxysporum. These findings put the evolution of fungal pathogenicity into a new perspective.

1,386 citations

Journal ArticleDOI
William C. Nierman1, William C. Nierman2, Arnab Pain3, Michael J. Anderson4, Jennifer R. Wortman1, Jennifer R. Wortman2, H. Stanley Kim2, H. Stanley Kim1, Javier Arroyo5, Matthew Berriman3, Keietsu Abe6, David B. Archer7, Clara Bermejo5, Joan W. Bennett8, Paul Bowyer4, Dan Chen1, Dan Chen2, Matthew Collins3, Richard Coulsen, Robert L. Davies3, Paul S. Dyer7, Mark L. Farman9, Nadia Fedorova2, Nadia Fedorova1, Natalie D. Fedorova2, Natalie D. Fedorova1, T. Feldblyum1, T. Feldblyum2, Reinhard Fischer10, Nigel Fosker3, Audrey Fraser3, José Luis García11, María Josefa Marcos García12, Ariette Goble3, Gustavo H. Goldman13, Katsuya Gomi6, Sam Griffith-Jones3, R. Gwilliam3, Brian J. Haas1, Brian J. Haas2, Hubertus Haas14, David Harris3, H. Horiuchi15, Jiaqi Huang1, Jiaqi Huang2, Sean Humphray3, Javier Jiménez12, Nancy P. Keller15, H. Khouri2, H. Khouri1, Katsuhiko Kitamoto16, Tetsuo Kobayashi17, Sven Konzack10, Resham Kulkarni1, Resham Kulkarni2, Toshitaka Kumagai18, Anne Lafton19, Jean-Paul Latgé19, Weixi Li9, Angela Lord3, Charles Lu1, Charles Lu2, William H. Majoros1, William H. Majoros2, Gregory S. May20, Bruce L. Miller21, Yasmin Ali Mohamoud1, Yasmin Ali Mohamoud2, María Molina5, Michel Monod22, Isabelle Mouyna19, Stephanie Mulligan2, Stephanie Mulligan1, Lee Murphy3, Susan O'Neil3, Ian T. Paulsen2, Ian T. Paulsen1, Miguel A. Peñalva11, Mihaela Pertea1, Mihaela Pertea2, Claire Price3, Bethan L. Pritchard4, Michael A. Quail3, Ester Rabbinowitsch3, Neil Rawlins3, Marie Adele Rajandream3, Utz Reichard23, Hubert Renauld3, Geoffrey D. Robson4, Santiago Rodríguez de Córdoba11, José Manuel Rodríguez-Peña5, Catherine M. Ronning2, Catherine M. Ronning1, Simon Rutter3, Steven L. Salzberg1, Steven L. Salzberg2, Miguel del Nogal Sánchez12, Juan C. Sánchez-Ferrero11, David L. Saunders3, Kathy Seeger3, Rob Squares3, S. Squares3, Michio Takeuchi24, Fredj Tekaia19, Geoffrey Turner25, Carlos R. Vázquez de Aldana12, J. Weidman1, J. Weidman2, Owen White1, Owen White2, John Woodward3, Jae-Hyuk Yu15, Claire M. Fraser1, Claire M. Fraser2, James E. Galagan26, Kiyoshi Asai18, Masayuki Machida18, Neil Hall1, Neil Hall3, Bart Barrell3, David W. Denning4 
22 Dec 2005-Nature
TL;DR: The Af293 genome sequence provides an unparalleled resource for the future understanding of this remarkable fungus and revealed temperature-dependent expression of distinct sets of genes, as well as 700 A. fumigatus genes not present or significantly diverged in the closely related sexual species Neosartorya fischeri, many of which may have roles in the pathogenicity phenotype.
Abstract: Aspergillus fumigatus is exceptional among microorganisms in being both a primary and opportunistic pathogen as well as a major allergen. Its conidia production is prolific, and so human respiratory tract exposure is almost constant. A. fumigatus is isolated from human habitats and vegetable compost heaps. In immunocompromised individuals, the incidence of invasive infection can be as high as 50% and the mortality rate is often about 50% (ref. 2). The interaction of A. fumigatus and other airborne fungi with the immune system is increasingly linked to severe asthma and sinusitis. Although the burden of invasive disease caused by A. fumigatus is substantial, the basic biology of the organism is mostly obscure. Here we show the complete 29.4-megabase genome sequence of the clinical isolate Af293, which consists of eight chromosomes containing 9,926 predicted genes. Microarray analysis revealed temperature-dependent expression of distinct sets of genes, as well as 700 A. fumigatus genes not present or significantly diverged in the closely related sexual species Neosartorya fischeri, many of which may have roles in the pathogenicity phenotype. The Af293 genome sequence provides an unparalleled resource for the future understanding of this remarkable fungus.

1,356 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The definition and use of family-specific, manually curated gathering thresholds are explained and some of the features of domains of unknown function (also known as DUFs) are discussed, which constitute a rapidly growing class of families within Pfam.
Abstract: Pfam is a widely used database of protein families and domains. This article describes a set of major updates that we have implemented in the latest release (version 24.0). The most important change is that we now use HMMER3, the latest version of the popular profile hidden Markov model package. This software is approximately 100 times faster than HMMER2 and is more sensitive due to the routine use of the forward algorithm. The move to HMMER3 has necessitated numerous changes to Pfam that are described in detail. Pfam release 24.0 contains 11,912 families, of which a large number have been significantly updated during the past two years. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/).

14,075 citations

Journal ArticleDOI
J. Craig Venter1, Mark Raymond Adams1, Eugene W. Myers1, Peter W. Li1  +269 moreInstitutions (12)
16 Feb 2001-Science
TL;DR: Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems are indicated.
Abstract: A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.

12,098 citations

Journal ArticleDOI
14 Jan 2005-Cell
TL;DR: In a four-genome analysis of 3' UTRs, approximately 13,000 regulatory relationships were detected above the estimate of false-positive predictions, thereby implicating as miRNA targets more than 5300 human genes, which represented 30% of the gene set.

11,624 citations

Journal ArticleDOI
TL;DR: A mature web tool for rapid and reliable display of any requested portion of the genome at any scale, together with several dozen aligned annotation tracks, is provided at http://genome.ucsc.edu.
Abstract: As vertebrate genome sequences near completion and research refocuses to their analysis, the issue of effective genome annotation display becomes critical. A mature web tool for rapid and reliable display of any requested portion of the genome at any scale, together with several dozen aligned annotation tracks, is provided at http://genome.ucsc.edu. This browser displays assembly contigs and gaps, mRNA and expressed sequence tag alignments, multiple gene predictions, cross-species homologies, single nucleotide polymorphisms, sequence-tagged sites, radiation hybrid data, transposon repeats, and more as a stack of coregistered tracks. Text and sequence-based searches provide quick and precise access to any region of specific interest. Secondary links from individual features lead to sequence details and supplementary off-site databases. One-half of the annotation tracks are computed at the University of California, Santa Cruz from publicly available sequence data; collaborators worldwide provide the rest. Users can stably add their own custom tracks to the browser for educational or research purposes. The conceptual and technical framework of the browser, its underlying MYSQL database, and overall use are described. The web site currently serves over 50,000 pages per day to over 3000 different users.

9,605 citations

Journal ArticleDOI
TL;DR: Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies and is in close agreement with simulated results without read-pair information.
Abstract: We have developed a new set of algorithms, collectively called "Velvet," to manipulate de Bruijn graphs for genomic sequence assembly. A de Bruijn graph is a compact representation based on short words (k-mers) that is ideal for high coverage, very short read (25-50 bp) data sets. Applying Velvet to very short reads and paired-ends information only, one can produce contigs of significant length, up to 50-kb N50 length in simulations of prokaryotic data and 3-kb N50 on simulated mammalian BACs. When applied to real Solexa data sets without read pairs, Velvet generated contigs of approximately 8 kb in a prokaryote and 2 kb in a mammalian BAC, in close agreement with our simulated results without read-pair information. Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies.

9,389 citations