Author
Kathy Seeger
Bio: Kathy Seeger is an academic researcher from Wellcome Trust Sanger Institute. The author has contributed to research in topics: Genome & Gene. The author has an hindex of 19, co-authored 23 publications receiving 10352 citations.
Topics: Genome, Gene, Schizosaccharomyces pombe, Synteny, Conserved sequence
Papers
More filters
••
TL;DR: The 8,667,507 base pair linear chromosome of Streptomyces coelicolor is reported, containing the largest number of genes so far discovered in a bacterium.
Abstract: Streptomyces coelicolor is a representative of the group of soil-dwelling, filamentous bacteria responsible for producing most natural antibiotics used in human and veterinary medicine. Here we report the 8,667,507 base pair linear chromosome of this organism, containing the largest number of genes so far discovered in a bacterium. The 7,825 predicted genes include more than 20 clusters coding for known or predicted secondary metabolites. The genome contains an unprecedented proportion of regulatory genes, predominantly those likely to be involved in responses to external stimuli and stresses, and many duplicated gene sets that may represent 'tissue-specific' isoforms operating in different phases of colonial development, a unique situation for a bacterium. An ancient synteny was revealed between the central 'core' of the chromosome and the whole chromosome of pathogens Mycobacterium tuberculosis and Corynebacterium diphtheriae. The genome sequence will greatly increase our understanding of microbial life in the soil as well as aiding the generation of new drug candidates by genetic engineering.
3,077 citations
••
Wellcome Trust Sanger Institute1, London Research Institute2, Katholieke Universiteit Leuven3, Max Planck Society4, GATC Biotech5, Université catholique de Louvain6, Centre national de la recherche scientifique7, University of Exeter8, Institut national agronomique Paris Grignon9, Pablo de Olavide University10, University of Málaga11, University of Salamanca12, University of Sussex13, Salk Institute for Biological Studies14, Stanford University15, Cold Spring Harbor Laboratory16, TigerLogic17, Rosalind Franklin University of Medicine and Science18, Russian Academy of Sciences19, Technical University of Denmark20
TL;DR: The genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote, is sequenced and highly conserved genes important for eukARYotic cell organization including those required for the cytoskeleton, compartmentation, cell-cycle control, proteolysis, protein phosphorylation and RNA splicing are identified.
Abstract: We have sequenced and annotated the genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly reflecting more-extended control regions. Some 43% of the genes contain introns, of which there are 4,730. Fifty genes have significant similarity with human disease genes; half of these are cancer related. We identify highly conserved genes important for eukaryotic cell organization including those required for the cytoskeleton, compartmentation, cell-cycle control, proteolysis, protein phosphorylation and RNA splicing. These genes may have originated with the appearance of eukaryotic life. Few similarly conserved genes that are important for multicellular organization were identified, suggesting that the transition from prokaryotes to eukaryotes required more new genes than did the transition from unicellular to multicellular organization.
1,686 citations
••
Wellcome Trust Sanger Institute1, Seattle Biomed2, Katholieke Universiteit Leuven3, GATC Biotech4, Max Planck Society5, Washington University in St. Louis6, University of Trieste7, International Centre for Genetic Engineering and Biotechnology8, European Bioinformatics Institute9, University of São Paulo10, National Scientific and Technical Research Council11, Université catholique de Louvain12, University of London13, University of Edinburgh14, University of Glasgow15, University of Wisconsin-Madison16, University of York17, University of Cambridge18, University of Washington19
TL;DR: The organization of protein-coding genes into long, strand-specific, polycistronic clusters and lack of general transcription factors in the L. major, Trypanosoma brucei, and Tritryp genomes suggest that the mechanisms regulating RNA polymerase II–directed transcription are distinct from those operating in other eukaryotes, although the trypanosomatids appear capable of chromatin remodeling.
Abstract: Leishmania species cause a spectrum of human diseases in tropical and subtropical regions of the world. We have sequenced the 36 chromosomes of the 32.8-megabase haploid genome of Leishmania major (Friedlin strain) and predict 911 RNA genes, 39 pseudogenes, and 8272 protein-coding genes, of which 36% can be ascribed a putative function. These include genes involved in host-pathogen interactions, such as proteolytic enzymes, and extensive machinery for synthesis of complex surface glycoconjugates. The organization of protein-coding genes into long, strand-specific, polycistronic clusters and lack of general transcription factors in the L. major, Trypanosoma brucei, and Trypanosoma cruzi (Tritryp) genomes suggest that the mechanisms regulating RNA polymerase II-directed transcription are distinct from those operating in other eukaryotes, although the trypanosomatids appear capable of chromatin remodeling. Abundant RNA-binding proteins are encoded in the Tritryp genomes, consistent with active posttranscriptional regulation of gene expression.
1,357 citations
••
Washington University in St. Louis1, J. Craig Venter Institute2, Wellcome Trust Sanger Institute3, University of Manchester4, Complutense University of Madrid5, Tohoku University6, University of Nottingham7, Tulane University8, University of Kentucky9, Max Planck Society10, Spanish National Research Council11, University of Salamanca12, University of São Paulo13, Innsbruck Medical University14, University of Wisconsin-Madison15, University of Tokyo16, Nagoya University17, National Institute of Advanced Industrial Science and Technology18, Pasteur Institute19, University of Texas MD Anderson Cancer Center20, University of Idaho21, University of Lausanne22, University of Göttingen23, Tokyo University of Agriculture and Technology24, University of Sheffield25, Broad Institute26
TL;DR: The Af293 genome sequence provides an unparalleled resource for the future understanding of this remarkable fungus and revealed temperature-dependent expression of distinct sets of genes, as well as 700 A. fumigatus genes not present or significantly diverged in the closely related sexual species Neosartorya fischeri, many of which may have roles in the pathogenicity phenotype.
Abstract: Aspergillus fumigatus is exceptional among microorganisms in being both a primary and opportunistic pathogen as well as a major allergen. Its conidia production is prolific, and so human respiratory tract exposure is almost constant. A. fumigatus is isolated from human habitats and vegetable compost heaps. In immunocompromised individuals, the incidence of invasive infection can be as high as 50% and the mortality rate is often about 50% (ref. 2). The interaction of A. fumigatus and other airborne fungi with the immune system is increasingly linked to severe asthma and sinusitis. Although the burden of invasive disease caused by A. fumigatus is substantial, the basic biology of the organism is mostly obscure. Here we show the complete 29.4-megabase genome sequence of the clinical isolate Af293, which consists of eight chromosomes containing 9,926 predicted genes. Microarray analysis revealed temperature-dependent expression of distinct sets of genes, as well as 700 A. fumigatus genes not present or significantly diverged in the closely related sexual species Neosartorya fischeri, many of which may have roles in the pathogenicity phenotype. The Af293 genome sequence provides an unparalleled resource for the future understanding of this remarkable fungus.
1,356 citations
••
TL;DR: It is shown that pseudogene formation and gene loss are the principal forces shaping the different genomes of Leishmania, and genes that are differentially distributed between the species encode proteins implicated in host-pathogen interactions and parasite survival in the macrophage.
Abstract: Leishmania parasites cause a broad spectrum of clinical disease. Here we report the sequencing of the genomes of two species of Leishmania: Leishmania infantum and Leishmania braziliensis. The comparison of these sequences with the published genome of Leishmania major reveals marked conservation of synteny and identifies only 200 genes with a differential distribution between the three species. L. braziliensis, contrary to Leishmania species examined so far, possesses components of a putative RNA-mediated interference pathway, telomere-associated transposable elements and spliced leader–associated SLACS retrotransposons. We show that pseudogene formation and gene loss are the principal forces shaping the different genomes. Genes that are differentially distributed between the species encode proteins implicated in host-pathogen interactions and parasite survival in the macrophage.
721 citations
Cited by
More filters
••
TL;DR: The genome sequence of P. falciparum clone 3D7 is reported, which is the most (A + T)-rich genome sequenced to date and is being exploited in the search for new drugs and vaccines to fight malaria.
Abstract: The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more than one million African children annually. Here we report an analysis of the genome sequence of P. falciparum clone 3D7. The 23-megabase nuclear genome consists of 14 chromosomes, encodes about 5,300 genes, and is the most (A + T)-rich genome sequenced to date. Genes involved in antigenic variation are concentrated in the subtelomeric regions of the chromosomes. Compared to the genomes of free-living eukaryotic microbes, the genome of this intracellular parasite encodes fewer enzymes and transporters, but a large proportion of genes are devoted to immune evasion and host-parasite interactions. Many nuclear-encoded proteins are targeted to the apicoplast, an organelle involved in fatty-acid and isoprenoid metabolism. The genome sequence provides the foundation for future studies of this organism, and is being exploited in the search for new drugs and vaccines to fight malaria.
4,312 citations
••
TL;DR: A major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes is described and is expected to be a useful platform for functional annotation of newlysequenced genomes, including those of complex eukARYotes, and genome-wide evolutionary studies.
Abstract: The availability of multiple, essentially complete genome sequences of prokaryotes and eukaryotes spurred both the demand and the opportunity for the construction of an evolutionary classification of genes from these genomes. Such a classification system based on orthologous relationships between genes appears to be a natural framework for comparative genomics and should facilitate both functional annotation of genomes and large-scale evolutionary studies. We describe here a major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes and the construction of clusters of predicted orthologs for 7 eukaryotic genomes, which we named KOGs after euk aryotic o rthologous g roups. The COG collection currently consists of 138,458 proteins, which form 4873 COGs and comprise 75% of the 185,505 (predicted) proteins encoded in 66 genomes of unicellular organisms. The euk aryotic o rthologous g roups (KOGs) include proteins from 7 eukaryotic genomes: three animals (the nematode Caenorhabditis elegans, the fruit fly Drosophila melanogaster and Homo sapiens), one plant, Arabidopsis thaliana, two fungi (Saccharomyces cerevisiae and Schizosaccharomyces pombe), and the intracellular microsporidian parasite Encephalitozoon cuniculi. The current KOG set consists of 4852 clusters of orthologs, which include 59,838 proteins, or ~54% of the analyzed eukaryotic 110,655 gene products. Compared to the coverage of the prokaryotic genomes with COGs, a considerably smaller fraction of eukaryotic genes could be included into the KOGs; addition of new eukaryotic genomes is expected to result in substantial increase in the coverage of eukaryotic genomes with KOGs. Examination of the phyletic patterns of KOGs reveals a conserved core represented in all analyzed species and consisting of ~20% of the KOG set. This conserved portion of the KOG set is much greater than the ubiquitous portion of the COG set (~1% of the COGs). In part, this difference is probably due to the small number of included eukaryotic genomes, but it could also reflect the relative compactness of eukaryotes as a clade and the greater evolutionary stability of eukaryotic genomes. The updated collection of orthologous protein sets for prokaryotes and eukaryotes is expected to be a useful platform for functional annotation of newly sequenced genomes, including those of complex eukaryotes, and genome-wide evolutionary studies.
4,167 citations
••
TL;DR: It is found that genes of similar functions are clustered in distinct, multi-megabase regions of individual chromosomes; genes in these regions tend to share transcriptional profiles.
Abstract: A principal challenge currently facing biologists is how to connect the complete DNA sequence of an organism to its development and behaviour. Large-scale targeted-deletions have been successful in defining gene functions in the single-celled yeast Saccharomyces cerevisiae, but comparable analyses have yet to be performed in an animal. Here we describe the use of RNA interference to inhibit the function of ∼86% of the 19,427 predicted genes of C. elegans. We identified mutant phenotypes for 1,722 genes, about two-thirds of which were not previously associated with a phenotype. We find that genes of similar functions are clustered in distinct, multi-megabase regions of individual chromosomes; genes in these regions tend to share transcriptional profiles. Our resulting data set and reusable RNAi library of 16,757 bacterial clones will facilitate systematic analyses of the connections among gene sequence, chromosomal location and gene function in C. elegans.
3,529 citations
••
TL;DR: In this paper, the authors characterized Cpf1, a putative class 2 CRISPR effector, which is a single RNA-guided endonuclease lacking tracrRNA and utilizes a T-rich protospacer-adjacent motif.
3,436 citations
••
TL;DR: The short history, specific features and future prospects of research of microbial metabolites, including antibiotics and other bioactive metabolites, are summarized.
Abstract: The short history, specific features and future prospects of research of microbial metabolites, including antibiotics and other bioactive metabolites, are summarized. The microbial origin, diversity of producing species, functions and various bioactivities of metabolites, unique features of their chemical structures are discussed, mainly on the basis of statistical data. The possible numbers of metabolites may be discovered in the future, the problems of dereplication of newly isolated compounds as well as the new trends and prospects of the research are also discussed.
2,706 citations