scispace - formally typeset
Search or ask a question
Author

André Goffeau

Bio: André Goffeau is an academic researcher from Université catholique de Louvain. The author has contributed to research in topics: Saccharomyces cerevisiae & ATPase. The author has an hindex of 72, co-authored 244 publications receiving 31166 citations. Previous affiliations of André Goffeau include Centre national de la recherche scientifique & Spanish National Research Council.
Topics: Saccharomyces cerevisiae, ATPase, Gene, Mutant, Yeast


Papers
More filters
Journal ArticleDOI
25 Oct 1996-Science
TL;DR: The genome of the yeast Saccharomyces cerevisiae has been completely sequenced through a worldwide collaboration and provides information about the higher order organization of yeast's 16 chromosomes and allows some insight into their evolutionary history.
Abstract: The genome of the yeast Saccharomyces cerevisiae has been completely sequenced through a worldwide collaboration. The sequence of 12,068 kilobases defines 5885 potential protein-encoding genes, approximately 140 genes specifying ribosomal RNA, 40 genes for small nuclear RNA molecules, and 275 transfer RNA genes. In addition, the complete sequence provides information about the higher order organization of yeast's 16 chromosomes and allows some insight into their evolutionary history. The genome shows a considerable amount of apparent genetic redundancy, and one of the major problems to be tackled during the next stage of the yeast genome project is to elucidate the biological functions of all of these genes.

4,254 citations

Journal ArticleDOI
F. Kunst1, Naotake Ogasawara2, Ivan Moszer1, Alessandra M. Albertini3  +151 moreInstitutions (30)
20 Nov 1997-Nature
TL;DR: Bacillus subtilis is the best-characterized member of the Gram-positive bacteria, indicating that bacteriophage infection has played an important evolutionary role in horizontal gene transfer, in particular in the propagation of bacterial pathogenesis.
Abstract: Bacillus subtilis is the best-characterized member of the Gram-positive bacteria. Its genome of 4,214,810 base pairs comprises 4,100 protein-coding genes. Of these protein-coding genes, 53% are represented once, while a quarter of the genome corresponds to several gene families that have been greatly expanded by gene duplication, the largest family containing 77 putative ATP-binding transport proteins. In addition, a large proportion of the genetic capacity is devoted to the utilization of a variety of carbon sources, including many plant-derived molecules. The identification of five signal peptidase genes, as well as several genes for components of the secretion apparatus, is important given the capacity of Bacillus strains to secrete large amounts of industrially important enzymes. Many of the genes are involved in the synthesis of secondary metabolites, including antibiotics, that are more typically associated with Streptomyces species. The genome contains at least ten prophages or remnants of prophages, indicating that bacteriophage infection has played an important evolutionary role in horizontal gene transfer, in particular in the propagation of bacterial pathogenesis.

3,753 citations

Journal Article
TL;DR: Evidence is presented substantiating the proposal that an internal tandem gene duplication event gave rise to a primordial MFS protein before divergence of the family members.
Abstract: In 1998 we updated earlier descriptions of the largest family of secondary transport carriers found in living organisms, the major facilitator superfamily (MFS). Seventeen families of transport proteins were shown to comprise this superfamily. We here report expansion of the MFS to include 29 established families as well as five probable families. Structural, functional, and mechanistic features of the constituent permeases are described, and each newly identified family is shown to exhibit specificity for a single class of substrates. Phylogenetic analyses define the evolutionary relationships of the members of each family to each other, and multiple alignments allow definition of family-specific signature sequences as well as all wellconserved sequence motifs. The work described serves to update previous publications and allows extrapolation of structural, functional and mechanistic information obtained with any one member of the superfamily to other members with limitations determined by the degrees of sequence divergence.

1,996 citations

Journal ArticleDOI
Valerie Wood1, R. Gwilliam1, Marie-Adèle Rajandream1, M. Lyne1, Rachel Lyne1, A. Stewart2, J. Sgouros2, N. Peat2, Jacqueline Hayles2, Stephen Baker1, D. Basham1, Sharen Bowman1, Karen Brooks1, D. Brown1, Steve D.M. Brown1, Tracey Chillingworth1, Carol Churcher1, Mark O. Collins1, R. Connor1, Ann Cronin1, P. Davis1, Theresa Feltwell1, Andrew G. Fraser1, S. Gentles1, Arlette Goble1, N. Hamlin1, David Harris1, J. Hidalgo1, Geoffrey M. Hodgson1, S. Holroyd1, T. Hornsby1, S. Howarth1, Elizabeth J. Huckle1, Sarah E. Hunt1, Kay Jagels1, Kylie R. James1, L. Jones1, Matthew Jones1, S. Leather1, S. McDonald1, J. McLean1, P. Mooney1, Sharon Moule1, Karen Mungall1, Lee Murphy1, D. Niblett1, C. Odell1, Karen Oliver1, Susan O'Neil1, D. Pearson1, Michael A. Quail1, Ester Rabbinowitsch1, Kim Rutherford1, Simon Rutter1, David L. Saunders1, Kathy Seeger1, Sarah Sharp1, Jason Skelton1, Mark Simmonds1, R. Squares1, S. Squares1, K. Stevens1, K. Taylor1, Ruth Taylor1, Adrian Tivey1, S. Walsh1, T. Warren1, S. Whitehead1, John Woodward1, Guido Volckaert3, Rita Aert3, Johan Robben3, B. Grymonprez3, I. Weltjens3, E. Vanstreels3, Michael A. Rieger, M. Schafer, S. Muller-Auer, C. Gabel, M. Fuchs, C. Fritzc, E. Holzer, D. Moestl, H. Hilbert, K. Borzym4, I. Langer4, Alfred Beck4, Hans Lehrach4, Richard Reinhardt4, Thomas M. Pohl5, P. Eger5, Wolfgang Zimmermann, H. Wedler, R. Wambutt, Bénédicte Purnelle6, André Goffeau6, Edouard Cadieu7, Stéphane Dréano7, Stéphanie Gloux7, Valerie Lelaure7, Stéphanie Mottier7, Francis Galibert7, Stephen J. Aves8, Z. Xiang8, Cherryl Hunt8, Karen Moore8, S. M. Hurst8, M. Lucas9, M. Rochet9, Claude Gaillardin9, Victor A. Tallada10, Victor A. Tallada11, Andrés Garzón11, Andrés Garzón10, G. Thode10, Rafael R. Daga10, Rafael R. Daga11, L. Cruzado10, Juan Jimenez10, Juan Jimenez11, Miguel del Nogal Sánchez12, F. del Rey12, J. Benito12, Angel Domínguez12, José L. Revuelta12, Sergio Moreno12, John Armstrong13, Susan L. Forsburg14, L. Cerrutti1, Todd M. Lowe15, W. R. McCombie16, Ian T. Paulsen17, Judith A. Potashkin18, G. V. Shpakovski19, David W. Ussery20, Bart Barrell1, Paul Nurse2 
21 Feb 2002-Nature
TL;DR: The genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote, is sequenced and highly conserved genes important for eukARYotic cell organization including those required for the cytoskeleton, compartmentation, cell-cycle control, proteolysis, protein phosphorylation and RNA splicing are identified.
Abstract: We have sequenced and annotated the genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly reflecting more-extended control regions. Some 43% of the genes contain introns, of which there are 4,730. Fifty genes have significant similarity with human disease genes; half of these are cancer related. We identify highly conserved genes important for eukaryotic cell organization including those required for the cytoskeleton, compartmentation, cell-cycle control, proteolysis, protein phosphorylation and RNA splicing. These genes may have originated with the appearance of eukaryotic life. Few similarly conserved genes that are important for multicellular organization were identified, suggesting that the transition from prokaryotes to eukaryotes required more new genes than did the transition from unicellular to multicellular organization.

1,686 citations

Journal ArticleDOI
Alasdair Ivens1, Christopher S. Peacock1, Elizabeth A. Worthey2, Lee Murphy1, Gautam Aggarwal2, Matthew Berriman1, Ellen Sisk2, Marie-Adèle Rajandream1, Ellen Adlem1, Rita Aert3, Atashi Anupama2, Zina Apostolou, Philip Attipoe2, Nathalie Bason1, Christopher Bauser4, Alfred Beck5, Stephen M. Beverley6, Gabriella Bianchettin7, K. Borzym5, G. Bothe4, Carlo V. Bruschi8, Carlo V. Bruschi7, Matt Collins1, Eithon Cadag2, Laura Ciarloni7, Christine Clayton, Richard M.R. Coulson9, Ann Cronin1, Angela K. Cruz10, Robert L. Davies1, Javier G. De Gaudenzi11, Deborah E. Dobson6, Andreas Duesterhoeft, Gholam Fazelina2, Nigel Fosker1, Alberto C.C. Frasch11, Audrey Fraser1, Monika Fuchs, Claudia Gabel, Arlette Goble1, André Goffeau12, David Harris1, Christiane Hertz-Fowler1, Helmut Hilbert, David Horn13, Yiting Huang2, Sven Klages5, Andrew J Knights1, Michael Kube5, Natasha Larke1, Lyudmila Litvin2, Angela Lord1, Tin Louie2, Marco A. Marra, David Masuy12, Keith R. Matthews14, Shulamit Michaeli, Jeremy C. Mottram15, Silke Müller-Auer, Heather Munden2, Siri Nelson2, Halina Norbertczak1, Karen Oliver1, Susan O'Neil1, Martin Pentony2, Thomas M. Pohl4, Claire Price1, Bénédicte Purnelle12, Michael A. Quail1, Ester Rabbinowitsch1, Richard Reinhardt5, Michael A. Rieger, Joel Rinta2, Johan Robben3, Laura Robertson2, Jeronimo C. Ruiz10, Simon Rutter1, David L. Saunders1, Melanie Schäfer, Jacquie Schein, David C. Schwartz16, Kathy Seeger1, Amber Seyler2, Sarah Sharp1, Heesun Shin, Dhileep Sivam2, Rob Squares1, Steve Squares1, Valentina Tosato7, Christy Vogt2, Guido Volckaert3, Rolf Wambutt, T. Warren1, Holger Wedler, John Woodward1, Shiguo Zhou16, Wolfgang Zimmermann, Deborah F. Smith17, Jenefer M. Blackwell18, Kenneth Stuart2, Kenneth Stuart19, Bart Barrell1, Peter J. Myler19, Peter J. Myler2 
15 Jul 2005-Science
TL;DR: The organization of protein-coding genes into long, strand-specific, polycistronic clusters and lack of general transcription factors in the L. major, Trypanosoma brucei, and Tritryp genomes suggest that the mechanisms regulating RNA polymerase II–directed transcription are distinct from those operating in other eukaryotes, although the trypanosomatids appear capable of chromatin remodeling.
Abstract: Leishmania species cause a spectrum of human diseases in tropical and subtropical regions of the world. We have sequenced the 36 chromosomes of the 32.8-megabase haploid genome of Leishmania major (Friedlin strain) and predict 911 RNA genes, 39 pseudogenes, and 8272 protein-coding genes, of which 36% can be ascribed a putative function. These include genes involved in host-pathogen interactions, such as proteolytic enzymes, and extensive machinery for synthesis of complex surface glycoconjugates. The organization of protein-coding genes into long, strand-specific, polycistronic clusters and lack of general transcription factors in the L. major, Trypanosoma brucei, and Trypanosoma cruzi (Tritryp) genomes suggest that the mechanisms regulating RNA polymerase II-directed transcription are distinct from those operating in other eukaryotes, although the trypanosomatids appear capable of chromatin remodeling. Abundant RNA-binding proteins are encoded in the Tritryp genomes, consistent with active posttranscriptional regulation of gene expression.

1,357 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Abstract: Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.

35,225 citations

Journal ArticleDOI
Eric S. Lander1, Lauren Linton1, Bruce W. Birren1, Chad Nusbaum1  +245 moreInstitutions (29)
15 Feb 2001-Nature
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

22,269 citations

Journal ArticleDOI
J. Craig Venter1, Mark Raymond Adams1, Eugene W. Myers1, Peter W. Li1  +269 moreInstitutions (12)
16 Feb 2001-Science
TL;DR: Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems are indicated.
Abstract: A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.

12,098 citations

Journal ArticleDOI
14 Dec 2000-Nature
TL;DR: This is the first complete genome sequence of a plant and provides the foundations for more comprehensive comparison of conserved processes in all eukaryotes, identifying a wide range of plant-specific gene functions and establishing rapid systematic ways to identify genes for crop improvement.
Abstract: The flowering plant Arabidopsis thaliana is an important model system for identifying genes and determining their functions. Here we report the analysis of the genomic sequence of Arabidopsis. The sequenced regions cover 115.4 megabases of the 125-megabase genome and extend into centromeric regions. The evolution of Arabidopsis involved a whole-genome duplication, followed by subsequent gene loss and extensive local gene duplications, giving rise to a dynamic genome enriched by lateral gene transfer from a cyanobacterial-like ancestor of the plastid. The genome contains 25,498 genes encoding proteins from 11,000 families, similar to the functional diversity of Drosophila and Caenorhabditis elegans--the other sequenced multicellular eukaryotes. Arabidopsis has many families of new proteins but also lacks several common protein families, indicating that the sets of common proteins have undergone differential expansion and contraction in the three multicellular eukaryotes. This is the first complete genome sequence of a plant and provides the foundations for more comprehensive comparison of conserved processes in all eukaryotes, identifying a wide range of plant-specific gene functions and establishing rapid systematic ways to identify genes for crop improvement.

8,742 citations

Journal ArticleDOI
11 Jun 1998-Nature
TL;DR: The complete genome sequence of the best-characterized strain of Mycobacterium tuberculosis, H37Rv, has been determined and analysed in order to improve the understanding of the biology of this slow-growing pathogen and to help the conception of new prophylactic and therapeutic interventions.
Abstract: Countless millions of people have died from tuberculosis, a chronic infectious disease caused by the tubercle bacillus. The complete genome sequence of the best-characterized strain of Mycobacterium tuberculosis, H37Rv, has been determined and analysed in order to improve our understanding of the biology of this slow-growing pathogen and to help the conception of new prophylactic and therapeutic interventions. The genome comprises 4,411,529 base pairs, contains around 4,000 genes, and has a very high guanine + cytosine content that is reflected in the biased amino-acid content of the proteins. M. tuberculosis differs radically from other bacteria in that a very large portion of its coding capacity is devoted to the production of enzymes involved in lipogenesis and lipolysis, and to two new families of glycine-rich proteins with a repetitive structure that may represent a source of antigenic variation.

7,779 citations