scispace - formally typeset
Search or ask a question

Showing papers on "Genome published in 1999"


Journal ArticleDOI
TL;DR: The Kyoto Encyclopedia of Genes and Genomes (KEGG) as discussed by the authors is a knowledge base for systematic analysis of gene functions in terms of the networks of genes and molecules.
Abstract: Kyoto Encyclopedia of Genes and Genomes (KEGG) is a knowledge base for systematic analysis of gene functions in terms of the networks of genes and molecules. The major component of KEGG is the PATHWAY database that consists of graphical diagrams of biochemical pathways including most of the known metabolic pathways and some of the known regulatory pathways. The pathway information is also represented by the ortholog group tables summarizing orthologous and paralogous gene groups among different organisms. KEGG maintains the GENES database for the gene catalogs of all organisms with complete genomes and selected organisms with partial genomes, which are continuously re-annotated, as well as the LIGAND database for chemical compounds and enzymes. Each gene catalog is associated with the graphical genome map for chromosomal locations that is represented by Java applet. In addition to the data collection efforts, KEGG develops and provides various computational tools, such as for reconstructing biochemical pathways from the complete genome sequence and for predicting gene regulatory networks from the gene expression profiles. The KEGG databases are daily updated and made freely available (http://www.genome.ad.jp/kegg/).

24,024 citations


Journal ArticleDOI
06 Aug 1999-Science
TL;DR: A total of 6925 Saccharomyces cerevisiae strains were constructed, by a high-throughput strategy, each with a precise deletion of one of 2026 ORFs (more than one-third of the ORFs in the genome), finding that 17 percent were essential for viability in rich medium.
Abstract: The functions of many open reading frames (ORFs) identified in genome-sequencing projects are unknown. New, whole-genome approaches are required to systematically determine their function. A total of 6925 Saccharomyces cerevisiae strains were constructed, by a high-throughput strategy, each with a precise deletion of one of 2026 ORFs (more than one-third of the ORFs in the genome). Of the deleted ORFs, 17 percent were essential for viability in rich medium. The phenotypes of more than 500 deletion strains were assayed in parallel. Of the deletion strains, 40 percent showed quantitative growth defects in either rich or minimal medium.

4,051 citations


Journal ArticleDOI
TL;DR: The comparison of animal mitochondrial gene arrangements has become a very powerful means for inferring ancient evolutionary relationships, since rearrangements appear to be unique, generally rare events that are unlikely to arise independently in separate evolutionary lineages.
Abstract: Animal mitochondrial DNA is a small, extrachromosomal genome, typically ~16 kb in size. With few exceptions, all animal mitochondrial genomes contain the same 37 genes: two for rRNAs, 13 for proteins and 22 for tRNAs. The products of these genes, along with RNAs and proteins imported from the cytoplasm, endow mitochondria with their own systems for DNA replication, transcription, mRNA processing and translation of proteins. The study of these genomes as they function in mitochondrial systems—‘mitochondrial genomics’— serves as a model for genome evolution. Furthermore, the comparison of animal mitochondrial gene arrangements has become a very powerful means for inferring ancient evolutionary relationships, since rearrangements appear to be unique, generally rare events that are unlikely to arise independently in separate evolutionary lineages. Complete mitochondrial gene arrangements have been published for 58 chordate species and 29 non-chordate species, and partial arrangements for hundreds of other taxa. This review compares and summarizes these gene arrangements and points out some of the questions that may be addressed by comparing mitochondrial systems.

2,923 citations


Journal ArticleDOI
TL;DR: Exploration of the genome using DNA microarrays and other genome–scale technologies should narrow the gap in the knowledge of gene function and molecular biology between the currently–favoured model organisms and other species.
Abstract: Thousands of genes are being discovered for the first time by sequencing the genomes of model organisms, an exhilarating reminder that much of the natural world remains to be explored at the molecular level. DNA microarrays provide a natural vehicle for this exploration. The model organisms are the first for which comprehensive genome-wide surveys of gene expression patterns or function are possible. The results can be viewed as maps that reflect the order and logic of the genetic program, rather than the physical order of genes on chromosomes. Exploration of the genome using DNA microarrays and other genome-scale technologies should narrow the gap in our knowledge of gene function and molecular biology between the currently-favoured model organisms and other species.

2,289 citations


Journal ArticleDOI
30 Jul 1999-Science
TL;DR: Searching sequences from many genomes revealed 6809 putative protein-protein interactions in Escherichia coli and 45,502 in yeast, and many members of these pairs were confirmed as functionally related; computational filtering further enriches for interactions.
Abstract: A computational method is proposed for inferring protein interactions from genome sequences on the basis of the observation that some pairs of interacting proteins have homologs in another organism fused into a single protein chain. Searching sequences from many genomes revealed 6809 such putative proteinprotein interactions in Escherichia coli and 45,502 in yeast. Many members of these pairs were confirmed as functionally related; computational filtering further enriches for interactions. Some proteins have links to several other proteins; these coupled links appear to represent functional interactions such as complexes or pathways. Experimentally confirmed interacting pairs are documented in a Database of Interacting Proteins.

1,691 citations


Journal ArticleDOI
TL;DR: This work describes a cDNA microarray-based CGH method, and its application to DNA copy-number variation analysis in breast cancer cell lines and tumours, and identifies gene amplifications and deletions genome-wide and with high resolution.
Abstract: Gene amplifications and deletions frequently contribute to tumorigenesis. Characterization of these DNA copy-number changes is important for both the basic understanding of cancer and its diagnosis. Comparative genomic hybridization (CGH) was developed to survey DNA copy-number variations across a whole genome1. With CGH, differentially labelled test and reference genomic DNAs are co-hybridized to normal metaphase chromosomes, and fluorescence ratios along the length of chromosomes provide a cytogenetic representation of DNA copynumber variation. CGH, however, has a limited (∼20 Mb) mapping resolution, and higher-resolution techniques, such as fluorescence in situ hybridization (FISH), are prohibitively labour-intensive on a genomic scale. Array-based CGH, in which fluorescence ratios at arrayed DNA elements provide a locusby-locus measure of DNA copy-number variation, represents another means of achieving increased mapping resolution2‐4. Published array CGH methods have relied on large genomic clone (for example BAC) array targets and have covered only a small fraction of the human genome. cDNAs representing over 30,000 radiation-hybrid (RH)‐mapped human genes5,6 provide

1,558 citations


Journal ArticleDOI
27 May 1999-Nature
TL;DR: Genome analysis reveals numerous pathways involved in degradation of sugars and plant polysaccharides, and 108 genes that have orthologues only in the genomes of other thermophilic Eubacteria and Archaea.
Abstract: The 1,860,725-base-pair genome of Thermotoga maritima MSB8 contains 1,877 predicted coding regions, 1,014 (54%) of which have functional assignments and 863 (46%) of which are of unknown function. Genome analysis reveals numerous pathways involved in degradation of sugars and plant polysaccharides, and 108 genes that have orthologues only in the genomes of other thermophilic Eubacteria and Archaea. Of the Eubacteria sequenced to date, T. maritima has the highest percentage (24%) of genes that are most similar to archaeal genes. Eighty-one archaeal-like genes are clustered in 15 regions of the T. maritima genome that range in size from 4 to 20 kilobases. Conservation of gene order between T. maritima and Archaea in many of the clustered regions suggests that lateral gene transfer may have occurred between thermophilic Eubacteria and Archaea.

1,486 citations


Journal ArticleDOI
TL;DR: It is expected that cell cultures of patients with mitochondrial diseases will increasingly be used to address fundamental questions about mtDNA expression, and several key enzymes involved in mtDNA replication, transcription and protein synthesis have now been biochemically identified and some have been cloned.

1,337 citations


Journal ArticleDOI
TL;DR: This report describes the systematic and up-to-date analysis of genomes (PEDANT), a comprehensive database of the yeast genome (MYGD), a database reflecting the progress in sequencing the Arabidopsis thaliana genome (MATD), the database of assembled, annotated human EST clusters (MEST), and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume).
Abstract: The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz-Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91-93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155-158; Barker et al. (2001) Nucleic Acids Res., 29, 29-32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de).

1,314 citations


Journal ArticleDOI
TL;DR: It is proposed that a major factor in the more frequent horizontal transfer of operational genes is that informational genes are typically members of large, complex systems, whereas operational genes are not, thereby making horizontalTransfer of informational gene products less probable (the complexity hypothesis).
Abstract: Increasingly, studies of genes and genomes are indicating that considerable horizontal transfer has occurred between prokaryotes. Extensive horizontal transfer has occurred for operational genes (those involved in housekeeping), whereas informational genes (those involved in transcription, translation, and related processes) are seldomly horizontally transferred. Through phylogenetic analysis of six complete prokaryotic genomes and the identification of 312 sets of orthologous genes present in all six genomes, we tested two theories describing the temporal flow of horizontal transfer. We show that operational genes have been horizontally transferred continuously since the divergence of the prokaryotes, rather than having been exchanged in one, or a few, massive events that occurred early in the evolution of prokaryotes. In agreement with earlier studies, we found that differences in rates of evolution between operational and informational genes are minimal, suggesting that factors other than rate of evolution are responsible for the observed differences in horizontal transfer. We propose that a major factor in the more frequent horizontal transfer of operational genes is that informational genes are typically members of large, complex systems, whereas operational genes are not, thereby making horizontal transfer of informational gene products less probable (the complexity hypothesis).

1,149 citations


Journal ArticleDOI
04 Nov 1999-Nature
TL;DR: It is shown that 215 genes or proteins in the complete genomes of Escherichia coli, Haemophilus influenzae and Methanococcus jannaschii are involved in 64 unique fusion events, which is able to predict functional associations of proteins.
Abstract: A large-scale effort to measure, detect and analyse protein-protein interactions using experimental methods is under way. These include biochemistry such as co-immunoprecipitation or crosslinking, molecular biology such as the two-hybrid system or phage display, and genetics such as unlinked noncomplementing mutant detection. Using the two-hybrid system, an international effort to analyse the complete yeast genome is in progress. Evidently, all these approaches are tedious, labour intensive and inaccurate. From a computational perspective, the question is how can we predict that two proteins interact from structure or sequence alone. Here we present a method that identifies gene-fusion events in complete genomes, solely based on sequence comparison. Because there must be selective pressure for certain genes to be fused over the course of evolution, we are able to predict functional associations of proteins. We show that 215 genes or proteins in the complete genomes of Escherichia coli, Haemophilus influenzae and Methanococcus jannaschii are involved in 64 unique fusion events. The approach is general, and can be applied even to genes of unknown function.

Journal ArticleDOI
Ian Dunham1, Nobuyoshi Shimizu1, Bruce A. Roe1, S. Chissoe1  +220 moreInstitutions (15)
02 Dec 1999-Nature
TL;DR: The sequence of the euchromatic part of human chromosome 22 is reported, which consists of 12 contiguous segments spanning 33.4 megabases, contains at least 545 genes and 134 pseudogenes, and provides the first view of the complex chromosomal landscapes that will be found in the rest of the genome.
Abstract: Knowledge of the complete genomic DNA sequence of an organism allows a systematic approach to defining its genetic components. The genomic sequence provides access to the complete structures of all genes, including those without known function, their control elements, and, by inference, the proteins they encode, as well as all other biologically important sequences. Furthermore, the sequence is a rich and permanent source of information for the design of further biological studies of the organism and for the study of evolution through cross-species sequence comparison. The power of this approach has been amply demonstrated by the determination of the sequences of a number of microbial and model organisms. The next step is to obtain the complete sequence of the entire human genome. Here we report the sequence of the euchromatic part of human chromosome 22. The sequence obtained consists of 12 contiguous segments spanning 33.4 megabases, contains at least 545 genes and 134 pseudogenes, and provides the first view of the complex chromosomal landscapes that will be found in the rest of the genome.

Journal ArticleDOI
TL;DR: The many new examples of human genes derived from single transposon insertions highlight the large contribution of selfish DNA to genomic evolution.

Journal ArticleDOI
TL;DR: Using an efficient data structure called a suffix tree, the system is able to rapidly align sequences containing millions of nucleotides and should facilitate analysis of syntenic chromosomal regions, strain-to-strain comparisons, evolutionary comparisons and genomic duplications.
Abstract: A new system for aligning whole genome sequences is described. Using an efficient data structure called a suffix tree, the system is able to rapidly align sequences containing millions of nucleotides. Its use is demonstrated on two strains of Mycoplasma tuberculosis, on two less similar species of Mycoplasma bacteria and on two syntenic sequences from human chromosome 12 and mouse chromosome 6. In each case it found an alignment of the input sequences, using between 30 s and 2 min of computation time. From the system output, information on single nucleotide changes, translocations and homologous genes can easily be extracted. Use of the algorithm should facilitate analysis of syntenic chromosomal regions, strain-to-strain comparisons, evolutionary comparisons and genomic duplications.

Journal ArticleDOI
04 Nov 1999-Nature
TL;DR: Proteins are grouped by correlated evolution, correlated messenger RNA expression patterns and patterns of domain fusion to determine functional relationships among the 6,217 proteins of the yeast Saccharomyces cerevisiae to discover pairwise links between functionally related yeast proteins.
Abstract: The availability of over 20 fully sequenced genomes has driven the development of new methods to find protein function and interactions. Here we group proteins by correlated evolution, correlated messenger RNA expression patterns and patterns of domain fusion to determine functional relationships among the 6,217 proteins of the yeast Saccharomyces cerevisiae. Using these methods, we discover over 93,000 pairwise links between functionally related yeast proteins. Links between characterized and uncharacterized proteins allow a general function to be assigned to more than half of the 2,557 previously uncharacterized yeast proteins. Examples of functional links are given for a protein family of previously unknown function, a protein whose human homologues are implicated in colon cancer and the yeast prion Sup35.

Journal ArticleDOI
19 Nov 1999-Science
TL;DR: Deinococcus radiodurans represents an organism in which all systems for DNA repair, DNA damage export, desiccation and starvation recovery, and genetic redundancy are present in one cell.
Abstract: The complete genome sequence of the radiation-resistant bacterium Deinococcus radiodurans R1 is composed of two chromosomes (2,648,638 and 412,348 base pairs), a megaplasmid (177,466 base pairs), and a small plasmid (45,704 base pairs), yielding a total genome of 3,284, 156 base pairs. Multiple components distributed on the chromosomes and megaplasmid that contribute to the ability of D. radiodurans to survive under conditions of starvation, oxidative stress, and high amounts of DNA damage were identified. Deinococcus radiodurans represents an organism in which all systems for DNA repair, DNA damage export, desiccation and starvation recovery, and genetic redundancy are present in one cell.

Journal ArticleDOI
TL;DR: DNA and predicted protein sequence similarities, implying homology, are reported, among genes of double-stranded DNA (dsDNA) bacteriophages and prophages spanning a broad phylogenetic range of host bacteria, suggesting common ancestry among these phage genes.
Abstract: We report DNA and predicted protein sequence similarities, implying homology, among genes of double-stranded DNA (dsDNA) bacteriophages and prophages spanning a broad phylogenetic range of host bacteria. The sequence matches reported here establish genetic connections, not always direct, among the lambdoid phages of Escherichia coli, phage phiC31 of Streptomyces, phages of Mycobacterium, a previously unrecognized cryptic prophage, phiflu, in the Haemophilus influenzae genome, and two small prophage-like elements, phiRv1 and phiRv2, in the genome of Mycobacterium tuberculosis. The results imply that these phage genes, and very possibly all of the dsDNA tailed phages, share common ancestry. We propose a model for the genetic structure and dynamics of the global phage population in which all dsDNA phage genomes are mosaics with access, by horizontal exchange, to a large common genetic pool but in which access to the gene pool is not uniform for all phage.

Journal ArticleDOI
TL;DR: Between these different mechanisms, Alu elements have not only contributed a great deal to the evolution of the genome but also continue to contribute to a significant portion of human genetic diseases.

Journal ArticleDOI
TL;DR: Two-dimensional gel electrophoresis and mass spectrometry have, coupled with searches in protein and EST databases, transformed the protein-identification process, and proteomics is functional genomics at the protein level.

Journal ArticleDOI
10 Dec 1999-Science
TL;DR: Global transposon mutagenesis was used to identify nonessential genes in an effort to learn whether the naturally occurring gene complement is a true minimal genome under laboratory growth conditions, and suggests that 265 to 350 of the 480 protein-coding genes of M. genitalium are essential under laboratory growing conditions.
Abstract: Mycoplasma genitalium with 517 genes has the smallest gene complement of any independently replicating cell so far identified. Global transposon mutagenesis was used to identify nonessential genes in an effort to learn whether the naturally occurring gene complement is a true minimal genome under laboratory growth conditions. The positions of 2209 transposon insertions in the completely sequenced genomes of M. genitalium and its close relative M. pneumoniae were determined by sequencing across the junction of the transposon and the genomic DNA. These junctions defined 1354 distinct sites of insertion that were not lethal. The analysis suggests that 265 to 350 of the 480 protein-coding genes of M. genitalium are essential under laboratory growth conditions, including about 100 genes of unknown function.

Journal ArticleDOI
01 Sep 1999-Genetics
TL;DR: The results show that Drosophila genes have a wide range of sensitivity to inactivation by P elements, and provide a rationale for greatly expanding the BDGP primary collection based entirely on insertion site sequencing, and predict that this approach can bring >85% of all Dosophila open reading frames under experimental control.
Abstract: A fundamental goal of genetics and functional genomics is to identify and mutate every gene in model organisms such as Drosophila melanogaster. The Berkeley Drosophila Genome Project (BDGP) gene disruption project generates single P-element insertion strains that each mutate unique genomic open reading frames. Such strains strongly facilitate further genetic and molecular studies of the disrupted loci, but it has remained unclear if P elements can be used to mutate all Drosophila genes. We now report that the primary collection has grown to contain 1045 strains that disrupt more than 25% of the estimated 3600 Drosophila genes that are essential for adult viability. Of these P insertions, 67% have been verified by genetic tests to cause the associated recessive mutant phenotypes, and the validity of most of the remaining lines is predicted on statistical grounds. Sequences flanking >920 insertions have been determined to exactly position them in the genome and to identify 376 potentially affected transcripts from collections of EST sequences. Strains in the BDGP collection are available from the Bloomington Stock Center and have already assisted the research community in characterizing >250 Drosophila genes. The likely identity of 131 additional genes in the collection is reported here. Our results show that Drosophila genes have a wide range of sensitivity to inactivation by P elements, and provide a rationale for greatly expanding the BDGP primary collection based entirely on insertion site sequencing. We predict that this approach can bring >85% of all Drosophila open reading frames under experimental control.

Journal ArticleDOI
16 Dec 1999-Nature
TL;DR: The sequence of chromosome 2 from the Columbia ecotype is reported in two gap-free assemblies (contigs) of 3.6 and 16 megabases, which represents the longest published stretch of uninterrupted DNA sequence assembled from any organism to date.
Abstract: Arabidopsis thaliana (Arabidopsis) is unique among plant model organisms in having a small genome (130-140 Mb), excellent physical and genetic maps, and little repetitive DNA. Here we report the sequence of chromosome 2 from the Columbia ecotype in two gap-free assemblies (contigs) of 3.6 and 16 megabases (Mb). The latter represents the longest published stretch of uninterrupted DNA sequence assembled from any organism to date. Chromosome 2 represents 15% of the genome and encodes 4,037 genes, 49% of which have no predicted function. Roughly 250 tandem gene duplications were found in addition to large-scale duplications of about 0.5 and 4.5 Mb between chromosomes 2 and 1 and between chromosomes 2 and 4, respectively. Sequencing of nearly 2 Mb within the genetically defined centromere revealed a low density of recognizable genes, and a high density and diverse range of vestigial and presumably inactive mobile elements. More unexpected is what appears to be a recent insertion of a continuous stretch of 75% of the mitochondrial genome into chromosome 2.

Journal ArticleDOI
TL;DR: The increased genetic complexity of fish might reflect their evolutionary success and diversity, and many others evolved new functions particularly during development.

Journal ArticleDOI
TL;DR: It is shown that three independent hrm101/hrm101 mutants and two independent enx3/enx3 mutants are defective in filamentation on Spider medium, arguing that HRM101 and ENX3 sequences are indeed portions of genes and that the respective gene products have related functions.
Abstract: Candida albicans is an opportunistic fungal pathogen. It exists as a benign commensal organism in healthy individuals but causes infections in susceptible individuals, such as those with diminished immune function (14). Molecular genetic analysis of C. albicans has permitted evaluation of antifungal drug targets and elucidation of requirements for infection and pathogenesis (16). New C. albicans genes have been identified frequently through sequence homology to known genes or gene families. Gene discovery has been facilitated greatly by access to much of the C. albicans genomic sequence (11). Now, the rate-limiting step in analysis of gene function in this diploid organism is the creation of a homozygous disruption mutant. Gene disruption has been accomplished through successive transformations with insertion/deletion alleles that are constructed in vitro (2, 7, 12). These methods have thus far required isolation of substantial DNA segments, and yet new genes of interest are often identified through DNA sequences of 400 to 600 bp (3a). We report here a rapid method for disruption of C. albicans genes with PCR products that contain short regions of homology to the genome.

Journal ArticleDOI
TL;DR: The nucleotide binding site (NBS) is a characteristic domain of many plant resistance gene products and its wide distribution in the plant kingdom and their prevalence in the Arabidopsis and rice genomes indicate that they are ancient, diverse and common in plants.
Abstract: The nucleotide binding site (NBS) is a characteristic domain of many plant resistance gene products. An increasing number of NBS-encoding sequences are being identified through gene cloning, PCR amplification with degenerate primers, and genome sequencing projects. The NBS domain was analyzed from 14 known plant resistance genes and more than 400 homologs, representing 26 genera of monocotyledonous, dicotyle-donous and one coniferous species. Two distinct groups of diverse sequences were identified, indicating divergence during evolution and an ancient origin for these sequences. One group was comprised of sequences encoding an N-terminal domain with Toll/Interleukin-1 receptor homology (TIR), including the known resistance genes, N, M, L6, RPP1 and RPP5. Surprisingly, this group was entirely absent from monocot species in searches of both random genomic sequences and large collections of ESTs. A second group contained monocot and dicot sequences, including the known resistance genes, RPS2, RPM1, I2, Mi, Dm3, Pi-B, Xa1, RPP8, RPS5 and Prf. Amino acid signatures in the conserved motifs comprising the NBS domain clearly distinguished these two groups. The Arabidopsis genome is estimated to contain approximately 200 genes that encode related NBS motifs; TIR sequences were more abundant and outnumber non-TIR sequences threefold. The Arabidopsis NBS sequences currently in the databases are located in approximately 21 genomic clusters and 14 isolated loci. NBS-encoding sequences may be more prevalent in rice. The wide distribution of these sequences in the plant kingdom and their prevalence in the Arabidopsis and rice genomes indicate that they are ancient, diverse and common in plants. Sequence inferences suggest that these genes encode a novel class of nucleotide-binding proteins.

Journal ArticleDOI
TL;DR: Molecular biologists ought to respect the original definition of synteny and its etymological derivation, especially as this term is still needed to refer to genes located on the same chromosome.
Abstract: nature genetics • volume 23 • december 1999 387 The term ‘synteny’ (or syntenic) refers to gene loci on the same chromosome regardless of whether or not they are genetically linked by classic linkage analysis1. This term was introduced in 1971 by John H. Renwick, of the London School of Hygiene and Tropical Medicine, at the 4th Internal Congress of Human Genetics in Paris with one of us (E.P.) in attendance. The need for such a term was suggested to J.H. Renwick by E.A. Murphy, of Johns Hopkins University2. It arose as a consequence of the new methods in gene mapping using somatic cell hybrid cells. Human genes located on the same chromosome with a genetic distance that could not be determined by the frequency of recombination lacked a term of reference. ‘Synteny’ means ‘same thread’ (or ribbon), a state of being together in location, as synchrony would be together in time. Although several textbooks3–10 and other reference works11–15 give a correct definition, the term synteny nowadays is often used to refer to gene loci in different organisms located on a chromosomal region of common evolutionary ancestry. This new usage of the term synteny does not correspond to its original definition and correct language derivation. A survey of 11 articles in Nature Genetics since 1992 using the term syntenic or synteny in either the title or the abstract revealed usage incorrect in 8 and ambiguous in 3. We believe molecular biologists ought to respect the original definition of synteny and its etymological derivation, especially as this term is still needed to refer to genes located on the same chromosome. We recognize the need to refer to gene loci of common ancestry. Correct terms exist: ‘paralogous’ for genes that arose from a common ancestor gene within one species and ‘orthologous’ for the same gene in different species. Eberhard Passarge1, Bernhard Horsthemke1 & Rosann A. Farber2 1Institut für Humangenetik, Universitätsklinikum Essen, Essen, Germany. 2Department of Pathology and Laboratory Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA. Correspondence should be addressed to E.P. (e-mail: eberhard.passarge@uni-essen.de).

Journal ArticleDOI
01 Sep 1999-Plasmid
TL;DR: Understanding of the role of horizontal gene transfer in the environment is essential for the evaluation of the possible consequences of the deliberate environmental release of natural or recombinant bacteria for agricultural and bioremediation purposes.

Journal ArticleDOI
TL;DR: A PCR-based approach to sequencing complete mitochondrial genomes is described along with a set of 86 primers designed primarily for avian mitochondrial DNA, which should make available a wider variety of mitochondrial genes for studies based on smaller data sets.

Journal ArticleDOI
TL;DR: This comprehensive genome phylogeny is independent of phylogenies based on the level of sequence identity of individual genes, and correlates with the standard reference of prokarytic phylogeny based on sequence similarity of 16s rRNA (ref. 4).
Abstract: Species phylogenies derived from comparisons of single genes are rarely consistent with each other, due to horizontal gene transfer, unrecognized paralogy and highly variable rates of evolution. The advent of completely sequenced genomes allows the construction of a phylogeny that is less sensitive to such inconsistencies and more representative of whole-genomes than are single-gene trees. Here, we present a distance-based phylogeny constructed on the basis of gene content, rather than on sequence identity, of 13 completely sequenced genomes of unicellular species. The similarity between two species is defined as the number of genes that they have in common divided by their total number of genes. In this type of phylogenetic analysis, evolutionary distance can be interpreted in terms of evolutionary events such as the acquisition and loss of genes, whereas the underlying properties (the gene content) can be interpreted in terms of function. As such, it takes a position intermediate to phylogenies based on single genes and phylogenies based on phenotypic characteristics. Although our comprehensive genome phylogeny is independent of phylogenies based on the level of sequence identity of individual genes, it correlates with the standard reference of prokarytic phylogeny based on sequence similarity of 16s rRNA. Thus, shared gene content between genomes is quantitatively determined by phylogeny, rather than by phenotype, and horizontal gene transfer has only a limited role in determining the gene content of genomes.

Journal ArticleDOI
TL;DR: Defining more precisely the alpha-proteobacterial ancestry of the mitochondrial genome, and the contribution of the endosymbiotic event to the nuclear genome, will be essential for a full understanding of the origin and evolution of the eukaryotic cell as a whole.
Abstract: ▪ Abstract Recent results from ancestral (minimally derived) protists testify to the tremendous diversity of the mitochondrial genome in various eukaryotic lineages, but also reinforce the view that mitochondria, descendants of an endosymbiotic α-Proteobacterium, arose only once in evolution. The serial endosymbiosis theory, currently the most popular hypothesis to explain the origin of mitochondria, postulates the capture of an α-proteobacterial endosymbiont by a nucleus-containing eukaryotic host resembling extant amitochondriate protists. New sequence data have challenged this scenario, instead raising the possibility that the origin of the mitochondrion was coincident with, and contributed substantially to, the origin of the nuclear genome of the eukaryotic cell. Defining more precisely the α-proteobacterial ancestry of the mitochondrial genome, and the contribution of the endosymbiotic event to the nuclear genome, will be essential for a full understanding of the origin and evolution of the eukaryoti...