scispace - formally typeset
Search or ask a question

Showing papers by "Kenneth H. Wolfe published in 2002"


Journal ArticleDOI
TL;DR: A systematic and objective analysis of the draft human genome sequence is reported to identify paralogous chromosomal regions (paralogons) formed during chordate evolution and to estimate the ages of duplicate genes.
Abstract: Opinions on the hypothesis that ancient genome duplications contributed to the vertebrate genome range from strong skepticism to strong credence. Previous studies concentrated on small numbers of gene families or chromosomal regions that might not have been representative of the whole genome, or used subjective methods to identify paralogous genes and regions. Here we report a systematic and objective analysis of the draft human genome sequence to identify paralogous chromosomal regions (paralogons) formed during chordate evolution and to estimate the ages of duplicate genes. We found that the human genome contains many more paralogons than would be expected by chance. Molecular clock analysis of all protein families in humans that have orthologs in the fly and nematode indicated that a burst of gene duplication activity took place in the period 350 650 Myr ago and that many of the duplicate genes formed at this time are located within paralogons. Our results support the contention that many of the gene families in vertebrates were formed or expanded by large-scale DNA duplications in an early chordate. Considering the incompleteness of the sequence data and the antiquity of the event, the results are compatible with at least one round of polyploidy.

532 citations


Journal ArticleDOI
TL;DR: It is estimated that the species diverged 50-120 million years ago, and that since then there have been 4030 rearrangements between their whole genomes, which is at least four times that of Drosophila, which was previously reported to be the fastest rate among eukaryotes.
Abstract: The genes of Caenorhabditis elegans appear to have an unusually rapid rate of evolution. The substitution rates of many C. elegans genes are twice those of their orthologs in non-nematode metazoans (Aguinaldo et al. 1997; see Fig. 3 in Mushegian et al. 1998). Even among nematodes, the C. elegans small subunit ribosomal RNA gene evolves faster than its orthologs in most of the major clades (see Fig. 1 in Blaxter et al. 1998). It has been estimated that two-thirds of C. elegans protein-coding genes evolve more rapidly than their Drosophila orthologs (Mushegian et al. 1998). In vertebrates at least, the rate of nucleotide substitution is correlated with that of chromosomal rearrangement (Burt et al. 1999). Ranz et al. (2001) reported that Drosophila chromosomes rearrange at least 175 times faster than those of other metazoans, and at a rate at least five times greater than the rate of the fastest plant genomes. However, no Caenorhabditis rate data existed to compare with the Drosophila data. Given their fast rate of nucleotide substitution, we guessed that Caenorhabditis genomes might have a fast rate of rearrangement. Here, we have estimated the rate of rearrangement since the divergence of C. elegans from its sister species Caenorhabditis briggsae, using the complete C. elegans genome sequence (The C. elegans Sequencing Consortium 1998) and 13 Mb of sequence from C. briggsae released by the Washington University Genome Sequencing Center (http://genome.wustl.edu/gsc/). Previous studies have shown that C. elegans and C. briggsae have conservation of gene order over stretches of chromosome that can be up to six genes long (Kuwabara and Shah 1994; Thacker et al. 1999). To calculate the rate, we estimated the number of chromosomal rearrangements since the speciation of C. elegans and C. briggsae. Because both species have six chromosomes (Nigon and Dougherty 1949), we assumed that there have not been any fusions or fissions of whole chromosomes since they diverged. Kececioglu and Ravi (1995) and Hannenhalli (1996) have developed computer algorithms that deduce the historical order and sizes of the reciprocal translocations (whereby two nonhomologous chromosomes exchange chunks of DNA by recombination) and/or inversions that have occurred since the divergence of two multichromosomal genomes. However, the C. elegans genome evolves not only by reciprocal translocations and inversions, but also by transpositions (whereby a chunk of DNA excises from one chromosome and inserts into a nonhomologous chromosome) and duplications (Robertson 2001). We designed a simple algorithm to calculate the number and sizes of such mutations, although not the order in which they occurred. Our method starts by finding all perfectly conserved segments between two species, in which gene content, order, and orientation are conserved. Next, these segments are fused into larger segments that have been splintered by duplications, inversions, or transpositions. When no more segments can be merged, the final fused segments are assumed to have resulted from fissure of chromosomes by reciprocal translocations. To convert the observed number of rearrangements into a rate, it is necessary to have an accurate estimate of the briggsae–elegans divergence date. Emmons et al. (1979) were the first to estimate this date, using restriction fragment data, venturing that it must be “tens of millions of years” ago. Butler et al. (1981) speculated that the date was 10–100 million years ago (Mya), judging from 5S rRNA sequences, anatomical differences, and protein electrophoretic mobilities. Subsequent estimates based on sequence data were 30–60 Mya (Prasad and Baillie 1989; one gene), 23–32 Mya (Heschl and Baillie 1990; one gene), 54–58 Mya (Lee et al. 1992; two genes), and 40 Mya (Kennedy et al. 1993; seven genes). Nematode fossils are extremely scarce (Poinar 1983). Therefore, to calibrate the molecular clock, these studies either assumed that all organisms have the same silent substitution rate (Prasad and Baillie 1989; Heschl and Baillie 1990) or nonsilent substitution rate (Lee et al. 1992), or that C. elegans has the same silent rate as Drosophila (Kennedy et al. 1993). These are dubious assumptions; for example, Mushegian et al. (1998) showed that about two-thirds of C. elegans genes have a higher rate of nonsilent substitution than their orthologs in Drosophila. To gain a more reliable interval estimate of the briggsae–elegans speciation date, we used phylogenetic analysis of all genes for which orthologous sequences were available from C. elegans, C. briggsae, Drosophila, and human. Only those genes that did not have a significantly different amino acid substitution rate in the four taxa were used to produce date estimates. The briggsae–elegans sequence data set is the largest available for any pair of congeneric eukaryotes. Such a big sample has a high power for detecting genome-wide trends. For example, the breakpoints of reciprocal translocations and inversions are frequently near repetitive DNA. This has been observed in bacteria (Romero et al. 1999), yeast (Fischer et al. 2000), insects (Caceres et al. 1999), mammals (Dehal et al. 2001), and plants (Zhang and Peterson 1999), but not yet in nematodes. Rearrangements near transposable elements may happen when the element is transposing (Zhang and Peterson 1999), but most rearrangements are hypothesized to occur by homologous recombination between nontransposing transposable elements, dispersed repeats, or gene family members. We find that translocation and transposition breakpoints are strongly associated with repeats in the C. elegans genome.

216 citations


Journal ArticleDOI
TL;DR: It is found that the Génolevures data strongly support the hypothesis that S. cerevisiae is a degenerate polyploid, and the map of sister regions that was constructed previously by using duplicated genes, an independent source of information is extended.
Abstract: The wealth of comparative genomics data from yeast species allows the molecular evolution of these eukaryotes to be studied in great detail. We used “proximity plots” to visually compare chromosomal gene order information from 14 hemiascomycetes, including the recent Genolevures survey, to Saccharomyces cerevisiae. Contrary to the original reports, we find that the Genolevures data strongly support the hypothesis that S. cerevisiae is a degenerate polyploid. Using gene order information alone, 70% of the S. cerevisiae genome can be mapped into “sister” regions that tile together with almost no overlap. This map confirms and extends the map of sister regions that we constructed previously by using duplicated genes, an independent source of information. Combining gene order and gene duplication data assigns essentially the whole genome into sister regions, the largest gap being only 36 genes long. The 16 centromere regions of S. cerevisiae form eight pairs, indicating that an ancestor with eight chromosomes underwent complete doubling; alternatives such as segmental duplications can be ruled out. Gene arrangements in Kluyveromyces lactis and four other species agree quantitatively with what would be expected if they diverged from S. cerevisiae before its polyploidization. In contrast, Saccharomyces exiguus, Saccharomyces servazzii, and Candida glabrata show higher levels of gene adjacency conservation, and more cases of imperfect conservation, suggesting that they split from the S. cerevisiae lineage after polyploidization. This finding is confirmed by sequences around the C. glabrata TRP1 and IPP1 loci, which show that it contains sister regions derived from the same duplication event as that of S. cerevisiae.

139 citations


Journal ArticleDOI
TL;DR: This work reports a uniform substitution rate in IR-less genomes, and finds this rate to be at the level otherwise reserved for SC genes, and proposes that this acceleration is a direct result of the decrease in the copy number of the sequence.
Abstract: The chloroplast genomes of some species of legumes lack the large inverted repeat (IR) that is a trademark of most land-plant chloroplasts. Our analysis of chloroplast genes in legume species that have an IR shows that the synonymous (silent) substitution rate in IR genes is 2.3-fold lower than in single-copy (SC) genes, which is largely in agreement with earlier findings. Given that all genes in species that lack the IR are single-copy, what level of synonymous substitution exists in these genes? We report a uniform substitution rate in IR-less genomes, and moreover, we find this rate to be at the level otherwise reserved for SC genes. In other words, the synonymous substitution rate has accelerated in the remaining copy of the duplicate region. We propose that this acceleration is a direct result of the decrease in the copy number of the sequence, rather than an intrinsic property of the genes normally located in the IR.

135 citations


Journal ArticleDOI
TL;DR: Analysis of the endpoints of the rearrangement indicates that it probably occurred by means of a two-step process of expansion and contraction of the IR and not by a 78-kb inversion.
Abstract: We have sequenced two sections of chloroplast DNA from adzuki bean (Vigna angularis), containing the junctions between the inverted repeat (IR) and large single copy (LSC) regions of the genome. The gene order at both junctions is different from that described for other members of the legume family, such as Lotus japonicus and soybean. These differences have been attributed to an apparent 78-kb inversion that spans nearly the entire LSC region and which is present in adzuki and its close relative, the common bean. This 78-kb rearrangement broke the large S10 operon of ribosomal proteins into two smaller operons, one at each end of the LSC, without affecting the gene content of the genome. It disrupted the physical and transcriptional relationship between the six-gene rpl23-rpl14 cluster and the four-gene rps8-rpoA cluster that is conserved in most land plants. Analysis of the endpoints of the rearrangement indicates that it probably occurred by means of a two-step process of expansion and contraction of the IR and not by a 78-kb inversion.

40 citations


Journal ArticleDOI
01 Aug 2002-Yeast
TL;DR: The sequences of two genomic regions from the pathogenic yeast Candida glabrata and their comparison to Saccharomyces cerevisiae are reported and a small‐scale rearrangement of gene order has occurred in the chromosome XI‐like section.
Abstract: We report the sequences of two genomic regions from the pathogenic yeast Candida glabrata and their comparison to Saccharomyces cerevisiae. A 3 kb region from C. glabrata was sequenced that contains homologues of the S. cerevisiae genes TFB3, MRPL28 and STP1. The equivalent region in S. cerevisiae includes a fourth gene, MFA1, coding for mating factor a. The absence of MFA1 is consistent with C. glabrata's asexual life cycle, although we cannot exclude the possibility that a-factor gene(s) are located somewhere else in its genome. We also report the sequence of a 16 kb region from C. glabrata that contains a five-gene cluster similar to S. cerevisiae chromosome XI (including GCN3) followed by a four-gene cluster similar to chromosome XV (including HIS3). A small-scale rearrangement of gene order has occurred in the chromosome XI-like section. The sequences have been deposited in the GenBank database with Accession Nos AY083606 and AY083607. Copyright © 2002 John Wiley & Sons, Ltd.

6 citations