scispace - formally typeset
Search or ask a question
Topic

Pseudogene

About: Pseudogene is a research topic. Over the lifetime, 5528 publications have been published within this topic receiving 336634 citations. The topic is also known as: Ψ & pseudogenes.


Papers
More filters
Journal ArticleDOI
TL;DR: The mobilization and dispersal of this gene-rich 27 kb element to the pericentromeric regions of primate chromosomes defines an unprecedented form of recent genome evolution and a novel mechanism for the generation of genetic diversity among closely related species.
Abstract: We have identified a 26.5 kb gene-rich duplication shared by human Xq28 and 16p11.1. Complete comparative sequence analysis of cosmids from both loci has revealed identical Xq28 and 16p11.1 genomic structures for both the human creatine transporter gene (SLC6A8) and five exons of the CDM gene (DXS1357E). Overall nucleotide similarity within the duplication was found to be 94.6%, suggesting that this interchromosomal duplication occurred within recent evolutionary time (7-10 mya). Based on comparisons between genomic and cDNA sequence, both the Xq28 creatine transporter and DXS1357E genes are transcriptionally active. Predicted translation of exons and RT-PCR analysis reveal that chromosome 16 paralogs likely represent pseudogenes. Comparative fluorescent in situ hybridization (FISH) analyses of chromosomes from various primates indicate that this gene-rich segment has undergone several duplications. In gorilla and chimpanzee, multiple pericentromeric localizations on a variety of chromosomes were found using probes from the duplicated region. In other species, such as the orangutan and gibbon, FISH signals were only identified at the distal end of the X chromosome, suggesting that the Xq28 locus represents the ancestral copy. Sequencing of the 16p 11.1/Xq28 duplication breakpoints has revealed the presence of repetitive immunoglobulin-like CAGGG pentamer sequences at or near the paralogy boundaries. The mobilization and dispersal of this gene-rich 27 kb element to the pericentromeric regions of primate chromosomes defines an unprecedented form of recent genome evolution and a novel mechanism for the generation of genetic diversity among closely related species.

154 citations

Journal ArticleDOI
TL;DR: The study infers the frequency of functional divergence from the size distribution of gene families produced by two successive genome duplications early in vertebrate evolution and reasons for this unexpectedly high frequency are discussed.
Abstract: Gene duplication events are important sources of novel gene functions. However, more often than not, a duplicate gene may lose its function and become a pseudogene. What is the relative frequency of these two scenarios: functional divergence versus gene loss? Given that most non-neutral mutations are deleterious, gene loss should be far more frequent than divergence. However, a recent empirical study suggests that about 50% of all gene duplications will lead to functional divergence. The study infers the frequency of functional divergence from the size distribution of gene families produced by two successive genome duplications early in vertebrate evolution. Reasons for this unexpectedly high frequency of functional divergence are discussed.

154 citations

Journal ArticleDOI
TL;DR: PhyOP is a fast and robust approach to orthology prediction that will be applicable to whole genomes from multiple closely related species, and will be particularly useful in predicting orthology for mammalian genomes that have been incompletely sequenced, and for large families of rapidly duplicating genes.
Abstract: Accurate predictions of orthology and paralogy relationships are necessary to infer human molecular function from experiments in model organisms. Previous genome-scale approaches to predicting these relationships have been limited by their use of protein similarity and their failure to take into account multiple splicing events and gene prediction errors. We have developed PhyOP, a new phylogenetic orthology prediction pipeline based on synonymous rate estimates, which accurately predicts orthology and paralogy relationships for transcripts, genes, exons, or genomic segments between closely related genomes. We were able to identify orthologue relationships to human genes for 93% of all dog genes from Ensembl. Among 1:1 orthologues, the alignments covered a median of 97.4% of protein sequences, and 92% of orthologues shared essentially identical gene structures. PhyOP accurately recapitulated genomic maps of conserved synteny. Benchmarking against predictions from Ensembl and Inparanoid showed that PhyOP is more accurate, especially in its predictions of paralogy. Nearly half (46%) of PhyOP paralogy predictions are unique. Using PhyOP to investigate orthologues and paralogues in the human and dog genomes, we found that the human assembly contains 3-fold more gene duplications than the dog. Species-specific duplicate genes, or “in-paralogues,” are generally shorter and have fewer exons than 1:1 orthologues, which is consistent with selective constraints and mutation biases based on the sizes of duplicated genes. In-paralogues have experienced elevated amino acid and synonymous nucleotide substitution rates. Duplicates possess similar biological functions for either the dog or human lineages. Having accounted for 2,954 likely pseudogenes and gene fragments, and after separating 346 erroneously merged genes, we estimated that the human genome encodes a minimum of 19,700 protein-coding genes, similar to the gene count of nematode worms. PhyOP is a fast and robust approach to orthology prediction that will be applicable to whole genomes from multiple closely related species. PhyOP will be particularly useful in predicting orthology for mammalian genomes that have been incompletely sequenced, and for large families of rapidly duplicating genes.

153 citations

Journal ArticleDOI
TL;DR: The DNA sequences of wheat Acc-1 and Acc-2 loci, encoding the plastid and cytosolic forms of the enzyme acetyl-CoA carboxylase, were analyzed with a view to understanding the evolution of these genes and the origin of the three genomes in modern hexaploid wheat.
Abstract: The DNA sequences of wheat Acc-1 and Acc-2 loci, encoding the plastid and cytosolic forms of the enzyme acetyl-CoA carboxylase, were analyzed with a view to understanding the evolution of these genes and the origin of the three genomes in modern hexaploid wheat. Acc-1 and Acc-2 loci from each of the wheats Triticum urartu (A genome), Aegilops tauschii (D genome), Triticum turgidum (AB genome), and Triticum aestivum (ABD genome), as well as two Acc-2-related pseudogenes from T. urartu were sequenced. The 2.3-2.4 Mya divergence time calculated here for the three homoeologous chromosomes, on the basis of coding and intron sequences of the Acc-1 genes, is at the low end of other estimates. Our clock was calibrated by using 60 Mya for the divergence between wheat and maize. On the same time scale, wheat and barley diverged 11.6 Mya, based on sequences of Acc and other genes. The regions flanking the Acc genes are not conserved among the A, B, and D genomes. They are conserved when comparing homoeologous genomes of diploid, tetraploid, and hexaploid wheats. Substitution rates in intergenic regions consisting primarily of repetitive sequences vary substantially along the loci and on average are 3.5-fold higher than the Acc intron substitution rates. The composition of the Acc homoeoloci suggests haplotype divergence exceeding in some cases 0.5 Mya. Such variation might result in a significant overestimate of the time since tetraploid wheat formation, which occurred no more than 0.5 Mya.

153 citations

Journal ArticleDOI
TL;DR: In this article, the authors performed a comparative genomic analysis of Bp K96243 and B. thailandensis (Bt) E264, a closely related but avirulent relative, and found that the acquisition of a capsular polysaccharide gene cluster in Bp, a key virulence component, is likely to have occurred non-randomly via replacement of an ancestral polycharide cluster.
Abstract: Background: The Gram-negative bacterium Burkholderia pseudomallei (Bp) is the causative agent of the human disease melioidosis. To understand the evolutionary mechanisms contributing to Bp virulence, we performed a comparative genomic analysis of Bp K96243 and B. thailandensis (Bt) E264, a closely related but avirulent relative. Results: We found the Bp and Bt genomes to be broadly similar, comprising two highly syntenic chromosomes with comparable numbers of coding regions (CDs), protein family distributions, and horizontally acquired genomic islands, which we experimentally validated to be differentially present in multiple Bt isolates. By examining species-specific genomic regions, we derived molecular explanations for previously-known metabolic differences, discovered potentially new ones, and found that the acquisition of a capsular polysaccharide gene cluster in Bp, a key virulence component, is likely to have occurred non-randomly via replacement of an ancestral polysaccharide cluster. Virulence related genes, in particular members of the Type III secretion needle complex, were collectively more divergent between Bp and Bt compared to the rest of the genome, possibly contributing towards the ability of Bp to infect mammalian hosts. An analysis of pseudogenes between the two species revealed that protein inactivation events were significantly biased towards membrane-associated proteins in Bt and transcription factors in Bp. Conclusion: Our results suggest that a limited number of horizontal-acquisition events, coupled with the fine-scale functional modulation of existing proteins, are likely to be the major drivers underlying Bp virulence. The extensive genomic similarity between Bp and Bt suggests that, in some cases, Bt could be used as a possible model system for studying certain aspects of Bp behavior.

153 citations


Network Information
Related Topics (5)
Gene
211.7K papers, 10.3M citations
95% related
Genome
74.2K papers, 3.8M citations
93% related
Regulation of gene expression
85.4K papers, 5.8M citations
91% related
Gene expression
113.3K papers, 5.5M citations
90% related
Transcription factor
82.8K papers, 5.4M citations
89% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023120
2022250
2021123
2020160
2019119
2018127