Showing papers on "Pseudogene published in 2003"

PDF

Open Access

Journal Article•DOI•

Genome-Wide Analysis of NBS-LRR–Encoding Genes in Arabidopsis

[...]

Blake C. Meyers¹, Alexander Kozik¹, Alyssa Griego¹, Hanhui Kuang¹, Richard W Michelmore¹ - Show less +1 more•Institutions (1)

University of California, Davis¹

01 Apr 2003-The Plant Cell

TL;DR: The observed diversity of these NBS-LRR proteins indicates the variety of recognition molecules available in an individual genotype to detect diverse biotic challenges.

...read moreread less

Abstract: The Arabidopsis genome contains ∼200 genes that encode proteins with similarity to the nucleotide binding site and other domains characteristic of plant resistance proteins. Through a reiterative process of sequence analysis and reannotation, we identified 149 NBS-LRR–encoding genes in the Arabidopsis (ecotype Columbia) genomic sequence. Fifty-six of these genes were corrected from earlier annotations. At least 12 are predicted to be pseudogenes. As described previously, two distinct groups of sequences were identified: those that encoded an N-terminal domain with Toll/Interleukin-1 Receptor homology (TIR-NBS-LRR, or TNL), and those that encoded an N-terminal coiled-coil motif (CC-NBS-LRR, or CNL). The encoded proteins are distinct from the 58 predicted adapter proteins in the previously described TIR-X, TIR-NBS, and CC-NBS groups. Classification based on protein domains, intron positions, sequence conservation, and genome distribution defined four subgroups of CNL proteins, eight subgroups of TNL proteins, and a pair of divergent NL proteins that lack a defined N-terminal motif. CNL proteins generally were encoded in single exons, although two subclasses were identified that contained introns in unique positions. TNL proteins were encoded in modular exons, with conserved intron positions separating distinct protein domains. Conserved motifs were identified in the LRRs of both CNL and TNL proteins. In contrast to CNL proteins, TNL proteins contained large and variable C-terminal domains. The extant distribution and diversity of the NBS-LRR sequences has been generated by extensive duplication and ectopic rearrangements that involved segmental duplications as well as microscale events. The observed diversity of these NBS-LRR proteins indicates the variety of recognition molecules available in an individual genotype to detect diverse biotic challenges.

...read moreread less

1,503 citations

Journal Article•DOI•

Evolution's cauldron: Duplication, deletion, and rearrangement in the mouse and human genomes

[...]

W. James Kent¹, Robert Baertsch, Angie S. Hinrichs, Webb Miller, David Haussler - Show less +1 more•Institutions (1)

University of California, Santa Cruz¹

30 Sep 2003-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: New alignment techniques that can handle large gaps in a robust fashion and discriminate between orthologous and paralogous alignments are developed and provide evidence that ≈2% of the genes in the human/mouse common ancestor have been deleted or partially deleted in the mouse.

...read moreread less

Abstract: This study examines genomic duplications, deletions, and rearrangements that have happened at scales ranging from a single base to complete chromosomes by comparing the mouse and human genomes. From whole-genome sequence alignments, 344 large (>100-kb) blocks of conserved synteny are evident, but these are further fragmented by smaller-scale evolutionary events. Excluding transposon insertions, on average in each megabase of genomic alignment we observe two inversions, 17 duplications (five tandem or nearly tandem), seven transpositions, and 200 deletions of 100 bases or more. This includes 160 inversions and 75 duplications or transpositions of length >100 kb. The frequencies of these smaller events are not substantially higher in finished portions in the assembly. Many of the smaller transpositions are processed pseudogenes; we define a “syntenic” subset of the alignments that excludes these and other small-scale transpositions. These alignments provide evidence that ≈2% of the genes in the human/mouse common ancestor have been deleted or partially deleted in the mouse. There also appears to be slightly less nontransposon-induced genome duplication in the mouse than in the human lineage. Although some of the events we detect are possibly due to misassemblies or missing data in the current genome sequence or to the limitations of our methods, most are likely to represent genuine evolutionary events. To make these observations, we developed new alignment techniques that can handle large gaps in a robust fashion and discriminate between orthologous and paralogous alignments.

...read moreread less

813 citations

Journal Article•DOI•

Molecular evolution of the insect chemoreceptor gene superfamily in Drosophila melanogaster.

[...]

Hugh M. Robertson¹, Coral G. Warr², Coral G. Warr³, John R. Carlson²•Institutions (3)

University of Illinois at Urbana–Champaign¹, Yale University², Monash University³

25 Nov 2003-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: The insect chemoreceptor superfamily in Drosophila melanogaster is predicted to consist of 62 odorant receptor (Or) and 68 gustatory receptor (Gr) proteins, encoded by families of 60 Or and 60 Gr genes through alternative splicing.

...read moreread less

Abstract: The insect chemoreceptor superfamily in Drosophila melanogaster is predicted to consist of 62 odorant receptor (Or) and 68 gustatory receptor (Gr) proteins, encoded by families of 60 Or and 60 Gr genes through alternative splicing. We include two previously undescribed Or genes and two previously undescribed Gr genes; two previously predicted Or genes are shown to be alternative splice forms. Three polymorphic pseudogenes and one highly defective pseudogene are recognized. Phylogenetic analysis reveals deep branches connecting multiple highly divergent clades within the Gr family, and the Or family appears to be a single highly expanded lineage within the superfamily. The genes are spread throughout the Drosophila genome, with some relatively recently diverged genes still clustered in the genome. The Gr5a gene on the X chromosome, which encodes a receptor for the sugar trehalose, has transposed from one such tandem cluster of six genes at cytological location 64, as has Gr61a, and all eight of these receptors might bind sugars. Analysis of intron evolution suggests that the common ancestor consisted of a long N-terminal exon encoding transmembrane domains 1-5 followed by three exons encoding transmembrane domains 6-7. As many as 57 additional introns have been acquired idiosyncratically during the evolution of the superfamily, whereas the ancestral introns and some of the older idiosyncratic introns have been lost at least 48 times independently. Altogether, these patterns of molecular evolution suggest that this is an ancient superfamily of chemoreceptors, probably dating back at least to the origin of the arthropods.

...read moreread less

745 citations

Journal Article•DOI•

Mutations in SBDS are associated with Shwachman–Diamond syndrome

[...]

Graeme R.B. Boocock, Jodi Morrison, Maja Popovic¹, Nicole Richards, Lynda Ellis¹, Peter R. Durie¹, Johanna M. Rommens¹ - Show less +3 more•Institutions (1)

University of Toronto¹

01 Jan 2003-Nature Genetics

TL;DR: Identification of disease-associated mutations in an uncharacterized gene, SBDS, in the interval of 1.9 cM at 7q11 is reported, suggesting that SDS may be caused by a deficiency in an aspect of RNA metabolism essential for development of the exocrine pancreas, hematopoiesis and chrondrogenesis.

...read moreread less

Abstract: Shwachman-Diamond syndrome (SDS; OMIM 260400) is an autosomal recessive disorder with clinical features that include pancreatic exocrine insufficiency, hematological dysfunction and skeletal abnormalities. Here, we report identification of disease-associated mutations in an uncharacterized gene, SBDS, in the interval of 1.9 cM at 7q11 previously shown to be associated with the disease. We report that SBDS has a 1.6-kb transcript and encodes a predicted protein of 250 amino acids. A pseudogene copy (SBDSP) with 97% nucleotide sequence identity resides in a locally duplicated genomic segment of 305 kb. We found recurring mutations resulting from gene conversion in 89% of unrelated individuals with SDS (141 of 158), with 60% (95 of 158) carrying two converted alleles. Converted segments consistently included at least one of two pseudogene-like sequence changes that result in protein truncation. SDBS is a member of a highly conserved protein family of unknown function with putative orthologs in diverse species including archaea and eukaryotes. Archaeal orthologs are located within highly conserved operons that include homologs of RNA-processing genes, suggesting that SDS may be caused by a deficiency in an aspect of RNA metabolism that is essential for development of the exocrine pancreas, hematopoiesis and chrondrogenesis.

...read moreread less

672 citations

Journal Article•DOI•

Inferring Nonneutral Evolution from Human-Chimp-Mouse Orthologous Gene Trios

[...]

Andrew G. Clark¹, Stephen Glanowski², Rasmus Nielsen¹, Paul Thomas³, Anish Kejariwal³, Melissa A. Todd¹, David M. Tanenbaum³, Daniel Civello³, Fu Lu³, Brian Murphy², Steve Ferriera², Gary Wang², Xianqgun Zheng³, Thomas J. White³, John J. Sninsky³, Mark Raymond Adams³, Michele Cargill³ - Show less +13 more•Institutions (3)

Cornell University¹, Applied Biosystems², Celera Corporation³

12 Dec 2003-Science

TL;DR: Partitions of genes into inferred biological classes identified accelerated evolution in several functional classes, including olfaction and nuclear transport and human-accelerated genes are significantly more likely to underlie major known Mendelian disorders.

...read moreread less

Abstract: Even though human and chimpanzee gene sequences are nearly 99% identical, sequence comparisons can nevertheless be highly informative in identifying biologically important changes that have occurred since our ancestral lineages diverged. We analyzed alignments of 7645 chimpanzee gene sequences to their human and mouse orthologs. These three-species sequence alignments allowed us to identify genes undergoing natural selection along the human and chimp lineage by fitting models that include parameters specifying rates of synonymous and nonsynonymous nucleotide substitution. This evolutionary approach revealed an informative set of genes with significantly different patterns of substitution on the human lineage compared with the chimpanzee and mouse lineages. Partitions of genes into inferred biological classes identified accelerated evolution in several functional classes, including olfaction and nuclear transport. In addition to suggesting adaptive physiological differences between chimps and humans, human-accelerated genes are significantly more likely to underlie major known Mendelian disorders.

...read moreread less

648 citations

Journal Article•DOI•

Negative feedback regulation ensures the one receptor-one olfactory neuron rule in mouse.

[...]

Shou Serizawa¹, Kazunari Miyamichi, Hiroko Nakatani, Misao Suzuki, Michiko Saito, Yoshihiro Yoshihara, Hitoshi Sakano - Show less +3 more•Institutions (1)

University of Tokyo¹

19 Dec 2003-Science

TL;DR: It is proposed that stochastic activation of only one OR gene within the cluster and negative feedback regulation by that OR gene product are necessary to ensure the one receptor–one neuron rule.

...read moreread less

Abstract: In the mouse olfactory system, each olfactory sensory neuron (OSN) expresses only one odorant receptor (OR) gene in a monoallelic and mutually exclusive manner. Such expression forms the genetic basis for OR-instructed axonal projection of OSNs to the olfactory bulb of the brain during development. Here, we identify an upstream cis-acting DNA region that activates the OR gene cluster in mouse and allows the expression of only one OR gene within the cluster. Deletion of the coding region of the expressed OR gene or a naturally occurring frame-shift mutation allows a second OR gene to be expressed. We propose that stochastic activation of only one OR gene within the cluster and negative feedback regulation by that OR gene product are necessary to ensure the one receptor-one neuron rule.

...read moreread less

527 citations

Journal Article•DOI•

Complete genome sequence of the Q-fever pathogen Coxiella burnetii

[...]

Rekha Seshadri, Ian T. Paulsen¹, Ian T. Paulsen², Jonathan A. Eisen¹, Jonathan A. Eisen², Timothy D. Read², Karen E. Nelson², William C. Nelson², Naomi L. Ward², Naomi L. Ward³, Hervé Tettelin², Tanja M. Davidsen², Maureen J. Beanan², Robert T. DeBoy², Sean C. Daugherty², Lauren M. Brinkac², Ramana Madupu², Robert J. Dodson², Hoda Khouri², K. Lee², Heather A. Carty², David J. Scanlan², Robert A. Heinzen⁴, Herbert A. Thompson⁵, James E. Samuel⁶, Claire M. Fraser², Claire M. Fraser⁷, John F. Heidelberg³, John F. Heidelberg² - Show less +25 more•Institutions (7)

Johns Hopkins University¹, J. Craig Venter Institute², University of Maryland, Baltimore County³, University of Wyoming⁴, Centers for Disease Control and Prevention⁵, Texas A&M University⁶, George Washington University⁷

29 Apr 2003-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: Analysis of the genome of Coxiella burnetii, Nine Mile phase I RSA493, a highly virulent zoonotic pathogen and category B bioterrorism agent, was sequenced by the random shotgun method, suggesting that the obligate intracellular lifestyle of C. burningetii may be a relatively recent innovation.

...read moreread less

Abstract: The 1,995,275-bp genome of Coxiella burnetii, Nine Mile phase I RSA493, a highly virulent zoonotic pathogen and category B bioterrorism agent, was sequenced by the random shotgun method. This bacterium is an obligate intracellular acidophile that is highly adapted for life within the eukaryotic phagolysosome. Genome analysis revealed many genes with potential roles in adhesion, invasion, intracellular trafficking, host-cell modulation, and detoxification. A previously uncharacterized 13-member family of ankyrin repeat-containing proteins is implicated in the pathogenesis of this organism. Although the lifestyle and parasitic strategies of C. burnetii resemble that of Rickettsiae and Chlamydiae, their genome architectures differ considerably in terms of presence of mobile elements, extent of genome reduction, metabolic capabilities, and transporter profiles. The presence of 83 pseudogenes displays an ongoing process of gene degradation. Unlike other obligate intracellular bacteria, 32 insertion sequences are found dispersed in the chromosome, indicating some plasticity in the C. burnetii genome. These analyses suggest that the obligate intracellular lifestyle of C. burnetii may be a relatively recent innovation.

...read moreread less

516 citations

Journal Article•DOI•

The genome of Nanoarchaeum equitans: Insights into early archaeal evolution and derived parasitism

[...]

28 Oct 2003-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: The hyperthermophile Nanoarchaeum equitans is an obligate symbiont growing in coculture with the crenarchaeon Ignicoccus, and represents a basal archaeal lineage and has a highly reduced genome.

...read moreread less

Abstract: The hyperthermophile Nanoarchaeum equitans is an obligate symbiont growing in coculture with the crenarchaeon Ignicoccus. Ribosomal protein and rRNA-based phylogenies place its branching point early in the archaeal lineage, representing the new archaeal kingdom Nanoarchaeota. The N. equitans genome (490,885 base pairs) encodes the machinery for information processing and repair, but lacks genes for lipid, cofactor, amino acid, or nucleotide biosyntheses. It is the smallest microbial genome sequenced to date, and also one of the most compact, with 95% of the DNA predicted to encode proteins or stable RNAs. Its limited biosynthetic and catabolic capacity indicates that N. equitans' symbiotic relationship to Ignicoccus is parasitic, making it the only known archaeal parasite. Unlike the small genomes of bacterial parasites that are undergoing reductive evolution, N. equitans has few pseudogenes or extensive regions of noncoding DNA. This organism represents a basal archaeal lineage and has a highly reduced genome.

...read moreread less

506 citations

Journal Article•DOI•

Pseudogenes: are they "junk" or functional DNA?

[...]

Evgeniy S. Balakirev¹, Francisco J. Ayala•Institutions (1)

University of California, Irvine¹

28 Nov 2003-Annual Review of Genetics

TL;DR: The Drosophila literature is reviewed and the proposal that pseudogenes be considered as potogenes, i.e., DNA sequences with a potentiality for becoming new genes is agreed.

...read moreread less

Abstract: ▪ Abstract Pseudogenes have been defined as nonfunctional sequences of genomic DNA originally derived from functional genes. It is therefore assumed that all pseudogene mutations are selectively neutral and have equal probability to become fixed in the population. Rather, pseudogenes that have been suitably investigated often exhibit functional roles, such as gene expression, gene regulation, generation of genetic (antibody, antigenic, and other) diversity. Pseudogenes are involved in gene conversion or recombination with functional genes. Pseudogenes exhibit evolutionary conservation of gene sequence, reduced nucleotide variability, excess synonymous over nonsynonymous nucleotide polymorphism, and other features that are expected in genes or DNA sequences that have functional roles. We first review the Drosophila literature and then extend the discussion to the various functional features identified in the pseudogenes of other organisms. A pseudogene that has arisen by duplication or retroposition may, a...

...read moreread less

460 citations

Journal Article•DOI•

Complete Genome Sequence and Comparative Genomics of Shigella flexneri Serotype 2a Strain 2457T

[...]

J. Wei, Marcia B. Goldberg¹, Valerie Burland, Malabi M. Venkatesan², W. Deng, Gregory P. Fournier¹, George F. Mayhew, Guy Plunkett, Debra J. Rose, Aaron E. Darling, Bob Mau, Nicole T. Perna, Shelley M. Payne³, L. J. Runyen-Janecky³, Shiguo Zhou⁴, David C. Schwartz⁴, Frederick R. Blattner - Show less +13 more•Institutions (4)

Harvard University¹, Walter Reed Army Institute of Research², University of Texas at Austin³, University of Wisconsin-Madison⁴

01 May 2003-Infection and Immunity

TL;DR: The complete genome sequence of Shigella flexneri serotype 2a strain 2457T (4,599,354 bp) was determined and it was found that the strain is distinctive in its large complement of insertion sequences, with several genomic rearrangements mediated by insertion sequences.

...read moreread less

Abstract: We determined the complete genome sequence of Shigella flexneri serotype 2a strain 2457T (4,599,354 bp). Shigella species cause >1 million deaths per year from dysentery and diarrhea and have a lifestyle that is markedly different from those of closely related bacteria, including Escherichia coli. The genome exhibits the backbone and island mosaic structure of E. coli pathogens, albeit with much less horizontally transferred DNA and lacking 357 genes present in E. coli. The strain is distinctive in its large complement of insertion sequences, with several genomic rearrangements mediated by insertion sequences, 12 cryptic prophages, 372 pseudogenes, and 195 S. flexneri-specific genes. The 2457T genome was also compared with that of a recently sequenced S. flexneri 2a strain, 301. Our data are consistent with Shigella being phylogenetically indistinguishable from E. coli. The S. flexneri-specific regions contain many genes that could encode proteins with roles in virulence. Analysis of these will reveal the genetic basis for aspects of this pathogenic organism9s distinctive lifestyle that have yet to be explained.

...read moreread less

419 citations

Journal Article•DOI•

Characterization of angiosperm nrDNA polymorphism, paralogy, and pseudogenes.

[...]

C. Donovan Bailey¹, Timothy G. Carr², Stephen A. Harris¹, Colin E. Hughes¹•Institutions (2)

University of Oxford¹, Cornell University²

01 Dec 2003-Molecular Phylogenetics and Evolution

TL;DR: It is concluded that a priori determinations of orthology and paralogy of nrDNA sequences should not be made based on the functionality or lack of functionality of those sequences, and the advantages of a tree-based approach to identifying pseudogenes based on comparisons of sequence substitution patterns from putatively conserved regions.

...read moreread less

Journal Article•DOI•

Millions of Years of Evolution Preserved: A Comprehensive Catalog of the Processed Pseudogenes in the Human Genome

[...]

Zhaolei Zhang¹, Paul M. Harrison¹, Yin Liu¹, Mark Gerstein¹•Institutions (1)

Yale University¹

01 Dec 2003-Genome Research

TL;DR: Overall, processed pseudogenes are very similar to their closest corresponding human gene, being 94% complete in coding regions, with sequence similarity of 75% for amino acids and 86% for nucleotides, however, it does vary with GC-content: Processed pseudogene occur mostly in intermediate GC- content regions.

...read moreread less

Abstract: Processed pseudogenes were created by reverse-transcription of mRNAs; they provide snapshots of ancient genes existing millions of years ago in the genome. To find them in the present-day human, we developed a pipeline using features such as intron-absence, frame-disruption, polyadenylation, and truncation. This has enabled us to identify in recent genome drafts approximately 8000 processed pseudogenes (distributed from http://pseudogene.org). Overall, processed pseudogenes are very similar to their closest corresponding human gene, being 94% complete in coding regions, with sequence similarity of 75% for amino acids and 86% for nucleotides. Their chromosomal distribution appears random and dispersed, with the numbers on chromosomes proportional to length, suggesting sustained "bombardment" over evolution. However, it does vary with GC-content: Processed pseudogenes occur mostly in intermediate GC-content regions. This is similar to Alus but contrasts with functional genes and L1-repeats. Pseudogenes, moreover, have age profiles similar to Alus. The number of pseudogenes associated with a given gene follows a power-law relationship, with a few genes giving rise to many pseudogenes and most giving rise to few. The prevalence of processed pseudogenes agrees well with germ-line gene expression. Highly expressed ribosomal proteins account for approximately 20% of the total. Other notables include cyclophilin-A, keratin, GAPDH, and cytochrome c.

...read moreread less

Journal Article•DOI•

An expressed pseudogene regulates the messenger-RNA stability of its homologous coding gene.

[...]

Shinji Hirotsune¹, Noriyuki Yoshida, Amy Chen², Lisa Garrett², Fumihiro Sugiyama³, Satoru Takahashi³, Ken-ichi Yagami³, Anthony Wynshaw-Boris⁴, Anthony Wynshaw-Boris², Atsushi Yoshiki - Show less +6 more•Institutions (4)

National Presto Industries¹, National Institutes of Health², University of Tsukuba³, University of California, San Diego⁴

01 May 2003-Nature

TL;DR: The role of an expressed pseudogene—regulation of messenger-RNA stability—in a transgene-insertion mouse mutant exhibiting polycystic kidneys and bone deformity is reported and point to the functional significance of non-coding RNAs.

...read moreread less

Abstract: A pseudogene is a gene copy that does not produce a functional, full-length protein. The human genome is estimated to contain up to 20,000 pseudogenes. Although much effort has been devoted to understanding the function of pseudogenes, their biological roles remain largely unknown. Here we report the role of an expressed pseudogene-regulation of messenger-RNA stability-in a transgene-insertion mouse mutant exhibiting polycystic kidneys and bone deformity. The transgene was integrated into the vicinity of the expressing pseudogene of Makorin1, called Makorin1-p1. This insertion reduced transcription of Makorin1-p1, resulting in destabilization of Makorin1 mRNA in trans by way of a cis-acting RNA decay element within the 5' region of Makorin1 that is homologous between Makorin1 and Makorin1-p1. Either Makorin1 or Makorin1-p1 transgenes could rescue these phenotypes. Our findings demonstrate a specific regulatory role of an expressed pseudogene, and point to the functional significance of non-coding RNAs.

...read moreread less

Journal Article•DOI•

Comparative Genomics of Salmonella enterica Serovar Typhi Strains Ty2 and CT18

[...]

Wen Deng, Shian-Ren Liou, Guy Plunkett, George F. Mayhew, Debra J. Rose, Valerie Burland, Voula Kodoyianni¹, David C. Schwartz¹, Frederick R. Blattner - Show less +5 more•Institutions (1)

University of Wisconsin-Madison¹

01 Apr 2003-Journal of Bacteriology

TL;DR: The 4.8-Mb complete genome sequence of Salmonella enterica serovar Typhi strain Ty2 is presented, a human-specific pathogen causing typhoid fever, and a half-genome interreplichore inversion in Ty2 relative to CT18 was confirmed.

...read moreread less

Abstract: We present the 4.8-Mb complete genome sequence of Salmonella enterica serovar Typhi strain Ty2, a human-specific pathogen causing typhoid fever. A comparison with the genome sequence of recently isolated S. enterica serovar Typhi strain CT18 showed that 29 of the 4,646 predicted genes in Ty2 are unique to this strain, while 84 genes are unique to CT18. Both genomes contain more than 200 pseudogenes; 9 of these genes in CT18 are intact in Ty2, while 11 intact CT18 genes are pseudogenes in Ty2. A half-genome interreplichore inversion in Ty2 relative to CT18 was confirmed. The two strains exhibit differences in prophages, insertion sequences, and island structures. While CT18 carries two plasmids, one conferring multiple drug resistance, Ty2 has no plasmids and is sensitive to antibiotics.

...read moreread less

Journal Article•DOI•

Sub-grouping of Plasmodium falciparum 3D7 var genes based on sequence analysis of coding and non-coding regions.

[...]

Thomas Lavstsen¹, Ali Salanti¹, Anja T. R. Jensen¹, David E. Arnot², Thor G. Theander¹ - Show less +1 more•Institutions (2)

University of Copenhagen¹, University of Edinburgh²

10 Sep 2003-Malaria Journal

TL;DR: The grouping of var genes implies that var gene recombination preferentially occurs within var gene groups and it is speculated that the groups reflect a functional diversification evolved to cope with the varying conditions of transmission and host immune response met by the parasite.

...read moreread less

Abstract: Background: The variant surface antigen family Plasmodium falciparum erythrocyte membrane protein-1 (PfEMP1) is an important target for protective immunity and is implicated in the pathology of malaria through its ability to adhere to host endothelial receptors. The sequence diversity and organization of the 3D7 PfEMP1 repertoire was investigated on the basis of the complete genome sequence. Methods: Using two tree-building methods we analysed the coding and non-coding sequences of 3D7 var and rif genes as well as var genes of other parasite strains. Results: var genes can be sub-grouped into three major groups (group A, B and C) and two intermediate groups B/A and B/C representing transitions between the three major groups. The best defined var group, group A, comprises telomeric genes transcribed towards the telomere encoding PfEMP1s with complex domain structures different from the 4-domain type dominant of groups B and C. Two sequences belonging to the var1 and var2 subfamilies formed independent groups. A rif subgroup transcribed towards the centromere was found neighbouring var genes of group A such that the rif and var 5' regions merged. This organization appeared to be unique for the group A var genes Conclusion: The grouping of var genes implies that var gene recombination preferentially occurs within var gene groups and it is speculated that the groups reflect a functional diversification evolved to cope with the varying conditions of transmission and host immune response met by the parasite.

...read moreread less

Journal Article•DOI•

Aldehyde dehydrogenase gene superfamily: the 2002 update.

[...]

Nickolas A. Sophos¹, Vasilis Vasiliou¹•Institutions (1)

Anschutz Medical Campus¹

01 Feb 2003-Chemico-Biological Interactions

TL;DR: A complete listing of all ALDH sequences known to date, along with the evolutionary analysis of the eukaryotic ALDHs are presented.

...read moreread less

Journal Article•DOI•

The transcriptional activity of human Chromosome 22

[...]

John L. Rinn¹, Ghia Euskirchen, Paul Bertone, Rebecca Martone, Nicholas M. Luscombe, Stephen Hartman, Paul M. Harrison, F. Kenneth Nelson, Perry L. Miller, Mark Gerstein, Sherman M. Weissman, Michael Snyder - Show less +8 more•Institutions (1)

Yale University¹

15 Feb 2003-Genes & Development

TL;DR: A DNA microarray representing nearly all of the unique sequences of human Chromosome 22 was constructed and used to measure global-transcriptional activity in placental poly(A)(+) RNA and revealed twice as many transcribed bases as have been reported previously.

...read moreread less

Abstract: A DNA microarray representing nearly all of the unique sequences of human Chromosome 22 was constructed and used to measure global-transcriptional activity in placental poly(A) + RNA. We found that many of the known, related and predicted genes are expressed. More importantly, our study reveals twice as many transcribed bases as have been reported previously. Many of the newly discovered expressed fragments were verified by RNA blot analysis and a novel technique called differential hybridization mapping (DHM). Interestingly, a significant fraction of these novel fragments are expressed antisense to previously annotated introns. The coding potential of these novel expressed regions is supported by their sequence conservation in the mouse genome. This study has greatly increased our understanding of the biological information encoded on a human chromosome. To facilitate the dissemination of these results to the scientific community, we have developed a comprehensive Web resource to present the findings of this study and other features of human Chromosome 22 at http://array.mbb.yale.edu/chr22.

...read moreread less

Journal Article•DOI•

Genomic Organization and Expression Analysis of B7-H4, an Immune Inhibitory Molecule of the B7 Family

[...]

In Hak Choi¹, Gefeng Zhu, Gabriel Sica, Scott E. Strome, John C. Cheville², Julie S. Lau, Yuwen Zhu, Dallas B. Flies, Koji Tamada, Lieping Chen - Show less +6 more•Institutions (2)

Inje University¹, Mayo Clinic²

01 Nov 2003-Journal of Immunology

TL;DR: It is reported that the genomic DNA of human B7-H4 is mapped on chromosome 1 comprised of six exons and five introns spanning 66 kb, of which exon 6 is used for alternative splicing to generate two different transcripts.

...read moreread less

Abstract: B7-H4 is a recently identified B7 family member that negatively regulates T cell immunity by the inhibition of T cell proliferation, cytokine production, and cell cycle progression. In this study, we report that the genomic DNA of human B7-H4 is mapped on chromosome 1 comprised of six exons and five introns spanning 66 kb, of which exon 6 is used for alternative splicing to generate two different transcripts. Similar B7-H4 structure is also found in mouse genomic DNA in chromosome 3. A human B7-H4 pseudogene is identified in chromosome 20p11.1 with a single exon and two stop codons in the coding region. Immunohistochemistry analysis using B7-H4-specific mAb demonstrates that B7-H4 is not expressed on the majority of normal human tissues. In contrast, up to 85% (22 of 26) of ovarian cancer and 31% (5 of 16) of lung cancer tissues constitutively express B7-H4. Our results indicate a tight regulation of B7-H4 expression in the translational level in normal peripheral tissues and a potential role of B7-H4 in the evasion of tumor immunity.

...read moreread less

Journal Article•DOI•

Human specific loss of olfactory receptor genes

[...]

Yoav Gilad¹, Orna Man, Svante Pääbo¹, Doron Lancet•Institutions (1)

Max Planck Society¹

18 Mar 2003-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: It is found that humans have accumulated mutations that disrupt OR coding regions roughly 4-fold faster than any other species sampled, suggesting a human-specific process of OR gene disruption, likely due to a reduced chemosensory dependence relative to apes.

...read moreread less

Abstract: Olfactory receptor (OR) genes constitute the basis for the sense of smell and are encoded by the largest mammalian gene superfamily of >1,000 genes. In humans, >60% of these are pseudogenes. In contrast, the mouse OR repertoire, although of roughly equal size, contains only ≈20% pseudogenes. We asked whether the high fraction of nonfunctional OR genes is specific to humans or is a common feature of all primates. To this end, we have compared the sequences of 50 human OR coding regions, regardless of their functional annotations, to those of their putative orthologs in chimpanzees, gorillas, orangutans, and rhesus macaques. We found that humans have accumulated mutations that disrupt OR coding regions roughly 4-fold faster than any other species sampled. As a consequence, the fraction of OR pseudogenes in humans is almost twice as high as in the non-human primates, suggesting a human-specific process of OR gene disruption, likely due to a reduced chemosensory dependence relative to apes.

...read moreread less

Journal Article•DOI•

On the origin of family 1 plant glycosyltransferases.

[...]

Suzanne Michelle Paquette¹, Birger Lindberg Møller², Søren Bak²•Institutions (2)

University of Washington¹, University of Copenhagen²

01 Feb 2003-Phytochemistry

TL;DR: The phylogeny of plant glycosyltransferases is substantiated with complete phylogenetic analysis of the A. thaliana UGT multigene family, including intron-exon organization and chromosomal localization.

...read moreread less

Journal Article•DOI•

Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes

[...]

Zhaolei Zhang¹, Mark Gerstein¹•Institutions (1)

Yale University¹

15 Sep 2003-Nucleic Acids Research

TL;DR: It is found that deletions are about three times more common than insertions, and the frequencies of both these events follow characteristic power-law behavior associated with the size of the indel, but unexpectedly, the frequency of 3 bp deletions violates this trend.

...read moreread less

Abstract: Nucleotide substitution, insertion and deletion (indel) events are the major driving forces that have shaped genomes. Using the recently identified human ribosomal protein (RP) pseudogene sequences, we have thoroughly studied DNA mutation patterns in the human genome. We analyzed a total of 1726 processed RP pseudogene sequences, comprising more than 700 000 bases. To be sure to differentiate the sequence changes occurring in the functional genes during evolution from those occurring in pseudogenes after they were fixed in the genome, we used only pseudogene sequences originating from parts of RP genes that are identical in human and mouse. Overall, we found that nucleotide transitions are more common than transversions, by roughly a factor of two. Moreover, the substitution rates amongst the 12 possible nucleotide pairs are not homogeneous as they are affected by the type of immediately neighboring nucleotides and the overall local G+C content. Finally, our dataset is large enough that it has many indels, thus allowing for the first time statistically robust analysis of these events. Overall, we found that deletions are about three times more common than insertions (3740 versus 1291). The frequencies of both these events follow characteristic power-law behavior associated with the size of the indel. However, unexpectedly, the frequency of 3 bp deletions (in contrast to 3 bp insertions) violates this trend, being considerably higher than that of 2 bp deletions. The possible biological implications of such a 3 bp bias are discussed.

...read moreread less

Journal Article•DOI•

A Genome-Wide Survey of Human Pseudogenes

[...]

David Torrents¹, Mikita Suyama, Evgeny M. Zdobnov, Peer Bork•Institutions (1)

European Bioinformatics Institute¹

01 Dec 2003-Genome Research

TL;DR: All intergenic regions in the human genome are screened with a combination of homology searches and a functionality test using the ratio of silent to replacement nucleotide substitutions (KA/KS), and nonprocessed pseudogenes appear to be enriched in regions with high gene density.

...read moreread less

Abstract: We screened all intergenic regions in the human genome to identify pseudogenes with a combination of homology searches and a functionality test using the ratio of silent to replacement nucleotide substitutions (KA/KS). We identified 19,724 regions of which 95% +/- 3% are estimated to evolve neutrally and thus are likely to encode pseudogenes. Half of these have no detectable truncation in their pseudocoding regions and therefore are not identifiable by methods that require the presence of truncations to prove nonfunctionality. A comparative analysis with the mouse genome showed that 70% of these pseudogenes have a retrotranspositional origin (processed), and the rest arose by segmental duplication (nonprocessed). Although the spread of both types of pseudogenes correlates with chromosome size, nonprocessed pseudogenes appear to be enriched in regions with high gene density. It is likely that the human pseudogenes identified here represent only a small fraction of the total, which probably exceeds the number of genes.

...read moreread less

Journal Article•DOI•

Evolution of olfactory receptor genes in the human genome

[...]

Yoshihito Niimura¹, Masatoshi Nei•Institutions (1)

Pennsylvania State University¹

14 Oct 2003-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: The complete set of OR genes and their chromosomal locations from the latest human genome sequences are identified and it is shown that the class II OR genes can further be classified into 19 phylogenetic clades supported by high bootstrap values.

...read moreread less

Abstract: Olfactory receptor (OR) genes form the largest known multigene family in the human genome. To obtain some insight into their evolutionary history, we have identified the complete set of OR genes and their chromosomal locations from the latest human genome sequences. We detected 388 potentially functional genes that have intact ORFs and 414 apparent pseudogenes. The number and the fraction (48%) of functional genes are considerably larger than the ones previously reported. The human OR genes can clearly be divided into class I and class II genes, as was previously noted. Our phylogenetic analysis has shown that the class II OR genes can further be classified into 19 phylogenetic clades supported by high bootstrap values. We have also found that there are many tandem arrays of OR genes that are phylogenetically closely related. These genes appear to have been generated by tandem gene duplication. However, the relationships between genomic clusters and phylogenetic clades are very complicated. There are a substantial number of cases in which the genes in the same phylogenetic clade are located on different chromosomal regions. In addition, OR genes belonging to distantly related phylogenetic clades are sometimes located very closely in a chromosomal region and form a tight genomic cluster. These observations can be explained by the assumption that several chromosomal rearrangements have occurred at the regions of OR gene clusters and the OR genes contained in different genomic clusters are shuffled.

...read moreread less

Journal Article•DOI•

The DNA sequence of human chromosome 7

[...]

LaDeana W. Hillier¹, Robert S. Fulton¹, Lucinda Fulton¹, Tina Graves¹, Kymberlie H. Pepin¹, Caryn Wagner-McPherson¹, Dan Layman¹, Jason Maas¹, Sara Jaeger¹, Rebecca S. Walker¹, Kristine M. Wylie¹, Mandeep Sekhon¹, Michael C. Becker¹, Michelle O'Laughlin¹, Mark E. Schaller¹, Ginger A. Fewell¹, Kimberly D. Delehaunty¹, Tracie L. Miner¹, William E. Nash¹, Matt Cordes¹, Hui Du¹, Hui Sun¹, Jennifer Edwards¹, Holland Bradshaw-Cordum¹, Johar Ali¹, Stephanie Andrews¹, Amber Isak¹, Andrew Vanbrunt¹, Christine Nguyen¹, Feiyu Du¹, Betty Lamar¹, Laura Courtney¹, Joelle Kalicki¹, Philip Ozersky¹, Lauren Bielicki¹, Kelsi Scott¹, Andrea Holmes¹, Richard Harkins¹, Anthony R. Harris¹, Cindy Strong¹, Shunfang Hou¹, Chad Tomlinson¹, Sara Dauphin-Kohlberg¹, Amy Kozlowicz-Reilly¹, Shawn Leonard¹, Theresa Rohlfing¹, Susan M. Rock¹, Aye-Mon Tin-Wollam¹, Amanda Abbott¹, Patrick Minx¹, Rachel Maupin¹, Catrina Strowmatt¹, Phil Latreille¹, Nancy Miller¹, Doug Johnson¹, Jennifer Murray¹, Jeffrey Woessner¹, Michael C. Wendl¹, Shiaw-Pyng Yang¹, Brian Schultz¹, John W. Wallis¹, John Spieth¹, Tamberlyn Bieri¹, Joanne O. Nelson¹, Nicolas Berkowicz¹, Patricia Wohldmann¹, Lisa Cook¹, Matthew T. Hickenbotham¹, James M. Eldred¹, Donald Williams¹, Joseph A. Bedell¹, Elaine R. Mardis¹, Sandra W. Clifton¹, Stephanie L. Chissoe¹, Marco A. Marra², Marco A. Marra¹, Christopher K. Raymond³, Eric Haugen³, Will Gillett³, Yang Zhou³, R. James³, Karen A. Phelps³, Shawn Iadanoto³, Kerry L. Bubb³, Elizabeth Simms³, Ruth Levy³, James B. Clendenning³, Rajinder Kaul³, W. James Kent⁴, Terrence S. Furey⁴, Robert Baertsch⁴, Michael R. Brent¹, Evan Keibler¹, Paul Flicek¹, Peer Bork⁵, Mikita Suyama⁵, Jeffrey A. Bailey⁶, Matthew E. Portnoy⁷, David Torrents⁵, Asif T. Chinwalla¹, Warren Gish¹, Sean R. Eddy¹, John Douglas Mcpherson¹, John Douglas Mcpherson⁸, Maynard V. Olson³, Evan E. Eichler⁶, Eric D. Green⁷, Robert H. Waterston³, Robert H. Waterston¹, Richard K. Wilson¹ - Show less +106 more•Institutions (8)

Washington University in St. Louis¹, BC Cancer Agency², University of Washington³, University of California, Santa Cruz⁴, European Bioinformatics Institute⁵, Case Western Reserve University⁶, National Institutes of Health⁷, Human Genome Sequencing Center⁸

10 Jul 2003-Nature

TL;DR: The euchromatic sequence of chromosome 7, the first metacentric chromosome completed so far, has excellent concordance with previously established physical and genetic maps, and it exhibits an unusual amount of segmentally duplicated sequence.

...read moreread less

Abstract: Human chromosome 7 has historically received prominent attention in the human genetics community, primarily related to the search for the cystic fibrosis gene and the frequent cytogenetic changes associated with various forms of cancer. Here we present more than 153 million base pairs representing 99.4% of the euchromatic sequence of chromosome 7, the first metacentric chromosome completed so far. The sequence has excellent concordance with previously established physical and genetic maps, and it exhibits an unusual amount of segmentally duplicated sequence (8.2%), with marked differences between the two arms. Our initial analyses have identified 1,150 protein-coding genes, 605 of which have been confirmed by complementary DNA sequences, and an additional 941 pseudogenes. Of genes confirmed by transcript sequences, some are polymorphic for mutations that disrupt the reading frame.

...read moreread less

Journal Article•DOI•

Rapid Genome Divergence at Orthologous Low Molecular Weight Glutenin Loci of the A and Am Genomes of Wheat

[...]

Thomas Wicker¹, Nabila Yahiaoui¹, Romain Guyot¹, Edith Schlagenhauf¹, Zhong-Da Liu¹, Jorge Dubcovsky², Beat Keller¹ - Show less +3 more•Institutions (2)

University of Zurich¹, University of California, Davis²

01 May 2003-The Plant Cell

TL;DR: The striking differences in the intergenic landscape between the A and Am genomes that diverged 1 to 3 million years ago provide evidence for a dynamic and rapid genome evolution in wheat species.

...read moreread less

Abstract: To study genome evolution in wheat, we have sequenced and compared two large physical contigs of 285 and 142 kb covering orthologous low molecular weight (LMW) glutenin loci on chromosome 1AS of a diploid wheat species (Triticum monococcum subsp monococcum) and a tetraploid wheat species (Triticum turgidum subsp durum). Sequence conservation between the two species was restricted to small regions containing the orthologous LMW glutenin genes, whereas >90% of the compared sequences were not conserved. Dramatic sequence rearrangements occurred in the regions rich in repetitive elements. Dating of long terminal repeat retrotransposon insertions revealed different insertion events occurring during the last 5.5 million years in both species. These insertions are partially responsible for the lack of homology between the intergenic regions. In addition, the gene space was conserved only partially, because different predicted genes were identified on both contigs. Duplications and deletions of large fragments that might be attributable to illegitimate recombination also have contributed to the differentiation of this region in both species. The striking differences in the intergenic landscape between the A and Am genomes that diverged 1 to 3 million years ago provide evidence for a dynamic and rapid genome evolution in wheat species.

...read moreread less

Journal Article•DOI•

Different noses for different people.

[...]

Idan Menashe¹, Orna Man¹, Doron Lancet¹, Yoav Gilad¹•Institutions (1)

Weizmann Institute of Science¹

01 Jun 2003-Nature Genetics

TL;DR: Genotyping 51 candidate genes in 189 ethnically diverse humans shows an unprecedented prevalence of segregating pseudogenes, identifying one of the most pronounced cases of functional population diversity in the human genome.

...read moreread less

Abstract: Of more than 1,000 human olfactory receptor genes, more than half seem to be pseudogenes. We investigated whether the most recent of these disruptions might still segregate with the intact form by genotyping 51 candidate genes in 189 ethnically diverse humans. The results show an unprecedented prevalence of segregating pseudogenes, identifying one of the most pronounced cases of functional population diversity in the human genome.

...read moreread less

Journal Article•DOI•

Comparison of the genomes of human and mouse lays the foundation of genome zoology

[...]

Richard D. Emes¹, Leo Goodstadt¹, Eitan E. Winter¹, Chris P. Ponting¹•Institutions (1)

University of Oxford¹

01 Apr 2003-Human Molecular Genetics

TL;DR: It is predicted that the availability of numerous animal genomes will give rise to a new field of genome zoology in which differences in animal physiology and ethology are illuminated by the study of genomic sequence variations.

...read moreread less

Abstract: The extensive similarities between the genomes of human and model organisms are the foundation of much of modern biology, with model organism experimentation permitting valuable insights into biological function and the aetiology of human disease. In contrast, differences among genomes have received less attention. Yet these can be expected to govern the physiological and morphological distinctions apparent among species, especially if such differences are the result of evolutionary adaptation. A recent comparison of the draft sequences of mouse and human genomes has shed light on the selective forces that have predominated in their recent evolutionary histories. In particular, mouse-specific clusters of homologues associated with roles in reproduction, immunity and host defence appear to be under diversifying positive selective pressure, as indicated by high ratios of non-synonymous to synonymous substitution rates. These clusters are also frequently punctuated by homologous pseudogenes. They thus have experienced numerous gene death, as well as gene birth, events. These regions appear, therefore, to have borne the brunt of adaptive evolution that underlies physiological and behavioural innovation in mice. We predict that the availability of numerous animal genomes will give rise to a new field of genome zoology in which differences in animal physiology and ethology are illuminated by the study of genomic sequence variations.

...read moreread less

Journal Article•DOI•

Whole-genome screening indicates a possible burst of formation of processed pseudogenes and Alu repeats by particular L1 subfamilies in ancestral primates

[...]

Kazuhiko Ohshima¹, Masahira Hattori², Tetsusi Yada³, Takashi Gojobori⁴, Yoshiyuki Sakaki³, Norihiro Okada¹ - Show less +2 more•Institutions (4)

Tokyo Institute of Technology¹, Kitasato University², University of Tokyo³, National Institute of Genetics⁴

28 Oct 2003-Genome Biology

TL;DR: It is suggested that a burst of formation of PPs and Alus occurred in the genome of ancestral primates and one possible mechanism is that proteins encoded by members of particular L1 subfamilies acquired an enhanced ability to recognize cytosolic RNAs in trans.

...read moreread less

Abstract: Background: Abundant pseudogenes are a feature of mammalian genomes. Processed pseudogenes (PPs) are reverse transcribed from mRNAs. Recent molecular biological studies show that mammalian long interspersed element 1 (L1)-encoded proteins may have been involved in PP reverse transcription. Here, we present the first comprehensive analysis of human PPs using all known human genes as queries. Results: The human genome was queried and 3,664 candidate PPs were identified. The most abundant were copies of genes encoding keratin 18, glyceraldehyde-3-phosphate dehydrogenase and ribosomal protein L21. A simple method was developed to estimate the level of nucleotide substitutions (and therefore the age) of PPs. A Poisson-like age distribution was obtained with a mean age close to that of the Alu repeats, the predominant human short interspersed elements. These data suggest a nearly simultaneous burst of PP and Alu formation in the genomes of ancestral primates. The peak period of amplification of these two distinct retrotransposons was estimated to be 40-50 million years ago. Concordant amplification of certain L1 subfamilies with PPs and Alus was observed. Conclusions: We suggest that a burst of formation of PPs and Alus occurred in the genome of ancestral primates. One possible mechanism is that proteins encoded by members of particular L1 subfamilies acquired an enhanced ability to recognize cytosolic RNAs in trans.

...read moreread less

Journal Article•DOI•

Identification of farnesoid X receptor beta as a novel mammalian nuclear receptor sensing lanosterol.

[...]

Kerstin Otte, Harald Kranz, Ingo Kober, Paul Thompson, Michael Hoefer, Bernhard Haubold, Bettina Remmel, H. Voss, Carmen Kaiser, Michael Albers, Zaccharias Cheruvallath, David A. Jackson, Georg Casari, Manfred Koegl, Svante Pääbo¹, Jan Mous, Claus Kremoser, Ulrich Deuschle - Show less +14 more•Institutions (1)

Max Planck Society¹

01 Feb 2003-Molecular and Cellular Biology

TL;DR: The identification ofFXRβ as a novel functional receptor in nonprimate animals sheds new light on the species differences in cholesterol metabolism and has strong implications for the interpretation of genetic and pharmacological studies of FXR-directed physiologies and drug discovery programs.

...read moreread less

Abstract: Nuclear receptors are ligand-modulated transcription factors. On the basis of the completed human genome sequence, this family was thought to contain 48 functional members. However, by mining human and mouse genomic sequences, we identified FXRβ as a novel family member. It is a functional receptor in mice, rats, rabbits, and dogs but constitutes a pseudogene in humans and primates. Murine FXRβ is widely coexpressed with FXR in embryonic and adult tissues. It heterodimerizes with RXRα and stimulates transcription through specific DNA response elements upon addition of 9-cis-retinoic acid. Finally, we identified lanosterol as a candidate endogenous ligand that induces coactivator recruitment and transcriptional activation by mFXRβ. Lanosterol is an intermediate of cholesterol biosynthesis, which suggests a direct role in the control of cholesterol biosynthesis in nonprimates. The identification of FXRβ as a novel functional receptor in nonprimate animals sheds new light on the species differences in cholesterol metabolism and has strong implications for the interpretation of genetic and pharmacological studies of FXR-directed physiologies and drug discovery programs.

...read moreread less

Journal Article•DOI•

Comparison of P450s from human and fugu: 420 million years of vertebrate P450 evolution.

[...]

David R. Nelson¹•Institutions (1)

University of Tennessee¹

01 Jan 2003-Archives of Biochemistry and Biophysics

TL;DR: The fugu (pufferfish) genome has been sequenced, and a second genome assembly was released 17 May 2002, and all P450 genes and pseudogenes in the available fugu sequence data have been identified, compared to human P450s, and assigned names.

...read moreread less

Collapse