scispace - formally typeset
Search or ask a question

Showing papers on "Genomics published in 2004"


Journal ArticleDOI
02 Apr 2004-Science
TL;DR: Over 1.2 million previously unknown genes represented in these samples, including more than 782 new rhodopsin-like photoreceptors are identified, suggesting substantial oceanic microbial diversity.
Abstract: We have applied “whole-genome shotgun sequencing” to microbial populations collected en masse on tangential flow and impact filters from seawater samples collected from the Sargasso Sea near Bermuda. A total of 1.045 billion base pairs of nonredundant sequence was generated, annotated, and analyzed to elucidate the gene content, diversity, and relative abundance of the organisms within these environmental samples. These data are estimated to derive from at least 1800 genomic species based on sequence relatedness, including 148 previously unknown bacterial phylotypes. We have identified over 1.2 million previously unknown genes represented in these samples, including more than 782 new rhodopsin-like photoreceptors. Variation in species present and stoichiometry suggests substantial oceanic microbial diversity. Microorganisms are responsible for most of the biogeochemical cycles that shape the environment of Earth and its oceans. Yet, these organisms are the least well understood on Earth, as the ability to study and understand the metabolic potential of microorganisms has been hampered by the inability to generate pure cultures. Recent studies have begun to explore environ

4,210 citations


Journal ArticleDOI
LaDeana W. Hillier1, Webb Miller2, Ewan Birney, Wesley C. Warren1  +171 moreInstitutions (39)
09 Dec 2004-Nature
TL;DR: A draft genome sequence of the red jungle fowl, Gallus gallus, provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes.
Abstract: We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes--provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. For example, the evolutionary distance between chicken and human provides high specificity in detecting functional elements, both non-coding and coding. Notably, many conserved non-coding sequences are far from genes and cannot be assigned to defined functional classes. In coding regions the evolutionary dynamics of protein domains and orthologous groups illustrate processes that distinguish the lineages leading to birds and mammals. The distinctive properties of avian microchromosomes, together with the inferred patterns of conserved synteny, provide additional insights into vertebrate chromosome architecture.

2,579 citations


Journal ArticleDOI
02 Sep 2004-Nature
TL;DR: An initial map of yeast's transcriptional regulatory code is constructed by identifying the sequence elements that are bound by regulators under various conditions and that are conserved among Saccharomyces species.
Abstract: DNA-binding transcriptional regulators interpret the genome's regulatory code by binding to specific sequences to induce or repress gene expression. Comparative genomics has recently been used to identify potential cis-regulatory sequences within the yeast genome on the basis of phylogenetic conservation, but this information alone does not reveal if or when transcriptional regulators occupy these binding sites. We have constructed an initial map of yeast's transcriptional regulatory code by identifying the sequence elements that are bound by regulators under various conditions and that are conserved among Saccharomyces species. The organization of regulatory elements in promoters and the environment-dependent use of these elements by regulators are discussed. We find that environment-specific use of regulatory elements predicts mechanistic models for the function of a large population of yeast's transcriptional regulators.

2,304 citations


Journal ArticleDOI
TL;DR: Reassembly of multiple genomes has provided insight into energy and nutrient cycling within the community, genome structure, gene function, population genetics and microheterogeneity, and lateral gene transfer among members of an uncultured community.
Abstract: Metagenomics (also referred to as environmental and community genomics) is the genomic analysis of microorganisms by direct extraction and cloning of DNA from an assemblage of microorganisms. The development of metagenomics stemmed from the ineluctable evidence that as-yet-uncultured microorganisms represent the vast majority of organisms in most environments on earth. This evidence was derived from analyses of 16S rRNA gene sequences amplified directly from the environment, an approach that avoided the bias imposed by culturing and led to the discovery of vast new lineages of microbial life. Although the portrait of the microbial world was revolutionized by analysis of 16S rRNA genes, such studies yielded only a phylogenetic description of community membership, providing little insight into the genetics, physiology, and biochemistry of the members. Metagenomics provides a second tier of technical innovation that facilitates study of the physiology and ecology of environmental microorganisms. Novel genes and gene products discovered through metagenomics include the first bacteriorhodopsin of bacterial origin; novel small molecules with antimicrobial activity; and new members of families of known proteins, such as an Na+(Li+)/H+ antiporter, RecA, DNA polymerase, and antibiotic resistance determinants. Reassembly of multiple genomes has provided insight into energy and nutrient cycling within the community, genome structure, gene function, population genetics and microheterogeneity, and lateral gene transfer among members of an uncultured community. The application of metagenomic sequence information will facilitate the design of better culturing strategies to link genomic analysis with pure culture studies.

2,224 citations


Journal ArticleDOI
04 Mar 2004-Nature
TL;DR: Reconstruction of near-complete genomes of Leptospirillum group II and Ferroplasma type II and analysis of the gene complement for each organism revealed the pathways for carbon and nitrogen fixation and energy generation, and provided insights into survival strategies in an extreme environment.
Abstract: Microbial communities are vital in the functioning of all ecosystems; however, most microorganisms are uncultivated, and their roles in natural systems are unclear. Here, using random shotgun sequencing of DNA from a natural acidophilic biofilm, we report reconstruction of near-complete genomes of Leptospirillum group II and Ferroplasma type II, and partial recovery of three other genomes. This was possible because the biofilm was dominated by a small number of species populations and the frequency of genomic rearrangements and gene insertions or deletions was relatively low. Because each sequence read came from a different individual, we could determine that single-nucleotide polymorphisms are the predominant form of heterogeneity at the strain level. The Leptospirillum group II genome had remarkably few nucleotide polymorphisms, despite the existence of low-abundance variants. The Ferroplasma type II genome seems to be a composite from three ancestral strains that have undergone homologous recombination to form a large population of mosaic genomes. Analysis of the gene complement for each organism revealed the pathways for carbon and nitrogen fixation and energy generation, and provided insights into survival strategies in an extreme environment.

2,213 citations


Journal ArticleDOI
01 Apr 2004-Nature
TL;DR: This first comprehensive analysis of the genome sequence of the Brown Norway (BN) rat strain is reported, which is the third complete mammalian genome to be deciphered, and three-way comparisons with the human and mouse genomes resolve details of mammalian evolution.
Abstract: The laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality 'draft' covering over 90% of the genome. The BN rat sequence is the third complete mammalian genome to be deciphered, and three-way comparisons with the human and mouse genomes resolve details of mammalian evolution. This first comprehensive analysis includes genes and proteins and their relation to human disease, repeated sequences, comparative genome-wide studies of mammalian orthologous chromosomal regions and rearrangement breakpoints, reconstruction of ancestral karyotypes and the events leading to existing species, rates of variation, and lineage-specific and lineage-independent evolutionary events such as expansion of gene families, orthology relations and protein evolution.

1,964 citations


Journal ArticleDOI
TL;DR: Changes over the past year include the removal of the sequence length limit, the launch of the EMBLCDSs dataset, extension of the Sequence Version Archive functionality and the revision of quality rules for TPA data.
Abstract: The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl.html) constitutes Europe's primary nucleotide sequence resource. Main sources for DNA and RNA sequences are direct submissions from individual researchers, genome sequencing projects and patent applications. While automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO), the preferred submission tool for individual submitters is Webin (WWW). Through all stages, dataflow is monitored by EBI biologists communicating with the sequencing groups. In collaboration with DDBJ and GenBank the database is produced, maintained and distributed at the European Bioinformatics Institute (EBI). Database releases are produced quarterly and are distributed on CD-ROM. Network services allow access to the most up-to-date data collection via Internet and World Wide Web interface. EBI's Sequence Retrieval System (SRS) is a Network Browser for Databanks in Molecular Biology, integrating and linking the main nucleotide and protein databases, plus many specialised databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, Blast etc) are available for external users to compare their own sequences against the most currently available data in the EMBL Nucleotide Sequence Database and SWISS-PROT.

1,187 citations


Journal ArticleDOI
TL;DR: This work presents a www server for AUGUSTUS, a novel software program for ab initio gene prediction in eukaryotic genomic sequences based on a generalized Hidden Markov Model with a new method for modeling the intron length distribution.
Abstract: We present a www server for AUGUSTUS, a novel software program for ab initio gene prediction in eukaryotic genomic sequences. Our method is based on a generalized Hidden Markov Model with a new method for modeling the intron length distribution. This method allows approximation of the true intron length distribution more accurately than do existing programs. For genomic sequence data from human and Drosophila melanogaster, the accuracy of AUGUSTUS is superior to existing gene-finding approaches. The advantage of our program becomes apparent especially for larger input sequences containing more than one gene. The server is available at http://augustus.gobics.de.

1,027 citations


Journal ArticleDOI
TL;DR: In this review, the principles, potential power, requirements, advantages, and disadvantages of the various marker types are discussed, along with their applications in a variety of aquaculture studies.

980 citations


Journal ArticleDOI
TL;DR: The crucial role that accessory elements play in the rapid evolution of S. aureus is clearly illustrated by comparing the MSSA476 genome with that of an extremely closely related MRSA community-acquired strain; the differential distribution of large mobile elements carrying virulence and drug-resistance determinants may be responsible for the clinically important phenotypic differences in these strains.
Abstract: Staphylococcus aureus is an important nosocomial and community-acquired pathogen. Its genetic plasticity has facilitated the evolution of many virulent and drug-resistant strains, presenting a major and constantly changing clinical challenge. We sequenced the ≈2.8-Mbp genomes of two disease-causing S. aureus strains isolated from distinct clinical settings: a recent hospital-acquired representative of the epidemic methicillin-resistant S. aureus EMRSA-16 clone (MRSA252), a clinically important and globally prevalent lineage; and a representative of an invasive community-acquired methicillin-susceptible S. aureus clone (MSSA476). A comparative-genomics approach was used to explore the mechanisms of evolution of clinically important S. aureus genomes and to identify regions affecting virulence and drug resistance. The genome sequences of MRSA252 and MSSA476 have a well conserved core region but differ markedly in their accessory genetic elements. MRSA252 is the most genetically diverse S. aureus strain sequenced to date: ≈6% of the genome is novel compared with other published genomes, and it contains several unique genetic elements. MSSA476 is methicillin-susceptible, but it contains a novel Staphylococcal chromosomal cassette (SCC) mec-like element (designated SCC476), which is integrated at the same site on the chromosome as SCCmec elements in MRSA strains but encodes a putative fusidic acid resistance protein. The crucial role that accessory elements play in the rapid evolution of S. aureus is clearly illustrated by comparing the MSSA476 genome with that of an extremely closely related MRSA community-acquired strain; the differential distribution of large mobile elements carrying virulence and drug-resistance determinants may be responsible for the clinically important phenotypic differences in these strains.

950 citations


Journal ArticleDOI
TL;DR: Genetic analysis of the wMel genome further supports the hypothesis that mitochondria share a common ancestor with the α-Proteobacteria, but shows little support for the grouping of mitochondria with species in the order Rickettsiales.
Abstract: The complete sequence of the 1,267,782 bp genome of Wolbachia pipientis wMel, an obligate intracellular bacteria of Drosophila melanogaster, has been determined. Wolbachia, which are found in a variety of invertebrate species, are of great interest due to their diverse interactions with different hosts, which range from many forms of reproductive parasitism to mutualistic symbioses. Analysis of the wMel genome, in particular phylogenomic comparisons with other intracellular bacteria, has revealed many insights into the biology and evolution of wMel and Wolbachia in general. For example, the wMel genome is unique among sequenced obligate intracellular species in both being highly streamlined and containing very high levels of repetitive DNA and mobile DNA elements. This observation, coupled with multiple evolutionary reconstructions, suggests that natural selection is somewhat inefficient in wMel, most likely owing to the occurrence of repeated population bottlenecks. Genome analysis predicts many metabolic differences with the closely related Rickettsia species, including the presence of intact glycolysis and purine synthesis, which may compensate for an inability to obtain ATP directly from its host, as Rickettsia can. Other discoveries include the apparent inability of wMel to synthesize lipopolysaccharide and the presence of the most genes encoding proteins with ankyrin repeat domains of any prokaryotic genome yet sequenced. Despite the ability of wMel to infect the germline of its host, we find no evidence for either recent lateral gene transfer between wMel and D. melanogaster or older transfers between Wolbachia and any host. Evolutionary analysis further supports the hypothesis that mitochondria share a common ancestor with the alpha-Proteobacteria, but shows little support for the grouping of mitochondria with species in the order Rickettsiales. With the availability of the complete genomes of both species and excellent genetic tools for the host, the wMel-D. melanogaster symbiosis is now an ideal system for studying the biology and evolution of Wolbachia infections.

Journal ArticleDOI
TL;DR: The constructed tiling resolution array allows comprehensive assessment of genomic integrity and thereby the identification of new genes associated with disease and shows the need to move beyond conventional marker-based genome comparison approaches, that rely on inference of continuity between interval markers.
Abstract: We constructed a tiling resolution array consisting of 32,433 overlapping BAC clones covering the entire human genome. This increases our ability to identify genetic alterations and their boundaries throughout the genome in a single comparative genomic hybridization (CGH) experiment. At this tiling resolution, we identified minute DNA alterations not previously reported. These alterations include microamplifications and deletions containing oncogenes, tumor-suppressor genes and new genes that may be associated with multiple tumor types. Our findings show the need to move beyond conventional marker-based genome comparison approaches, that rely on inference of continuity between interval markers. Our submegabase resolution tiling set for array CGH (SMRT array) allows comprehensive assessment of genomic integrity and thereby the identification of new genes associated with disease.

Journal ArticleDOI
TL;DR: Virus-induced gene silencing is a recently developed gene transcript suppression technique for characterizing the function of plant genes that is rapid, does not require development of stable transformants, allows characterization of phenotypes that might be lethal in stable lines, and offers the potential to silence either individual or multiple members of a gene family.
Abstract: Virus-induced gene silencing (VIGS) is a recently developed gene transcript suppression technique for characterizing the function of plant genes. The approach involves cloning a short sequence of a targeted plant gene into a viral delivery vector. The vector is used to infect a young plant, and in a few weeks natural defense mechanisms of the plant directed at suppressing virus replication also result in specific degradation of mRNAs from the endogenous plant gene that is targeted for silencing. VIGS is rapid (3-4 weeks from infection to silencing), does not require development of stable transformants, allows characterization of phenotypes that might be lethal in stable lines, and offers the potential to silence either individual or multiple members of a gene family. Here we briefly review the discoveries that led to the development of VIGS and what is known about the experimental requirements for effective silencing. We describe the methodology of VIGS and how it can be optimized and used for both forward and reverse genetics studies. Advantages and disadvantages of VIGS compared with other loss-of-function approaches available for plants are discussed, along with how the limitations of VIGS might be overcome. Examples are reviewed where VIGS has been used to provide important new insights into the roles of specific genes in plant development and plant defense responses. Finally, we examine the future prospects for VIGS as a powerful tool for assessing and characterizing the function of plant genes.

Journal ArticleDOI
01 Mar 2004-Genomics
TL;DR: The design of DNA microarrays, the selection of controls, the level of repetition required, and other critical parameters for success in the design and analysis of ChIP-chip experiments, especially those conducted in the context of mammalian or other relatively large genomes are reviewed.

Journal ArticleDOI
TL;DR: An analysis of over 1,100 of the ∼10,000 predicted proteins encoded by the genome sequence of the filamentous fungus Neurospora crassa reveals potential new targets for antifungals as well as loci implicated in human and plant physiology and disease.
Abstract: We present an analysis of over 1,100 of the approximately 10,000 predicted proteins encoded by the genome sequence of the filamentous fungus Neurospora crassa. Seven major areas of Neurospora genomics and biology are covered. First, the basic features of the genome, including the automated assembly, gene calls, and global gene analyses are summarized. The second section covers components of the centromere and kinetochore complexes, chromatin assembly and modification, and transcription and translation initiation factors. The third area discusses genome defense mechanisms, including repeat induced point mutation, quelling and meiotic silencing, and DNA repair and recombination. In the fourth section, topics relevant to metabolism and transport include extracellular digestion; membrane transporters; aspects of carbon, sulfur, nitrogen, and lipid metabolism; the mitochondrion and energy metabolism; the proteasome; and protein glycosylation, secretion, and endocytosis. Environmental sensing is the focus of the fifth section with a treatment of two-component systems; GTP-binding proteins; mitogen-activated protein, p21-activated, and germinal center kinases; calcium signaling; protein phosphatases; photobiology; circadian rhythms; and heat shock and stress responses. The sixth area of analysis is growth and development; it encompasses cell wall synthesis, proteins important for hyphal polarity, cytoskeletal components, the cyclin/cyclin-dependent kinase machinery, macroconidiation, meiosis, and the sexual cycle. The seventh section covers topics relevant to animal and plant pathogenesis and human disease. The results demonstrate that a large proportion of Neurospora genes do not have homologues in the yeasts Saccharomyces cerevisiae and Schizosaccharomyces pombe. The group of unshared genes includes potential new targets for antifungals as well as loci implicated in human and plant physiology and disease.

Journal ArticleDOI
TL;DR: Comparative genomics is beginning to identify the functional components of the chromosome and that in turn will set the stage for the functional characterization of the sequences.
Abstract: The sequence of chromosome 21 was a turning point for the understanding of Down syndrome. Comparative genomics is beginning to identify the functional components of the chromosome and that in turn will set the stage for the functional characterization of the sequences. Animal models combined with genome-wide analytical methods have proved indispensable for unravelling the mysteries of gene dosage imbalance.

Journal ArticleDOI
TL;DR: An increasing amount of evidence indicates that genomic variants in both coding and non-coding sequences can have unexpected deleterious effects on the splicing of the gene transcript.
Abstract: When genome variants are identified in genomic DNA, especially during routine analysis of disease-associated genes, their functional implications might not be immediately evident. Distinguishing between a genomic variant that changes the phenotype and one that does not is a difficult task. An increasing amount of evidence indicates that genomic variants in both coding and non-coding sequences can have unexpected deleterious effects on the splicing of the gene transcript. So how can benign polymorphisms be distinguished from disease-associated splicing mutations?

Journal ArticleDOI
TL;DR: When the genome sequences of domestic animals become available the identification of the mutations that underlie the transformation from a wild to a domestic species will be a realistic and important target.
Abstract: One of the 'grand challenges' in modern biology is to understand the genetic basis of phenotypic diversity within and among species. Thousands of years of selective breeding of domestic animals has created a diversity of phenotypes among breeds that is only matched by that observed among species in nature. Domestic animals therefore constitute a unique resource for understanding the genetic basis of phenotypic variation. When the genome sequences of domestic animals become available the identification of the mutations that underlie the transformation from a wild to a domestic species will be a realistic and important target.

Journal ArticleDOI
30 Sep 2004-Nature
TL;DR: It is reported that there are over 3,000 Pack-MULEs in rice containing fragments derived from more than 1,000 cellular genes, which indicates that fragments of genomic DNA have been captured, rearranged and amplified over millions of years.
Abstract: Mutator-like transposable elements (MULEs) are found in many eukaryotic genomes and are especially prevalent in higher plants. In maize, rice and Arabidopsis a few MULEs were shown to carry fragments of cellular genes. These chimaeric elements are called Pack-MULEs in this study. The abundance of MULEs in rice and the availability of most of the genome sequence permitted a systematic analysis of the prevalence and nature of Pack-MULEs in an entire genome. Here we report that there are over 3,000 Pack-MULEs in rice containing fragments derived from more than 1,000 cellular genes. Pack-MULEs frequently contain fragments from multiple chromosomal loci that are fused to form new open reading frames, some of which are expressed as chimaeric transcripts. About 5% of the Pack-MULEs are represented in collections of complementary DNA. Functional analysis of amino acid sequences and proteomic data indicate that some captured gene fragments might be functional. Comparison of the cellular genes and Pack-MULE counterparts indicates that fragments of genomic DNA have been captured, rearranged and amplified over millions of years. Given the abundance of Pack-MULEs in rice and the widespread occurrence of MULEs in all characterized plant genomes, gene fragment acquisition by Pack-MULEs might represent an important new mechanism for the evolution of genes in higher plants.

Journal ArticleDOI
TL;DR: The results suggest that the migration of modern humans out of Africa into new environments was accompanied by genetic adaptations to emergent selective forces, and a region containing four contiguous genes on Chromosome 7 showed striking evidence of a recent selective sweep in European-Americans.
Abstract: Identifying regions of the human genome that have been targets of natural selection will provide important insights into human evolutionary history and may facilitate the identification of complex disease genes. Although the signature that natural selection imparts on DNA sequence variation is difficult to disentangle from the effects of neutral processes such as population demographic history, selective and demographic forces can be distinguished by analyzing multiple loci dispersed throughout the genome. We studied the molecular evolution of 132 genes by comprehensively resequencing them in 24 African-Americans and 23 European-Americans. We developed a rigorous computational approach for taking into account multiple hypothesis tests and demographic history and found that while many apparent selective events can instead be explained by demography, there is also strong evidence for positive or balancing selection at eight genes in the European-American population, but none in the African-American population. Our results suggest that the migration of modern humans out of Africa into new environments was accompanied by genetic adaptations to emergent selective forces. In addition, a region containing four contiguous genes on Chromosome 7 showed striking evidence of a recent selective sweep in European-Americans. More generally, our results have important implications for mapping genes underlying complex human diseases.

Journal ArticleDOI
TL;DR: A more global concept of genomic disorders emerges in which susceptibility to rearrangements occurs due to underlying complex genomic architecture, and this architecture plays a role not only in disease etiology, but also in primate genome evolution.
Abstract: The term 'genomic disorder' refers to a disease that is caused by an alteration of the genome that results in complete loss, gain or disruption of the structural integrity of a dosage sensitive gene(s). In most of the common chromosome deletion/duplication syndromes, the rearranged genomic segments are flanked by large (usually >10 kb), highly homologous low copy repeat (LCR) structures that can act as recombination substrates. Recombination between non-allelic LCR copies, also known as non-allelic homologous recombination, can result in deletion or duplication of the intervening segment. Recent findings suggest that other chromosomal rearrangements, including reciprocal, Robertsonian and jumping translocations, inversions, isochromosomes and small marker chromosomes, may also involve susceptibility to rearrangement related to genome structure or architecture. In several cases, LCRs, AT-rich palindromes and pericentromeric repeats are located at such rearrangement breakpoints. Analysis of the products of recombination at the junctions of the rearrangements reveals both homologous recombination and non-homologous end joining as causative mechanisms. Thus, a more global concept of genomic disorders emerges in which susceptibility to rearrangements occurs due to underlying complex genomic architecture. Interestingly, this architecture plays a role not only in disease etiology, but also in primate genome evolution. In this review, we discuss recent advances regarding general mechanisms for the various rearrangements of our genome, and potential models for rearrangements with non-homologous breakpoint regions.

Journal ArticleDOI
TL;DR: It is found that two highly diverged animals, the nematode Caenorhabditis elegans and the fruit fly Drosophila melanogaster, implement a shared adult-onset expression program of genes involved in mitochondrial metabolism, DNA repair, catabolism, peptidolysis and cellular transport.
Abstract: We developed a method for systematically comparing gene expression patterns across organisms using genome-wide comparative analysis of DNA microarray experiments. We identified analogous gene expression programs comprising shared patterns of regulation across orthologous genes. Biological features of these patterns could be identified as highly conserved subpatterns that correspond to Gene Ontology categories. Here, we demonstrate these methods by analyzing a specific biological process, aging, and show that similar analysis can be applied to a range of biological processes. We found that two highly diverged animals, the nematode Caenorhabditis elegans and the fruit fly Drosophila melanogaster, implement a shared adult-onset expression program of genes involved in mitochondrial metabolism, DNA repair, catabolism, peptidolysis and cellular transport. Most of these changes were implemented early in adulthood. Using this approach to search databases of gene expression data, we found conserved transcriptional signatures in larval development, embryogenesis, gametogenesis and mRNA degradation.

Journal ArticleDOI
TL;DR: In this paper, data mining methods have been used to identify 356 Cyt P450 genes and 99 related pseudogenes in the rice (Oryza sativa) genome using sequence information available from both the indica and japonica strains.
Abstract: Data mining methods have been used to identify 356 Cyt P450 genes and 99 related pseudogenes in the rice (Oryza sativa) genome using sequence information available from both the indica and japonica strains. Because neither of these genomes is completely available, some genes have been identified in only one strain, and 28 genes remain incomplete. Comparison of these rice genes with the 246 P450 genes and 26 pseudogenes in the Arabidopsis genome has indicated that most of the known plant P450 families existed before the monocot-dicot divergence that occurred approximately 200 million years ago. Comparative analysis of P450s in the Pinus expressed sequence tag collections has identified P450 families that predated the separation of gymnosperms and flowering plants. Complete mapping of all available plant P450s onto the Deep Green consensus plant phylogeny highlights certain lineage-specific families maintained (CYP80 in Ranunculales) and lineage-specific families lost (CYP92 in Arabidopsis) in the course of evolution.

Journal ArticleDOI
TL;DR: TILLING was shown to be an effective reverse genetic strategy by the establishment of a high-throughput TILLING facility and the delivery of thousands of point mutations in hundreds of Arabidopsis genes to members of the plant biology community.
Abstract: Background Going from a gene sequence to its function in the context of a whole organism requires a strategy for targeting mutations, referred to as reverse genetics. Reverse genetics is highly desirable in the modern genomics era; however, the most powerful methods are generally restricted to a few model organisms. Previously, we introduced a reverse-genetic strategy with the potential for general applicability to organisms that lack well-developed genetic tools. Our TILLING (Targeting Induced Local Lesions IN Genomes) method uses chemical mutagenesis followed by screening for single-base changes to discover induced mutations that alter protein function. TILLING was shown to be an effective reverse genetic strategy by the establishment of a high-throughput TILLING facility and the delivery of thousands of point mutations in hundreds of Arabidopsis genes to members of the plant biology community.

Journal ArticleDOI
TL;DR: Evidence is examined for an ancient whole-genome duplication (tetraploidization) event that probably occurred just before the teleost radiation, which could have contributed to the genetic isolation of populations, to lineage-specific diversification of developmental programs, and ultimately to phenotypic variation among teleost fish.

Journal ArticleDOI
TL;DR: The impact of Drosophila genetics on the field of insect resistance and the current and future impact of genomics is reviewed and three fundamental questions in the evolution of resistance are addressed.

Journal ArticleDOI
TL;DR: To better understand genome function and evolution in Mycobacterium tuberculosis, the genomes of 100 epidemiologically well characterized clinical isolates were interrogated by DNA microarrays and sequencing and 224 genes were found to be partially or completely deleted.
Abstract: To better understand genome function and evolution in Mycobacterium tuberculosis, the genomes of 100 epidemiologically well characterized clinical isolates were interrogated by DNA microarrays and sequencing. We identified 68 different large-sequence polymorphisms (comprising 186,137 bp, or 4.2% of the genome) that are present in H37Rv, but absent from one or more clinical isolates. A total of 224 genes (5.5%), including genes in all major functional categories, were found to be partially or completely deleted. Deletions are not distributed randomly throughout the genome but instead tend to be aggregated. The distinct deletions in some aggregations appear in closely related isolates, suggesting a genomically disruptive process specific to an individual mycobacterial lineage. Other genomic aggregations include distinct deletions that appear in phylogenetically unrelated isolates, suggesting that a genomic region is vulnerable throughout the species. Although the deletions identified here are evidently inessential to the causation of disease (they are found in active clinical cases), their frequency spectrum suggests that most are weakly deleterious to the pathogen. For some deletions, short-term evolutionary pressure due to the host immune system or antibiotics may favor the elimination of genes, whereas longer-term physiological requirements maintain the genes in the population.

Journal ArticleDOI
TL;DR: The statistical solutions that help to overcome the problems with data-set complexity are described, in anticipation of the imminent wealth of data that will be generated by new genome-wide epigenetic profiling and DNA sequence analysis techniques.
Abstract: Epigenomic studies aim to define the location and nature of the genomic sequences that are epigenetically modified. Much progress has been made towards whole-genome epigenetic profiling using molecular techniques, but the analysis of such large and complex data sets is far from trivial given the correlated nature of sequence and functional characteristics within the genome. We describe the statistical solutions that help to overcome the problems with data-set complexity, in anticipation of the imminent wealth of data that will be generated by new genome-wide epigenetic profiling and DNA sequence analysis techniques. So far, epigenomic studies have succeeded in identifying CpG islands, but recent evidence points towards a role for transposable elements in epigenetic regulation, causing the fields of study of epigenetics and transposable element biology to converge.

Journal ArticleDOI
TL;DR: The most exciting direction for genetic research on intelligence is to harness the power of the Human Genome Project to identify some of the specific genes responsible for the heritability of intelligence.
Abstract: More is known about the genetics of intelligence than about any other trait, behavioral or biological, which is selectively reviewed in this article. Two of the most interesting genetic findings are that heritability of intelligence increases throughout the life span and that the same genes affect diverse cognitive abilities. The most exciting direction for genetic research on intelligence is to harness the power of the Human Genome Project to identify some of the specific genes responsible for the heritability of intelligence. The next research direction will be functional genomics--for example, understanding the brain pathways between genes and intelligence. Deoxyribonucleic acid (DNA) will integrate life sciences research on intelligence; bottom-up molecular biological research will meet top-down psychological research in the brain.

Journal ArticleDOI
TL;DR: The coding content of Populus and Arabidopsis genomes shows very high similarity, indicating that differences between these annual and perennial angiosperm life forms result primarily from differences in gene regulation.
Abstract: Trees present a life form of paramount importance for terrestrial ecosystems and human societies because of their ecological structure and physiological function and provision of energy and industrial materials. The genus Populus is the internationally accepted model for molecular tree biology. We have analyzed 102,019 Populus ESTs that clustered into 11,885 clusters and 12,759 singletons. We also provide >4,000 assembled full clone sequences to serve as a basis for the upcoming annotation of the Populus genome sequence. A public web-based EST database (populusdb) provides digital expression profiles for 18 tissues that comprise the majority of differentiated organs. The coding content of Populus and Arabidopsis genomes shows very high similarity, indicating that differences between these annual and perennial angiosperm life forms result primarily from differences in gene regulation. The high similarity between Populus and Arabidopsis will allow studies of Populus to directly benefit from the detailed functional genomic information generated for Arabidopsis, enabling detailed insights into tree development and adaptation. These data will also valuable for functional genomic efforts in Arabidopsis.