Author
J. Gielen
Bio: J. Gielen is an academic researcher from Ghent University. The author has contributed to research in topics: Chromosome 4 & Genome. The author has an hindex of 3, co-authored 3 publications receiving 1231 citations.
Topics: Chromosome 4, Genome, Gene density, Genome project, Gene
Papers
More filters
••
John Innes Centre1, Harvard University2, Ghent University3, University of Paris4, Trinity College, Dublin5, University of East Anglia6, Spanish National Research Council7, Agricultural University of Athens8, MediGene9, Centre national de la recherche scientifique10, Katholieke Universiteit Leuven11, Max Planck Society12
TL;DR: Analysis of the sequence revealed an average gene density of one gene every 4.8 kilobases, and 54% of the predicted genes had significant similarity to known genes, and other interesting features were found, such as the sequence of a disease-resistance gene locus, the distribution of retroelements, and the frequent occurrence of clustered gene families.
Abstract: The plant Arabidopsis thaliana (Arabidopsis) has become an important model species for the study of many aspects of plant biology. The relatively small size of the nuclear genome and the availability of extensive physical maps of the five chromosomes provide a feasible basis for initiating sequencing of the five chromosomes. The YAC (yeast artificial chromosome)-based physical map of chromosome 4 was used to construct a sequence-ready map of cosmid and BAC (bacterial artificial chromosome) clones covering a 1.9-megabase (Mb) contiguous region, and the sequence of this region is reported here. Analysis of the sequence revealed an average gene density of one gene every 4.8 kilobases (kb), and 54% of the predicted genes had significant similarity to known genes. Other interesting features were found, such as the sequence of a disease-resistance gene locus, the distribution of retroelements, the frequent occurrence of clustered gene families, and the sequence of several classes of genes not previously encountered in plants.
832 citations
••
TL;DR: Analysis of 17.38 megabases of unique sequence, representing about 17% of the Arabidopsis genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements.
Abstract: The higher plant Arabidopsis thaliana (Arabidopsis) is an important model for identifying plant genes and determining their function. To assist biological investigations and to define chromosome structure, a coordinated effort to sequence the Arabidopsis genome was initiated in late 1996. Here we report one of the first milestones of this project, the sequence of chromosome 4. Analysis of 17.38 megabases of unique sequence, representing about 17% of the genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements. Heterochromatic regions surrounding the putative centromere, which has not yet been completely sequenced, are characterized by an increased frequency of a variety of repeats, new repeats, reduced recombination, lowered gene density and lowered gene expression. Roughly 60% of the predicted protein-coding genes have been functionally characterized on the basis of their homology to known genes. Many genes encode predicted proteins that are homologous to human and Caenorhabditis elegans proteins.
411 citations
••
TL;DR: The clustering of highly repetitive elements is a striking feature of the A. thaliana genome emerging from sequence and other analyses, indicating that local sequence duplication and subsequent divergence generates a significant proportion of gene families.
15 citations
Cited by
More filters
••
TL;DR: This is the first complete genome sequence of a plant and provides the foundations for more comprehensive comparison of conserved processes in all eukaryotes, identifying a wide range of plant-specific gene functions and establishing rapid systematic ways to identify genes for crop improvement.
Abstract: The flowering plant Arabidopsis thaliana is an important model system for identifying genes and determining their functions. Here we report the analysis of the genomic sequence of Arabidopsis. The sequenced regions cover 115.4 megabases of the 125-megabase genome and extend into centromeric regions. The evolution of Arabidopsis involved a whole-genome duplication, followed by subsequent gene loss and extensive local gene duplications, giving rise to a dynamic genome enriched by lateral gene transfer from a cyanobacterial-like ancestor of the plastid. The genome contains 25,498 genes encoding proteins from 11,000 families, similar to the functional diversity of Drosophila and Caenorhabditis elegans--the other sequenced multicellular eukaryotes. Arabidopsis has many families of new proteins but also lacks several common protein families, indicating that the sets of common proteins have undergone differential expansion and contraction in the three multicellular eukaryotes. This is the first complete genome sequence of a plant and provides the foundations for more comprehensive comparison of conserved processes in all eukaryotes, identifying a wide range of plant-specific gene functions and establishing rapid systematic ways to identify genes for crop improvement.
8,742 citations
••
TL;DR: A neural network-based tool, TargetP, for large-scale subcellular location prediction of newly identified proteins has been developed and it is estimated that 10% of all plant proteins are mitochondrial and 14% chloroplastic, and that the abundance of secretory proteins, in both Arabidopsis and Homo, is around 10%.
4,268 citations
••
TL;DR: The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms and reveals the evolutionary generation of diversity in the regulation of transcription.
Abstract: The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms. Arabidopsis dedicates over 5% of its genome to code for more than 1500 transcription factors, about 45% of which are from families specific to plants. Arabidopsis transcription factors that belong to families common to all eukaryotes do not share significant similarity with those of the other kingdoms beyond the conserved DNA binding domains, many of which have been arranged in combinations specific to each lineage. The genome-wide comparison reveals the evolutionary generation of diversity in the regulation of transcription.
2,582 citations
••
TL;DR: The WRKY proteins are a superfamily of transcription factors with up to 100 representatives in Arabidopsis that appear to be involved in the regulation of various physio-logical programs that are unique to plants, including pathogen defense, senescence and trichome development.
2,447 citations
••
Agricultural Research Service1, Oregon State University2, University of California, Berkeley3, John Innes Centre4, United States Department of Energy5, United States Department of Agriculture6, University of California, Davis7, University of Silesia in Katowice8, China Agricultural University9, Iowa State University10, Washington State University11, University of Florida12, University of Massachusetts Amherst13, University of Wisconsin-Madison14, Technische Universität München15, Cornell University16, University of Zurich17, University of Helsinki18, Universidade Federal de Pelotas19, Purdue University20, University of Texas at Arlington21, National Center for Genome Resources22, University of Delaware23, Joint BioEnergy Institute24, University of Copenhagen25, Kyung Hee University26, Ghent University27, Centre national de la recherche scientifique28, Oak Ridge National Laboratory29, Ohio State University30, Institut national de la recherche agronomique31, University of Picardie Jules Verne32, Illinois State University33, Sabancı University34, Donald Danforth Plant Science Center35
TL;DR: The high-quality genome sequence will help Brachypodium reach its potential as an important model system for developing new energy and food crops and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat.
Abstract: Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops.
1,603 citations