Home
/
Authors
/
R. Wambutt

Author

R. Wambutt

Bio: R. Wambutt is an academic researcher. The author has contributed to research in topics: Genome project & Gene. The author has an hindex of 6, co-authored 6 publications receiving 1550 citations.

Topics: Genome project, Gene, Genome, Gene density, Chromosome 4 ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Analysis of 1.9 Mb of contiguous sequence from chromosome 4 of Arabidopsis thaliana

[...]

Michael W. Bevan¹, Ian Bancroft¹, E. Bent¹, K. Love¹, Howard M. Goodman², Caroline Dean¹, R. Bergkamp, Wim G. Dirkse, M. van Staveren, Willem J. Stiekema, L. Drost¹, P. Ridley¹, S.-A. Hudson¹, K. Patel¹, George Murphy¹, Pietro Piffanelli¹, H. Wedler, E. Wedler, R. Wambutt, T. Weitzenegger, Thomas Pohl, Nancy Terryn³, J. Gielen³, R. Villarroel³, R. De Clerck³, M. Van Montagu³, A. Lecharny⁴, S. Auborg⁴, I. Gy⁴, M. Kreis⁴, N. Lao⁵, Tony A. Kavanagh⁵, S. Hempel, P. Kötter, K.-D. Entian, M. Rieger, M. Schaeffer, B. Funk, S. Mueller-Auer, M. Silvey⁶, Richard James⁶, A. Montfort⁷, A. Pons⁷, Pere Puigdomènech⁷, A. Douka⁸, E. Voukelatou⁸, Dimitra Milioni⁸, Polydefkis Hatzopoulos⁸, E. Piravandi⁹, B. Obermaier⁹, H. Hilbert, A. Düsterhöft, T. Moores¹, Jonathan D. G. Jones¹, T. Eneva, Klaus Palme, Vladimir Benes, S. Rechman, W. Ansorge, R. Cooke¹⁰, Claire Berger¹⁰, Michel Delseny¹⁰, Marleen Voet¹¹, Guido Volckaert¹¹, Hans-Werner Mewes¹², S. Klosterman¹², C. Schueller¹², N. Chalwatzis¹² - Show less +64 more•Institutions (12)

John Innes Centre¹, Harvard University², Ghent University³, University of Paris⁴, Trinity College, Dublin⁵, University of East Anglia⁶, Spanish National Research Council⁷, Agricultural University of Athens⁸, MediGene⁹, Centre national de la recherche scientifique¹⁰, Katholieke Universiteit Leuven¹¹, Max Planck Society¹²

29 Jan 1998-Nature

TL;DR: Analysis of the sequence revealed an average gene density of one gene every 4.8 kilobases, and 54% of the predicted genes had significant similarity to known genes, and other interesting features were found, such as the sequence of a disease-resistance gene locus, the distribution of retroelements, and the frequent occurrence of clustered gene families.

...read moreread less

Abstract: The plant Arabidopsis thaliana (Arabidopsis) has become an important model species for the study of many aspects of plant biology. The relatively small size of the nuclear genome and the availability of extensive physical maps of the five chromosomes provide a feasible basis for initiating sequencing of the five chromosomes. The YAC (yeast artificial chromosome)-based physical map of chromosome 4 was used to construct a sequence-ready map of cosmid and BAC (bacterial artificial chromosome) clones covering a 1.9-megabase (Mb) contiguous region, and the sequence of this region is reported here. Analysis of the sequence revealed an average gene density of one gene every 4.8 kilobases (kb), and 54% of the predicted genes had significant similarity to known genes. Other interesting features were found, such as the sequence of a disease-resistance gene locus, the distribution of retroelements, the frequent occurrence of clustered gene families, and the sequence of several classes of genes not previously encountered in plants.

...read moreread less

832 citations

Journal Article•DOI•

Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana

[...]

Klaus F. X. Mayer¹, C. Schüller¹, R. Wambutt, George Murphy² +230 more•Institutions (21)

16 Dec 1999-Nature

TL;DR: Analysis of 17.38 megabases of unique sequence, representing about 17% of the Arabidopsis genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements.

...read moreread less

Abstract: The higher plant Arabidopsis thaliana (Arabidopsis) is an important model for identifying plant genes and determining their function. To assist biological investigations and to define chromosome structure, a coordinated effort to sequence the Arabidopsis genome was initiated in late 1996. Here we report one of the first milestones of this project, the sequence of chromosome 4. Analysis of 17.38 megabases of unique sequence, representing about 17% of the genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements. Heterochromatic regions surrounding the putative centromere, which has not yet been completely sequenced, are characterized by an increased frequency of a variety of repeats, new repeats, reduced recombination, lowered gene density and lowered gene expression. Roughly 60% of the predicted protein-coding genes have been functionally characterized on the basis of their homology to known genes. Many genes encode predicted proteins that are homologous to human and Caenorhabditis elegans proteins.

...read moreread less

411 citations

Journal Article•DOI•

Toward a Catalog of Human Genes and Proteins: Sequencing and Analysis of 500 Novel Complete Protein Coding Human cDNAs

[...]

Stefan Wiemann¹, Bernd Weil¹, Ruth Wellenreuther¹, Johannes Gassenhuber¹, Sabine Glassl, Wilhelm Ansorge, Michael Böcher, Helmut Blöcker, Stefan Bauersachs², Helmut Blum², Jürgen Lauber, A. Düsterhöft, Andreas Beyer³, Karl Köhrer³, Normann Strack, Hans-Werner Mewes, Birgit Ottenwälder, Brigitte Obermaier, Jens Tampe, Dagmar Heubner, R. Wambutt, Bernhard Korn¹, Michaela Klein¹, Annemarie Poustka¹ - Show less +20 more•Institutions (3)

German Cancer Research Center¹, Ludwig Maximilian University of Munich², University of Düsseldorf³

01 Mar 2001-Genome Research

TL;DR: The sequencing and analysis of 500 novel human cDNAs containing the complete protein coding frame were reported, finding a number of genes that either had been completely missed in the analysis of the genomic sequences or had been wrongly predicted.

...read moreread less

Abstract: With the complete human genomic sequence being unraveled, the focus will shift to gene identification and to the functional analysis of gene products. The generation of a set of cDNAs, both sequences and physical clones, which contains the complete and noninterrupted protein coding regions of all human genes will provide the indispensable tools for the systematic and comprehensive analysis of protein function to eventually understand the molecular basis of man. Here we report the sequencing and analysis of 500 novel human cDNAs containing the complete protein coding frame. Assignment to functional categories was possible for 52% (259) of the encoded proteins, the remaining fraction having no similarities with known proteins. By aligning the cDNA sequences with the sequences of the finished chromosomes 21 and 22 we identified a number of genes that either had been completely missed in the analysis of the genomic sequences or had been wrongly predicted. Three of these genes appear to be present in several copies. We conclude that full-length cDNA sequencing continues to be crucial also for the accurate identification of genes. The set of 500 novel cDNAs, and another 1000 full-coding cDNAs of known transcripts we have identified, adds up to cDNA representations covering 2%–5 % of all human genes. We thus substantially contribute to the generation of a gene catalog, consisting of both full-coding cDNA sequences and clones, which should be made freely available and will become an invaluable tool for detailed functional studies. [The sequence data described in this paper have been submitted to the EMBL database under the accession nos. given in Table Table22.] Table 2 Functional Classification of Individual cDNAsa The recent past has witnessed major advances in the determination of the sequence of the human genome (Dunham et al. 1999; Hattori et al. 2000). Although the whole genomic sequence will be completely unraveled in the near future (Collins et al. 1998), the identification of genes and the deciphering of gene structures will extend for a prolonged time, and cDNA sequences will continue to be invaluable tools for this adventure, especially in view of alternative splicing. The primary focus will shift to the functional analysis of the genes and their protein products to finally understand the molecular basis of human life. Current estimates vary between 29,000 and >70,000 genes to constitute the protein coding repertoire of the human genome (Fields et al. 1994; Ewing and Green 2000; Liang et al. 2000; Roest Crollius et al. 2000). However, thus far only some 11,000 cDNA sequences have been deposited in public databases, which are supposed to contain the complete protein coding open reading frame (ORF). The majority of the respective cDNA clones are most likely not accessible. The generation of a physical clone set representing all human genes that should be made freely accessible is consequently regarded to have an extremely high impact (Schuler 1997; Pruitt et al. 2000). This would permit the establishment of a catalog of clones to provide the resources needed in the proteomics era where the functions of proteins, their action in pathways, and the possible disease relation are deciphered. Until recently, the long-cDNA sequencing project carried out at the Kazusa Institute (Nomura et al. 1994; Nagase et al. 2000) Consortium had been the only systematic full-length cDNA sequencing project with a significant output of novel sequence information. The initiation of a new large-scale cDNA sequencing project has been announced lately that is coordinated by the National Institute of Health (Strausberg et al. 1999). We founded a cDNA Consortium in 1997 as part of the German Genome Project and aim at the characterization of the complete sequences of novel human transcripts at the cDNA level. Here, we report the sequences and analysis of 500 novel human cDNAs that all contain the complete protein coding region. These cDNAs constitute the most valuable essence of 30,000 clones that have been EST sequenced and 3630 fully sequenced cDNAs. Over 1000 cDNAs that cover the complete coding sequence of already known transcripts have been identified in the EST-sequenced clone set. All clones are made available through the Resource Center of the German Genome Project (RZPD).

...read moreread less

185 citations

Journal Article•DOI•

Conservation of Microstructure between a Sequenced Region of the Genome of Rice and Multiple Segments of the Genome of Arabidopsis thaliana

[...]

Klaus F. X. Mayer, George Murphy, Renato Tarchini, R. Wambutt, Guido Volckaert, Thomas Pohl, A. Düsterhöft, Willem J. Stiekema, Karl-Dieter Entian, Nancy Terryn, Kai Lemcke, Dirk Haase, Caroline R. Hall, Anne-Marie van Dodeweerd, Scott V. Tingey, Hans-Werner Mewes, Michael W. Bevan, Ian Bancroft - Show less +14 more

01 Jul 2001-Genome Research

TL;DR: The results demonstrate that conservation of the genome microstructure can be identified even between monocot and dicot species, and are consistent with the hypothesis that the Arabidopsis genome has undergone multiple duplication events.

...read moreread less

Abstract: The nucleotide sequence was determined for a 340-kb segment of rice chromosome 2, revealing 56 putative protein-coding genes. This represents a density of one gene per 6.1 kb, which is higher than was reported for a previously sequenced segment of the rice genome. Sixteen of the putative genes were supported by matches to ESTs. The predicted products of 29 of the putative genes showed similarity to known proteins, and a further 17 genes showed similarity only to predicted or hypothetical proteins identified in genome sequence data. The region contains a few transposable elements: one retrotransposon, and one transposon. The segment of the rice genome studied had previously been identified as representing a part of rice chromosome 2 that may be homologous to a segment of Arabidopsis chromosome 4. We confirmed the conservation of gene content and order between the two genome segments. In addition, we identified a further four segments of the Arabidopsis genome that contain conserved gene content and order. In total, 22 of the 56 genes identified in the rice genome segment were represented in this set of Arabidopsis genome segments, with at least five genes present, in conserved order, in each segment. These data are consistent with the hypothesis that the Arabidopsis genome has undergone multiple duplication events. Our results demonstrate that conservation of the genome microstructure can be identified even between monocot and dicot species. However, the frequent occurrence of duplication, and subsequent microstructure divergence, within plant genomes may necessitate the integration of subsets of genes present in multiple redundant segments to deduce evolutionary relationships and identify orthologous genes.

...read moreread less

73 citations

Patent•

Human dna sequences

[...]

Stefan Wiemann, Annemarie Poustka, Ruth Wellenreuther, Helmut Blum, Brigitte Obermaier, Birgit Ottenwaelder, Andre Bahr, Andreas Duesterhoeft, Christoph Koenig, Juergen Lauber, Dagmar Heubner, R. Wambutt, Karl Koehrer, Andreas Beyer, Johann Gassenhuber, Christian Gruber, Norman Strack, H.W. Mewes, Wilhelm Ansorge, Sabine Glassl, Claudia Rittmueller, Thomas Regiert, Helmut Bloecker, Michael Boecher, Klaus Hornischer, Gabriele Nordsiek, Jens Tampe - Show less +23 more

18 Aug 2000

TL;DR: In this article, the human cDNA sequence of a clone, the encoded protein sequence of clones, antibodies and variants thereof, are provided, and the disclosed sequence finds application in a number of ways, including use in profiling assays.

...read moreread less

Abstract: Novel human cDNA sequence of a clones, the encoded protein sequence of a clones, antibodies and variants thereof, are provided. The disclosed sequence of a clones find application in a number of ways, including use in profiling assays. In this regard, various assemblages of nucleic acids or proteins are provided that are useful in providing large arrays of human material for implementing large-scale screening strategies. The disclosed sequence of a clones may also be used in formulating medicaments, treating various disorders and in certain diagnostic applications.

...read moreread less

62 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Analysis of the genome sequence of the flowering plant Arabidopsis thaliana.

[...]

Arabidopsis Genome Initiative¹•Institutions (1)

J. Craig Venter Institute¹

14 Dec 2000-Nature

TL;DR: This is the first complete genome sequence of a plant and provides the foundations for more comprehensive comparison of conserved processes in all eukaryotes, identifying a wide range of plant-specific gene functions and establishing rapid systematic ways to identify genes for crop improvement.

...read moreread less

Abstract: The flowering plant Arabidopsis thaliana is an important model system for identifying genes and determining their functions. Here we report the analysis of the genomic sequence of Arabidopsis. The sequenced regions cover 115.4 megabases of the 125-megabase genome and extend into centromeric regions. The evolution of Arabidopsis involved a whole-genome duplication, followed by subsequent gene loss and extensive local gene duplications, giving rise to a dynamic genome enriched by lateral gene transfer from a cyanobacterial-like ancestor of the plastid. The genome contains 25,498 genes encoding proteins from 11,000 families, similar to the functional diversity of Drosophila and Caenorhabditis elegans--the other sequenced multicellular eukaryotes. Arabidopsis has many families of new proteins but also lacks several common protein families, indicating that the sets of common proteins have undergone differential expansion and contraction in the three multicellular eukaryotes. This is the first complete genome sequence of a plant and provides the foundations for more comprehensive comparison of conserved processes in all eukaryotes, identifying a wide range of plant-specific gene functions and establishing rapid systematic ways to identify genes for crop improvement.

...read moreread less

8,742 citations

Journal Article•DOI•

Predicting subcellular localization of proteins based on their N-terminal amino acid sequence.

[...]

Olof Emanuelsson¹, Henrik Nielsen², Søren Brunak², Gunnar von Heijne¹•Institutions (2)

Stockholm University¹, Technical University of Denmark²

21 Jul 2000-Journal of Molecular Biology

TL;DR: A neural network-based tool, TargetP, for large-scale subcellular location prediction of newly identified proteins has been developed and it is estimated that 10% of all plant proteins are mitochondrial and 14% chloroplastic, and that the abundance of secretory proteins, in both Arabidopsis and Homo, is around 10%.

...read moreread less

4,268 citations

Journal Article•DOI•

A draft sequence of the rice genome (Oryza sativa L. ssp indica)

[...]

Stephen A. Goff¹, Darrell O. Ricke¹, Tien-Hung Lan¹, Gernot G. Presting¹, Ronglin Wang¹, Molly Dunn¹, Jane Glazebrook¹, Allen Sessions¹, Paul Oeller¹, Hemant Varma¹, David Hadley¹, Don Hutchison¹, Christopher M. Martin¹, Fumiaki Katagiri¹, B. Markus Lange¹, Todd Moughamer¹, Yu Xia¹, Paul Budworth¹, Jingping Zhong¹, Trini Miguel¹, Uta Paszkowski¹, Shiping Zhang¹, Michelle Colbert¹, Wei-lin Sun¹, Lili Chen¹, Bret Cooper¹, Sylvia Park¹, Todd Charles Wood², Long Mao³, Peter H. Quail⁴, Rod A. Wing⁵, Ralph A. Dean⁵, Yeisoo Yu⁵, Andrey Zharkikh⁶, Richard Shen⁶, Sudhir Sahasrabudhe⁶, Alun Thomas⁶, Rob Cannings⁶, Alexander Gutin⁶, Dmitry Pruss⁶, Julia Reid⁶, Sean V. Tavtigian⁶, J.T. Mitchell⁶, Glenn Eldredge⁶, Terri Scholl⁶, Rose Mary Miller⁶, Satish Bhatnagar⁶, Nils Adey⁶, Todd Rubano⁶, Nadeem Tusneem⁶, Rosann Robinson⁶, Jane Feldhaus⁶, Teresita Macalma⁶, Arnold R. Oliphant⁶, Steven P. Briggs¹ - Show less +51 more•Institutions (6)

Syngenta¹, Bryan College², Northern Illinois University³, University of California, Berkeley⁴, Clemson University⁵, Myriad Genetics⁶

05 Apr 2002-Science

TL;DR: A draft sequence of the rice genome for the most widely cultivated subspecies in China, Oryza sativa L. ssp.indica, by whole-genome shotgun sequencing is produced, with a large proportion of rice genes with no recognizable homologs due to a gradient in the GC content of rice coding sequences.

...read moreread less

Abstract: We have produced a draft sequence of the rice genome for the most widely cultivated subspecies in China, Oryza sativa L. ssp. indica, by whole-genome shotgun sequencing. The genome was 466 megabases in size, with an estimated 46,022 to 55,615 genes. Functional coverage in the assembled sequences was 92.0%. About 42.2% of the genome was in exact 20-nucleotide oligomer repeats, and most of the transposons were in the intergenic regions between genes. Although 80.6% of predicted Arabidopsis thaliana genes had a homolog in rice, only 49.4% of predicted rice genes had a homolog in A. thaliana. The large proportion of rice genes with no recognizable homologs is due to a gradient in the GC-content of rice coding sequences.

...read moreread less

4,064 citations

Journal Article•DOI•

Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes.

[...]

José Luis Riechmann¹, Jacqueline E. Heard¹, George M. Martin¹, T. Lynne Reuber¹, Cai-Zhong Jiang¹, James S. Keddie¹, Luc Adam¹, Omaira Pineda¹, Oliver J. Ratcliffe¹, Raymond Samaha¹, Robert A. Creelman¹, Marsha Pilgrim¹, Pierre Broun¹, James Zhang¹, D. Ghandehari¹, Bradley K. Sherman¹, Guo-Liang Yu¹ - Show less +13 more•Institutions (1)

Mendel Biotechnology, Inc.¹

15 Dec 2000-Science

TL;DR: The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms and reveals the evolutionary generation of diversity in the regulation of transcription.

...read moreread less

Abstract: The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms. Arabidopsis dedicates over 5% of its genome to code for more than 1500 transcription factors, about 45% of which are from families specific to plants. Arabidopsis transcription factors that belong to families common to all eukaryotes do not share significant similarity with those of the other kingdoms beyond the conserved DNA binding domains, many of which have been arranged in combinations specific to each lineage. The genome-wide comparison reveals the evolutionary generation of diversity in the regulation of transcription.

...read moreread less

2,582 citations

Journal Article•DOI•

The WRKY superfamily of plant transcription factors

[...]

Thomas Eulgem¹, Paul J. Rushton¹, Silke Robatzek¹, Imre E. Somssich¹•Institutions (1)

Max Planck Society¹

01 May 2000-Trends in Plant Science

TL;DR: The WRKY proteins are a superfamily of transcription factors with up to 100 representatives in Arabidopsis that appear to be involved in the regulation of various physio-logical programs that are unique to plants, including pathogen defense, senescence and trichome development.

...read moreread less

2,447 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse