Home
/
Authors
/
Claude Scarpelli

Author

Claude Scarpelli

Other affiliations: Centre national de la recherche scientifique, Université Paris-Saclay

Bio: Claude Scarpelli is an academic researcher from University of Évry Val d'Essonne. The author has contributed to research in topics: Genome & Gene. The author has an hindex of 19, co-authored 22 publications receiving 14047 citations. Previous affiliations of Claude Scarpelli include Centre national de la recherche scientifique & Université Paris-Saclay.

Topics: Genome, Gene, Comparative genomics, Genomics, Genome project ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The map-based sequence of the rice genome

[...]

Takashi Matsumoto¹, Jianzhong Wu¹, Hiroyuki Kanamori¹, Yuichi Katayose¹ +262 more•Institutions (25)

11 Aug 2005-Nature

TL;DR: A map-based, finished quality sequence that covers 95% of the 389 Mb rice genome, including virtually all of the euchromatin and two complete centromeres, and finds evidence for widespread and recurrent gene transfer from the organelles to the nuclear chromosomes.

...read moreread less

Abstract: Rice, one of the world's most important food plants, has important syntenic relationships with the other cereal species and is a model plant for the grasses. Here we present a map-based, finished quality sequence that covers 95% of the 389 Mb genome, including virtually all of the euchromatin and two complete centromeres. A total of 37,544 non-transposable-element-related protein-coding genes were identified, of which 71% had a putative homologue in Arabidopsis. In a reciprocal analysis, 90% of the Arabidopsis proteins had a putative homologue in the predicted rice proteome. Twenty-nine per cent of the 37,544 predicted genes appear in clustered gene families. The number and classes of transposable elements found in the rice genome are consistent with the expansion of syntenic regions in the maize and sorghum genomes. We find evidence for widespread and recurrent gene transfer from the organelles to the nuclear chromosomes. The map-based sequence has proven useful for the identification of genes underlying agronomic traits. The additional single-nucleotide polymorphisms and simple sequence repeats identified in our study should accelerate improvements in rice production.

...read moreread less

3,423 citations

Journal Article•DOI•

The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla.

[...]

Olivier Jaillon¹, Jean-Marc Aury, Benjamin Noel, Alberto Policriti, Christian Clepet, Alberto Casagrande, Nathalie Choisne, Sébastien Aubourg, Nicola Vitulo, Claire Jubin, Alessandro Vezzi, Fabrice Legeai, Philippe Hugueney, Corinne Dasilva, David S. Horner, Erica Mica, Delphine Jublot, Julie Poulain, Clémence Bruyère, Alain Billault, Béatrice Segurens, Michel Gouyvenoux, Edgardo Ugarte, Federica Cattonaro, Véronique Anthouard, Virginie Vico, Cristian Del Fabbro, Michael Alaux, Gabriele Di Gaspero, Vincent Dumas, Nicoletta Felice, Sophie Paillard, Irena Juman, Marco Moroldo, Simone Scalabrin, Aurélie Canaguier, Isabelle Le Clainche, G Malacrida, Eléonore Durand, Graziano Pesole, Valérie Laucou, Philippe Chatelet, Didier Merdinoglu, Massimo Delledonne, Mario Pezzotti, Alain Lecharny, Claude Scarpelli, François Artiguenave, M. Enrico Pè, Giorgio Valle, Michele Morgante, Michel Caboche, Anne-Françoise Adam-Blondon, Jean Weissenbach, Francis Quetier, Patrick Wincker - Show less +52 more•Institutions (1)

University of Évry Val d'Essonne¹

26 Aug 2007-Nature

TL;DR: A high-quality draft of the genome sequence of grapevine is obtained from a highly homozygous genotype, revealing the contribution of three ancestral genomes to the grapevine haploid content and explaining the chronology of previously described whole-genome duplication events in the evolution of flowering plants.

...read moreread less

Abstract: The analysis of the first plant genomes provided unexpected evidence for genome duplication events in species that had previously been considered as true diploids on the basis of their genetics. These polyploidization events may have had important consequences in plant evolution, in particular for species radiation and adaptation and for the modulation of functional capacities. Here we report a high-quality draft of the genome sequence of grapevine (Vitis vinifera) obtained from a highly homozygous genotype. The draft sequence of the grapevine genome is the fourth one produced so far for flowering plants, the second for a woody species and the first for a fruit crop (cultivated for both fruit and beverage). Grapevine was selected because of its important place in the cultural heritage of humanity beginning during the Neolithic period. Several large expansions of gene families with roles in aromatic features are observed. The grapevine genome has not undergone recent genome duplication, thus enabling the discovery of ancestral traits and features of the genetic organization of flowering plants. This analysis reveals the contribution of three ancestral genomes to the grapevine haploid content. This ancestral arrangement is common to many dicotyledonous plants but is absent from the genome of rice, which is a monocotyledon. Furthermore, we explain the chronology of previously described whole-genome duplication events in the evolution of flowering plants.

...read moreread less

3,311 citations

Journal Article•DOI•

Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype

[...]

Olivier Jaillon¹, Jean-Marc Aury¹, Frédéric Brunet², Jean-Louis Petit¹, Nicole Stange-Thomann³, Evan Mauceli³, Laurence Bouneau¹, Cécile Fischer¹, Catherine Ozouf-Costaz⁴, Alain Bernot¹, Sophie Nicaud¹, David M. Jaffe³, Sheila Fisher³, Georges Lutfalla⁴, Carole Dossat¹, Béatrice Segurens¹, Corinne Dasilva¹, Marcel Salanoubat¹, Michael Levy¹, Nathalie Boudet¹, Sergi Castellano, Véronique Anthouard¹, Claire Jubin¹, Vanina Castelli¹, Michael Katinka¹, Benoit Vacherie¹, Christian Biémont⁵, Zineb Skalli¹, Laurence Cattolico¹, Julie Poulain¹, Véronique de Berardinis¹, Corinne Cruaud¹, Simone Duprat¹, Philippe Brottier¹, Jean-Pierre Coutanceau⁴, Jérôme Gouzy⁴, Genís Parra, Guillaume Lardier¹, Charles Chapple, Kevin McKernan, Paul A. McEwan, Stephanie Bosak, Manolis Kellis³, Jean-Nicolas Volff⁶, Roderic Guigó, Michael C. Zody³, Jill P. Mesirov³, Kerstin Lindblad-Toh³, Bruce W. Birren³, Chad Nusbaum³, Daniel Kahn⁴, Marc Robinson-Rechavi², Vincent Laudet², Vincent Schächter¹, Francis Quetier¹, William Saurin¹, Claude Scarpelli¹, Patrick Wincker¹, Eric S. Lander⁷, Eric S. Lander³, Jean Weissenbach¹, Hugues Roest Crollius¹, Hugues Roest Crollius⁸ - Show less +59 more•Institutions (8)

University of Évry Val d'Essonne¹, École normale supérieure de Lyon², Broad Institute³, Centre national de la recherche scientifique⁴, University of Lyon⁵, University of Würzburg⁶, Massachusetts Institute of Technology⁷, École Normale Supérieure⁸

21 Oct 2004-Nature

TL;DR: Genome analysis provides a greatly improved fish gene catalogue, including identifying key genes previously thought to be absent in fish, and reconstructs much of the evolutionary history of ancient and recent chromosome rearrangements leading to the modern human karyotype.

...read moreread less

Abstract: Tetraodon nigroviridis is a freshwater puffer fish with the smallest known vertebrate genome. Here, we report a draft genome sequence with long-range linkage and substantial anchoring to the 21 Tetraodon chromosomes. Genome analysis provides a greatly improved fish gene catalogue, including identifying key genes previously thought to be absent in fish. Comparison with other vertebrates and a urochordate indicates that fish proteins have diverged markedly faster than their mammalian homologues. Comparison with the human genome suggests ∼900 previously unannotated human genes. Analysis of the Tetraodon and human genomes shows that whole-genome duplication occurred in the teleost fish lineage, subsequent to its divergence from mammals. The analysis also makes it possible to infer the basic structure of the ancestral bony vertebrate genome, which was composed of 12 chromosomes, and to reconstruct much of the evolutionary history of ancient and recent chromosome rearrangements leading to the modern human karyotype.

...read moreread less

1,889 citations

Journal Article•DOI•

Genome evolution in yeasts

[...]

Bernard Dujon¹, David James Sherman², Gilles Fischer¹, Pascal Durrens³, Serge Casaregola³, Ingrid Lafontaine¹, Jacky de Montigny³, Christian Marck, Cécile Neuvéglise³, Emmanuel Talla¹, Nicolas Goffard, Lionel Frangeul, Michel Aigle³, Véronique Anthouard³, Anna Babour³, Valérie Barbe³, Stéphanie Barnay³, Sylvie Blanchin³, Jean-Marie Beckerich³, Emmanuelle Beyne², Claudine Bleykasten³, Anita Boisramé³, Jeanne Boyer¹, Laurence Cattolico³, Fabrice Confanioleri⁴, Antoine de Daruvar, Laurence Despons³, Emmanuelle Fabre¹, Cécile Fairhead¹, Hélène Ferry-Dumazet, Alexis Groppi, Florence Hantraye⁵, Christophe Hennequin¹, Nicolas Jauniaux³, Philippe Joyet³, Rym Kachouri³, Alix Kerrest¹, Romain Koszul¹, Marc Lemaire³, Isabelle Lesur², Laurence Ma, Héloïse Muller¹, Jean-Marc Nicaud³, Macha Nikolski², Sophie Oztas³, Odile Ozier-Kalogeropoulos¹, Stefan Pellenz¹, Serge Potier³, Guy-Franck Richard¹, Marie-Laure Straub³, Audrey Suleau³, Dominique Swennen³, Fredj Tekaia¹, Micheline Wésolowski-Louvel³, Eric Westhof³, Bénédicte Wirth³, Maria Zeniou-Meyer³, Ivan Zivanovic⁴, Monique Bolotin-Fukuhara⁴, Agnès Thierry¹, Christiane Bouchier, Bernard Caudron⁵, Claude Scarpelli³, Claude Gaillardin³, Jean Weissenbach³, Patrick Wincker³, Jean-Luc Souciet³ - Show less +63 more•Institutions (5)

Pierre-and-Marie-Curie University¹, L'Abri², Centre national de la recherche scientifique³, University of Paris⁴, Pasteur Institute⁵

01 Jul 2004-Nature

TL;DR: Analysis of chromosome maps and genome redundancies reveal that the different yeast lineages have evolved through a marked interplay between several distinct molecular mechanisms, including tandem gene repeat formation, segmental duplication, a massive genome duplication and extensive gene loss.

...read moreread less

Abstract: Identifying the mechanisms of eukaryotic genome evolution by comparative genomics is often complicated by the multiplicity of events that have taken place throughout the history of individual lineages, leaving only distorted and superimposed traces in the genome of each living organism. The hemiascomycete yeasts, with their compact genomes, similar lifestyle and distinct sexual and physiological properties, provide a unique opportunity to explore such mechanisms. We present here the complete, assembled genome sequences of four yeast species, selected to represent a broad evolutionary range within a single eukaryotic phylum, that after analysis proved to be molecularly as diverse as the entire phylum of chordates. A total of approximately 24,200 novel genes were identified, the translation products of which were classified together with Saccharomyces cerevisiae proteins into about 4,700 families, forming the basis for interspecific comparisons. Analysis of chromosome maps and genome redundancies reveal that the different yeast lineages have evolved through a marked interplay between several distinct molecular mechanisms, including tandem gene repeat formation, segmental duplication, a massive genome duplication and extensive gene loss.

...read moreread less

1,604 citations

Journal Article•DOI•

The Medicago genome provides insight into the evolution of rhizobial symbioses

[...]

Nevin D. Young¹, Frédéric Debellé², Frédéric Debellé³, Giles E. D. Oldroyd⁴, René Geurts⁵, Steven B. Cannon⁶, Steven B. Cannon⁷, Michael K. Udvardi, Vagner A. Benedito⁸, Klaus F. X. Mayer, Jérôme Gouzy², Jérôme Gouzy³, Heiko Schoof⁹, Yves Van de Peer¹⁰, Sebastian Proost¹⁰, Douglas R. Cook¹¹, Blake C. Meyers¹², Manuel Spannagl, Foo Cheung¹³, Stéphane De Mita⁵, Vivek Krishnakumar¹³, Heidrun Gundlach, Shiguo Zhou¹⁴, Joann Mudge¹⁵, Arvind K. Bharti¹⁵, Jeremy D. Murray⁴, Marina Naoumkina, Benjamin D. Rosen¹¹, Kevin A. T. Silverstein¹, Haibao Tang¹³, Stephane Rombauts¹⁰, Patrick X. Zhao, Peng Zhou¹, Valérie Barbe, Philippe Bardou², Philippe Bardou³, Michael Bechner¹⁴, Arnaud Bellec³, Anne Berger, Hélène Bergès³, Shelby L. Bidwell¹³, Ton Bisseling¹⁶, Ton Bisseling⁵, Nathalie Choisne, Arnaud Couloux, Roxanne Denny¹, Shweta Deshpande¹⁷, Xinbin Dai, Jeff J. Doyle¹⁸, Anne Marie Dudez², Anne Marie Dudez³, Andrew Farmer¹⁵, Stéphanie Fouteau, Carolien Franken⁵, Chrystel Gibelin², Chrystel Gibelin³, John Gish¹¹, Steven A. Goldstein¹⁴, Alvaro J. González¹², Pamela J. Green¹², Asis Hallab¹⁹, Marijke Hartog⁵, Axin Hua¹⁷, Sean Humphray²⁰, Dong-Hoon Jeong¹², Yi Jing¹⁷, Anika Jöcker¹⁹, Steve Kenton¹⁷, Dong-Jin Kim²¹, Dong-Jin Kim¹¹, Kathrin Klee¹⁹, Hongshing Lai¹⁷, Chunting Lang⁵, Shaoping Lin¹⁷, Simone L. Macmil¹⁷, Ghislaine Magdelenat, Lucy Matthews²⁰, Jamison McCorrison¹³, Erin L. Monaghan¹³, Jeong Hwan Mun¹¹, Jeong Hwan Mun²², Fares Z. Najar¹⁷, Christine Nicholson²⁰, Céline Noirot³, Majesta O'Bleness¹⁷, Charles Paule¹, Julie Poulain, Florent Prion², Florent Prion³, Baifang Qin¹⁷, Chunmei Qu¹⁷, Ernest F. Retzel¹⁵, Claire Riddle²⁰, Erika Sallet², Erika Sallet³, Sylvie Samain, Nicolas Samson³, Nicolas Samson², Iryna Sanders¹⁷, Olivier Saurat³, Olivier Saurat², Claude Scarpelli, Thomas Schiex³, Béatrice Segurens, Andrew J. Severin⁷, D. Janine Sherrier¹², Ruihua Shi¹⁷, Sarah Sims²⁰, Susan R. Singer²³, Senjuti Sinharoy, Lieven Sterck¹⁰, Agnès Viollet, Bing Bing Wang¹, Keqin Wang¹⁷, Mingyi Wang, Xiaohong Wang¹, Jens Warfsmann¹⁹, Jean Weissenbach, Doug White¹⁷, James D. White¹⁷, Graham B. Wiley¹⁷, Patrick Wincker, Yanbo Xing¹⁷, Limei Yang¹⁷, Ziyun Yao¹⁷, Fu Ying¹⁷, Jixian Zhai¹², Liping Zhou¹⁷, Antoine Zuber³, Antoine Zuber², Jean Dénarié², Jean Dénarié³, Richard A. Dixon, Gregory D. May¹⁵, David C. Schwartz¹⁴, Jane Rogers²⁴, Francis Quetier, Christopher D. Town¹³, Bruce A. Roe¹⁷ - Show less +135 more•Institutions (24)

University of Minnesota¹, Centre national de la recherche scientifique², Institut national de la recherche agronomique³, John Innes Centre⁴, Laboratory of Molecular Biology⁵, Agricultural Research Service⁶, Iowa State University⁷, West Virginia University⁸, University of Bonn⁹, Ghent University¹⁰, University of California, Davis¹¹, Delaware Biotechnology Institute¹², J. Craig Venter Institute¹³, University of Wisconsin-Madison¹⁴, National Center for Genome Resources¹⁵, King Saud University¹⁶, University of Oklahoma¹⁷, Cornell University¹⁸, Max Planck Society¹⁹, Wellcome Trust²⁰, International Institute of Minnesota²¹, Rural Development Administration²², Carleton College²³, Norwich Research Park²⁴

22 Dec 2011-Nature

TL;DR: The draft sequence of the M. truncatula genome sequence is described, a close relative of alfalfa (Medicago sativa), a widely cultivated crop with limited genomics tools and complex autotetraploid genetics, which provides significant opportunities to expand al falfa’s genomic toolbox.

...read moreread less

Abstract: Legumes (Fabaceae or Leguminosae) are unique among cultivated plants for their ability to carry out endosymbiotic nitrogen fixation with rhizobial bacteria, a process that takes place in a specialized structure known as the nodule. Legumes belong to one of the two main groups of eurosids, the Fabidae, which includes most species capable of endosymbiotic nitrogen fixation. Legumes comprise several evolutionary lineages derived from a common ancestor 60 million years ago (Myr ago). Papilionoids are the largest clade, dating nearly to the origin of legumes and containing most cultivated species. Medicago truncatula is a long-established model for the study of legume biology. Here we describe the draft sequence of the M. truncatula euchromatin based on a recently completed BAC assembly supplemented with Illumina shotgun sequence, together capturing ∼94% of all M. truncatula genes. A whole-genome duplication (WGD) approximately 58 Myr ago had a major role in shaping the M. truncatula genome and thereby contributed to the evolution of endosymbiotic nitrogen fixation. Subsequent to the WGD, the M. truncatula genome experienced higher levels of rearrangement than two other sequenced legumes, Glycine max and Lotus japonicus. M. truncatula is a close relative of alfalfa (Medicago sativa), a widely cultivated crop with limited genomics tools and complex autotetraploid genetics. As such, the M. truncatula genome sequence provides significant opportunities to expand alfalfa's genomic toolbox.

...read moreread less

1,153 citations

1
2
3
4
…
5

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation

[...]

Cole Trapnell¹, Cole Trapnell², Brian A. Williams³, Geo Pertea², Ali Mortazavi³, Gordon Kwan³, Marijke J. van Baren⁴, Steven L. Salzberg², Barbara J. Wold³, Lior Pachter¹ - Show less +6 more•Institutions (4)

University of California, Berkeley¹, University of Maryland, College Park², California Institute of Technology³, Washington University in St. Louis⁴

01 May 2010-Nature Biotechnology

TL;DR: The results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.

...read moreread less

Abstract: High-throughput mRNA sequencing (RNA-Seq) promises simultaneous transcript discovery and abundance estimation. However, this would require algorithms that are not restricted by prior gene annotations and that account for alternative transcription and splicing. Here we introduce such algorithms in an open-source software program called Cufflinks. To test Cufflinks, we sequenced and analyzed >430 million paired 75-bp RNA-Seq reads from a mouse myoblast cell line over a differentiation time series. We detected 13,692 known transcripts and 3,724 previously unannotated ones, 62% of which are supported by independent expression data or by homologous genes in other species. Over the time series, 330 genes showed complete switches in the dominant transcription start site (TSS) or splice isoform, and we observed more subtle shifts in 1,304 other genes. These results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.

...read moreread less

13,337 citations

Journal Article•DOI•

The RAST Server: Rapid Annotations using Subsystems Technology

[...]

Ramy K. Aziz¹, Ramy K. Aziz², Daniela Bartels³, Aaron A. Best⁴, Matthew DeJongh⁴, Terrence Disz³, Terrence Disz⁵, Robert Edwards⁵, Kevin Formsma⁴, Svetlana Gerdes, Elizabeth M. Glass⁵, Michael Kubal³, Folker Meyer³, Folker Meyer⁵, Gary J. Olsen⁶, Gary J. Olsen⁵, Robert Olson⁵, Robert Olson³, Andrei L. Osterman⁷, Ross Overbeek, Leslie Klis McNeil⁶, Daniel Paarmann³, Tobias Paczian³, Bruce Parrello, Gordon D. Pusch³, Claudia I. Reich⁶, Rick Stevens⁵, Rick Stevens³, Olga Vassieva, Veronika Vonstein, Andreas Wilke³, Olga Zagnitko - Show less +28 more•Institutions (7)

Cairo University¹, University of Tennessee Health Science Center², University of Chicago³, Hope College⁴, Argonne National Laboratory⁵, University of Illinois at Urbana–Champaign⁶, Sanford-Burnham Institute for Medical Research⁷

08 Feb 2008-BMC Genomics

TL;DR: A fully automated service for annotating bacterial and archaeal genomes that identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user.

...read moreread less

Abstract: The number of prokaryotic genome sequences becoming available is growing steadily and is growing faster than our ability to accurately annotate them. We describe a fully automated service for annotating bacterial and archaeal genomes. The service identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user. In addition, the annotated genome can be browsed in an environment that supports comparative analysis with the annotated genomes maintained in the SEED environment. The service normally makes the annotated genome available within 12–24 hours of submission, but ultimately the quality of such a service will be judged in terms of accuracy, consistency, and completeness of the produced annotations. We summarize our attempts to address these issues and discuss plans for incrementally enhancing the service. By providing accurate, rapid annotation freely to the community we have created an important community resource. The service has now been utilized by over 120 external users annotating over 350 distinct genomes.

...read moreread less

9,397 citations

Journal Article•DOI•

Circos: An information aesthetic for comparative genomics

[...]

Martin Krzywinski, Jacqueline E. Schein, Inanc Birol, Joseph M. Connors, Randy D. Gascoyne, Doug Horsman, Steven J.M. Jones, Marco A. Marra - Show less +4 more

01 Sep 2009-Genome Research

TL;DR: Circos uses a circular ideogram layout to facilitate the display of relationships between pairs of positions by the use of ribbons, which encode the position, size, and orientation of related genomic elements.

...read moreread less

Abstract: We created a visualization tool called Circos to facilitate the identification and analysis of similarities and differences arising from comparisons of genomes. Our tool is effective in displaying variation in genome structure and, generally, any other kind of positional relationships between genomic intervals. Such data are routinely produced by sequence alignments, hybridization arrays, genome mapping, and genotyping studies. Circos uses a circular ideogram layout to facilitate the display of relationships between pairs of positions by the use of ribbons, which encode the position, size, and orientation of related genomic elements. Circos is capable of displaying data as scatter, line, and histogram plots, heat maps, tiles, connectors, and text. Bitmap or vector images can be created from GFF-style data inputs and hierarchical configuration files, which can be easily generated by automated tools, making Circos suitable for rapid deployment in data analysis and reporting pipelines.

...read moreread less

8,315 citations

Journal Article•DOI•

Sequencing technologies-the next generation

[...]

Michael L. Metzker¹•Institutions (1)

Baylor College of Medicine¹

01 Jan 2010-Nature Reviews Genetics

TL;DR: A technical review of template preparation, sequencing and imaging, genome alignment and assembly approaches, and recent advances in current and near-term commercially available NGS instruments is presented.

...read moreread less

Abstract: Demand has never been greater for revolutionary technologies that deliver fast, inexpensive and accurate genome information. This challenge has catalysed the development of next-generation sequencing (NGS) technologies. The inexpensive production of large volumes of sequence data is the primary advantage over conventional methods. Here, I present a technical review of template preparation, sequencing and imaging, genome alignment and assembly approaches, and recent advances in current and near-term commercially available NGS instruments. I also outline the broad range of applications for NGS technologies, in addition to providing guidelines for platform selection to address biological questions of interest.

...read moreread less

7,023 citations

Journal Article•DOI•

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project

[...]

Ewan Birney, John A. Stamatoyannopoulos¹, Anindya Dutta², Roderic Guigó³ +317 more•Institutions (44)

14 Jun 2007-Nature

TL;DR: Functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project are reported, providing convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts.

...read moreread less

Abstract: We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.

...read moreread less

5,091 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse