Author
Marco Pietrella
Other affiliations: Canadian Real Estate Association, Sant'Anna School of Advanced Studies
Bio: Marco Pietrella is an academic researcher from ENEA. The author has contributed to research in topics: Genome & Whole genome sequencing. The author has an hindex of 11, co-authored 22 publications receiving 5076 citations. Previous affiliations of Marco Pietrella include Canadian Real Estate Association & Sant'Anna School of Advanced Studies.
Topics: Genome, Whole genome sequencing, Solanum, Silybum marianum, Biology
Papers
More filters
••
TL;DR: A high-quality genome sequence of domesticated tomato is presented, a draft sequence of its closest wild relative, Solanum pimpinellifolium, is compared, and the two tomato genomes are compared to each other and to the potato genome.
Abstract: Tomato (Solanum lycopersicum) is a major crop plant and a model system for fruit development. Solanum is one of the largest angiosperm genera1 and includes annual and perennial plants from diverse habitats. Here we present a high-quality genome sequence of domesticated tomato, a draft sequence of its closest wild relative, Solanum pimpinellifolium2, and compare them to each other and to the potato genome (Solanum tuberosum). The two tomato genomes show only 0.6% nucleotide divergence and signs of recent admixture, but show more than 8% divergence from potato, with nine large and several smaller inversions. In contrast to Arabidopsis, but similar to soybean, tomato and potato small RNAs map predominantly to gene-rich chromosomal regions, including gene promoters. The Solanum lineage has experienced two consecutive genome triplications: one that is ancient and shared with rosids, and a more recent one. These triplications set the stage for the neofunctionalization of genes controlling fruit characteristics, such as colour and fleshiness.
2,687 citations
••
Beijing Institute of Genomics1, Cayetano Heredia University2, Indian Council of Agricultural Research3, Russian Academy of Sciences4, University of Dundee5, Huazhong Agricultural University6, Hunan Agricultural University7, Imperial College London8, Polish Academy of Sciences9, International Potato Center10, J. Craig Venter Institute11, National University of La Plata12, Michigan State University13, James Hutton Institute14, Teagasc15, Plant & Food Research16, Aalborg University17, University of Wisconsin-Madison18, Virginia Tech19, Wageningen University and Research Centre20
TL;DR: The potato genome sequence provides a platform for genetic improvement of this vital crop and predicts 39,031 protein-coding genes and presents evidence for at least two genome duplication events indicative of a palaeopolyploid origin.
Abstract: Potato (Solanum tuberosum L.) is the world's most important non-grain food crop and is central to global food security. It is clonally propagated, highly heterozygous, autotetraploid, and suffers acute inbreeding depression. Here we use a homozygous doubled-monoploid potato clone to sequence and assemble 86% of the 844-megabase genome. We predict 39,031 protein-coding genes and present evidence for at least two genome duplication events indicative of a palaeopolyploid origin. As the first genome sequence of an asterid, the potato genome reveals 2,642 genes specific to this large angiosperm clade. We also sequenced a heterozygous diploid clone and show that gene presence/absence variants and other potentially deleterious mutations occur frequently and are a likely cause of inbreeding depression. Gene family expansion, tissue-specific expression and recruitment of genes to new pathways contributed to the evolution of tuber development. The potato genome sequence provides a platform for genetic improvement of this vital crop.
1,813 citations
••
State University of New York System1, Centre de coopération internationale en recherche agronomique pour le développement2, ENEA3, University of Ottawa4, French Alternative Energies and Atomic Energy Commission5, Bioversity International6, Nestlé7, Bielefeld University8, Institut national de la recherche agronomique9, Chongqing University of Science and Technology10, University of Illinois at Urbana–Champaign11, University of Barcelona12, University of Maryland, College Park13, University of Trieste14, Empresa Brasileira de Pesquisa Agropecuária15, Analytica16, University of Queensland17, Coffee production in India18, University of Arizona19, University of Évry Val d'Essonne20, Centre national de la recherche scientifique21
TL;DR: The Coffea canephora (coffee) genome was sequenced and identified a conserved gene order, and comparative analyses of caffeine NMTs demonstrate that these genes expanded through sequential tandem duplications independently of genes from cacao and tea, suggesting that caffeine in eudicots is of polyphyletic origin.
Abstract: Coffee is a valuable beverage crop due to its characteristic flavor, aroma, and the stimulating effects of caffeine. We generated a high-quality draft genome of the species Coffea canephora, which displays a conserved chromosomal gene order among asterid angiosperms. Although it shows no sign of the whole-genome triplication identified in Solanaceae species such as tomato, the genome includes several species-specific gene family expansions, among them N-methyltransferases (NMTs) involved in caffeine production, defense-related genes, and alkaloid and flavonoid enzymes involved in secondary compound synthesis. Comparative analyses of caffeine NMTs demonstrate that these genes expanded through sequential tandem duplications independently of genes from cacao and tea, suggesting that caffeine in eudicots is of polyphyletic origin.
513 citations
••
TL;DR: Comparative transcript profiling allowed the identification of differentially expressed genes with potential relevance in regulating the fruit metabolism and phenolic content during ripening and provided large scale information about the structure and putative function of gene transcripts accumulated during fruit development.
Abstract: Despite its primary economic importance, genomic information on olive tree is still lacking. 454 pyrosequencing was used to enrich the very few sequence data currently available for the Olea europaea species and to identify genes involved in expression of fruit quality traits. Fruits of Coratina, a widely cultivated variety characterized by a very high phenolic content, and Tendellone, an oleuropein-lacking natural variant, were used as starting material for monitoring the transcriptome. Four different cDNA libraries were sequenced, respectively at the beginning and at the end of drupe development. A total of 261,485 reads were obtained, for an output of about 58 Mb. Raw sequence data were processed using a four step pipeline procedure and data were stored in a relational database with a web interface. Massively parallel sequencing of different fruit cDNA collections has provided large scale information about the structure and putative function of gene transcripts accumulated during fruit development. Comparative transcript profiling allowed the identification of differentially expressed genes with potential relevance in regulating the fruit metabolism and phenolic content during ripening.
246 citations
••
TL;DR: Using in bacterio, in vitro, and in planta functional assays, it is demonstrated that CCD2 is the dioxygenase catalyzing the first dedicated step in saffron crocetin biosynthesis starting from the carotenoid zeaxanthin.
Abstract: Crocus sativus stigmas are the source of the saffron spice and accumulate the apocarotenoids crocetin, crocins, picrocrocin, and safranal, responsible for its color, taste, and aroma. Through deep transcriptome sequencing, we identified a novel dioxygenase, carotenoid cleavage dioxygenase 2 (CCD2), expressed early during stigma development and closely related to, but distinct from, the CCD1 dioxygenase family. CCD2 is the only identified member of a novel CCD clade, presents the structural features of a bona fide CCD, and is able to cleave zeaxanthin, the presumed precursor of saffron apocarotenoids, both in Escherichia coli and in maize endosperm. The cleavage products, identified through high-resolution mass spectrometry and comigration with authentic standards, are crocetin dialdehyde and crocetin, respectively. In vitro assays show that CCD2 cleaves sequentially the 7,8 and 7′,8′ double bonds adjacent to a 3-OH-β-ionone ring and that the conversion of zeaxanthin to crocetin dialdehyde proceeds via the C30 intermediate 3-OH-β-apo-8′-carotenal. In contrast, zeaxanthin cleavage dioxygenase (ZCD), an enzyme previously claimed to mediate crocetin formation, did not cleave zeaxanthin or 3-OH-β-apo-8′-carotenal in the test systems used. Sequence comparison and structure prediction suggest that ZCD is an N-truncated CCD4 form, lacking one blade of the β-propeller structure conserved in all CCDs. These results constitute strong evidence that CCD2 catalyzes the first dedicated step in crocin biosynthesis. Similar to CCD1, CCD2 has a cytoplasmic localization, suggesting that it may cleave carotenoids localized in the chromoplast outer envelope.
224 citations
Cited by
More filters
28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。
18,940 citations
••
TL;DR: Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number of complete plant genomes.
Abstract: The number of sequenced plant genomes and associated genomic resources is growing rapidly with the advent of both an increased focus on plant genomics from funding agencies, and the application of inexpensive next generation sequencing. To interact with this increasing body of data, we have developed Phytozome (http://www.phytozome.net), a comparative hub for plant genome and gene family data and analysis. Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number (currently 25) of complete plant genomes, including all the land plants and selected algae sequenced at the Joint Genome Institute, as well as selected species sequenced elsewhere. Through a comprehensive plant genome database and web portal, these data and analyses are available to the broader plant science research community, providing powerful comparative genomics tools that help to link model systems with other plants of economic and ecological importance.
3,728 citations
••
TL;DR: A high-quality genome sequence of domesticated tomato is presented, a draft sequence of its closest wild relative, Solanum pimpinellifolium, is compared, and the two tomato genomes are compared to each other and to the potato genome.
Abstract: Tomato (Solanum lycopersicum) is a major crop plant and a model system for fruit development. Solanum is one of the largest angiosperm genera1 and includes annual and perennial plants from diverse habitats. Here we present a high-quality genome sequence of domesticated tomato, a draft sequence of its closest wild relative, Solanum pimpinellifolium2, and compare them to each other and to the potato genome (Solanum tuberosum). The two tomato genomes show only 0.6% nucleotide divergence and signs of recent admixture, but show more than 8% divergence from potato, with nine large and several smaller inversions. In contrast to Arabidopsis, but similar to soybean, tomato and potato small RNAs map predominantly to gene-rich chromosomal regions, including gene promoters. The Solanum lineage has experienced two consecutive genome triplications: one that is ancient and shared with rosids, and a more recent one. These triplications set the stage for the neofunctionalization of genes controlling fruit characteristics, such as colour and fleshiness.
2,687 citations
••
TL;DR: The sequencing and assembly of the oyster genome using short reads and a fosmid-pooling strategy and transcriptomes of development and stress response and the proteome of the shell are reported, showing that shell formation in molluscs is more complex than currently understood and involves extensive participation of cells and their exosomes.
Abstract: The Pacific oyster Crassostrea gigas belongs to one of the most species-rich but genomically poorly explored phyla, the Mollusca. Here we report the sequencing and assembly of the oyster genome using short reads and a fosmid-pooling strategy, along with transcriptomes of development and stress response and the proteome of the shell. The oyster genome is highly polymorphic and rich in repetitive sequences, with some transposable elements still actively shaping variation. Transcriptome studies reveal an extensive set of genes responding to environmental stress. The expansion of genes coding for heat shock protein 70 and inhibitors of apoptosis is probably central to the oyster's adaptation to sessile life in the highly stressful intertidal zone. Our analyses also show that shell formation in molluscs is more complex than currently understood and involves extensive participation of cells and their exosomes. The oyster genome sequence fills a void in our understanding of the Lophotrochozoa.
1,806 citations
••
TL;DR: The computational problems surrounding repeats are discussed and strategies used by current bioinformatics systems to solve them are described.
Abstract: Repetitive DNA sequences are abundant in a broad range of species, from bacteria to mammals, and they cover nearly half of the human genome. Repeats have always presented technical challenges for sequence alignment and assembly programs. Next-generation sequencing projects, with their short read lengths and high data volumes, have made these challenges more difficult. From a computational perspective, repeats create ambiguities in alignment and assembly, which, in turn, can produce biases and errors when interpreting results. Simply ignoring repeats is not an option, as this creates problems of its own and may mean that important biological phenomena are missed. We discuss the computational problems surrounding repeats and describe strategies used by current bioinformatics systems to solve them.
1,451 citations