scispace - formally typeset
Search or ask a question
Author

Jan Gawor

Bio: Jan Gawor is an academic researcher from Polish Academy of Sciences. The author has contributed to research in topics: Genome & Gene. The author has an hindex of 11, co-authored 35 publications receiving 1964 citations.
Topics: Genome, Gene, Prototheca, Medicine, Biology

Papers
More filters
Journal ArticleDOI
Xun Xu1, Shengkai Pan1, Shifeng Cheng1, Bo Zhang1, Mu D1, Peixiang Ni1, Gengyun Zhang1, Shuang Yang1, Ruiqiang Li1, Jun Wang1, Gisella Orjeda2, Frank Guzman2, Torres M2, Roberto Lozano2, Olga Ponce2, Diana Martinez2, De la Cruz G3, Chakrabarti Sk3, Patil Vu3, Konstantin G. Skryabin4, Boris B. Kuznetsov4, Nikolai V. Ravin4, Tatjana V. Kolganova4, Alexey V. Beletsky4, Andrey V. Mardanov4, Di Genova A5, Dan Bolser5, David M. A. Martin5, Li G, Yang Y, Hanhui Kuang6, Hu Q6, Xiong X7, Gerard J. Bishop8, Boris Sagredo, Nilo Mejía, Zagorski W9, Robert Gromadka9, Jan Gawor9, Pawel Szczesny9, Sanwen Huang, Zhang Z, Liang C, He J, Li Y, He Y, Xu J, Youjun Zhang, Xie B, Du Y, Qu D, Merideth Bonierbale10, Marc Ghislain10, Herrera Mdel R, Giovanni Giuliano, Marco Pietrella, Gaetano Perrotta, Paolo Facella, O'Brien K11, Sergio Enrique Feingold, Barreiro Le, Massa Ga, Luis Aníbal Diambra12, Brett R Whitty13, Brieanne Vaillancourt13, Lin H13, Alicia N. Massa13, Geoffroy M13, Lundback S13, Dean DellaPenna13, Buell Cr14, Sanjeev Kumar Sharma14, David Marshall14, Robbie Waugh14, Glenn J. Bryan14, Destefanis M15, Istvan Nagy15, Dan Milbourne15, Susan Thomson16, Mark Fiers16, Jeanne M. E. Jacobs16, Kåre Lehmann Nielsen17, Mads Sønderkær17, Marina Iovene18, Giovana Augusta Torres18, Jiming Jiang18, Richard E. Veilleux19, Christian W. B. Bachem20, de Boer J20, Theo Borm20, Bjorn Kloosterman20, van Eck H20, Erwin Datema20, Hekkert Bt20, Aska Goverse20, van Ham Rc20, Richard G. F. Visser20 
10 Jul 2011-Nature
TL;DR: The potato genome sequence provides a platform for genetic improvement of this vital crop and predicts 39,031 protein-coding genes and presents evidence for at least two genome duplication events indicative of a palaeopolyploid origin.
Abstract: Potato (Solanum tuberosum L.) is the world's most important non-grain food crop and is central to global food security. It is clonally propagated, highly heterozygous, autotetraploid, and suffers acute inbreeding depression. Here we use a homozygous doubled-monoploid potato clone to sequence and assemble 86% of the 844-megabase genome. We predict 39,031 protein-coding genes and present evidence for at least two genome duplication events indicative of a palaeopolyploid origin. As the first genome sequence of an asterid, the potato genome reveals 2,642 genes specific to this large angiosperm clade. We also sequenced a heterozygous diploid clone and show that gene presence/absence variants and other potentially deleterious mutations occur frequently and are a likely cause of inbreeding depression. Gene family expansion, tissue-specific expression and recruitment of genes to new pathways contributed to the evolution of tuber development. The potato genome sequence provides a platform for genetic improvement of this vital crop.

1,813 citations

Book ChapterDOI
TL;DR: Polyvalent bacteriophages of the genus Twort-like that infect clinically relevant Staphylococcus strains may be among the most promising phages with potential therapeutic applications and can be predicted based on their homology to prototypical genes of model spounavirus SPO1.
Abstract: Polyvalent bacteriophages of the genus Twort-like that infect clinically relevant Staphylococcus strains may be among the most promising phages with potential therapeutic applications. They are obligatorily lytic, infect the majority of Staphylococcus strains in clinical strain collections, propagate efficiently and do not transfer foreign DNA by transduction. Comparative genomic analysis of 11 S. aureus/S. epidermidis Twort-like phages, as presented in this chapter, emphasizes their strikingly high similarity and clear divergence from phage Twort of the same genus, which might have evolved in hosts of a different species group. Genetically, these phages form a relatively isolated group, which minimizes the risk of acquiring potentially harmful genes. The order of genes in core parts of their 127 to 140-kb genomes is conserved and resembles that found in related representatives of the Spounavirinae subfamily of myoviruses. Functions of certain conserved genes can be predicted based on their homology to prototypical genes of model spounavirus SPO1. Deletions in the genomes of certain phages mark genes that are dispensable for phage development. Nearly half of the genes of these phages have no known homologues. Unique genes are mostly located near termini of the virion DNA molecule and are expressed early in phage development as implied by analysis of their potential transcriptional signals. Thus, many of them are likely to play a role in host takeover. Single genes encode homologues of bacterial virulence-associated proteins. They were apparently acquired by a common ancestor of these phages by horizontal gene transfer but presumably evolved towards gaining functions that increase phage infectivity for bacteria or facilitate mature phage release. Major differences between the genomes of S. aureus/S. epidermidis Twort-like phages consist of single nucleotide polymorphisms and insertions/deletions of short stretches of nucleotides, single genes, or introns of group I. Although the number and location of introns may vary between particular phages, intron shuffling is unlikely to be a major factor responsible for specificity differences.

106 citations

Journal ArticleDOI
TL;DR: It is suggested, that seeding of glacial surfaces with Polaromonas cells transported by various means, is of greater efficiency on local than global scales, and interactions with other supraglacial microbiota, like algae cells may drive postselectional niche separation and microevolution within the polaromonas genus.
Abstract: Polaromonas is one of the most abundant genera found on glacier surfaces, yet its ecology remains poorly described. Investigations made to date point towards a uniform distribution of Polaromonas phylotypes across the globe. We compared 43 Polaromonas isolates obtained from surfaces of Arctic and Antarctic glaciers to address this issue. 16S rRNA gene sequences, intergenic transcribed spacers (ITS) and metabolic fingerprinting showed great differences between hemispheres but also between neighboring glaciers. Phylogenetic distance between Arctic and Antarctic isolates indicated separate species. The Arctic group clustered similarly, when constructing dendrograms based on 16S rRNA gene and ITS sequences, as well as metabolic traits. The Antarctic strains, although almost identical considering 16S rRNA genes, diverged into 2 groups based on the ITS sequences and metabolic traits, suggesting recent niche separation. Certain phenotypic traits pointed towards cell adaptation to specific conditions on a particular glacier, like varying pH levels. Collected data suggest, that seeding of glacial surfaces with Polaromonas cells transported by various means, is of greater efficiency on local than global scales. Selection mechanisms present of glacial surfaces reduce the deposited Polaromonas diversity, causing subsequent adaptation to prevailing environmental conditions. Furthermore, interactions with other supraglacial microbiota, like algae cells may drive postselectional niche separation and microevolution within the Polaromonas genus.

40 citations

Journal ArticleDOI
TL;DR: Analysis of "natural" transconjugants showed that pSinA is functional (expresses arsenite oxidase) and is stably maintained in their cells after approximately 60 generations of growth under nonselective conditions, demonstrating that p SinA is a self-transferable, broad-host-range plasmid, which plays an important role in horizontal transfer of arsenic metabolism genes.

39 citations

Journal ArticleDOI
TL;DR: For circumscription and differentiation of Prototheca spp.
Abstract: The only algae which are able to inflict disease on humans and other mammals through active invasion and spread within the host tissues belong to either of two genera: Chlorella and Prototheca. Whereas Chlorella infections are extremely rare, with only two human cases reported in the literature, protothecosis is an emerging disease of humans and domestic animals, especially dairy cows. The genus Prototheca, erected by Kruger in 1894, has undergone several significant revisions, as more phenotypic, chemotaxonomic, and molecular data have become available. Due to this, a large number of Prototheca strains have been accumulated in public culture collections, over the years, where they still exist under outdated or invalid infraspecific or species names. In this study, the partial cytb gene was used as a marker to revise the taxonomy and nomenclature of a set of Prototheca strains, preserved in major algae culture repositories worldwide. Within the genus, two main lineages were observed, with a dominance of typically dairy cattle-associated (i.e. P. ciferrii, formerly P. zopfii gen. 1, the here validated P. blaschkeae, and one newly erected species, namely P. bovis, formerly P. zopfii gen. 2) and human-associated (i.e. P. wickerhamii, P. cutis, P. miyajii) species, respectively. In the former lineage, three newly described species were allocated, namely P. cookei sp. nov., P. cerasi sp. nov., and P. pringsheimii sp. nov., and the lecto- and epitypified P. zopfii species. The second, or so-called P. wickerhamii lineage, incorporated a newly proposed species of P. xanthoriae sp. nov. These protothecans were shown as the closest relatives of the photosynthetic genera, Chlorella and Auxenochlorella. The environmental species P. ulmea was synonymized with the lecto- and epitypified P. moriformis species. For circumscription and differentiation of Prototheca spp., the use of phenotypic characters, and morphology in particular, is of limited value and should rather be auxiliary to molecular marker-based approaches. As demonstrated in our previous study and corroborated in the present one, the cytb gene provides higher resolution than the conventional rDNA markers, and currently represents the most efficient barcode for the Prototheca algae.

32 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number of complete plant genomes.
Abstract: The number of sequenced plant genomes and associated genomic resources is growing rapidly with the advent of both an increased focus on plant genomics from funding agencies, and the application of inexpensive next generation sequencing. To interact with this increasing body of data, we have developed Phytozome (http://www.phytozome.net), a comparative hub for plant genome and gene family data and analysis. Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number (currently 25) of complete plant genomes, including all the land plants and selected algae sequenced at the Joint Genome Institute, as well as selected species sequenced elsewhere. Through a comprehensive plant genome database and web portal, these data and analyses are available to the broader plant science research community, providing powerful comparative genomics tools that help to link model systems with other plants of economic and ecological importance.

3,728 citations

Journal ArticleDOI
Shusei Sato, Satoshi Tabata, Hideki Hirakawa, Erika Asamizu  +320 moreInstitutions (51)
31 May 2012-Nature
TL;DR: A high-quality genome sequence of domesticated tomato is presented, a draft sequence of its closest wild relative, Solanum pimpinellifolium, is compared, and the two tomato genomes are compared to each other and to the potato genome.
Abstract: Tomato (Solanum lycopersicum) is a major crop plant and a model system for fruit development. Solanum is one of the largest angiosperm genera1 and includes annual and perennial plants from diverse habitats. Here we present a high-quality genome sequence of domesticated tomato, a draft sequence of its closest wild relative, Solanum pimpinellifolium2, and compare them to each other and to the potato genome (Solanum tuberosum). The two tomato genomes show only 0.6% nucleotide divergence and signs of recent admixture, but show more than 8% divergence from potato, with nine large and several smaller inversions. In contrast to Arabidopsis, but similar to soybean, tomato and potato small RNAs map predominantly to gene-rich chromosomal regions, including gene promoters. The Solanum lineage has experienced two consecutive genome triplications: one that is ancient and shared with rosids, and a more recent one. These triplications set the stage for the neofunctionalization of genes controlling fruit characteristics, such as colour and fleshiness.

2,687 citations

01 Jan 2011
TL;DR: The sheer volume and scope of data posed by this flood of data pose a significant challenge to the development of efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.
Abstract: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.

2,187 citations

Journal ArticleDOI
04 Oct 2012-Nature
TL;DR: The sequencing and assembly of the oyster genome using short reads and a fosmid-pooling strategy and transcriptomes of development and stress response and the proteome of the shell are reported, showing that shell formation in molluscs is more complex than currently understood and involves extensive participation of cells and their exosomes.
Abstract: The Pacific oyster Crassostrea gigas belongs to one of the most species-rich but genomically poorly explored phyla, the Mollusca. Here we report the sequencing and assembly of the oyster genome using short reads and a fosmid-pooling strategy, along with transcriptomes of development and stress response and the proteome of the shell. The oyster genome is highly polymorphic and rich in repetitive sequences, with some transposable elements still actively shaping variation. Transcriptome studies reveal an extensive set of genes responding to environmental stress. The expansion of genes coding for heat shock protein 70 and inhibitors of apoptosis is probably central to the oyster's adaptation to sessile life in the highly stressful intertidal zone. Our analyses also show that shell formation in molluscs is more complex than currently understood and involves extensive participation of cells and their exosomes. The oyster genome sequence fills a void in our understanding of the Lophotrochozoa.

1,806 citations

Journal ArticleDOI
TL;DR: The computational problems surrounding repeats are discussed and strategies used by current bioinformatics systems to solve them are described.
Abstract: Repetitive DNA sequences are abundant in a broad range of species, from bacteria to mammals, and they cover nearly half of the human genome. Repeats have always presented technical challenges for sequence alignment and assembly programs. Next-generation sequencing projects, with their short read lengths and high data volumes, have made these challenges more difficult. From a computational perspective, repeats create ambiguities in alignment and assembly, which, in turn, can produce biases and errors when interpreting results. Simply ignoring repeats is not an option, as this creates problems of its own and may mean that important biological phenomena are missed. We discuss the computational problems surrounding repeats and describe strategies used by current bioinformatics systems to solve them.

1,451 citations