Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions.
Takakazu Kaneko,Shusei Sato,Hirokazu Kotani,Ayako Tanaka,Erika Asamizu,Yasukazu Nakamura,Nobuyuki Miyajima,Makoto Hirosawa,Masahiro Sugiura,Shigemi Sasamoto,Takaharu Kimura,Tsutomu Hosouchi,Ai Matsuno,Akiko Muraki,Naomi Nakazaki,Kaoru Naruo,Satomi Okumura,Sayaka Shimpo,Chie Takeuchi,Tsuyuko Wada,Akiko Watanabe,Manabu Yamada,Miho Yasuda,Satoshi Tabata +23 more
TLDR
The sequence determination of the entire genome of the Synechocystis sp.Abstract:
The sequence determination of the entire genome of the Synechocystis sp. strain PCC6803 was completed. The total length of the genome finally confirmed was 3,573,470 bp, including the previously reported sequence of 1,003,450 bp from map position 64% to 92% of the genome. The entire sequence was assembled from the sequences of the physical map-based contigs of cosmid clones and of lambda clones and long PCR products which were used for gap-filling. The accuracy of the sequence was guaranteed by analysis of both strands of DNA through the entire genome. The authenticity of the assembled sequence was supported by restriction analysis of long PCR products, which were directly amplified from the genomic DNA using the assembled sequence data. To predict the potential protein-coding regions, analysis of open reading frames (ORFs), analysis by the GeneMark program and similarity search to databases were performed. As a result, a total of 3,168 potential protein genes were assigned on the genome, in which 145 (4.6%) were identical to reported genes and 1,257 (39.6%) and 340 (10.8%) showed similarity to reported and hypothetical genes, respectively. The remaining 1,426 (45.0%) had no apparent similarity to any genes in databases. Among the potential protein genes assigned, 128 were related to the genes participating in photosynthetic reactions. The sum of the sequences coding for potential protein genes occupies 87% of the genome length. By adding rRNA and tRNA genes, therefore, the genome has a very compact arrangement of protein- and RNA-coding regions. A notable feature on the gene organization of the genome was that 99 ORFs, which showed similarity to transposase genes and could be classified into 6 groups, were found spread all over the genome, and at least 26 of them appeared to remain intact. The result implies that rearrangement of the genome occurred frequently during and after establishment of this species.read more
Citations
More filters
Journal ArticleDOI
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Stephen F. Altschul,Thomas L. Madden,Alejandro A. Schäffer,Jinghui Zhang,Zheng Zhang,Webb Miller,David J. Lipman +6 more
TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Journal ArticleDOI
The Complete Genome Sequence of Escherichia coli K-12
Frederick R. Blattner,Guy Plunkett,Craig A. Bloch,Nicole T. Perna,Valerie Burland,Monica Riley,Julio Collado-Vides,Jeremy D. Glasner,Christopher K. Rode,George F. Mayhew,Jason Gregor,Nelson Wayne Davis,Heather A. Kirkpatrick,Michael A. Goeden,Debra J. Rose,Bob Mau,Ying Shao +16 more
TL;DR: The 4,639,221-base pair sequence of Escherichia coli K-12 is presented and reveals ubiquitous as well as narrowly distributed gene families; many families of similar genes within E. coli are also evident.
Journal ArticleDOI
Complete genome sequence of Pseudomonas aeruginosa PAO1, an opportunistic pathogen.
Charles K. Stover,X. Q. Pham,A. L. Erwin,S. D. Mizoguchi,Paul Warrener,Mark J. Hickey,Fiona S. L. Brinkman,W. O. Hufnagle,D. J. Kowalik,Lagrou Mj,R. L. Garber,L. Goltry,E. Tolentino,S. Westbrock-Wadman,Ying Yuan,L. L. Brody,S. N. Coulter,K. R. Folger,Arnold Kas,K. Larbig,R. Lim,Kelly D. Smith,David H. Spencer,Gane Ka-Shu Wong,Z. Wu,Ian T. Paulsen,Ian T. Paulsen,Jonathan Reizer,Milton H. Saier,Robert E. W. Hancock,Stephen Lory,Maynard V. Olson +31 more
TL;DR: It is proposed that the size and complexity of the P. aeruginosa genome reflect an evolutionary adaptation permitting it to thrive in diverse environments and resist the effects of a variety of antimicrobial substances.
Journal ArticleDOI
The complete genome sequence of the Gram-positive bacterium Bacillus subtilis
F. Kunst,Naotake Ogasawara,Ivan Moszer,Alessandra M. Albertini,G. Alloni,Vasco Azevedo,M. G. Bertero,M. G. Bertero,Philippe Bessières,Bolotin Ap,S. Borchert,Rainer Borriss,L. Boursier,Alain Brans,M. Braun,S. C. Brignell,Sierd Bron,S. Brouillet,S. Brouillet,Carlo V. Bruschi,B. Caldwell,V. Capuano,Noel Carter,Soo Keun Choi,J.-J. Codani,Ian F. Connerton,Nicola J. Cummings,Richard A. Daniel,François Denizot,Kevin M. Devine,A. Düsterhöft,Stanislav Dusko Ehrlich,P. T. Emmerson,K. D. Entian,Jeff Errington,C. Fabret,Eugenio Ferrari,D. Foulger,C. Fritz,Masaya Fujita,Yasutaro Fujita,S. Fuma,Alessandro Galizzi,Nathalie Galleron,Sa Youl Ghim,Philippe Glaser,André Goffeau,E. J. Golightly,Guido Grandi,G. Guiseppi,BJ Guy,Kazuko Haga,Jacques Haiech,Colin R. Harwood,Alain Hénaut,H. Hilbert,Siger Holsappel,S. Hosono,Marie-Françoise Hullo,Mitsuhiro Itaya,Louis M. Jones,Bernard Joris,Dimitri Karamata,Y. Kasahara,M. Klaerr-Blanchard,Carsten Klein,Y. Kobayashi,P. Koetter,G. Koningstein,Susanne Krogh,Miyuki Kumano,Kanako Kurita,Alla Lapidus,S. Lardinois,J. Lauber,Vladimir Lazarevic,Simon Ming-Yuen Lee,Alain Levine,H. Liu,S. Masuda,Catherine Mauël,Claudine Médigue,Claudine Médigue,N. Medina,Rafael P. Mellado,Motoki Mizuno,D. Moestl,S. Nakai,Michiel A. Noback,David Noone,Mary O'Reilly,K. Ogawa,A. Ogiwara,B. Oudega,S.-H. Park,Victor Parro,Thomas Pohl,Daniel Portetelle,Steffen Porwollik,A. M. Prescott,E. Presecan,Petar Pujic,Bénédicte Purnelle,Georges Rapoport,M. Rey,Stacey Reynolds,Michael A. Rieger,Carlo Rivolta,Eduardo P. C. Rocha,Eduardo P. C. Rocha,B. Roche,Matthias Rose,Yoshito Sadaie,Toshitada Sato,E. Scanlan,S. Schleich,R. Schroeter,F Scoffone,Junichi Sekiguchi,Agnieszka Sekowska,Simone J. Séror,Pascale Serror,B.-S. Shin,Blazenka Soldo,Alexei Sorokin,E. Tacconi,T. Takagi,Hideyuki Takahashi,Ken-Ichi Takemaru,Michio Takeuchi,A. Tamakoshi,Tetsu Tanaka,Peter Terpstra,Angelo Tognoni,Valentina Tosato,Shigeki Uchiyama,Micheline Vandenbol,Françoise Vannier,A. Vassarotti,Alain Viari,R. Wambutt,E. Wedler,H. Wedler,T. Weitzenegger,P. Winters,Anil Wipat,Hiroki Yamamoto,Kunio Yamane,K. Yasumoto,Katsunori Yata,K. Yoshida,Hisashi Yoshikawa,Emmanuelle Zumstein,Hiroshi Yoshikawa,Antoine Danchin +154 more
TL;DR: Bacillus subtilis is the best-characterized member of the Gram-positive bacteria, indicating that bacteriophage infection has played an important evolutionary role in horizontal gene transfer, in particular in the propagation of bacterial pathogenesis.
Journal ArticleDOI
A genomic perspective on protein families
TL;DR: Comparison of proteins encoded in seven complete genomes from five major phylogenetic lineages and elucidation of consistent patterns of sequence similarities allowed the delineation of 720 clusters of orthologous groups (COGs), which comprise a framework for functional and evolutionary genome analysis.
References
More filters
Journal ArticleDOI
Whole-genome random sequencing and assembly of Haemophilus influenzae Rd.
Fleischmann Rd,Adams,Owen White,Rebecca A. Clayton,Ewen F. Kirkness,Anthony R. Kerlavage,Carol J. Bult,J F Tomb,Brian Dougherty,Merrick Jm +9 more
TL;DR: An approach for genome analysis based on sequencing and assembly of unselected pieces of DNA from the whole chromosome has been applied to obtain the complete nucleotide sequence of the genome from the bacterium Haemophilus influenzae Rd.
Journal ArticleDOI
The minimal gene complement of Mycoplasma genitalium
Claire M. Fraser,Jeannine D. Gocayne,Owen White,Mark Raymond Adams,Rebecca A. Clayton,Robert D. Fleischmann,Carol J. Bult,Anthony R. Kerlavage,Granger G. Sutton,Jenny M. Kelley,Janice L. Fritchman,Janice Weidman,Keith V. Small,Mina Sandusky,Joyce Fuhrmann,David Nguyen,Teresa Utterback,D. Saudek,Cheryl Phillips,Joseph M. Merrick,J F Tomb,Brian Dougherty,Kenneth F. Bott,Ping Chuan Hu,Thomas Lucier,Scott N. Peterson,Hamilton O. Smith,Clyde A. Hutchison,J. Craig Venter +28 more
TL;DR: Comparison of the Mycoplasma genitalium genome to that of Haemophilus influenzae suggests that differences in genome content are reflected as profound differences in physiology and metabolic capacity between these two organisms.
Journal ArticleDOI
GENMARK: Parallel gene recognition for both DNA strands
TL;DR: The initial idea of the method combines the specific Markov models of coding and non-coding region together with Bayes' decision making function and allows easy generalization for employing of higher order Markov chain models.
Journal ArticleDOI
Sequence Analysis of the Genome of the Unicellular Cyanobacterium Synechocystis Sp. Strain PCC6803. I. Sequence Features in the 1 Mb Region From Map Positions 64% to 92% of the Genome
Takakazu Kaneko,Ayako Tanaka,Shusei Sato,Hirokazu Kotani,Takashi Sazuka,Nobuyuki Miyajima,Masahiro Sugiura,Satoshi Tabata +7 more
TL;DR: The contiguous sequence of 1,003,450 bp spanning map positions 64% to 92% of the genome of Synechocystis sp.