scispace - formally typeset
Search or ask a question
JournalISSN: 1340-2838

DNA Research 

University of Oxford
About: DNA Research is an academic journal published by University of Oxford. The journal publishes majorly in the area(s): Gene & Genome. It has an ISSN identifier of 1340-2838. It is also open access. Over the lifetime, 1174 publications have been published receiving 70554 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: The sequence determination of the entire genome of the Synechocystis sp.
Abstract: The sequence determination of the entire genome of the Synechocystis sp. strain PCC6803 was completed. The total length of the genome finally confirmed was 3,573,470 bp, including the previously reported sequence of 1,003,450 bp from map position 64% to 92% of the genome. The entire sequence was assembled from the sequences of the physical map-based contigs of cosmid clones and of lambda clones and long PCR products which were used for gap-filling. The accuracy of the sequence was guaranteed by analysis of both strands of DNA through the entire genome. The authenticity of the assembled sequence was supported by restriction analysis of long PCR products, which were directly amplified from the genomic DNA using the assembled sequence data. To predict the potential protein-coding regions, analysis of open reading frames (ORFs), analysis by the GeneMark program and similarity search to databases were performed. As a result, a total of 3,168 potential protein genes were assigned on the genome, in which 145 (4.6%) were identical to reported genes and 1,257 (39.6%) and 340 (10.8%) showed similarity to reported and hypothetical genes, respectively. The remaining 1,426 (45.0%) had no apparent similarity to any genes in databases. Among the potential protein genes assigned, 128 were related to the genes participating in photosynthetic reactions. The sum of the sequences coding for potential protein genes occupies 87% of the genome length. By adding rRNA and tRNA genes, therefore, the genome has a very compact arrangement of protein- and RNA-coding regions. A notable feature on the gene organization of the genome was that 99 ORFs, which showed similarity to transposase genes and could be classified into 6 groups, were found spread all over the genome, and at least 26 of them appeared to remain intact. The result implies that rearrangement of the genome occurred frequently during and after establishment of this species.

2,523 citations

Journal ArticleDOI
TL;DR: A total of 2740 experimentally confirmed SSR markers for rice are made available, or approximately one SSR every 157 kb, with AT-rich microsatellites had the longest average repeat tracts, while GC-rich motifs were the shortest.
Abstract: A total of 2414 new di-, tri- and tetra-nucleotide non-redundant SSR primer pairs, representing 2240 unique marker loci, have been developed and experimentally validated for rice (Oryza sativa L.). Duplicate primer pairs are reported for 7% (174) of the loci. The majority (92%) of primer pairs were developed in regions flanking perfect repeats > or = 24 bp in length. Using electronic PCR (e-PCR) to align primer pairs against 3284 publicly sequenced rice BAC and PAC clones (representing about 83% of the total rice genome), 65% of the SSR markers hit a BAC or PAC clone containing at least one genetically mapped marker and could be mapped by proxy. Additional information based on genetic mapping and "nearest marker" information provided the basis for locating a total of 1825 (81%) of the newly designed markers along rice chromosomes. Fifty-six SSR markers (2.8%) hit BAC clones on two or more different chromosomes and appeared to be multiple copy. The largest proportion of SSRs in this data set correspond to poly(GA) motifs (36%), followed by poly(AT) (15%) and poly(CCG) (8%) motifs. AT-rich microsatellites had the longest average repeat tracts, while GC-rich motifs were the shortest. In combination with the pool of 500 previously mapped SSR markers, this release makes available a total of 2740 experimentally confirmed SSR markers for rice, or approximately one SSR every 157 kb.

1,493 citations

Journal ArticleDOI
TL;DR: A complete set of cloned individual genes encoding Histidine-tagged proteins with or without GFP fused for functional genomic analysis of Escherichia coli K-12 strain should provide unique resources for systematic functional genomic approaches.
Abstract: Based on the genomic sequence data of Escherichia coli K-12 strain, we have constructed a complete set of cloned individual genes encoding Histidine-tagged proteins with or without GFP fused for functional genomic analysis. Each clone encodes a protein of predicted ORF attached by Histidines and seven spacer amino acids at the N-terminal end, and five spacer amino acids and GFP at the C-terminal end. SfiI restriction sites are generated at both the N- and C-terminal boundaries of ORF upon cloning, which enables easy transfer of ORF to other vector systems by cutting with SfiI. Expression of cloned ORF is under the control of an IPTG-inducible promoter, which is strictly repressed by lacI(q) repressor gene product. The set of cloned ORFs described here should provide unique resources for systematic functional genomic approaches including (i) construction of DNA microarray, (ii) production and purification of proteins, (iii) analysis of protein localization by monitoring GFP fluorescence and (iv) analysis of protein-protein interaction.

1,337 citations

Journal ArticleDOI
TL;DR: The complete chromosome sequence of an O157:H7 strain isolated from the Sakai outbreak is reported, and the results of genomic comparison with a benign laboratory strain, K-12 MG1655, are identified, which may represent the fundamental backbone of the E. coli chromosome.
Abstract: Escherichia coli O157:H7 is a major food-borne infectious pathogen that causes diarrhea, hemorrhagic colitis, and hemolytic uremic syndrome. Here we report the complete chromosome sequence of an O157:H7 strain isolated from the Sakai outbreak, and the results of genomic comparison with a benign laboratory strain, K-12 MG1655. The chromosome is 5.5 Mb in size, 859 Kb larger than that of K-12. We identified a 4.1-Mb sequence highly conserved between the two strains, which may represent the fundamental backbone of the E. coli chromosome. The remaining 1.4-Mb sequence comprises of O157:H7-specific sequences, most of which are horizontally transferred foreign DNAs. The predominant roles of bacteriophages in the emergence of O157:H7 is evident by the presence of 24 prophages and prophage-like elements that occupy more than half of the O157:H7-specific sequences. The O157:H7 chromosome encodes 1632 proteins and 20 tRNAs that are not present in K-12. Among these, at least 131 proteins are assumed to have virulence-related functions. Genome-wide codon usage analysis suggested that the O157:H7-specific tRNAs are involved in the efficient expression of the strain-specific genes. A complete set of the genes specific to O157:H7 presented here sheds new insight into the pathogenicity and the physiology of O157:H7, and will open a way to fully understand the molecular mechanisms underlying the O157:H7 infection.

1,265 citations

Journal ArticleDOI
TL;DR: The entire sequences of 100 cDNA clones which were screened on the basis of the potentiality of coding for large proteins in vitro were determined and the expression profiles in a variety of tissues and chromosomal locations of the sequenced clones have been determined.
Abstract: In this series of projects of sequencing human cDNA clones which correspond to relatively long transcripts, we newly determined the entire sequences of 100 cDNA clones which were screened on the basis of the potentiality of coding for large proteins in vitro. The cDNA libraries used were the fractions with average insert sizes from 5.3 to 7.0 kb of the size-fractionated cDNA libraries from human brain. The randomly sampled clones were single-pass sequenced from both the ends to select clones that are not registered in the public database. Then their protein-coding potentialities were examined by an in vitro transcription/translation system, and the clones that generated proteins larger than 60 kDa were entirely sequenced. Each clone gave a distinct open reading frame (ORF), and the length of the ORF was roughly coincident with the approximate molecular mass of the in vitro product estimated from its mobility on SDS-polyacrylamide gel electrophoresis. The average size of the cDNA clones sequenced was 6.1 kb, and that of the ORFs corresponded to 1200 amino acid residues. By computer-assisted analysis of the sequences with DNA and protein-motif databases (GenBank and PROSITE databases), the functions of at least 73% of the gene products could be anticipated, and 88% of them (the products of 64 clones) were assigned to the functional categories of proteins relating to cell signaling/communication, nucleic acid managing, and cell structure/motility. The expression profiles in a variety of tissues and chromosomal locations of the sequenced clones have been determined. According to the expression spectra, approximately 11 genes appeared to be predominantly expressed in brain. Most of the remaining genes were categorized into one of the following classes: either the expression occurs in a limited number of tissues (31 genes) or the expression occurs ubiquitously in all but a few tissues (47 genes).

971 citations

Performance
Metrics
No. of papers from the Journal in previous years
YearPapers
202318
202260
202130
202025
201941
201855