scispace - formally typeset
Search or ask a question

Showing papers in "Nucleic Acids Research in 1993"


Journal ArticleDOI
TL;DR: A new approach for generating null alleles of a gene by one step PCR amplification of the wild-type HIS3 gene at its own locus is described.
Abstract: The systematic sequencing of the yeast S.cerevisiae genome has revealed a profusion of Open Reading Frames (ORFs). Although some of them have been previously studied, a large majority represents new genes (1, 2). Deletion of these ORFs is a convenient tool for their functional analysis, Here we describe a new approach for generating null alleles of a gene. Usually, a gene inactivation requires the in vitro creation of a construction in which a selectable marker is sandwiched by the 5' and the 3' flanking sequences of the target ORF. Classically, this strategy requires several cloning steps. In contrast, our approach generates such a construction by one step PCR amplification. Each oligodeoxynucleotide used contains two distinct regions, one which allows homologous recombination at the target locus and will be named the deleting sequence, the second part which permits the PCR amplification of a selectable marker. The deleting sequences, which are respectively the 5' (oligopro) and 3' (oligoterm) flanking sequences of the ORF, range from 35 to 51 nucleotides in length and are followed by a short stretch of 17 nucleotides homologous to the HI S3 selectable marker. Table 1 shows the composition of the deleting sequences, the sequence used for HIS3 amplification always being the same (5'-TCGTTCAGAATGACACG-3' for oligoterm and 5'-CTCTTGGCCTCCTCTAG-3' for oligopro. Following the PCR amplification, the crude mix was directly used to transform yeast by standard procedures (3). In a first set of experiments, we used W3O3-1B (MATa, ura3-l, trpl-1, ade2-l, leu2-3, 112, his3-ll, 15) as recipient strain. All the His transformants tested (a total of 30) presented the same pattern when analysed by Southern blot, corresponding to the insertion of the wild-type HIS3 gene at its own locus (data not shown). We thus used a recipient strain carrying a complete deletion of the HIS3 gene. With the diploid strain BMA 1 (a diploid from cross FY 1679-18B and FY 1679-28C, kindly provided by B.Dujon) containing the His3A200 allele (4), we routinely obtained more than 10 His transformants per plate. As the procedure appeared to be efficient enough, we tested by

1,357 citations


Journal ArticleDOI
TL;DR: Nick-translation PCR was performed with fluorogenic probes that generated fluorescence from its indicator dye only when the sequence between the indicator and quencher dyes was perfectly complementary to target.
Abstract: Nick-translation PCR was performed with fluorogenic probes. Two probes were used: one complementary to a sequence containing the F508 codon of the normal human cystic fibrosis (CF) gene (wt DNA) and one complementary to a sequence containing the delta F508 three base pair deletion (mut DNA). Each probe contained a unique and spectrally resolvable fluorescent indicator dye at the 5' end and a common quencher dye attached to the seventh nucleotide from the 5' end. The F508/delta F508 site was located between the indicator and quencher. The probes were added at the start of a PCR containing mut DNA, wt DNA or heterozygous DNA and were degraded during thermal cycling. Although both probes were degraded, each probe generated fluorescence from its indicator dye only when the sequence between the indicator and quencher dyes was perfectly complementary to target. The identify of the target DNA could be determined from the post-PCR fluorescence emission spectrum.

957 citations


Journal ArticleDOI
TL;DR: It is shown that the number of anchored oligo-dT primers can be reduced from twelve to four that are degenerate at the penultimate base from the 3' end, which enables further streamlining of the technique and make it readily applicable to a broad spectrum of biological systems.
Abstract: Differential display has been developed as a tool to detect and characterize altered gene expression in eukaryotic cells. The basic principle is to systematically amplify messenger RNAs and then distribute their 3' termini on a denaturing polyacrylamide gel. Here we provide methodological details and examine in depth the specificity, sensitivity and reproducibility of the method. We show that the number of anchored oligo-dT primers can be reduced from twelve to four that are degenerate at the penultimate base from the 3' end. We also demonstrate that using optimized conditions described here, multiple RNA samples from related cells can be displayed simultaneously. Therefore process-specific rather than cell-specific genes could be more accurately identified. These results enable further streamlining of the technique and make it readily applicable to a broad spectrum of biological systems.

924 citations


Journal ArticleDOI
TL;DR: It is established that expression of the zebrafish gene (krx-20) first appears at 100% epiboly as a single anterior domain of the prospective neuroepithelium, followed very soon after by a second more posterior domain.
Abstract: To begin to examine the function of genes that control early development in the hindbrain, we have screened an embryonic zebrafish cDNA library with a murine krox-20 gene probe that contained the conserved zinc finger regions. We have isolated two overlapping cDNAs, zf187 and zf201 which are homologues of the murine krox-20 gene. The N-terminal of the longest cDNA (zf201) contains two acidic regions identical to those of the murine krox-20. This indicates that the functional organisation of these proteins is probably conserved. Northern Blot analysis identified a single transcript of 2.0 kb. Wholemount in situ hybridisation established that expression of the zebrafish gene (krx-20) first appears at 100% epiboly as a single anterior domain of the prospective neuroepithelium, followed very soon after by a second more posterior domain. The alternating pattern of expression of this gene in rhombomeres(r) r3 and r5 is apparent by 12 hr post-fertilisation, that is prior to the morphological appearance of the rhombomeres. Around 14 hr neural crest migration begins from the dorsal surface of r5, moving caudally into r6 and then ventrally towards the pharyngeal arches. Crest migration is not apparent at or after 16 hr. No neural crest migration was observed from r3. Expression of krx-20 is down regulated firstly in r3 around 26 hr and later in r5 around 30 hr.

725 citations


Journal ArticleDOI
TL;DR: Analysis of sequence motifs found in metazoan protein factors involved in constitutive pre-mRNA splicing and in alternative splicing regulation indicates that the RRM is an ancient conserved region (ACR) that has diversified by duplication of genes and intragenic domains.
Abstract: We present a systematic analysis of sequence motifs found in metazoan protein factors involved in constitutive pre-mRNA splicing and in alternative splicing regulation. Using profile analysis we constructed a database enriched in protein sequences containing one or more presumptive copies of the RNA-recognition motif (RRM). We provide an accurate alignment of RRMs and structure-based criteria for identifying new RRMs, including many that lack the prototype RNP-1 submotif. We present a comprehensive table of 125 sequences containing 252 RRMs, including 22 previously unreported RRMs in 17 proteins. The presence of a putative RRM in these proteins, which are implicated in a variety of cellular processes, strongly suggests that their function involves binding to RNA. Unreported homologies in the RRM-enriched database to the metazoan SR family of splicing factors are described for an Arg-rich human nuclear protein and two yeast proteins (S. pombe mei2 and S. cerevisiae Npl3). We have rigorously tested the phylogenetic relationships of a large sample of RRMs. This analysis indicates that the RRM is an ancient conserved region (ACR) that has diversified by duplication of genes and intragenic domains. Statistical analyses and classification of repeated Arg-Ser (RS) and RGG domains in various protein splicing factors are presented.

663 citations


Journal ArticleDOI
TL;DR: This is an update of an earlier compilation and alignment of DNA polymerase sequences (Ito and Braithwaite, 1991) that attempted to compile complete sequences, to facilitate the identification of conserved and viable regions of the DNA polymerases.
Abstract: This is an update of an earlier compilation and alignment of DNA polymerase sequences (Ito and Braithwaite, 1991). As in the previous compilation, we attempted to compile complete sequences, to facilitate the identification of conserved and viable regions of the DNA polymerases (1). This update includes, for the first time, three DNA polymerase sequences from Archaea (2); two new members of the Family A DNA polymerases; and 19 new members of the Family B DNA polymerases. In addition, we included nucleases that have related amino acid sequences to E.coli DNA polymerase I, and the sequence of E.coli DNA polymerase HI (e-subunit) was aligned to Family C due to its homology to Bacillus subtilis DNA polymerase HI. As in the previous compilation (1), Family A DNA polymerases are named for their homology to the product of the polA gene specifying E.coli DNA polymerase I; Family B DNA polymerases are named for their homology to the product of the polB gene encoding E.coli DNA polymerase II; and Family C DNA polymerases are named for their homology to the product of the polC encoding E. coli DNA polymerase HI alpha subunit. Table 1 summarizes the molecular weights and isoelectric points of each DNA polymerase and nuclease. Table 1 also serves as a reference guide to the sequences shown in Figures 1A, IB, and 1C. Since no new sequences were published for the Family X DNA polymerases (/Mike), we have excluded them from this compilation.

630 citations



Journal ArticleDOI
TL;DR: The MUST package is a phylogenetically oriented set of programs for data management and display, allowing one to handle both raw data (sequences) and results (trees, number of steps, bootstrap proportions).
Abstract: The MUST package is a phylogenetically oriented set of programs for data management and display, allowing one to handle both raw data (sequences) and results (trees, number of steps, bootstrap proportions). It is complementary to the main available software for phylogenetic analysis (PHYLIP, PAUP, HENNING86, CLUSTAL) with which it is fully compatible. The first part of MUST consists of the acquisition of new sequences, their storage, modification, and checking of sequence integrity in files of aligned sequences. In order to improve alignment, an editor function for aligned sequences offers numerous options, such as selection of subsets of sequences, display of consensus sequences, and search for similarities over small sequence fragments. For phylogenetic reconstruction, the choice of species and portions of sequences to be analyzed is easy and very rapid, permitting fast testing of numerous combinations of sequences and taxa. The resulting files can be formatted for most programs of tree construction. An interactive tree-display program recovers the output of all these programs. Finally, various modules allow an in-depth analysis of results, such as comparison of distance matrices, variation of bootstrap proportions with respect to various parameters or comparison of the number of steps per position. All presently available complete sequences of 28S rRNA are furnished aligned in the package. MUST therefore allows the management of all the operations required for phylogenetic reconstructions.

618 citations


Journal ArticleDOI
TL;DR: By applying the method to regenerating mouse liver, the usefulness and power of the RNA display technique is confirmed, which is named differential display reverse transcription PCR (DDRT-PCR), and the range of its application is extended.
Abstract: We have significantly improved a method originally developed by Liang and Pardee [Science 257 (1992) 967-971] to display a broad spectrum of expressed genes and to detect differences in expression between different cell types. We have analysed various aspects of the technique and have modified it for both, the application to fast and efficient identification of genes and the use with automatic analysis systems. Based on the mathematical background we have devised the appropriate number of optimal PCR primers. We have also introduced nondenaturating gels for separating double stranded fragments as single bands. By applying the method to regenerating mouse liver, we have identified, out of a total of 38,000 bands, about 70 fragments where the expression of the corresponding genes seems to be differentially regulated at different time points. Application of the method to an automatic DNA sequencer was successfully done. Thus, we have confirmed the usefulness and increased the power of the RNA display technique, which we named differential display reverse transcription PCR (DDRT-PCR), and have extended the range of its application.

581 citations


Journal ArticleDOI
TL;DR: The abundance of different simple sequence motifs in plants was accessed through data base searches of DNA sequences and quantitative hybridization with synthetic dinucleotide repeats and the GT/CA motif being the most abundant din nucleotide repeat in mammals was found to be considerably less frequent in plants.
Abstract: The abundance of different simple sequence motifs in plants was accessed through data base searches of DNA sequences and quantitative hybridization with synthetic dinucleotide repeats. Database searches indicated that microsatellites are five times less abundant in the genomes of plants than in mammals. The most common plant repeat motif was AA/TT followed by AT/TA and CT/GA. This group comprised about 75% of all microsatellites with a length of more than 6 repeats. The GT/CA motif being the most abundant dinucleotide repeat in mammals was found to be considerably less frequent in plants. To address the question if plant simple repeat sequences are variable as in mammals, (GT)n and (CT)n microsatellites were isolated from B.napus. Five loci were investigated by PCR-analysis and amplified products were obtained for all microsatellites from B. oleracea, B.napus and B.rapa DNA, but only for one primer pair from B.nigra. Polymorphism was detected for all microsatellites.

565 citations


Journal ArticleDOI
TL;DR: The sequence of hnRNP K contains a 45 amino acid repeated motif which is almost completely conserved between the X.laevis and human proteins and shows significant homology to several proteins some of which are known nucleic acids binding proteins.
Abstract: The K protein is among the major pre-mRNA-binding proteins (hnRNPs) in vertebrate cell nuclei. It binds tenaciously to cytidine-rich sequences and is the major oligo(rC/dC)-binding protein in vertebrate cells. We have cloned a cDNA of the Xenopus laevis hnRNP K and determined its sequence. The X.laevis hnRNP K is a 47 kD protein that is remarkably similar to its human 66 kD counterpart except for two large internal deletions. The sequence of hnRNP K contains a 45 amino acid repeated motif which is almost completely conserved between the X.laevis and human proteins. We found that this repeated motif, the KH motif (for K homology), shows significant homology to several proteins some of which are known nucleic acids binding proteins. The homology is particularly strong with the archeabacterial ribosomal protein S3 and with the saccharomyces cerevisiae protein MER1 which is required for meiosis-specific splicing of the MER 2 transcript. As several of the proteins that contain the KH motif are known to bind RNA, this domain may be involved in RNA binding.

Journal ArticleDOI
TL;DR: This set of rRNAs contains representative structures from all of the major phylogenetic groupings--Archaea, (eu)Bacteria, and the nucleus, mitochondrion, and chloroplast of Eucarya.
Abstract: A collection of diverse 16S and 16S-like rRNA secondary structure diagrams are available. This set of rRNAs contains representative structures from all of the major phylogenetic groupings--Archaea, (eu)Bacteria, and the nucleus, mitochondrion, and chloroplast of Eucarya. Within this broad phylogenetic sampling are examples of the major forms of structural diversity currently known for this class of rRNAs. These structure diagrams are available online through our computer-network WWW server and anonymous ftp, as well as from the author in hardcopy format.

Journal ArticleDOI
TL;DR: In this paper, a minimal methyl-CpG binding domain (MBD) was isolated from MeCP2 and shown to have negligable non-specific affinity for DNA, confirming that nonspecific and methyl CpG specific binding domains are distinct.
Abstract: MeCP2 is a chromosomal protein which binds to DNA that is methylated at CpG. In situ immunofluorescence in mouse cells has shown that the protein is most concentrated in pericentromeric heterochromatin, suggesting that MeCP2 may play a role in the formation of inert chromatin. Here we have isolated a minimal methyl-CpG binding domain (MBD) from MeCP2. MBD is 85 amino acids in length, and binds exclusively to DNA that contains one or more symmetrically methylated CpGs. MBD has negligable non-specific affinity for DNA, confirming that non-specific and methyl-CpG specific binding domains of MeCP2 are distinct. In vitro footprinting indicates that MBD binding can protect a 12 nucleotide region surrounding a methyl-CpG pair, with an approximate dissociation constant of 10(-9) M.

Journal ArticleDOI
TL;DR: This methodology can be used in combination with time dependent degradation of oligonucleotides by exonucleases as powerful tool to determine sequence compositions.
Abstract: We report the analysis and characterization of natural and modified oligonucleotides by matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS). The present technology was highly improved for this class of compounds by using a new matrix, 2,4,6-trihydroxy acetophenone, together with di- and triammonium salts of organic or inorganic acids to suppress peak broadening due to multiple ion adducts. This methodology can be used in combination with time dependent degradation of oligonucleotides by exonucleases as powerful tool to determine sequence compositions.

Journal ArticleDOI
TL;DR: The following method, which is a modification of a protocol published by Boom et a/.
Abstract: The polymerase chain reaction has made it possible to include extinct species and past populations in molecular studies of phylogeny and evolution (1). This emerging field, however, is marred by problems, mainly because archaeological remains often yield no amplifiable DNA, extracts often contain components which inhibit the Tag. polymerase, and contamination of trace amounts of contemporary DNA can yield misleading results (2, 3). We have found that the following method, which is a modification of a protocol published by Boom et a/. (4), is highly useful in alleviating the former two types of problems and in several cases allows the study of late Pleistocene animal remains that often are not amenable to other extraction procedures. A layer of approximately 1 mm is removed from the surface of the bone samples by grinding with a drilling machine in order to reduce contamination from previous handling. The sample is ground to a fine powder under liquid nitrogen in a freezer mill (Spex Industries Inc., Edison, NJ). About 0.5 g of bone powder is added to 1 ml of extraction buffer consisting of 10 M guanidinium thiocyanate (GuSCN), 0.1 M Tris-HCl pH 6.4, 0.02 M EDTA pH 8.0 and 1.3% Triton X-100. This is then incubated at 60°C for one to several hours with sporadic agitation. After centrifugation for 5 min at 5,000 rpm about 500 /tl of the supernatant is recovered and added to a mixture of 500 /tl of extraction buffer and 40 /il silica suspension prepared as in ref. 4. The mixture is incubated for 10 min at room temperature. Subsequently, the silica pellet is washed twice with a buffer consisting of 10 M GuSCN and 0.1 M Tris-HCl, pH 6.4, twice with 70% ethanol and once with acetone. After drying the pellet at 56°C, nucleic acids are eluted at 56°C in two aliquots of 65 /tl water or TE and stored at -20°C.

Journal ArticleDOI
TL;DR: The database on small ribosomal subunit RNA structure contained 1804 nucleotide sequences on April 23, 1993, which comprises 365 eukaryotic, 65 archaeal, 1260 bacterial, 30 plastidial, and 84 mitochondrial sequences.
Abstract: The database on small ribosomal subunit RNA structure contained 1804 nucleotide sequences on April 23, 1993. This number comprises 365 eukaryotic, 65 archaeal, 1260 bacterial, 30 plastidial, and 84 mitochondrial sequences. These are stored in the form of an alignment in order to facilitate the use of the database as input for comparative studies on higher-order structure and for reconstruction of phylogenetic trees. The elements of the postulated secondary structure for each molecule are indicated by special symbols. The database is available on-line directly from the authors by ftp and can also be obtained from the EMBL nucleotide sequence library by electronic mail, ftp, and on CD ROM disk.


Journal ArticleDOI
TL;DR: This work compared the activity of several inducible and constitutive expression systems in fission yeast by using a betagalactosidase reporter gene to allow a more informed choice of appropriate expression systems for analysis of gene function in S.pombe.
Abstract: Analysis of a variety of problems in yeast relies on ectopic expression of the protein of interest under control of a heterologous promoter. In the fission yeast S.pombe, there are relatively few expression plasmids and most reports of their activity have not been comparative. It is thus difficult to determine the comparability of experiments carried out using different expression vectors. When new promoters are characterised, it is likewise difficult to determine their activity relative to those previously identified. In order to better assess relative promoter strengths, I have compared the activity of several inducible and constitutive expression systems in fission yeast by using a betagalactosidase reporter gene. This allows a more informed choice of appropriate expression systems for analysis of gene function in S.pombe. Four regulatable and two constitutive promoters were compared. Three regulated promoters were derived from the powerful rural promoter, first described by Maundrell (1) and subsequently attenuated with mutations in the TATA box by Basi et al. (2). The other regulated system was constructed from the tetracycline-inducible system described by Faryar and Gatz (3). The two constitutive promoters were the previously described vectors pARTl (containing the adh promoter; .(4) and pSMl (containing the SV40 promoter; 5). The expression vector REP3, containing the thiamine-inducible nmtl promoter (1), and its derivatives REP41 and REP81 (2), which have lower levels of activity due to mutation, all contain an ATG within their polylinker. This was destroyed by insertion of a Xho linker; these derivatives are called REP3X (full strength nmtl), REP41X (slightly weaker; nmtl*) and REP81X (much


Journal ArticleDOI
TL;DR: The DNA sequence of a 225.4 kilobase segment of the Escherichia coli K-12 genome is described here, from 76.0 to 81.5 minutes on the genetic map, which brings the total of contiguous sequence from the E.coli genome project to 725.1 kb.
Abstract: The DNA sequence of a 225.4 kilobase segment of the Escherichia coli K-12 genome is described here, from 76.0 to 81.5 minutes on the genetic map. This brings the total of contiguous sequence from the E.coli genome project to 725.1 kb (76.0 to 92.8 minutes). We found 191 putative coding genes (ORFs) of which 72 genes were previously known, and 110 of which remain unidentified despite literature and similarity searches. Seven new genes--arsE, arsF, arsG, treF, xylR, xylG, and xylH--were identified as well as the previously mapped pit and dctA genes. The arrangement of proposed genes relative to possible promoters and terminators suggests 90 potential transcription units. Other features include 19 REP elements, 95 computer-predicted bends, 50 Chi sites, and one grey hole. Thirty-one putative signal peptides were found, including those of thirteen known membrane or periplasmic proteins. One tRNA gene (proK) and two insertion sequences (IS5 and IS150) are located in this segment. The genes in this region are organized with equal numbers oriented with or against replication.

Journal ArticleDOI
TL;DR: A model system, involving the lox-Cre site-specific recombination system of bacteriophage PI, to lock together the heavy and light chain genes from two different replicons within an infected bacterium is described.
Abstract: Antibody fragments, comprising paired heavy (VH) and light (VL) chain variable domains, can be displayed on the surface of filamentous bacteriophage, and rare phage (encoding antigen binding activities) selected by binding to antigen (1). The process mimics immune selection and has been used to make human antibody fragments in bacteria, without immunisation, by random combinatorial linkage (2) of diverse repertoires of VH and VL genes from lymphocytes (3, 4). Fragments with a range of binding specificities have been isolated with binding affinities in the range 10 M~'-10 M\" (for reviews see (5, 6)). However larger 'primary' repertoires of phage antibodies should allow higher affinity fragments to be isolated (7, 8). The size of phage antibody repertoires (10) is limited by the efficiency of transformation of E.coli. In principle, larger repertoires could be made by combinatorial infection, for example by transforming E. coli with a repertoire of heavy chains (encoded on plasmids) then infecting with a repertoire of light chains (encoded on phage) (9). Since infection is extremely efficient, and most E.coli cells in an exponential culture can be infected, the combinatorial diversity of Fab fragments displayed on phage could be as large as the number of E.coli in culture (10 per litre). However the heavy and light chain genes would not be packaged together within the same phage particle, and so could not be simultaneously co-selected. Here we describe a model system, involving the lox-Cre site-specific recombination system of bacteriophage PI, to lock together the heavy and light chain genes from two different replicons within an infected bacterium.

Journal ArticleDOI
TL;DR: It is shown that a PNA/DNA complex can effectively block the formation of a PCR product when the PNA is targeted against one of the PCR primer sites and this blockage allows selective amplification/suppression of target sequences that differ by only one base pair.
Abstract: A novel method that allows direct analysis of single base mutation by the polymerase chain reaction (PCR) is described. The method utilizes the finding that PNAs (peptide nucleic acids) recognize and bind to their complementary nucleic acid sequences with higher thermal stability and specificity than the corresponding deoxyribooligonucleotides and that they cannot function as primers for DNA polymerases. We show that a PNA/DNA complex can effectively block the formation of a PCR product when the PNA is targeted against one of the PCR primer sites. Furthermore, we demonstrate that this blockage allows selective amplification/suppression of target sequences that differ by only one base pair. Finally we show that PNAs can be designed in such a way that blockage can be accomplished when the PNA target sequence is located between the PCR primers.

Journal ArticleDOI
TL;DR: A new superfamily of (putative) DNA-dependent ATPases is described that includes the ATPase domains of prokaryotic NtrC-related transcription regulators, MCM proteins involved in the initiation of eukaryotic DNA replication, and a group of uncharacterized bacterial and chloroplast proteins.
Abstract: A new superfamily of (putative) DNA-dependent ATPases is described that includes the ATPase domains of prokaryotic NtrC-related transcription regulators, MCM proteins involved in the initiation of eukaryotic DNA replication, and a group of uncharacterized bacterial and chloroplast proteins. MCM proteins are shown to contain a modified form of the ATP-binding motif and are predicted to mediate ATP-dependent opening of double-stranded DNA in the replication origins. In a second line of investigation, it is demonstrated that the products of unidentified open reading frames from Marchantia mitochondria and from yeast, and a domain of a baculovirus protein involved in viral DNA replication are related to the superfamily III of DNA and RNA helicases that previously has been known to include only proteins of small viruses. Comparison of the multiple alignments showed that the proteins of the NtrC superfamily and the helicases of superfamily III share three related sequence motifs tightly packed in the ATPase domain that consists of 100-150 amino acid residues. A similar array of conserved motifs is found in the family of DnaA-related ATPases. It is hypothesized that the three large groups of nucleic acid-dependent ATPases have similar structure of the core ATPase domain and have evolved from a common ancestor.


Journal ArticleDOI
TL;DR: REBASE is a comprehensive database of information about restriction enzymes and their associated methylases, including their recognition and cleavage sites and their commercial availability, and a listing of homing endonucleases.
Abstract: REBASE is a comprehensive database of information about restriction enzymes and their associated methylases, including their recognition and cleavage sites and their commercial availability. Also included is a listing of homing endonucleases. Information from REBASE is available via monthly electronic mailings as well as via anonymous ftp and through the World Wide Web. The REBASE web site, http://www. neb.com/rebase , is where we maintain a web page for every enzyme, reference and supplier. Additionally, there is a search facility, help and NEWS pages, and a complete description of our various services. Specialized files are available that can be used directly by many software packages.

Journal ArticleDOI
TL;DR: An updated compilation of 300 E. coli mRNA promoter sequences is presented and the most recent relevant paper was checked, to verify the location of the transcriptional start position as identified experimentally.
Abstract: An updated compilation of 300 E. coli mRNA promoter sequences is presented. For each sequence the most recent relevant paper was checked, to verify the location of the transcriptional start position as identified experimentally. We comment on the reliability of the sequence databanks and analyze the conservation of known promoter features in the current compilation. This database is available by E-mail.

Journal ArticleDOI
TL;DR: A database of patterns found in protein sequences which would be used to search against sequences of unknown function, and contains some patterns which have been published in the literature, but the majority have been developed in the last four years by the author.
Abstract: PROSITE is a compilation of sites and patterns found in protein sequences; it can be used as a method of determining the function of uncharacterized proteins translated from genomic or cDNA sequences. In some cases the sequence of an unknown protein is too distantly related to any protein of known structure to detect its resemblance by overall sequence alignment, but relationships can be revealed by the occurrence in its sequence of a particular cluster of residue types which is variously known as a pattern, motif, signature, or fingerprint. These motifs arise because specific region(s) of a protein which may be important, for example, for their binding properties or for their enzymatic activity are conserved in both structure and sequence. These structural requirements impose very tight constraints on the evolution of these small but important portion(s) of a protein sequence. The use of protein sequence patterns to determine the function of proteins is becoming very rapidly one of the essential tools of sequence analysis. This reality has been recognized by many authors [1,2]. While there have been a number of reviews of published patterns [3,4,5], no attempt had been made until very recently [6,7] to systematically collect biologically significant patterns or to discover new ones. Based on these observations, we decided in 1988, to actively pursue the development of a database of patterns which would be used to search against sequences of unknown function. This database, called PROSITE, contains some patterns which have been published in the literature, but the majority have been developed in the last four years by the author.

Journal ArticleDOI
TL;DR: A rapid (< 2.5 hrs) method for single-strand conformation polymorphism (SSCP) analysis of PCR products that allows the use of ethidium bromide staining is described and has additional advantages of dramatically increased speed, precise temperature control, reproducibility, and easily and inexpensively obtainable reagents and equipment.
Abstract: A rapid (< 2.5 hrs) method for single-strand conformation polymorphism (SSCP) analysis of PCR products that allows the use of ethidium bromide staining is described. PCR products ranging in size from 117 to 256 bp were evaluated for point mutations and polymorphisms by 'cold SSCP' in commercially available pre-cast polyacrylamide mini-gels. Several electrophoretic parameters (running temperature, buffers, denaturants, DNA concentration, and gel polyacrylamide concentration) were found to influence the degree of strand separation and appeared to be PCR fragment specific. Use of the 'cold' SSCP technique and the mini-gel format allowed us to readily optimize the electrophoretic conditions for each PCR fragment. This greatly increased our ability to detect polymorphisms compared to conventional, radioisotope-labeled 'hot' SSCP, typically run under two standard temperature conditions. Excellent results have been obtained in resolving mutant PCR fragments from human p53 exons 5 through 8, human HLA-DQA, human K-ras exons 1 and 2, and rat K-ras exon 3. Polymorphisms could be detected when mutant DNA comprised as little as 3% of the total gene copies in a PCR mixture. Compared to standard 'hot' SSCP, this novel non-isotopic method has additional advantages of dramatically increased speed, precise temperature control, reproducibility, and easily and inexpensively obtainable reagents and equipment. This new method also lacks the safety and hazardous waste management concerns associated with radioactive methods.

Journal ArticleDOI
TL;DR: The complete DNA sequence of the Euglena gracilis, Pringsheim strain Z chloroplast genome is reported, counting only one copy of a 54 bp tandem repeat sequence that is present in variable copy number within a single culture.
Abstract: We report the complete DNA sequence of the Euglena gracilis, Pringsheim strain Z chloroplast genome. This circular DNA is 143,170 bp, counting only one copy of a 54 bp tandem repeat sequence that is present in variable copy number within a single culture. The overall organization of the genome involves a tandem array of three complete and one partial ribosomal RNA operons, and a large single copy region. There are genes for the 16S, 5S, and 23S rRNAs of the 70S chloroplast ribosomes, 27 different tRNA species, 21 ribosomal proteins plus the gene for elongation factor EF-Tu, three RNA polymerase subunits, and 27 known photosynthesis-related polypeptides. Several putative genes of unknown function have also been identified, including five within large introns, and five with amino acid sequence similarity to genes in other organisms. This genome contains at least 149 introns. There are 72 individual group II introns, 46 individual group III introns, 10 group II introns and 18 group III introns that are components of twintrons (introns-within-introns), and three additional introns suspected to be twintrons composed of multiple group II and/or group III introns, but not yet characterized. At least 54,804 bp, or 38.3% of the total DNA content is represented by introns.

Journal ArticleDOI
TL;DR: The ENZYME data bank is a repository of information relative to the nomenclature of enzymes and it contains the following data for each type of characterized enzyme for which an EC (Enzyme Commission) number has been provided.
Abstract: SWISS-PROT is an annotated protein sequence database established in 1986 and maintained collaboratively, since 1988, by the Department of Medical Biochemistry of the University of Geneva and the EMBL Data Library. The SWISS-PROT protein sequence data bank consist of sequence entries. Sequence entries are composed of different lines types, each with their own format. For standardization purposes the format of SWISS-PROT follows as closely as possible that of the EMBL Nucleotide Sequence Database. A sample SWISS-PROT entry is shown in Figure 1.