scispace - formally typeset
Search or ask a question

Showing papers in "Protein Engineering in 1995"


Journal ArticleDOI
TL;DR: The LIGPLOT program automatically generates schematic 2-D representations of protein-ligand complexes from standard Protein Data Bank file input giving a simple and informative representation of the intermolecular interactions and their strengths, including hydrogen bonds, hydrophobic interactions and atom accessibilities.
Abstract: The LIGPLOT program automatically generates schematic 2-D representations of protein-ligand complexes from standard Protein Data Bank file input. The output is a colour, or black-and-white, PostScript file giving a simple and informative representation of the intermolecular interactions and their strengths, including hydrogen bonds, hydrophobic interactions and atom accessibilities. The program is completely general for any ligand and can also be used to show other types of interaction in proteins and nucleic acids. It was designed to facilitate the rapid inspection of many enzyme complexes, but has found many other applications.

4,745 citations


Journal ArticleDOI
TL;DR: Results proved that members of this alpha-helical receptor library with multiple substitutions in the solvent-exposed surface remain stable and soluble in E. coli.
Abstract: The construction and characterization of a combinatorial library of a solvent-exposed surface of an alpha-helical domain derived from a bacterial receptor is described. Using a novel solid-phase approach, the library was assembled in a directed and successive manner utilizing single-stranded oligonucleotides containing multiple random substitutions for the variegated segments of the gene fragment. The simultaneous substitution of 13 residues to all 20 possible amino acids was carried out in a region spanning 81 nucleotides. The randomization was made in codons for amino acids that were modelled to be solvent accessible at a surface made up from two of the three alpha-helices of a monovalent Fc-binding domain of staphylococcal protein A. After cloning of the PCR-amplified library into a phagemid vector adapted for phage display of the mutants, DNA sequencing analysis suggested a random distribution of codons in the mutagenized positions. Four members of the library with multiple substitutions were produced in Escherichia coli as fusions to an albumin-binding affinity tag derived from streptococcal protein G. The fusion proteins were purified by human serum albumin affinity chromatography and subsequently characterized by SDS-electrophoresis, CD spectroscopy and biosensor analysis. The analyses showed that the mutant protein A derivatives could all be secreted as soluble full-length proteins. Furthermore, the CD analysis showed that all mutants, except one with a proline introduced into helix 2, have secondary structures in close agreement with the wild-type domain. These results proved that members of this alpha-helical receptor library with multiple substitutions in the solvent-exposed surface remain stable and soluble in E. coli.(ABSTRACT TRUNCATED AT 250 WORDS)

307 citations


Journal ArticleDOI
TL;DR: Linear and thioether-linked F(ab')2 have very similar pharmacokinetic properties in normal mice, and their serum permanence times are respectively 7- and 8-fold longer than the corresponding Fab fragment.
Abstract: We developed a novel bivalent antibody fragment, the linear (L-) F(ab')2, comprising tandem repeats of a heavy chain fragment VH-CH1-VH-CH1 cosecreted with a light chain. Functional humanized L-F(ab')2 directed against p185HER2 was secreted from Escherichia coli at high titer (> or = 100 mg/l) and purified to homogeneity. The L-F(ab')2 binds two equivalents of antigen with an apparent affinity (Kd = 0.46 nM) that is within 3-fold of the corresponding thioether-linked F(ab')2 fragment. The N-terminal site binds antigen with an affinity (Kd = 1.2 nM) that is approximately 4-fold greater than that for the C-terminal site, as shown by the comparison of L-F(ab')2 variants containing a single functional binding site. L-F(ab')2 has greater antiproliferative activity than the thioether-linked F(ab')2 against the p185HER2-overexpressing tumor cell line BT474. Linear and thioether-linked F(ab')2 have very similar pharmacokinetic properties in normal mice, and their serum permanence times are respectively 7- and 8-fold longer than the corresponding Fab fragment. L-F(ab')2 offers a facile route to bivalent antibody fragments that are potentially suitable for clinical applications, and that may have improved biological activity compared with thioether-linked F(ab')2 fragments.

271 citations


Journal ArticleDOI
TL;DR: This analysis shows that it is possible to engineer improved frameworks for semi-synthetic antibody libraries which may be important in maintaining library diversity and that limitations in recombinant protein expression can be overcome by single amino acid substitutions.
Abstract: Using recombinant antibodies functionally expressed by secretion to the periplasm in Escherichia coli as a model system, we identified mutations located in turns of the protein which reduce the formation of aggregates during in vivo folding or which influence cell stability during expression. Unexpectedly, the two effects are based on different mutations and could be separated, but both mutations act synergistically in vivo. Neither mutation increases the thermodynamic stability in vitro. However, the in vivo folding mutation correlates with the yield of oxidative folding in vitro, which is limited by the side reaction of aggregation. The in vivo folding data also correlate with the rate and activation entropy of thermally induced aggregation. This analysis shows that it is possible to engineer improved frameworks for semi-synthetic antibody libraries which may be important in maintaining library diversity. Moreover, limitations in recombinant protein expression can be overcome by single amino acid substitutions.

270 citations


Journal ArticleDOI
TL;DR: The results strongly suggest the use of the recognition procedure for docking studies where the detailed structures of the molecules are lacking, and a pronounced trend towards the correct structure of the molecular complex was clearly indicated.
Abstract: A typical problem for a docking procedure is how to match two molecules with known 3-D structure so as to predict the configuration of their complex. A very serious obstacle to docking is an inherent inaccuracy in the 3-D structures of the molecules. In general, existing molecular recognition techniques are not designed for cases where (i) conformational changes upon macromolecular complex formation are substantial or (ii) the X-ray data on one or both (macro) molecules are not available, and the structures, based on alternative sources (NMR, modeling), are not well defined. We designed a direct computer experiment using molecules totally deprived of any structural features smaller than 7 A. This was performed on the basis of a previously developed docking algorithm. The modified procedure was applied to a number of known protein complexes taken from the Brookhaven Protein Data Bank. In most cases, a pronounced trend towards the correct structure of the molecular complex was clearly indicated and the real binding sites were predicted. The distinction between the prediction of the antigen-antibody complex and other molecular pairs may reflect important differences in the principles of complex formation. The results strongly suggest the use of our recognition procedure for docking studies where the detailed structures of the molecules are lacking.

248 citations


Journal ArticleDOI
TL;DR: The essential dynamics method was used to study differences in dynamics between the apo and holo forms of CRBP, and showed inhibition of essential motions upon ligand binding, and revealed large correlated motions of Retinol with regions of the protein, pointing to a possible retinol entry/exit site.
Abstract: The cellular retinol-binding protein (CRBP) is an intracellular retinol carrier protein belonging to a family of hydrophobic ligand-binding proteins, It transports retinol to specific locations in the cell where, for instance, it is esterified for storage, Recently solved crystallographic structures of CRBP homologues with and without bound ligand do not provide evidence for a ligand-induced conformational change, However, it has been shown that there is a difference in binding of holo-CRBP and apo-CRBP to lecithin-retinol acyltransferase, Moreover, proteolysis of holo-CRBP and apo-CRBP yields different products, indicating a difference in structure or dynamics between the two forms, Here, we present the results of molecular dynamics simulations of holo-CRBP and apo-CRBP, The simulations show a significant difference in conformation, in agreement with experimental results, The essential dynamics method was used to study differences in dynamics between the apo and hole forms of CRBP, and showed inhibition of essential motions upon ligand binding, It also revealed large correlated motions of retinol with regions of the protein, pointing to a possible retinol entry/exit site.

168 citations


Journal ArticleDOI
TL;DR: The proposed approach provides reasonable estimates of distinctions in binding affinity and gives an insight into the nature of enthalpyentropy compensation factors detected in the binding process.
Abstract: The steadily increasing number of high-resolution human immunodeficiency virus (HIV) 1 protease complexes has been the impetus for the elaboration of knowledge-based mean field ligand-protein interaction potentials. These potentials have been linked with the hydrophobicity and conformational entropy scales developed originally to explain protein folding and stability. Empirical free energy calculations of a diverse set of HIV-1 protease crystallographic complexes have enabled a detailed analysis of binding thermodynamics. The thermodynamic consequences of conformational changes that HIV-1 protease undergoes upon binding to all inhibitors, and a substantial concomitant loss of conformational entropy by the part of HIV-1 protease that forms the ligand-protein interface, have been examined. The quantitative breakdown of the entropy-driven changes occurring during ligand-protein association, such as the hydrophobic contribution, the conformational entropy term and the entropy loss due to a reduction of rotational and translational degrees of freedom, of a system composed to ligand, protein and crystallographic water molecules at the ligand-protein interface has been carried out. The proposed approach provides reasonable estimates of distinctions in binding affinity and gives an insight into the nature of enthalpyentropy compensation factors detected in the binding process.

152 citations


Journal ArticleDOI
TL;DR: Absolute binding free energies for three inhibitors of HIV-1 proteinase were estimated from molecular dynamics simulations by a recently reported linear approximation procedure, in fairly good agreement with experimental binding data.
Abstract: Estimation of Binding Free Energies for HIV Proteinase Inhibitors by Molecular Dynamics Simulations

150 citations


Journal ArticleDOI
TL;DR: The extramembraneous segments in a large collection of G-protein coupled receptors (GPCRs) have been analysed in terms of amino acid composition and length and it is shown that this family of multi-spanning integral membrane proteins conforms well to the 'positive inside' rule.
Abstract: The extramembraneous segments in a large collection of G-protein coupled receptors (GPCRs) have been analysed in terms of amino acid composition and length. It is shown that this family of multi-spanning integral membrane proteins conforms well to the 'positive inside' rule. Further, the extracellular N-terminal tails of GPCRs lacking a cleavable signal peptide are shown to be considerably shorter and to have a reduced content of positively charged amnio acids compared with the N-terminal tails of GPCRs endowed with a signal peptide. This suggests that extracellular N-terminal tails of eukaryotic plasma membrane proteins may be translocated by different mechanisms depending on whether or not they are preceded by a signal peptide.

131 citations


Journal ArticleDOI
TL;DR: An automatic algorithm based on inter-residue contacts is presented to identify domains in proteins and agrees with the authors' assignment for 78% of the 284 non-redundant chains considered.
Abstract: An automatic algorithm based on inter-residue contacts is presented to identify domains in proteins. The results of the algorithm are compared to an assignment performed by inspection that was guided by the authors' description in the literature. The authors' and the algorithm's assignments for a chain were considered to agree if the same number of domains were identified and if the assignments were the same for at least 95% of the residues. With this criterion, the algorithm agreed with the authors' assignment for 78% of the 284 non-redundant chains considered. When some of the authors' assignments were re-evaluated based on the results of the algorithm, an agreement of 84% was obtained. The algorithm is therefore a useful tool for data validation in domain assignment. The authors assignments of domains were analysed for structural principles of domains. The number of chains forming one, two, three, four and five domains are 197, 67, 13, 6 and 1 respectively. Most domains in multidomain proteins are formed from continuous segments and adopt the same structural class. Distributions of the number of residues and the ellipticity of domains and chains are presented. The relationship between accessible surface area and molecular weight for domains and chains is examined.

129 citations


Journal ArticleDOI
TL;DR: Analysis of the binding and turnover of natural and synthetic substrates by the wild-type and mutant enzymes shows that the primary role of Gln11 is to prevent the non-productive binding of substrate.
Abstract: Bovine pancreatic ribonuclease A (RNase A) has been the object of much landmark work in biological chemistry. Yet the application of the techniques of protein engineering to RNase A has been limited by problems inherent in the isolation and heterologous expression of its gene. A cDNA library was prepared from cow pancreas, and from this library the cDNA that codes for RNase A was isolated. This cDNA was inserted into expression plasmids that then directed the production of RNase A in Saccharomyces cerevisiae (fused to a modified alpha-factor leader sequence) or Escherichia coli (fused to the pelB signal sequence). RNase A secreted into the medium by S.cerevisiae was an active but highly glycosylated enzyme that was recoverable at 1 mg/l of culture. RNase A produced by E.coli was in an insoluble fraction of the cell lysate. Oxidation of the reduced and denatured protein produced active enzyme which was isolated at 50 mg/l of culture. The bacterial expression system is ideal for the large-scale production of mutants of RNase A. This system was used to substitute alanine, asparagine or histidine for Gln11, a conserved residue that donates a hydrogen bond to the reactive phosphoryl group of bound substrate. Analysis of the binding and turnover of natural and synthetic substrates by the wild-type and mutant enzymes shows that the primary role of Gln11 is to prevent the non-productive binding of substrate.

Journal ArticleDOI
TL;DR: The interdomain linker peptide improved the hapten binding properties of the antibody fragment when compared with Fv fragment, but slightly increased its susceptibility to proteases.
Abstract: Single-chain antibodies were constructed using six different linker peptides to join the VH and VL domains of an anti-2-phenyloxazolone (Ox) antibody. Four of the linker peptides originated from the interdomain linker region of the fungal cellulase CBHI and consisted of 28, 11, six and two amino acid residues. The two other linker peptides used were the (GGGGS)3 linker with 15 amino acid residues and a modified IgG2b hinge peptide with 22 residues. Proteolytic stability and Ox binding properties of the six different scFv derivatives produced in Escherichia coli were investigated and compared with those of the corresponding Fv fragment containing no joining peptide between the V domains. The hapten binding properties of different antibody fragments were studied by ELISA and BIAcoreTM. The interdomain linker peptide improved the hapten binding properties of the antibody fragment when compared with Fv fragment, but slightly increased its susceptibility to proteases. Single-chain antibodies with short CBHI linkers of 11, six and two residues had a tendency to form multimers which led to a higher apparent affinity. The fragments with linkers longer than 11 residues remained monomeric.

Journal ArticleDOI
TL;DR: It is shown that it is possible to engineer a protein of enhanced thermostability by combining a series of rationally designed point mutations, and that this stabilization is achieved with only minor, localized changes in the structure of the protein.
Abstract: A number of mutations have been shown previously to stabilize T4 lysozyme. By combining up to seven such mutations in the same protein, the melting temperature was incrementally increased by up to 8.3 degrees C at pH 5.4 (delta delta G = 3.6 kcal/mol). This shows that it is possible to engineer a protein of enhanced thermostability by combining a series of rationally designed point mutations. It is also shown that this stabilization is achieved with only minor, localized changes in the structure of the protein. This is consistent with the observation that the change in stability of each of the multiple mutants is, in each case, additive, i.e. equal to the sum of the stability changes associated with the constituent single mutants. One of the seven substitutions, Asn116-->Asp, changes a residue that participates in substrate binding; not surprisingly, it causes a significant loss in activity. Ignoring this mutation, there is a gradual reduction in activity as successively more mutations are combined.

Journal ArticleDOI
TL;DR: An algorithm to predict tertiary structures of small proteins using an exceedingly simple potential function based only on a single type of favorable interaction between hydrophobic residues, an unfavorable excluded volume term of spatial overlaps and an interstrand hydrogen bond interaction.
Abstract: We describe an algorithm to predict tertiary structures of small proteins. In contrast to most current folding algorithms, it uses very few energy parameters. Given the secondary structural elements in the sequence--alpha-helices and beta-strands--the algorithm searches the remaining conformational space of a simplified real-space representation of chains to find a minimum energy of an exceedingly simple potential function. The potential is based only on a single type of favorable interaction between hydrophobic residues, an unfavorable excluded volume term of spatial overlaps and, for sheet proteins, an interstrand hydrogen bond interaction. Where appropriate, the known disulfide bonds are constrained by a square-law potential. Conformations are searched by a genetic algorithm. The model predicts reasonably well the known tertiary folds of seven out of the 10 small proteins we consider. We draw two conclusions. First, for the proteins we tested, this exceedingly simple potential function is no worse than others having hundreds of energy parameters in finding the right general tertiary structures. Second, despite its simplicity, the potential function is not the weak link in this algorithm. Differences between our predicted structures and the correct targets can be ascribed to shortcomings in our search strategy. This potential function may be useful for testing other conformational search strategies.

Journal ArticleDOI
TL;DR: Two methods for designing amino acid sequences of proteins that will fold to have good hydrophobic cores and minimizes an energy function in a sequence evolution process are presented.
Abstract: We present two methods for designing amino acid sequences of proteins that will fold to have good hydrophobic cores. Given the coordinates of the desired target protein or polymer structure, the methods generate sequences of hydrophobic (H) and polar (P) monomers that are intended to fold to these structures. One method designs hydrophobic inside, polar outside; the other minimizes an energy function in a sequence evolution process. The sequences generated by these methods agree at the level of 60-80% of the sequence positions in 20 proteins in the Protein Data Bank. A major challenge in protein design is to create sequences that can fold uniquely, i.e. to a single conformation rather than to many. While an earlier lattice-based sequence evolution method was shown not to design unique folders, our method generates unique folders in lattice model tests. These methods may also be useful in designing other types of foldable polymer not based on amino acids.

Journal ArticleDOI
TL;DR: New developments to the dead-end elimination method are presented that allow us to handle larger proteins and more extensive rotamer libraries and it now becomes feasible to use extremely detailed libraries.
Abstract: Although the conformational states of protein side chains can be described using a library of rotamers, the determination of the global minimum energy conformation (GMEC) of a large collection of side chains, given fixed backbone coordinates, represents a challenging combinatorial problem with important applications in the field of homology modelling. Recently, we have developed a theoretical framework, called the dead-end elimination method, which allows us to identify efficiently rotamers that cannot be members of the GMEC. Such dead-ending rotamers can be iteratively removed from the system under study thereby tracking down the size of the combinatorial problem. Here we present new developments to the dead-end elimination method that allow us to handle larger proteins and more extensive rotamer libraries. These developments encompass (i) a procedure to determine weight factors in the generalized dead-end elimination theorem thereby enhancing the elimination of dead-ending rotamers and (ii) a novel strategy, mainly based on logical arguments derived from the logic pairs theorem, to use dead-ending rotamer pairs in the efficient elimination of single rotamers. These developments are illustrated for proteins of various sizes and the flow of the current method is discussed in detail. The effectiveness of dead-end elimination is increased by two orders of magnitude as compared with previous work. In addition, it now becomes feasible to use extremely detailed libraries. We also provide an appendix in which the validity of the generalized dead-end criterion is shown. Finally, perspectives for further applications which may now become within reach are discussed.

Journal ArticleDOI
TL;DR: Alignment of the sequences of the four papaya enzymes shows that there is a highly variable section towards the C-terminal of the pro-region, which may confer selectivity to thePro-regions for the individual proteolytic enzymes.
Abstract: Proteolytic enzymes require the presence of their pro-regions for correct folding. Of the four proteolytic enzymes from Carica papaya, papain and papaya proteinase IV (PPIV) have 68% sequence identity. We find that their pro-regions are even more similar, exhibiting 73.6% identity. cDNAs encoding the pro-regions of these two proteinases have been expressed in Escherichia coli independently from their mature enzymes. The recombinant pro-regions of papain and PPIV have been shown to be high affinity inhibitors of all four of the mature native papaya cysteine proteinases. Their inhibition constants are in the range 10(-6) - 10(-9) M. PPIV was inhibited two to three orders of magnitude less effectively than papain, chymopapain and caricain. The pro-region of PPIV, however, inhibited its own mature enzyme more effectively than did the pro-region of papain. Alignment of the sequences of the four papaya enzymes shows that there is a highly variable section towards the C-terminal of the pro-region. This region may therefore confer selectivity to the pro-regions for the individual proteolytic enzymes.

Journal ArticleDOI
TL;DR: The key to this method is the use of a Bayesian formalism to calculate the probability that a given substitution matrix fits the tree structures and multiple sequence alignment data.
Abstract: Substitution matrices are a key tool in important applications such as identifying sequence homologies, creating sequence alignments and more recently using evolutionary patterns for the prediction of protein structure. We have derived a novel approach to the derivation of these matrices that utilizes not only multiple sequence alignments, but also the associated evolutionary trees. The key to our method is the use of a Bayesian formalism to calculate the probability that a given substitution matrix fits the tree structures and multiple sequence alignment data. Using this procedure, we can determine optimal substitution matrices for various local environments, depending on parameters such as secondary structure and surface accessibility.

Journal ArticleDOI
TL;DR: The probability alignment method is applied to a few protein pairs, and results indicate that such highly probable correspondences in the probability alignments are probably correct correspondences that agree with the structural alignments and that incorrect correspondence in the maximum similarity alignings are usually insignificant correspondences on the basis of probabilities.
Abstract: Probabilities of all possible correspondences of residues in aligning two proteins are evaluated by assuming that the statistical weight of each alignment is proportional to the exponent of its total similarity score. Based on such probabilities, a probability alignment that includes the most probable correspondences is proposed. In the case of highly similar sequence pairs, the probability alignments agree with the maximum similarity alignments that correspond to the alignments with the maximum similarity score. Significant correspondences in the probability alignments are those whose probabilities are > 0.5. The probability alignment method is applied to a few protein pairs, and results indicate that such highly probable correspondences in the probability alignments are probably correct correspondences that agree with the structural alignments and that incorrect correspondences in the maximum similarity alignments are usually insignificant correspondences in the probability alignments. The root mean square deviations in superimposition of corresponding residues tend to be smaller for significant correspondences in the probability alignments than for all correspondences in the maximum similarity alignments, indicating that incorrect correspondences in the maximum similarity alignments tend to be insignificant correspondences in probability alignments. This fact is also confirmed in 109 protein pairs that are similar to each other with sequence identities between 90 and 35%. In addition, the probability alignment method may better predict correct correspondences than the maximum similarity alignment method. Probability alignments do, of course, depend on a scoring scheme but are less sensitive to the value of parameters such as gap penalties. The present probability alignment method is useful for constructing reliable alignments based on the probabilities of correspondences and can be used with any scoring scheme.

Journal ArticleDOI
TL;DR: Differences were found between the structure reported here and the previously reported 2.7 A 4-4-20 Fab structure, which could be explained by differences in interpreting the electron density maps at the various resolutions.
Abstract: The crystal complex of fluorescein bound to the high-affinity anti-fluorescein 4-4-20 Fab (Ka = 10(10) M-1 at 2 degrees C) has been determined at 1.85 A. Isomorphous crystals of two isoelectric forms (pI = 7.5 and 7.9) of the anti-fluorescein 4-4-20 Fab, an IgG2A [Gibson et al. (1988) Proteins: Struct. Funct. Genet., 3, 155-160], have been grown. Both complexes crystallize with one molecule in the asymmetric unit in space group P1, with a = 42.75 A, b = 43.87 A, c = 58.17 A, alpha = 95.15 degrees, beta = 86.85 degrees and gamma = 98.01 degrees. The final structure has an R value of 0.188 at 1.85 A resolution. Interactions between bound fluorescein, the complementarity-determining regions (CDRs) of the Fab and the active-site mutants of the 4-4-20 single-chain Fv will be discussed. Differences were found between the structure reported here and the previously reported 2.7 A 4-4-20 Fab structure [Herron et al. (1989) Proteins: Struct. Funct. Genet., 5, 271-280]. Our structure determination was based on 26,328 unique reflections--four times the amount of data used in the previous report. Differences in the two structures could be explained by differences in interpreting the electron density maps at the various resolutions. The r.m.s. deviations between the variable and constant domains of the two structures were 0.77 and 1.54 A, respectively. Four regions of the light chain and four regions of the heavy chain had r.m.s. backbone deviations of > 4 A. The most significant of these was the conformation of the light chain CDR 1.

Journal ArticleDOI
TL;DR: A new method of protein structure comparison based on spatial arrangements of secondary structural elements (SSEs) that has a flexible target function that can be adjusted depending upon particular levels or definitions of structural similarity, and it is fast enough to allow structural comparisons for many pairs of proteins.
Abstract: We have developed a new method of protein structure comparison based on spatial arrangements of secondary structural elements (SSEs). Each SSE is represented by a single vector, and common spatial arrangements of vectors in a pair of proteins are detected. The method allows not only insertions and deletions of SSEs, but also topological permutations. It has a flexible target function that can be adjusted depending upon particular levels or definitions of structural similarity, and it is fast enough to allow structural comparisons for many pairs of proteins. The parameters for the target function are determined based on distributions of the geometrical variables for the spatial arrangements of the equivalent SSEs in well-known structural motifs. The obtained parameter set is tuned for detecting relatively strong structural similarity. We report several tests on examples including comparisons of known structural similarity and database searches for a target structure, and examine the results when this parameter set is used for the comparison of distantly related structures.

Journal ArticleDOI
TL;DR: A simple code for DNA recognition by transcription factors does seem to exist, and the recognition rules allow us to predict DNA-protein interactions, to change the binding specificity of an existing transcription factor, and probably even to design in a rational way a new protein which binds to a particular DNA sequence.
Abstract: Introduction Over 35 years have passed since the 'central dogma' of molecular biology (DNA makes RNA makes protein) was proposed (Crick, 1958). Despite its remarkable verification, it is being seen increasingly as limited, for if the whole flow of information in a cell were unidirectional, all cells with the same complement of genetic material would have identical function and morphology. The truth is manifestly otherwise. A group of proteins, transcription factors, selects the information used in cells by specifically binding to 'regulatory' DNA sequences. Among other effects, this causes the differentiation of cells. These factors act as the final messenger in a transduction pathway of signals which come from outside the cell. Thus, gene expression can be regulated by the environment. Recognition between a transcription factor and its target DNA is achieved through the physical interaction of the two molecules. Since the structures of both DNA and proteins are determined by their primary sequences, there must be a set of rules to describe DNA-protein interactions entirely on the basis of sequences. The fundamental question is whether these rules are simple and comprehensible, such that the DNA recognition code can be compared with the triplet code which summarizes the rules of how DNA and protein sequences are related in the central dogma. As we review in this paper, a simple code for DNA recognition by transcription factors does seem to exist. In fact, the recognition rules allow us (i) to predict DNA-protein interactions, (ii) to change the binding specificity of an existing transcription factor, and (iii) probably even to design in a rational way a new protein which binds to a particular DNA sequence. The code has been derived from crystal structures of transcription factor-DNA complexes (Table I) and the vast body of biochemical, genetic and statistical information about the binding specificity of transcription factors. Most of the transcription factors discussed here use an a-helix, which binds to the DNA major groove, for recognition. Those proteins which have a 'recognition helix' discussed here fall mainly into four families: probe helix (PH), helix-turnhelix (HTH), zinc finger (ZnF) and C4 Zn binding proteins (C4). There is, in addition, one transcription factor family described that uses a (J-sheet, the MetJ repressor-like (MR) family. [See Table I for members of these and other families. Note that (i) individual Zn fingers are further subdivided into A and B fingers, AF and BF (Suzuki et ai, 1994a), (ii) the PH family includes homeodomain and basic-zipper proteins (Suzuki, 1993) and (iii) the C4 family includes the hormone receptors and the GATA proteins (Suzuki and Chothia, 1994).]

Journal ArticleDOI
TL;DR: This work constructs dsFv immunotoxins in which the Fv moiety is fused to a truncated form of Pseudomonas exotoxin, and demonstrates that position H44-L105 is the only one which gives high production yields of active ds Fvs and all other positions gave either low yields and activity or completely failed to produce active dSFv.
Abstract: Using molecular modeling technology we have recently identified positions in conserved framework regions of Fvs which can be used to stabilize antibody Fvs by an interchain disulfide bond engineered in between the structurally conserved framework positions of the variable domains of heavy (VH) and light (VL) immunoglobulin chains (disulfide-stabilized Fv; dsFv). The computer model indicated the existence of other potential sites in the framework regions that might be suitable for disulfide bond formation between VH and VL. The possibility of obtaining dsFvs using these positions is evaluated here experimentally by constructing dsFv immunotoxins in which the Fv moiety is fused to a truncated form of Pseudomonas exotoxin. We analyzed the extent of dsFv formation and the activity of the resulting dsFv immunotoxins, and compared various dsFv molecules with the scFv immunotoxin. Our results demonstrate that position H44-L105 is the only one which gives high production yields of active dsFv. All other positions gave either low yields and activity or completely failed to produce active dsFv. With one exception, the formation and activities of the dsFvs corresponded to the C alpha-C alpha distance between the VH and VL positions, with an optimal distance of 5.7 A producing the best dsFv. Distances of 6.0-6.9 A resulted in a low yield of protein that was still capable of binding antigen, whereas distances > 7.0 A resulted in molecules in which dsFv formation was not obtained.

Journal ArticleDOI
TL;DR: It was concluded that gentle removal of urea from denatured proteins, dissolved in concentrated urea solution, by means of dialysis should be useful to renature denaturing proteins effectively.
Abstract: To increase the folding yield of concentrated reduced lysozyme, we developed a renaturation method by means of dialysis from concentrated urea with redox agents. After lysozyme was incubated in the reducing buffer (8 M urea solution) with oxidized glutathione, renaturation of reduced lysozyme was started by dialysis against the dialyzing buffer containing 8 M urea with redox agents. The urea concentration of the dialyzing bottle was gradually diluted with dialyzing buffer without urea at a flow rate of 0.1 ml/min by high pressure pump. Using this systematic dialysis, a concentration as high as 5 mg/ml of reduced lysozyme could be renaturated in 80% yield, while the folding yield was < 5% even at a concentration of 1 mg/ml using a conventional rapid dilution method [Goldberg et al. (1991) Biochemistry, 30, 2790-2797]. Therefore, it was concluded that gentle removal of urea from denatured proteins, dissolved in concentrated urea solution, by means of dialysis should be useful to renature denatured proteins effectively.

Journal ArticleDOI
TL;DR: The success of the present preliminary work on protein structure class prediction suggests that further refinements of method may lead to improved predictions and this is currently being investigated.
Abstract: Most globular proteins can be classified into one of four structural classes--all-alpha, all-beta, alpha + beta and alpha/beta--depending upon the type, amount and arrangement of secondary structures present. In this work a new method, based upon fuzzy clustering, is proposed for predicting the structural class of a protein from its amino acid composition. Here, each of the structural classes is described by a fuzzy cluster and each protein is characterized by its membership degree, a number between zero and one in each of the four clusters, with the constraint that the sum of the membership degrees equals unity. A given protein is then classified as belonging to that structural class corresponding to the fuzzy cluster with maximum membership degree. Calculation of membership degrees is carried out using the fuzzy c-means algorithm on a training set of 64 proteins. Results obtained for the training set show that the fuzzy clustering approach produces results comparable with or better than those obtained by other methods. A test set of 27 proteins also produced comparable results to those obtained with the training set. The success of the present preliminary work on protein structure class prediction suggests that further refinements of method may lead to improved predictions and this is currently being investigated.

Journal ArticleDOI
TL;DR: Results demonstrate that the fourth and fifth epidermal growth factor (EGF)-like domains together comprise the smallest active fragment of TM, which is also the largest fragment of human thrombomodulin.
Abstract: Fragments of human thrombomodulin (TM) have been expressed in large quantities in the Pichia pastoris yeast expression system and purified to homogeneity. Fermentation of P. pastoris resulted in yields of 170 mg/l TM. Purification to homogeneity resulted in an overall 10% yield, so that quantities of approximately 20 mg purified fragments can be readily obtained. Smaller fragments of TM, such as the individual fourth or fifth domains, were not active, nor were equimolar mixtures of the two domains. These results demonstrate that the fourth and fifth epidermal growth factor (EGF)-like domains together comprise the smallest active fragment of TM. The fragment containing the fourth and fifth EGF-like domains [TMEGF(4-5)] had 10% the specific activity of rabbit TM. Comparison of the M388L mutant TMEGF(4-5) fragment with the same mutant TMEGF(4-5-6) fragment showed that the fragment with the sixth domain had a 10-fold better Km value for thrombin than the fragment that did not contain the sixth domain; this factor completely accounts for the higher specific activity of the fragments containing the sixth domain. Comparison of the wild-type and M388L mutants showed that the M388L mutation resulted in a 2-fold increase in kcat for the activation of protein C by the thrombin-TM fragment complex, completely accounting for the 2-fold increase in specific activity of these mutant fragments.

Journal ArticleDOI
TL;DR: Kinetic analysis of electron transfer between mutated pseudoazurins and NIR reveals that the lysine mutations have very little effect on the rate of electron Transfer to NIR, but substitution at residues 10, 38, 57 and 77, all close to the copper site, substantially decreases the affinity of pseudoazuran for NIR.
Abstract: Pseudoazurin, a low molecular weight protein containing a single type I copper, functions as an electron donor to a copper-containing nitrite reductase (NIR) in a denitrifying bacterium Alcaligenes faecalis S-6. To elucidate the protein-protein interaction between these two copper-containing proteins, each of nine out of 13 lysine residues on the surface of pseudoazurin were independently replaced by alanine or aspartate, and the effects of the mutations on the interaction with NIR, as well as the physicochemical properties of pseudoazurin, were analyzed. All of the mutated pseudoazurins showed optical spectra and oxidation-reduction potentials almost identical to those of wild-type pseudoazurin, suggesting that none of the replacements of these lysine residues affected the environment around the type I copper site. Kinetic analysis of electron transfer between mutated pseudoazurins and NIR reveals that the lysine mutations have very little effect on the rate of electron transfer to NIR, but substitution at residues 10, 38, 57 and 77, all close to the copper site, substantially decreases the affinity of pseudoazurin for NIR. This suggests that pseudoazurin interacts with NIR through the region close to the type I copper site. The refined X-ray structures of Lys38Asp and Lys10Asp/Lys38Asp show that the molecular structure has indeed changed little. A new space group is observed for the Lys109Ala mutant crystal. Crystal packing interactions change for the Lys10Asp/Lys38Asp mutant but remain the same for Lys38Asp and Lys59Ala mutants.

Journal ArticleDOI
TL;DR: The molecular evolution of both P450-containing systems and of each particular component does not follow phylogeny in general, and two separate domains and their interaction are described.
Abstract: All known P450-containing monooxygenase systems share common structural and functional domain architecture Apart from P450 itself, these systems can comprise several fundamentally different protein components or domains, all of which are shared by other multicomponent/multidomain enzyme systems with various functions: FAD flavoprotein or domain, FMN domain, Fe2S2 ferredoxin, Fe3S4 ferredoxin, and cytochrome b5 Either FMN domain, ferredoxins or cytochrome b5 serve as the electron transport intermediate between the FAD domain and P450 The molecular evolution of both P450-containing systems and of each particular component does not follow phylogeny in general Gene fusion and horizontal gene transfer events can lead to the appearance of novel redox chains in the same manner that artificial chimeric proteins can be constructed by humans Recent studies using genetic and protein engineering techniques to investigate the separate domains and their interaction are described

Journal ArticleDOI
TL;DR: A fully automatic procedure for aligning two protein structures is presented, which uses as sole structural similarity measure the root mean square deviation of superimposed backbone atoms and is designed to yield optimal solutions with respect to this measure.
Abstract: A fully automatic procedure for aligning two protein structures is presented It uses as sole structural similarity measure the root mean square (rms) deviation of superimposed backbone atoms (N, C alpha, C and O) and is designed to yield optimal solutions with respect to this measure In a first step, the procedure identifies protein segments with similar conformations in both proteins In a second step, a novel multiple linkage clustering algorithm is used to identify segment combinations which yield optimal global structure alignments Several structure alignments can usually be obtained for a given pair of proteins, which are exploited here to define automatically the common structural core of a protein family Furthermore, an automatic analysis of the clustering trees is described which enables detection of rigid-body movements between structure elements To illustrate the performance of our procedure, we apply it to families of distantly related proteins One groups the three alpha + beta proteins ubiquitin, ferredoxin and the B1-domain of protein G Their common structure motif consists of four beta-strands and the only alpha-helix, with one strand and the helix being displaced as a rigid body relative to the remaining three beta-strands The other family consists of beta-proteins from the Greek key group, in particular actinoxanthin, the immunoglobulin variable domain and plastocyanin Their consensus motif, composed of five beta-strands and a turn, is identified, mostly intact, in all Greek key proteins except the trypsins, and interestingly also in three other beta-protein families, the lipocalins, the neuraminidases and the lectins This result provides new insights into the evolutionary relationships in the very diverse group of all beta-proteins

Journal ArticleDOI
TL;DR: A mean standard geometry for the GlcNAc moiety, along with a rationalization of its conformational behavior, can be proposed from the statistical analysis of 44 different glycosylation sites belonging to 26 glycoproteins of the Brookhaven Protein Data Bank.
Abstract: The stereochemical features displayed by the N-glycosidic linkage in crystalline N-linked glycoproteins are analyzed. From the statistical analysis of 44 different glycosylation sites belonging to 26 glycoproteins of the Brookhaven Protein Data Bank, a mean standard geometry for the GlcNAc moiety, along with a rationalization of its conformational behavior, can be proposed. As for the glycopeptide linkage, the distribution of observed conformations has been analyzed on the basis of molecular mechanics calculations. The rotamer distribution of the Asn side chains conforms to that observed on non-glycosylated structures, and it agrees with the pattern of flexible conformations gathered from NMR measurements. In characterizing the protein-glycan interactions, some hydrogen bonds occur. Stacking between the amphiphilic moiety of the glycan and some surrounding aromatic, or at least hydrophobic, amino acid residues is also found. When looking at the secondary structure of the glycosylated peptide, only 25% of the glycosylation sites correspond to situations where Asn is located at the top of a beta-turn. Other types of secondary structure exist which fulfill the spatial requirement of having the glycan exposed at the surface of the protein. These data can be compared with the most recent studies on the peptide conformation which would be required for glycosylation.