Showing papers by "Chris Sander published in 1995"

PDF

Open Access

Journal Article•DOI•

Dali: a network tool for protein structure comparison

[...]

Liisa Holm, Chris Sander

01 Nov 1995-Trends in Biochemical Sciences

1,444 citations

Journal Article•DOI•

The double cubic lattice method: Efficient approaches to numerical integration of surface area and volume and to dot surface contouring of molecular assemblies

[...]

Frank Eisenhaber¹, Philip Lijnzaad, Patrick Argos, Chris Sander, Michael Scharf - Show less +1 more•Institutions (1)

Humboldt University of Berlin¹

01 Mar 1995-Journal of Computational Chemistry

TL;DR: The double cubic lattice method (DCLM) is an accurate and rapid approach for computing numerically molecular surface areas and the volume and compactness of molecular assemblies and for generating dot surfaces, and is the method of choice, especially for large molecular complexes and high point densities.

...read moreread less

Abstract: The double cubic lattice method (DCLM) is an accurate and rapid approach for computing numerically molecular surface areas (such as the solvent accessible or van der Waals surface) and the volume and compactness of molecular assemblies and for generating dot surfaces. The algorithm has no special memory requirements and can be easily implemented. The computation speed is extremely high, making interactive calculation of surfaces, volumes, and dot surfaces for systems of 1000 and more atoms possible on single-processor workstations. The algorithm can be easily parallelized. The DCLM is an algorithmic variant of the approach proposed by Shrake and Rupley (J. Mol. Biol., 79, 351–371, 1973). However, the application of two cubic lattices—one for grouping neighboring atomic centers and the other for grouping neighboring surface dots of an atom—results in a drastic reduction of central processing unit (CPU) time consumption by avoiding redundant distance checks. This is most noticeable for compact conformations. For instance, the calculation of the solvent accessible surface area of the crystal conformation of bovine pancreatic trypsin inhibitor (entry 4PTI of the Brookhaven Protein Data Bank, 362-point sphere for all 454 nonhydrogen atoms) takes less than 1 second (on a single R3000 processor of an SGI 4D/480, about 5 MFLOP). The DCLM does not depend on the spherical point distribution applied. The quality of unit sphere tesselations is discussed. We propose new ways of subdivision based on the icosahedron and dodecahedron, which achieve constantly low ratios of longest to shortest arcs over the whole frequency range. The DCLM is the method of choice, especially for large molecular complexes and high point densities. Its speed has been compared to the fastest techniques known to the authors, and it was found to be superior, especially when also taking into account the small memory requirement and the flexibility of the algorithm. The program text may be obtained on request. © 1995 by John Wiley & Sons, Inc.

...read moreread less

805 citations

Journal Article•DOI•

A method to predict functional residues in proteins

[...]

Georg Casari¹, Chris Sander¹, Alfonso Valencia¹•Institutions (1)

European Bioinformatics Institute¹

01 Feb 1995-Nature Structural & Molecular Biology

TL;DR: A novel method is presented that exploits conservation patterns for the prediction of functional residues in SH2 domains and in the conserved box of cyclins, using a simple but powerful representation of entire proteins, as well as sequence residues as vectors in a generalised ‘sequence space’.

...read moreread less

Abstract: The biological activity of a protein typically depends on the presence of a small number of functional residues. Identifying these residues from the amino acid sequences alone would be useful. Classically, strictly conserved residues are predicted to be functional but often conservation patterns are more complicated. Here, we present a novel method that exploits such patterns for the prediction of functional residues. The method uses a simple but powerful representation of entire proteins, as well as sequence residues as vectors in a generalised 'sequence space'. Projection of these vectors onto a lower-dimensional space reveals groups of residues specific for particular subfamilies that are predicted to be directly involved in protein function. Based on the method we present testable predictions for sets of functional residues in SH2 domains and in the conserved box of cyclins.

...read moreread less

428 citations

Journal Article•DOI•

DNA polymerase β belongs to an ancient nucleotidyltransferase superfamily

[...]

Liisa Holm, Chris Sander

01 Sep 1995-Trends in Biochemical Sciences

248 citations

Journal Article•DOI•

A Sequence Property Approach to Searching Protein Databases

[...]

Uwe Hobohm¹, Chris Sander¹•Institutions (1)

European Bioinformatics Institute¹

18 Aug 1995-Journal of Molecular Biology

TL;DR: This work shows that members of structural protein families have a low mutual PropSearch distance when the weights are optimized to discriminate maximally between structural families, and demonstrates the results of database searches using the PropSearch method.

...read moreread less

181 citations

Journal Article•DOI•

The use of position-specific rotamers in model building by homology.

[...]

Glay Chinea¹, Gabriel Padrón¹, Rob Hooft², Chris Sander², Gerrit Vriend² - Show less +1 more•Institutions (2)

Biotec¹, European Bioinformatics Institute²

01 Nov 1995-Proteins

TL;DR: This study focuses on replacing side chains as a subtask of model building by homology by choosing position‐specific rather than generalized rotamers and by sorting the residues that have to be modelled as a function of their freedom in rotamer space.

...read moreread less

Abstract: In this study we concentrate on replacing side chains as a subtask of model building by homology. Two problems arise. How to determine potential low energy rotamers? And how to avoid the combinatorial explosion that results from the combination of many residues for which multiple good rotamers are predicted? We attempt to solve these problems by choosing position-specific rather than generalized rotamers and by sorting the residues that have to be modelled as a function of their freedom in rotamer space. The practical advantages of our method are the quality of the models for cases of high backbone similarity, the small amount of human intervention needed, and the fact that the method automatically estimates the reliability with which each residue has been modeled. Other methods described in this issue are probably more suitable if large backbone rearrangements or loop insertions and deletions need to be modeled. © 1995 Wiley-Liss, Inc.

...read moreread less

142 citations

Journal Article•DOI•

Progress of 1D protein structure prediction at last.

[...]

Burkhard Rost¹, Chris Sander¹•Institutions (1)

European Bioinformatics Institute¹

01 Nov 1995-Proteins

TL;DR: Accuracy of predicting protein secondary structure and solvent accessibility from sequence information has been improved significantly by using information contained in multiple sequence alignments as input to a neural 'network system.

...read moreread less

Abstract: Accuracy of predicting protein secondary structure and solvent accessibility from sequence information has been improved significantly by using information contained in multiple sequence alignments as input to a neural 'network system. For the Asilomar meeting, predictions for 13 proteins were generated automatically using the publicly available prediction method PHD. The results confirm the estimate of 72% three-state prediction accuracy. The fairly accurate predictions of secondary structure segments made the tool useful as a starting point for modeling of higher dimensional aspects of protein structure. © 1995 Wiley-Liss, Inc.

...read moreread less

98 citations

Journal Article•DOI•

Exploring the Mycoplasma capricolum genome: a minimal cell reveals its physiology.

[...]

Peer Bork¹, Christos A. Ouzounis², Georg Casari², Reinhard Schneider³, Chris Sander², Maureen Dolan⁴, Walter Gilbert⁴, P. Gillevet⁵ - Show less +4 more•Institutions (5)

Max Delbrück Center for Molecular Medicine¹, European Bioinformatics Institute², University of Luxembourg³, Harvard University⁴, George Mason University⁵

01 Jun 1995-Molecular Microbiology

TL;DR: This survey is beginning to provide a detailed view of how M. capricolum manages to maintain essential cellular processes with a genome much smaller than that of its bacterial relatives.

...read moreread less

Abstract: We report on the analysis of 214kb of the parasitic eubacterium Mycoplasma capricolum sequenced by genomic walking techniques. The 287 putative proteins detected to date represent about half of the estimated total number of 500 predicted for this organism. A large fraction of these (75%) can be assigned a likely function as a result of similarity searches. Several important features of the functional organization of this small genome are already apparent. Among these are (i) the expected relatively large number of enzymes involved in metabolic transport and activation, for efficient use of host cell nutrients; (ii) the presence of anabolic enzymes; (iii) the unexpected diversity of enzymes involved in DNA replication and repair; and (iv) a sizeable number of orthologues (82 so far) in Escherichia coli. This survey is beginning to provide a detailed view of how M. capricolum manages to maintain essential cellular processes with a genome much smaller than that of its bacterial relatives.

...read moreread less

98 citations

Journal Article•DOI•

Challenging times for bioinformatics

[...]

Georg Casari, Miguel A. Andrade, Peer Bork¹, John Boyle, Antoine de Daruvar, Christos A. Ouzounis², Reinhard Schneider³, Javier Tamames⁴, Alfonso Valencia⁴, Chris Sander - Show less +6 more•Institutions (4)

Max Delbrück Center for Molecular Medicine¹, SRI International², European Bioinformatics Institute³, Spanish National Research Council⁴

24 Aug 1995-Nature

91 citations

Journal Article•DOI•

The cytidylyltransferase superfamily: Identification of the nucleotide‐binding site and fold prediction

[...]

Peer Bork¹, Liisa Holm¹, Eugene V. Koonin², Chris Sander¹•Institutions (2)

European Bioinformatics Institute¹, National Institutes of Health²

01 Jul 1995-Proteins

TL;DR: The proposed 3D model of TagD is plausible both structurally, with a well packed hydrophobic core, and functionally, as the most conserved residues cluster around the putative nucleotide binding site.

...read moreread less

Abstract: The crystal structure of glycerol-3-phosphate cytidylyltransferase from B. subtilis (TagD) is about to be solved. Here, we report a testable structure prediction based on the identification by sequence analysis of a superfamily of functionally diverse but structurally similar nucleotide-binding enzymes. We predict that TagD is a member of this family. The most conserved region in this superfamily resembles the ATP-binding HiGH motif of class I aminoacyl-tRNA synthetases. The predicted secondary structure of cytidylyltransferase and its homologues is compatible with the alpha/beta topography of the class I aminoacyl-tRNA synthetases. The hypothesis of similarity of fold is strengthened by sequence-structure alignment and 3D model building using the known structure of tyrosyl tRNA synthetase as template. The proposed 3D model of TagD is plausible both structurally, with a well packed hydrophobic core, and functionally, as the most conserved residues cluster around the putative nucleotide binding site. If correct, the model would imply a very ancient evolutionary link between class I tRNA synthetases and the novel cytidylyltransferase superfamily.

...read moreread less

89 citations

3-D lookup: Fast protein structure database searches

[...]

Liisa Holm, Chris Sander

31 Dec 1995

TL;DR: This work presents a novel heuristic for identifying 3-D similarities between a query structure and the database of known protein structures, which is useful as a rapid preprocessor to a comprehensive protein structure database search system.

...read moreread less

Abstract: There are far fewer classes of three-dimensional protein folds than sequence families but the problem of detecting three-dimensional similarities is NP-complete. We present a novel heuristic for identifying 3-D similarities between a query structure and the database of known protein structures. Many methods for structure alignment use a bottom-up approach, identifying first local matches and then solving a combinatorial problem in building up larger clusters of matching substructures. Here the top-down approach is to start with the global comparison and select a rough superimposition using a fast 3-D lookup of secondary structure motifs. The superimposition is then extended to an alignment of C{sup {alpha}} atoms by an iterative dynamic programming step. An all-against-all comparison of 385-representative proteins (150,000 pair comparisons) took 1 day of computer time on a single R8000 processor. In other words, one query structure is scanned against the database in a matter of minutes. The method is rated at 90% reliability at capturing statistically significant similarities. It is useful as a rapid preprocessor to a comprehensive protein structure database search system.

...read moreread less

Proceedings Article•

3-D lookup: fast protein structure database searches at 90% reliability.

[...]

Liisa Holm, Chris Sander

01 Jan 1995

TL;DR: In this article, a top-down approach is proposed to identify 3D similarities between a query structure and the database of known protein structures. But the problem of detecting three-dimensional similarities is NP-complete.

...read moreread less

Abstract: There are far fewer classes of three-dimensional protein folds than sequence families but the problem of detecting three-dimensional similarities is NP-complete. We present a novel heuristic for identifying 3-D similarities between a query structure and the database of known protein structures. Many methods for structure alignment use a bottom-up approach, identifying first local matches and then solving a combinatorial problem in building up larger clusters of matching substructures. Here, the top-down approach is to start with the global comparison and select a rough superimposition using a fast 3-D lookup of secondary structure motifs. The superimposition is then extended to an alignment of C alpha atoms by an iterative dynamic programming step. An all-against-all comparison of 385 representative proteins (150,000 pair comparisons) took 1 day of computer time on a single R8000 processor. In other words, one query structure is scanned against the database in a matter of minutes. The method is rated at 90% reliability at capturing statistically significant similarities. It is useful as a rapid preprocessor to a comprehensive protein structure database search system.

...read moreread less

Journal Article•DOI•

Evolutionary link between glycogen phosphorylase and a DNA modifying enzyme.

[...]

Liisa Holm¹, Chris Sander¹•Institutions (1)

European Bioinformatics Institute¹

03 Apr 1995-The EMBO Journal

TL;DR: An unexpected similarity in three‐dimensional structure between glucosyltransferases involved in very different biochemical pathways, with interesting evolutionary and functional implications, is reported, derived from a common ancient evolutionary ancestor of the two enzymes.

...read moreread less

Abstract: We report here an unexpected similarity in three-dimensional structure between glucosyltransferases involved in very different biochemical pathways, with interesting evolutionary and functional implications. One is the DNA modifying enzyme beta-glucosyltransferase from bacteriophage T4, alias UDP-glucose:5-hydroxymethyl-cytosine beta-glucosyltransferase. The other is the metabolic enzyme glycogen phosphorylase, alias 1.4-alpha-D-glucan:orthophosphate alpha-glucosyltransferase. Structural alignment revealed that the entire structure of beta-glucosyltransferase is topographically equivalent to the catalytic core of the much larger glycogen phosphorylase. The match includes two domains in similar relative orientation and connecting helices, with a positional root-mean-square deviation of only 3.4 A for 256 C alpha atoms. An interdomain rotation seen in the R- to T-state transition of glycogen phosphorylase is similar to that observed in beta-glucosyltransferase on substrate binding. Although not a single functional residue is identical, there are striking similarities in the spatial arrangement and in the chemical nature of the substrates. The functional analogies are (beta-glucosyltransferase-glycogen phosphorylase): ribose ring of UDP-pyridoxal ring of pyridoxal phosphate co-enzyme; phosphates of UDP-phosphate of co-enzyme and reactive orthophosphate; glucose unit transferred to DNA-terminal glucose unit extracted from glycogen. We anticipate the discovery of additional structurally conserved members of the emerging glucosyltransferase superfamily derived from a common ancient evolutionary ancestor of the two enzymes.

...read moreread less

Journal Article•DOI•

Novel protein families in archaean genomes

[...]

Christos A. Ouzounis, Nikos C. Kyrpides, Chris Sander

25 Feb 1995-Nucleic Acids Research

TL;DR: It is shown that the putative laminin receptor family of eukaryotes and an archaean homologue belong to the previously characterized ribosomal protein family S2 from eubacteria, suggesting that archaea seem to have a mode of expression of genetic information rather similar to eUKaryotes, while eub bacteria may have proceeded into unique ways of transcription and translation.

...read moreread less

Abstract: In a quest for novel functions in archaea, all archaean hypothetical open reading frames (ORFs), as annotated in the Swiss-Prot protein sequence database, were used to search the latest databases for the identification of characterized homologues. Of the 95 hypothetical archaean ORFs, 25 were found to be homologous to another hypothetical archaean ORF, while 36 were homologous to non-archaean proteins, of which as many as 30 were homologous to a characterized protein family. Thus the level of sequence similarity in this set reaches 64%, while the level of function assignment is only 32%. Of the ORFs with predicted functions, 12 homologies are reported here for the first time and represent nine new functions and one gene duplication at an acetyl-coA synthetase locus. The novel functions include components of the transcriptional and translational apparatus, such as ribosomal proteins, modification enzymes and a translation initiation factor. In addition, new enzymes are identified in archaea, such as cobyric acid synthase, dCTP deaminase and the first archaean homologues of a new subclass of ATP binding proteins found in fungi. Finally, it is shown that the putative laminin receptor family of eukaryotes and an archaean homologue belong to the previously characterized ribosomal protein family S2 from eubacteria. From the present and previous work, the major implication is that archaea seem to have a mode of expression of genetic information rather similar to eukaryotes, while eubacteria may have proceeded into unique ways of transcription and translation. In addition, with the detection of proteins in various metabolic and genetic processes in archaea, we can further predict the presence of additional proteins involved in these processes.

...read moreread less

Journal Article•DOI•

New protein functions in yeast chromosome VIII

[...]

Christos A. Ouzounis, Peer Bork, Georg Casari, Chris Sander

01 Nov 1995-Protein Science

TL;DR: The analysis of the 269 open reading frames of yeast chromosome VIII by computational methods has yielded 24 new significant sequence similarities to proteins of known function, including peptidyl‐tRNA hydrolase, a ribosome recycling factor homologue, and a protein similar to cytochrome b translational activator CBS2.

...read moreread less

Abstract: The analysis of the 269 open reading frames of yeast chromosome VIII by computational methods has yielded 24 new significant sequence similarities to proteins of known function. The resulting predicted functions include three particularly interesting cases of translation-associated proteins: peptidyl-tRNA hydrolase, a ribosome recycling factor homologue, and a protein similar to cytochrome b translational activator CBS2. The methodological limits of the meaningful transfer of functional information between distant homologues are discussed.

...read moreread less

Journal Article•DOI•

A Drosophila hsp70 gene contains long, antiparallel, coupled open reading frames (LAC ORFs) conserved in homologous loci.

[...]

Irene Konstantopoulou¹, Christos A. Ouzounis, Elena Drosopoulou¹, Minas Yiangou¹, Paschalis Sideras², Chris Sander, Zacharias G. Scouras¹ - Show less +3 more•Institutions (2)

Aristotle University of Thessaloniki¹, Umeå University²

01 Oct 1995-Journal of Molecular Evolution

TL;DR: Computational analysis shows that this LAC ORF arrangement is conserved in other hsp70 loci in a wide range of organisms, raising questions about possible evolutionary benefits of such a peculiar genomic organization.

...read moreread less

Abstract: A clone isolated from a Drosophila auraria heat-shock cDNA library presents two long, antiparallel, coupled (LAC) open reading frames (ORFs). One strand ORF is 1,929 nucleotides long and exhibits great identity (87.5% at the nucleotide level and 94% at the amino acid level) with the hsp70 gene copies of D. melanogaster, while the second strand ORF, in antiparallel in-frame register arrangement, is 1,839 nucleotides long and exhibits 32% identity with a putative, recently identified, NAD+-dependent glutamate dehydrogenase (NAD+-GDH). The overlap of the two ORFs is 1,824 nucleotides long. Computational analysis shows that this LAC ORF arrangement is conserved in other hsp70 loci in a wide range of organisms, raising questions about possible evolutionary benefits of such a peculiar genomic organization.

...read moreread less

Journal Article•DOI•

Nucleotide sequence and analysis of the centromeric region of yeast chromosome IX

[...]

H. Voss, Javier Tamames, C. Teodoru, A. Valencia, Christoph Wilhelm Sensen, Stefan Wiemann, Christian Schwager, Jürgen Zimmermann, Chris Sander, W. Ansorge - Show less +6 more

01 Jan 1995-Yeast

TL;DR: The nucleotide sequence of a cosmid containing the centromere region of yeast (Saccharomyces cerevisiae) chromosome IX is determined by using an efficient directed sequencing strategy in combination with automated DNA sequencing on the A.L.F. DNA sequencer.

...read moreread less

Abstract: We have determined the nucleotide sequence of a cosmid (pIX338) containing the centromere region of yeast (Saccharomyces cerevisiae) chromosome IX. The complete nucleotide sequence of 33·8 kb was obtained by using an efficient directed sequencing strategy in combination with automated DNA sequencing on the A.L.F. DNA sequencer. Sequence analysis revealed the presence of 17 open reading frames (ORFs), four of them previously known yeast genes (sly12, pan1, sts1 and prl1), a tRNA gene and the centromere motif. Exhaustive database searches detected sequence homologues of known function for as many as 14 of the 17 ORFs. These include a mammalian tyrosine kinase substrate; the Escherichia coli cell cycle protein MinD; the human inositol polyphosphate-5-phosphatase (gene OCRL) involved in Lowe's syndrome, a developmental disorder; and helicases, for which the new yeast member defines a distinct DEAD/H-box subfamily. A surprisingly large fraction of the ORFs (at least six out of 17) in the centromeric region are apparently involved in RNA or DNA binding. The nucleotide sequence reported here has been submitted to the EMBL data library under the accession number X79743.

...read moreread less

Journal Article•DOI•

Investigating the Structural Determinants of the p21-like Triphosphate and Mg2+Binding Site

[...]

Philippe Cronet, Lluı́s Bellsolell¹, Lluı́s Bellsolell², Chris Sander, Miquel Coll¹, Miquel Coll², Luis Serrano - Show less +3 more•Institutions (2)

Polytechnic University of Catalonia¹, Spanish National Research Council²

09 Jun 1995-Journal of Molecular Biology

TL;DR: This work has engineered the Kinase 1 and 2 motifs into a protein that has the CMBF and no nucleotide binding activity, the chemotactic protein from Escherichia coli, CheY, which demonstrates that the native structure of the P-loop requires external interactions with the rest of the protein.

...read moreread less

Book Chapter•DOI•

The Functional Composition of Living Machines as a Design Principle for Artificial Organisms

[...]

Christos A. Ouzounis¹, Alfonso Valencia², Javier Tamames², Peer Bork³, Chris Sander⁴ - Show less +1 more•Institutions (4)

SRI International¹, Autonomous University of Madrid², Max Delbrück Center for Molecular Medicine³, European Bioinformatics Institute⁴

04 Jun 1995

TL;DR: This subdivision of the genomes of four best known model organisms can form a design principle for the construction of computational models of genomes and organisms and, ultimately, the design and fabrication of artificial organisms.

...read moreread less

Abstract: How similar are the engineering principles of artificial and natural machines? One way to approach this question is to compare in detail the basic functional components of living cells and human-made machines. Here, we provide some basic material for such a comparison, based on the analysis of functions for a few thousand protein molecules, the most versatile functional components of living cells. The composition of the genomes of four best known model organisms is analyzed and three major classes of molecular functions are defined: energy-, information- and communication-related. It is interesting that at the expense of the other two categories, communication-related coding potential has increased in relative numbers during evolution, and the progression from prokaryotes to eukaryotes and from unicellular to multi-cellular organisms. Based on the currently available data, 42% of the four genomes codes for energy-related proteins, 37% for information-related proteins, and finally the rest 21% for communication-related proteins, on average. This subdivision, and future refinements thereof, can form a design principle for the construction of computational models of genomes and organisms and, ultimately, the design and fabrication of artificial organisms.

...read moreread less