scispace - formally typeset
Search or ask a question
Journal ArticleDOI

J. Appl. Cryst.の発刊に際して

10 Mar 1970-Vol. 12, Iss: 1, pp 1-1
About: The article was published on 1970-03-10 and is currently open access. It has received 8159 citations till now.
Citations
More filters
01 Jan 1998
TL;DR: In this article, an approach for assessing the significance of sequence and structure comparisons by using nearly identical statistical formalisms for both se- quence and structure is presented. But the approach is restricted to protein sequences and does not consider protein structures.
Abstract: We present an approach for assessing the significance of sequence and structure comparisons by using nearly identical statistical formalisms for both se- quence and structure. Doing so involves an all-vs.-all com- parison of protein domains (taken here from the Structural Classification of Proteins (scop) database) and then fitting a simple distribution function to the observed scores. By using this distribution, we can attach a statistical signifi- cance to each comparison score in the form of a P value, the probability that a better score would occur by chance. As expected, we find that the scores for sequence matching follow an extreme-value distribution. The agreement, more- over, between the P values that we derive from this distri- bution and those reported by standard programs (e.g., BLAST and FASTA validates our approach. Structure comparison scores also follow an extreme-value distribution when the statistics are expressed in terms of a structural alignment score (essentially the sum of reciprocated distances between aligned atoms minus gap penalties). We find that the traditional metric of structural similarity, the rms deviation in atom positions after fitting aligned atoms, follows a different distribution of scores and does not perform as well as the structural alignment score. Comparison of the se- quence and structure statistics for pairs of proteins known to be related distantly shows that structural comparison is able to detect approximately twice as many distant rela- tionships as sequence comparison at the same error rate. The comparison also indicates that there are very few pairs with significant similarity in terms of sequence but not structure whereas many pairs have significant similarity in terms of structure but not sequence. Comparison is a most fundamental operation in biology. Measuring the similarities between ''things'' enables us to group them in families, cluster them in trees, and infer common ancestors and an evolutionary progression. Biological comparisons can take place at many levels, from that of whole organisms to that of individual molecules. We are concerned here with the comparison on the latter level, specifically, with comparisons of individual protein sequences and structures.

295 citations

Journal ArticleDOI
TL;DR: In this article, a procedure for the determination of all reference intensities of interest simultaneously is presented, and the maximum standard deviation of the matrix-flushing method has been estimated to be 8% relative.
Abstract: A set of reference intensities, ki, are required for the quantitative interpretation of X-ray diffraction patterns of mixtures. Each ki was heretofore determined individually from binary mixtures of a one-to-one weight ratio. A procedure for the determination of all ki's of interest simultaneously is presented. The X-ray diffraction patterns of multicomponent mixtures usually contain overlapping peaks. This overlapping problem can be avoided by choosing an arbitrary reference material already present in the mixture and/or using the strongest resolved reflections directly. These concepts are substantiated by ten examples. The maximum standard deviation of the matrix-flushing method has been estimated to be 8% relative.

295 citations

Journal ArticleDOI
TL;DR: The main-chain conformations of 237 384 amino acids in 1042 protein subunits from the PDB were analyzed with Ramachandran plots and may be useful for checking secondary-structure assignments in the P DB and for predicting protein folding.
Abstract: The main-chain conformations of 237 384 amino acids in 1042 protein subunits from the PDB were analyzed with Ramachandran plots. The populated areas of the empirical Ramachandran plot differed markedly from the classical plot in all regions. All amino acids in α-helices are found within a very narrow range of φ, ψ angles. As many as 40% of all amino acids are found in this most populated region, covering only 2% of the Ramachandran plot. The β-sheet region is clearly subdivided into two distinct regions. These do not arise from the parallel and antiparallel β-strands, which have quite similar conformations. One β region is mainly from amino acids in random coil. The third and smallest populated area of the Ramachandran plot, often denoted left-handed α-helix, has a different position than that originally suggested by Ramachandran. Each of the 20 amino acids has its own very characteristic Ramachandran plot. Most of the glycines have conformations that were considered to be less favoured. These results may be useful for checking secondary-structure assignments in the PDB and for predicting protein folding.

294 citations


Cites background or methods from "J. Appl. Cryst.の発刊に際して"

  • ...These do not arise from the parallel and antiparallel -strands, which have quite similar conformations....

    [...]

  • ...Today, the Ramachandran plot is not only a basic diagram in textbooks on protein structures, but also a useful tool for assessing the correctness of a protein structure determination (Laskowski et al., 1993; Morris et al., 1992)....

    [...]

Journal ArticleDOI
TL;DR: The difference in stability between recombinant molecules with and without the N cap sequences suggests that additional free energy for membrane fusion may become available after the formation of the central triple-stranded coiled coil and insertion of the fusion peptide into the target membrane.
Abstract: The structure of a stable recombinant ectodomain of influenza hemagglutinin HA2 subunit, EHA2 (23–185), defined by proteolysis studies of the intact bacterial-expressed ectodomain, was determined to 1.9-A resolution by using x-ray crystallography. The structure reveals a domain composed of N- and C-terminal residues that form an N cap terminating both the N-terminal α-helix and the central coiled coil. The N cap is formed by a conserved sequence, and part of it is found in the neutral pH conformation of HA. The C-terminal 23 residues of the ectodomain form a 72-A long nonhelical structure ordered to within 7 residues of the transmembrane anchor. The structure implies that continuous α helices are not required for membrane fusion at either the N or C termini. The difference in stability between recombinant molecules with and without the N cap sequences suggests that additional free energy for membrane fusion may become available after the formation of the central triple-stranded coiled coil and insertion of the fusion peptide into the target membrane.

292 citations

Journal ArticleDOI
TL;DR: The fast Fourier transform autoindexing routines written by the Rossmann group at Purdue University have been incorporated in MOSFLM, providing a rapid and reliable method of indexing oscillation images.
Abstract: The fast Fourier transform (FFT) autoindexing routines written by the Rossmann group at Purdue University have been incorporated in MOSFLM, providing a rapid and reliable method of indexing oscillation images. This is a procedure which extracts direct-space information about the unit cell from the FFT. The method and its implementation in MOSFLM are discussed.

290 citations


Cites methods from "J. Appl. Cryst.の発刊に際して"

  • ...This choice is repeated for different combinations of the real-space vectors and a test is applied to determine the deviation from integer values of the indices calculated from (4); the solution with the smallest number of re¯ections which have deviations on any one axis greater than a chosen cut-off (0.2 in this case) is chosen and passed back to MOSFLM. h A ÿ1 U ÿ1x;where 4 h approximate Miller indices of a reflection with reciprocal lattice constants x; U matrix for rotation of ' around the crystal oscillation axis: The Bravais lattice routines are then called from MOSFLM; solutions for each of the 44 lattice characters are calculated from the basis triclinic lattice in order to provide a list of possible solutions of higher symmetry together with their discrepancy indices....

    [...]

  • ...The method and its implementation in MOSFLM are discussed....

    [...]

  • ...The mechanics of some of these black boxes have been discussed at length elsewhere (Duisenberg, 1992; Higashi, 1990; Kim, 1989; MacõÂcÏek & Yordanov, 1992); for example, the existing autoindexing method distributed with MOSFLM (Leslie, 1992) relies on the calculation of many difference vectors between diffraction maxima in reciprocal space (Kabsch, 1993)....

    [...]

  • ...The autoindexing is run from the X-windows interface to MOSFLM (Campbell, 1995; Meyer, 1998)....

    [...]

  • ...The underlying code (the `DPS indexing routine') has been made available to the author and has been implemented in MOSFLM....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: The goals of the PDB are described, the systems in place for data deposition and access, how to obtain further information and plans for the future development of the resource are described.
Abstract: The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.

34,239 citations

Journal ArticleDOI
TL;DR: New features added to the refinement program SHELXL since 2008 are described and explained.
Abstract: The improvements in the crystal structure refinement program SHELXL have been closely coupled with the development and increasing importance of the CIF (Crystallographic Information Framework) format for validating and archiving crystal structures. An important simplification is that now only one file in CIF format (for convenience, referred to simply as `a CIF') containing embedded reflection data and SHELXL instructions is needed for a complete structure archive; the program SHREDCIF can be used to extract the .hkl and .ins files required for further refinement with SHELXL. Recent developments in SHELXL facilitate refinement against neutron diffraction data, the treatment of H atoms, the determination of absolute structure, the input of partial structure factors and the refinement of twinned and disordered structures. SHELXL is available free to academics for the Windows, Linux and Mac OS X operating systems, and is particularly suitable for multiple-core processors.

28,425 citations

Journal ArticleDOI
TL;DR: CCP4mg is a project that aims to provide a general-purpose tool for structural biologists, providing tools for X-ray structure solution, structure comparison and analysis, and publication-quality graphics.
Abstract: CCP4mg is a project that aims to provide a general-purpose tool for structural biologists, providing tools for X-ray structure solution, structure comparison and analysis, and publication-quality graphics. The map-fitting tools are available as a stand-alone package, distributed as `Coot'.

27,505 citations

Journal ArticleDOI
TL;DR: The PHENIX software for macromolecular structure determination is described and its uses and benefits are described.
Abstract: Macromolecular X-ray crystallography is routinely applied to understand biological processes at a molecular level. How­ever, significant time and effort are still required to solve and complete many of these structures because of the need for manual interpretation of complex numerical data using many software packages and the repeated use of interactive three-dimensional graphics. PHENIX has been developed to provide a comprehensive system for macromolecular crystallo­graphic structure solution with an emphasis on the automation of all procedures. This has relied on the development of algorithms that minimize or eliminate subjective input, the development of algorithms that automate procedures that are traditionally performed by hand and, finally, the development of a framework that allows a tight integration between the algorithms.

18,531 citations

Journal ArticleDOI
TL;DR: A description is given of Phaser-2.1: software for phasing macromolecular crystal structures by molecular replacement and single-wavelength anomalous dispersion phasing.
Abstract: Phaser is a program for phasing macromolecular crystal structures by both molecular replacement and experimental phasing methods. The novel phasing algorithms implemented in Phaser have been developed using maximum likelihood and multivariate statistics. For molecular replacement, the new algorithms have proved to be significantly better than traditional methods in discriminating correct solutions from noise, and for single-wavelength anomalous dispersion experimental phasing, the new algorithms, which account for correlations between F+ and F−, give better phases (lower mean phase error with respect to the phases given by the refined structure) than those that use mean F and anomalous differences ΔF. One of the design concepts of Phaser was that it be capable of a high degree of automation. To this end, Phaser (written in C++) can be called directly from Python, although it can also be called using traditional CCP4 keyword-style input. Phaser is a platform for future development of improved phasing methods and their release, including source code, to the crystallographic community.

17,755 citations