scispace - formally typeset
Search or ask a question
Journal ArticleDOI

MEROPS: the peptidase database

01 Jan 1999-Nucleic Acids Research (Oxford University Press)-Vol. 32, Iss: 1, pp 325-331
TL;DR: The MEROPS database has added an analysis tool to the relevant species pages to show significant gains and losses of peptidase genes relative to related species, and has collected over 39 000 known cleavage sites in proteins, peptides and synthetic substrates.
Abstract: Peptidases (proteolytic enzymes) are of great relevance to biology, medicine and biotechnology. This practical importance creates a need for an integrated source of information about them, and also about their natural inhibitors. The MEROPS database (http://merops.sanger.ac.uk) aims to fill this need. The organizational principle of the database is a hierarchical classification in which homologous sets of the proteins of interest are grouped in families and the homologous families are grouped in clans. Each peptidase, family and clan has a unique identifier. The database has recently been expanded to include the protein inhibitors of peptidases, and these are classified in much the same way as the peptidases. Forms of information recently added include new links to other databases, summary alignments for peptidase clans, displays to show the distribution of peptidases and inhibitors among organisms, substrate cleavage sites and indexes for expressed sequence tag libraries containing peptidases. A new way of making hyperlinks to the database has been devised and a BlastP search of our library of peptidase and inhibitor sequences has been added.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: Improvement in accuracy was generally observed for most methods, but remarkably large for the new options of MAFFT proposed here, which showed higher accuracy than currently available methods including TCoffee version 2 and CLUSTAL W in benchmark tests consisting of alignments of >50 sequences.
Abstract: The accuracy of multiple sequence alignment program MAFFT has been improved. The new version (5.3) of MAFFT offers new iterative refinement options, H-INS-i, F-INS-i and G-INS-i, in which pairwise alignment information are incorporated into objective function. These new options of MAFFT showed higher accuracy than currently available methods including TCoffee version 2 and CLUSTAL W in benchmark tests consisting of alignments of >50 sequences. Like the previously available options, the new options of MAFFT can handle hundreds of sequences on a standard desktop computer. We also examined the effect of the number of homologues included in an alignment. For a multiple alignment consisting of ∼8 sequences with low similarity, the accuracy was improved (2–10 percentage points) when the sequences were aligned together with dozens of their close homologues (E-value < 10−5–10−20) collected from a database. Such improvement was generally observed for most methods, but remarkably large for the new options of MAFFT proposed here. Thus, we made a Ruby script, mafftE.rb, which aligns the input sequences together with their close homologues collected from SwissProt using NCBI-BLAST.

4,528 citations

Journal ArticleDOI
TL;DR: Recent studies in mice and flies point to essential roles of MMPs as mediators of change and physical adaptation in tissues, whether developmentally regulated, environmentally induced or disease associated.
Abstract: Matrix metalloproteinases (MMPs) were discovered because of their role in amphibian metamorphosis, yet they have attracted more attention because of their roles in disease. Despite intensive scrutiny in vitro, in cell culture and in animal models, the normal physiological roles of these extracellular proteases have been elusive. Recent studies in mice and flies point to essential roles of MMPs as mediators of change and physical adaptation in tissues, whether developmentally regulated, environmentally induced or disease associated.

2,634 citations

Journal ArticleDOI
TL;DR: The InterPro database integrates together predictive models or ‘signatures’ representing protein domains, families and functional sites from multiple, diverse source databases: Gene3D, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY and TIGRFAMs.
Abstract: The InterPro database (http://www.ebi.ac.uk/interpro/) integrates together predictive models or 'signatures' representing protein domains, families and functional sites from multiple, diverse source databases: Gene3D, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY and TIGRFAMs. Integration is performed manually and approximately half of the total approximately 58,000 signatures available in the source databases belong to an InterPro entry. Recently, we have started to also display the remaining un-integrated signatures via our web interface. Other developments include the provision of non-signature data, such as structural data, in new XML files on our FTP site, as well as the inclusion of matchless UniProtKB proteins in the existing match XML files. The web interface has been extended and now links out to the ADAN predicted protein-protein interaction database and the SPICE and Dasty viewers. The latest public release (v18.0) covers 79.8% of UniProtKB (v14.1) and consists of 16 549 entries. InterPro data may be accessed either via the web address above, via web services, by downloading files by anonymous FTP or by using the InterProScan search software (http://www.ebi.ac.uk/Tools/InterProScan/).

1,834 citations


Cites background from "MEROPS: the peptidase database"

  • ...Semi-automatic procedures create and maintain links to an array of other databases, including the protease resource MEROPS (18), the protein interaction database IntAct (19), the protein sequence clusters in CluSTr (20) and the 3D protein structure database PDB (21)....

    [...]

  • ...Semi-automatic procedures create and maintain links to an array of other databases, including the protease resource MEROPS (18), the protein interaction database IntAct (19), the protein sequence clusters in CluSTr (20) and the 3D protein structure database PDB (21)....

    [...]

Journal ArticleDOI
20 May 2011-Science
TL;DR: The value of characterizing vertebrate gut microbiomes to understand host evolutionary histories at a supraorganismal level is illustrated by shotgun sequencing of microbial community DNA and targeted sequencing of bacterial 16S ribosomal RNA genes.
Abstract: Coevolution of mammals and their gut microbiota has profoundly affected their radiation into myriad habitats. We used shotgun sequencing of microbial community DNA and targeted sequencing of bacterial 16S ribosomal RNA genes to gain an understanding of how microbial communities adapt to extremes of diet. We sampled fecal DNA from 33 mammalian species and 18 humans who kept detailed diet records, and we found that the adaptation of the microbiota to diet is similar across different mammalian lineages. Functional repertoires of microbiome genes, such as those encoding carbohydrate-active enzymes and proteases, can be predicted from bacterial species assemblages. These results illustrate the value of characterizing vertebrate gut microbiomes to understand host evolutionary histories at a supraorganismal level.

1,585 citations

Journal ArticleDOI
TL;DR: The MEROPS database has been expanded to include proteolytic enzymes other than peptidases, and the inclusion of small-molecule inhibitors in the tables of peptidase–inhibitor interactions is included.
Abstract: Peptidases, their substrates and inhibitors are of great relevance to biology, medicine and biotechnology. The MEROPS database (http://merops.sanger.ac.uk) aims to fulfill the need for an integrated source of information about these. The database has hierarchical classifications in which homologous sets of peptidases and protein inhibitors are grouped into protein species, which are grouped into families, which are in turn grouped into clans. Recent developments include the following. A community annotation project has been instigated in which acknowledged experts are invited to contribute summaries for peptidases. Software has been written to provide an Internet-based data entry form. Contributors are acknowledged on the relevant web page. A new display showing the intron/exon structures of eukaryote peptidase genes and the phasing of the junctions has been implemented. It is now possible to filter the list of peptidases from a completely sequenced bacterial genome for a particular strain of the organism. The MEROPS filing pipeline has been altered to circumvent the restrictions imposed on non-interactive blastp searches, and a HMMER search using specially generated alignments to maximize the distribution of organisms returned in the search results has been added.

1,443 citations

References
More filters
Journal ArticleDOI
TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.

88,255 citations


"MEROPS: the peptidase database" refers methods in this paper

  • ...We have implemented a BlastP search ( 11 ) that allows a user to search the MEROPS sequence database with an unknown query sequence....

    [...]

Journal ArticleDOI
TL;DR: The neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods for reconstructing phylogenetic trees from evolutionary distance data.
Abstract: A new method called the neighbor-joining method is proposed for reconstructing phylogenetic trees from evolutionary distance data. The principle of this method is to find pairs of operational taxonomic units (OTUs [= neighbors]) that minimize the total branch length at each stage of clustering of OTUs starting with a starlike tree. The branch lengths as well as the topology of a parsimonious tree can quickly be obtained by using this method. Using computer simulation, we studied the efficiency of this method in obtaining the correct unrooted tree in comparison with that of five other tree-making methods: the unweighted pair group method of analysis, Farris's method, Sattath and Tversky's method, Li's method, and Tateno et al.'s modified Farris method. The new, neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods.

57,055 citations


"MEROPS: the peptidase database" refers methods in this paper

  • ...An alignment at the identifier level is generated by MUSCLE (5); this alignment is used to generate a Neighbor-joining tree ( 6 ) using QuickTree (7) which is displayed using the Clustal-Tree Java applet written by Rodrigo Lopez and Stephen Robinson at the European Bioinformatics Institute....

    [...]

Book
01 Feb 1987
TL;DR: Recent developments of statistical methods in molecular phylogenetics are reviewed and it is shown that the mathematical foundations of these methods are not well established, but computer simulations and empirical data indicate that currently used methods produce reasonably good phylogenetic trees when a sufficiently large number of nucleotides or amino acids are used.
Abstract: Recent developments of statistical methods in molecular phylogenetics are reviewed. It is shown that the mathematical foundations of these methods are not well established, but computer simulations and empirical data indicate that currently used methods such as neighbor joining, minimum evolution, likelihood, and parsimony methods produce reasonably good phylogenetic trees when a sufficiently large number of nucleotides or amino acids are used. However, when the rate of evolution varies exlensively from branch to branch, many methods may fail to recover the true topology. Solid statistical tests for examining'the accuracy of trees obtained by neighborjoining, minimum evolution, and least-squares method are available, but the methods for likelihood and parsimony trees are yet to be refined. Parsimony, likelihood, and distance methods can all be used for inferring amino acid sequences of the proteins of ancestral organisms that have become extinct.

15,840 citations

Journal ArticleDOI
TL;DR: A set of simple and physically motivated criteria for secondary structure, programmed as a pattern‐recognition process of hydrogen‐bonded and geometrical features extracted from x‐ray coordinates is developed.
Abstract: For a successful analysis of the relation between amino acid sequence and protein structure, an unambiguous and physically meaningful definition of secondary structure is essential. We have developed a set of simple and physically motivated criteria for secondary structure, programmed as a pattern-recognition process of hydrogen-bonded and geometrical features extracted from x-ray coordinates. Cooperative secondary structure is recognized as repeats of the elementary hydrogen-bonding patterns “turn” and “bridge.” Repeating turns are “helices,” repeating bridges are “ladders,” connected ladders are “sheets.” Geometric structure is defined in terms of the concepts torsion and curvature of differential geometry. Local chain “chirality” is the torsional handedness of four consecutive Cα positions and is positive for right-handed helices and negative for ideal twisted β-sheets. Curved pieces are defined as “bends.” Solvent “exposure” is given as the number of water molecules in possible contact with a residue. The end result is a compilation of the primary structure, including SS bonds, secondary structure, and solvent exposure of 62 different globular proteins. The presentation is in linear form: strip graphs for an overall view and strip tables for the details of each of 10.925 residues. The dictionary is also available in computer-readable form for protein structure prediction work.

14,077 citations

Journal ArticleDOI
TL;DR: The definition and use of family-specific, manually curated gathering thresholds are explained and some of the features of domains of unknown function (also known as DUFs) are discussed, which constitute a rapidly growing class of families within Pfam.
Abstract: Pfam is a widely used database of protein families and domains. This article describes a set of major updates that we have implemented in the latest release (version 24.0). The most important change is that we now use HMMER3, the latest version of the popular profile hidden Markov model package. This software is approximately 100 times faster than HMMER2 and is more sensitive due to the routine use of the forward algorithm. The move to HMMER3 has necessitated numerous changes to Pfam that are described in detail. Pfam release 24.0 contains 11,912 families, of which a large number have been significantly updated during the past two years. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/).

14,075 citations


"MEROPS: the peptidase database" refers background in this paper

  • ...From the family summary pages there are now links to the HOMSTRAD database of structural alignments (6), and the Pfam ( 7 ) and InterPro (8) databases of protein domains....

    [...]