scispace - formally typeset
Search or ask a question
Journal ArticleDOI

SWISS-MODEL: homology modelling of protein structures and complexes.

TL;DR: An update to the SWISS-MODEL server is presented, which includes the implementation of a new modelling engine, ProMod3, and the introduction a new local model quality estimation method, QMEANDisCo.
Abstract: Homology modelling has matured into an important technique in structural biology, significantly contributing to narrowing the gap between known protein sequences and experimentally determined structures. Fully automated workflows and servers simplify and streamline the homology modelling process, also allowing users without a specific computational expertise to generate reliable protein models and have easy access to modelling results, their visualization and interpretation. Here, we present an update to the SWISS-MODEL server, which pioneered the field of automated modelling 25 years ago and been continuously further developed. Recently, its functionality has been extended to the modelling of homo- and heteromeric complexes. Starting from the amino acid sequences of the interacting proteins, both the stoichiometry and the overall structure of the complex are inferred by homology modelling. Other major improvements include the implementation of a new modelling engine, ProMod3 and the introduction a new local model quality estimation method, QMEANDisCo. SWISS-MODEL is freely available at https://swissmodel.expasy.org.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: The phylogenetic analysis suggests that bats might be the original host of this virus, an animal sold at the seafood market in Wuhan might represent an intermediate host facilitating the emergence of the virus in humans.

9,474 citations

Journal ArticleDOI
03 Feb 2020-Nature
TL;DR: Phylogenetic and metagenomic analyses of the complete viral genome of a new coronavirus from the family Coronaviridae reveal that the virus is closely related to a group of SARS-like coronaviruses found in bats in China.
Abstract: Emerging infectious diseases, such as severe acute respiratory syndrome (SARS) and Zika virus disease, present a major threat to public health1–3. Despite intense research efforts, how, when and where new diseases appear are still a source of considerable uncertainty. A severe respiratory disease was recently reported in Wuhan, Hubei province, China. As of 25 January 2020, at least 1,975 cases had been reported since the first patient was hospitalized on 12 December 2019. Epidemiological investigations have suggested that the outbreak was associated with a seafood market in Wuhan. Here we study a single patient who was a worker at the market and who was admitted to the Central Hospital of Wuhan on 26 December 2019 while experiencing a severe respiratory syndrome that included fever, dizziness and a cough. Metagenomic RNA sequencing4 of a sample of bronchoalveolar lavage fluid from the patient identified a new RNA virus strain from the family Coronaviridae, which is designated here ‘WH-Human 1’ coronavirus (and has also been referred to as ‘2019-nCoV’). Phylogenetic analysis of the complete viral genome (29,903 nucleotides) revealed that the virus was most closely related (89.1% nucleotide similarity) to a group of SARS-like coronaviruses (genus Betacoronavirus, subgenus Sarbecovirus) that had previously been found in bats in China5. This outbreak highlights the ongoing ability of viral spill-over from animals to cause severe disease in humans. Phylogenetic and metagenomic analyses of the complete viral genome of a new coronavirus from the family Coronaviridae reveal that the virus is closely related to a group of SARS-like coronaviruses found in bats in China.

9,231 citations

Journal ArticleDOI
20 Aug 2021-Science
TL;DR: In this article, a three-track network is proposed to combine information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level.
Abstract: DeepMind presented notably accurate predictions at the recent 14th Critical Assessment of Structure Prediction (CASP14) conference. We explored network architectures that incorporate related ideas and obtained the best performance with a three-track network in which information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The three-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables the rapid solution of challenging x-ray crystallography and cryo-electron microscopy structure modeling problems, and provides insights into the functions of proteins of currently unknown structure. The network also enables rapid generation of accurate protein-protein complex models from sequence information alone, short-circuiting traditional approaches that require modeling of individual subunits followed by docking. We make the method available to the scientific community to speed biological research.

1,907 citations

Journal ArticleDOI
22 Jul 2021-Nature
TL;DR: The AlphaFold2 dataset as discussed by the authors is a large-scale and high-accuracy structure prediction dataset for protein structures, which is used to evaluate the structural properties of proteins.
Abstract: Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experimentally-determined structure1. Here we dramatically expand structural coverage by applying the state-of-the-art machine learning method, AlphaFold2, at scale to almost the entire human proteome (98.5% of human proteins). The resulting dataset covers 58% of residues with a confident prediction, of which a subset (36% of all residues) have very high confidence. We introduce several metrics developed by building on the AlphaFold model, and use them to interpret the dataset, identifying strong multi-domain predictions as well as regions likely to be disordered. Finally, we provide some case studies illustrating how high-quality predictions may be used to generate biological hypotheses. Importantly, we are making our predictions freely available to the community via a public database (hosted by the European Bioinformatics Institute at https://alphafold.ebi.ac.uk/ ). We anticipate that routine large-scale and high-accuracy structure prediction will become an important tool, allowing new questions to be addressed from a structural perspective.

1,238 citations

Journal ArticleDOI
TL;DR: EK1C4 was the most potent fusion inhibitor against SARS-CoV-2 S protein-mediated membrane fusion and pseudovirus infection with IC50s of 1.3 and 15.8 nM, about 241- and 149-fold more potent than the original EK1 peptide, respectively.
Abstract: The recent outbreak of coronavirus disease (COVID-19) caused by SARS-CoV-2 infection in Wuhan, China has posed a serious threat to global public health. To develop specific anti-coronavirus therapeutics and prophylactics, the molecular mechanism that underlies viral infection must first be defined. Therefore, we herein established a SARS-CoV-2 spike (S) protein-mediated cell–cell fusion assay and found that SARS-CoV-2 showed a superior plasma membrane fusion capacity compared to that of SARS-CoV. We solved the X-ray crystal structure of six-helical bundle (6-HB) core of the HR1 and HR2 domains in the SARS-CoV-2 S protein S2 subunit, revealing that several mutated amino acid residues in the HR1 domain may be associated with enhanced interactions with the HR2 domain. We previously developed a pan-coronavirus fusion inhibitor, EK1, which targeted the HR1 domain and could inhibit infection by divergent human coronaviruses tested, including SARS-CoV and MERS-CoV. Here we generated a series of lipopeptides derived from EK1 and found that EK1C4 was the most potent fusion inhibitor against SARS-CoV-2 S protein-mediated membrane fusion and pseudovirus infection with IC50s of 1.3 and 15.8 nM, about 241- and 149-fold more potent than the original EK1 peptide, respectively. EK1C4 was also highly effective against membrane fusion and infection of other human coronavirus pseudoviruses tested, including SARS-CoV and MERS-CoV, as well as SARSr-CoVs, and potently inhibited the replication of 5 live human coronaviruses examined, including SARS-CoV-2. Intranasal application of EK1C4 before or after challenge with HCoV-OC43 protected mice from infection, suggesting that EK1C4 could be used for prevention and treatment of infection by the currently circulating SARS-CoV-2 and other emerging SARSr-CoVs.

1,026 citations

References
More filters
Journal ArticleDOI
TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Abstract: The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSIBLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.

70,111 citations

Journal ArticleDOI
TL;DR: The goals of the PDB are described, the systems in place for data deposition and access, how to obtain further information and plans for the future development of the resource are described.
Abstract: The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.

34,239 citations


"SWISS-MODEL: homology modelling of ..." refers background or methods in this paper

  • ...To this aim, we are actively participating to the CAMEO project (Continuous Automated Model Evaluation, https://cameo3d.org) (32), a fully automated blind prediction assessment based on weekly prerelease of sequences from the PDB (33), allowing us to constantly monitor and improve the performance of the server....

    [...]

  • ...The resulting model is shown in Figure 2D, where it has been superimposed onto the experimental structure of the complex (PDB ID: 5H5J (68), shown in light gray)....

    [...]

  • ...The SWISS-MODEL Template Library (SMTL), available at https://swissmodel.expasy.org/templates/, is a curated template library, which is updated on a weekly basis according to the new PDB release (33)....

    [...]

  • ...With the new version of SWISS-MODEL presented here, we aimed at extending the scope of automated homology modelling to address the modelling of protein assemblies by efficiently using the information on quaternary structures available in the PDB....

    [...]

  • ...Currently, the SMR contains 1 067 355 models from SWISS-MODEL and 129 416 structures from PDB with mapping to UniProtKB....

    [...]

Journal ArticleDOI
TL;DR: The new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences.
Abstract: Sequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its use of heuristics, the speed of the current BLAST software is suboptimal for very long queries or database sequences. There are also some shortcomings in the user-interface of the current command-line applications. We describe features and improvements of rewritten BLAST software and introduce new command-line applications. Long query sequences are broken into chunks for processing, in some cases leading to dramatically shorter run times. For long database sequences, it is possible to retrieve only the relevant parts of the sequence, reducing CPU time and memory usage for searches of short queries against databases of contigs or chromosomes. The program can now retrieve masking information for database sequences from the BLAST databases. A new modular software library can now access subject sequence data from arbitrary data sources. We introduce several new features, including strategy files that allow a user to save and reuse their favorite set of options. The strategy files can be uploaded to and downloaded from the NCBI BLAST web site. The new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences. We have also improved the user interface of the command-line applications.

13,223 citations


"SWISS-MODEL: homology modelling of ..." refers methods in this paper

  • ...SWISS-MODEL performs this task by using two database search methods: BLAST (35,36), which is fast and sufficiently accurate for closely related templates, and HHblits (37), which adds sensitivity in case of remote homology....

    [...]

  • ...Each leaf of the tree corresponds to a template and target-template alignment (based on HHblits, BLAST or both); templates are labelled with their SMTL ID; bars indicate sequence identity and coverage to the target (darker shades of blue indicate higher sequence identity)....

    [...]

Journal ArticleDOI
TL;DR: A comparative protein modelling method designed to find the most probable structure for a sequence given its alignment with related structures, which is automated and illustrated by the modelling of trypsin from two other serine proteinases.

12,386 citations


"SWISS-MODEL: homology modelling of ..." refers methods in this paper

  • ...Until recently, the software package ProMod-II (26), using MODELLER (51) as a fall-back, was in use to perform this task....

    [...]

Journal ArticleDOI
TL;DR: An environment for comparative protein modeling is developed that consists of SWISS‐MODEL, a server for automated comparativeprotein modeling and of the SWiss‐PdbViewer, a sequence to structure workbench that provides a large selection of structure analysis and display tools.
Abstract: Comparative protein modeling is increasingly gaining interest since it is of great assistance during the rational design of mutagenesis experiments. The availability of this method, and the resulting models, has however been restricted by the availability of expensive computer hardware and software. To overcome these limitations, we have developed an environment for comparative protein modeling that consists of SWISS-MODEL, a server for automated comparative protein modeling and of the SWISS-PdbViewer, a sequence to structure workbench. The Swiss-PdbViewer not only acts as a client for SWISS-MODEL, but also provides a large selection of structure analysis and display tools. In addition, we provide the SWISS-MODEL Repository, a database containing more than 3500 automatically generated protein models. By making such tools freely available to the scientific community, we hope to increase the use of protein structures and models in the process of experiment design.

10,713 citations