Protein-Sol: a web tool for predicting protein solubility from sequence.
Reads0
Chats0
TLDR
The model returns a predicted solubility and an indication of the features which deviate most from average values, and the utility of these additional features is demonstrated with the example of thioredoxin.Abstract:
Motivation Protein solubility is an important property in industrial and therapeutic applications Prediction is a challenge, despite a growing understanding of the relevant physicochemical properties Results Protein-Sol is a web server for predicting protein solubility Using available data for Escherichia coli protein solubility in a cell-free expression system, 35 sequence-based properties are calculated Feature weights are determined from separation of low and high solubility subsets The model returns a predicted solubility and an indication of the features which deviate most from average values Two other properties are profiled in windowed calculation along the sequence: fold propensity, and net segment charge The utility of these additional features is demonstrated with the example of thioredoxin Availability and implementation The Protein-Sol webserver is available at http://protein-solmanchesteracuk Contact jimwarwicker@manchesteracukread more
Citations
More filters
Journal ArticleDOI
Computational approaches to therapeutic antibody design: established methods and emerging trends
Richard A. Norman,Francesco Ambrosetti,Alexandre M. J. J. Bonvin,Lucy J. Colwell,Sebastian Kelm,Sandeep Kumar,Konrad Krawczyk +6 more
TL;DR: A structured overview of available databases, methods and emerging trends in computational antibody analysis are presented and contextualize them towards the engineering of candidate antibody therapeutics.
Journal ArticleDOI
An in silico deep learning approach to multi-epitope vaccine design: a SARS-CoV-2 case study.
TL;DR: Wang et al. as mentioned in this paper proposed an in silico deep learning approach for prediction and design of a multi-epitope vaccine (DeepVacPred), which directly predicts 26 potential vaccine subunits from the available SARS-CoV-2 spike protein sequence.
Journal ArticleDOI
Machine learning with physicochemical relationships: solubility prediction in organic solvents and water
TL;DR: A successful approach to solubility prediction in organic solvents and water is reported using a combination of machine learning (ANN, SVM, RF, ExtraTrees, Bagging and GP) and computational chemistry.
Journal ArticleDOI
Generating functional protein variants with variational autoencoders.
Alex Hawkins-Hooker,Florence Depardieu,Sebastien Baur,Guillaume Couairon,Arthur Chen,David Bikard +5 more
TL;DR: It is shown that variational autoencoders trained on a dataset of almost 70000 luciferase-like oxidoreductases can be used to generate novel, functional variants of the luxA bacterial Luciferase.
Posted ContentDOI
Generating functional protein variants with variational autoencoders
Alex Hawkins-Hooker,Florence Depardieu,Sebastien Baur,Guillaume Couairon,Arthur Chen,David Bikard +5 more
TL;DR: The feasibility of using deep generative models to explore the space of possible protein sequences and generate useful variants is demonstrated, providing a method complementary to rational design and directed evolution approaches.
References
More filters
Journal ArticleDOI
A simple method for displaying the hydropathic character of a protein
Jack Kyte,Russell F. Doolittle +1 more
TL;DR: A computer program that progressively evaluates the hydrophilicity and hydrophobicity of a protein along its amino acid sequence has been devised and its simplicity and its graphic nature make it a very useful tool for the evaluation of protein structures.
Journal ArticleDOI
UniProt: the Universal Protein knowledgebase
Rolf Apweiler,Amos Marc Bairoch,Cathy H. Wu,Winona C. Barker,Brigitte Boeckmann,Serenella Ferro,Elisabeth Gasteiger,Hongzhan Huang,Rodrigo Lopez,Michele Magrane,Maria Jesus Martin,Darren A. Natale,Claire O'Donovan,Nicole Redaschi,Lai-Su L. Yeh +14 more
TL;DR: The Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt), which is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and query interfaces.
Journal ArticleDOI
Why are "natively unfolded" proteins unstructured under physiologic conditions?
TL;DR: Analysis of amino acid sequences, based on the normalized net charge and mean hydrophobicity, has been applied to two sets of proteins and shows that “natively unfolded” proteins are specifically localized within a unique region of charge‐hydrophobia phase space.
Journal ArticleDOI
GlobPlot: exploring protein sequences for globularity and disorder
TL;DR: A new tool for discovery of unstructured, or disordered regions within proteins, and examples with known proteins where it successfully identifies inter-domain segments containing linear motifs, and also apparently ordered regions that do not contain any recognised domain are presented.
Journal ArticleDOI
Crystal structure of thioredoxin from Escherichia coli at 1.68 A resolution.
TL;DR: The crystal structure of thioredoxin from Escherichia coli has been refined by the stereochemically restrained least-squares procedure to a crystallographic R-factor of 0.165 at 1.68 A resolution.