A multitask transfer learning framework for novel virus-human protein interactions
read more
Citations
A multitask transfer learning framework for the prediction of virus-human protein-protein interactions
Application of Sequence Embedding in Protein Sequence-Based Predictions
Application of Sequence Embedding in Protein Sequence-Based Predictions
Sharing to learn and learning to share - Fitting together Meta-Learning, Multi-Task Learning, and Transfer Learning : A meta review.
A multitask transfer learning framework for the prediction of virus-human protein-protein interactions.
References
STRING v10: protein–protein interaction networks, integrated over the tree of life
Distributed Representations of Sentences and Documents
Unified rational protein engineering with sequence-based deep representation learning
Synthesis of a Vocal Sound from the 3,000 year old Mummy, Nesyamun ‘True of Voice’
Understanding eukaryotic linear motifs and their role in cell signaling and regulation.
Frequently Asked Questions (17)
Q2. What are the future works mentioned in the paper "A multitask transfer learning framework for novel virus-human protein interactions" ?
As future work
Q3. What is the input to the LR module?
The input to the LR module is the element wise product of fine-tuned representations (output of the MLP) of virus and human protein.
Q4. What is the name of the SVM model?
It used a SVM model trained on feature set consisting of the protein domain-domain association and methionine, serine, and valine amino acid composition of viral proteins.
Q5. What is the main limitation of the proposed model?
Noting the fact that virus tends to mimic humans towards building interactions with its proteins, the authors use the prediction of human PPI as a side task to further regularize their model and improve generalization.
Q6. What is the rationale behind using human PPI?
The rationale behind using human PPI task is that viruses have been shown to mimic and compete with human proteins in their binding and interaction patterns with other human proteins (Mei & Zhang, 2020).
Q7. What is the purpose of the paper?
The authors will enhance their multi task approach by incorporating more domain information as well as exploiting more sophisticated multi task model architectures.
Q8. What are the main limitations of the proposed model?
Heuristics such as K-mer composition usually used for protein representations are bound to fail as it is known that viral proteins with completely different sequences might show similar interaction patterns.
Q9. What are the limitations of the UNIREP model?
The protein representations extracted from UNIREP model are empirically shown to preserve fundamental properties of the proteins and are hypothesized to be statistically more powerful and generalizable than hand crafted sequence features.
Q10. What is the zhh′ for human PPI?
For human PPI, the target variables (zhh′ ) are the normalized confidence scores which can be interpreted as the probability of observing an interaction.
Q11. What is the name of the dataset?
DeNovo’s SLIM datasetTo be presented at the ICLR Workshop on AI for Public Health 2021encapsulated viral proteins based on presence of Short Linear Motif (SLiM) (short recurring protein sequences with specific biological function).
Q12. What is the corresponding binary cross entropy loss function for the virus-human P?
L2 = ∑(h,h′)∈MP−zhh′ log yhh′(Φ,W2)− (1− zhh′) log(1− yhh′(Φ,W2)) (2)The authors use a linear combination of the two loss functions to train their model, i.e., L = L1 + α · L2, where α is the human PPI weight factor.
Q13. What are the interactions between the viral coat and the host?
These interactions range from the initial biding of viral coat proteins to the host membrane receptor to the hijacking of the host transcription machinery by viral proteins.
Q14. What is the main objective of the proposed model?
The authors further fine-tune these representations by training 2 simple neural networks (single layer MLP with ReLu activation) using an additional objective of predicting human PPI in addition to the main task.
Q15. What are the learnable parameters for the human PPI?
Let Θ,Φ denote the set of learnable parameters corresponding to representation tuning components, i.e., the Multilayer Perceptrons (MLP) corresponding to the virus and human proteins, respectively.
Q16. What are the limitations of the proposed model?
In this work the authors tackle the above limitations by exploiting powerful statistical protein representations derived from a corpus of around 24 Million protein sequences in a multitask framework.
Q17. What is the rationale behind using human interactome to guide their virus-human PPI task?
the authors believe that the patterns learned from the human interactome (or human PPI network) should be a rich source of knowledge to guide their virus-human PPI task and further helps to regularize their model.