scispace - formally typeset
Search or ask a question
Posted ContentDOI

Human mitochondrial protein complexes revealed by large-scale coevolution analysis and deep learning-based structure modeling

14 Sep 2021-bioRxiv (Cold Spring Harbor Laboratory)-
TL;DR: In this article, two deep learning methods, namely RoseTTAFold and AlphaFold2, were used to predict the coevolution of about 95% of mitochondrial protein pairs.
Abstract: Recent development of deep-learning methods has led to a breakthrough in the prediction accuracy of 3-dimensional protein structures. Extending these methods to protein pairs is expected to allow large-scale detection of protein-protein interactions and modeling protein complexes at the proteome level. We applied RoseTTAFold and AlphaFold2, two of the latest deep-learning methods for structure predictions, to analyze coevolution of human proteins residing in mitochondria, an organelle of vital importance in many cellular processes including energy production, metabolism, cell death, and antiviral response. Variations in mitochondrial proteins have been linked to a plethora of human diseases and genetic conditions. RoseTTAFold, with high computational speed, was used to predict the coevolution of about 95% of mitochondrial protein pairs. Top-ranked pairs were further subject to the modeling of the complex structures by AlphaFold2, which also produced contact probability with high precision and in many cases consistent with RoseTTAFold. Most of the top ranked pairs with high contact probability were supported by known protein-protein interactions and/or similarities to experimental structural complexes. For high-scoring pairs without experimental complex structures, our coevolution analyses and structural models shed light on the details of their interfaces, including CHCHD4-AIFM1, MTERF3-TRUB2, FMC1-ATPAF2, ECSIT-NDUFAF1 and COQ7-COQ9, among others. We also identified novel PPIs (PYURF-NDUFAF5, LYRM1-MTRF1L and COA8-COX10) for several proteins without experimentally characterized interaction partners, leading to predictions of their molecular functions and the biological processes they are involved in.
Citations
More filters
Journal ArticleDOI
TL;DR: In this paper , the authors report that in human cells the CcO copper chaperones form macromolecular assemblies and cooperate with several twin CX 9 C proteins to control heme a biosynthesis and coordinate copper transfer sequentially to the Cu A and Cu B sites.
Abstract: Abstract Mitochondrial cytochrome c oxidase (CcO) or respiratory chain complex IV is a heme aa 3 -copper oxygen reductase containing metal centers essential for holo-complex biogenesis and enzymatic function that are assembled by subunit-specific metallochaperones. The enzyme has two copper sites located in the catalytic core subunits. The COX1 subunit harbors the Cu B site that tightly associates with heme a 3 while the COX2 subunit contains the binuclear Cu A site. Here, we report that in human cells the CcO copper chaperones form macromolecular assemblies and cooperate with several twin CX 9 C proteins to control heme a biosynthesis and coordinate copper transfer sequentially to the Cu A and Cu B sites. These data on CcO illustrate a mechanism that regulates the biogenesis of macromolecular enzymatic assemblies with several catalytic metal redox centers and prevents the accumulation of cytotoxic reactive assembly intermediates.

17 citations

Journal ArticleDOI
TL;DR: In this article , an evolutionary information deciphered from thousands of homologous sequences that coevolve in interacting partners was used to predict protein-protein interactions (PPIs) at a proteome-wide scale.

1 citations

Journal ArticleDOI
TL;DR: In this paper , the authors found that the complex I assembly factor NDUFAF8 follows a two-step import pathway linking IMS and matrix import systems, which allows exposure to the IMS disulfide relay.
Abstract: Mitochondria critically rely on protein import and its tight regulation. Here, we found that the complex I assembly factor NDUFAF8 follows a two-step import pathway linking IMS and matrix import systems. A weak targeting sequence drives TIM23-dependent NDUFAF8 matrix import, and en route, allows exposure to the IMS disulfide relay, which oxidizes NDUFAF8. Import is closely surveyed by proteases: YME1L prevents accumulation of excess NDUFAF8 in the IMS, while CLPP degrades reduced NDUFAF8 in the matrix. Therefore, NDUFAF8 can only fulfil its function in complex I biogenesis if both oxidation in the IMS and subsequent matrix import work efficiently. We propose that the two-step import pathway for NDUFAF8 allows integration of the activity of matrix complex I biogenesis pathways with the activity of the mitochondrial disulfide relay system in the IMS. Such coordination might not be limited to NDUFAF8 as we identified further proteins that can follow such a two-step import pathway.

1 citations

Journal ArticleDOI
TL;DR: Deep learning, a potent branch of artificial intelligence, is steadily leaving its transformative imprint across multiple disciplines as mentioned in this paper and there exists an immediate demand for an exhaustive review that encapsulates and critically assesses these novel developments within computational biology, thus, an in-depth exploration of PPIs is crucial for decoding the intricate biological system dynamics and unveiling potential avenues for therapeutic interventions.
Abstract: Deep learning, a potent branch of artificial intelligence, is steadily leaving its transformative imprint across multiple disciplines. Within computational biology, it is expediting progress in the understanding of Protein–Protein Interactions (PPIs), key components governing a wide array of biological functionalities. Hence, an in-depth exploration of PPIs is crucial for decoding the intricate biological system dynamics and unveiling potential avenues for therapeutic interventions. As the deployment of deep learning techniques in PPI analysis proliferates at an accelerated pace, there exists an immediate demand for an exhaustive review that encapsulates and critically assesses these novel developments. Addressing this requirement, this review offers a detailed analysis of the literature from 2021 to 2023, highlighting the cutting-edge deep learning methodologies harnessed for PPI analysis. Thus, this review stands as a crucial reference for researchers in the discipline, presenting an overview of the recent studies in the field. This consolidation helps elucidate the dynamic paradigm of PPI analysis, the evolution of deep learning techniques, and their interdependent dynamics. This scrutiny is expected to serve as a vital aid for researchers, both well-established and newcomers, assisting them in maneuvering the rapidly shifting terrain of deep learning applications in PPI analysis.
Posted ContentDOI
14 Jun 2023-bioRxiv
TL;DR: In this paper , three-dimensional models of five human receptor complexes with cytokines and JAK2 were generated using AlphaFold Multimer, and the binding mode of two eltrombopag molecules to TM α-helices of the active TPOR dimer was proposed.
Abstract: Homodimeric class 1 cytokine receptors include the erythropoietin (EPOR), thrombopoietin (TPOR), granulocyte colony-stimulating factor 3 (CSF3R), growth hormone (GHR), and prolactin receptors (PRLR). They are cell-surface single-pass transmembrane (TM) glycoproteins that regulate cell growth, proliferation, and differentiation and induce oncogenesis. An active TM signaling complex consists of a receptor homodimer, one or two ligands bound to the receptor extracellular domains and two molecules of Janus Kinase 2 (JAK2) constitutively associated with the receptor intracellular domains. Although crystal structures of soluble extracellular domains with ligands have been obtained for all the receptors except TPOR, little is known about the structure and dynamics of the complete TM complexes that activate the downstream JAK-STAT signaling pathway. Three-dimensional models of five human receptor complexes with cytokines and JAK2 were generated using AlphaFold Multimer. Given the large size of the complexes (from 3220 to 4074 residues), the modeling required a stepwise assembly from smaller parts with selection and validation of the models through comparisons with published experimental data. The modeling of active and inactive complexes supports a general activation mechanism that involves ligand binding to a monomeric receptor followed by receptor dimerization and rotational movement of the receptor TM α-helices causing proximity, dimerization, and activation of associated JAK2 subunits. The binding mode of two eltrombopag molecules to TM α-helices of the active TPOR dimer was proposed. The models also help elucidating the molecular basis of oncogenic mutations that may involve non-canonical activation route. Models equilibrated in explicit lipids of the plasma membrane are publicly available.
References
More filters
Journal ArticleDOI
TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Abstract: The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSIBLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.

70,111 citations

Journal ArticleDOI
TL;DR: This version of MAFFT has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update.
Abstract: We report a major update of the MAFFT multiple sequence alignment program. This version has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update. This report shows actual examples to explain how these features work, alone and in combination. Some examples incorrectly aligned by MAFFT are also shown to clarify its limitations. We discuss how to avoid misalignments, and our ongoing efforts to overcome such limitations.

27,771 citations

Journal ArticleDOI
15 Jul 2021-Nature
TL;DR: For example, AlphaFold as mentioned in this paper predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture. But the accuracy is limited by the fact that no homologous structure is available.
Abstract: Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort1–4, the structures of around 100,000 unique proteins have been determined5, but this represents a small fraction of the billions of known protein sequences6,7. Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’8—has been an important open research problem for more than 50 years9. Despite recent progress10–14, existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14)15, demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.

10,601 citations

Journal ArticleDOI

[...]

TL;DR: A new CD-HIT program accelerated with a novel parallelization strategy and some other techniques to allow efficient clustering of such datasets to reduce sequence redundancy and improve the performance of other sequence analyses is developed.
Abstract: Summary: CD-HIT is a widely used program for clustering biological sequences to reduce sequence redundancy and improve the performance of other sequence analyses. In response to the rapid increase in the amount of sequencing data produced by the next-generation sequencing technologies, we have developed a new CD-HIT program accelerated with a novel parallelization strategy and some other techniques to allow efficient clustering of such datasets. Our tests demonstrated very good speedup derived from the parallelization for up to ~24 cores and a quasi-linear speedup for up to ~8 cores. The enhanced CD-HIT is capable of handling very large datasets in much shorter time than previous versions. Availability: http://cd-hit.org. Contact: [email protected] Supplementary information:Supplementary data are available at Bioinformatics online.

5,959 citations

Journal ArticleDOI
04 Feb 1999-Nature
TL;DR: The identification and cloning of an apoptosis-inducing factor, AIF, which is sufficient to induce apoptosis of isolated nuclei is reported, indicating that AIF is a mitochondrial effector of apoptotic cell death.
Abstract: Mitochondria play a key part in the regulation of apoptosis (cell death). Their intermembrane space contains several proteins that are liberated through the outer membrane in order to participate in the degradation phase of apoptosis. Here we report the identification and cloning of an apoptosis-inducing factor, AIF, which is sufficient to induce apoptosis of isolated nuclei. AIF is a flavoprotein of relative molecular mass 57,000 which shares homology with the bacterial oxidoreductases; it is normally confined to mitochondria but translocates to the nucleus when apoptosis is induced. Recombinant AIF causes chromatin condensation in isolated nuclei and large-scale fragmentation of DNA. It induces purified mitochondria to release the apoptogenic proteins cytochrome c and caspase-9. Microinjection of AIF into the cytoplasm of intact cells induces condensation of chromatin, dissipation of the mitochondrial transmembrane potential, and exposure of phosphatidylserine in the plasma membrane. None of these effects is prevented by the wide-ranging caspase inhibitor known as Z-VAD.fmk. Overexpression of Bcl-2, which controls the opening of mitochondrial permeability transition pores, prevents the release of AIF from the mitochondrion but does not affect its apoptogenic activity. These results indicate that AIF is a mitochondrial effector of apoptotic cell death.

4,095 citations