scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Identification of Oxa1 Homologs Operating in the Eukaryotic Endoplasmic Reticulum

TL;DR: Findings suggest a specific biochemical function for TMCO1 and define a superfamily of proteins—the “Oxa1 superfamily”—whose shared function is to facilitate membrane protein biogenesis.
About: This article is published in Cell Reports.The article was published on 2017-12-26 and is currently open access. It has received 91 citations till now. The article focuses on the topics: ER membrane protein complex & Chloroplast thylakoid membrane.
Citations
More filters
01 Jan 2011
TL;DR: This paper reported a genome-wide association study for open-angle glaucoma (OAG) blindness using a discovery cohort of 590 individuals with severe visual field loss (cases) and 3,956 controls.
Abstract: We report a genome-wide association study for open-angle glaucoma (OAG) blindness using a discovery cohort of 590 individuals with severe visual field loss (cases) and 3,956 controls. We identified associated loci at TMCO1 (rs4656461[G] odds ratio (OR) = 1.68, P = 6.1 × 10-10) and CDKN2B-AS1 (rs4977756[A] OR = 1.50, P = 4.7 × 10-9). We replicated these associations in an independent cohort of cases with advanced OAG (rs4656461 P = 0.010; rs4977756 P = 0.042) and two additional cohorts of less severe OAG (rs4656461 combined discovery and replication P = 6.00 × 10-14, OR = 1.51, 95% CI 1.35-1.68; rs4977756 combined P = 1.35 × 10-14, OR = 1.39, 95% CI 1.28-1.51). We show retinal expression of genes at both loci in human ocular tissues. We also show that CDKN2A and CDKN2B are upregulated in the retina of a rat model of glaucoma. © 2011 Nature America, Inc. All rights reserved.

347 citations

Journal ArticleDOI
26 Jan 2018-Science
TL;DR: It is found that known membrane insertion pathways fail to effectively engage tail-anchored membrane proteins with moderately hydrophobic transmembrane domains, and these proteins are instead shielded in the cytosol by calmodulin.
Abstract: Insertion of proteins into membranes is an essential cellular process. The extensive biophysical and topological diversity of membrane proteins necessitates multiple insertion pathways that remain incompletely defined. Here we found that known membrane insertion pathways fail to effectively engage tail-anchored membrane proteins with moderately hydrophobic transmembrane domains. These proteins are instead shielded in the cytosol by calmodulin. Dynamic release from calmodulin allowed sampling of the endoplasmic reticulum (ER), where the conserved ER membrane protein complex (EMC) was shown to be essential for efficient insertion in vitro and in cells. Purified EMC in synthetic liposomes catalyzed the insertion of its substrates in a reconstituted system. Thus, EMC is a transmembrane domain insertase, a function that may explain its widely pleiotropic membrane-associated phenotypes across organisms.

204 citations

Journal ArticleDOI
TL;DR: This review examines the molecular biology of flaviviruses touching on the structure and function of viral components and how these interact with host factors, and highlights the role of a noncoding RNA produced by flavIViruses to impair antiviral host immune responses.
Abstract: Flaviviruses, such as dengue, Japanese encephalitis, tick-borne encephalitis, West Nile, yellow fever, and Zika viruses, are critically important human pathogens that sicken a staggeringly high number of humans every year. Most of these pathogens are transmitted by mosquitos, and not surprisingly, as the earth warms and human populations grow and move, their geographic reach is increasing. Flaviviruses are simple RNA–protein machines that carry out protein synthesis, genome replication, and virion packaging in close association with cellular lipid membranes. In this review, we examine the molecular biology of flaviviruses touching on the structure and function of viral components and how these interact with host factors. The latter are functionally divided into pro-viral and antiviral factors, both of which, not surprisingly, include many RNA binding proteins. In the interface between the virus and the hosts we highlight the role of a noncoding RNA produced by flaviviruses to impair antiviral host immune ...

184 citations

Journal ArticleDOI
29 Nov 2018-Cell
TL;DR: It is found that efficient biogenesis of β1-adrenergic receptor (β1AR) and other G protein-coupled receptors (GPCRs) requires the conserved ER membrane protein complex (EMC), which inserts TMDs co-translationally and cooperates with the Sec61 translocon to ensure accurate topogenesis of many membrane proteins.

145 citations

Journal ArticleDOI
29 May 2018-eLife
TL;DR: The systematic proteomic approaches revealed that the ER membrane protein complex (EMC) binds to and promotes the biogenesis of a range of multipass transmembrane proteins, with a particular enrichment for transporters.
Abstract: The endoplasmic reticulum (ER) supports biosynthesis of proteins with diverse transmembrane domain (TMD) lengths and hydrophobicity. Features in transmembrane domains such as charged residues in ion channels are often functionally important, but could pose a challenge during cotranslational membrane insertion and folding. Our systematic proteomic approaches in both yeast and human cells revealed that the ER membrane protein complex (EMC) binds to and promotes the biogenesis of a range of multipass transmembrane proteins, with a particular enrichment for transporters. Proximity-specific ribosome profiling demonstrates that the EMC engages clients cotranslationally and immediately following clusters of TMDs enriched for charged residues. The EMC can remain associated after completion of translation, which both protects clients from premature degradation and allows recruitment of substrate-specific and general chaperones. Thus, the EMC broadly enables the biogenesis of multipass transmembrane proteins containing destabilizing features, thereby mitigating the trade-off between function and stability.

145 citations


Cites background from "Identification of Oxa1 Homologs Ope..."

  • ...Interestingly, 344 EMC3 may share a common ancestry with the universally conserved YidC/Oxa1/Alb3 protein 345 family in bacteria and mitochondria (38)....

    [...]

  • ...…wide diversity of membrane spanning sequences by directly interacting with select membrane proteins with destabilizing features in TMDs. Interestingly, EMC3 may share a common ancestry with the universally conserved YidC/Oxa1/Alb3 protein family in bacteria and mitochondria (Anghel et al., 2017)....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Abstract: The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSIBLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.

70,111 citations


"Identification of Oxa1 Homologs Ope..." refers methods in this paper

  • ...For each of these protein families, homologs were retrieved using PSI-Blast (Altschul et al., 1997) with an expected threshold cutoff of 10−1....

    [...]

Journal ArticleDOI
TL;DR: MUSCLE is a new computer program for creating multiple alignments of protein sequences that includes fast distance estimation using kmer counting, progressive alignment using a new profile function the authors call the log-expectation score, and refinement using tree-dependent restricted partitioning.
Abstract: We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the logexpectation score, and refinement using treedependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.

37,524 citations


"Identification of Oxa1 Homologs Ope..." refers methods in this paper

  • ...[PubMed: 9184221] Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput....

    [...]

  • ...Proteins in this list were then aligned using MUSCLE (Edgar, 2004)....

    [...]

Journal ArticleDOI
TL;DR: This work has used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximum-likelihood programs and much higher than the performance of distance-based and parsimony approaches.
Abstract: The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximum- likelihood principle, which clearly satisfies these requirements. The core of this method is a simple hill-climbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distance-based method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment of the topology and branch lengths, only a few iterations are sufficient to reach an optimum. We used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximum-likelihood programs and much higher than the performance of distance-based and parsimony approaches. The reduction of computing time is dramatic in comparison with other maximum-likelihood packages, while the likelihood maximization ability tends to be higher. For example, only 12 min were required on a standard personal computer to analyze a data set consisting of 500 rbcL sequences with 1,428 base pairs from plant plastids, thus reaching a speed of the same order as some popular distance-based and parsimony algorithms. This new method is implemented in the PHYML program, which is freely available on our web page: http://www.lirmm.fr/w3ifa/MAAS/. (Algorithm; computer simulations; maximum likelihood; phylogeny; rbcL; RDPII project.) The size of homologous sequence data sets has in- creased dramatically in recent years, and many of these data sets now involve several hundreds of taxa. More- over, current probabilistic sequence evolution models (Swofford et al., 1996 ; Page and Holmes, 1998 ), notably those including rate variation among sites (Uzzell and Corbin, 1971 ; Jin and Nei, 1990 ; Yang, 1996 ), require an increasing number of calculations. Therefore, the speed of phylogeny reconstruction methods is becoming a sig- nificant requirement and good compromises between speed and accuracy must be found. The maximum likelihood (ML) approach is especially accurate for building molecular phylogenies. Felsenstein (1981) brought this framework to nucleotide-based phy- logenetic inference, and it was later also applied to amino acid sequences (Kishino et al., 1990). Several vari- ants were proposed, most notably the Bayesian meth- ods (Rannala and Yang 1996; and see below), and the discrete Fourier analysis of Hendy et al. (1994), for ex- ample. Numerous computer studies (Huelsenbeck and Hillis, 1993; Kuhner and Felsenstein, 1994; Huelsenbeck, 1995; Rosenberg and Kumar, 2001; Ranwez and Gascuel, 2002) have shown that ML programs can recover the cor- rect tree from simulated data sets more frequently than other methods can. Another important advantage of the ML approach is the ability to compare different trees and evolutionary models within a statistical framework (see Whelan et al., 2001, for a review). However, like all optimality criterion-based phylogenetic reconstruction approaches, ML is hampered by computational difficul- ties, making it impossible to obtain the optimal tree with certainty from even moderate data sets (Swofford et al., 1996). Therefore, all practical methods rely on heuristics that obtain near-optimal trees in reasonable computing time. Moreover, the computation problem is especially difficult with ML, because the tree likelihood not only depends on the tree topology but also on numerical pa- rameters, including branch lengths. Even computing the optimal values of these parameters on a single tree is not an easy task, particularly because of possible local optima (Chor et al., 2000). The usual heuristic method, implemented in the pop- ular PHYLIP (Felsenstein, 1993 ) and PAUP ∗ (Swofford, 1999 ) packages, is based on hill climbing. It combines stepwise insertion of taxa in a growing tree and topolog- ical rearrangement. For each possible insertion position and rearrangement, the branch lengths of the resulting tree are optimized and the tree likelihood is computed. When the rearrangement improves the current tree or when the position insertion is the best among all pos- sible positions, the corresponding tree becomes the new current tree. Simple rearrangements are used during tree growing, namely "nearest neighbor interchanges" (see below), while more intense rearrangements can be used once all taxa have been inserted. The procedure stops when no rearrangement improves the current best tree. Despite significant decreases in computing times, no- tably in fastDNAml (Olsen et al., 1994 ), this heuristic becomes impracticable with several hundreds of taxa. This is mainly due to the two-level strategy, which sepa- rates branch lengths and tree topology optimization. In- deed, most calculations are done to optimize the branch lengths and evaluate the likelihood of trees that are finally rejected. New methods have thus been proposed. Strimmer and von Haeseler (1996) and others have assembled four- taxon (quartet) trees inferred by ML, in order to recon- struct a complete tree. However, the results of this ap- proach have not been very satisfactory to date (Ranwez and Gascuel, 2001 ). Ota and Li (2000, 2001) described

16,261 citations


"Identification of Oxa1 Homologs Ope..." refers methods in this paper

  • ...A maximum-likelihood phylogenetic tree was built using PhyML-SMS (Guindon and Gascuel, 2003) using nearest-neighbor interchange (NNI) and the Akaike information criterion....

    [...]

Journal ArticleDOI
TL;DR: The Crystallography & NMR System (CNS) as mentioned in this paper is a software suite for macromolecular structure determination by X-ray crystallography or solution nuclear magnetic resonance (NMR) spectroscopy.
Abstract: A new software suite, called Crystallography & NMR System (CNS), has been developed for macromolecular structure determination by X-ray crystallography or solution nuclear magnetic resonance (NMR) spectroscopy. In contrast to existing structure-determination programs the architecture of CNS is highly flexible, allowing for extension to other structure-determination methods, such as electron microscopy and solid-state NMR spectroscopy. CNS has a hierarchical structure: a high-level hypertext markup language (HTML) user interface, task-oriented user input files, module files, a symbolic structure-determination language (CNS language), and low-level source code. Each layer is accessible to the user. The novice user may just use the HTML interface, while the more advanced user may use any of the other layers. The source code will be distributed, thus source-code modification is possible. The CNS language is sufficiently powerful and flexible that many new algorithms can be easily implemented in the CNS language without changes to the source code. The CNS language allows the user to perform operations on data structures, such as structure factors, electron-density maps, and atomic properties. The power of the CNS language has been demonstrated by the implementation of a comprehensive set of crystallographic procedures for phasing, density modification and refinement. User-friendly task-oriented input files are available for nearly all aspects of macromolecular structure determination by X-ray crystallography and solution NMR.

15,182 citations

Journal ArticleDOI
TL;DR: TrimAl is a tool for automated alignment trimming, which is especially suited for large-scale phylogenetic analyses and can automatically select the parameters to be used in each specific alignment so that the signal-to-noise ratio is optimized.
Abstract: Multiple sequence alignments (MSA) are central to many areas of bioinformatics, including phylogenetics, homology modeling, database searches and motif finding. Recently, such MSA-based techniques have been incorporated in high-throughput pipelines such as genome annotation and phylogenomics analyses. In all these applications, the reliability and accuracy of the analyses depend critically on the quality of the underlying alignments. A plethora of computer programs and algorithms for MSA are currently available (Notredame, 2007), which implement different heuristics to find mathematically optimal solutions to the MSA problem. Accuracies of 80–90% have been reported for the best algorithms, but even the best scoring alignment algorithms may fail with certain protein families or at specific regions in the alignment. The situation worsens in large-scale analyses, where faster but less reliable algorithms and large numbers of automatically selected sequences are used. It is therefore generally assumed that trimming the alignment, so that poorly aligned regions are eliminated, increases the accuracy of the resulting MSA-based applications (Talavera and Castresana, 2007). Some programs such as G-blocks (Castresana, 2000) have been developed to assist in the MSA trimming phase by selecting blocks of conserved regions. They have become very popular and are extensively used, with good performance, in small-to-medium scale datasets, where several parameters can be tested manually (Talavera and Castresana, 2007). However, their use over larger datasets is hampered by the need for defining, prior to the analysis, the set of parameters that will be used for all sequence families. Here, we present trimAl, a tool for automated alignment trimming. Its speed and the possibility for automatically adjusting the parameters to improve the phylogenetic signal-to-noise ratio, makes trimAl especially suited for large-scale phylogenomic analyses, involving thousands of large alignments. trimAl has been developed in a GNU/Linux environment using C++ programming language and has been tested on various UNIX, Mac and Windows platforms. Moreover, we have developed a web server to run trimAl online (http://phylemon2.bioinfo.cipf.es/), which has been included in the Phylemon suite for phylogenetic and phylogenomic tools (Tarraga et al., 2007). The documentation, source files and additional information for trimAl are available through a wiki page (http://trimal.cgenomics.org). trimAl reads and renders protein or nucleotide alignments in several standard formats. trimAl starts by reading all columns in an alignment and computes a score (Sx) for each of them. This score can be a gap score (Sg), a similarity score (Ss) or a consistency score (Sc). The score for each column can be computed based only on the information from that column or, if a window size of w is specified, it corresponds to the average value of w columns around the position considered. The gap score (Sg) for a column is the fraction of sequences without a gap in that position. The residue similarity score (Ss) consists of mean distance (MD) scores as described in Thompson et al. (2001) and Supplementary Material. This score uses the MD between pairs of residues, as defined by a given scoring matrix. Finally, the consistency score (Sc) can only be computed when more than one alignment for the same set of sequences is provided. Details on how these scores are computed are provided in the Supplementary Material. In brief, Sc measures the level of consistency of all the residue pairs found in a column as compared with the other alignments. The alignment with the highest consistency is chosen and then trimmed to remove the columns that are less conserved, according to Sc or other thresholds set by the user. Once all column scores have been computed trimAl can proceed in two ways. If both a score and a minimum conservation threshold are provided, trimAl renders a trimmed alignment in which only the columns with scores above the score threshold are included, as far as the number of selected columns is above a conservation threshold defined by the user. If this number is below the conservation threshold, trimAl will add more columns to the trimmed alignment in a decreasing order of scores until the conservation threshold is reached. The conservation threshold corresponds to the minimum percentage of columns, from the original alignment, which the user wants to include in the trimmed alignment. Alternatively, if the automatic selection of parameters options is selected, trimAl will compute specific score thresholds depending on the inherent characteristics of each alignment. So far, trimAl incorporates three modes for the automated selection of parameters, gappyout, strict and strictplus, which are based on the different use of gap and similarity scores. Moreover, the option automated1 implements a heuristic to decide the most appropriate mode depending on the alignment characteristics. The heuristics to define such parameters have been designed based on the results of a benchmark. Details on the heuristics and the benchmark can be found in the online documentation of the program. In brief, the automatic selection of parameters approximate optimal cutoffs by plotting, internally, the cumulative graphs of gap and similarity scores of the columns in the alignment (see online documentation). We expanded, using ROSE simulations (Stoye et al., 1998) a benchmark set that has been used previously to test the improvement in phylogenetic performance after an alignment trimming phase (Talavera and Castresana, 2007). This dataset simulates several evolutionary scenarios varying in the number and length of the sequences, the topology of the underlying tree and the level of sequence divergence considered. We compared the results obtained from MUSCLE alignments before and after trimming with trimAl using automated selection of parameters. The accuracy of the resulting trees was measured by comparing them with the original trees used to generate the sequence sets, and measuring the Robinson Foulds distance (Robinson and Foulds, 1981). We observed an overall improvement of the phylogenetic accuracy after trimming. Using -automated1 option of trimAl, the trimmed alignment always produced Maximum Likelihood trees that were of equal (36%) or significantly better (64%) quality as compared with the tree derived from the complete alignment. For Neighbor Joining reconstruction the -strictplus option of trimAl worked best, improving the phylogenetic accuracy in 89% of the scenarios. In most scenarios (90%), trimAl outperformed Gblocks v0.91b with default parameters. Most importantly, the use of Gblocks default parameters diminished the accuracy of the subsequent tree reconstruction in half of the scenarios considered. In contrast, the use of trimAl automated methods rarely (1.5%) undermined the topological accuracy of the resulting phylogenetic tree (see Supplementary Material for more details). To test the applicability of trimAl on real datasets as well as its suitability for large-scale phylogenetic datasets, we ran trimAl on the complete set of MUSCLE alignments generated for the Human Phylome project (Huerta-Cepas et al., 2007). This includes a total of 31 182 alignments, containing, on average, 67 sequences of 1472 positions of length. Trimming these alignments using the -gappyout and automated1 options used 5 min 45 s and 125 min, 2 s, respectively, on a computer with an Intel QuadCore XEON E5410 processors and 8 GB of RAM. trimAl has been used previously in a pipeline to reconstruct complete collections of gene trees. In this case, the parameter sets used were a minimum conservation threshold of 60% and a gap threshold of 90% (-cons 60 -gt 0.9). Complete and trimmed alignments used to generate the phylomes included in PhylomeDB (Huerta-Cepas et al., 2008) can be viewed through this database.

6,807 citations


"Identification of Oxa1 Homologs Ope..." refers methods in this paper

  • ...Gaps in the alignment were trimmed using TrimAl (Capella-Gutiérrez et al., 2009) with a cutoff of 0.4....

    [...]

Related Papers (5)