scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A homozygous pathogenic missense variant broadens the phenotypic and mutational spectrum of CREB3L1-related osteogenesis imperfecta

01 Jun 2019-Human Molecular Genetics (Oxford University Press (OUP))-Vol. 28, Iss: 11, pp 1801-1809
TL;DR: The first homozygous pathogenic missense variant is identified in a patient with lethal OI, which is located within the highly conserved basic leucine zipper domain, four amino acids upstream of the DNA binding domain, and affects a critical residue in this functional domain, thereby decreasing the type I collagen transcriptional binding ability.
Abstract: The cyclic adenosine monophosphate responsive element binding protein 3-like 1 (CREB3L1) gene codes for the endoplasmic reticulum stress transducer old astrocyte specifically induced substance (OASIS), which has an important role in osteoblast differentiation during bone development. Deficiency of OASIS is linked to a severe form of autosomal recessive osteogenesis imperfecta (OI), but only few patients have been reported. We identified the first homozygous pathogenic missense variant [p.(Ala304Val)] in a patient with lethal OI, which is located within the highly conserved basic leucine zipper domain, four amino acids upstream of the DNA binding domain. In vitro structural modeling and luciferase assays demonstrate that this missense variant affects a critical residue in this functional domain, thereby decreasing the type I collagen transcriptional binding ability. In addition, overexpression of the mutant OASIS protein leads to decreased transcription of the SEC23A and SEC24D genes, which code for components of the coat protein complex type II (COPII), and aberrant OASIS signaling also results in decreased protein levels of SEC24D. Our findings therefore provide additional proof of the potential involvement of the COPII secretory complex in the context of bone-associated disease.

Summary (3 min read)

Introduction

  • Osteogenesis imperfecta (OI) is a clinically and genetically heterogeneous group of heritable bone dysplasias, with the severity of symptoms ranging from perinatal lethality to generalized osteopenia (1).
  • This brittle bone disease affects one in 15,000-20,000 births and is characterized by typical clinical manifestations such as bone fragility, skeletal deformities, low bone mass and short stature.
  • The CREB3L1 gene (cAMP Responsive Element Binding Protein 3 Like 1) encodes the endoplasmic reticulum (ER)-stress transducer ‘old astrocyte specifically induced substance’ , a basic leucine zipper (bZIP) transcription factor which belongs to the well-conserved family of the cyclic AMP responsive element binding protein/activating transcription factor (CREB/ATF) genes.

Clinical phenotype

  • The authors report a consanguineous Turkish family of second cousins, who had a medically terminated pregnancy at 19 weeks of gestation due to skeletal changes highly suggestive for severe OI.
  • Antenatal ultrasound findings of the female fetus (IV-3, Fig. 1) included short tubular bones, multiple rib fractures with beaded appearance, and a narrow thorax circumference of 81mm (2.5-5th percentile).
  • The parents (III-7 and III-8, Fig. 1) did not show any overt clinical signs of OI and had no history of fractures.
  • The affected alanine (Ala) residue is located within a highly conserved bipartite nuclear localization sequence (NLS) within the bZIP domain, and only 4 amino acids (AA) upstream of the DNA binding domain , in which the earlier reported in-frame deletion p.(Lys312del) is located , (Fig. 2, Fig. 3A) (22).
  • Biochemical assays using a luciferase reporter were performed in order to validate the direct impact of the p.(Ala304Val) variant on the regulation of the expression of the downstream target genes of OASIS, using type I collagen expression as a representative example.

Discussion

  • This is the first report linking a pathogenic missense variant to the CREB3L1-related AR form of OI, and the 4th case in total implicating this gene.
  • In contrast to the two earlier reports, the child presented by Lindahl et al., survived infancy.
  • Together, these findings suggest that both p.(Ala304Val) and p.(Lys312del) have similar working mechanisms; they both form stable mutant proteins, which subsequently might accumulate in the cytosol.
  • Recent studies have shown that monoubiquitylation of SEC31A helps to regulate COPII size, that glycosylation of both SEC24 and SEC23 is important for organization and regulation of COPII vesicles, and that phosphorylation of SEC23 and SEC24 confers directionality on COPII vesicles from ER to Golgi (34, 38).
  • Keller et al. first proposed that mutations in OASIS can lead to OI due to disruption of the important role this protein plays in the secretion of type I collagen and other bone matrix proteins from osteoblasts during osteogenesis (22, 23, 33).

Ethical considerations

  • Written and signed informed consent was obtained from the parents of the patient participating in this study.
  • Genomic DNA (gDNA) from the proband, siblings or parents was isolated from whole blood according to the standard procedures.

Molecular studies

  • The authors used conventional Sanger sequencing and next generation panel sequencing (MiSeq platform – Illumina) for molecular screening of the COL1A1, COL1A2, CRTAP, LEPRE1, PPIB, CREB3L1, WNT1, PLS3, BMP1, FKBP10, IFITM5, PLOD2, SERPINF1, SERPINH1, SP7 and TMEM38B genes.
  • For NGS, single bases (up to 20 bases intronic of all coding exons) were covered with a minimal of 30x.
  • Confirmational Sanger sequencing and segregational analysis was performed using the BigDye Terminator Cycle Sequencing Kit (Life Technologies, Carlsbad, Ca, USA) and run on a ABI 3730XL DNA Analyzer (Life technologies).
  • Nucleotide numbering reflects cDNA numbering, with +1 corresponding to the A nucleotide of the ATG translation initiation codon in the reference sequence of CREB3L1 (NM_052854.2).
  • Variant nomenclature follows the Human Genome Variation Society (HGVS) guidelines (http://www.hgvs.org/mutnomen), and variant classification was done by using the Alamut Visual software (version 2.10) and according to the American College of Medical Genetics (ACMG) standards and guidelines (Genome Aggregation Database, http://gnomad.broadinstitute.org) (24, 39).

Structural modeling of the variant

  • By means of the I-TASSER server, which is an integrated platform for automated protein structure and function prediction based on the sequence-to-structure-to-function paradigm, 5 different three dimensional structural protein models were generated of the full length WT, p.(Ala304Val), and p.(Lys312del) protein sequences (26-28).
  • The homology model of the CREB bZIP-CRE complex (PDB: 1DH3 – Mus musculus – generated in the expression system of Escherichia coli) was used as a template (30).
  • The UCSF Chimera software package (version 1.13, build 41780) was used to visualize, study the localization, and model the effect of the specific protein variant (Dunbrack rotamers and FindHBond function), respectively (32, 40) .

Expression vectors

  • The primers for sitedirected mutagenesis were designed using the QuikChange Primer Design tool and were purchased as HPLC-purified primers (primer sequences are listed in the Supplementary Table S1) (Integrated DNA Technologies).
  • A pCMV-3Tag-2 empty vector (cat240196, Agilent) was purchased to use as a transfection control in their experiments (‘Empty’).
  • Final constructs were sequenced, and a control- digestion was performed to confirm correct vector structure (data not shown).

Luciferase reporter assay

  • For the luciferase experiments, 20,000 HEK293 cells were seeded in clear bottom 96 well plates (CLS3603-48EA, Sigma-Aldrich) in triplicate at day 1 and transiently co-transfected at day 3 using FuGene HD transfection reagent (E2311, Promega).
  • Twenty-four hours post transfection, cells were lysed according to the manufacturers guidelines (Dual-Glo Luciferase Assay System, Promega) and luciferase activity was measured using a GloMax-Multi Detection System (E7031, Promega).
  • Graphs display data-points normalized to WT values.
  • In brief, 200,000 HEK293 cells were seeded in 6-well plates in triplicate at day 1 and transiently transfected at day 2 using FuGene HD transfection reagent (E2311, Promega) at a 3:1 ratio (3l reagent: 1g plasmid) per well and incubated for 48 hours before harvesting.
  • These cells were subsequently processed for quantitative reverse-transcription PCR (RT-qPCR) or immunoblotting.

Quantitative reverse transcription PCR

  • Total RNA was extracted from transfected HEK293 cells using the RNeasy Kit .
  • Starting from 2g of RNA, cDNA was subsequently synthesized with the iScript cDNA Synthesis Kit (Bio-Rad Laboratories).
  • RT-qPCR reactions were prepared with the addition of RealTime ready DNA Probes Master mix and ResoLight Dye and were run in duplicate on a LightCycler 480 System.
  • Data were analyzed with qbase+ software (version 3.0, Biogazelle) (42), and expression was normalized to the housekeeping genes HPRT1, RPL13A and YWHAZ.
  • Graphs display data-points normalized to WT values.

Legends to Figures

  • Pedigree of the Turkish CREB3L1 OI family, also known as (A).
  • The proband is indicated with an arrow, asterisks denote family members available for molecular testing.
  • (B): Postmortem examination of fetus IV:3 showed bowed extremities with bilateral angulation of the forearms due to fractures, bilateral femoral and tibial bowing.
  • Figure 2 Protein structure and function of OASIS.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

biblio.ugent.be
The UGent Institutional Repository is the electronic archiving and dissemination platform for all
UGent research publications. Ghent University has implemented a mandate stipulating that all
academic publications of UGent researchers should be deposited and archived in this repository.
Except for items where current copyright restrictions apply, these papers are available in Open
Access.
This item is the archived peer-reviewed author-version of:
A homozygous pathogenic missense variant broadens the phenotypic and mutational
spectrum of CREB3L1-related osteogenesis imperfecta
Guillemyn B, Kayserili H, Demuynck L, Sips P, De Paepe A, Syx D, Coucke PJ, Malfait F,
Symoens S
Human Molecular Genetics, 28 (11), 1801-1809, 2019.
This is a pre-copyedited, author-produced version of an article accepted for publication in
Human Molecular Genetics following peer review. The version of record is available online
at: https://doi.org/10.1093/hmg/ddz017.
To refer to or to cite this work, please use the citation to the published version:
Guillemyn B, Kayserili H, Demuynck L, Sips P, De Paepe A, Syx D, Coucke PJ, Malfait F, and
Symoens S (2019). A homozygous pathogenic missense variant broadens the phenotypic
and mutational spectrum of CREB3L1-related osteogenesis imperfecta. Hum Mol Genet
28(11) 1801-1809. doi: 10.1093/hmg/ddz017

For Peer Review
1
A homozygous pathogenic missense variant broadens the
phenotypic and mutational spectrum of CREB3L1-related
osteogenesis imperfecta
Brecht Guillemyn
1
, Hülya Kayserili
2
, Lynn Demuynck
1
, Patrick Sips
1
, Anne De Paepe
1
,
Delfien Syx
1
, Paul J. Coucke
1
, Fransiska Malfait
1
, Sofie Symoens
1,*
1
Center for Medical Genetics Ghent, Ghent University Hospital, Department of Biomolecular
Medicine, Ghent 9000, Belgium
2
KOÇ University School of Medicine (KUSoM) Medical Genetics Department, Topkapi
Zeytinburnu, 34010 Istanbul, Turkey
*To whom correspondence should be addressed: Department of Biomolecular Medicine,
Center for Medical Genetics Ghent, Ghent University Hospital, Corneel Heymanslaan 10,
Medical Research Building 1, 9000 Ghent, Belgium. Tel: 0032/9 332 02 33; Email:
Sofie.Symoens@UGent.be
Page 1 of 28 Human Molecular Genetics
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

For Peer Review
2
Abstract
The cyclic AMP responsive element binding protein 3-like 1 (CREB3L1) gene codes for the
endoplasmic reticulum stress transducer old astrocyte specifically induced substance (OASIS),
which has an important role in osteoblast differentiation during bone development. Deficiency
of OASIS is linked to a severe form of autosomal recessive osteogenesis imperfecta (OI), but
only few patients have been reported. We identified the first homozygous pathogenic missense
variant (p.(Ala304Val)) in a patient with lethal OI, which is located within the highly conserved
basic leucine zipper domain, four amino acids upstream of the DNA binding domain. In vitro
structural modeling and luciferase assays demonstrate that this missense variant affects a
critical residue in this functional domain, thereby decreasing the type I collagen transcriptional
binding ability. In addition, overexpression of the mutant OASIS protein leads to decreased
transcription of the SEC23A and SEC24D genes, which code for components of the coat protein
complex type II (COPII), and aberrant OASIS signaling also results in decreased protein levels
of SEC24D. Our findings therefore provide additional proof of the potential involvement of the
COPII secretory complex in the context of bone-associated disease.
Page 2 of 28Human Molecular Genetics
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

For Peer Review
3
Introduction
Osteogenesis imperfecta (OI) is a clinically and genetically heterogeneous group of heritable
bone dysplasias, with the severity of symptoms ranging from perinatal lethality to generalized
osteopenia (1). This brittle bone disease affects one in 15,000-20,000 births and is characterized
by typical clinical manifestations such as bone fragility, skeletal deformities, low bone mass
and short stature. Extraskeletal features, including blue sclerae, dentinogenesis imperfecta,
adult-onset hearing loss, joint hypermobility, restrictive pulmonary disease, cardiovascular
abnormalities and easy bruising, contribute to the multisystemic disorder (1-3). The
predominant autosomal dominant (AD) forms are caused by mutations in either COL1A1 (MIM
120150) or COL1A2 (MIM 120160), encoding the α1- and α2-chains of type I procollagen
respectively. Another rare AD OI subtype is associated with mutations in interferon–induced
transmembrane protein 5 (IFITM5, MIM 614757), which is involved in bone mineralization. In
approximately 10% of OI cases, the disease has an autosomal recessive (AR) inheritance.
Several genes have been associated with these AR forms of OI, and they are classified according
to their mechanism and pathophysiology: collagen post-translational modification (CRTAP,
MIM 605497; P3H1, MIM 610339; PPIB, MIM 123841), collagen processing and crosslinking
(SERPINH1, MIM 600943; FKBP10, MIM 607063; PLOD2, MIM 601865; BMP1, MIM
112264), bone mineralization (SERPINF1, MIM 172860) and osteoblast
differentiation/function (SP7, MIM 606633; TMEM38B, MIM 611236; WNT1, MIM 164820;
CREB3L1, MIM 616215; SPARC, MIM 182120; MBTPS2, MIM 300294; TAPT1, MIM
616897) (1, 2, 4-18).
The CREB3L1 gene (cAMP Responsive Element Binding Protein 3 Like 1) encodes the
endoplasmic reticulum (ER)-stress transducer ‘old astrocyte specifically induced substance’
(OASIS), a basic leucine zipper (bZIP) transcription factor which belongs to the well-conserved
family of the cyclic AMP responsive element binding protein/activating transcription factor
(CREB/ATF) genes. OASIS is processed by regulated intramembrane proteolysis (RIP) in
response to ER stress, and is highly expressed in osteoblasts (19, 20). OASIS
-/-
mice exhibit
severe osteopenia and spontaneous fractures, resulting from a decrease in type I collagen in the
bone matrix and a decline in the activity of osteoblasts. More recently, Col1a1 was identified
as a target of OASIS, and Murakami et al. demonstrated with murine studies that OASIS
activates the transcription of Col1a1 through an unfolded protein response element (UPRE)-
Page 3 of 28 Human Molecular Genetics
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

For Peer Review
4
like sequence in the Col1a1 promoter region, thereby revealing its critical role in bone
formation (19-21).
Hitherto, only 3 reports have associated homozygous CREB3L1 defects to an AR form of OI (a
whole gene deletion, the in-frame deletion (c.934_936delAAG, p.(Lys312del)) and the
nonsense variant (c.1284C>A, p.(Tyr428*))), which is currently classified as OI type XVI (2,
15, 22, 23).
Here, we present a Turkish family, in which molecular analysis of the proband revealed a
previously unreported homozygous missense variant (c.911C>T, p.(Ala304Val)).
We applied structural modeling to study the effects of this missense variant on the OASIS
protein. We then performed further in vitro studies to investigate the functional consequences
regarding regulation of type I collagen and COPII component gene expression.
Page 4 of 28Human Molecular Genetics
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

Citations
More filters
Journal ArticleDOI
TL;DR: An overview of the diverse functions of each member of the CREB3 family of transcription factors with special focus on their role in the central nervous system is provided.
Abstract: CREB3 family of transcription factors are ER localized proteins that belong to the bZIP family. They are transported from the ER to the Golgi, cleaved by S1P and S2P proteases and the released N-terminal domains act as transcription factors. CREB3 family members regulate the expression of a large variety of genes and according to their tissue-specific expression profiles they play, among others, roles in acute phase response, lipid metabolism, development, survival, differentiation, organelle autoregulation, and protein secretion. They have been implicated in the ER and Golgi stress responses as regulators of the cell secretory capacity and cell specific cargos. In this review we provide an overview of the diverse functions of each member of the family (CREB3, CREB3L1, CREB3L2, CREB3L3, CREB3L4) with special focus on their role in the central nervous system.

61 citations


Cites background from "A homozygous pathogenic missense va..."

  • ...CREB3L1 critical contribution in bone formation was also confirmed by its role as a genetic cause of autosomal recessive osteogenesis imperfecta in humans (Symoens et al., 2013; Keller et al., 2018; Guillemyn et al., 2019)....

    [...]

Journal ArticleDOI
TL;DR: Current models describing the dynamics and mechanisms of ER-Golgi transport are discussed, challenging long-held models of vesicular transport of large matrix proteins and are implicating less well-defined carriers and direct interconnections between organelles.

45 citations

01 Jan 2016
TL;DR: The bioinformatics for geneticists is universally compatible with any devices to read and will help you to get the most less latency time to download any of the authors' books like this one.
Abstract: Thank you for reading bioinformatics for geneticists. As you may know, people have search numerous times for their favorite novels like this bioinformatics for geneticists, but end up in harmful downloads. Rather than reading a good book with a cup of coffee in the afternoon, instead they are facing with some malicious virus inside their laptop. bioinformatics for geneticists is available in our book collection an online access to it is set as public so you can download it instantly. Our digital library saves in multiple countries, allowing you to get the most less latency time to download any of our books like this one. Merely said, the bioinformatics for geneticists is universally compatible with any devices to read.

40 citations

Journal ArticleDOI
TL;DR: Osteogenesis imperfecta (OI) is a phenotypically and genetically heterogeneous skeletal dysplasia characterized by bone fragility, growth deficiency and skeletal deformity as discussed by the authors.
Abstract: Osteogenesis imperfecta (OI) is a phenotypically and genetically heterogeneous skeletal dysplasia characterized by bone fragility, growth deficiency and skeletal deformity. Previously known to be caused by defects in type I collagen, the major protein of extracellular matrix, it is now also understood to be a collagen-related disorder caused by defects in collagen folding, post-translational modification and processing, bone mineralization and osteoblast differentiation, with inheritance of OI types spanning autosomal dominant and recessive as well as X-linked recessive. This review provides the latest updates on OI, encompassing both classical OI and rare forms, their mechanism and the signaling pathways involved in their pathophysiology. There is a special emphasis on mutations in type I procollagen C-propeptide structure and processing, the later causing OI with strikingly high bone mass. Types V and VI OI, while notably different, are shown to be interrelated by the IFITM5 p.S40L mutation that reveals the connection between the BRIL and PEDF pathways. The function of Regulated Intramembrane Proteolysis has been extended beyond cholesterol metabolism to bone formation by defects in RIP components S2P and OASIS. Several recently proposed candidate genes for new types of OI are also presented. Discoveries of new OI genes add complexity to already-challenging OI management; current and potential approaches are summarized.

38 citations

Journal ArticleDOI
TL;DR: The most recent advances in the understanding of processes involved in abnormal bone mineralization, collagen processing and osteoblast function are described, as illustrated by the characterization of new causative genes for OI and OI‐related fragility syndromes.
Abstract: The limited accessibility of bone and its mineralized nature have restricted deep investigation of its biology. Recent breakthroughs in identification of mutant proteins affecting bone tissue homeostasis in rare skeletal diseases have revealed novel pathways involved in skeletal development and maintenance. The characterization of new dominant, recessive and X-linked forms of the rare brittle bone disease osteogenesis imperfecta (OI) and other OI-related bone fragility disorders was a key player in this advance. The development of in vitro models for these diseases along with the generation and characterization of murine and zebrafish models contributed to dissecting previously unknown pathways. Here, we describe the most recent advances in the understanding of processes involved in abnormal bone mineralization, collagen processing and osteoblast function, as illustrated by the characterization of new causative genes for OI and OI-related fragility syndromes. The coordinated role of the integral membrane protein BRIL and of the secreted protein PEDF in modulating bone mineralization as well as the function and cross-talk of the collagen-specific chaperones HSP47 and FKBP65 in collagen processing and secretion are discussed. We address the significance of WNT ligand, the importance of maintaining endoplasmic reticulum membrane potential and of regulating intramembrane proteolysis in osteoblast homeostasis. Moreover, we also examine the relevance of the cytoskeletal protein plastin-3 and of the nucleotidyltransferase FAM46A. Thanks to these advances, new targets for the development of novel therapies for currently incurable rare bone diseases have been and, likely, will be identified, supporting the important role of basic science for translational approaches.

33 citations

References
More filters
Journal ArticleDOI
TL;DR: Two unusual extensions are presented: Multiscale, which adds the ability to visualize large‐scale molecular assemblies such as viral coats, and Collaboratory, which allows researchers to share a Chimera session interactively despite being at separate locales.
Abstract: The design, implementation, and capabilities of an extensible visualization system, UCSF Chimera, are discussed. Chimera is segmented into a core that provides basic services and visualization, and extensions that provide most higher level functionality. This architecture ensures that the extension mechanism satisfies the demands of outside developers who wish to incorporate new features. Two unusual extensions are presented: Multiscale, which adds the ability to visualize large-scale molecular assemblies such as viral coats, and Collaboratory, which allows researchers to share a Chimera session interactively despite being at separate locales. Other extensions include Multalign Viewer, for showing multiple sequence alignments and associated structures; ViewDock, for screening docked ligand orientations; Movie, for replaying molecular dynamics trajectories; and Volume Viewer, for display and analysis of volumetric data. A discussion of the usage of Chimera in real-world situations is given, along with anticipated future directions. Chimera includes full user documentation, is free to academic and nonprofit users, and is available for Microsoft Windows, Linux, Apple Mac OS X, SGI IRIX, and HP Tru64 Unix from http://www.cgl.ucsf.edu/chimera/.

35,698 citations

Journal ArticleDOI
TL;DR: Because of the increased complexity of analysis and interpretation of clinical genetic testing described in this report, the ACMG strongly recommends thatclinical molecular genetic testing should be performed in a Clinical Laboratory Improvement Amendments–approved laboratory, with results interpreted by a board-certified clinical molecular geneticist or molecular genetic pathologist or the equivalent.

17,834 citations

Journal ArticleDOI
TL;DR: The iterative threading assembly refinement (I-TASSER) server is an integrated platform for automated protein structure and function prediction based on the sequence- to-structure-to-function paradigm.
Abstract: The iterative threading assembly refinement (I-TASSER) server is an integrated platform for automated protein structure and function prediction based on the sequence-to-structure-to-function paradigm. Starting from an amino acid sequence, I-TASSER first generates three-dimensional (3D) atomic models from multiple threading alignments and iterative structural assembly simulations. The function of the protein is then inferred by structurally matching the 3D models with other known proteins. The output from a typical server run contains full-length secondary and tertiary structure predictions, and functional annotations on ligand-binding sites, Enzyme Commission numbers and Gene Ontology terms. An estimate of accuracy of the predictions is provided based on the confidence score of the modeling. This protocol provides new insights and guidelines for designing of online server systems for the state-of-the-art protein structure and function predictions. The server is available at http://zhanglab.ccmb.med.umich.edu/I-TASSER.

5,792 citations

Journal ArticleDOI
Yang Zhang1
TL;DR: The I-TASSER server has been developed to generate automated full-length 3D protein structural predictions where the benchmarked scoring system helps users to obtain quantitative assessments of the I- TASSER models.
Abstract: Prediction of 3-dimensional protein structures from amino acid sequences represents one of the most important problems in computational structural biology. The community-wide Critical Assessment of Structure Prediction (CASP) experiments have been designed to obtain an objective assessment of the state-of-the-art of the field, where I-TASSER was ranked as the best method in the server section of the recent 7th CASP experiment. Our laboratory has since then received numerous requests about the public availability of the I-TASSER algorithm and the usage of the I-TASSER predictions. An on-line version of I-TASSER is developed at the KU Center for Bioinformatics which has generated protein structure predictions for thousands of modeling requests from more than 35 countries. A scoring function (C-score) based on the relative clustering structural density and the consensus significance score of multiple threading templates is introduced to estimate the accuracy of the I-TASSER predictions. A large-scale benchmark test demonstrates a strong correlation between the C-score and the TM-score (a structural similarity measurement with values in [0, 1]) of the first models with a correlation coefficient of 0.91. Using a C-score cutoff > -1.5 for the models of correct topology, both false positive and false negative rates are below 0.1. Combining C-score and protein length, the accuracy of the I-TASSER models can be predicted with an average error of 0.08 for TM-score and 2 A for RMSD. The I-TASSER server has been developed to generate automated full-length 3D protein structural predictions where the benchmarked scoring system helps users to obtain quantitative assessments of the I-TASSER models. The output of the I-TASSER server for each query includes up to five full-length models, the confidence score, the estimated TM-score and RMSD, and the standard deviation of the estimations. The I-TASSER server is freely available to the academic community at http://zhang.bioinformatics.ku.edu/I-TASSER .

4,754 citations

Journal ArticleDOI
TL;DR: A stand-alone I-TASSER Suite that can be used for off-line protein structure and function prediction and three complementary algorithms to enhance function inferences are developed, the consensus of which is derived by COACH4 using support vector machines.
Abstract: The lowest free-energy conformations are identified by structure clustering. A second round of assembly simulation is conducted, starting from the centroid models, to remove steric clashes and refine global topology. Final atomic structure models are constructed from the low-energy conformations by a two-step atomic-level energy minimization approach. The correctness of the global model is assessed by the confidence score, which is based on the significance of threading alignments and the density of structure clustering; the residue-level local quality of the structural models and B factor of the target protein are evaluated by a newly developed method, ResQ, built on the variation of modeling simulations and the uncertainty of homologous alignments through support vector regression training. For function annotation, the structure models with the highest confidence scores are matched against the BioLiP5 database of ligand-protein interactions to detect homologous function templates. Functional insights on ligand-binding site (LBS), Enzyme Commission (EC) and Gene Ontology (GO) are deduced from the functional templates. We developed three complementary algorithms (COFACTOR, TM-SITE and S-SITE) to enhance function inferences, the consensus of which is derived by COACH4 using support vector machines. Detailed instructions for installation, implementation and result interpretation of the Suite can be found in the Supplementary Methods and Supplementary Tables 1 and 2. The I-TASSER Suite pipeline was tested in recent communitywide structure and function prediction experiments, including CASP10 (ref. 1) and CAMEO2. Overall, I-TASSER generated the correct fold with a template modeling score (TM-score) >0.5 for 10 out of 36 “New Fold” (NF) targets in the CASP10, which have no homologous templates in the Protein Data Bank (PDB). Of the 110 template-based modeling targets, 92 had a TM-score >0.5, and 89 had the templates drawn closer to the native with an average r.m.s. deviation improvement of 1.05 Å in the same threadingaligned regions6. In CAMEO, COACH generated LBS predictions for 4,271 targets with an average accuracy 0.86, which was 20% higher than that of the second-best method in the experiment. Here we illustrate I-TASSER Suite–based structure and function modeling using six examples (Fig. 1b–g) from the communitywide blind tests1,2. R0006 and R0007 are two NF targets from CASP10, and I-TASSER constructed models of correct fold with a TM-score of 0.62 for both targets (Fig. 1b,c). An illustration of local quality estimation by ResQ is shown for T0652, which has an average error 0.75 Å compared to the actual deviation of the model from the native (Fig. 1h). The four LBS prediction examples (Fig. 1d–g) are from CASP10 (ref. 1) and CAMEO2; COACH generated ligand models all with a ligand r.m.s. deviation below 2 Å. COACH also correctly assigned the threeand fourdigit EC numbers to the enzyme targets C0050 and C0046 (Supplementary Table 3). In summary, we developed a stand-alone I-TASSER Suite that can be used for off-line protein structure and function prediction. The I-TASSER Suite: protein structure and function prediction

4,693 citations

Related Papers (5)