Home
/
Authors
/
Min-Yi Shen

Author

Min-Yi Shen

Other affiliations: University of Chicago, Life Technologies, National Tsing Hua University ...read more

Bio: Min-Yi Shen is an academic researcher from University of California, San Francisco. The author has contributed to research in topics: Protein structure & Langevin dynamics. The author has an hindex of 20, co-authored 30 publications receiving 9048 citations. Previous affiliations of Min-Yi Shen include University of Chicago & Life Technologies.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Comparative Protein Structure Modeling Using MODELLER

[...]

Narayanan Eswar¹, Ben Webb¹, Marc A. Marti-Renom, Mallur S. Madhusudhan¹, David Eramian¹, Min-Yi Shen¹, Ursula Pieper¹, Andrej Sali¹ - Show less +4 more•Institutions (1)

University of California, San Francisco¹

01 Nov 2007-Current protocols in protein science

TL;DR: This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications.

...read moreread less

Abstract: Functional characterization of a protein sequence is a common goal in biology, and is usually facilitated by having an accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.

...read moreread less

3,495 citations

Journal Article•DOI•

Comparative protein structure modeling using Modeller.

[...]

Narayanan Eswar¹, Ben Webb¹, Marc A. Marti-Renom¹, Mallur S. Madhusudhan¹, David Eramian¹, Min-Yi Shen¹, Ursula Pieper¹, Andrej Sali¹ - Show less +4 more•Institutions (1)

University of California, San Francisco¹

01 Sep 2006-Current protocols in human genetics

TL;DR: This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications.

...read moreread less

Abstract: Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.

...read moreread less

3,006 citations

Journal Article•DOI•

Statistical potential for assessment and prediction of protein structures

[...]

Min-Yi Shen¹, Andrej Sali¹•Institutions (1)

University of California, San Francisco¹

01 Nov 2006-Protein Science

TL;DR: To facilitate its use in various applications, such as model assessment, loop modeling, and fitting into cryo‐electron microscopy mass density maps combined with comparative protein structure modeling, DOPE was incorporated into the modeling package MODELLER‐8.

...read moreread less

Abstract: Protein structures in the Protein Data Bank provide a wealth of data about the interactions that determine the native states of proteins. Using the probability theory, we derive an atomic distance-dependent statistical potential from a sample of native structures that does not depend on any adjustable parameters (Discrete Optimized Protein Energy, or DOPE). DOPE is based on an improved reference state that corresponds to noninteracting atoms in a homogeneous sphere with the radius dependent on a sample native structure; it thus accounts for the finite and spherical shape of the native structures. The DOPE potential was extracted from a nonredundant set of 1472 crystallographic structures. We tested DOPE and five other scoring functions by the detection of the native state among six multiple target decoy sets, the correlation between the score and model error, and the identification of the most accurate non-native structure in the decoy set. For all decoy sets, DOPE is the best performing function in terms of all criteria, except for a tie in one criterion for one decoy set. To facilitate its use in various applications, such as model assessment, loop modeling, and fitting into cryo-electron microscopy mass density maps combined with comparative protein structure modeling, DOPE was incorporated into the modeling package MODELLER-8.

...read moreread less

2,160 citations

Journal Article•DOI•

A composite score for predicting errors in protein structure models

[...]

David Eramian¹, Min-Yi Shen¹, Damien P. Devos¹, Francisco Melo², Andrej Sali¹, Marc A. Marti-Renom¹ - Show less +2 more•Institutions (2)

University of California, San Francisco¹, Pontifical Catholic University of Chile²

01 Jul 2006-Protein Science

TL;DR: The most accurate score is based on a combination of the DOPE non‐hydrogen atom statistical potential; surface, contact, and combined statistical potentials from MODPIPE; and two PSIPRED/DSSP scores, which can be applied to select the final model in various modeling problems, including fold assignment, target–template alignment, and loop modeling.

...read moreread less

Abstract: Reliable prediction of model accuracy is an important unsolved problem in protein structure modeling. To address this problem, we studied 24 individual assessment scores, including physics-based energy functions, statistical potentials, and machine learning–based scoring functions. Individual scores were also used to construct ∼85,000 composite scoring functions using support vector machine (SVM) regression. The scores were tested for their abilities to identify the most native-like models from a set of 6000 comparative models of 20 representative protein structures. Each of the 20 targets was modeled using a template of <30% sequence identity, corresponding to challenging comparative modeling cases. The best SVM score outperformed all individual scores by decreasing the average RMSD difference between the model identified as the best of the set and the model with the lowest RMSD (ΔRMSD) from 0.63 A to 0.45 A, while having a higher Pearson correlation coefficient to RMSD (r = 0.87) than any other tested score. The most accurate score is based on a combination of the DOPE non-hydrogen atom statistical potential; surface, contact, and combined statistical potentials from MODPIPE; and two PSIPRED/DSSP scores. It was implemented in the SVMod program, which can now be applied to select the final model in various modeling problems, including fold assignment, target–template alignment, and loop modeling.

...read moreread less

171 citations

Journal Article•DOI•

How well can the accuracy of comparative protein structure models be predicted

[...]

David Eramian¹, Narayanan Eswar², Narayanan Eswar¹, Min-Yi Shen¹, Min-Yi Shen², Andrej Sali², Andrej Sali¹ - Show less +3 more•Institutions (2)

University of California, San Francisco¹, California Institute for Quantitative Biosciences²

01 Nov 2008-Protein Science

TL;DR: A protocol optimized specifically for predicting the RMSD and NO3.5Å errors of a model in the absence of its native structure, which quantifies the error in an absolute sense, thus helping to determine whether or not the model is suitable for intended applications.

...read moreread less

Abstract: Comparative structure models are available for two orders of magnitude more protein sequences than are experimentally determined structures. These models, however, suffer from two limitations that experimentally determined structures do not: They frequently contain significant errors, and their accuracy cannot be readily assessed. We have addressed the latter limitation by developing a protocol optimized specifically for predicting the Cα root-mean-squared deviation (RMSD) and native overlap (NO3.5A) errors of a model in the absence of its native structure. In contrast to most traditional assessment scores that merely predict one model is more accurate than others, this approach quantifies the error in an absolute sense, thus helping to determine whether or not the model is suitable for intended applications. The assessment relies on a model-specific scoring function constructed by a support vector machine. This regression optimizes the weights of up to nine features, including various sequence similarity measures and statistical potentials, extracted from a tailored training set of models unique to the model being assessed: If possible, we use similarly sized models with the same fold; otherwise, we use similarly sized models with the same secondary structure composition. This protocol predicts the RMSD and NO3.5A errors for a diverse set of 580,317 comparative models of 6174 sequences with correlation coefficients (r) of 0.84 and 0.86, respectively, to the actual errors. This scoring function achieves the best correlation compared to 13 other tested assessment criteria that achieved correlations ranging from 0.35 to 0.71.

...read moreread less

135 citations

1
2
3
4
…
5
6

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Comparison of multiple Amber force fields and development of improved protein backbone parameters.

[...]

Viktor Hornak¹, Robert Abel², Asim Okur¹, Bentley Strockbine¹, Adrian E. Roitberg², Carlos Simmerling³, Carlos Simmerling¹ - Show less +3 more•Institutions (3)

Stony Brook University¹, University of Florida², Brookhaven National Laboratory³

15 Nov 2006-Proteins

TL;DR: An effort to improve the φ/ψ dihedral terms in the ff99 energy function achieves a better balance of secondary structure elements as judged by improved distribution of backbone dihedrals for glycine and alanine with respect to PDB survey data.

...read moreread less

Abstract: The ff94 force field that is commonly associated with the Amber simulation package is one of the most widely used parameter sets for biomolecular simulation. After a decade of extensive use and testing, limitations in this force field, such as over-stabilization of alpha-helices, were reported by us and other researchers. This led to a number of attempts to improve these parameters, resulting in a variety of "Amber" force fields and significant difficulty in determining which should be used for a particular application. We show that several of these continue to suffer from inadequate balance between different secondary structure elements. In addition, the approach used in most of these studies neglected to account for the existence in Amber of two sets of backbone phi/psi dihedral terms. This led to parameter sets that provide unreasonable conformational preferences for glycine. We report here an effort to improve the phi/psi dihedral terms in the ff99 energy function. Dihedral term parameters are based on fitting the energies of multiple conformations of glycine and alanine tetrapeptides from high level ab initio quantum mechanical calculations. The new parameters for backbone dihedrals replace those in the existing ff99 force field. This parameter set, which we denote ff99SB, achieves a better balance of secondary structure elements as judged by improved distribution of backbone dihedrals for glycine and alanine with respect to PDB survey data. It also accomplishes improved agreement with published experimental data for conformational preferences of short alanine peptides and better accord with experimental NMR relaxation data of test protein systems.

...read moreread less

6,146 citations

Journal Article•DOI•

Comparative Protein Structure Modeling Using MODELLER

[...]

Narayanan Eswar¹, Ben Webb¹, Marc A. Marti-Renom, Mallur S. Madhusudhan¹, David Eramian¹, Min-Yi Shen¹, Ursula Pieper¹, Andrej Sali¹ - Show less +4 more•Institutions (1)

University of California, San Francisco¹

01 Nov 2007-Current protocols in protein science

TL;DR: This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications.

...read moreread less

3,495 citations

Journal Article•DOI•

Comparative protein structure modeling using Modeller.

[...]

Narayanan Eswar¹, Ben Webb¹, Marc A. Marti-Renom¹, Mallur S. Madhusudhan¹, David Eramian¹, Min-Yi Shen¹, Ursula Pieper¹, Andrej Sali¹ - Show less +4 more•Institutions (1)

University of California, San Francisco¹

01 Sep 2006-Current protocols in human genetics

TL;DR: This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications.

...read moreread less

3,006 citations

Journal Article•DOI•

Statistical potential for assessment and prediction of protein structures

[...]

Min-Yi Shen¹, Andrej Sali¹•Institutions (1)

University of California, San Francisco¹

01 Nov 2006-Protein Science

...read moreread less

2,160 citations

Journal Article•DOI•

Toward the estimation of the absolute quality of individual protein structure models

[...]

Pascal Benkert¹, Marco Biasini², Torsten Schwede²•Institutions (2)

University of Basel¹, Swiss Institute of Bioinformatics²

01 Feb 2011-Bioinformatics

TL;DR: The ability of the newly introduced QMEAN Z-score to detect experimentally solved protein structures containing significant errors, as well as to evaluate theoretical protein models is demonstrated.

...read moreread less

Abstract: Motivation: Quality assessment of protein structures is an important part of experimental structure validation and plays a crucial role in protein structure prediction, where the predicted models may contain substantial errors. Most current scoring functions are primarily designed to rank alternative models of the same sequence supporting model selection, whereas the prediction of the absolute quality of an individual protein model has received little attention in the field. However, reliable absolute quality estimates are crucial to assess the suitability of a model for specific biomedical applications. Results: In this work, we present a new absolute measure for the quality of protein models, which provides an estimate of the ‘degree of nativeness’ of the structural features observed in a model and describes the likelihood that a given model is of comparable quality to experimental structures. Model quality estimates based on the QMEAN scoring function were normalized with respect to the number of interactions. The resulting scoring function is independent of the size of the protein and may therefore be used to assess both monomers and entire oligomeric assemblies. Model quality scores for individual models are then expressed as ‘Z-scores’ in comparison to scores obtained for high-resolution crystal structures. We demonstrate the ability of the newly introduced QMEAN Z-score to detect experimentally solved protein structures containing significant errors, as well as to evaluate theoretical protein models. In a comprehensive QMEAN Z-score analysis of all experimental structures in the PDB, membrane proteins accumulate on one side of the score spectrum and thermostable proteins on the other. Proteins from the thermophilic organism Thermatoga maritima received significantly higher QMEAN Z-scores in a pairwise comparison with their homologous mesophilic counterparts, underlining the significance of the QMEAN Z-score as an estimate of protein stability. Availability: The Z-score calculation has been integrated in the QMEAN server available at: http://swissmodel.expasy.org/qmean. Contact: torsten.schwede@unibas.ch Supplementary information:Supplementary data are available at Bioinformatics online.

...read moreread less

1,844 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse