Home
/
Authors
/
Jianlin Cheng

Author

Jianlin Cheng

Other affiliations: University of Central Florida, University of Missouri–St. Louis, University of California, Irvine

Bio: Jianlin Cheng is an academic researcher from University of Missouri. The author has contributed to research in topics: Protein structure prediction & Computer science. The author has an hindex of 55, co-authored 240 publications receiving 13909 citations. Previous affiliations of Jianlin Cheng include University of Central Florida & University of Missouri–St. Louis.

Topics: Protein structure prediction, Computer science, Deep learning, Genome, CASP ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Genome sequence of the palaeopolyploid soybean

[...]

Jeremy Schmutz, Steven B. Cannon¹, Jessica A. Schlueter², Jessica A. Schlueter³, Jianxin Ma², Therese Mitros⁴, William Nelson⁵, David L. Hyten¹, Qijian Song¹, Qijian Song⁶, Jay J. Thelen⁷, Jianlin Cheng⁷, Dong Xu⁷, Uffe Hellsten⁸, Gregory D. May⁹, Yeisoo Yu⁵, Tetsuya Sakurai, Taishi Umezawa, Madan K. Bhattacharyya¹⁰, Devinder Sandhu¹¹, Babu Valliyodan⁷, Erika Lindquist⁸, Myron Peto¹, David Grant¹, Shengqiang Shu⁸, David Goodstein⁸, Kerrie Barry⁸, Montona Futrell-Griggs², Brian Abernathy², Jianchang Du², Zhixi Tian², Liucun Zhu², Navdeep Gill², Trupti Joshi⁷, Marc Libault⁷, Ananad Sethuraman, Xue-Cheng Zhang⁷, Kazuo Shinozaki, Henry T. Nguyen⁷, Rod A. Wing⁵, Perry B. Cregan¹, James E. Specht¹², Jane Grimwood⁸, Daniel S. Rokhsar⁸, Gary Stacey⁷, Randy C. Shoemaker¹, Scott A. Jackson² - Show less +43 more•Institutions (12)

Agricultural Research Service¹, Purdue University², University of North Carolina at Charlotte³, University of California, Berkeley⁴, University of Arizona⁵, University of Maryland, College Park⁶, University of Missouri⁷, Joint Genome Institute⁸, National Center for Genome Resources⁹, Iowa State University¹⁰, University of Wisconsin–Stevens Point¹¹, University of Nebraska–Lincoln¹²

14 Jan 2010-Nature

TL;DR: An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

...read moreread less

Abstract: Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70% more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78% of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75% of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

...read moreread less

3,743 citations

Journal Article•DOI•

SCRATCH: a protein structure and structural feature prediction server

[...]

Jianlin Cheng¹, Arlo Randall¹, Michael J. Sweredoski¹, Pierre Baldi¹•Institutions (1)

University of California, Irvine¹

01 Jul 2005-Nucleic Acids Research

TL;DR: SCRATCH is a server for predicting protein tertiary structure and structural features and includes predictors for secondary structure, relative solvent accessibility, disordered regions, domains, disulfide bridges, single mutation stability, residue contacts versus average, individual residue contacts and tertiaries structure.

...read moreread less

Abstract: SCRATCH is a server for predicting protein tertiary structure and structural features. The SCRATCH software suite includes predictors for secondary structure, relative solvent accessibility, disordered regions, domains, disulfide bridges, single mutation stability, residue contacts versus average, individual residue contacts and tertiary structure. The user simply provides an amino acid sequence and selects the desired predictions, then submits to the server. Results are emailed to the user. The server is available at http://www.igb.uci.edu/servers/psss.html.

...read moreread less

914 citations

Journal Article•DOI•

A large-scale evaluation of computational protein function prediction

[...]

Predrag Radivojac¹, Wyatt T. Clark¹, Tal Ronnen Oron², Alexandra M. Schnoes³, Tobias Wittkop², Artem Sokolov⁴, Artem Sokolov⁵, Kiley Graim⁴, Christopher S. Funk⁶, Karin Verspoor⁶, Asa Ben-Hur⁴, Gaurav Pandey⁷, Gaurav Pandey⁸, Jeffrey M. Yunes⁸, Ameet Talwalkar⁸, Susanna Repo⁹, Susanna Repo⁸, Michael L Souza⁸, Damiano Piovesan¹⁰, Rita Casadio¹⁰, Zheng Wang¹¹, Jianlin Cheng¹¹, Hai Fang, Julian Gough¹², Patrik Koskinen¹³, Petri Törönen¹³, Jussi Nokso-Koivisto¹³, Liisa Holm¹³, Domenico Cozzetto¹⁴, Daniel W. A. Buchan¹⁴, Kevin Bryson¹⁴, David T. Jones¹⁴, Bhakti Limaye¹⁵, Harshal Inamdar¹⁵, Avik Datta¹⁵, Sunitha K Manjari¹⁵, Rajendra Joshi¹⁵, Meghana Chitale¹⁶, Daisuke Kihara¹⁶, Andreas Martin Lisewski¹⁷, Serkan Erdin¹⁷, Eric Venner¹⁷, Olivier Lichtarge¹⁷, Robert Rentzsch¹⁴, Haixuan Yang¹⁸, Alfonso E. Romero¹⁸, Prajwal Bhat¹⁸, Alberto Paccanaro¹⁸, Tobias Hamp¹⁹, Rebecca Kaßner¹⁹, Stefan Seemayer¹⁹, Esmeralda Vicedo¹⁹, Christian Schaefer¹⁹, Dominik Achten¹⁹, Florian Auer¹⁹, Ariane Boehm¹⁹, Tatjana Braun¹⁹, Maximilian Hecht¹⁹, Mark Heron¹⁹, Peter Hönigschmid¹⁹, Thomas A. Hopf¹⁹, Stefanie Kaufmann¹⁹, Michael Kiening¹⁹, Denis Krompass¹⁹, Cedric Landerer¹⁹, Yannick Mahlich¹⁹, Manfred Roos¹⁹, Jari Björne²⁰, Tapio Salakoski²⁰, Andrew Wong²¹, Hagit Shatkay²², Hagit Shatkay²¹, Fanny Gatzmann²³, Ingolf Sommer²³, Mark N. Wass²⁴, Michael J.E. Sternberg²⁴, Nives Škunca, Fran Supek, Matko Bošnjak, Panče Panov, Sašo Džeroski, Tomislav Šmuc, Yiannis A. I. Kourmpetis²⁵, Yiannis A. I. Kourmpetis²⁶, Aalt D. J. van Dijk²⁶, Cajo J. F. ter Braak²⁶, Yuanpeng Zhou²⁷, Qingtian Gong²⁷, Xinran Dong²⁷, Weidong Tian²⁷, Marco Falda²⁸, Paolo Fontana, Enrico Lavezzo²⁸, Barbara Di Camillo²⁸, Stefano Toppo²⁸, Liang Lan²⁹, Nemanja Djuric²⁹, Yuhong Guo²⁹, Slobodan Vucetic²⁹, Amos Marc Bairoch³⁰, Amos Marc Bairoch³¹, Michal Linial³², Patricia C. Babbitt³, Steven E. Brenner⁸, Christine A. Orengo¹⁴, Burkhard Rost¹⁹, Sean D. Mooney², Iddo Friedberg³³ - Show less +104 more•Institutions (33)

Indiana University¹, Buck Institute for Research on Aging², University of California, San Francisco³, Colorado State University⁴, University of California, Santa Cruz⁵, University of Colorado Denver⁶, Icahn School of Medicine at Mount Sinai⁷, University of California, Berkeley⁸, European Bioinformatics Institute⁹, University of Bologna¹⁰, University of Missouri¹¹, University of Bristol¹², University of Helsinki¹³, University College London¹⁴, Centre for Development of Advanced Computing¹⁵, Purdue University¹⁶, Baylor College of Medicine¹⁷, Royal Holloway, University of London¹⁸, Technische Universität München¹⁹, University of Turku²⁰, Queen's University²¹, University UCINF²², Max Planck Society²³, Imperial College London²⁴, Nestlé²⁵, Wageningen University and Research Centre²⁶, Fudan University²⁷, University of Padua²⁸, Temple University²⁹, University of Geneva³⁰, Swiss Institute of Bioinformatics³¹, Hebrew University of Jerusalem³², Miami University³³

01 Mar 2013-Nature Methods

TL;DR: Today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets, and there is considerable need for improvement of currently available tools.

...read moreread less

Abstract: Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. Here we report the results from the first large-scale community-based critical assessment of protein function annotation (CAFA) experiment. Fifty-four methods representing the state of the art for protein function prediction were evaluated on a target set of 866 proteins from 11 organisms. Two findings stand out: (i) today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is considerable need for improvement of currently available tools.

...read moreread less

859 citations

Journal Article•DOI•

Prediction of protein stability changes for single-site mutations using support vector machines.

[...]

Jianlin Cheng¹, Arlo Randall¹, Pierre Baldi¹•Institutions (1)

University of California, Irvine¹

21 Dec 2005-Proteins

TL;DR: The method can accurately predict protein stability changes using primary sequence information only, it is applicable to many situations where the tertiary structure is unknown, overcoming a major limitation of previous methods which require tertiary information.

...read moreread less

Abstract: Accurate prediction of protein stability changes resulting from single amino acid mutations is important for understanding protein structures and designing new proteins. We use support vector machines to predict protein stability changes for single amino acid mutations leveraging both sequence and structural information. We evaluate our approach using cross-validation methods on a large dataset of single amino acid mutations. When only the sign of the stability changes is considered, the predictive method achieves 84% accuracy-a significant improvement over previously published results. Moreover, the experimental results show that the prediction accuracy obtained using sequence alone is close to the accuracy obtained using tertiary structure information. Because our method can accurately predict protein stability changes using primary sequence information only, it is applicable to many situations where the tertiary structure is unknown, overcoming a major limitation of previous methods which require tertiary information. The web server for predictions of protein stability changes upon mutations (MUpro), software, and datasets are available at http://www.igb.uci.edu/servers/servers.html.

...read moreread less

801 citations

Journal Article•DOI•

An expanded evaluation of protein function prediction methods shows an improvement in accuracy

[...]

Yuxiang Jiang¹, Tal Ronnen Oron², Wyatt T. Clark³, Asma R. Bankapur⁴ +153 more•Institutions (59)

07 Sep 2016-Genome Biology

TL;DR: The second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function, was conducted by as mentioned in this paper. But the results of the CAFA2 assessment are limited.

...read moreread less

Abstract: BACKGROUND: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging. RESULTS: We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2. CONCLUSIONS: The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.

...read moreread less

330 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Deep learning in neural networks

[...]

Jürgen Schmidhuber¹•Institutions (1)

University of Lugano¹

01 Jan 2015-Neural Networks

TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

...read moreread less

14,635 citations

Journal Article•DOI•

Highly accurate protein structure prediction with AlphaFold

[...]

John M. Jumper, Richard O. Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunyasuvunakool, Russell Bates, Augustin Žídek, Anna Potapenko, Alex Bridgland, Clemens Meyer, Simon A. A. Kohl, Andrew J. Ballard, Andrew Cowie, Bernardino Romera-Paredes, Stanislav Nikolov, R. D. Jain, Jonas Adler, Trevor Back, Stig Petersen, David Reiman, Ellen Clancy, Michal Zielinski, Martin Steinegger¹, Michalina Pacholska, Tamas Berghammer, Sebastian Bodenstein, David L. Silver, Oriol Vinyals, Andrew W. Senior, Koray Kavukcuoglu, Pushmeet Kohli, Demis Hassabis - Show less +30 more•Institutions (1)

Seoul National University¹

15 Jul 2021-Nature

TL;DR: For example, AlphaFold as mentioned in this paper predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture. But the accuracy is limited by the fact that no homologous structure is available.

...read moreread less

Abstract: Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort1–4, the structures of around 100,000 unique proteins have been determined5, but this represents a small fraction of the billions of known protein sequences6,7. Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’8—has been an important open research problem for more than 50 years9. Despite recent progress10–14, existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14)15, demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.

...read moreread less

10,601 citations

Data Mining - Concepts and Techniques.

[...]

Petra Perner

01 Jan 2002

9,314 citations

Proceedings Article•DOI•

node2vec: Scalable Feature Learning for Networks

[...]

Aditya Grover¹, Jure Leskovec¹•Institutions (1)

Stanford University¹

13 Aug 2016

TL;DR: Node2vec as mentioned in this paper learns a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes by using a biased random walk procedure.

...read moreread less

Abstract: Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.

...read moreread less

7,072 citations

“Bioinformatics” 특집을 내면서

[...]

장병탁, 김삼묘, 허철구

01 Aug 2000

TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.

...read moreread less

Abstract: BIOE 402. Medical Technology Assessment. 2 or 3 hours. Bioentrepreneur course. Assessment of medical technology in the context of commercialization. Objectives, competition, market share, funding, pricing, manufacturing, growth, and intellectual property; many issues unique to biomedical products. Course Information: 2 undergraduate hours. 3 graduate hours. Prerequisite(s): Junior standing or above and consent of the instructor.

...read moreread less

4,833 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse