Home
/
Authors
/
Benjamin J. Lengerich

Author

Benjamin J. Lengerich

Other affiliations: Pennsylvania State University, Massachusetts Institute of Technology

Bio: Benjamin J. Lengerich is an academic researcher from Carnegie Mellon University. The author has contributed to research in topics: Deep learning & Artificial neural network. The author has an hindex of 8, co-authored 25 publications receiving 1248 citations. Previous affiliations of Benjamin J. Lengerich include Pennsylvania State University & Massachusetts Institute of Technology.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Opportunities and obstacles for deep learning in biology and medicine.

[...]

Travers Ching¹, Daniel Himmelstein², Brett K. Beaulieu-Jones², Alexandr A. Kalinin³, Brian T. Do⁴, Gregory P. Way², Enrico Ferrero⁵, Paul-Michael Agapow⁶, Michael Zietz², Michael M. Hoffman⁷, Michael M. Hoffman⁸, Wei Xie⁹, Gail L. Rosen¹⁰, Benjamin J. Lengerich¹¹, Johnny Israeli¹², Jack Lanchantin¹³, Stephen Woloszynek¹⁰, Anne E. Carpenter¹⁴, Avanti Shrikumar¹², Jinbo Xu¹⁵, Evan M. Cofer¹⁶, Evan M. Cofer¹⁷, Christopher A. Lavender¹⁸, Srinivas C. Turaga¹⁹, Amr Alexandari¹², Zhiyong Lu¹⁸, David J. Harris²⁰, Dave DeCaprio, Yanjun Qi¹³, Anshul Kundaje¹², Yifan Peng¹⁸, Laura K. Wiley²¹, Marwin H. S. Segler²², Simina M. Boca²³, S. Joshua Swamidass²⁴, Austin Huang²⁵, Anthony Gitter²⁶, Anthony Gitter²⁷, Casey S. Greene² - Show less +35 more•Institutions (27)

University of Hawaii at Manoa¹, University of Pennsylvania², University of Michigan³, Harvard University⁴, GlaxoSmithKline⁵, Imperial College London⁶, Princess Margaret Cancer Centre⁷, University of Toronto⁸, Vanderbilt University⁹, Drexel University¹⁰, Carnegie Mellon University¹¹, Stanford University¹², University of Virginia¹³, Broad Institute¹⁴, Toyota Technological Institute at Chicago¹⁵, Trinity University¹⁶, Princeton University¹⁷, National Institutes of Health¹⁸, Howard Hughes Medical Institute¹⁹, University of Florida²⁰, University of Colorado Denver²¹, University of Münster²², Georgetown University Medical Center²³, Washington University in St. Louis²⁴, Brown University²⁵, Morgridge Institute for Research²⁶, University of Wisconsin-Madison²⁷

01 Apr 2018-Journal of the Royal Society Interface

TL;DR: It is found that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art.

...read moreread less

Abstract: Deep learning describes a class of machine learning algorithms that are capable of combining raw inputs into layers of intermediate features. These algorithms have recently shown impressive results across a variety of domains. Biology and medicine are data-rich disciplines, but the data are complex and often ill-understood. Hence, deep learning techniques may be particularly well suited to solve problems of these fields. We examine applications of deep learning to a variety of biomedical problems-patient classification, fundamental biological processes and treatment of patients-and discuss whether deep learning will be able to transform these tasks or if the biomedical sphere poses unique challenges. Following from an extensive literature review, we find that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art. Even though improvements over previous baselines have been modest in general, the recent progress indicates that deep learning methods will provide valuable means for speeding up or aiding human investigation. Though progress has been made linking a specific neural network's prediction to input features, understanding how users should interpret these models to make testable hypotheses about the system under study remains an open challenge. Furthermore, the limited amount of labelled data for training presents problems in some domains, as do legal and privacy constraints on work with sensitive health records. Nonetheless, we foresee deep learning enabling changes at both bench and bedside with the potential to transform several areas of biology and medicine.

...read moreread less

1,491 citations

Journal Article•DOI•

Precision Lasso: accounting for correlations and linear dependencies in high-dimensional genomic data.

[...]

Haohan Wang¹, Benjamin J. Lengerich¹, Bryon Aragam¹, Eric P. Xing¹•Institutions (1)

Carnegie Mellon University¹

01 Apr 2019-Bioinformatics

TL;DR: The Precision Lasso is a Lasso variant that promotes sparse variable selection by regularization governed by the covariance and inverse covariance matrices of explanatory variables that outperforms popular methods of variable selection such as the Lasso, the Elastic Net and Minimax Concave Penalty regression.

...read moreread less

Abstract: Motivation Association studies to discover links between genetic markers and phenotypes are central to bioinformatics. Methods of regularized regression, such as variants of the Lasso, are popular for this task. Despite the good predictive performance of these methods in the average case, they suffer from unstable selections of correlated variables and inconsistent selections of linearly dependent variables. Unfortunately, as we demonstrate empirically, such problematic situations of correlated and linearly dependent variables often exist in genomic datasets and lead to under-performance of classical methods of variable selection. Results To address these challenges, we propose the Precision Lasso. Precision Lasso is a Lasso variant that promotes sparse variable selection by regularization governed by the covariance and inverse covariance matrices of explanatory variables. We illustrate its capacity for stable and consistent variable selection in simulated data with highly correlated and linearly dependent variables. We then demonstrate the effectiveness of the Precision Lasso to select meaningful variables from transcriptomic profiles of breast cancer patients. Our results indicate that in settings with correlated and linearly dependent variables, the Precision Lasso outperforms popular methods of variable selection such as the Lasso, the Elastic Net and Minimax Concave Penalty (MCP) regression. Availability and implementation Software is available at https://github.com/HaohanWang/thePrecisionLasso. Supplementary information Supplementary data are available at Bioinformatics online.

...read moreread less

110 citations

Posted Content•DOI•

Opportunities And Obstacles For Deep Learning In Biology And Medicine

[...]

Travers Ching¹, Daniel Himmelstein², Brett K. Beaulieu-Jones², Alexandr A. Kalinin³, Brian T. Do⁴, Gregory P. Way², Enrico Ferrero⁵, Paul-Michael Agapow⁶, Wei Xie⁷, Gail L. Rosen⁸, Benjamin J. Lengerich⁹, Johnny Israeli¹⁰, Jack Lanchantin¹¹, Stephen Woloszynek⁸, Anne E. Carpenter¹², Avanti Shrikumar¹⁰, Jinbo Xu¹³, Evan M. Cofer¹⁴, David J. Harris¹⁵, Dave DeCaprio, Yanjun Qi¹¹, Anshul Kundaje¹⁰, Yifan Peng¹⁶, Laura K. Wiley¹⁷, Marwin H. S. Segler¹⁸, Anthony Gitter¹⁹, Casey S. Greene² - Show less +23 more•Institutions (19)

University of Hawaii at Manoa¹, University of Pennsylvania², University of Michigan³, Harvard University⁴, GlaxoSmithKline⁵, Imperial College London⁶, Vanderbilt University⁷, Drexel University⁸, Carnegie Mellon University⁹, Stanford University¹⁰, University of Virginia¹¹, Broad Institute¹², Toyota¹³, Trinity University¹⁴, University of Florida¹⁵, National Institutes of Health¹⁶, University of Colorado Denver¹⁷, University of Münster¹⁸, Morgridge Institute for Research¹⁹

28 May 2017-bioRxiv

TL;DR: This work examines applications of deep learning to a variety of biomedical problems -- patient classification, fundamental biological processes, and treatment of patients -- to predict whether deep learning will transform these tasks or if the biomedical sphere poses unique challenges.

...read moreread less

Abstract: Deep learning, which describes a class of machine learning algorithms, has recently showed impressive results across a variety of domains. Biology and medicine are data rich, but the data are complex and often ill-understood. Problems of this nature may be particularly well-suited to deep learning techniques. We examine applications of deep learning to a variety of biomedical problems -- patient classification, fundamental biological processes, and treatment of patients -- to predict whether deep learning will transform these tasks or if the biomedical sphere poses unique challenges. We find that deep learning has yet to revolutionize or definitively resolve any of these problems, but promising advances have been made on the prior state of the art. Even when improvement over a previous baseline has been modest, we have seen signs that deep learning methods may speed or aid human investigation. More work is needed to address concerns related to interpretability and how to best model each problem. Furthermore, the limited amount of labeled data for training presents problems in some domains, as can legal and privacy constraints on work with sensitive health records. Nonetheless, we foresee deep learning powering changes at the bench and bedside with the potential to transform several areas of biology and medicine.

...read moreread less

70 citations

Journal Article•DOI•

Experimental and computational mutagenesis to investigate the positioning of a general base within an enzyme active site.

[...]

Jason P. Schwans¹, Philip Hanoian², Benjamin J. Lengerich², Fanny Sunden¹, Ana Gonzalez³, Yingssu Tsai³, Yingssu Tsai¹, Sharon Hammes-Schiffer⁴, Daniel Herschlag¹ - Show less +5 more•Institutions (4)

Stanford University¹, Pennsylvania State University², SLAC National Accelerator Laboratory³, University of Illinois at Urbana–Champaign⁴

09 Apr 2014-Biochemistry

TL;DR: Recognizing the extent, type, and energetic interconnectivity of interactions that contribute to positioning catalytic groups has implications for enzyme evolution and may help reveal the nature and extent of interactions required to design enzymes that rival those found in biology.

...read moreread less

Abstract: The positioning of catalytic groups within proteins plays an important role in enzyme catalysis, and here we investigate the positioning of the general base in the enzyme ketosteroid isomerase (KSI). The oxygen atoms of Asp38, the general base in KSI, were previously shown to be involved in anion–aromatic interactions with two neighboring Phe residues. Here we ask whether those interactions are sufficient, within the overall protein architecture, to position Asp38 for catalysis or whether the side chains that pack against Asp38 and/or the residues of the structured loop that is capped by Asp38 are necessary to achieve optimal positioning for catalysis. To test positioning, we mutated each of the aforementioned residues, alone and in combinations, in a background with the native Asp general base and in a D38E mutant background, as Glu at position 38 was previously shown to be mispositioned for general base catalysis. These double-mutant cycles reveal positioning effects as large as 103-fold, indicating tha...

...read moreread less

23 citations

Proceedings Article•

Retrofitting Distributional Embeddings to Knowledge Graphs with Functional Relations

[...]

Benjamin J. Lengerich¹, Andrew L. Maas², Christopher Potts²•Institutions (2)

Carnegie Mellon University¹, Stanford University²

01 Aug 2018

TL;DR: Functional retrofitting as mentioned in this paper generalizes current retrofitting methods by explicitly modeling pairwise relations, which can directly incorporate a variety of pairwise penalty functions previously developed for knowledge graph completion and allow users to encode, learn, and extract information about relation semantics.

...read moreread less

Abstract: Knowledge graphs are a versatile framework to encode richly structured data relationships, but it can be challenging to combine these graphs with unstructured data. Methods for retrofitting pre-trained entity representations to the structure of a knowledge graph typically assume that entities are embedded in a connected space and that relations imply similarity. However, useful knowledge graphs often contain diverse entities and relations (with potentially disjoint underlying corpora) which do not accord with these assumptions. To overcome these limitations, we present Functional Retrofitting, a framework that generalizes current retrofitting methods by explicitly modeling pairwise relations. Our framework can directly incorporate a variety of pairwise penalty functions previously developed for knowledge graph completion. Further, it allows users to encode, learn, and extract information about relation semantics. We present both linear and neural instantiations of the framework. Functional Retrofitting significantly outperforms existing retrofitting methods on complex knowledge graphs and loses no accuracy on simpler graphs (in which relations do imply similarity). Finally, we demonstrate the utility of the framework by predicting new drug–disease treatment pairs in a large, complex health knowledge graph.

...read moreread less

23 citations

1
2
3
4
…
5
6
7

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

A guide to deep learning in healthcare.

[...]

Andre Esteva¹, Alexandre Robicquet¹, Bharath Ramsundar¹, Volodymyr Kuleshov¹, Mark A. DePristo², Katherine Chou², Claire Cui², Greg S. Corrado², Sebastian Thrun¹, Jeffrey Dean² - Show less +6 more•Institutions (2)

Stanford University¹, Google²

01 Jan 2019-Nature Medicine

TL;DR: How these computational techniques can impact a few key areas of medicine and explore how to build end-to-end systems are described.

...read moreread less

Abstract: Here we present deep-learning techniques for healthcare, centering our discussion on deep learning in computer vision, natural language processing, reinforcement learning, and generalized methods. We describe how these computational techniques can impact a few key areas of medicine and explore how to build end-to-end systems. Our discussion of computer vision focuses largely on medical imaging, and we describe the application of natural language processing to domains such as electronic health record data. Similarly, reinforcement learning is discussed in the context of robotic-assisted surgery, and generalized deep-learning methods for genomics are reviewed.

...read moreread less

1,843 citations

Journal Article•DOI•

CellProfiler 3.0: Next-generation image processing for biology.

[...]

Claire McQuin¹, Allen Goodman¹, Vasiliy S. Chernyshev², Vasiliy S. Chernyshev³, Lee Kamentsky¹, Beth A. Cimini¹, Kyle W. Karhohs¹, Minh Doan¹, Liya Ding⁴, Susanne M. Rafelski⁴, Derek Thirstrup⁴, Winfried Wiegraebe⁴, Shantanu Singh¹, Tim Becker¹, Juan C. Caicedo¹, Anne E. Carpenter¹ - Show less +12 more•Institutions (4)

Broad Institute¹, Moscow Institute of Physics and Technology², Skolkovo Institute of Science and Technology³, Allen Institute for Cell Science⁴

03 Jul 2018-PLOS Biology

TL;DR: CellProfiler 3.0 is described, a new version of the software supporting both whole-volume and plane-wise analysis of three-dimensional image stacks, increasingly common in biomedical research.

...read moreread less

Abstract: CellProfiler has enabled the scientific research community to create flexible, modular image analysis pipelines since its release in 2005. Here, we describe CellProfiler 3.0, a new version of the software supporting both whole-volume and plane-wise analysis of three-dimensional (3D) image stacks, increasingly common in biomedical research. CellProfiler's infrastructure is greatly improved, and we provide a protocol for cloud-based, large-scale image processing. New plugins enable running pretrained deep learning models on images. Designed by and for biologists, CellProfiler equips researchers with powerful computational tools via a well-documented user interface, empowering biologists in all fields to create quantitative, reproducible image analysis workflows.

...read moreread less

1,466 citations

Proceedings Article•DOI•

Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks

[...]

Aditya Chattopadhay¹, Anirban Sarkar¹, Prantik Howlader¹, Vineeth N Balasubramanian²•Institutions (2)

Indian Institute of Technology, Hyderabad¹, Cisco Systems, Inc.²

12 Mar 2018

TL;DR: This paper proposes Grad-CAM++, which uses a weighted combination of the positive partial derivatives of the last convolutional layer feature maps with respect to a specific class score as weights to generate a visual explanation for the class label under consideration, to provide better visual explanations of CNN model predictions.

...read moreread less

Abstract: Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in solving complex vision based problems. However, deep models are perceived as "black box" methods considering the lack of understanding of their internal functioning. There has been a significant recent interest to develop explainable deep learning models, and this paper is an effort in this direction. Building on a recently proposed method called Grad-CAM, we propose Grad-CAM++ to provide better visual explanations of CNN model predictions (when compared to Grad-CAM), in terms of better localization of objects as well as explaining occurrences of multiple objects of a class in a single image. We provide a mathematical explanation for the proposed method, Grad-CAM++, which uses a weighted combination of the positive partial derivatives of the last convolutional layer feature maps with respect to a specific class score as weights to generate a visual explanation for the class label under consideration. Our extensive experiments and evaluations, both subjective and objective, on standard datasets showed that Grad-CAM++ indeed provides better visual explanations for a given CNN architecture when compared to Grad-CAM.

...read moreread less

1,451 citations

Journal Article•DOI•

Artificial Intelligence in Healthcare

[...]

Kun-Hsing Yu¹, Andrew L. Beam¹, Isaac S. Kohane², Isaac S. Kohane¹•Institutions (2)

Harvard University¹, Boston Children's Hospital²

01 Oct 2018-Nature Biomedical Engineering

TL;DR: Recent breakthroughs in AI technologies and their biomedical applications are outlined, the challenges for further progress in medical AI systems are identified, and the economic, legal and social implications of AI in healthcare are summarized.

...read moreread less

Abstract: Artificial intelligence (AI) is gradually changing medical practice. With recent progress in digitized data acquisition, machine learning and computing infrastructure, AI applications are expanding into areas that were previously thought to be only the province of human experts. In this Review Article, we outline recent breakthroughs in AI technologies and their biomedical applications, identify the challenges for further progress in medical AI systems, and summarize the economic, legal and social implications of AI in healthcare.

...read moreread less

1,315 citations

Journal Article•DOI•

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

[...]

Laith Alzubaidi¹, Jinglan Zhang¹, Amjad J. Humaidi², Ayad Q. Al-Dujaili, Ye Duan³, Omran Al-Shamma, José Santamaría⁴, Mohammed A. Fadhel⁵, Muthana Al-Amidie³, Laith Farhan⁶ - Show less +6 more•Institutions (6)

Queensland University of Technology¹, University of Technology, Iraq², University of Missouri³, University of Jaén⁴, Information Technology University⁵, Manchester Metropolitan University⁶

01 Jan 2021-Journal of Big Data

TL;DR: In this paper, a comprehensive survey of the most important aspects of DL and including those enhancements recently added to the field is provided, and the challenges and suggested solutions to help researchers understand the existing research gaps.

...read moreread less

Abstract: In the last few years, the deep learning (DL) computing paradigm has been deemed the Gold Standard in the machine learning (ML) community. Moreover, it has gradually become the most widely used computational approach in the field of ML, thus achieving outstanding results on several complex cognitive tasks, matching or even beating those provided by human performance. One of the benefits of DL is the ability to learn massive amounts of data. The DL field has grown fast in the last few years and it has been extensively used to successfully address a wide range of traditional applications. More importantly, DL has outperformed well-known ML techniques in many domains, e.g., cybersecurity, natural language processing, bioinformatics, robotics and control, and medical information processing, among many others. Despite it has been contributed several works reviewing the State-of-the-Art on DL, all of them only tackled one aspect of the DL, which leads to an overall lack of knowledge about it. Therefore, in this contribution, we propose using a more holistic approach in order to provide a more suitable starting point from which to develop a full understanding of DL. Specifically, this review attempts to provide a more comprehensive survey of the most important aspects of DL and including those enhancements recently added to the field. In particular, this paper outlines the importance of DL, presents the types of DL techniques and networks. It then presents convolutional neural networks (CNNs) which the most utilized DL network type and describes the development of CNNs architectures together with their main features, e.g., starting with the AlexNet network and closing with the High-Resolution network (HR.Net). Finally, we further present the challenges and suggested solutions to help researchers understand the existing research gaps. It is followed by a list of the major DL applications. Computational tools including FPGA, GPU, and CPU are summarized along with a description of their influence on DL. The paper ends with the evolution matrix, benchmark datasets, and summary and conclusion.

...read moreread less

1,084 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse