Home
/
Authors
/
Subramani Mani

Author

Subramani Mani

Other affiliations: University of Wisconsin–Milwaukee, University of South Carolina, University of Pittsburgh ...read more

Bio: Subramani Mani is an academic researcher from University of New Mexico. The author has contributed to research in topics: Bayesian network & Markov blanket. The author has an hindex of 23, co-authored 48 publications receiving 3717 citations. Previous affiliations of Subramani Mani include University of Wisconsin–Milwaukee & University of South Carolina.

Papers

PDF

Open Access

More filters

Wisdom of crowds for robust gene network inference

[...]

Daniel Marbach, James C. Costello, Robert Küffner, Nicole M. Vega, Robert J. Prill, Diogo M. Camacho, Kyle R. Allison, Andrej Aderhold, Richard Bonneau, Yukun Chen, James J. Collins, Francesca Cordero, Martin Crane, Frank Dondelinger, Mathias Drton, Roberto Esposito, Rina Foygel, Alberto de la Fuente, Jan Gertheiss, Pierre Geurts, Alex Greenfield, Marco Grzegorczyk, Anne-Claire Haury, Benjamin Holmes, Torsten Hothorn, Dirk Husmeier, Vân Anh Huynh-Thu, Alexandre Irrthum, Manolis Kellis, Guy Karlebach, Sophie Lèbre, Vincenzo De Leo, Aviv Madar, Subramani Mani, Fantine Mordelet, Harry Ostrer, Zhengyu Ouyang, Ravi Pandya, Tobias Petri, Andrea Pinna, Christopher S. Poultney, Serena Rezny, Heather J. Ruskin, Yvan Saeys, Ron Shamir, Alina Sîrbu, Mingzhou Song, Nicola Soranzo, Alexander Statnikov, Gustavo Stolovitzky, Nicci Vega, Paola Vera-Licona, Jean-Philippe Vert, Alessia Visconti, Haizhou Wang, Louis Wehenkel, Lukas Windhager, Yang Zhang, Ralf Zimmer - Show less +55 more

01 Jul 2012

TL;DR: A comprehensive blind assessment of over 30 network inference methods on Escherichia coli, Staphylococcus aureus, Saccharomyces cerevisiae and in silico microarray data defines the performance, data requirements and inherent biases of different inference approaches, and provides guidelines for algorithm application and development.

...read moreread less

Abstract: Reconstructing gene regulatory networks from high-throughput data is a long-standing challenge. Through the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project, we performed a comprehensive blind assessment of over 30 network inference methods on Escherichia coli, Staphylococcus aureus, Saccharomyces cerevisiae and in silico microarray data. We characterize the performance, data requirements and inherent biases of different inference approaches, and we provide guidelines for algorithm application and development. We observed that no single inference method performs optimally across all data sets. In contrast, integration of predictions from multiple inference methods shows robust and high performance across diverse data sets. We thereby constructed high-confidence networks for E. coli and S. aureus, each comprising ∼1,700 transcriptional interactions at a precision of ∼50%. We experimentally tested 53 previously unobserved regulatory interactions in E. coli, of which 23 (43%) were supported. Our results establish community-based methods as a powerful and robust tool for the inference of transcriptional gene regulatory networks.

...read moreread less

1,355 citations

Journal Article•DOI•

Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical Evaluation

[...]

Constantin F. Aliferis¹, Alexander Statnikov¹, Ioannis Tsamardinos², Subramani Mani³, Xenofon Koutsoukos³ - Show less +1 more•Institutions (3)

New York University¹, University of Crete², Vanderbilt University³

01 Mar 2010-Journal of Machine Learning Research

TL;DR: It is found that non-causal feature selection methods cannot be interpreted causally even when they achieve excellent predictivity, so only local causal techniques should be used when insight into causal structure is sought.

...read moreread less

Abstract: We present an algorithmic framework for learning local causal structure around target variables of interest in the form of direct causes/effects and Markov blankets applicable to very large data sets with relatively small samples. The selected feature sets can be used for causal discovery and classification. The framework (Generalized Local Learning, or GLL) can be instantiated in numerous ways, giving rise to both existing state-of-the-art as well as novel algorithms. The resulting algorithms are sound under well-defined sufficient conditions. In a first set of experiments we evaluate several algorithms derived from this framework in terms of predictivity and feature set parsimony and compare to other local causal discovery methods and to state-of-the-art non-causal feature selection methods using real data. A second set of experimental evaluations compares the algorithms in terms of ability to induce local causal neighborhoods using simulated and resimulated data and examines the relation of predictivity with causal induction performance. Our experiments demonstrate, consistently with causal feature selection theory, that local causal feature selection methods (under broad assumptions encompassing appropriate family of distributions, types of classifiers, and loss functions) exhibit strong feature set parsimony, high predictivity and local causal interpretability. Although non-causal feature selection methods are often used in practice to shed light on causal relationships, we find that they cannot be interpreted causally even when they achieve excellent predictivity. Therefore we conclude that only local causal techniques should be used when insight into causal structure is sought. In a companion paper we examine in depth the behavior of GLL algorithms, provide extensions, and show how local techniques can be used for scalable and accurate global causal graph learning.

...read moreread less

521 citations

Journal Article•DOI•

Unexplored therapeutic opportunities in the human genome.

[...]

Tudor I. Oprea, Cristian Bologa¹, Søren Brunak², Allen Campbell, Gregory N. Gan, Anna Gaulton³, Shawn M. Gomez⁴, Rajarshi Guha⁵, Anne Hersey³, Jayme Holmes¹, Ajit Jadhav⁵, Lars Juhl Jensen², Gary L. Johnson⁴, Anneli Karlson³, Andrew R. Leach³, Avi Ma'ayan⁶, Anna Malovannaya⁷, Subramani Mani¹, Stephen L. Mathias¹, Michael T. McManus⁸, Terrence F. Meehan³, Christian von Mering⁹, Daniel Muthas¹⁰, Dac-Trung Nguyen⁵, John P. Overington³, George Papadatos³, George Papadatos¹¹, Jun Qin⁷, Christian Reich, Bryan L. Roth⁴, Stephan C. Schürer¹², Anton Simeonov⁵, Larry A. Sklar¹, Noel Southall⁵, Susumu Tomita¹³, Ilinca Tudose³, Ilinca Tudose¹⁴, Oleg Ursu¹, Dusica Vidovic¹², Anna Waller¹, David Westergaard², Jeremy J. Yang¹, Gergely Zahoránszky-Köhalmi¹ - Show less +39 more•Institutions (14)

University of New Mexico¹, University of Copenhagen², European Bioinformatics Institute³, University of North Carolina at Chapel Hill⁴, National Institutes of Health⁵, Icahn School of Medicine at Mount Sinai⁶, Baylor College of Medicine⁷, University of California, San Francisco⁸, University of Zurich⁹, AstraZeneca¹⁰, GlaxoSmithKline¹¹, University of Miami¹², Yale University¹³, Google¹⁴

23 Mar 2018-Nature Reviews Drug Discovery

TL;DR: How the systematic collection and processing of a wide array of genomic, proteomic, chemical and disease-related resource data by the IDG Knowledge Management Center have enabled the development of evidence-based criteria for tracking the target development level (TDL) of human proteins is discussed.

...read moreread less

Abstract: A large proportion of biomedical research and the development of therapeutics is focused on a small fraction of the human genome. In a strategic effort to map the knowledge gaps around proteins encoded by the human genome and to promote the exploration of currently understudied, but potentially druggable, proteins, the US National Institutes of Health launched the Illuminating the Druggable Genome (IDG) initiative in 2014. In this article, we discuss how the systematic collection and processing of a wide array of genomic, proteomic, chemical and disease-related resource data by the IDG Knowledge Management Center have enabled the development of evidence-based criteria for tracking the target development level (TDL) of human proteins, which indicates a substantial knowledge deficit for approximately one out of three proteins in the human proteome. We then present spotlights on the TDL categories as well as key drug target classes, including G protein-coupled receptors, protein kinases and ion channels, which illustrate the nature of the unexplored opportunities for biomedical research and therapeutic development.

...read moreread less

274 citations

Journal Article•DOI•

A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries.

[...]

Min Jiang¹, Yukun Chen¹, Mei Liu¹, S. Trent Rosenbloom¹, Subramani Mani¹, Joshua C. Denny¹, Hua Xu¹ - Show less +3 more•Institutions (1)

Vanderbilt University¹

01 Sep 2011-Journal of the American Medical Informatics Association

TL;DR: Systematic evaluation on the training set showed that Conditional Random Fields outperformed Support Vector Machines, and semantic information from existing natural-language-processing systems largely improved performance, although contributions from different types of features varied.

...read moreread less

262 citations

Journal Article•DOI•

Pharos: Collating protein information to shed light on the druggable genome

[...]

Dac-Trung Nguyen¹, Stephen L. Mathias², Cristian Bologa², Søren Brunak³, Nicolas F. Fernandez⁴, Anna Gaulton⁵, Anne Hersey⁵, Jayme Holmes², Lars Juhl Jensen³, Anneli Karlsson, Guixia Liu⁶, Guixia Liu², Avi Ma'ayan⁴, Geetha Mandava¹, Subramani Mani², Saurabh Mehta⁷, Saurabh Mehta⁸, John P. Overington, Juhee Patel², Andrew D. Rouillard⁴, Stephan C. Schürer⁸, Timothy Sheils¹, Anton Simeonov¹, Larry A. Sklar², Noel Southall¹, Oleg Ursu², Dusica Vidovic⁸, Anna Waller², Jeremy J. Yang², Ajit Jadhav¹, Tudor I. Oprea², Rajarshi Guha¹ - Show less +28 more•Institutions (8)

National Institutes of Health¹, University of New Mexico², University of Copenhagen³, Icahn School of Medicine at Mount Sinai⁴, European Bioinformatics Institute⁵, East China University of Science and Technology⁶, Delhi Technological University⁷, University of Miami⁸

04 Jan 2017-Nucleic Acids Research

TL;DR: Two resources developed by the IDG Knowledge Management Center are described: the Target Central Resource Database (TCRD) which collates many heterogeneous gene/protein datasets and Pharos (https://pharos.nih.gov), a multimodal web interface that presents the data from TCRD.

...read moreread less

Abstract: The 'druggable genome' encompasses several protein families, but only a subset of targets within them have attracted significant research attention and thus have information about them publicly available. The Illuminating the Druggable Genome (IDG) program was initiated in 2014, has the goal of developing experimental techniques and a Knowledge Management Center (KMC) that would collect and organize information about protein targets from four families, representing the most common druggable targets with an emphasis on understudied proteins. Here, we describe two resources developed by the KMC: the Target Central Resource Database (TCRD) which collates many heterogeneous gene/protein datasets and Pharos (https://pharos.nih.gov), a multimodal web interface that presents the data from TCRD. We briefly describe the types and sources of data considered by the KMC and then highlight features of the Pharos interface designed to enable intuitive access to the IDG knowledgebase. The aim of Pharos is to encourage 'serendipitous browsing', whereby related, relevant information is made easily discoverable. We conclude by describing two use cases that highlight the utility of Pharos and TCRD.

...read moreread less

222 citations

1
2
3
4
…
5
6
7
8
9
10

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Journal Article•DOI•

A Survey of Methods for Explaining Black Box Models

[...]

Riccardo Guidotti¹, Anna Monreale¹, Salvatore Ruggieri¹, Franco Turini¹, Fosca Giannotti², Dino Pedreschi¹ - Show less +2 more•Institutions (2)

University of Pisa¹, Istituto di Scienza e Tecnologie dell'Informazione²

22 Aug 2018-ACM Computing Surveys

TL;DR: In this paper, the authors provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box decision support systems, given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work.

...read moreread less

Abstract: In recent years, many accurate decision support systems have been constructed as black boxes, that is as systems that hide their internal logic to the user. This lack of explanation constitutes both a practical and an ethical issue. The literature reports many approaches aimed at overcoming this crucial weakness, sometimes at the cost of sacrificing accuracy for interpretability. The applications in which black box decision systems can be used are various, and each approach is typically developed to provide a solution for a specific problem and, as a consequence, it explicitly or implicitly delineates its own definition of interpretability and explanation. The aim of this article is to provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box system. Given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work. The proposed classification of approaches to open black box models should also be useful for putting the many research open questions in perspective.

...read moreread less

2,805 citations

Book•

Learning Bayesian networks

[...]

Richard E. Neapolitan¹•Institutions (1)

Northeastern Illinois University¹

01 Jan 2004

TL;DR: This chapter discusses Bayesian Networks, a framework for Bayesian Structure Learning, and some of the algorithms used in this framework.

...read moreread less

Abstract: Preface. I. BASICS. 1. Introduction to Bayesian Networks. 2. More DAG/Probability Relationships. II. INFERENCE. 3. Inference: Discrete Variables. 4. More Inference Algorithms. 5. Influence Diagrams. III. LEARNING. 6. Parameter Learning: Binary Variables. 7. More Parameter Learning. 8. Bayesian Structure Learning. 9. Approximate Bayesian Structure Learning. 10. Constraint-Based Learning. 11. More Structure Learning. IV. APPICATIONS. 12. Applications. Bibliography. Index.

...read moreread less

2,575 citations

Proceedings Article•

Integrating classification and association rule mining

[...]

Bing Liu¹, Wynne Hsu¹, Yiming Ma¹•Institutions (1)

National University of Singapore¹

27 Aug 1998

TL;DR: The integration is done by focusing on mining a special subset of association rules, called class association rules (CARs), and shows that the classifier built this way is more accurate than that produced by the state-of-the-art classification system C4.5.

...read moreread less

Abstract: Classification rule mining aims to discover a small set of rules in the database that forms an accurate classifier. Association rule mining finds all the rules existing in the database that satisfy some minimum support and minimum confidence constraints. For association rule mining, the target of discovery is not pre-determined, while for classification rule mining there is one and only one predetermined target. In this paper, we propose to integrate these two mining techniques. The integration is done by focusing on mining a special subset of association rules, called class association rules (CARs). An efficient algorithm is also given for building a classifier based on the set of discovered CARs. Experimental results show that the classifier built this way is, in general, more accurate than that produced by the state-of-the-art classification system C4.5. In addition, this integration helps to solve a number of problems that exist in the current classification systems.

...read moreread less

2,479 citations

Journal Article•DOI•

SCENIC: single-cell regulatory network inference and clustering.

[...]

Sara Aibar¹, Carmen Bravo González-Blas¹, Thomas Moerman¹, Vân Anh Huynh-Thu², Hana Imrichova¹, Gert Hulselmans¹, Florian Rambow¹, Jean-Christophe Marine¹, Pierre Geurts², Jan Aerts¹, Joost van den Oord¹, Zeynep Kalender Atak¹, Jasper Wouters¹, Stein Aerts¹ - Show less +10 more•Institutions (2)

Katholieke Universiteit Leuven¹, University of Liège²

09 Oct 2017-Nature Methods

TL;DR: On a compendium of single-cell data from tumors and brain, it is demonstrated that cis-regulatory analysis can be exploited to guide the identification of transcription factors and cell states.

...read moreread less

Abstract: We present SCENIC, a computational method for simultaneous gene regulatory network reconstruction and cell-state identification from single-cell RNA-seq data (http://scenicaertslaborg) On a compendium of single-cell data from tumors and brain, we demonstrate that cis-regulatory analysis can be exploited to guide the identification of transcription factors and cell states SCENIC provides critical biological insights into the mechanisms driving cellular heterogeneity

...read moreread less

2,277 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse