Home
/
Authors
/
Sorin Draghici

Author

Sorin Draghici

Other affiliations: University of Cincinnati, Los Alamos National Laboratory, The Microsoft Research - University of Trento Centre for Computational and Systems Biology ...read more

Bio: Sorin Draghici is an academic researcher from Wayne State University. The author has contributed to research in topics: Artificial neural network & Gene expression profiling. The author has an hindex of 49, co-authored 182 publications receiving 11895 citations. Previous affiliations of Sorin Draghici include University of Cincinnati & Los Alamos National Laboratory.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A systems biology approach for pathway level analysis

[...]

Sorin Draghici¹, Purvesh Khatri¹, Adi L. Tarca, Kashyap Amin¹, Arina Done¹, Calin Voichita¹, Constantin Georgescu¹, Roberto Romero - Show less +4 more•Institutions (1)

Wayne State University¹

01 Oct 2007-Genome Research

TL;DR: An impact analysis is developed that includes the classical statistics but also considers other crucial factors such as the magnitude of each gene's expression change, their type and position in the given pathways, their interactions, etc.

...read moreread less

Abstract: A common challenge in the analysis of genomics data is trying to understand the underlying phenomenon in the context of all complex interactions taking place on various signaling pathways. A statistical approach using various models is universally used to identify the most relevant pathways in a given experiment. Here, we show that the existing pathway analysis methods fail to take into consideration important biological aspects and may provide incorrect results in certain situations. By using a systems biology approach, we developed an impact analysis that includes the classical statistics but also considers other crucial factors such as the magnitude of each gene’s expression change, their type and position in the given pathways, their interactions, etc. The impact analysis is an attempt to a deeper level of statistical analysis, informed by more pathway-specific biology than the existing techniques. On several illustrative data sets, the classical analysis produces both false positives and false negatives, while the impact analysis provides biologically meaningful results. This analysis method has been implemented as a Web-based tool, Pathway-Express, freely available as part of the Onto-Tools (http://vortex.cs.wayne.edu).

...read moreread less

1,069 citations

Journal Article•DOI•

A novel signaling pathway impact analysis

[...]

Adi L. Tarca¹, Sorin Draghici¹, Purvesh Khatri¹, Sonia S. Hassan¹, Pooja Mittal¹, Jung-Sun Kim¹, Chong Jai Kim¹, Juan Pedro Kusanovic¹, Roberto Romero¹ - Show less +5 more•Institutions (1)

Wayne State University¹

01 Jan 2009-Bioinformatics

TL;DR: A novel signaling pathway impact analysis (SPIA) that combines the evidence obtained from the classical enrichment analysis with a novel type of evidence, which measures the actual perturbation on a given pathway under a given condition.

...read moreread less

Abstract: Motivation: Gene expression class comparison studies may identify hundreds or thousands of genes as differentially expressed (DE) between sample groups. Gaining biological insight from the result of such experiments can be approached, for instance, by identifying the signaling pathways impacted by the observed changes. Most of the existing pathway analysis methods focus on either the number of DE genes observed in a given pathway (enrichment analysis methods), or on the correlation between the pathway genes and the class of the samples (functional class scoring methods). Both approaches treat the pathways as simple sets of genes, disregarding the complex gene interactions that these pathways are built to describe. Results: We describe a novel signaling pathway impact analysis (SPIA) that combines the evidence obtained from the classical enrichment analysis with a novel type of evidence, which measures the actual perturbation on a given pathway under a given condition. A bootstrap procedure is used to assess the significance of the observed total pathway perturbation. Using simulations we show that the evidence derived from perturbations is independent of the pathway enrichment evidence. This allows us to calculate a global pathway significance P-value, which combines the enrichment and perturbation P-values. We illustrate the capabilities of the novel method on four real datasets. The results obtained on these data show that SPIA has better specificity and more sensitivity than several widely used pathway analysis methods. Availability: SPIA was implemented as an R package available at http://vortex.cs.wayne.edu/ontoexpress/ Contact: [email protected] Supplementary information:Supplementary data are available at Bioinformatics online.

...read moreread less

952 citations

Journal Article•DOI•

Ontological analysis of gene expression data: current tools, limitations, and open problems

[...]

Purvesh Khatri¹, Sorin Draghici¹•Institutions (1)

Wayne State University¹

15 Sep 2005-Bioinformatics

TL;DR: A detailed comparison of the capabilities of 14 ontological analysis tools is presented using the following criteria: scope of the analysis, visualization capabilities, statistical model used, correction for multiple comparisons, reference microarrays available, installation issues and sources of annotation data.

...read moreread less

Abstract: Summary: Independent of the platform and the analysis methods used, the result of a microarray experiment is, in most cases, a list of differentially expressed genes An automatic ontological analysis approach has been recently proposed to help with the biological interpretation of such results Currently, this approach is the de facto standard for the secondary analysis of high throughput experiments and a large number of tools have been developed for this purpose We present a detailed comparison of 14 such tools using the following criteria: scope of the analysis, visualization capabilities, statistical model(s) used, correction for multiple comparisons, reference microarrays available, installation issues and sources of annotation data This detailed analysis of the capabilities of these tools will help researchers choose the most appropriate tool for a given type of analysis More importantly, in spite of the fact that this type of analysis has been generally adopted, this approach has several important intrinsic drawbacks These drawbacks are associated with all tools discussed and represent conceptual limitations of the current state-of-the-art in ontological analysis We propose these as challenges for the next generation of secondary data analysis tools Contact: [email protected]

...read moreread less

881 citations

Journal Article•DOI•

Reliability and reproducibility issues in DNA microarray measurements

[...]

Sorin Draghici¹, Purvesh Khatri¹, Aron Charles Eklund², Zoltan Szallasi³•Institutions (3)

Wayne State University¹, Brigham and Women's Hospital², Harvard University³

01 Feb 2006-Trends in Genetics

TL;DR: DNA microarrays enable researchers to monitor the expression of thousands of genes simultaneously but the current technology has several limitations, which need to be addressed.

...read moreread less

619 citations

Journal Article•DOI•

Machine learning and its applications to biology.

[...]

Adi L. Tarca, Vincent J. Carey, Xue-wen Chen, Roberto Romero, Sorin Draghici - Show less +1 more

29 Jun 2007-PLOS Computational Biology

TL;DR: This tutorial discusses the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data in the field of supervised learning in R, the open source data analysis and visualization language.

...read moreread less

Abstract: The term machine learning refers to a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. Two facets of mechanization should be acknowledged when considering machine learning in broad terms. Firstly, it is intended that the classification and prediction tasks can be accomplished by a suitably programmed computing machine. That is, the product of machine learning is a classifier that can be feasibly used on available hardware. Secondly, it is intended that the creation of the classifier should itself be highly mechanized, and should not involve too much human input. This second facet is inevitably vague, but the basic objective is that the use of automatic algorithm construction methods can minimize the possibility that human biases could affect the selection and performance of the algorithm. Both the creation of the algorithm and its operation to classify objects or predict events are to be based on concrete, observable data. The history of relations between biology and the field of machine learning is long and complex. An early technique [1] for machine learning called the perceptron constituted an attempt to model actual neuronal behavior, and the field of artificial neural network (ANN) design emerged from this attempt. Early work on the analysis of translation initiation sequences [2] employed the perceptron to define criteria for start sites in Escherichia coli. Further artificial neural network architectures such as the adaptive resonance theory (ART) [3] and neocognitron [4] were inspired from the organization of the visual nervous system. In the intervening years, the flexibility of machine learning techniques has grown along with mathematical frameworks for measuring their reliability, and it is natural to hope that machine learning methods will improve the efficiency of discovery and understanding in the mounting volume and complexity of biological data. This tutorial is structured in four main components. Firstly, a brief section reviews definitions and mathematical prerequisites. Secondly, the field of supervised learning is described. Thirdly, methods of unsupervised learning are reviewed. Finally, a section reviews methods and examples as implemented in the open source data analysis and visualization language R (http://www.r-project.org).

...read moreread less

523 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources.

[...]

Da-Wei Huang¹, Brad T. Sherman¹, Richard A. Lempicki¹•Institutions (1)

Science Applications International Corporation¹

01 Jan 2009-Nature Protocols

TL;DR: By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.

...read moreread less

Abstract: DAVID bioinformatics resources consists of an integrated biological knowledgebase and analytic tools aimed at systematically extracting biological meaning from large gene/protein lists. This protocol explains how to use DAVID, a high-throughput and integrated data-mining environment, to analyze gene lists derived from high-throughput genomic experiments. The procedure first requires uploading a gene list containing any number of common gene identifiers followed by analysis using one or more text and pathway-mining tools such as gene functional classification, functional annotation chart or clustering and functional annotation table. By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.

...read moreread less

31,015 citations

Journal Article•

The Design and Analysis of Experiments

[...]

Margaret J. Robertson

01 Jun 1953-Yale Journal of Biology and Medicine

TL;DR: This book by a teacher of statistics (as well as a consultant for "experimenters") is a comprehensive study of the philosophical background for the statistical design of experiment.

...read moreread less

Abstract: THE DESIGN AND ANALYSIS OF EXPERIMENTS. By Oscar Kempthorne. New York, John Wiley and Sons, Inc., 1952. 631 pp. $8.50. This book by a teacher of statistics (as well as a consultant for \"experimenters\") is a comprehensive study of the philosophical background for the statistical design of experiment. It is necessary to have some facility with algebraic notation and manipulation to be able to use the volume intelligently. The problems are presented from the theoretical point of view, without such practical examples as would be helpful for those not acquainted with mathematics. The mathematical justification for the techniques is given. As a somewhat advanced treatment of the design and analysis of experiments, this volume will be interesting and helpful for many who approach statistics theoretically as well as practically. With emphasis on the \"why,\" and with description given broadly, the author relates the subject matter to the general theory of statistics and to the general problem of experimental inference. MARGARET J. ROBERTSON

...read moreread less

13,333 citations

Journal Article•DOI•

Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists

[...]

Da-Wei Huang¹, Brad T. Sherman¹, Richard A. Lempicki¹•Institutions (1)

Science Applications International Corporation¹

01 Jan 2009-Nucleic Acids Research

TL;DR: The survey will help tool designers/developers and experienced end users understand the underlying algorithms and pertinent details of particular tool categories/tools, enabling them to make the best choices for their particular research interests.

...read moreread less

Abstract: Functional analysis of large gene lists, derived in most cases from emerging high-throughput genomic, proteomic and bioinformatics scanning approaches, is still a challenging and daunting task. The gene-annotation enrichment analysis is a promising high-throughput strategy that increases the likelihood for investigators to identify biological processes most pertinent to their study. Approximately 68 bioinformatics enrichment tools that are currently available in the community are collected in this survey. Tools are uniquely categorized into three major classes, according to their underlying enrichment algorithms. The comprehensive collections, unique tool classifications and associated questions/issues will provide a more comprehensive and up-to-date view regarding the advantages, pitfalls and recent trends in a simpler tool-class level rather than by a tool-by-tool approach. Thus, the survey will help tool designers/developers and experienced end users understand the underlying algorithms and pertinent details of particular tool categories/tools, enabling them to make the best choices for their particular research interests.

...read moreread less

13,102 citations

SPAdes, a new genome assembly algorithm and its applications to single-cell sequencing ( 7th Annual SFAF Meeting, 2012)

[...]

Glenn Tesler

01 Jun 2012

TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).

...read moreread less

Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

...read moreread less

10,124 citations

Journal Article•DOI•

Metascape provides a biologist-oriented resource for the analysis of systems-level datasets.

[...]

Yingyao Zhou¹, Bin Zhou¹, Lars Pache², Max W. Chang³, Alireza Hadj Khodabakhshi¹, Olga Tanaseichuk¹, Christopher Benner³, Sumit K. Chanda² - Show less +4 more•Institutions (3)

Genomics Institute of the Novartis Research Foundation¹, Discovery Institute², University of California, San Diego³

03 Apr 2019-Nature Communications

TL;DR: A biologist-oriented portal that provides a gene list annotation, enrichment and interactome resource and enables integrated analysis of multi-OMICs datasets, Metascape is an effective and efficient tool for experimental biologists to comprehensively analyze and interpret OMICs-based studies in the big data era.

...read moreread less

Abstract: A critical component in the interpretation of systems-level studies is the inference of enriched biological pathways and protein complexes contained within OMICs datasets Successful analysis requires the integration of a broad set of current biological databases and the application of a robust analytical pipeline to produce readily interpretable results Metascape is a web-based portal designed to provide a comprehensive gene list annotation and analysis resource for experimental biologists In terms of design features, Metascape combines functional enrichment, interactome analysis, gene annotation, and membership search to leverage over 40 independent knowledgebases within one integrated portal Additionally, it facilitates comparative analyses of datasets across multiple independent and orthogonal experiments Metascape provides a significantly simplified user experience through a one-click Express Analysis interface to generate interpretable outputs Taken together, Metascape is an effective and efficient tool for experimental biologists to comprehensively analyze and interpret OMICs-based studies in the big data era

...read moreread less

6,282 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse