scispace - formally typeset
Search or ask a question
Author

Shailesh Tripathi

Bio: Shailesh Tripathi is an academic researcher from Steyr Mannlicher. The author has contributed to research in topics: Computer science & Network science. The author has an hindex of 11, co-authored 45 publications receiving 451 citations. Previous affiliations of Shailesh Tripathi include Indian Institutes of Technology & University of Arkansas for Medical Sciences.

Papers
More filters
Journal ArticleDOI
28 Feb 2020
TL;DR: This paper presents an introductory review of deep learning approaches including Deep Feedforward Neural Networks (D-FFNN), Convolutional Neural networks (CNNs), Deep Belief Networks (DBNs), Autoencoders (AEs), and Long Short-Term Memory (LSTM) networks.
Abstract: Deep learning models stand for a new learning paradigm in artificial intelligence (AI) and machine learning. Recent breakthrough results in image analysis and speech recognition have generated a massive interest in this field because also applications in many other domains providing big data seem possible. On a downside, the mathematical and computational methodology underlying deep learning models is very challenging, especially for interdisciplinary scientists. For this reason, we present in this paper an introductory review of deep learning approaches including Deep Feedforward Neural Networks (D-FFNN), Convolutional Neural Networks (CNNs), Deep Belief Networks (DBNs), Autoencoders (AEs), and Long Short-Term Memory (LSTM) networks. These models form the major core architectures of deep learning models currently used and should belong in any data scientist's toolbox. Importantly, those core architectural building blocks can be composed flexibly-in an almost Lego-like manner-to build new application-specific network architectures. Hence, a basic understanding of these network architectures is important to be prepared for future developments in AI.

296 citations

Journal ArticleDOI
TL;DR: NetBioV (Network Biology Visualization) is an R package that allows the visualization of large network data in biology and medicine to enable an organized and reproducible visualization of networks.
Abstract: Summary: NetBioV (Network Biology Visualization) is an R package that allows the visualization of large network data in biology and medicine. The purpose of NetBioV is to enable an organized and reproducible visualization of networks by emphasizing or highlighting specific structural properties that are of biological relevance. Availability and implementation: NetBioV is freely available for academic use. The package has been tested for R 2.14.2 under Linux, Windows and Mac OS X. It is available from Bioconductor. Contact: v@bio-complexity.com Supplementary information: Supplementary data are available at Bioinformatics online.

47 citations

Journal ArticleDOI
TL;DR: By integrating high-throughput mutational, gene expression and DNA methylation data, this study reveals for the first time the distinct molecular profile of small bowel adenocarcinoma and highlights potential clinically exploitable markers.
Abstract: Small bowel accounts for only 0.5% of cancer cases in the US but incidence rates have been rising at 2.4% per year over the past decade. One-third of these are adenocarcinomas but little is known about their molecular pathology and no molecular markers are available for clinical use. Using a retrospective 28 patient matched normal-tumor cohort, next-generation sequencing, gene expression arrays and CpG methylation arrays were used for molecular profiling. Next-generation sequencing identified novel mutations in IDH1, CDH1, KIT, FGFR2, FLT3, NPM1, PTEN, MET, AKT1, RET, NOTCH1 and ERBB4. Array data revealed 17% of CpGs and 5% of RNA transcripts assayed to be differentially methylated and expressed respectively (p < 0.01). Merging gene expression and DNA methylation data revealed CHN2 as consistently hypermethylated and downregulated in this disease (Spearman −0.71, p < 0.001). Mutations in TP53 which were found in more than half of the cohort (15/28) and Kazald1 hypomethylation were both were indicative of poor survival (p = 0.03, HR = 3.2 and p = 0.01, HR = 4.9 respectively). By integrating high-throughput mutational, gene expression and DNA methylation data, this study reveals for the first time the distinct molecular profile of small bowel adenocarcinoma and highlights potential clinically exploitable markers.

42 citations

Journal ArticleDOI
01 Jan 2013
TL;DR: The construction, the application, the meaning and the interpretation of the Diseasome network, which enables a systematic connection between the molecular and the phenotype level, and derived models like the human disease network are reviewed.
Abstract: In this paper, we review the construction, the application, the meaning and the interpretation of the Diseasome network, which enables a systematic connection between the molecular and the phenotype level, and derived models like the human disease network. Further, we are surveying recent conceptual and methodological enhancements that integrate data from diverse sources, e.g., from protein databases or genome-wide association studies. For our review, we assume a “data-centric” view that allows to distinguish different approaches based on the types of data used in a model. In addition, we discuss the need for network-based approaches in medicine.

35 citations

Journal ArticleDOI
TL;DR: A comparative analysis of five popular and four novel module detection algorithms reveals that there is a large heterogeneity among the different module prediction algorithms if one zooms-in the biological level of biological processes in the form of GO terms and all methods are severely affected by a slight perturbation of the networks.
Abstract: It is generally acknowledged that a functional understanding of a biological system can only be obtained by an understanding of the collective of molecular interactions in form of biological networks. Protein networks are one particular network type of special importance, because proteins form the functional base units of every biological cell. On a mesoscopic level of protein networks, modules are of significant importance because these building blocks may be the next elementary functional level above individual proteins allowing to gain insight into fundamental organizational principles of biological cells. In this paper, we provide a comparative analysis of five popular and four novel module detection algorithms. We study these module prediction methods for simulated benchmark networks as well as 10 biological protein interaction networks (PINs). A particular focus of our analysis is placed on the biological meaning of the predicted modules by utilizing the Gene Ontology (GO) database as gold standard for the definition of biological processes. Furthermore, we investigate the robustness of the results by perturbing the PINs simulating in this way our incomplete knowledge of protein networks. Overall, our study reveals that there is a large heterogeneity among the different module prediction algorithms if one zooms-in the biological level of biological processes in the form of GO terms and all methods are severely affected by a slight perturbation of the networks. However, we also find pathways that are enriched in multiple modules, which could provide important information about the hierarchical organization of the system.

29 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Proceedings ArticleDOI
22 Jan 2006
TL;DR: Some of the major results in random graphs and some of the more challenging open problems are reviewed, including those related to the WWW.
Abstract: We will review some of the major results in random graphs and some of the more challenging open problems. We will cover algorithmic and structural questions. We will touch on newer models, including those related to the WWW.

7,116 citations

01 Jan 2012

3,692 citations

Journal Article
TL;DR: Why interactome networks are important to consider in biology, how they can be mapped and integrated with each other, what global properties are starting to emerge from interactome network models, and how these properties may relate to human disease are detailed.
Abstract: Complex biological systems and cellular networks may underlie most genotype to phenotype relationships. Here, we review basic concepts in network biology, discussing different types of interactome networks and the insights that can come from analyzing them. We elaborate on why interactome networks are important to consider in biology, how they can be mapped and integrated with each other, what global properties are starting to emerge from interactome network models, and how these properties may relate to human disease.

1,323 citations

Book ChapterDOI
E.R. Davies1
01 Jan 1990
TL;DR: This chapter introduces the subject of statistical pattern recognition (SPR) by considering how features are defined and emphasizes that the nearest neighbor algorithm achieves error rates comparable with those of an ideal Bayes’ classifier.
Abstract: This chapter introduces the subject of statistical pattern recognition (SPR). It starts by considering how features are defined and emphasizes that the nearest neighbor algorithm achieves error rates comparable with those of an ideal Bayes’ classifier. The concepts of an optimal number of features, representativeness of the training data, and the need to avoid overfitting to the training data are stressed. The chapter shows that methods such as the support vector machine and artificial neural networks are subject to these same training limitations, although each has its advantages. For neural networks, the multilayer perceptron architecture and back-propagation algorithm are described. The chapter distinguishes between supervised and unsupervised learning, demonstrating the advantages of the latter and showing how methods such as clustering and principal components analysis fit into the SPR framework. The chapter also defines the receiver operating characteristic, which allows an optimum balance between false positives and false negatives to be achieved.

1,189 citations