scispace - formally typeset
Search or ask a question
Author

Kunal Malhotra

Bio: Kunal Malhotra is an academic researcher from Georgia Institute of Technology. The author has contributed to research in topics: Database design & Information schema. The author has an hindex of 3, co-authored 5 publications receiving 45 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: The predictive model presented in this study is a preliminary step in a long-term plan of developing personalized treatment plans for GBM patients that can later be extended to other types of cancers.

29 citations

Proceedings ArticleDOI
15 Jul 2012
TL;DR: The results of the evaluation show that the technique developed and presented in this paper is both efficient and effective in selecting tests to rerun and in reducing the overall time required to perform regression testing.
Abstract: To manage and integrate information gathered from heterogeneous databases, an ontology is often used. Like all systems, ontology-driven systems evolve over time and must be regression tested to gain confidence in the behavior of the modified system. Because rerunning all existing tests can be extremely expensive, researchers have developed regression-test-selection (RTS) techniques that select a subset of the available tests that are affected by the changes, and use this subset to test the modified system. Existing RTS techniques have been shown to be effective, but they operate on the code and are unable to handle changes that involve ontologies. To address this limitation, we developed and present in this paper a novel RTS technique that targets ontology-driven systems. Our technique creates representations of the old and new ontologies, compares them to identify entities affected by the changes, and uses this information to select the subset of tests to rerun. We also describe in this paper OntoRetest, a tool that implements our technique and that we used to empirically evaluate our approach on two biomedical ontology-driven database systems. The results of our evaluation show that our technique is both efficient and effective in selecting tests to rerun and in reducing the overall time required to perform regression testing.

10 citations

Proceedings ArticleDOI
29 Oct 2015
TL;DR: Analysis of electronic healthcare reimbursement claims for analyzing healthcare delivery and practice patterns across the United States reveals that in contrast to treating HD and BC, clinical procedures for ASD diagnoses are highly varied leading up to and after the ASD diagnoses.
Abstract: We examine the use of electronic healthcare reimbursement claims (EHRC) for analyzing healthcare delivery and practice patterns across the United States (US). We show that EHRCs are correlated with disease incidence estimates published by the Centers for Disease Control. Further, by analyzing over 1 billion EHRCs, we track patterns of clinical procedures administered to patients with autism spectrum disorder (ASD), heart disease (HD) and breast cancer (BC) using sequential pattern mining algorithms. Our analyses reveal that in contrast to treating HD and BC, clinical procedures for ASD diagnoses are highly varied leading up to and after the ASD diagnoses. The discovered clinical procedure sequences also reveal significant differences in the overall costs incurred across different parts of the US, indicating a lack of consensus amongst practitioners in treating ASD patients. We show that a data-driven approach to understand clinical trajectories using EHRC can provide quantitative insights into how to better manage and treat patients. Based on our experience, we also discuss emerging challenges in using EHRC datasets for gaining insights into the state of contemporary healthcare delivery and practice in the US.

8 citations

Book ChapterDOI
16 Jun 2014
TL;DR: A form-based approach to schema creation and modification is addressed by needs of healthcare-IT where small group practices are currently in need of systems that will cater to their dynamic requirements without depending on EMR (Electronic Medical Record) systems.
Abstract: The traditional approach to relational database design starts with the conceptual design of an application based schema in a model like the Entity-relationship model, then mapping that to a logical design and eventually representing it as a set of related normalized tables. The project we present has been motivated by needs of healthcare-IT where small group practices are currently in need of systems that will cater to their dynamic requirements without depending on EMR (Electronic Medical Record) systems. It is also relevant for researchers for mining huge repositories of data such as social networks, etc. and create extracts of data on the fly for data analytics. Based on user characteristics and needs, the data is likely to vary and hence, a dynamic back-end database must be created. This paper addresses a form-based approach to schema creation and modification.

3 citations

01 Jan 2013
TL;DR: This paper describes the ONSTR domain, which is an application ontology for translational research into newborn screening, and the model that will serve as a model for PKU and enzymatic processes, as well as the disambiguation of terms.
Abstract: Translational research in the field of newborn screening system requires integration of data generated during various phases of life long treatment of patients identified and diagnosed through newborn dried blood spot screening (NDBS). In this paper, we describe Ontology for Newborn Screening Follow-­‐ up and Translational Research (ONSTR). ONSTR is an application ontology for representing data entities, practices and knowledge in the domain of the newborn screening short-­‐ and long-­‐term follow-­‐up of patients diagnosed with inheritable and congenital disorders. It will serve as a core of the data integration framework, the Newborn Screening Follow-­‐up Data Integration Collaborative (NBSDC) designed to support Semantic web tools and applications with the goal of helping clinicians involved in translational research. Here, we describe ONSTR domain, our top-­‐down bottom up methodological approach to ontology modelling using phenylketonuria (PKU) as an exemplar, and lessons learned. We provide an illustration of our ontological model of three important aspects of PKU: 1) the etiology, 2) phenylalanine hydroxylase enzyme dysfunction underlying PKU and 3) the disambiguation of terms central to PKU appearing in the literature. In modelling of the mechanism of PAH enzyme dysfunction, we encountered the limitations in using GO process calsses, in terms of their over-­‐granularity and the lack of representations of process participants. As a solution to this problem and to accurately represent this process, we created a hybrid model of enzyme mediated biochemical reactions. This model of PKU and enzymatic reactions will serve as a prototype for modelling other IMDs and enzymatic processes of importance to clinical and translational research in NBDS long-­‐term follow-­‐up domain. These initial work provides ontological foundation for automated reasoning and integration and annotation of data collected through newborn screening system.

1 citations


Cited by
More filters
01 Jan 2002

9,314 citations

Journal ArticleDOI
TL;DR: An architecture for an Analytic Information Warehouse that supports transforming data represented in different physical schemas into a common data model, specifying derived variables in terms of the common model to enable their reuse and computing derived variables while enforcing invariants is developed.

45 citations

Journal ArticleDOI
24 Jan 2019
TL;DR: This study aimed to study whether ML could achieve accurate prognostication of 2-year mortality in a small, highly dimensional database of patients with glioma and achieved reasonable performance compared with similar studies in the literature.
Abstract: Background Machine learning (ML) is the application of specialized algorithms to datasets for trend delineation, categorization, or prediction. ML techniques have been traditionally applied to large, highly dimensional databases. Gliomas are a heterogeneous group of primary brain tumors, traditionally graded using histopathologic features. Recently, the World Health Organization proposed a novel grading system for gliomas incorporating molecular characteristics. We aimed to study whether ML could achieve accurate prognostication of 2-year mortality in a small, highly dimensional database of patients with glioma. Methods We applied 3 ML techniques (artificial neural networks [ANNs], decision trees [DTs], and support vector machines [SVMs]) and classical logistic regression (LR) to a dataset consisting of 76 patients with glioma of all grades. We compared the effect of applying the algorithms to the raw database versus a database where only statistically significant features were included into the algorithmic inputs (feature selection). Results Raw input consisted of 21 variables and achieved performance of accuracy/area (C.I.) under the curve of 70.7%/0.70 (49.9–88.5) for ANN, 68%/0.72 (53.4–90.4) for SVM, 66.7%/0.64 (43.6–85.0) for LR, and 65%/0.70 (51.6–89.5) for DT. Feature selected input consisted of 14 variables and achieved performance of 73.4%/0.75 (62.9–87.9) for ANN, 73.3%/0.74 (62.1–87.4) for SVM, 69.3%/0.73 (60.0–85.8) for LR, and 65.2%/0.63 (49.1–76.9) for DT. Conclusions We demonstrate that these techniques can also be applied to small, highly dimensional datasets. Our ML techniques achieved reasonable performance compared with similar studies in the literature. Although local databases may be small versus larger cancer repositories, we demonstrate that ML techniques can still be applied to their analysis; however, traditional statistical methods are of similar benefit.

44 citations

Journal ArticleDOI
TL;DR: This systematical review aimed to assemble the current neurosurgical literature that machine learning has been utilized, and to inform neurosurgeons on this novel method of data analysis.
Abstract: Current practice of neurosurgery depends on clinical practice guidelines and evidence-based research publications that derive results using statistical methods. However, statistical analysis methods have some limitations such as the inability to analyze nonlinear variables, requiring setting a level of significance, being impractical for analyzing large amounts of data and the possibility of human bias. Machine learning is an emerging method for analyzing massive amounts of complex data which relies on algorithms that allow computers to learn and make accurate predictions. During the past decade, machine learning has been increasingly implemented in medical research as well as neurosurgical publications. This systematical review aimed to assemble the current neurosurgical literature that machine learning has been utilized, and to inform neurosurgeons on this novel method of data analysis.

38 citations

Journal ArticleDOI
TL;DR: The proposed method can effectively extract typical treatment processes from treatment records, and also has great potential to improve treatment outcome by personalizing the treatment process for patients with different conditions.

32 citations