scispace - formally typeset
Search or ask a question
Author

Ali Azari

Other affiliations: Tarbiat Modares University
Bio: Ali Azari is an academic researcher from University of Maryland, Baltimore County. The author has contributed to research in topics: Information technology & Cluster analysis. The author has an hindex of 6, co-authored 10 publications receiving 108 citations. Previous affiliations of Ali Azari include Tarbiat Modares University.

Papers
More filters
Proceedings ArticleDOI
10 Dec 2012
TL;DR: This paper proposes a methodology that employs clustering to create the training sets to train different classification algorithms and consistently found that using clustering as a precursor to form the training set gives better prediction results as compared to non-clustering based training sets.
Abstract: A model to predict the Length of Stay (LOS) for hospitalized patients can be an effective tool for healthcare providers. Such a model will enable early interventions to prevent complications and prolonged LOS and also enable more efficient utilization of manpower and facilities in hospitals. In this paper, we propose an approach for Predicting Hospital Length of Stay (PHLOS) using a multi-tiered data mining approach. In this paper we propose a methodology that employs clustering to create the training sets to train different classification algorithms. We compared the performance of different classifiers along several different performance measures and consistently found that using clustering as a precursor to form the training set gives better prediction results as compared to non-clustering based training sets. We have also found the accuracies to be consistently higher than some reported in the current literature for predicting individual patient LOS. The classification techniques used in this study are interpretable, enabling us to examine the details of the classification rules learned from the data. As a result, this study provides insight into the underlying factors that influence hospital length of stay. We also examine our results with domain expert insights.

48 citations

Proceedings ArticleDOI
01 Oct 2014
TL;DR: The architecture of a Big-distributed Intrusion Detection System (B-dIDS) to discover multi-pronged attacks which are anomalies existing across multiple subnets in a distributed network.
Abstract: The focus of this paper is to present the architecture of a Big-distributed Intrusion Detection System (B-dIDS) to discover multi-pronged attacks which are anomalies existing across multiple subnets in a distributed network. The B-dIDS is composed of two key components: a big data processing engine and an analytics engine. The big data processing is done through HAMR, which is a next generation in-memory MapReduce engine. HAMR has reported high speedups over existing big data solutions across several analytics algorithms. The analytics engine comprises a novel ensemble algorithm, which extracts training data from clusters of the multiple IDS alarms. The clustering is utilized as a preprocessing step to re-label the datasets based on their high similarity to known potential attacks. The overall aim is to predict multi-pronged attacks that are spread across multiple subnets but can be missed if not evaluated in an integrated manner.

18 citations

Journal ArticleDOI
TL;DR: This study provides insight into the underlying factors that influence hospital length of stay, using a multi-tiered data mining approach to form training sets and identifying patients who need aggressive or moderate early interventions to prevent prolonged stays.
Abstract: A model to predict the Length of Stay LOS for hospitalized patients can be an effective tool for measuring the consumption of hospital resources Such a model will enable early interventions to prevent complications and prolonged LOS and also enable more efficient utilization of manpower and facilities in hospitals In this paper, the authors propose an approach for Predicting Hospital Length of Stay PHLOS using a multi-tiered data mining approach In their aproach, the authors form training sets, using groups of similar claims identified by k-means clustering and perfom classification using ten different classifiers The authors provide a combined measure of performance to statistically evaluate and rank the classifiers for different levels of clustering They consistently found that using clustering as a precursor to form the training set gives better prediction results as compared to non-clustering based training sets The authors have also found the accuracies to be consistently higher than some reported in the current literature for predicting individual patient LOS Binning the LOS to three groups of short, medium and long stays, their method identifies patients who need aggressive or moderate early interventions to prevent prolonged stays The classification techniques used in this study are interpretable, enabling them to examine the details of the classification rules learned from the data As a result, this study provides insight into the underlying factors that influence hospital length of stay They also examine the authors' prediction results for three randomly selected conditions with domain expert insights

18 citations

Proceedings ArticleDOI
09 Nov 2015
TL;DR: A framework that predicts patients with prolonged ED stays (> 14 hours) from data available at triage and integrates a class imbalance learning ensemble method into this framework produces much better results for prolonged stays than only using traditional logistic regression methods.
Abstract: A major contributor to Emergency Department (ED) crowding is patients with prolonged length of stay (LOS). Patients with long stays (i.e., those with LOS longer than 14 hours) comprise 10% percent of ED visits, but utilize 30% of the total ED bed hours. Accurately predicting patients' LOS can be used to improve resource management both in the ED and the hospital. A prediction model that can identify this minority, prolonged stay patient group, early at presentation may be effective in addressing barriers to expedited treatment and ED disposition. However, this is a challenging task because regular classification techniques are biased toward the majority group of examples and tend to overlook the minority class examples. This problem can be alleviated by using class imbalance learning methods. In this paper, we present a framework that predicts patients with prolonged ED stays (> 14 hours) from data available at triage (i.e., presentation). The framework also enables extraction of independent variables that capture the current state of the resources in the ED. Predictions combine patient information (e.g., demographics, complaints, and vital signs) with a snapshot of resources and queuing metrics in the ED which can substantially impact the LOS. The prediction models in our framework are developed from over one hundred thousand ED encounters retrospectively collected at an urban hospital. Our experimental results demonstrate that we accurately predict prolonged ED length of stay and provide a clear interpretation of the factors that influence it. We also found that integrating a class imbalance learning ensemble method into our framework produces much better results for prolonged stays than only using traditional logistic regression methods.

14 citations

Posted Content
TL;DR: In this article, the authors analyze worldwide information technology (IT) policy trends development using technology-diffusion, policy-making models and identify a framework based on the approach of "Hanna".
Abstract: This paper analyzes worldwide information technology (IT) policy trends development using technology-diffusion, policy-making models. After reviewing various types of technology transfer, the authors identified a framework based on the approach of 'Hanna'. The policy trends of 55 IT programs in 11 countries were positioned in the framework using a bottom-up approach. The authors analyzed IT policy-making initiative trends, and our findings show a sharp increase in the number of hands-off programs in the past ten years and a considerable increase in the number of bridging programs since 1990. It also shows that before 1990, IT was not yet considered an independent program in the policy-making arena. Programs that use a hands-on method prefer the 'generation' policy group rather than the 'diffusion' policy group, but hands-off methods are neutral. The authors make recommendations and offer guidelines for IT policy makers. The study opens up new lines of research possibilities for IT researchers. '

10 citations


Cited by
More filters
01 Jan 2002

9,314 citations

Journal ArticleDOI
TL;DR: This paper provides a list of criteria for making selections along with an analysis of the advantages and drawbacks of three different processing paradigms along with a comparison of engines that implement them, including MapReduce, Spark, Flink, Storm, and H2O.
Abstract: With an ever-increasing amount of options, the task of selecting machine learning tools for big data can be difficult. The available tools have advantages and drawbacks, and many have overlapping uses. The world’s data is growing rapidly, and traditional tools for machine learning are becoming insufficient as we move towards distributed and real-time processing. This paper is intended to aid the researcher or professional who understands machine learning but is inexperienced with big data. In order to evaluate tools, one should have a thorough understanding of what to look for. To that end, this paper provides a list of criteria for making selections along with an analysis of the advantages and drawbacks of each. We do this by starting from the beginning, and looking at what exactly the term “big data” means. From there, we go on to the Hadoop ecosystem for a look at many of the projects that are part of a typical machine learning architecture and an understanding of how everything might fit together. We discuss the advantages and disadvantages of three different processing paradigms along with a comparison of engines that implement them, including MapReduce, Spark, Flink, Storm, and H2O. We then look at machine learning libraries and frameworks including Mahout, MLlib, SAMOA, and evaluate them based on criteria such as scalability, ease of use, and extensibility. There is no single toolkit that truly embodies a one-size-fits-all solution, so this paper aims to help make decisions smoother by providing as much information as possible and quantifying what the tradeoffs will be. Additionally, throughout this paper, we review recent research in the field using these tools and talk about possible future directions for toolkit-based learning.

379 citations

Journal ArticleDOI
TL;DR: A comprehensive survey on state-of-the-art deep learning, IoT security, and big data technologies is conducted and a thematic taxonomy is derived from the comparative analysis of technical studies of the three aforementioned domains.

193 citations

Journal ArticleDOI
TL;DR: This is the first survey that addresses the use of big data analytics techniques for the design of a broad range of networks, and identifies the challenges confronting the utilization ofbig data analytics in network design.

110 citations