scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Real-time analysis of healthcare using big data analytics

01 Nov 2017-Vol. 263, Iss: 4, pp 042056
TL;DR: This project, dataset alike Electronic Medical Records (EMR) produced from numerous medical devices and mobile applications will be induced into MongoDB using Hadoop framework with Improvised processing technique to improve outcome of processing patient records.
Abstract: Big Data Analytics (BDA) provides a tremendous advantage where there is a need of revolutionary performance in handling large amount of data that covers 4 characteristics such as Volume Velocity Variety Veracity. BDA has the ability to handle such dynamic data providing functioning effectiveness and exceptionally beneficial output in several day to day applications for various organizations. Healthcare is one of the sectors which generate data constantly covering all four characteristics with outstanding growth. There are several challenges in processing patient records which deals with variety of structured and unstructured format. Inducing BDA in to Healthcare (HBDA) will deal with sensitive patient driven information mostly in unstructured format comprising of prescriptions, reports, data from imaging system, etc., the challenges will be overcome by big data with enhanced efficiency in fetching and storing of data. In this project, dataset alike Electronic Medical Records (EMR) produced from numerous medical devices and mobile applications will be induced into MongoDB using Hadoop framework with Improvised processing technique to improve outcome of processing patient records.
Citations
More filters
Journal ArticleDOI
TL;DR: A new architecture for real-time health status prediction and analytics system using big data technologies and measures the performance of Spark DT against traditional machine learning tools including Weka to show the effectiveness of the proposed architecture.
Abstract: A number of technologies enabled by Internet of Thing (IoT) have been used for the prevention of various chronic diseases, continuous and real-time tracking system is a particularly important one. Wearable medical devices with sensor, health cloud and mobile applications have continuously generating a huge amount of data which is often called as streaming big data. Due to the higher speed of the data generation, it is difficult to collect, process and analyze such massive data in real-time in order to perform real-time actions in case of emergencies and extracting hidden value. using traditional methods which are limited and time-consuming. Therefore, there is a significant need to real-time big data stream processing to ensure an effective and scalable solution. In order to overcome this issue, this work proposes a new architecture for real-time health status prediction and analytics system using big data technologies. The system focus on applying distributed machine learning model on streaming health data events ingested to Spark streaming through Kafka topics. Firstly, we transform the standard decision tree (DT) (C4.5) algorithm into a parallel, distributed, scalable and fast DT using Spark instead of Hadoop MapReduce which becomes limited for real-time computing. Secondly, this model is applied to streaming data coming from distributed sources of various diseases to predict health status. Based on several input attributes, the system predicts health status, send an alert message to care providers and store the details in a distributed database to perform health data analytics and stream reporting. We measure the performance of Spark DT against traditional machine learning tools including Weka. Finally, performance evaluation parameters such as throughput and execution time are calculated to show the effectiveness of the proposed architecture. The experimental results show that the proposed system is able to effectively process and predict real-time and massive amount of medical data enabled by IoT from distributed and various diseases.

54 citations


Cites methods from "Real-time analysis of healthcare us..."

  • ...Real-time analysis focused on electronic medical records produced from many sources such as medical devices and mobile applications is described in [19]....

    [...]

Journal ArticleDOI
01 Jan 2018
TL;DR: In Malaysia, the focus on big data has started and some initiatives have been put in place to share information patient’s medical records and knowledge among general public, private hospitals and clinics.
Abstract: Big data in healthcare is important as it can be used in the prediction of outcome of diseases prevention of co-morbidities, mortality and saving the cost of medical treatment. In many countries, big data has becoming an important database where information generated could be used for treatment and management of diseases. In Malaysia, the focus on big data has started and some initiatives have been put in place to share information patient’s medical records and knowledge among general public, private hospitals and clinics. Nevertheless there are many challenges in implementing big data in healthcare especially in relation to privacy, security, standards, governance, integration of data, data accommodation, data classification, incorporation of technology etc. It is imperative that these challenges to be overcome before big data can be implemented successfully in healthcare.

32 citations

Journal ArticleDOI
TL;DR: In this work, brain images are used for screening individuals who have high risk to dyslexia and the proposed predictive model uses the machine-learning algorithm Support Vector Machine (SVM), designed in Apache SPARK framework to support voluminous data.
Abstract: Dyslexia is a learning disorder characterized by lack of reading and /or writing skills, difficulty in rapid word naming and also poor in spelling. Dyslexic individuals have great difficulty to read and interpret words or letters. Research work is carried out to classify dyslexic from non-dyslexics by various approaches such as machine learning, image processing, understanding the brain behavior through psychology, studying the differences in anatomy of brain. In addition to it several assistive tools are developed to support dyslexics. In this work, brain images are used for screening individuals who have high risk to dyslexia. This work also motivates the application of machine learning in distributed environment. The proposed predictive model uses the machine-learning algorithm Support Vector Machine (SVM). The model is designed in Apache SPARK framework to support voluminous data. The prediction accuracy of 92.5% is achieved using SVM.

6 citations

Proceedings ArticleDOI
24 Apr 2019
TL;DR: The model will analyze the previous history of patients for any side effects of the drug to be recommended and considers weather and maps API from Google as well so that the patients can easily locate the nearby stores where the medicines will be available.
Abstract: In Health Care Systems, consuming of medicines has become day to day activities for the people who are suffering from diseases. Most of the people are not also aware of the medication prescribed by doctors or pharmacies. Sometimes patients get other kind of complications as well by taking the medicines prescribed by medical practitioners. To counter these challenges, the authors are proposing the drug prediction model which will help patients for taking right medicines for the cure of particular disease. MLLib Library of Apache Spark is to be used for initial data analysis for drug suggestions related to symptoms gathered from particular user. The model will analyze the previous history of patients for any side effects of the drug to be recommended and considers weather and maps API from Google as well so that the patients can easily locate the nearby stores where the medicines will be available.

5 citations


Cites background from "Real-time analysis of healthcare us..."

  • ...iThe ireal itime iprocessing iof ihealth icare idata ihas ibeen iprojected iin i[7] iwhere idata ifrom idifferent imedical irelated iapplications iand imobile iapplications istored iin iElectronic iMedical iRecords iis ibrought iin ito ihadoop iand iMongoDB ienvironments....

    [...]

References
More filters
Journal ArticleDOI
05 Aug 2016-PLOS ONE
TL;DR: Model-free Big Data machine learning-based classification methods can outperform model-based techniques in terms of predictive precision and reliability, and it is observed that statistical rebalancing of cohort sizes yields better discrimination of group differences, specifically for predictive analytics based on heterogeneous and incomplete PPMI data.
Abstract: Background A unique archive of Big Data on Parkinson’s Disease is collected, managed and disseminated by the Parkinson’s Progression Markers Initiative (PPMI). The integration of such complex and heterogeneous Big Data from multiple sources offers unparalleled opportunities to study the early stages of prevalent neurodegenerative processes, track their progression and quickly identify the efficacies of alternative treatments. Many previous human and animal studies have examined the relationship of Parkinson’s disease (PD) risk to trauma, genetics, environment, co-morbidities, or life style. The defining characteristics of Big Data–large size, incongruency, incompleteness, complexity, multiplicity of scales, and heterogeneity of information-generating sources–all pose challenges to the classical techniques for data management, processing, visualization and interpretation. We propose, implement, test and validate complementary model-based and model-free approaches for PD classification and prediction. To explore PD risk using Big Data methodology, we jointly processed complex PPMI imaging, genetics, clinical and demographic data. Methods and Findings Collective representation of the multi-source data facilitates the aggregation and harmonization of complex data elements. This enables joint modeling of the complete data, leading to the development of Big Data analytics, predictive synthesis, and statistical validation. Using heterogeneous PPMI data, we developed a comprehensive protocol for end-to-end data characterization, manipulation, processing, cleaning, analysis and validation. Specifically, we (i) introduce methods for rebalancing imbalanced cohorts, (ii) utilize a wide spectrum of classification methods to generate consistent and powerful phenotypic predictions, and (iii) generate reproducible machine-learning based classification that enables the reporting of model parameters and diagnostic forecasting based on new data. We evaluated several complementary model-based predictive approaches, which failed to generate accurate and reliable diagnostic predictions. However, the results of several machine-learning based classification methods indicated significant power to predict Parkinson’s disease in the PPMI subjects (consistent accuracy, sensitivity, and specificity exceeding 96%, confirmed using statistical n-fold cross-validation). Clinical (e.g., Unified Parkinson's Disease Rating Scale (UPDRS) scores), demographic (e.g., age), genetics (e.g., rs34637584, chr12), and derived neuroimaging biomarker (e.g., cerebellum shape index) data all contributed to the predictive analytics and diagnostic forecasting. Conclusions Model-free Big Data machine learning-based classification methods (e.g., adaptive boosting, support vector machines) can outperform model-based techniques in terms of predictive precision and reliability (e.g., forecasting patient diagnosis). We observed that statistical rebalancing of cohort sizes yields better discrimination of group differences, specifically for predictive analytics based on heterogeneous and incomplete PPMI data. UPDRS scores play a critical role in predicting diagnosis, which is expected based on the clinical definition of Parkinson’s disease. Even without longitudinal UPDRS data, however, the accuracy of model-free machine learning based classification is over 80%. The methods, software and protocols developed here are openly shared and can be employed to study other neurodegenerative disorders (e.g., Alzheimer’s, Huntington’s, amyotrophic lateral sclerosis), as well as for other predictive Big Data analytics applications.

109 citations

Proceedings ArticleDOI
02 Jun 2015
TL;DR: The potential benefits of big data to healthcare are explained and how it improves treatment and empowers patients, providers and researchers are explored and the ability of reality mining in collecting large amounts of data to understand people's habits, detect and predict outcomes is described.
Abstract: Mobile phones, sensors, patients, hospitals, researchers, providers and organizations are nowadays, generating huge amounts of healthcare data. The real challenge in healthcare systems is how to find, collect, analyze and manage information to make people's lives healthier and easier, by contributing not only to understand new diseases and therapies but also to predict outcomes at earlier stages and make real-time decisions. In this paper, we explain the potential benefits of big data to healthcare and explore how it improves treatment and empowers patients, providers and researchers. We also describe the ability of reality mining in collecting large amounts of data to understand people's habits, detect and predict outcomes, and illustrate the benefits of big data analytics through five effective new pathways that could be adopted to promote patients' health, enhance medicine, reduce cost and improve healthcare value and quality. We cover some big data solutions in healthcare and we shed light on implementations, such as Electronic Healthcare Record (HER) and Electronic Healthcare Predictive Analytics (e-HPA) in US hospitals. Furthermore, we complete the picture by highlighting some challenges that big data analytics faces in healthcare.

72 citations

Proceedings ArticleDOI
19 Mar 2015
TL;DR: Big data solutions often come with set of innovative data management solutions and analytical tools, when effectively implemented can transform the healthcare outcomes.
Abstract: Data in the healthcare sector is growing beyond dealing capacity of the health care organizations and is expected to increase significantly in the coming years. Majority of the Healthcare data is often unstructured, exists in silos and resides in imaging systems, medical prescription notes, insurance claims data, EPR (Electronic Patient Records) etc. integrating these heterogeneous data and factoring it in to advance analytics is critical to improve healthcare outcomes. Either because data are isolated in disparate or incompatible formats or due to the lack in processing capability to load and query large datasets in a timely fashion the Healthcare organizations are not in a position to leverage the benefits of the vast data they have. With convergence of advanced computing and numerous Big Data technological options like commercial solutions, Open Source, Cloud etc. it is now possible to attain high performance, scalability at a relatively low cost. Big data solutions often come with set of innovative data management solutions and analytical tools, when effectively implemented can transform the healthcare outcomes.

55 citations

Proceedings ArticleDOI
25 Nov 2015
TL;DR: This paper is to illustrate how a problem being solved using MySQL will perform when MongoDB is used on a Big data dataset and the results are encouraging and clearly showcase the comparisons made.
Abstract: Database can accommodate a very large number of users on an on-demand basis. The main limitations with conventional relational database management systems (RDBMS) are that they are hard to scale with Data warehousing, Grid, Web 2.0 and Cloud applications, have non-linear query execution time, have unstable query plans and have static schema. Even though RDBMS's have provided database users with the best mix of simplicity, robustness, flexibility, performance, scalability and compatibility but they are not able to satisfy the present day users and applications for the reasons mentioned above. The next generation NonSQL (NoSQL) databases are mostly non-relational, distributed and horizontally scalable and are able to satisfy most of the needs of the present day applications. The main characteristics of these databases are schema-free, no join, non-relational, easy replication support, simple API and eventually consistent. The aim of this paper is to illustrate how a problem being solved using MySQL will perform when MongoDB is used on a Big data dataset. The results are encouraging and clearly showcase the comparisons made. Queries are executed on a big data airlines database using both MongoDB and MySQL. Select, update, delete and insert queries are executed and performance is evaluated.

37 citations

Proceedings ArticleDOI
01 Feb 2016
TL;DR: An overview of storing and retrieval methods, Big Data tools and techniques used in healthcare clouds, role of Big Data Analytics in healthcare and discusses the benefits, outlooks in nascent fields of predictive analytics, faces challenges and provides solutions are given.
Abstract: In today's world the massive set of data is generated from different organizations throughout the world. This huge and heterogeneous data is called Big Data. Big Data Analytics offers tremendous insights to different organizations especially in healthcare. The traditional database architectures are not up to the mark to face the challenge with huge data, which is pouring into organizations today, and it creates a big havoc. Big Data plays an important role in achieving predictive analysis in the healthcare domain. Big Data can handle huge explosion of data, which is found in many medical organizations. Big Data Analytics plays a major role in solving issues and challenges arises in healthcare domain. This paper gives an overview of storing and retrieval methods, Big Data tools and techniques used in healthcare clouds, role of Big Data Analytics in healthcare and discusses the benefits, outlooks in nascent fields of predictive analytics, faces challenges and provides solutions. The results also shows the astronomical role of Big Data Analytics in healthcare.

36 citations