An Efficient, Ensemble-Based Classification Framework for Big Medical Data.

doi:10.1089/BIG.2021.0132

Home
/
Papers
/
An Efficient, Ensemble-Based Classification Framework for Big Medical Data.

Journal Article•DOI•

An Efficient, Ensemble-Based Classification Framework for Big Medical Data.

Firoz Khan¹, Balusupati Veera Venkata Siva Prasad, Salman Ali Syed, Imran Ashraf², Lakshmana Kumar Ramasamy - Show less +1 more•Institutions (2)

Higher Colleges of Technology¹, Yeungnam University²

23 Sep 2021-

TL;DR: In this article, the authors proposed an efficient, ensemble-based classification framework for big medical data to deal with the problem of insufficient classification algorithms for handling big medical datasets, which is a complicated task in the big data age.

read less

Abstract: Fetching useful information from big medical datasets is a complicated task in the big data age. Various classification algorithms are used in the data mining process to analyze information from the big medical dataset. Nevertheless, these classification algorithms are insufficient to handle big medical data. This work proposes an efficient, ensemble-based classification framework for big medical data to deal with this problem. The proposed work involves initially applying the preprocessing technique to remove noise, missing values, and unwanted features from big medical data. The process selects a subset of classifiers from a pool of classifiers. The selected classifiers are combined to form a hybrid system for efficient classification. The methodology further involves incremental learning from data samples, explaining the predicted outputs, and achieving high classification performance. Java is used for simulation, and the Cleveland Heart Disease big dataset and Diabetes big dataset are used for classification. The experimental result shows that the proposed ensemble algorithm provides an efficient classification compared with existing algorithms based on accuracy, precision, F-measure, recall, and execution time.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Magnetic Force Classifier: A Novel Method for Big Data Classification

[...]

01 Jan 2022-IEEE Access

TL;DR: Based on the number of points belonging to a specific class/magnet, the proposed magnetic force (MF) classifier calculates the magnetic force at each discrete point in the feature space as discussed by the authors .

...read moreread less

Abstract: There are a plethora of invented classifiers in Machine learning literature, however, there is no optimal classifier in terms of accuracy and time taken to build the trained model, especially with the tremendous development and growth of Big data. Hence, there is still room for improvement. In this paper, we propose a new classification method that is based on the well-known magnetic force. Based on the number of points belonging to a specific class/magnet, the proposed magnetic force (MF) classifier calculates the magnetic force at each discrete point in the feature space. Unknown examples are classified using the magnetic forces recorded in the trained model by various magnets/classes. When compared to existing classifiers, the proposed MF classifier achieves comparable classification accuracy, according to the experimental results utilizing 28 different datasets. More importantly, we found that the proposed MF classifier is significantly faster than all other classifiers tested, particularly when applied to Big datasets and hence could be a viable option for structured Big data classification with some optimization.

...read moreread less

3 citations

Journal Article•DOI•

Security and privacy issues in federated healthcare – An overview

[...]

Jansi Rani Amalraj, Robert Lourdusamy

01 Jan 2022-Open Computer Science

TL;DR: The importance of federated learning in healthcare is highlighted and the privacy and security issues in communicating the e-health data are discussed.

...read moreread less

Abstract: Abstract Securing medical records is a significant task in Healthcare communication. The major setback during the transfer of medical data in the electronic medium is the inherent difficulty in preserving data confidentiality and patients’ privacy. The innovation in technology and improvisation in the medical field has given numerous advancements in transferring the medical data with foolproof security. In today’s healthcare industry, federated network operation is gaining significance to deal with distributed network resources due to the efficient handling of privacy issues. The design of a federated security system for healthcare services is one of the intense research topics. This article highlights the importance of federated learning in healthcare. Also, the article discusses the privacy and security issues in communicating the e-health data.

...read moreread less

1 citations

Journal Article•DOI•

Clinical Uncertainty Influences Antibiotic Prescribing for Upper Respiratory Tract Infections: A Qualitative Study of Township Hospital Physicians and Village Doctors in Rural Shandong Province, China

[...]

Liyan Shen, Ting Wang, Jia Yin, Qiang-san Sun, Oliver J. Dyar - Show less +1 more

01 Jun 2023-Antibiotics

TL;DR: Wang et al. as mentioned in this paper explored how clinical uncertainty influences antibiotic prescribing practices among township hospital physicians and village doctors in rural Shandong Province, China, and suggested that interventions to reduce clinical uncertainty may help minimize the unnecessary use of antibiotics in these settings.

...read moreread less

Abstract: Objective: This study aimed to explore how clinical uncertainty influences antibiotic prescribing practices among township hospital physicians and village doctors in rural Shandong Province, China. Methods: Qualitative semi-structured interviews were conducted with 30 township hospital physicians and 6 village doctors from rural Shandong Province, China. A multi-stage random sampling method was used to identify respondents. Conceptual content analysis together with Colaizzi’s method were used to generate qualitative codes and identify themes. Results: Three final thematic categories emerged during the data analysis: (1) Incidence and treatment of Upper Respiratory Tract Infections (URTIs) in township hospitals and village clinics; (2) Antibiotic prescribing practices based on the clinical experience of clinicians; (3) Influence of clinical uncertainty on antibiotic prescribing. Respondents from both township hospitals and village clinics reported that URTIs were the most common reason for antibiotic prescriptions at their facilities and that clinical uncertainty appears to be an important driver for the overuse of antibiotics for URTIs. Clinical uncertainty was primarily due to: (1) Diagnostic uncertainty (establishing a relevant diagnosis is hindered by limited diagnostic resources and capacities, as well as limited willingness of patients to pay for investigations), and (2) Insufficient prognostic evidence. As a consequence of the clinical uncertainty caused by both diagnostic and prognostic uncertainty, respondents stated that antibiotics are frequently prescribed for URTIs to prevent both prolonged courses or recurrence of the disease, as well as clinical worsening, hospital admission, or complications. Conclusion: Our study suggests that clinical uncertainty is a key driver for the overuse and misuse of prescribing antibiotics for URTIs in both rural township hospitals and village clinics in Shandong province, China, and that interventions to reduce clinical uncertainty may help minimize the unnecessary use of antibiotics in these settings. Interventions that use clinical rules to identify patients at low risk of complications or hospitalization may be more feasible in the near-future than laboratory-based interventions aimed at reducing diagnostic uncertainty.

...read moreread less

Journal Article•DOI•

A Novel Ensemble of Support Vector Machines for Improving Medical Data Classification

[...]

Phuoc-Hai Huynh, Van Hoa Nguyen

15 Feb 2023-Engineering Innovations

TL;DR: In this article , the ensemble approaches based on support vector machines are proposed for classifying medical data, which can predict diseases with an accuracy rate of 82.82 and 81.76 percent without feature selection in the preprocessing data stage.

...read moreread less

Abstract: In recent years, the increasing volume and availability of healthcare and biomedical data are opening up new opportunities for computational methods to enhance healthcare in many hospitals. Medical data classification is regarded as the challenging task to develop intelligent medical decision support systems in hospitals. In this paper, the ensemble approaches based on support vector machines are proposed for classifying medical data. This research’s key contribution is that the ensemble multiple support vector machines use the function kernel in the style of gradient boosting and bagging to produce a more accurate fusion model than the mono-modality models. Extensive experiments have been conducted on forty benchmark medical datasets from the University of California at Irvine machine learning repository. The classification results show that there is a statistically significant difference (p-values < 0.05) between the proposed approaches and the best classification models. In addition, the empirical analysis of forty medical datasets indicated that our models can predict diseases with an accuracy rate of 82.82 and 81.76 percent without feature selection in the preprocessing data stage.

...read moreread less

References

PDF

Open Access

More filters

Journal Article•DOI•

The WEKA data mining software: an update

[...]

Mark Hall, Eibe Frank¹, Geoffrey Holmes¹, Bernhard Pfahringer¹, Peter Reutemann¹, Ian H. Witten¹ - Show less +2 more•Institutions (1)

University of Waikato¹

16 Nov 2009-Sigkdd Explorations

TL;DR: This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.

...read moreread less

Abstract: More than twelve years have elapsed since the first public release of WEKA. In that time, the software has been rewritten entirely from scratch, evolved substantially and now accompanies a text on data mining [35]. These days, WEKA enjoys widespread acceptance in both academia and business, has an active community, and has been downloaded more than 1.4 million times since being placed on Source-Forge in April 2000. This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.

...read moreread less

19,603 citations

UCI Machine Learning Repository

[...]

A. Asuncion

01 Jan 2007

17,341 citations

Journal Article•DOI•

A review of feature selection techniques in bioinformatics

[...]

Yvan Saeys¹, Iñaki Inza¹, Pedro Larrañaga¹•Institutions (1)

University of the Basque Country¹

10 Sep 2007-Bioinformatics

TL;DR: A basic taxonomy of feature selection techniques is provided, providing their use, variety and potential in a number of both common as well as upcoming bioinformatics applications.

...read moreread less

Abstract: Feature selection techniques have become an apparent need in many bioinformatics applications. In addition to the large pool of techniques that have already been developed in the machine learning and data mining fields, specific applications in bioinformatics have led to a wealth of newly proposed techniques. In this article, we make the interested reader aware of the possibilities of feature selection, providing a basic taxonomy of feature selection techniques, and discussing their use, variety and potential in a number of both common as well as upcoming bioinformatics applications. Contact: yvan.saeys@psb.ugent.be Supplementary information: http://bioinformatics.psb.ugent.be/supplementary_data/yvsae/fsreview

...read moreread less

4,706 citations

Journal Article•DOI•

Selection of relevant features and examples in machine learning

[...]

Avrim Blum¹, Pat Langley²•Institutions (2)

Carnegie Mellon University¹, Daimler AG²

01 Dec 1997-Artificial Intelligence

TL;DR: This survey reviews work in machine learning on methods for handling data sets containing large amounts of irrelevant information and describes the advances that have been made in both empirical and theoretical work in this area.

...read moreread less

2,869 citations

Journal Article•DOI•

Feature Extraction, Construction and Selection: A Data Mining Perspective

[...]

Huan Liu, Hiroshi Motoda

01 Jul 1998-Journal of the American Statistical Association

TL;DR: This book can be used by researchers and graduate students in machine learning, data mining, and knowledge discovery, who wish to understand techniques of feature extraction, construction and selection for data pre-processing and to solve large size, real-world problems.

...read moreread less

Abstract: From the Publisher: The book can be used by researchers and graduate students in machine learning, data mining, and knowledge discovery, who wish to understand techniques of feature extraction, construction and selection for data pre-processing and to solve large size, real-world problems. The book can also serve as a reference book for those who are conducting research about feature extraction, construction and selection, and are ready to meet the exciting challenges ahead of us.

...read moreread less

953 citations