Comparison of machine-learning algorithms to build a predictive model for detecting undiagnosed diabetes - ELSA-Brasil : accuracy study

doi:10.1590/1516-3180.2016.0309010217

Open AccessJournal ArticleDOI

Comparison of machine-learning algorithms to build a predictive model for detecting undiagnosed diabetes - ELSA-Brasil : accuracy study

André Rodrigues Olivera, +6 more

- 01 May 2017 -

Sao Paulo Medical Journal

- Vol. 135, Iss: 3, pp 234-246

Chats0

TLDR

Most of the predictive models produced similar results, and demonstrated the feasibility of identifying individuals with highest probability of having undiagnosed diabetes, through easily-obtained clinical data.

Abstract:

CONTEXT AND OBJECTIVE: Type 2 diabetes is a chronic disease associated with a wide range of serious health complications that have a major impact on overall health. The aims here were to develop and validate predictive models for detecting undiagnosed diabetes using data from the Longitudinal Study of Adult Health (ELSA-Brasil) and to compare the performance of different machine-learning algorithms in this task. DESIGN AND SETTING: Comparison of machine-learning algorithms to develop predictive models using data from ELSA-Brasil. METHODS: After selecting a subset of 27 candidate variables from the literature, models were built and validated in four sequential steps: (i) parameter tuning with tenfold cross-validation, repeated three times; (ii) automatic variable selection using forward selection, a wrapper strategy with four different machine-learning algorithms and tenfold cross-validation (repeated three times), to evaluate each subset of variables; (iii) error estimation of model parameters with tenfold cross-validation, repeated ten times; and (iv) generalization testing on an independent dataset. The models were created with the following machine-learning algorithms: logistic regression, artificial neural network, naive Bayes, K-nearest neighbor and random forest. RESULTS: The best models were created using artificial neural networks and logistic regression. These achieved mean areas under the curve of, respectively, 75.24% and 74.98% in the error estimation step and 74.17% and 74.41% in the generalization testing step. CONCLUSION: Most of the predictive models produced similar results, and demonstrated the feasibility of identifying individuals with highest probability of having undiagnosed diabetes, through easily-obtained clinical data.

Comparison of machine-learning algorithms to build a predictive model for detecting undiagnosed diabetes - ELSA-Brasil : accuracy study

Citations

A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models

Early detection of type 2 diabetes mellitus using machine learning-based prediction models.

Applications of Machine Learning Predictive Models in the Chronic Disease Diagnosis

Artificial intelligence in medicine: What is it doing for us today?

Current State of Diabetes Mellitus Prevalence, Awareness, Treatment, and Control in Latin America: Challenges and Innovative Solutions to Improve Health Outcomes Across the Continent

References

Random Forests

Applied Logistic Regression.

An introduction to variable and feature selection

Nearest neighbor pattern classification

Neural Networks And Learning Machines

Related Papers (5)

Predicting brain age using machine learning algorithms: A comprehensive evaluation.

Predicting diabetes diseases using mixed data and supervised machine learning algorithms

Brief review of regression‐based and machine learning methods in genetic epidemiology: the Genetic Analysis Workshop 17 experience

Comparing different supervised machine learning algorithms for disease prediction

Performance Analysis of Classifier Models to Predict Diabetes Mellitus

Trending Questions (1)