scispace - formally typeset
Open AccessJournal ArticleDOI

Comparison of machine-learning algorithms to build a predictive model for detecting undiagnosed diabetes - ELSA-Brasil : accuracy study

Reads0
Chats0
TLDR
Most of the predictive models produced similar results, and demonstrated the feasibility of identifying individuals with highest probability of having undiagnosed diabetes, through easily-obtained clinical data.
Abstract
CONTEXT AND OBJECTIVE: Type 2 diabetes is a chronic disease associated with a wide range of serious health complications that have a major impact on overall health. The aims here were to develop and validate predictive models for detecting undiagnosed diabetes using data from the Longitudinal Study of Adult Health (ELSA-Brasil) and to compare the performance of different machine-learning algorithms in this task. DESIGN AND SETTING: Comparison of machine-learning algorithms to develop predictive models using data from ELSA-Brasil. METHODS: After selecting a subset of 27 candidate variables from the literature, models were built and validated in four sequential steps: (i) parameter tuning with tenfold cross-validation, repeated three times; (ii) automatic variable selection using forward selection, a wrapper strategy with four different machine-learning algorithms and tenfold cross-validation (repeated three times), to evaluate each subset of variables; (iii) error estimation of model parameters with tenfold cross-validation, repeated ten times; and (iv) generalization testing on an independent dataset. The models were created with the following machine-learning algorithms: logistic regression, artificial neural network, naive Bayes, K-nearest neighbor and random forest. RESULTS: The best models were created using artificial neural networks and logistic regression. ­These achieved mean areas under the curve of, respectively, 75.24% and 74.98% in the error estimation step and 74.17% and 74.41% in the generalization testing step. CONCLUSION: Most of the predictive models produced similar results, and demonstrated the feasibility of identifying individuals with highest probability of having undiagnosed diabetes, through easily-obtained clinical data.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models

TL;DR: Improvements in methodology and reporting are needed for studies that compare modeling algorithms for clinical prediction modeling in the literature and found no evidence of superior performance of ML over LR.
Journal ArticleDOI

Early detection of type 2 diabetes mellitus using machine learning-based prediction models.

TL;DR: This study compares machine learning-based prediction models to commonly used regression models for prediction of undiagnosed T2DM and shows no clinically relevant improvement when more sophisticated prediction models were used.
Journal ArticleDOI

Applications of Machine Learning Predictive Models in the Chronic Disease Diagnosis

TL;DR: There are no standard methods to determine the best approach in real-time clinical practice since each method has its advantages and disadvantages, and these models are expected to become more important in medical practice in the near future.
Journal ArticleDOI

Artificial intelligence in medicine: What is it doing for us today?

TL;DR: Some of the potential drawbacks, concerns, and uncertainties surrounding the use of AI in medicine are clarified and some of the efforts being made to prepare the health care industry for the implementation of AI are discussed.
Journal ArticleDOI

Current State of Diabetes Mellitus Prevalence, Awareness, Treatment, and Control in Latin America: Challenges and Innovative Solutions to Improve Health Outcomes Across the Continent

TL;DR: The prevalence of diabetes mellitus continues to rise across Latin America, and the number of those with the disease may be underestimated, however, some local governments are embedding more comprehensive diabetes assessments in their local national surveys.
References
More filters
Journal ArticleDOI

Random Forests

TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Journal ArticleDOI

Applied Logistic Regression.

TL;DR: Applied Logistic Regression, Third Edition provides an easily accessible introduction to the logistic regression model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables.
Journal ArticleDOI

An introduction to variable and feature selection

TL;DR: The contributions of this special issue cover a wide range of aspects of variable selection: providing a better definition of the objective function, feature construction, feature ranking, multivariate feature selection, efficient search methods, and feature validity assessment methods.
Journal ArticleDOI

Nearest neighbor pattern classification

TL;DR: The nearest neighbor decision rule assigns to an unclassified sample point the classification of the nearest of a set of previously classified points, so it may be said that half the classification information in an infinite sample set is contained in the nearest neighbor.
Book

Neural Networks And Learning Machines

Simon Haykin
TL;DR: Refocused, revised and renamed to reflect the duality of neural networks and learning machines, this edition recognizes that the subject matter is richer when these topics are studied together.
Related Papers (5)
Trending Questions (1)
How do I find the best regression model in R?

The best models were created using artificial neural networks and logistic regression.