Application of machine learning models for early detection and accurate classification of type 2 Diabetes
Descripción del Articulo
Early detection of diabetes is essential to prevent serious complications in patients. The purpose of this work is to detect and classify type 2 diabetes in patients using machine learning (ML) models, and to select the most optimal model to predict the risk of diabetes. In this paper, five ML model...
Autores: | , , , |
---|---|
Formato: | artículo |
Fecha de Publicación: | 2023 |
Institución: | Universidad Tecnológica del Perú |
Repositorio: | UTP-Institucional |
Lenguaje: | inglés |
OAI Identifier: | oai:repositorio.utp.edu.pe:20.500.12867/7776 |
Enlace del recurso: | https://hdl.handle.net/20.500.12867/7776 https://doi.org/10.3390/diagnostics13142383 |
Nivel de acceso: | acceso abierto |
Materia: | Diabetes Machine learning Predictive modelling https://purl.org/pe-repo/ocde/ford#3.00.00 https://purl.org/pe-repo/ocde/ford#1.02.00 |
id |
UTPD_46ddc32aadacc85c0a79ab300af42cdf |
---|---|
oai_identifier_str |
oai:repositorio.utp.edu.pe:20.500.12867/7776 |
network_acronym_str |
UTPD |
network_name_str |
UTP-Institucional |
repository_id_str |
4782 |
dc.title.es_PE.fl_str_mv |
Application of machine learning models for early detection and accurate classification of type 2 Diabetes |
title |
Application of machine learning models for early detection and accurate classification of type 2 Diabetes |
spellingShingle |
Application of machine learning models for early detection and accurate classification of type 2 Diabetes Espinola Linares, Karina Diabetes Machine learning Predictive modelling https://purl.org/pe-repo/ocde/ford#3.00.00 https://purl.org/pe-repo/ocde/ford#1.02.00 |
title_short |
Application of machine learning models for early detection and accurate classification of type 2 Diabetes |
title_full |
Application of machine learning models for early detection and accurate classification of type 2 Diabetes |
title_fullStr |
Application of machine learning models for early detection and accurate classification of type 2 Diabetes |
title_full_unstemmed |
Application of machine learning models for early detection and accurate classification of type 2 Diabetes |
title_sort |
Application of machine learning models for early detection and accurate classification of type 2 Diabetes |
author |
Espinola Linares, Karina |
author_facet |
Espinola Linares, Karina Iparraguirre-Villanueva, Orlando Flores Castañeda, Rosalynn Ornella Cabanillas-Carbonell, Michael |
author_role |
author |
author2 |
Iparraguirre-Villanueva, Orlando Flores Castañeda, Rosalynn Ornella Cabanillas-Carbonell, Michael |
author2_role |
author author author |
dc.contributor.author.fl_str_mv |
Espinola Linares, Karina Iparraguirre-Villanueva, Orlando Flores Castañeda, Rosalynn Ornella Cabanillas-Carbonell, Michael |
dc.subject.es_PE.fl_str_mv |
Diabetes Machine learning Predictive modelling |
topic |
Diabetes Machine learning Predictive modelling https://purl.org/pe-repo/ocde/ford#3.00.00 https://purl.org/pe-repo/ocde/ford#1.02.00 |
dc.subject.ocde.es_PE.fl_str_mv |
https://purl.org/pe-repo/ocde/ford#3.00.00 https://purl.org/pe-repo/ocde/ford#1.02.00 |
description |
Early detection of diabetes is essential to prevent serious complications in patients. The purpose of this work is to detect and classify type 2 diabetes in patients using machine learning (ML) models, and to select the most optimal model to predict the risk of diabetes. In this paper, five ML models, including K-nearest neighbor (K-NN), Bernoulli Naïve Bayes (BNB), decision tree (DT), logistic regression (LR), and support vector machine (SVM), are investigated to predict diabetic patients. A Kaggle-hosted Pima Indian dataset containing 768 patients with and without diabetes was used, including variables such as number of pregnancies the patient has had, blood glucose concentration, diastolic blood pressure, skinfold thickness, body insulin levels, body mass index (BMI), genetic background, diabetes in the family tree, age, and outcome (with/without diabetes). The results show that the K-NN and BNB models outperform the other models. The K-NN model obtained the best accuracy in detecting diabetes, with 79.6% accuracy, while the BNB model obtained 77.2% accuracy in detecting diabetes. Finally, it can be stated that the use of ML models for the early detection of diabetes is very promising. |
publishDate |
2023 |
dc.date.accessioned.none.fl_str_mv |
2023-10-25T20:55:22Z |
dc.date.available.none.fl_str_mv |
2023-10-25T20:55:22Z |
dc.date.issued.fl_str_mv |
2023 |
dc.type.es_PE.fl_str_mv |
info:eu-repo/semantics/article |
dc.type.version.es_PE.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.issn.none.fl_str_mv |
2075-4418 |
dc.identifier.uri.none.fl_str_mv |
https://hdl.handle.net/20.500.12867/7776 |
dc.identifier.journal.es_PE.fl_str_mv |
Diagnostics |
dc.identifier.doi.none.fl_str_mv |
https://doi.org/10.3390/diagnostics13142383 |
identifier_str_mv |
2075-4418 Diagnostics |
url |
https://hdl.handle.net/20.500.12867/7776 https://doi.org/10.3390/diagnostics13142383 |
dc.language.iso.es_PE.fl_str_mv |
eng |
language |
eng |
dc.relation.ispartofseries.none.fl_str_mv |
Diagnostics;vol. 13, n° 4 |
dc.rights.es_PE.fl_str_mv |
info:eu-repo/semantics/openAccess |
dc.rights.uri.es_PE.fl_str_mv |
http://creativecommons.org/licenses/by/4.0/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by/4.0/ |
dc.format.es_PE.fl_str_mv |
application/pdf |
dc.publisher.es_PE.fl_str_mv |
Multidisciplinary Digital Publishing Institute |
dc.publisher.country.es_PE.fl_str_mv |
CH |
dc.source.es_PE.fl_str_mv |
Repositorio Institucional - UTP Universidad Tecnológica del Perú |
dc.source.none.fl_str_mv |
reponame:UTP-Institucional instname:Universidad Tecnológica del Perú instacron:UTP |
instname_str |
Universidad Tecnológica del Perú |
instacron_str |
UTP |
institution |
UTP |
reponame_str |
UTP-Institucional |
collection |
UTP-Institucional |
bitstream.url.fl_str_mv |
http://repositorio.utp.edu.pe/bitstream/20.500.12867/7776/2/license.txt http://repositorio.utp.edu.pe/bitstream/20.500.12867/7776/1/K.Espinoza_Articulo_2023.pdf http://repositorio.utp.edu.pe/bitstream/20.500.12867/7776/3/K.Espinoza_Articulo_2023.pdf.txt http://repositorio.utp.edu.pe/bitstream/20.500.12867/7776/4/K.Espinoza_Articulo_2023.pdf.jpg |
bitstream.checksum.fl_str_mv |
8a4605be74aa9ea9d79846c1fba20a33 e86dc57d02b18ce4868a623f35cd1417 29f244577ab41c279b60d5690a8442b4 6433d8d07a0d8faf40a095c9778b39fb |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositorio Institucional de la Universidad Tecnológica del Perú |
repository.mail.fl_str_mv |
repositorio@utp.edu.pe |
_version_ |
1817984957621993472 |
spelling |
Espinola Linares, KarinaIparraguirre-Villanueva, OrlandoFlores Castañeda, Rosalynn OrnellaCabanillas-Carbonell, Michael2023-10-25T20:55:22Z2023-10-25T20:55:22Z20232075-4418https://hdl.handle.net/20.500.12867/7776Diagnosticshttps://doi.org/10.3390/diagnostics13142383Early detection of diabetes is essential to prevent serious complications in patients. The purpose of this work is to detect and classify type 2 diabetes in patients using machine learning (ML) models, and to select the most optimal model to predict the risk of diabetes. In this paper, five ML models, including K-nearest neighbor (K-NN), Bernoulli Naïve Bayes (BNB), decision tree (DT), logistic regression (LR), and support vector machine (SVM), are investigated to predict diabetic patients. A Kaggle-hosted Pima Indian dataset containing 768 patients with and without diabetes was used, including variables such as number of pregnancies the patient has had, blood glucose concentration, diastolic blood pressure, skinfold thickness, body insulin levels, body mass index (BMI), genetic background, diabetes in the family tree, age, and outcome (with/without diabetes). The results show that the K-NN and BNB models outperform the other models. The K-NN model obtained the best accuracy in detecting diabetes, with 79.6% accuracy, while the BNB model obtained 77.2% accuracy in detecting diabetes. Finally, it can be stated that the use of ML models for the early detection of diabetes is very promising.Campus Chimboteapplication/pdfengMultidisciplinary Digital Publishing InstituteCHDiagnostics;vol. 13, n° 4info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by/4.0/Repositorio Institucional - UTPUniversidad Tecnológica del Perúreponame:UTP-Institucionalinstname:Universidad Tecnológica del Perúinstacron:UTPDiabetesMachine learningPredictive modellinghttps://purl.org/pe-repo/ocde/ford#3.00.00https://purl.org/pe-repo/ocde/ford#1.02.00Application of machine learning models for early detection and accurate classification of type 2 Diabetesinfo:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionLICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://repositorio.utp.edu.pe/bitstream/20.500.12867/7776/2/license.txt8a4605be74aa9ea9d79846c1fba20a33MD52ORIGINALK.Espinoza_Articulo_2023.pdfK.Espinoza_Articulo_2023.pdfapplication/pdf3181728http://repositorio.utp.edu.pe/bitstream/20.500.12867/7776/1/K.Espinoza_Articulo_2023.pdfe86dc57d02b18ce4868a623f35cd1417MD51TEXTK.Espinoza_Articulo_2023.pdf.txtK.Espinoza_Articulo_2023.pdf.txtExtracted texttext/plain65038http://repositorio.utp.edu.pe/bitstream/20.500.12867/7776/3/K.Espinoza_Articulo_2023.pdf.txt29f244577ab41c279b60d5690a8442b4MD53THUMBNAILK.Espinoza_Articulo_2023.pdf.jpgK.Espinoza_Articulo_2023.pdf.jpgGenerated Thumbnailimage/jpeg23730http://repositorio.utp.edu.pe/bitstream/20.500.12867/7776/4/K.Espinoza_Articulo_2023.pdf.jpg6433d8d07a0d8faf40a095c9778b39fbMD5420.500.12867/7776oai:repositorio.utp.edu.pe:20.500.12867/77762023-10-25 17:04:34.463Repositorio Institucional de la Universidad Tecnológica del Perúrepositorio@utp.edu.peTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo= |
score |
13.882472 |
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).