Predictive analysis of vector-borne diseases through tabular classification of epidemiological data

Descripción del Articulo

Vector-borne diseases (VBDs) are major threats to human health. They are estimated to cause more than 700,000 deaths each year. This presents serious health problems for CBD. In recent years, the incidence of VBDs has increased globally, affecting one billion people approximately and accounting for...

Descripción completa

Detalles Bibliográficos
Autores: Iparraguirre-Villanueva, Orlando, Cabanillas-Carbonell, Michael
Formato: artículo
Fecha de Publicación:2024
Institución:Universidad Tecnológica del Perú
Repositorio:UTP-Institucional
Lenguaje:inglés
OAI Identifier:oai:repositorio.utp.edu.pe:20.500.12867/14482
Enlace del recurso:https://hdl.handle.net/20.500.12867/14482
https://doi.org/10.3991/ijoe.v20i13.50437
Nivel de acceso:acceso abierto
Materia:Prediction
Machine learning
Epidemiological data
Models
https://purl.org/pe-repo/ocde/ford#2.02.04
id UTPD_1a2f589a1a53ef99eb0b37df161b4a79
oai_identifier_str oai:repositorio.utp.edu.pe:20.500.12867/14482
network_acronym_str UTPD
network_name_str UTP-Institucional
repository_id_str 4782
dc.title.es_PE.fl_str_mv Predictive analysis of vector-borne diseases through tabular classification of epidemiological data
title Predictive analysis of vector-borne diseases through tabular classification of epidemiological data
spellingShingle Predictive analysis of vector-borne diseases through tabular classification of epidemiological data
Iparraguirre-Villanueva, Orlando
Prediction
Machine learning
Epidemiological data
Models
https://purl.org/pe-repo/ocde/ford#2.02.04
title_short Predictive analysis of vector-borne diseases through tabular classification of epidemiological data
title_full Predictive analysis of vector-borne diseases through tabular classification of epidemiological data
title_fullStr Predictive analysis of vector-borne diseases through tabular classification of epidemiological data
title_full_unstemmed Predictive analysis of vector-borne diseases through tabular classification of epidemiological data
title_sort Predictive analysis of vector-borne diseases through tabular classification of epidemiological data
author Iparraguirre-Villanueva, Orlando
author_facet Iparraguirre-Villanueva, Orlando
Cabanillas-Carbonell, Michael
author_role author
author2 Cabanillas-Carbonell, Michael
author2_role author
dc.contributor.author.fl_str_mv Iparraguirre-Villanueva, Orlando
Cabanillas-Carbonell, Michael
dc.subject.es_PE.fl_str_mv Prediction
Machine learning
Epidemiological data
Models
topic Prediction
Machine learning
Epidemiological data
Models
https://purl.org/pe-repo/ocde/ford#2.02.04
dc.subject.ocde.es_PE.fl_str_mv https://purl.org/pe-repo/ocde/ford#2.02.04
description Vector-borne diseases (VBDs) are major threats to human health. They are estimated to cause more than 700,000 deaths each year. This presents serious health problems for CBD. In recent years, the incidence of VBDs has increased globally, affecting one billion people approximately and accounting for 17% of all infectious diseases. Globally, disease rates have risen at an alarming rate, with more than 3.9 billion people at risk of infection. Therefore, it is essential to find approaches to detect these diseases; this is where machine learning (ML) models come into play. The purpose of this study was to predict VBDs using tabular epidemiological data. For this purpose, a set of ML models was used, such as support vector classifier (SVC), extreme gradient boosting (XGBoost), LightGBM, CatBoost, random forest (RF), and balanced random forest (BRF). A dataset consisting of 65 features and 1262 records was used during the training stage. The results highlighted the successful integration of the different models, such as SVC, XGBoost, LightGBM, CatBoost, BRF, and RF, with weights of 0.49959 ± 0.27112, 0.58496 ± 0.22619, 0.48482 ± 0.29971, 0.54992 ± 0.27982, 0.24924 ± 0.22654, and 0.45592 ±0.25849. In addition, the BRF model stood out for having the lowest log loss, evaluated through the ensemble log-loss metric, with an average of 0.24924 and a standard deviation of 0.22654.
publishDate 2024
dc.date.accessioned.none.fl_str_mv 2025-11-07T17:26:35Z
dc.date.available.none.fl_str_mv 2025-11-07T17:26:35Z
dc.date.issued.fl_str_mv 2024
dc.type.es_PE.fl_str_mv info:eu-repo/semantics/article
dc.type.version.es_PE.fl_str_mv info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.issn.none.fl_str_mv 2626-8493
dc.identifier.uri.none.fl_str_mv https://hdl.handle.net/20.500.12867/14482
dc.identifier.journal.es_PE.fl_str_mv International Journal of Online and Biomedical Engineering
dc.identifier.doi.none.fl_str_mv https://doi.org/10.3991/ijoe.v20i13.50437
identifier_str_mv 2626-8493
International Journal of Online and Biomedical Engineering
url https://hdl.handle.net/20.500.12867/14482
https://doi.org/10.3991/ijoe.v20i13.50437
dc.language.iso.es_PE.fl_str_mv eng
language eng
dc.rights.es_PE.fl_str_mv info:eu-repo/semantics/openAccess
dc.rights.uri.es_PE.fl_str_mv https://creativecommons.org/licenses/by/4.0/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by/4.0/
dc.format.es_PE.fl_str_mv application/pdf
dc.publisher.es_PE.fl_str_mv International Federation of Engineering Education Societies (IFEES)
dc.source.es_PE.fl_str_mv Repositorio Institucional - UTP
Universidad Tecnológica del Perú
dc.source.none.fl_str_mv reponame:UTP-Institucional
instname:Universidad Tecnológica del Perú
instacron:UTP
instname_str Universidad Tecnológica del Perú
instacron_str UTP
institution UTP
reponame_str UTP-Institucional
collection UTP-Institucional
bitstream.url.fl_str_mv https://repositorio.utp.edu.pe/backend/api/core/bitstreams/0093a3c3-9950-4364-b167-b12119f16066/download
https://repositorio.utp.edu.pe/backend/api/core/bitstreams/591e64c7-034c-4594-8a2a-7c5586cf7f60/download
https://repositorio.utp.edu.pe/backend/api/core/bitstreams/c62252a4-b7b1-4bec-9fb1-462d46b39d7f/download
https://repositorio.utp.edu.pe/backend/api/core/bitstreams/bc4cefe3-90ac-4f34-9b25-54f3bf226032/download
https://repositorio.utp.edu.pe/backend/api/core/bitstreams/4254dc03-fd9b-4204-bf63-89986eaaa771/download
https://repositorio.utp.edu.pe/backend/api/core/bitstreams/073f1639-cce9-4552-a64c-98efed143e5a/download
bitstream.checksum.fl_str_mv 775748cd251c0c4f5f41435f0080b757
8a4605be74aa9ea9d79846c1fba20a33
b734ae05b67eeb6ff9dd9cdfa641943e
0d816d60124fdd29e4c803497e63048d
f632b5dbf8ea6aa167f4087aefd1c614
b5e3db6b8ead7d86c850b683ad613c5a
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositorio de la Universidad Tecnológica del Perú
repository.mail.fl_str_mv repositorio@utp.edu.pe
_version_ 1852231535613181952
spelling Iparraguirre-Villanueva, OrlandoCabanillas-Carbonell, Michael2025-11-07T17:26:35Z2025-11-07T17:26:35Z20242626-8493https://hdl.handle.net/20.500.12867/14482International Journal of Online and Biomedical Engineeringhttps://doi.org/10.3991/ijoe.v20i13.50437Vector-borne diseases (VBDs) are major threats to human health. They are estimated to cause more than 700,000 deaths each year. This presents serious health problems for CBD. In recent years, the incidence of VBDs has increased globally, affecting one billion people approximately and accounting for 17% of all infectious diseases. Globally, disease rates have risen at an alarming rate, with more than 3.9 billion people at risk of infection. Therefore, it is essential to find approaches to detect these diseases; this is where machine learning (ML) models come into play. The purpose of this study was to predict VBDs using tabular epidemiological data. For this purpose, a set of ML models was used, such as support vector classifier (SVC), extreme gradient boosting (XGBoost), LightGBM, CatBoost, random forest (RF), and balanced random forest (BRF). A dataset consisting of 65 features and 1262 records was used during the training stage. The results highlighted the successful integration of the different models, such as SVC, XGBoost, LightGBM, CatBoost, BRF, and RF, with weights of 0.49959 ± 0.27112, 0.58496 ± 0.22619, 0.48482 ± 0.29971, 0.54992 ± 0.27982, 0.24924 ± 0.22654, and 0.45592 ±0.25849. In addition, the BRF model stood out for having the lowest log loss, evaluated through the ensemble log-loss metric, with an average of 0.24924 and a standard deviation of 0.22654.Campus Chimboteapplication/pdfengInternational Federation of Engineering Education Societies (IFEES)info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/4.0/Repositorio Institucional - UTPUniversidad Tecnológica del Perúreponame:UTP-Institucionalinstname:Universidad Tecnológica del Perúinstacron:UTPPredictionMachine learningEpidemiological dataModelshttps://purl.org/pe-repo/ocde/ford#2.02.04Predictive analysis of vector-borne diseases through tabular classification of epidemiological datainfo:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionORIGINALO.Iparraguirre_M.Cabanillas_Articulo_2024.pdfO.Iparraguirre_M.Cabanillas_Articulo_2024.pdfapplication/pdf1793645https://repositorio.utp.edu.pe/backend/api/core/bitstreams/0093a3c3-9950-4364-b167-b12119f16066/download775748cd251c0c4f5f41435f0080b757MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://repositorio.utp.edu.pe/backend/api/core/bitstreams/591e64c7-034c-4594-8a2a-7c5586cf7f60/download8a4605be74aa9ea9d79846c1fba20a33MD52TEXTIparraguirre.O_Cabanillas.M_Articulo_2024.pdf.txtIparraguirre.O_Cabanillas.M_Articulo_2024.pdf.txtExtracted texttext/plain46727https://repositorio.utp.edu.pe/backend/api/core/bitstreams/c62252a4-b7b1-4bec-9fb1-462d46b39d7f/downloadb734ae05b67eeb6ff9dd9cdfa641943eMD53O.Iparraguirre_M.Cabanillas_Articulo_2024.pdf.txtO.Iparraguirre_M.Cabanillas_Articulo_2024.pdf.txtExtracted texttext/plain51236https://repositorio.utp.edu.pe/backend/api/core/bitstreams/bc4cefe3-90ac-4f34-9b25-54f3bf226032/download0d816d60124fdd29e4c803497e63048dMD57THUMBNAILIparraguirre.O_Cabanillas.M_Articulo_2024.pdf.jpgIparraguirre.O_Cabanillas.M_Articulo_2024.pdf.jpgGenerated Thumbnailimage/jpeg20096https://repositorio.utp.edu.pe/backend/api/core/bitstreams/4254dc03-fd9b-4204-bf63-89986eaaa771/downloadf632b5dbf8ea6aa167f4087aefd1c614MD54O.Iparraguirre_M.Cabanillas_Articulo_2024.pdf.jpgO.Iparraguirre_M.Cabanillas_Articulo_2024.pdf.jpgGenerated Thumbnailimage/jpeg39042https://repositorio.utp.edu.pe/backend/api/core/bitstreams/073f1639-cce9-4552-a64c-98efed143e5a/downloadb5e3db6b8ead7d86c850b683ad613c5aMD5820.500.12867/14482oai:repositorio.utp.edu.pe:20.500.12867/144822025-11-30 17:04:19.865https://creativecommons.org/licenses/by/4.0/info:eu-repo/semantics/openAccessopen.accesshttps://repositorio.utp.edu.peRepositorio de la Universidad Tecnológica del Perúrepositorio@utp.edu.peTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=
score 13.918286
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).