Predictive analysis of vector-borne diseases through tabular classification of epidemiological data
Descripción del Articulo
Vector-borne diseases (VBDs) are major threats to human health. They are estimated to cause more than 700,000 deaths each year. This presents serious health problems for CBD. In recent years, the incidence of VBDs has increased globally, affecting one billion people approximately and accounting for...
| Autores: | , |
|---|---|
| Formato: | artículo |
| Fecha de Publicación: | 2024 |
| Institución: | Universidad Tecnológica del Perú |
| Repositorio: | UTP-Institucional |
| Lenguaje: | inglés |
| OAI Identifier: | oai:repositorio.utp.edu.pe:20.500.12867/14482 |
| Enlace del recurso: | https://hdl.handle.net/20.500.12867/14482 https://doi.org/10.3991/ijoe.v20i13.50437 |
| Nivel de acceso: | acceso abierto |
| Materia: | Prediction Machine learning Epidemiological data Models https://purl.org/pe-repo/ocde/ford#2.02.04 |
| id |
UTPD_1a2f589a1a53ef99eb0b37df161b4a79 |
|---|---|
| oai_identifier_str |
oai:repositorio.utp.edu.pe:20.500.12867/14482 |
| network_acronym_str |
UTPD |
| network_name_str |
UTP-Institucional |
| repository_id_str |
4782 |
| dc.title.es_PE.fl_str_mv |
Predictive analysis of vector-borne diseases through tabular classification of epidemiological data |
| title |
Predictive analysis of vector-borne diseases through tabular classification of epidemiological data |
| spellingShingle |
Predictive analysis of vector-borne diseases through tabular classification of epidemiological data Iparraguirre-Villanueva, Orlando Prediction Machine learning Epidemiological data Models https://purl.org/pe-repo/ocde/ford#2.02.04 |
| title_short |
Predictive analysis of vector-borne diseases through tabular classification of epidemiological data |
| title_full |
Predictive analysis of vector-borne diseases through tabular classification of epidemiological data |
| title_fullStr |
Predictive analysis of vector-borne diseases through tabular classification of epidemiological data |
| title_full_unstemmed |
Predictive analysis of vector-borne diseases through tabular classification of epidemiological data |
| title_sort |
Predictive analysis of vector-borne diseases through tabular classification of epidemiological data |
| author |
Iparraguirre-Villanueva, Orlando |
| author_facet |
Iparraguirre-Villanueva, Orlando Cabanillas-Carbonell, Michael |
| author_role |
author |
| author2 |
Cabanillas-Carbonell, Michael |
| author2_role |
author |
| dc.contributor.author.fl_str_mv |
Iparraguirre-Villanueva, Orlando Cabanillas-Carbonell, Michael |
| dc.subject.es_PE.fl_str_mv |
Prediction Machine learning Epidemiological data Models |
| topic |
Prediction Machine learning Epidemiological data Models https://purl.org/pe-repo/ocde/ford#2.02.04 |
| dc.subject.ocde.es_PE.fl_str_mv |
https://purl.org/pe-repo/ocde/ford#2.02.04 |
| description |
Vector-borne diseases (VBDs) are major threats to human health. They are estimated to cause more than 700,000 deaths each year. This presents serious health problems for CBD. In recent years, the incidence of VBDs has increased globally, affecting one billion people approximately and accounting for 17% of all infectious diseases. Globally, disease rates have risen at an alarming rate, with more than 3.9 billion people at risk of infection. Therefore, it is essential to find approaches to detect these diseases; this is where machine learning (ML) models come into play. The purpose of this study was to predict VBDs using tabular epidemiological data. For this purpose, a set of ML models was used, such as support vector classifier (SVC), extreme gradient boosting (XGBoost), LightGBM, CatBoost, random forest (RF), and balanced random forest (BRF). A dataset consisting of 65 features and 1262 records was used during the training stage. The results highlighted the successful integration of the different models, such as SVC, XGBoost, LightGBM, CatBoost, BRF, and RF, with weights of 0.49959 ± 0.27112, 0.58496 ± 0.22619, 0.48482 ± 0.29971, 0.54992 ± 0.27982, 0.24924 ± 0.22654, and 0.45592 ±0.25849. In addition, the BRF model stood out for having the lowest log loss, evaluated through the ensemble log-loss metric, with an average of 0.24924 and a standard deviation of 0.22654. |
| publishDate |
2024 |
| dc.date.accessioned.none.fl_str_mv |
2025-11-07T17:26:35Z |
| dc.date.available.none.fl_str_mv |
2025-11-07T17:26:35Z |
| dc.date.issued.fl_str_mv |
2024 |
| dc.type.es_PE.fl_str_mv |
info:eu-repo/semantics/article |
| dc.type.version.es_PE.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.issn.none.fl_str_mv |
2626-8493 |
| dc.identifier.uri.none.fl_str_mv |
https://hdl.handle.net/20.500.12867/14482 |
| dc.identifier.journal.es_PE.fl_str_mv |
International Journal of Online and Biomedical Engineering |
| dc.identifier.doi.none.fl_str_mv |
https://doi.org/10.3991/ijoe.v20i13.50437 |
| identifier_str_mv |
2626-8493 International Journal of Online and Biomedical Engineering |
| url |
https://hdl.handle.net/20.500.12867/14482 https://doi.org/10.3991/ijoe.v20i13.50437 |
| dc.language.iso.es_PE.fl_str_mv |
eng |
| language |
eng |
| dc.rights.es_PE.fl_str_mv |
info:eu-repo/semantics/openAccess |
| dc.rights.uri.es_PE.fl_str_mv |
https://creativecommons.org/licenses/by/4.0/ |
| eu_rights_str_mv |
openAccess |
| rights_invalid_str_mv |
https://creativecommons.org/licenses/by/4.0/ |
| dc.format.es_PE.fl_str_mv |
application/pdf |
| dc.publisher.es_PE.fl_str_mv |
International Federation of Engineering Education Societies (IFEES) |
| dc.source.es_PE.fl_str_mv |
Repositorio Institucional - UTP Universidad Tecnológica del Perú |
| dc.source.none.fl_str_mv |
reponame:UTP-Institucional instname:Universidad Tecnológica del Perú instacron:UTP |
| instname_str |
Universidad Tecnológica del Perú |
| instacron_str |
UTP |
| institution |
UTP |
| reponame_str |
UTP-Institucional |
| collection |
UTP-Institucional |
| bitstream.url.fl_str_mv |
https://repositorio.utp.edu.pe/backend/api/core/bitstreams/0093a3c3-9950-4364-b167-b12119f16066/download https://repositorio.utp.edu.pe/backend/api/core/bitstreams/591e64c7-034c-4594-8a2a-7c5586cf7f60/download https://repositorio.utp.edu.pe/backend/api/core/bitstreams/c62252a4-b7b1-4bec-9fb1-462d46b39d7f/download https://repositorio.utp.edu.pe/backend/api/core/bitstreams/bc4cefe3-90ac-4f34-9b25-54f3bf226032/download https://repositorio.utp.edu.pe/backend/api/core/bitstreams/4254dc03-fd9b-4204-bf63-89986eaaa771/download https://repositorio.utp.edu.pe/backend/api/core/bitstreams/073f1639-cce9-4552-a64c-98efed143e5a/download |
| bitstream.checksum.fl_str_mv |
775748cd251c0c4f5f41435f0080b757 8a4605be74aa9ea9d79846c1fba20a33 b734ae05b67eeb6ff9dd9cdfa641943e 0d816d60124fdd29e4c803497e63048d f632b5dbf8ea6aa167f4087aefd1c614 b5e3db6b8ead7d86c850b683ad613c5a |
| bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 MD5 MD5 |
| repository.name.fl_str_mv |
Repositorio de la Universidad Tecnológica del Perú |
| repository.mail.fl_str_mv |
repositorio@utp.edu.pe |
| _version_ |
1852231535613181952 |
| spelling |
Iparraguirre-Villanueva, OrlandoCabanillas-Carbonell, Michael2025-11-07T17:26:35Z2025-11-07T17:26:35Z20242626-8493https://hdl.handle.net/20.500.12867/14482International Journal of Online and Biomedical Engineeringhttps://doi.org/10.3991/ijoe.v20i13.50437Vector-borne diseases (VBDs) are major threats to human health. They are estimated to cause more than 700,000 deaths each year. This presents serious health problems for CBD. In recent years, the incidence of VBDs has increased globally, affecting one billion people approximately and accounting for 17% of all infectious diseases. Globally, disease rates have risen at an alarming rate, with more than 3.9 billion people at risk of infection. Therefore, it is essential to find approaches to detect these diseases; this is where machine learning (ML) models come into play. The purpose of this study was to predict VBDs using tabular epidemiological data. For this purpose, a set of ML models was used, such as support vector classifier (SVC), extreme gradient boosting (XGBoost), LightGBM, CatBoost, random forest (RF), and balanced random forest (BRF). A dataset consisting of 65 features and 1262 records was used during the training stage. The results highlighted the successful integration of the different models, such as SVC, XGBoost, LightGBM, CatBoost, BRF, and RF, with weights of 0.49959 ± 0.27112, 0.58496 ± 0.22619, 0.48482 ± 0.29971, 0.54992 ± 0.27982, 0.24924 ± 0.22654, and 0.45592 ±0.25849. In addition, the BRF model stood out for having the lowest log loss, evaluated through the ensemble log-loss metric, with an average of 0.24924 and a standard deviation of 0.22654.Campus Chimboteapplication/pdfengInternational Federation of Engineering Education Societies (IFEES)info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/4.0/Repositorio Institucional - UTPUniversidad Tecnológica del Perúreponame:UTP-Institucionalinstname:Universidad Tecnológica del Perúinstacron:UTPPredictionMachine learningEpidemiological dataModelshttps://purl.org/pe-repo/ocde/ford#2.02.04Predictive analysis of vector-borne diseases through tabular classification of epidemiological datainfo:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionORIGINALO.Iparraguirre_M.Cabanillas_Articulo_2024.pdfO.Iparraguirre_M.Cabanillas_Articulo_2024.pdfapplication/pdf1793645https://repositorio.utp.edu.pe/backend/api/core/bitstreams/0093a3c3-9950-4364-b167-b12119f16066/download775748cd251c0c4f5f41435f0080b757MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://repositorio.utp.edu.pe/backend/api/core/bitstreams/591e64c7-034c-4594-8a2a-7c5586cf7f60/download8a4605be74aa9ea9d79846c1fba20a33MD52TEXTIparraguirre.O_Cabanillas.M_Articulo_2024.pdf.txtIparraguirre.O_Cabanillas.M_Articulo_2024.pdf.txtExtracted texttext/plain46727https://repositorio.utp.edu.pe/backend/api/core/bitstreams/c62252a4-b7b1-4bec-9fb1-462d46b39d7f/downloadb734ae05b67eeb6ff9dd9cdfa641943eMD53O.Iparraguirre_M.Cabanillas_Articulo_2024.pdf.txtO.Iparraguirre_M.Cabanillas_Articulo_2024.pdf.txtExtracted texttext/plain51236https://repositorio.utp.edu.pe/backend/api/core/bitstreams/bc4cefe3-90ac-4f34-9b25-54f3bf226032/download0d816d60124fdd29e4c803497e63048dMD57THUMBNAILIparraguirre.O_Cabanillas.M_Articulo_2024.pdf.jpgIparraguirre.O_Cabanillas.M_Articulo_2024.pdf.jpgGenerated Thumbnailimage/jpeg20096https://repositorio.utp.edu.pe/backend/api/core/bitstreams/4254dc03-fd9b-4204-bf63-89986eaaa771/downloadf632b5dbf8ea6aa167f4087aefd1c614MD54O.Iparraguirre_M.Cabanillas_Articulo_2024.pdf.jpgO.Iparraguirre_M.Cabanillas_Articulo_2024.pdf.jpgGenerated Thumbnailimage/jpeg39042https://repositorio.utp.edu.pe/backend/api/core/bitstreams/073f1639-cce9-4552-a64c-98efed143e5a/downloadb5e3db6b8ead7d86c850b683ad613c5aMD5820.500.12867/14482oai:repositorio.utp.edu.pe:20.500.12867/144822025-11-30 17:04:19.865https://creativecommons.org/licenses/by/4.0/info:eu-repo/semantics/openAccessopen.accesshttps://repositorio.utp.edu.peRepositorio de la Universidad Tecnológica del Perúrepositorio@utp.edu.peTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo= |
| score |
13.918286 |
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).