Optimizing credit risk prediction in the financial sector using boosting algorithms: a comparative study with financial datasets
Descripción del Articulo
        This article analyzes credit risk in the financial sector and proposes a methodology to improve its prediction accuracy using boosting algorithms such as XGBoost, LightGBM, and Boosted Random Forest. Datasets from the UCI Machine Learning Repository were used, including Statlog German Credit Data, A...
              
            
    
                        | Autor: | |
|---|---|
| Formato: | tesis de grado | 
| Fecha de Publicación: | 2025 | 
| Institución: | Universidad de Lima | 
| Repositorio: | ULIMA-Institucional | 
| Lenguaje: | inglés | 
| OAI Identifier: | oai:repositorio.ulima.edu.pe:20.500.12724/23390 | 
| Enlace del recurso: | https://hdl.handle.net/20.500.12724/23390 | 
| Nivel de acceso: | acceso abierto | 
| Materia: | Pendiente https://purl.org/pe-repo/ocde/ford#2.02.04 | 
| id | RULI_224cb19cc3b21f61ecd0edcfa678c485 | 
|---|---|
| oai_identifier_str | oai:repositorio.ulima.edu.pe:20.500.12724/23390 | 
| network_acronym_str | RULI | 
| network_name_str | ULIMA-Institucional | 
| repository_id_str | 3883 | 
| dc.title.en_EN.fl_str_mv | Optimizing credit risk prediction in the financial sector using boosting algorithms: a comparative study with financial datasets | 
| dc.title.alternative.en_EN.fl_str_mv | Optimización de la predicción del riesgo crediticio en el sector financiero mediante algoritmos de boosting: un estudio comparativo con conjuntos de datos financieros | 
| title | Optimizing credit risk prediction in the financial sector using boosting algorithms: a comparative study with financial datasets | 
| spellingShingle | Optimizing credit risk prediction in the financial sector using boosting algorithms: a comparative study with financial datasets Villanueva Mora, Renzo Orlando Pendiente https://purl.org/pe-repo/ocde/ford#2.02.04 | 
| title_short | Optimizing credit risk prediction in the financial sector using boosting algorithms: a comparative study with financial datasets | 
| title_full | Optimizing credit risk prediction in the financial sector using boosting algorithms: a comparative study with financial datasets | 
| title_fullStr | Optimizing credit risk prediction in the financial sector using boosting algorithms: a comparative study with financial datasets | 
| title_full_unstemmed | Optimizing credit risk prediction in the financial sector using boosting algorithms: a comparative study with financial datasets | 
| title_sort | Optimizing credit risk prediction in the financial sector using boosting algorithms: a comparative study with financial datasets | 
| author | Villanueva Mora, Renzo Orlando | 
| author_facet | Villanueva Mora, Renzo Orlando | 
| author_role | author | 
| dc.contributor.advisor.fl_str_mv | Escobedo Cardenas, Edwin Jonathan | 
| dc.contributor.author.fl_str_mv | Villanueva Mora, Renzo Orlando | 
| dc.subject.es_PE.fl_str_mv | Pendiente | 
| topic | Pendiente https://purl.org/pe-repo/ocde/ford#2.02.04 | 
| dc.subject.ocde.none.fl_str_mv | https://purl.org/pe-repo/ocde/ford#2.02.04 | 
| description | This article analyzes credit risk in the financial sector and proposes a methodology to improve its prediction accuracy using boosting algorithms such as XGBoost, LightGBM, and Boosted Random Forest. Datasets from the UCI Machine Learning Repository were used, including Statlog German Credit Data, Australian Credit Approval, and Bank Marketing. The methodology involved feature engineering, exploratory data analysis, and hyperparameter tuning. Additionally, a complementary strategy using K-means clustering was implemented to enhance the data. The results show that XGBoost outperforms the other models in various scenarios, and boosting-based methods deliver better performance than traditional approaches like decision trees and factorization machines—offering valuable insights for financial institutions. | 
| publishDate | 2025 | 
| dc.date.accessioned.none.fl_str_mv | 2025-09-23T16:37:07Z | 
| dc.date.available.none.fl_str_mv | 2025-09-23T16:37:07Z | 
| dc.date.issued.fl_str_mv | 2025 | 
| dc.type.none.fl_str_mv | info:eu-repo/semantics/bachelorThesis | 
| dc.type.other.none.fl_str_mv | Tesis | 
| format | bachelorThesis | 
| dc.identifier.uri.none.fl_str_mv | https://hdl.handle.net/20.500.12724/23390 | 
| dc.identifier.isni.none.fl_str_mv | 0000000121541816 | 
| url | https://hdl.handle.net/20.500.12724/23390 | 
| identifier_str_mv | 0000000121541816 | 
| dc.language.iso.none.fl_str_mv | eng | 
| language | eng | 
| dc.relation.ispartof.fl_str_mv | SUNEDU | 
| dc.rights.none.fl_str_mv | info:eu-repo/semantics/openAccess | 
| dc.rights.uri.*.fl_str_mv | https://creativecommons.org/licenses/by-nc-sa/4.0/ | 
| eu_rights_str_mv | openAccess | 
| rights_invalid_str_mv | https://creativecommons.org/licenses/by-nc-sa/4.0/ | 
| dc.format.none.fl_str_mv | application/pdf | 
| dc.publisher.none.fl_str_mv | Universidad de Lima | 
| dc.publisher.country.none.fl_str_mv | PE | 
| publisher.none.fl_str_mv | Universidad de Lima | 
| dc.source.none.fl_str_mv | reponame:ULIMA-Institucional instname:Universidad de Lima instacron:ULIMA | 
| instname_str | Universidad de Lima | 
| instacron_str | ULIMA | 
| institution | ULIMA | 
| reponame_str | ULIMA-Institucional | 
| collection | ULIMA-Institucional | 
| bitstream.url.fl_str_mv | https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/1/T018_72754378_T.pdf https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/2/FA_72754378.pdf https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/3/TURNITIN_DNI_72754378%20-%2020193654.pdf https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/4/T018_72754378_T.pdf.txt https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/6/FA_72754378.pdf.txt https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/8/TURNITIN_DNI_72754378%20-%2020193654.pdf.txt https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/5/T018_72754378_T.pdf.jpg https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/7/FA_72754378.pdf.jpg https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/9/TURNITIN_DNI_72754378%20-%2020193654.pdf.jpg | 
| bitstream.checksum.fl_str_mv | 6a43d8a3015b9618d459066420174139 48045e3f2fadcd218ebd1e666007d40b 6b600cd4250817d28c3f7b0d44ed0466 55947bb6c038eaf58ed2ec6c7c147c47 8006eaadf78be844006ebd108415fc0d 090bfce0a50eae8e44caf6b8f2fc1ca7 b43aff4f5cac5d73eed02d5af83a19c7 0e2b72a978d2d37bb2458b5278c1e44b 5622317d554cd63d791ec83479f15732 | 
| bitstream.checksumAlgorithm.fl_str_mv | MD5 MD5 MD5 MD5 MD5 MD5 MD5 MD5 MD5 | 
| repository.name.fl_str_mv | Repositorio Universidad de Lima | 
| repository.mail.fl_str_mv | repositorio@ulima.edu.pe | 
| _version_ | 1847246238755323904 | 
| spelling | Escobedo Cardenas, Edwin JonathanVillanueva Mora, Renzo Orlando2025-09-23T16:37:07Z2025-09-23T16:37:07Z2025https://hdl.handle.net/20.500.12724/233900000000121541816This article analyzes credit risk in the financial sector and proposes a methodology to improve its prediction accuracy using boosting algorithms such as XGBoost, LightGBM, and Boosted Random Forest. Datasets from the UCI Machine Learning Repository were used, including Statlog German Credit Data, Australian Credit Approval, and Bank Marketing. The methodology involved feature engineering, exploratory data analysis, and hyperparameter tuning. Additionally, a complementary strategy using K-means clustering was implemented to enhance the data. The results show that XGBoost outperforms the other models in various scenarios, and boosting-based methods deliver better performance than traditional approaches like decision trees and factorization machines—offering valuable insights for financial institutions.Este artículo analiza el riesgo crediticio en el sector financiero y propone una metodología para predecirlo con mayor precisión mediante algoritmos de boosting como XGBoost, LightGBM y Boosted Random Forest. Se utilizaron datasets del repositorio UCI como Statlog German Credit Data, Australian Credit Approval, Bank Marketing, entre otros, aplicando técnicas de feature engineering, análisis exploratorio y ajuste de hiperparámetros. Además, se incorporó una estrategia adicional con K-means para enriquecer los datos. Los resultados muestran que XGBoost supera a los demás modelos en distintos escenarios, y que los métodos de boosting ofrecen mejor desempeño que enfoques tradicionales como árboles de decisión y máquinas de factorización, lo cual resulta valioso para las entidades financieras.application/pdfengUniversidad de LimaPEinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/4.0/Pendientehttps://purl.org/pe-repo/ocde/ford#2.02.04Optimizing credit risk prediction in the financial sector using boosting algorithms: a comparative study with financial datasetsOptimización de la predicción del riesgo crediticio en el sector financiero mediante algoritmos de boosting: un estudio comparativo con conjuntos de datos financierosinfo:eu-repo/semantics/bachelorThesisTesisreponame:ULIMA-Institucionalinstname:Universidad de Limainstacron:ULIMASUNEDUTitulo profesionalIngeniería de SistemasUniversidad de Lima. Facultad de IngenieríaIngeniero de Sistemashttps://orcid.org/0000-0003-2034-513X4521175561207672754378https://purl.org/pe-repo/renati/level#tituloProfesionalGuzman Jimenez, Rosario MarybelEscobedo Cardenas, Edwin JonathanQuintana Cruz, Hernan Alejandrohttps://purl.org/pe-repo/renati/type#tesisOIORIGINALT018_72754378_T.pdfT018_72754378_T.pdfDescargarapplication/pdf319855https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/1/T018_72754378_T.pdf6a43d8a3015b9618d459066420174139MD51FA_72754378.pdfFA_72754378.pdfAutorizaciónapplication/pdf248503https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/2/FA_72754378.pdf48045e3f2fadcd218ebd1e666007d40bMD52TURNITIN_DNI_72754378 - 20193654.pdfTURNITIN_DNI_72754378 - 20193654.pdfReporte de similitudapplication/pdf544088https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/3/TURNITIN_DNI_72754378%20-%2020193654.pdf6b600cd4250817d28c3f7b0d44ed0466MD53TEXTT018_72754378_T.pdf.txtT018_72754378_T.pdf.txtExtracted texttext/plain13745https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/4/T018_72754378_T.pdf.txt55947bb6c038eaf58ed2ec6c7c147c47MD54FA_72754378.pdf.txtFA_72754378.pdf.txtExtracted texttext/plain4323https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/6/FA_72754378.pdf.txt8006eaadf78be844006ebd108415fc0dMD56TURNITIN_DNI_72754378 - 20193654.pdf.txtTURNITIN_DNI_72754378 - 20193654.pdf.txtExtracted texttext/plain17147https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/8/TURNITIN_DNI_72754378%20-%2020193654.pdf.txt090bfce0a50eae8e44caf6b8f2fc1ca7MD58THUMBNAILT018_72754378_T.pdf.jpgT018_72754378_T.pdf.jpgGenerated Thumbnailimage/jpeg12120https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/5/T018_72754378_T.pdf.jpgb43aff4f5cac5d73eed02d5af83a19c7MD55FA_72754378.pdf.jpgFA_72754378.pdf.jpgGenerated Thumbnailimage/jpeg21287https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/7/FA_72754378.pdf.jpg0e2b72a978d2d37bb2458b5278c1e44bMD57TURNITIN_DNI_72754378 - 20193654.pdf.jpgTURNITIN_DNI_72754378 - 20193654.pdf.jpgGenerated Thumbnailimage/jpeg8897https://repositorio.ulima.edu.pe/bitstream/20.500.12724/23390/9/TURNITIN_DNI_72754378%20-%2020193654.pdf.jpg5622317d554cd63d791ec83479f15732MD5920.500.12724/23390oai:repositorio.ulima.edu.pe:20.500.12724/233902025-09-29 12:38:56.747Repositorio Universidad de Limarepositorio@ulima.edu.pe | 
| score | 13.932078 | 
 Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
    La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
 
   
   
             
            