Breast cancer prediction using machine learning models

Descripción del Articulo

Breast cancer is a type of cancer that develops in the cells of the breast. Treatment for breast cancer usually involves X-ray, chemotherapy, or a combination of both treatments. Detecting cancer at an early stage can save a person's life. Artificial intelligence (AI) plays a very important rol...

Descripción completa

Detalles Bibliográficos
Autores: Ruíz Alvarado, John Fernando, Iparraguirre-Villanueva, Orlando, Epifanía-Huerta, Andrés, Torres-Ceclén, Carmen, Cabanillas-Carbonell, Michael
Formato: artículo
Fecha de Publicación:2023
Institución:Universidad Tecnológica del Perú
Repositorio:UTP-Institucional
Lenguaje:español
OAI Identifier:oai:repositorio.utp.edu.pe:20.500.12867/6960
Enlace del recurso:https://hdl.handle.net/20.500.12867/6960
https://doi.org/10.14569/IJACSA.2023.0140272
Nivel de acceso:acceso abierto
Materia:Breast cancer
Machine learning
Predictive modelling
https://purl.org/pe-repo/ocde/ford#3.02.21
https://purl.org/pe-repo/ocde/ford#1.02.00
id UTPD_bfe7bd00410487c4c96507ecb038a4cb
oai_identifier_str oai:repositorio.utp.edu.pe:20.500.12867/6960
network_acronym_str UTPD
network_name_str UTP-Institucional
repository_id_str 4782
dc.title.es_PE.fl_str_mv Breast cancer prediction using machine learning models
title Breast cancer prediction using machine learning models
spellingShingle Breast cancer prediction using machine learning models
Ruíz Alvarado, John Fernando
Breast cancer
Machine learning
Predictive modelling
https://purl.org/pe-repo/ocde/ford#3.02.21
https://purl.org/pe-repo/ocde/ford#1.02.00
title_short Breast cancer prediction using machine learning models
title_full Breast cancer prediction using machine learning models
title_fullStr Breast cancer prediction using machine learning models
title_full_unstemmed Breast cancer prediction using machine learning models
title_sort Breast cancer prediction using machine learning models
author Ruíz Alvarado, John Fernando
author_facet Ruíz Alvarado, John Fernando
Iparraguirre-Villanueva, Orlando
Epifanía-Huerta, Andrés
Torres-Ceclén, Carmen
Cabanillas-Carbonell, Michael
author_role author
author2 Iparraguirre-Villanueva, Orlando
Epifanía-Huerta, Andrés
Torres-Ceclén, Carmen
Cabanillas-Carbonell, Michael
author2_role author
author
author
author
dc.contributor.author.fl_str_mv Ruíz Alvarado, John Fernando
Iparraguirre-Villanueva, Orlando
Epifanía-Huerta, Andrés
Torres-Ceclén, Carmen
Cabanillas-Carbonell, Michael
dc.subject.es_PE.fl_str_mv Breast cancer
Machine learning
Predictive modelling
topic Breast cancer
Machine learning
Predictive modelling
https://purl.org/pe-repo/ocde/ford#3.02.21
https://purl.org/pe-repo/ocde/ford#1.02.00
dc.subject.ocde.es_PE.fl_str_mv https://purl.org/pe-repo/ocde/ford#3.02.21
https://purl.org/pe-repo/ocde/ford#1.02.00
description Breast cancer is a type of cancer that develops in the cells of the breast. Treatment for breast cancer usually involves X-ray, chemotherapy, or a combination of both treatments. Detecting cancer at an early stage can save a person's life. Artificial intelligence (AI) plays a very important role in this area. Therefore, predicting breast cancer remains a very challenging issue for clinicians and researchers. This work aims to predict the probability of breast cancer in patients. Using machine learning (ML) models such as Multilayer Perceptron (MLP), K-Nearest Neightbot (KNN), AdaBoost (AB), Bagging, Gradient Boosting (GB), and Random Forest (RF). The breast cancer diagnostic medical dataset from the Wisconsin repository has been used. The dataset includes 569 observations and 32 features. Following the data analysis methodology, data cleaning, exploratory analysis, training, testing, and validation were performed. The performance of the models was evaluated with the parameters: classification accuracy, specificity, sensitivity, F1 count, and precision. The training and results indicate that the six trained models can provide optimal classification and prediction results. The RF, GB, and AB models achieved 100% accuracy, outperforming the other models. Therefore, the suggested models for breast cancer identification, classification, and prediction are RF, GB, and AB. Likewise, the Bagging, KNN, and MLP models achieved a performance of 99.56%, 95.82%, and 96.92%, respectively. Similarly, the last three models achieved an optimal yield close to 100%. Finally, the results show a clear advantage of the RF, GB, and AB models, as they achieve more accurate results in breast cancer prediction.
publishDate 2023
dc.date.accessioned.none.fl_str_mv 2023-05-15T21:57:52Z
dc.date.available.none.fl_str_mv 2023-05-15T21:57:52Z
dc.date.issued.fl_str_mv 2023
dc.type.es_PE.fl_str_mv info:eu-repo/semantics/article
dc.type.version.es_PE.fl_str_mv info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.issn.none.fl_str_mv 2156-5570
dc.identifier.uri.none.fl_str_mv https://hdl.handle.net/20.500.12867/6960
dc.identifier.journal.es_PE.fl_str_mv International Journal of Advanced Computer Science and Applications
dc.identifier.doi.none.fl_str_mv https://doi.org/10.14569/IJACSA.2023.0140272
identifier_str_mv 2156-5570
International Journal of Advanced Computer Science and Applications
url https://hdl.handle.net/20.500.12867/6960
https://doi.org/10.14569/IJACSA.2023.0140272
dc.language.iso.es_PE.fl_str_mv spa
language spa
dc.relation.ispartofseries.none.fl_str_mv International Journal of Advanced Computer Science and Applications;vol. 14, n° 2
dc.rights.es_PE.fl_str_mv info:eu-repo/semantics/openAccess
dc.rights.uri.es_PE.fl_str_mv http://creativecommons.org/licenses/by/4.0/
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by/4.0/
dc.format.es_PE.fl_str_mv application/pdf
dc.publisher.es_PE.fl_str_mv The Science and Information Organization
dc.publisher.country.es_PE.fl_str_mv GB
dc.source.es_PE.fl_str_mv Repositorio Institucional - UTP
Universidad Tecnológica del Perú
dc.source.none.fl_str_mv reponame:UTP-Institucional
instname:Universidad Tecnológica del Perú
instacron:UTP
instname_str Universidad Tecnológica del Perú
instacron_str UTP
institution UTP
reponame_str UTP-Institucional
collection UTP-Institucional
bitstream.url.fl_str_mv http://repositorio.utp.edu.pe/bitstream/20.500.12867/6960/1/J.Ruiz_Articulo_2023.pdf
http://repositorio.utp.edu.pe/bitstream/20.500.12867/6960/2/license.txt
http://repositorio.utp.edu.pe/bitstream/20.500.12867/6960/3/J.Ruiz_Articulo_2023.pdf.txt
http://repositorio.utp.edu.pe/bitstream/20.500.12867/6960/4/J.Ruiz_Articulo_2023.pdf.jpg
bitstream.checksum.fl_str_mv 9206319d1b143bcdee8e50ba3c538951
8a4605be74aa9ea9d79846c1fba20a33
99639d93a88e6df238344fb6f8d585b3
893bf6e6f771a753ef7b456bcaa45e16
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositorio Institucional de la Universidad Tecnológica del Perú
repository.mail.fl_str_mv repositorio@utp.edu.pe
_version_ 1817984912142106624
spelling Ruíz Alvarado, John FernandoIparraguirre-Villanueva, OrlandoEpifanía-Huerta, AndrésTorres-Ceclén, CarmenCabanillas-Carbonell, Michael2023-05-15T21:57:52Z2023-05-15T21:57:52Z20232156-5570https://hdl.handle.net/20.500.12867/6960International Journal of Advanced Computer Science and Applicationshttps://doi.org/10.14569/IJACSA.2023.0140272Breast cancer is a type of cancer that develops in the cells of the breast. Treatment for breast cancer usually involves X-ray, chemotherapy, or a combination of both treatments. Detecting cancer at an early stage can save a person's life. Artificial intelligence (AI) plays a very important role in this area. Therefore, predicting breast cancer remains a very challenging issue for clinicians and researchers. This work aims to predict the probability of breast cancer in patients. Using machine learning (ML) models such as Multilayer Perceptron (MLP), K-Nearest Neightbot (KNN), AdaBoost (AB), Bagging, Gradient Boosting (GB), and Random Forest (RF). The breast cancer diagnostic medical dataset from the Wisconsin repository has been used. The dataset includes 569 observations and 32 features. Following the data analysis methodology, data cleaning, exploratory analysis, training, testing, and validation were performed. The performance of the models was evaluated with the parameters: classification accuracy, specificity, sensitivity, F1 count, and precision. The training and results indicate that the six trained models can provide optimal classification and prediction results. The RF, GB, and AB models achieved 100% accuracy, outperforming the other models. Therefore, the suggested models for breast cancer identification, classification, and prediction are RF, GB, and AB. Likewise, the Bagging, KNN, and MLP models achieved a performance of 99.56%, 95.82%, and 96.92%, respectively. Similarly, the last three models achieved an optimal yield close to 100%. Finally, the results show a clear advantage of the RF, GB, and AB models, as they achieve more accurate results in breast cancer prediction.Campus Chimboteapplication/pdfspaThe Science and Information OrganizationGBInternational Journal of Advanced Computer Science and Applications;vol. 14, n° 2info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by/4.0/Repositorio Institucional - UTPUniversidad Tecnológica del Perúreponame:UTP-Institucionalinstname:Universidad Tecnológica del Perúinstacron:UTPBreast cancerMachine learningPredictive modellinghttps://purl.org/pe-repo/ocde/ford#3.02.21https://purl.org/pe-repo/ocde/ford#1.02.00Breast cancer prediction using machine learning modelsinfo:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionORIGINALJ.Ruiz_Articulo_2023.pdfJ.Ruiz_Articulo_2023.pdfapplication/pdf1311622http://repositorio.utp.edu.pe/bitstream/20.500.12867/6960/1/J.Ruiz_Articulo_2023.pdf9206319d1b143bcdee8e50ba3c538951MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://repositorio.utp.edu.pe/bitstream/20.500.12867/6960/2/license.txt8a4605be74aa9ea9d79846c1fba20a33MD52TEXTJ.Ruiz_Articulo_2023.pdf.txtJ.Ruiz_Articulo_2023.pdf.txtExtracted texttext/plain49863http://repositorio.utp.edu.pe/bitstream/20.500.12867/6960/3/J.Ruiz_Articulo_2023.pdf.txt99639d93a88e6df238344fb6f8d585b3MD53THUMBNAILJ.Ruiz_Articulo_2023.pdf.jpgJ.Ruiz_Articulo_2023.pdf.jpgGenerated Thumbnailimage/jpeg22074http://repositorio.utp.edu.pe/bitstream/20.500.12867/6960/4/J.Ruiz_Articulo_2023.pdf.jpg893bf6e6f771a753ef7b456bcaa45e16MD5420.500.12867/6960oai:repositorio.utp.edu.pe:20.500.12867/69602023-05-15 17:03:53.589Repositorio Institucional de la Universidad Tecnológica del Perúrepositorio@utp.edu.peTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=
score 13.7211075
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).