Breast cancer prediction using machine learning models
Descripción del Articulo
Breast cancer is a type of cancer that develops in the cells of the breast. Treatment for breast cancer usually involves X-ray, chemotherapy, or a combination of both treatments. Detecting cancer at an early stage can save a person's life. Artificial intelligence (AI) plays a very important rol...
Autores: | , , , , |
---|---|
Formato: | artículo |
Fecha de Publicación: | 2023 |
Institución: | Universidad Tecnológica del Perú |
Repositorio: | UTP-Institucional |
Lenguaje: | español |
OAI Identifier: | oai:repositorio.utp.edu.pe:20.500.12867/6960 |
Enlace del recurso: | https://hdl.handle.net/20.500.12867/6960 https://doi.org/10.14569/IJACSA.2023.0140272 |
Nivel de acceso: | acceso abierto |
Materia: | Breast cancer Machine learning Predictive modelling https://purl.org/pe-repo/ocde/ford#3.02.21 https://purl.org/pe-repo/ocde/ford#1.02.00 |
id |
UTPD_bfe7bd00410487c4c96507ecb038a4cb |
---|---|
oai_identifier_str |
oai:repositorio.utp.edu.pe:20.500.12867/6960 |
network_acronym_str |
UTPD |
network_name_str |
UTP-Institucional |
repository_id_str |
4782 |
dc.title.es_PE.fl_str_mv |
Breast cancer prediction using machine learning models |
title |
Breast cancer prediction using machine learning models |
spellingShingle |
Breast cancer prediction using machine learning models Ruíz Alvarado, John Fernando Breast cancer Machine learning Predictive modelling https://purl.org/pe-repo/ocde/ford#3.02.21 https://purl.org/pe-repo/ocde/ford#1.02.00 |
title_short |
Breast cancer prediction using machine learning models |
title_full |
Breast cancer prediction using machine learning models |
title_fullStr |
Breast cancer prediction using machine learning models |
title_full_unstemmed |
Breast cancer prediction using machine learning models |
title_sort |
Breast cancer prediction using machine learning models |
author |
Ruíz Alvarado, John Fernando |
author_facet |
Ruíz Alvarado, John Fernando Iparraguirre-Villanueva, Orlando Epifanía-Huerta, Andrés Torres-Ceclén, Carmen Cabanillas-Carbonell, Michael |
author_role |
author |
author2 |
Iparraguirre-Villanueva, Orlando Epifanía-Huerta, Andrés Torres-Ceclén, Carmen Cabanillas-Carbonell, Michael |
author2_role |
author author author author |
dc.contributor.author.fl_str_mv |
Ruíz Alvarado, John Fernando Iparraguirre-Villanueva, Orlando Epifanía-Huerta, Andrés Torres-Ceclén, Carmen Cabanillas-Carbonell, Michael |
dc.subject.es_PE.fl_str_mv |
Breast cancer Machine learning Predictive modelling |
topic |
Breast cancer Machine learning Predictive modelling https://purl.org/pe-repo/ocde/ford#3.02.21 https://purl.org/pe-repo/ocde/ford#1.02.00 |
dc.subject.ocde.es_PE.fl_str_mv |
https://purl.org/pe-repo/ocde/ford#3.02.21 https://purl.org/pe-repo/ocde/ford#1.02.00 |
description |
Breast cancer is a type of cancer that develops in the cells of the breast. Treatment for breast cancer usually involves X-ray, chemotherapy, or a combination of both treatments. Detecting cancer at an early stage can save a person's life. Artificial intelligence (AI) plays a very important role in this area. Therefore, predicting breast cancer remains a very challenging issue for clinicians and researchers. This work aims to predict the probability of breast cancer in patients. Using machine learning (ML) models such as Multilayer Perceptron (MLP), K-Nearest Neightbot (KNN), AdaBoost (AB), Bagging, Gradient Boosting (GB), and Random Forest (RF). The breast cancer diagnostic medical dataset from the Wisconsin repository has been used. The dataset includes 569 observations and 32 features. Following the data analysis methodology, data cleaning, exploratory analysis, training, testing, and validation were performed. The performance of the models was evaluated with the parameters: classification accuracy, specificity, sensitivity, F1 count, and precision. The training and results indicate that the six trained models can provide optimal classification and prediction results. The RF, GB, and AB models achieved 100% accuracy, outperforming the other models. Therefore, the suggested models for breast cancer identification, classification, and prediction are RF, GB, and AB. Likewise, the Bagging, KNN, and MLP models achieved a performance of 99.56%, 95.82%, and 96.92%, respectively. Similarly, the last three models achieved an optimal yield close to 100%. Finally, the results show a clear advantage of the RF, GB, and AB models, as they achieve more accurate results in breast cancer prediction. |
publishDate |
2023 |
dc.date.accessioned.none.fl_str_mv |
2023-05-15T21:57:52Z |
dc.date.available.none.fl_str_mv |
2023-05-15T21:57:52Z |
dc.date.issued.fl_str_mv |
2023 |
dc.type.es_PE.fl_str_mv |
info:eu-repo/semantics/article |
dc.type.version.es_PE.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.issn.none.fl_str_mv |
2156-5570 |
dc.identifier.uri.none.fl_str_mv |
https://hdl.handle.net/20.500.12867/6960 |
dc.identifier.journal.es_PE.fl_str_mv |
International Journal of Advanced Computer Science and Applications |
dc.identifier.doi.none.fl_str_mv |
https://doi.org/10.14569/IJACSA.2023.0140272 |
identifier_str_mv |
2156-5570 International Journal of Advanced Computer Science and Applications |
url |
https://hdl.handle.net/20.500.12867/6960 https://doi.org/10.14569/IJACSA.2023.0140272 |
dc.language.iso.es_PE.fl_str_mv |
spa |
language |
spa |
dc.relation.ispartofseries.none.fl_str_mv |
International Journal of Advanced Computer Science and Applications;vol. 14, n° 2 |
dc.rights.es_PE.fl_str_mv |
info:eu-repo/semantics/openAccess |
dc.rights.uri.es_PE.fl_str_mv |
http://creativecommons.org/licenses/by/4.0/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by/4.0/ |
dc.format.es_PE.fl_str_mv |
application/pdf |
dc.publisher.es_PE.fl_str_mv |
The Science and Information Organization |
dc.publisher.country.es_PE.fl_str_mv |
GB |
dc.source.es_PE.fl_str_mv |
Repositorio Institucional - UTP Universidad Tecnológica del Perú |
dc.source.none.fl_str_mv |
reponame:UTP-Institucional instname:Universidad Tecnológica del Perú instacron:UTP |
instname_str |
Universidad Tecnológica del Perú |
instacron_str |
UTP |
institution |
UTP |
reponame_str |
UTP-Institucional |
collection |
UTP-Institucional |
bitstream.url.fl_str_mv |
http://repositorio.utp.edu.pe/bitstream/20.500.12867/6960/1/J.Ruiz_Articulo_2023.pdf http://repositorio.utp.edu.pe/bitstream/20.500.12867/6960/2/license.txt http://repositorio.utp.edu.pe/bitstream/20.500.12867/6960/3/J.Ruiz_Articulo_2023.pdf.txt http://repositorio.utp.edu.pe/bitstream/20.500.12867/6960/4/J.Ruiz_Articulo_2023.pdf.jpg |
bitstream.checksum.fl_str_mv |
9206319d1b143bcdee8e50ba3c538951 8a4605be74aa9ea9d79846c1fba20a33 99639d93a88e6df238344fb6f8d585b3 893bf6e6f771a753ef7b456bcaa45e16 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositorio Institucional de la Universidad Tecnológica del Perú |
repository.mail.fl_str_mv |
repositorio@utp.edu.pe |
_version_ |
1817984912142106624 |
spelling |
Ruíz Alvarado, John FernandoIparraguirre-Villanueva, OrlandoEpifanía-Huerta, AndrésTorres-Ceclén, CarmenCabanillas-Carbonell, Michael2023-05-15T21:57:52Z2023-05-15T21:57:52Z20232156-5570https://hdl.handle.net/20.500.12867/6960International Journal of Advanced Computer Science and Applicationshttps://doi.org/10.14569/IJACSA.2023.0140272Breast cancer is a type of cancer that develops in the cells of the breast. Treatment for breast cancer usually involves X-ray, chemotherapy, or a combination of both treatments. Detecting cancer at an early stage can save a person's life. Artificial intelligence (AI) plays a very important role in this area. Therefore, predicting breast cancer remains a very challenging issue for clinicians and researchers. This work aims to predict the probability of breast cancer in patients. Using machine learning (ML) models such as Multilayer Perceptron (MLP), K-Nearest Neightbot (KNN), AdaBoost (AB), Bagging, Gradient Boosting (GB), and Random Forest (RF). The breast cancer diagnostic medical dataset from the Wisconsin repository has been used. The dataset includes 569 observations and 32 features. Following the data analysis methodology, data cleaning, exploratory analysis, training, testing, and validation were performed. The performance of the models was evaluated with the parameters: classification accuracy, specificity, sensitivity, F1 count, and precision. The training and results indicate that the six trained models can provide optimal classification and prediction results. The RF, GB, and AB models achieved 100% accuracy, outperforming the other models. Therefore, the suggested models for breast cancer identification, classification, and prediction are RF, GB, and AB. Likewise, the Bagging, KNN, and MLP models achieved a performance of 99.56%, 95.82%, and 96.92%, respectively. Similarly, the last three models achieved an optimal yield close to 100%. Finally, the results show a clear advantage of the RF, GB, and AB models, as they achieve more accurate results in breast cancer prediction.Campus Chimboteapplication/pdfspaThe Science and Information OrganizationGBInternational Journal of Advanced Computer Science and Applications;vol. 14, n° 2info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by/4.0/Repositorio Institucional - UTPUniversidad Tecnológica del Perúreponame:UTP-Institucionalinstname:Universidad Tecnológica del Perúinstacron:UTPBreast cancerMachine learningPredictive modellinghttps://purl.org/pe-repo/ocde/ford#3.02.21https://purl.org/pe-repo/ocde/ford#1.02.00Breast cancer prediction using machine learning modelsinfo:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionORIGINALJ.Ruiz_Articulo_2023.pdfJ.Ruiz_Articulo_2023.pdfapplication/pdf1311622http://repositorio.utp.edu.pe/bitstream/20.500.12867/6960/1/J.Ruiz_Articulo_2023.pdf9206319d1b143bcdee8e50ba3c538951MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://repositorio.utp.edu.pe/bitstream/20.500.12867/6960/2/license.txt8a4605be74aa9ea9d79846c1fba20a33MD52TEXTJ.Ruiz_Articulo_2023.pdf.txtJ.Ruiz_Articulo_2023.pdf.txtExtracted texttext/plain49863http://repositorio.utp.edu.pe/bitstream/20.500.12867/6960/3/J.Ruiz_Articulo_2023.pdf.txt99639d93a88e6df238344fb6f8d585b3MD53THUMBNAILJ.Ruiz_Articulo_2023.pdf.jpgJ.Ruiz_Articulo_2023.pdf.jpgGenerated Thumbnailimage/jpeg22074http://repositorio.utp.edu.pe/bitstream/20.500.12867/6960/4/J.Ruiz_Articulo_2023.pdf.jpg893bf6e6f771a753ef7b456bcaa45e16MD5420.500.12867/6960oai:repositorio.utp.edu.pe:20.500.12867/69602023-05-15 17:03:53.589Repositorio Institucional de la Universidad Tecnológica del Perúrepositorio@utp.edu.peTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo= |
score |
13.7211075 |
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).