A comparison of machine learning techniques for detection of phishing websites

Descripción del Articulo

Phishing is the theft of personal data through fake websites. Victims of this type of theft are directed to a fake website, where they are asked to enter their data to validate their identity. At that moment, theft is carried out, since entered data are stored and used by the hacker responsible for...

Descripción completa

Detalles Bibliográficos
Autor: Moncada Vargas, Andrés Eduardo
Formato: artículo
Fecha de Publicación:2020
Institución:Universidad de Lima
Repositorio:Revistas - Universidad de Lima
Lenguaje:español
OAI Identifier:oai:revistas.ulima.edu.pe:article/4886
Enlace del recurso:https://revistas.ulima.edu.pe/index.php/Interfases/article/view/4886
Nivel de acceso:acceso abierto
Materia:Anti-Phishing
Machine Learning
Cibersecurity
Phishing Warning
Phishing
Ciberattack
Ciberseguridad
Advertencia Phishing
Ciberataque
id REVULIMA_778976e3101066ba9e2ae4de0aaf2ded
oai_identifier_str oai:revistas.ulima.edu.pe:article/4886
network_acronym_str REVULIMA
network_name_str Revistas - Universidad de Lima
repository_id_str
spelling A comparison of machine learning techniques for detection of phishing websitesComparación de técnicas de machine learning para detección de sitios web de phishingMoncada Vargas, Andrés EduardoAnti-PhishingMachine LearningCibersecurityPhishing WarningPhishingCiberattackAnti-PhishingMachine LearningCiberseguridadAdvertencia PhishingPhishingCiberataquePhishing is the theft of personal data through fake websites. Victims of this type of theft are directed to a fake website, where they are asked to enter their data to validate their identity. At that moment, theft is carried out, since entered data are stored and used by the hacker responsible for said attack to sell them or enter to websites and perform a fraud or scam. In order to conduct this work, we researched different methods for detecting phishing websites by using machine learning techniques. Thus, the purpose of this work is to compare machine learning techniques that have demonstrated to be the most effective methods to detect phishing websites. The results show that decision tree classifiers such as Decision Tree and Random Forest have achieved the highest accuracy and efficacy rates, with values between 97% and 99%, in detecting these types of websites.El phishing es el robo de datos personales a través de páginas web falsas. La víctima de este robo es dirigida a esta página falsa, donde se le solicita ingresar sus datos para validar su identidad. Es en ese momento que se realiza el robo, ya que al ingresar sus datos, estos son almacenados y usados por el hacker responsable de dicho ataque para venderlos o ingresar a las entidades y realizar robos o estafas. Para este trabajo se ha investigado sobre distintos métodos de detección de páginas web phishing utilizando técnicas de machine learning. Así, el propósito de este trabajo es realizar una comparación de dichas técnicas que han demostrado ser las más efectivas en la detección de los sitios web phishing. Los resultados obtenidos demuestran que los clasificadores de árboles, denominados Árbol de Decisión y Bosque Aleatorio, han alcanzado las mayores tasas de precisión y efectividad, con valores de entre 97 % y 99 % en la detección de este tipo de páginas.Universidad de Lima2020-12-22info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://revistas.ulima.edu.pe/index.php/Interfases/article/view/488610.26439/interfases2020.n013.4886Interfases; No. 013 (2020); 77-103Interfases; Núm. 013 (2020); 77-103Interfases; n. 013 (2020); 77-1031993-491210.26439/interfases2020.n013reponame:Revistas - Universidad de Limainstname:Universidad de Limainstacron:ULIMAspahttps://revistas.ulima.edu.pe/index.php/Interfases/article/view/4886/4873Derechos de autor 2020 Revista Interfaseshttps://creativecommons.org/licenses/by-nd/4.0info:eu-repo/semantics/openAccessoai:revistas.ulima.edu.pe:article/48862023-07-24T13:32:54Z
dc.title.none.fl_str_mv A comparison of machine learning techniques for detection of phishing websites
Comparación de técnicas de machine learning para detección de sitios web de phishing
title A comparison of machine learning techniques for detection of phishing websites
spellingShingle A comparison of machine learning techniques for detection of phishing websites
Moncada Vargas, Andrés Eduardo
Anti-Phishing
Machine Learning
Cibersecurity
Phishing Warning
Phishing
Ciberattack
Anti-Phishing
Machine Learning
Ciberseguridad
Advertencia Phishing
Phishing
Ciberataque
title_short A comparison of machine learning techniques for detection of phishing websites
title_full A comparison of machine learning techniques for detection of phishing websites
title_fullStr A comparison of machine learning techniques for detection of phishing websites
title_full_unstemmed A comparison of machine learning techniques for detection of phishing websites
title_sort A comparison of machine learning techniques for detection of phishing websites
dc.creator.none.fl_str_mv Moncada Vargas, Andrés Eduardo
author Moncada Vargas, Andrés Eduardo
author_facet Moncada Vargas, Andrés Eduardo
author_role author
dc.subject.none.fl_str_mv Anti-Phishing
Machine Learning
Cibersecurity
Phishing Warning
Phishing
Ciberattack
Anti-Phishing
Machine Learning
Ciberseguridad
Advertencia Phishing
Phishing
Ciberataque
topic Anti-Phishing
Machine Learning
Cibersecurity
Phishing Warning
Phishing
Ciberattack
Anti-Phishing
Machine Learning
Ciberseguridad
Advertencia Phishing
Phishing
Ciberataque
description Phishing is the theft of personal data through fake websites. Victims of this type of theft are directed to a fake website, where they are asked to enter their data to validate their identity. At that moment, theft is carried out, since entered data are stored and used by the hacker responsible for said attack to sell them or enter to websites and perform a fraud or scam. In order to conduct this work, we researched different methods for detecting phishing websites by using machine learning techniques. Thus, the purpose of this work is to compare machine learning techniques that have demonstrated to be the most effective methods to detect phishing websites. The results show that decision tree classifiers such as Decision Tree and Random Forest have achieved the highest accuracy and efficacy rates, with values between 97% and 99%, in detecting these types of websites.
publishDate 2020
dc.date.none.fl_str_mv 2020-12-22
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv https://revistas.ulima.edu.pe/index.php/Interfases/article/view/4886
10.26439/interfases2020.n013.4886
url https://revistas.ulima.edu.pe/index.php/Interfases/article/view/4886
identifier_str_mv 10.26439/interfases2020.n013.4886
dc.language.none.fl_str_mv spa
language spa
dc.relation.none.fl_str_mv https://revistas.ulima.edu.pe/index.php/Interfases/article/view/4886/4873
dc.rights.none.fl_str_mv Derechos de autor 2020 Revista Interfases
https://creativecommons.org/licenses/by-nd/4.0
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Derechos de autor 2020 Revista Interfases
https://creativecommons.org/licenses/by-nd/4.0
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidad de Lima
publisher.none.fl_str_mv Universidad de Lima
dc.source.none.fl_str_mv Interfases; No. 013 (2020); 77-103
Interfases; Núm. 013 (2020); 77-103
Interfases; n. 013 (2020); 77-103
1993-4912
10.26439/interfases2020.n013
reponame:Revistas - Universidad de Lima
instname:Universidad de Lima
instacron:ULIMA
instname_str Universidad de Lima
instacron_str ULIMA
institution ULIMA
reponame_str Revistas - Universidad de Lima
collection Revistas - Universidad de Lima
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_ 1844893192072200192
score 13.035174
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).