Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm

Descripción del Articulo

Abstract: Today, web content such as images, text, speeches, and videos are user-generated, and social networks have become increasingly popular as a means for people to share their ideas and opinions. One of the most popular social media for expressing their feelings towards events that occur is Tw...

Descripción completa

Detalles Bibliográficos
Autores: Iparraguirre-Villanueva, Orlando, Guevara-Ponce, Victor, Sierra-Liñan, Fernando, Beltozar-Clemente, Saul, Cabanillas-Carbonell, Michael
Formato: artículo
Fecha de Publicación:2022
Institución:Universidad Autónoma del Perú
Repositorio:AUTONOMA-Institucional
Lenguaje:inglés
OAI Identifier:oai:repositorio.autonoma.edu.pe:20.500.13067/1983
Enlace del recurso:https://hdl.handle.net/20.500.13067/1983
http://dx.doi.org/10.14569/IJACSA.2022.0130669
Nivel de acceso:acceso abierto
Materia:Techniques
Machine learning
Classification
Twitter
https://purl.org/pe-repo/ocde/ford#2.02.04
id AUTO_42c03cda3d46d83ea3587e78023a7b85
oai_identifier_str oai:repositorio.autonoma.edu.pe:20.500.13067/1983
network_acronym_str AUTO
network_name_str AUTONOMA-Institucional
repository_id_str 4774
spelling Iparraguirre-Villanueva, OrlandoGuevara-Ponce, VictorSierra-Liñan, FernandoBeltozar-Clemente, SaulCabanillas-Carbonell, Michael2022-07-21T17:21:26Z2022-07-21T17:21:26Z2022https://hdl.handle.net/20.500.13067/1983(IJACSA) International Journal of Advanced Computer Science and Applicationshttp://dx.doi.org/10.14569/IJACSA.2022.0130669Abstract: Today, web content such as images, text, speeches, and videos are user-generated, and social networks have become increasingly popular as a means for people to share their ideas and opinions. One of the most popular social media for expressing their feelings towards events that occur is Twitter. The main objective of this study is to classify and analyze the content of the affiliates of the Pension and Funds Administration (AFP) published on Twitter. This study incorporates machine learning techniques for data mining, cleaning, tokenization, exploratory analysis, classification, and sentiment analysis. To apply the study and examine the data, Twitter was used with the hashtag #afp, followed by descriptive and exploratory analysis, including metrics of the tweets. Finally, a content analysis was carried out, including word frequency calculation, lemmatization, and classification of words by sentiment, emotions, and word cloud. The study uses tweets published in the month of May 2022. Sentiment distribution was also performed in three polarity classes: positive, neutral, and negative, representing 22%, 4%, and 74% respectively. Supported by the unsupervised learning method and the K-Means algorithm, we were able to determine the number of clusters using the elbow method. Finally, the sentiment analysis and the clusters formed indicate that there is a very pronounced dispersion, the distances are not very similar, even though the data standardization work was carried out.application/pdfengSAI The Science and Information OrganizationUSinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-nd/4.0/TechniquesMachine learningClassificationTwitterhttps://purl.org/pe-repo/ocde/ford#2.02.04Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithminfo:eu-repo/semantics/article1316571578reponame:AUTONOMA-Institucionalinstname:Universidad Autónoma del Perúinstacron:AUTONOMAORIGINALPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdfPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdfArtículoapplication/pdf618535http://repositorio.autonoma.edu.pe/bitstream/20.500.13067/1983/1/Paper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf22393ed499a8c8baa98a5ab35ca6334eMD51LICENSElicense.txtlicense.txttext/plain; charset=utf-885http://repositorio.autonoma.edu.pe/bitstream/20.500.13067/1983/2/license.txt9243398ff393db1861c890baeaeee5f9MD52TEXTPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf.txtPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf.txtExtracted texttext/plain37270http://repositorio.autonoma.edu.pe/bitstream/20.500.13067/1983/3/Paper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf.txt02808a0598530eff6c0d4a297b502f66MD53THUMBNAILPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf.jpgPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf.jpgGenerated Thumbnailimage/jpeg8204http://repositorio.autonoma.edu.pe/bitstream/20.500.13067/1983/4/Paper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf.jpgc9962ea9dde7c0b73943d13d94ceeeb9MD5420.500.13067/1983oai:repositorio.autonoma.edu.pe:20.500.13067/19832022-07-22 03:00:25.89Repositorio de la Universidad Autonoma del Perúrepositorio@autonoma.pe
dc.title.es_PE.fl_str_mv Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
title Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
spellingShingle Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
Iparraguirre-Villanueva, Orlando
Techniques
Machine learning
Classification
Twitter
https://purl.org/pe-repo/ocde/ford#2.02.04
title_short Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
title_full Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
title_fullStr Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
title_full_unstemmed Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
title_sort Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
author Iparraguirre-Villanueva, Orlando
author_facet Iparraguirre-Villanueva, Orlando
Guevara-Ponce, Victor
Sierra-Liñan, Fernando
Beltozar-Clemente, Saul
Cabanillas-Carbonell, Michael
author_role author
author2 Guevara-Ponce, Victor
Sierra-Liñan, Fernando
Beltozar-Clemente, Saul
Cabanillas-Carbonell, Michael
author2_role author
author
author
author
dc.contributor.author.fl_str_mv Iparraguirre-Villanueva, Orlando
Guevara-Ponce, Victor
Sierra-Liñan, Fernando
Beltozar-Clemente, Saul
Cabanillas-Carbonell, Michael
dc.subject.es_PE.fl_str_mv Techniques
Machine learning
Classification
Twitter
topic Techniques
Machine learning
Classification
Twitter
https://purl.org/pe-repo/ocde/ford#2.02.04
dc.subject.ocde.es_PE.fl_str_mv https://purl.org/pe-repo/ocde/ford#2.02.04
description Abstract: Today, web content such as images, text, speeches, and videos are user-generated, and social networks have become increasingly popular as a means for people to share their ideas and opinions. One of the most popular social media for expressing their feelings towards events that occur is Twitter. The main objective of this study is to classify and analyze the content of the affiliates of the Pension and Funds Administration (AFP) published on Twitter. This study incorporates machine learning techniques for data mining, cleaning, tokenization, exploratory analysis, classification, and sentiment analysis. To apply the study and examine the data, Twitter was used with the hashtag #afp, followed by descriptive and exploratory analysis, including metrics of the tweets. Finally, a content analysis was carried out, including word frequency calculation, lemmatization, and classification of words by sentiment, emotions, and word cloud. The study uses tweets published in the month of May 2022. Sentiment distribution was also performed in three polarity classes: positive, neutral, and negative, representing 22%, 4%, and 74% respectively. Supported by the unsupervised learning method and the K-Means algorithm, we were able to determine the number of clusters using the elbow method. Finally, the sentiment analysis and the clusters formed indicate that there is a very pronounced dispersion, the distances are not very similar, even though the data standardization work was carried out.
publishDate 2022
dc.date.accessioned.none.fl_str_mv 2022-07-21T17:21:26Z
dc.date.available.none.fl_str_mv 2022-07-21T17:21:26Z
dc.date.issued.fl_str_mv 2022
dc.type.es_PE.fl_str_mv info:eu-repo/semantics/article
format article
dc.identifier.uri.none.fl_str_mv https://hdl.handle.net/20.500.13067/1983
dc.identifier.journal.es_PE.fl_str_mv (IJACSA) International Journal of Advanced Computer Science and Applications
dc.identifier.doi.none.fl_str_mv http://dx.doi.org/10.14569/IJACSA.2022.0130669
url https://hdl.handle.net/20.500.13067/1983
http://dx.doi.org/10.14569/IJACSA.2022.0130669
identifier_str_mv (IJACSA) International Journal of Advanced Computer Science and Applications
dc.language.iso.es_PE.fl_str_mv eng
language eng
dc.rights.es_PE.fl_str_mv info:eu-repo/semantics/openAccess
dc.rights.uri.es_PE.fl_str_mv https://creativecommons.org/licenses/by-nc-nd/4.0/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.format.es_PE.fl_str_mv application/pdf
dc.publisher.es_PE.fl_str_mv SAI The Science and Information Organization
dc.publisher.country.es_PE.fl_str_mv US
dc.source.none.fl_str_mv reponame:AUTONOMA-Institucional
instname:Universidad Autónoma del Perú
instacron:AUTONOMA
instname_str Universidad Autónoma del Perú
instacron_str AUTONOMA
institution AUTONOMA
reponame_str AUTONOMA-Institucional
collection AUTONOMA-Institucional
dc.source.volume.es_PE.fl_str_mv 13
dc.source.issue.es_PE.fl_str_mv 16
dc.source.beginpage.es_PE.fl_str_mv 571
dc.source.endpage.es_PE.fl_str_mv 578
bitstream.url.fl_str_mv http://repositorio.autonoma.edu.pe/bitstream/20.500.13067/1983/1/Paper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf
http://repositorio.autonoma.edu.pe/bitstream/20.500.13067/1983/2/license.txt
http://repositorio.autonoma.edu.pe/bitstream/20.500.13067/1983/3/Paper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf.txt
http://repositorio.autonoma.edu.pe/bitstream/20.500.13067/1983/4/Paper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf.jpg
bitstream.checksum.fl_str_mv 22393ed499a8c8baa98a5ab35ca6334e
9243398ff393db1861c890baeaeee5f9
02808a0598530eff6c0d4a297b502f66
c9962ea9dde7c0b73943d13d94ceeeb9
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositorio de la Universidad Autonoma del Perú
repository.mail.fl_str_mv repositorio@autonoma.pe
_version_ 1835915465209151488
score 13.7211075
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).