Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm

Descripción del Articulo

Today, web content such as images, text, speeches, and videos are user-generated, and social networks have become increasingly popular as a means for people to share their ideas and opinions. One of the most popular social media for expressing their feelings towards events that occur is Twitter. The...

Descripción completa

Detalles Bibliográficos
Autores: Iparraguirre-Villanueva, Orlando, Guevara-Ponce, Victor, Sierra-Liñan, Fernando, Beltozar-Clemente, Saul, Cabanillas-Carbonel, Michael
Formato: artículo
Fecha de Publicación:2022
Institución:Universidad Privada Norbert Wiener
Repositorio:UWIENER-Institucional
Lenguaje:inglés
OAI Identifier:oai:repositorio.uwiener.edu.pe:20.500.13053/6918
Enlace del recurso:https://hdl.handle.net/20.500.13053/6918
Nivel de acceso:acceso abierto
Materia:Techniques; machine learning; classification; twitter
http://purl.org/pe-repo/ocde/ford#3.03.00
id UWIE_5822dd9b3efc10d69e415f0a583a72fa
oai_identifier_str oai:repositorio.uwiener.edu.pe:20.500.13053/6918
network_acronym_str UWIE
network_name_str UWIENER-Institucional
repository_id_str 9398
dc.title.es_ES.fl_str_mv Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
title Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
spellingShingle Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
Iparraguirre-Villanueva, Orlando
Techniques; machine learning; classification; twitter
http://purl.org/pe-repo/ocde/ford#3.03.00
title_short Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
title_full Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
title_fullStr Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
title_full_unstemmed Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
title_sort Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
author Iparraguirre-Villanueva, Orlando
author_facet Iparraguirre-Villanueva, Orlando
Guevara-Ponce, Victor
Sierra-Liñan, Fernando
Beltozar-Clemente, Saul
Cabanillas-Carbonel, Michael
author_role author
author2 Guevara-Ponce, Victor
Sierra-Liñan, Fernando
Beltozar-Clemente, Saul
Cabanillas-Carbonel, Michael
author2_role author
author
author
author
dc.contributor.author.fl_str_mv Iparraguirre-Villanueva, Orlando
Guevara-Ponce, Victor
Sierra-Liñan, Fernando
Beltozar-Clemente, Saul
Cabanillas-Carbonel, Michael
dc.subject.es_ES.fl_str_mv Techniques; machine learning; classification; twitter
topic Techniques; machine learning; classification; twitter
http://purl.org/pe-repo/ocde/ford#3.03.00
dc.subject.ocde.es_ES.fl_str_mv http://purl.org/pe-repo/ocde/ford#3.03.00
description Today, web content such as images, text, speeches, and videos are user-generated, and social networks have become increasingly popular as a means for people to share their ideas and opinions. One of the most popular social media for expressing their feelings towards events that occur is Twitter. The main objective of this study is to classify and analyze the content of the affiliates of the Pension and Funds Administration (AFP) published on Twitter. This study incorporates machine learning techniques for data mining, cleaning, tokenization, exploratory analysis, classification, and sentiment analysis. To apply the study and examine the data, Twitter was used with the hashtag #afp, followed by descriptive and exploratory analysis, including metrics of the tweets. Finally, a content analysis was carried out, including word frequency calculation, lemmatization, and classification of words by sentiment, emotions, and word cloud. The study uses tweets published in the month of May 2022. Sentiment distribution was also performed in three polarity classes: positive, neutral, and negative, representing 22%, 4%, and 74% respectively. Supported by the unsupervised learning method and the K-Means algorithm, we were able to determine the number of clusters using the elbow method. Finally, the sentiment analysis and the clusters formed indicate that there is a very pronounced dispersion, the distances are not very similar, even though the data standardization work was carried out.
publishDate 2022
dc.date.accessioned.none.fl_str_mv 2022-10-24T20:39:00Z
dc.date.available.none.fl_str_mv 2022-10-24T20:39:00Z
dc.date.issued.fl_str_mv 2022
dc.type.es_ES.fl_str_mv info:eu-repo/semantics/article
dc.type.version.es_ES.fl_str_mv info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.none.fl_str_mv https://hdl.handle.net/20.500.13053/6918
dc.identifier.doi.es_ES.fl_str_mv 10.14569/IJACSA.2022.0130669
url https://hdl.handle.net/20.500.13053/6918
identifier_str_mv 10.14569/IJACSA.2022.0130669
dc.language.iso.es_ES.fl_str_mv eng
language eng
dc.rights.es_ES.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.es_ES.fl_str_mv application/pdf
dc.publisher.es_ES.fl_str_mv Science and Information Organization
dc.publisher.country.es_ES.fl_str_mv GB
dc.source.none.fl_str_mv reponame:UWIENER-Institucional
instname:Universidad Privada Norbert Wiener
instacron:UWIENER
instname_str Universidad Privada Norbert Wiener
instacron_str UWIENER
institution UWIENER
reponame_str UWIENER-Institucional
collection UWIENER-Institucional
bitstream.url.fl_str_mv https://dspace-uwiener.metabuscador.org/bitstreams/4717ff3d-6d7b-4dda-bf26-42c567292cf7/download
https://dspace-uwiener.metabuscador.org/bitstreams/89d03da0-df98-4c1f-8ebe-384c15857a78/download
https://dspace-uwiener.metabuscador.org/bitstreams/545a8ca3-26c6-4f48-a869-19c4f4759059/download
https://dspace-uwiener.metabuscador.org/bitstreams/f4e76c40-610c-4ab4-83ff-2c03296277ca/download
bitstream.checksum.fl_str_mv 8a4605be74aa9ea9d79846c1fba20a33
2b6ee5d29a1a6a3f2612308f5d746d67
22393ed499a8c8baa98a5ab35ca6334e
5a0c7c3bd6982e9362bd9b3de55bb7fc
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositorio Institucional de la Universidad de Wiener
repository.mail.fl_str_mv bdigital@metabiblioteca.com
_version_ 1835828820777631744
spelling Iparraguirre-Villanueva, OrlandoGuevara-Ponce, VictorSierra-Liñan, FernandoBeltozar-Clemente, SaulCabanillas-Carbonel, Michael2022-10-24T20:39:00Z2022-10-24T20:39:00Z2022https://hdl.handle.net/20.500.13053/691810.14569/IJACSA.2022.0130669Today, web content such as images, text, speeches, and videos are user-generated, and social networks have become increasingly popular as a means for people to share their ideas and opinions. One of the most popular social media for expressing their feelings towards events that occur is Twitter. The main objective of this study is to classify and analyze the content of the affiliates of the Pension and Funds Administration (AFP) published on Twitter. This study incorporates machine learning techniques for data mining, cleaning, tokenization, exploratory analysis, classification, and sentiment analysis. To apply the study and examine the data, Twitter was used with the hashtag #afp, followed by descriptive and exploratory analysis, including metrics of the tweets. Finally, a content analysis was carried out, including word frequency calculation, lemmatization, and classification of words by sentiment, emotions, and word cloud. The study uses tweets published in the month of May 2022. Sentiment distribution was also performed in three polarity classes: positive, neutral, and negative, representing 22%, 4%, and 74% respectively. Supported by the unsupervised learning method and the K-Means algorithm, we were able to determine the number of clusters using the elbow method. Finally, the sentiment analysis and the clusters formed indicate that there is a very pronounced dispersion, the distances are not very similar, even though the data standardization work was carried out.application/pdfengScience and Information OrganizationGBinfo:eu-repo/semantics/openAccessTechniques; machine learning; classification; twitterhttp://purl.org/pe-repo/ocde/ford#3.03.00Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithminfo:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionreponame:UWIENER-Institucionalinstname:Universidad Privada Norbert Wienerinstacron:UWIENERPublicationLICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://dspace-uwiener.metabuscador.org/bitstreams/4717ff3d-6d7b-4dda-bf26-42c567292cf7/download8a4605be74aa9ea9d79846c1fba20a33MD52TEXTPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf.txtPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf.txtExtracted texttext/plain37484https://dspace-uwiener.metabuscador.org/bitstreams/89d03da0-df98-4c1f-8ebe-384c15857a78/download2b6ee5d29a1a6a3f2612308f5d746d67MD53ORIGINALPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdfPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdfapplication/pdf618535https://dspace-uwiener.metabuscador.org/bitstreams/545a8ca3-26c6-4f48-a869-19c4f4759059/download22393ed499a8c8baa98a5ab35ca6334eMD51THUMBNAILPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf.jpgPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf.jpgGenerated Thumbnailimage/jpeg14456https://dspace-uwiener.metabuscador.org/bitstreams/f4e76c40-610c-4ab4-83ff-2c03296277ca/download5a0c7c3bd6982e9362bd9b3de55bb7fcMD5420.500.13053/6918oai:dspace-uwiener.metabuscador.org:20.500.13053/69182024-12-13 11:56:34.708open.accesshttps://dspace-uwiener.metabuscador.orgRepositorio Institucional de la Universidad de Wienerbdigital@metabiblioteca.comTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=
score 13.924246
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).