Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm
Descripción del Articulo
Today, web content such as images, text, speeches, and videos are user-generated, and social networks have become increasingly popular as a means for people to share their ideas and opinions. One of the most popular social media for expressing their feelings towards events that occur is Twitter. The...
| Autores: | , , , , |
|---|---|
| Formato: | artículo |
| Fecha de Publicación: | 2022 |
| Institución: | Universidad Privada Norbert Wiener |
| Repositorio: | UWIENER-Institucional |
| Lenguaje: | inglés |
| OAI Identifier: | oai:repositorio.uwiener.edu.pe:20.500.13053/6918 |
| Enlace del recurso: | https://hdl.handle.net/20.500.13053/6918 |
| Nivel de acceso: | acceso abierto |
| Materia: | Techniques; machine learning; classification; twitter http://purl.org/pe-repo/ocde/ford#3.03.00 |
| id |
UWIE_5822dd9b3efc10d69e415f0a583a72fa |
|---|---|
| oai_identifier_str |
oai:repositorio.uwiener.edu.pe:20.500.13053/6918 |
| network_acronym_str |
UWIE |
| network_name_str |
UWIENER-Institucional |
| repository_id_str |
9398 |
| dc.title.es_ES.fl_str_mv |
Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm |
| title |
Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm |
| spellingShingle |
Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm Iparraguirre-Villanueva, Orlando Techniques; machine learning; classification; twitter http://purl.org/pe-repo/ocde/ford#3.03.00 |
| title_short |
Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm |
| title_full |
Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm |
| title_fullStr |
Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm |
| title_full_unstemmed |
Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm |
| title_sort |
Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm |
| author |
Iparraguirre-Villanueva, Orlando |
| author_facet |
Iparraguirre-Villanueva, Orlando Guevara-Ponce, Victor Sierra-Liñan, Fernando Beltozar-Clemente, Saul Cabanillas-Carbonel, Michael |
| author_role |
author |
| author2 |
Guevara-Ponce, Victor Sierra-Liñan, Fernando Beltozar-Clemente, Saul Cabanillas-Carbonel, Michael |
| author2_role |
author author author author |
| dc.contributor.author.fl_str_mv |
Iparraguirre-Villanueva, Orlando Guevara-Ponce, Victor Sierra-Liñan, Fernando Beltozar-Clemente, Saul Cabanillas-Carbonel, Michael |
| dc.subject.es_ES.fl_str_mv |
Techniques; machine learning; classification; twitter |
| topic |
Techniques; machine learning; classification; twitter http://purl.org/pe-repo/ocde/ford#3.03.00 |
| dc.subject.ocde.es_ES.fl_str_mv |
http://purl.org/pe-repo/ocde/ford#3.03.00 |
| description |
Today, web content such as images, text, speeches, and videos are user-generated, and social networks have become increasingly popular as a means for people to share their ideas and opinions. One of the most popular social media for expressing their feelings towards events that occur is Twitter. The main objective of this study is to classify and analyze the content of the affiliates of the Pension and Funds Administration (AFP) published on Twitter. This study incorporates machine learning techniques for data mining, cleaning, tokenization, exploratory analysis, classification, and sentiment analysis. To apply the study and examine the data, Twitter was used with the hashtag #afp, followed by descriptive and exploratory analysis, including metrics of the tweets. Finally, a content analysis was carried out, including word frequency calculation, lemmatization, and classification of words by sentiment, emotions, and word cloud. The study uses tweets published in the month of May 2022. Sentiment distribution was also performed in three polarity classes: positive, neutral, and negative, representing 22%, 4%, and 74% respectively. Supported by the unsupervised learning method and the K-Means algorithm, we were able to determine the number of clusters using the elbow method. Finally, the sentiment analysis and the clusters formed indicate that there is a very pronounced dispersion, the distances are not very similar, even though the data standardization work was carried out. |
| publishDate |
2022 |
| dc.date.accessioned.none.fl_str_mv |
2022-10-24T20:39:00Z |
| dc.date.available.none.fl_str_mv |
2022-10-24T20:39:00Z |
| dc.date.issued.fl_str_mv |
2022 |
| dc.type.es_ES.fl_str_mv |
info:eu-repo/semantics/article |
| dc.type.version.es_ES.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.uri.none.fl_str_mv |
https://hdl.handle.net/20.500.13053/6918 |
| dc.identifier.doi.es_ES.fl_str_mv |
10.14569/IJACSA.2022.0130669 |
| url |
https://hdl.handle.net/20.500.13053/6918 |
| identifier_str_mv |
10.14569/IJACSA.2022.0130669 |
| dc.language.iso.es_ES.fl_str_mv |
eng |
| language |
eng |
| dc.rights.es_ES.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.es_ES.fl_str_mv |
application/pdf |
| dc.publisher.es_ES.fl_str_mv |
Science and Information Organization |
| dc.publisher.country.es_ES.fl_str_mv |
GB |
| dc.source.none.fl_str_mv |
reponame:UWIENER-Institucional instname:Universidad Privada Norbert Wiener instacron:UWIENER |
| instname_str |
Universidad Privada Norbert Wiener |
| instacron_str |
UWIENER |
| institution |
UWIENER |
| reponame_str |
UWIENER-Institucional |
| collection |
UWIENER-Institucional |
| bitstream.url.fl_str_mv |
https://dspace-uwiener.metabuscador.org/bitstreams/4717ff3d-6d7b-4dda-bf26-42c567292cf7/download https://dspace-uwiener.metabuscador.org/bitstreams/89d03da0-df98-4c1f-8ebe-384c15857a78/download https://dspace-uwiener.metabuscador.org/bitstreams/545a8ca3-26c6-4f48-a869-19c4f4759059/download https://dspace-uwiener.metabuscador.org/bitstreams/f4e76c40-610c-4ab4-83ff-2c03296277ca/download |
| bitstream.checksum.fl_str_mv |
8a4605be74aa9ea9d79846c1fba20a33 2b6ee5d29a1a6a3f2612308f5d746d67 22393ed499a8c8baa98a5ab35ca6334e 5a0c7c3bd6982e9362bd9b3de55bb7fc |
| bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 |
| repository.name.fl_str_mv |
Repositorio Institucional de la Universidad de Wiener |
| repository.mail.fl_str_mv |
bdigital@metabiblioteca.com |
| _version_ |
1835828820777631744 |
| spelling |
Iparraguirre-Villanueva, OrlandoGuevara-Ponce, VictorSierra-Liñan, FernandoBeltozar-Clemente, SaulCabanillas-Carbonel, Michael2022-10-24T20:39:00Z2022-10-24T20:39:00Z2022https://hdl.handle.net/20.500.13053/691810.14569/IJACSA.2022.0130669Today, web content such as images, text, speeches, and videos are user-generated, and social networks have become increasingly popular as a means for people to share their ideas and opinions. One of the most popular social media for expressing their feelings towards events that occur is Twitter. The main objective of this study is to classify and analyze the content of the affiliates of the Pension and Funds Administration (AFP) published on Twitter. This study incorporates machine learning techniques for data mining, cleaning, tokenization, exploratory analysis, classification, and sentiment analysis. To apply the study and examine the data, Twitter was used with the hashtag #afp, followed by descriptive and exploratory analysis, including metrics of the tweets. Finally, a content analysis was carried out, including word frequency calculation, lemmatization, and classification of words by sentiment, emotions, and word cloud. The study uses tweets published in the month of May 2022. Sentiment distribution was also performed in three polarity classes: positive, neutral, and negative, representing 22%, 4%, and 74% respectively. Supported by the unsupervised learning method and the K-Means algorithm, we were able to determine the number of clusters using the elbow method. Finally, the sentiment analysis and the clusters formed indicate that there is a very pronounced dispersion, the distances are not very similar, even though the data standardization work was carried out.application/pdfengScience and Information OrganizationGBinfo:eu-repo/semantics/openAccessTechniques; machine learning; classification; twitterhttp://purl.org/pe-repo/ocde/ford#3.03.00Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithminfo:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionreponame:UWIENER-Institucionalinstname:Universidad Privada Norbert Wienerinstacron:UWIENERPublicationLICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://dspace-uwiener.metabuscador.org/bitstreams/4717ff3d-6d7b-4dda-bf26-42c567292cf7/download8a4605be74aa9ea9d79846c1fba20a33MD52TEXTPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf.txtPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf.txtExtracted texttext/plain37484https://dspace-uwiener.metabuscador.org/bitstreams/89d03da0-df98-4c1f-8ebe-384c15857a78/download2b6ee5d29a1a6a3f2612308f5d746d67MD53ORIGINALPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdfPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdfapplication/pdf618535https://dspace-uwiener.metabuscador.org/bitstreams/545a8ca3-26c6-4f48-a869-19c4f4759059/download22393ed499a8c8baa98a5ab35ca6334eMD51THUMBNAILPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf.jpgPaper_69-Sentiment_Analysis_of_Tweets_using_Unsupervised_Learning_Techniques.pdf.jpgGenerated Thumbnailimage/jpeg14456https://dspace-uwiener.metabuscador.org/bitstreams/f4e76c40-610c-4ab4-83ff-2c03296277ca/download5a0c7c3bd6982e9362bd9b3de55bb7fcMD5420.500.13053/6918oai:dspace-uwiener.metabuscador.org:20.500.13053/69182024-12-13 11:56:34.708open.accesshttps://dspace-uwiener.metabuscador.orgRepositorio Institucional de la Universidad de Wienerbdigital@metabiblioteca.comTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo= |
| score |
13.924246 |
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).