A deep learning approach for sentiment analysis in Spanish Tweets
Descripción del Articulo
The present work was supported by grant 234-2015-FONDECYT (Master Program) from Cienciactiva of the National Council for Science, Technology and Technological Innovation (CONCYTEC-PERU) and the Office Research of Universidad Nacional de Ingeniería (VRI - UNI).
Autores: | , , |
---|---|
Formato: | objeto de conferencia |
Fecha de Publicación: | 2018 |
Institución: | Consejo Nacional de Ciencia Tecnología e Innovación |
Repositorio: | CONCYTEC-Institucional |
Lenguaje: | inglés |
OAI Identifier: | oai:repositorio.concytec.gob.pe:20.500.12390/485 |
Enlace del recurso: | https://hdl.handle.net/20.500.12390/485 https://doi.org/10.1007/978-3-030-01424-7_61 |
Nivel de acceso: | acceso abierto |
Materia: | Spanish tweets Convolution Data mining Learning algorithms Matrix algebra Natural language processing systems Network architecture Neural networks Sentiment analysis Architecture designs Architectures and models Convolutional Neural Networks (CNN) English languages Learning approach Pre-processing algorithms Deep learning https://purl.org/pe-repo/ocde/ford#1.02.01 |
id |
CONC_aa7c16151378a974d5935ea2353d8df7 |
---|---|
oai_identifier_str |
oai:repositorio.concytec.gob.pe:20.500.12390/485 |
network_acronym_str |
CONC |
network_name_str |
CONCYTEC-Institucional |
repository_id_str |
4689 |
dc.title.none.fl_str_mv |
A deep learning approach for sentiment analysis in Spanish Tweets |
title |
A deep learning approach for sentiment analysis in Spanish Tweets |
spellingShingle |
A deep learning approach for sentiment analysis in Spanish Tweets Vizcarra G. Spanish tweets Convolution Data mining Learning algorithms Matrix algebra Natural language processing systems Network architecture Neural networks Sentiment analysis Architecture designs Architecture designs Architectures and models Convolutional Neural Networks (CNN) Convolutional Neural Networks (CNN) English languages English languages Learning approach Pre-processing algorithms Deep learning Deep learning https://purl.org/pe-repo/ocde/ford#1.02.01 |
title_short |
A deep learning approach for sentiment analysis in Spanish Tweets |
title_full |
A deep learning approach for sentiment analysis in Spanish Tweets |
title_fullStr |
A deep learning approach for sentiment analysis in Spanish Tweets |
title_full_unstemmed |
A deep learning approach for sentiment analysis in Spanish Tweets |
title_sort |
A deep learning approach for sentiment analysis in Spanish Tweets |
author |
Vizcarra G. |
author_facet |
Vizcarra G. Mauricio A. Mauricio L. |
author_role |
author |
author2 |
Mauricio A. Mauricio L. |
author2_role |
author author |
dc.contributor.author.fl_str_mv |
Vizcarra G. Mauricio A. Mauricio L. |
dc.subject.none.fl_str_mv |
Spanish tweets |
topic |
Spanish tweets Convolution Data mining Learning algorithms Matrix algebra Natural language processing systems Network architecture Neural networks Sentiment analysis Architecture designs Architecture designs Architectures and models Convolutional Neural Networks (CNN) Convolutional Neural Networks (CNN) English languages English languages Learning approach Pre-processing algorithms Deep learning Deep learning https://purl.org/pe-repo/ocde/ford#1.02.01 |
dc.subject.es_PE.fl_str_mv |
Convolution Data mining Learning algorithms Matrix algebra Natural language processing systems Network architecture Neural networks Sentiment analysis Architecture designs Architecture designs Architectures and models Convolutional Neural Networks (CNN) Convolutional Neural Networks (CNN) English languages English languages Learning approach Pre-processing algorithms Deep learning Deep learning |
dc.subject.ocde.none.fl_str_mv |
https://purl.org/pe-repo/ocde/ford#1.02.01 |
description |
The present work was supported by grant 234-2015-FONDECYT (Master Program) from Cienciactiva of the National Council for Science, Technology and Technological Innovation (CONCYTEC-PERU) and the Office Research of Universidad Nacional de Ingeniería (VRI - UNI). |
publishDate |
2018 |
dc.date.accessioned.none.fl_str_mv |
2024-05-30T23:13:38Z |
dc.date.available.none.fl_str_mv |
2024-05-30T23:13:38Z |
dc.date.issued.fl_str_mv |
2018 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject |
format |
conferenceObject |
dc.identifier.isbn.none.fl_str_mv |
9783030014230 |
dc.identifier.uri.none.fl_str_mv |
https://hdl.handle.net/20.500.12390/485 |
dc.identifier.doi.none.fl_str_mv |
https://doi.org/10.1007/978-3-030-01424-7_61 |
dc.identifier.scopus.none.fl_str_mv |
2-s2.0-85054825905 |
identifier_str_mv |
9783030014230 2-s2.0-85054825905 |
url |
https://hdl.handle.net/20.500.12390/485 https://doi.org/10.1007/978-3-030-01424-7_61 |
dc.language.iso.none.fl_str_mv |
eng |
language |
eng |
dc.relation.ispartof.none.fl_str_mv |
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Springer Verlag |
publisher.none.fl_str_mv |
Springer Verlag |
dc.source.none.fl_str_mv |
reponame:CONCYTEC-Institucional instname:Consejo Nacional de Ciencia Tecnología e Innovación instacron:CONCYTEC |
instname_str |
Consejo Nacional de Ciencia Tecnología e Innovación |
instacron_str |
CONCYTEC |
institution |
CONCYTEC |
reponame_str |
CONCYTEC-Institucional |
collection |
CONCYTEC-Institucional |
repository.name.fl_str_mv |
Repositorio Institucional CONCYTEC |
repository.mail.fl_str_mv |
repositorio@concytec.gob.pe |
_version_ |
1839175385909035008 |
spelling |
Publicationrp00529600rp00530600rp00531600Vizcarra G.Mauricio A.Mauricio L.2024-05-30T23:13:38Z2024-05-30T23:13:38Z20189783030014230https://hdl.handle.net/20.500.12390/485https://doi.org/10.1007/978-3-030-01424-7_612-s2.0-85054825905The present work was supported by grant 234-2015-FONDECYT (Master Program) from Cienciactiva of the National Council for Science, Technology and Technological Innovation (CONCYTEC-PERU) and the Office Research of Universidad Nacional de Ingeniería (VRI - UNI).Sentiment Analysis at Document Level is a well-known problem in Natural Language Processing (NLP), being considered as a reference in NLP, over which new architectures and models are tested in order to compare metrics that are also referents in other issues. This problem has been solved in good enough terms for English language, but its metrics are still quite low in other languages. In addition, architectures which are successful in a language do not necessarily works in another. In the case of Spanish, data quantity and quality become a problem during data preparation and architecture design, due to the few labeled data available including not-textual elements (like emoticons or expressions). This work presents an approach to solve the sentiment analysis problem in Spanish tweets and compares it with the state of art. To do so, a preprocessing algorithm is performed based on interpretation of colloquial expressions and emoticons, and trivial words elimination. Processed sentences turn into matrices using the 3 most successful methods of word embeddings (GloVe, FastText and Word2Vec), then the 3 matrices merge into a 3-channels matrix which is used to feed our CNN-based model. The proposed architecture uses parallel convolution layers as k-grams, by this way the value of each word and their contexts are weighted, to predict the sentiment polarity among 4 possible classes. After several tests, the optimal tuple which improves the accuracy were <1, 2>. Finally, our model presents %61.58 and %71.14 of accuracy in InterTASS and General Corpus respectively.Consejo Nacional de Ciencia, Tecnología e Innovación Tecnológica - ConcytecengSpringer VerlagLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)info:eu-repo/semantics/openAccessSpanish tweetsConvolution-1Data mining-1Learning algorithms-1Matrix algebra-1Natural language processing systems-1Network architecture-1Neural networks-1Sentiment analysis-1Architecture designs-1Architecture designs-1Architectures and models-1Convolutional Neural Networks (CNN)-1Convolutional Neural Networks (CNN)-1English languages-1English languages-1Learning approach-1Pre-processing algorithms-1Deep learning-1Deep learning-1https://purl.org/pe-repo/ocde/ford#1.02.01-1A deep learning approach for sentiment analysis in Spanish Tweetsinfo:eu-repo/semantics/conferenceObjectreponame:CONCYTEC-Institucionalinstname:Consejo Nacional de Ciencia Tecnología e Innovacióninstacron:CONCYTEC#PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE#20.500.12390/485oai:repositorio.concytec.gob.pe:20.500.12390/4852024-05-30 15:35:34.188http://purl.org/coar/access_right/c_14cbinfo:eu-repo/semantics/closedAccessmetadata only accesshttps://repositorio.concytec.gob.peRepositorio Institucional CONCYTECrepositorio@concytec.gob.pe#PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE#<Publication xmlns="https://www.openaire.eu/cerif-profile/1.1/" id="facb31fb-45c1-4f85-9c37-0aeda22381b4"> <Type xmlns="https://www.openaire.eu/cerif-profile/vocab/COAR_Publication_Types">http://purl.org/coar/resource_type/c_1843</Type> <Language>eng</Language> <Title>A deep learning approach for sentiment analysis in Spanish Tweets</Title> <PublishedIn> <Publication> <Title>Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)</Title> </Publication> </PublishedIn> <PublicationDate>2018</PublicationDate> <DOI>https://doi.org/10.1007/978-3-030-01424-7_61</DOI> <SCP-Number>2-s2.0-85054825905</SCP-Number> <ISBN>9783030014230</ISBN> <Authors> <Author> <DisplayName>Vizcarra G.</DisplayName> <Person id="rp00529" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> <Author> <DisplayName>Mauricio A.</DisplayName> <Person id="rp00530" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> <Author> <DisplayName>Mauricio L.</DisplayName> <Person id="rp00531" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> </Authors> <Editors> </Editors> <Publishers> <Publisher> <DisplayName>Springer Verlag</DisplayName> <OrgUnit /> </Publisher> </Publishers> <Keyword>Spanish tweets</Keyword> <Keyword>Convolution</Keyword> <Keyword>Data mining</Keyword> <Keyword>Learning algorithms</Keyword> <Keyword>Matrix algebra</Keyword> <Keyword>Natural language processing systems</Keyword> <Keyword>Network architecture</Keyword> <Keyword>Neural networks</Keyword> <Keyword>Sentiment analysis</Keyword> <Keyword>Architecture designs</Keyword> <Keyword>Architecture designs</Keyword> <Keyword>Architectures and models</Keyword> <Keyword>Convolutional Neural Networks (CNN)</Keyword> <Keyword>Convolutional Neural Networks (CNN)</Keyword> <Keyword>English languages</Keyword> <Keyword>English languages</Keyword> <Keyword>Learning approach</Keyword> <Keyword>Pre-processing algorithms</Keyword> <Keyword>Deep learning</Keyword> <Keyword>Deep learning</Keyword> <Abstract>Sentiment Analysis at Document Level is a well-known problem in Natural Language Processing (NLP), being considered as a reference in NLP, over which new architectures and models are tested in order to compare metrics that are also referents in other issues. This problem has been solved in good enough terms for English language, but its metrics are still quite low in other languages. In addition, architectures which are successful in a language do not necessarily works in another. In the case of Spanish, data quantity and quality become a problem during data preparation and architecture design, due to the few labeled data available including not-textual elements (like emoticons or expressions). This work presents an approach to solve the sentiment analysis problem in Spanish tweets and compares it with the state of art. To do so, a preprocessing algorithm is performed based on interpretation of colloquial expressions and emoticons, and trivial words elimination. Processed sentences turn into matrices using the 3 most successful methods of word embeddings (GloVe, FastText and Word2Vec), then the 3 matrices merge into a 3-channels matrix which is used to feed our CNN-based model. The proposed architecture uses parallel convolution layers as k-grams, by this way the value of each word and their contexts are weighted, to predict the sentiment polarity among 4 possible classes. After several tests, the optimal tuple which improves the accuracy were <1, 2>. Finally, our model presents %61.58 and %71.14 of accuracy in InterTASS and General Corpus respectively.</Abstract> <Access xmlns="http://purl.org/coar/access_right" > </Access> </Publication> -1 |
score |
13.448654 |
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).