A deep learning approach for sentiment analysis in Spanish Tweets

Descripción del Articulo

The present work was supported by grant 234-2015-FONDECYT (Master Program) from Cienciactiva of the National Council for Science, Technology and Technological Innovation (CONCYTEC-PERU) and the Office Research of Universidad Nacional de Ingeniería (VRI - UNI).
Detalles Bibliográficos
Autores: Vizcarra G., Mauricio A., Mauricio L.
Formato: objeto de conferencia
Fecha de Publicación:2018
Institución:Consejo Nacional de Ciencia Tecnología e Innovación
Repositorio:CONCYTEC-Institucional
Lenguaje:inglés
OAI Identifier:oai:repositorio.concytec.gob.pe:20.500.12390/485
Enlace del recurso:https://hdl.handle.net/20.500.12390/485
https://doi.org/10.1007/978-3-030-01424-7_61
Nivel de acceso:acceso abierto
Materia:Spanish tweets
Convolution
Data mining
Learning algorithms
Matrix algebra
Natural language processing systems
Network architecture
Neural networks
Sentiment analysis
Architecture designs
Architectures and models
Convolutional Neural Networks (CNN)
English languages
Learning approach
Pre-processing algorithms
Deep learning
https://purl.org/pe-repo/ocde/ford#1.02.01
id CONC_aa7c16151378a974d5935ea2353d8df7
oai_identifier_str oai:repositorio.concytec.gob.pe:20.500.12390/485
network_acronym_str CONC
network_name_str CONCYTEC-Institucional
repository_id_str 4689
dc.title.none.fl_str_mv A deep learning approach for sentiment analysis in Spanish Tweets
title A deep learning approach for sentiment analysis in Spanish Tweets
spellingShingle A deep learning approach for sentiment analysis in Spanish Tweets
Vizcarra G.
Spanish tweets
Convolution
Data mining
Learning algorithms
Matrix algebra
Natural language processing systems
Network architecture
Neural networks
Sentiment analysis
Architecture designs
Architecture designs
Architectures and models
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
English languages
English languages
Learning approach
Pre-processing algorithms
Deep learning
Deep learning
https://purl.org/pe-repo/ocde/ford#1.02.01
title_short A deep learning approach for sentiment analysis in Spanish Tweets
title_full A deep learning approach for sentiment analysis in Spanish Tweets
title_fullStr A deep learning approach for sentiment analysis in Spanish Tweets
title_full_unstemmed A deep learning approach for sentiment analysis in Spanish Tweets
title_sort A deep learning approach for sentiment analysis in Spanish Tweets
author Vizcarra G.
author_facet Vizcarra G.
Mauricio A.
Mauricio L.
author_role author
author2 Mauricio A.
Mauricio L.
author2_role author
author
dc.contributor.author.fl_str_mv Vizcarra G.
Mauricio A.
Mauricio L.
dc.subject.none.fl_str_mv Spanish tweets
topic Spanish tweets
Convolution
Data mining
Learning algorithms
Matrix algebra
Natural language processing systems
Network architecture
Neural networks
Sentiment analysis
Architecture designs
Architecture designs
Architectures and models
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
English languages
English languages
Learning approach
Pre-processing algorithms
Deep learning
Deep learning
https://purl.org/pe-repo/ocde/ford#1.02.01
dc.subject.es_PE.fl_str_mv Convolution
Data mining
Learning algorithms
Matrix algebra
Natural language processing systems
Network architecture
Neural networks
Sentiment analysis
Architecture designs
Architecture designs
Architectures and models
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
English languages
English languages
Learning approach
Pre-processing algorithms
Deep learning
Deep learning
dc.subject.ocde.none.fl_str_mv https://purl.org/pe-repo/ocde/ford#1.02.01
description The present work was supported by grant 234-2015-FONDECYT (Master Program) from Cienciactiva of the National Council for Science, Technology and Technological Innovation (CONCYTEC-PERU) and the Office Research of Universidad Nacional de Ingeniería (VRI - UNI).
publishDate 2018
dc.date.accessioned.none.fl_str_mv 2024-05-30T23:13:38Z
dc.date.available.none.fl_str_mv 2024-05-30T23:13:38Z
dc.date.issued.fl_str_mv 2018
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
format conferenceObject
dc.identifier.isbn.none.fl_str_mv 9783030014230
dc.identifier.uri.none.fl_str_mv https://hdl.handle.net/20.500.12390/485
dc.identifier.doi.none.fl_str_mv https://doi.org/10.1007/978-3-030-01424-7_61
dc.identifier.scopus.none.fl_str_mv 2-s2.0-85054825905
identifier_str_mv 9783030014230
2-s2.0-85054825905
url https://hdl.handle.net/20.500.12390/485
https://doi.org/10.1007/978-3-030-01424-7_61
dc.language.iso.none.fl_str_mv eng
language eng
dc.relation.ispartof.none.fl_str_mv Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Springer Verlag
publisher.none.fl_str_mv Springer Verlag
dc.source.none.fl_str_mv reponame:CONCYTEC-Institucional
instname:Consejo Nacional de Ciencia Tecnología e Innovación
instacron:CONCYTEC
instname_str Consejo Nacional de Ciencia Tecnología e Innovación
instacron_str CONCYTEC
institution CONCYTEC
reponame_str CONCYTEC-Institucional
collection CONCYTEC-Institucional
repository.name.fl_str_mv Repositorio Institucional CONCYTEC
repository.mail.fl_str_mv repositorio@concytec.gob.pe
_version_ 1839175385909035008
spelling Publicationrp00529600rp00530600rp00531600Vizcarra G.Mauricio A.Mauricio L.2024-05-30T23:13:38Z2024-05-30T23:13:38Z20189783030014230https://hdl.handle.net/20.500.12390/485https://doi.org/10.1007/978-3-030-01424-7_612-s2.0-85054825905The present work was supported by grant 234-2015-FONDECYT (Master Program) from Cienciactiva of the National Council for Science, Technology and Technological Innovation (CONCYTEC-PERU) and the Office Research of Universidad Nacional de Ingeniería (VRI - UNI).Sentiment Analysis at Document Level is a well-known problem in Natural Language Processing (NLP), being considered as a reference in NLP, over which new architectures and models are tested in order to compare metrics that are also referents in other issues. This problem has been solved in good enough terms for English language, but its metrics are still quite low in other languages. In addition, architectures which are successful in a language do not necessarily works in another. In the case of Spanish, data quantity and quality become a problem during data preparation and architecture design, due to the few labeled data available including not-textual elements (like emoticons or expressions). This work presents an approach to solve the sentiment analysis problem in Spanish tweets and compares it with the state of art. To do so, a preprocessing algorithm is performed based on interpretation of colloquial expressions and emoticons, and trivial words elimination. Processed sentences turn into matrices using the 3 most successful methods of word embeddings (GloVe, FastText and Word2Vec), then the 3 matrices merge into a 3-channels matrix which is used to feed our CNN-based model. The proposed architecture uses parallel convolution layers as k-grams, by this way the value of each word and their contexts are weighted, to predict the sentiment polarity among 4 possible classes. After several tests, the optimal tuple which improves the accuracy were <1, 2>. Finally, our model presents %61.58 and %71.14 of accuracy in InterTASS and General Corpus respectively.Consejo Nacional de Ciencia, Tecnología e Innovación Tecnológica - ConcytecengSpringer VerlagLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)info:eu-repo/semantics/openAccessSpanish tweetsConvolution-1Data mining-1Learning algorithms-1Matrix algebra-1Natural language processing systems-1Network architecture-1Neural networks-1Sentiment analysis-1Architecture designs-1Architecture designs-1Architectures and models-1Convolutional Neural Networks (CNN)-1Convolutional Neural Networks (CNN)-1English languages-1English languages-1Learning approach-1Pre-processing algorithms-1Deep learning-1Deep learning-1https://purl.org/pe-repo/ocde/ford#1.02.01-1A deep learning approach for sentiment analysis in Spanish Tweetsinfo:eu-repo/semantics/conferenceObjectreponame:CONCYTEC-Institucionalinstname:Consejo Nacional de Ciencia Tecnología e Innovacióninstacron:CONCYTEC#PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE#20.500.12390/485oai:repositorio.concytec.gob.pe:20.500.12390/4852024-05-30 15:35:34.188http://purl.org/coar/access_right/c_14cbinfo:eu-repo/semantics/closedAccessmetadata only accesshttps://repositorio.concytec.gob.peRepositorio Institucional CONCYTECrepositorio@concytec.gob.pe#PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE#<Publication xmlns="https://www.openaire.eu/cerif-profile/1.1/" id="facb31fb-45c1-4f85-9c37-0aeda22381b4"> <Type xmlns="https://www.openaire.eu/cerif-profile/vocab/COAR_Publication_Types">http://purl.org/coar/resource_type/c_1843</Type> <Language>eng</Language> <Title>A deep learning approach for sentiment analysis in Spanish Tweets</Title> <PublishedIn> <Publication> <Title>Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)</Title> </Publication> </PublishedIn> <PublicationDate>2018</PublicationDate> <DOI>https://doi.org/10.1007/978-3-030-01424-7_61</DOI> <SCP-Number>2-s2.0-85054825905</SCP-Number> <ISBN>9783030014230</ISBN> <Authors> <Author> <DisplayName>Vizcarra G.</DisplayName> <Person id="rp00529" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> <Author> <DisplayName>Mauricio A.</DisplayName> <Person id="rp00530" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> <Author> <DisplayName>Mauricio L.</DisplayName> <Person id="rp00531" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> </Authors> <Editors> </Editors> <Publishers> <Publisher> <DisplayName>Springer Verlag</DisplayName> <OrgUnit /> </Publisher> </Publishers> <Keyword>Spanish tweets</Keyword> <Keyword>Convolution</Keyword> <Keyword>Data mining</Keyword> <Keyword>Learning algorithms</Keyword> <Keyword>Matrix algebra</Keyword> <Keyword>Natural language processing systems</Keyword> <Keyword>Network architecture</Keyword> <Keyword>Neural networks</Keyword> <Keyword>Sentiment analysis</Keyword> <Keyword>Architecture designs</Keyword> <Keyword>Architecture designs</Keyword> <Keyword>Architectures and models</Keyword> <Keyword>Convolutional Neural Networks (CNN)</Keyword> <Keyword>Convolutional Neural Networks (CNN)</Keyword> <Keyword>English languages</Keyword> <Keyword>English languages</Keyword> <Keyword>Learning approach</Keyword> <Keyword>Pre-processing algorithms</Keyword> <Keyword>Deep learning</Keyword> <Keyword>Deep learning</Keyword> <Abstract>Sentiment Analysis at Document Level is a well-known problem in Natural Language Processing (NLP), being considered as a reference in NLP, over which new architectures and models are tested in order to compare metrics that are also referents in other issues. This problem has been solved in good enough terms for English language, but its metrics are still quite low in other languages. In addition, architectures which are successful in a language do not necessarily works in another. In the case of Spanish, data quantity and quality become a problem during data preparation and architecture design, due to the few labeled data available including not-textual elements (like emoticons or expressions). This work presents an approach to solve the sentiment analysis problem in Spanish tweets and compares it with the state of art. To do so, a preprocessing algorithm is performed based on interpretation of colloquial expressions and emoticons, and trivial words elimination. Processed sentences turn into matrices using the 3 most successful methods of word embeddings (GloVe, FastText and Word2Vec), then the 3 matrices merge into a 3-channels matrix which is used to feed our CNN-based model. The proposed architecture uses parallel convolution layers as k-grams, by this way the value of each word and their contexts are weighted, to predict the sentiment polarity among 4 possible classes. After several tests, the optimal tuple which improves the accuracy were &lt;1, 2&gt;. Finally, our model presents %61.58 and %71.14 of accuracy in InterTASS and General Corpus respectively.</Abstract> <Access xmlns="http://purl.org/coar/access_right" > </Access> </Publication> -1
score 13.448654
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).