Classification of Tweets Related to Natural Disasters Using Machine Learning Algorithms
Descripción del Articulo
        Abstract—In recent years, computer science has advanced exponentially, helping significantly to identify and classify text extracted from social networks, specifically Twitter. This work identifies, classifies, and analyzes tweets related to real natural disasters through tweets with the hashtag #Na...
              
            
    
                        | Autores: | , , , , , , , | 
|---|---|
| Formato: | artículo | 
| Fecha de Publicación: | 2023 | 
| Institución: | Universidad Autónoma del Perú | 
| Repositorio: | AUTONOMA-Institucional | 
| Lenguaje: | inglés | 
| OAI Identifier: | oai:repositorio.autonoma.edu.pe:20.500.13067/2875 | 
| Enlace del recurso: | https://hdl.handle.net/20.500.13067/2875 https://doi.org/10.3991/ijim.v17i14.39907 | 
| Nivel de acceso: | acceso abierto | 
| Materia: | Classification Tweets Disasters Machine learning Natural https://purl.org/pe-repo/ocde/ford#2.02.04 | 
| id | AUTO_9d998482f225b9265dd9036e0821e758 | 
|---|---|
| oai_identifier_str | oai:repositorio.autonoma.edu.pe:20.500.13067/2875 | 
| network_acronym_str | AUTO | 
| network_name_str | AUTONOMA-Institucional | 
| repository_id_str | 4774 | 
| spelling | Iparraguirre-Villanueva, OrlandoMelgarejo-Graciano, MelquiadesCastro-Leon, GloriaOlaya-Cotera, SandroJohn, Ruiz-AlvaradoEpifanía-Huerta, AndrésCabanillas-Carbonell, MichaelZapata-Paulini, Joselyn2023-12-20T15:11:40Z2023-12-20T15:11:40Z2023https://hdl.handle.net/20.500.13067/2875International Journal of Interactive Mobile Technologies (iJIM)https://doi.org/10.3991/ijim.v17i14.39907Abstract—In recent years, computer science has advanced exponentially, helping significantly to identify and classify text extracted from social networks, specifically Twitter. This work identifies, classifies, and analyzes tweets related to real natural disasters through tweets with the hashtag #Nat-uralDisasters, using Machine learning (ML) algorithms, such as Bernoulli Naive Bayes (BNB), Multinomial Naive Bayes (MNB), Logistic Regression (LR), K-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest (RF). First, tweets related to natural disasters were identified, creating a dataset of 122k geo-located tweets for training. Secondly, the data-cleaning process was carried out by applying stemming and lemmatization techniques. Third, exploratory data analysis (EDA) was performed to gain an initial understanding of the data. Fourth, the training and testing process of the BNB, MNB, L, KNN, DT, and RF models was initiated, using tools and libraries for this type of task. The results of the trained models demonstrated optimal performance: BNB, MNB, and LR models achieved a perfor mance rate of 87% accuracy; and KNN, DT, and RF models achieved perfor mances of 82%, 75%, and 86%, respectively. However, the BNB, MNB, and LR models performed better with respect to performance on their respective metrics, such as processing time, test accuracy, precision, and F1 score. Demonstrating, for this context and with the trained dataset that they are the best in terms of text classifiers.application/pdfengInternational Journal of Interactive Mobile Technologies (iJIM)info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/4.0/ClassificationTweetsDisastersMachine learningNaturalhttps://purl.org/pe-repo/ocde/ford#2.02.04Classification of Tweets Related to Natural Disasters Using Machine Learning Algorithmsinfo:eu-repo/semantics/articlehttps://online-journals.org/index.php/i-jim/article/view/399071714144162reponame:AUTONOMA-Institucionalinstname:Universidad Autónoma del Perúinstacron:AUTONOMAORIGINAL42_2023.pdf42_2023.pdfArtículoapplication/pdf2220811http://repositorio.autonoma.edu.pe/bitstream/20.500.13067/2875/1/42_2023.pdfe0d55dbf66537ed3fa182725eead7f86MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-885http://repositorio.autonoma.edu.pe/bitstream/20.500.13067/2875/2/license.txt9243398ff393db1861c890baeaeee5f9MD52TEXT42_2023.pdf.txt42_2023.pdf.txtExtracted texttext/plain48952http://repositorio.autonoma.edu.pe/bitstream/20.500.13067/2875/3/42_2023.pdf.txt248445ed94dba32b2fdd5dc977ca46d8MD53THUMBNAIL42_2023.pdf.jpg42_2023.pdf.jpgGenerated Thumbnailimage/jpeg4372http://repositorio.autonoma.edu.pe/bitstream/20.500.13067/2875/4/42_2023.pdf.jpg4e69e60a267125ad0087445a6f9b1575MD5420.500.13067/2875oai:repositorio.autonoma.edu.pe:20.500.13067/28752023-12-21 03:00:36.021Repositorio de la Universidad Autonoma del Perúrepositorio@autonoma.peVG9kb3MgbG9zIGRlcmVjaG9zIHJlc2VydmFkb3MgcG9yOg0KVU5JVkVSU0lEQUQgQVVUw5NOT01BIERFTCBQRVLDmg0KQ1JFQVRJVkUgQ09NTU9OUw== | 
| dc.title.es_PE.fl_str_mv | Classification of Tweets Related to Natural Disasters Using Machine Learning Algorithms | 
| title | Classification of Tweets Related to Natural Disasters Using Machine Learning Algorithms | 
| spellingShingle | Classification of Tweets Related to Natural Disasters Using Machine Learning Algorithms Iparraguirre-Villanueva, Orlando Classification Tweets Disasters Machine learning Natural https://purl.org/pe-repo/ocde/ford#2.02.04 | 
| title_short | Classification of Tweets Related to Natural Disasters Using Machine Learning Algorithms | 
| title_full | Classification of Tweets Related to Natural Disasters Using Machine Learning Algorithms | 
| title_fullStr | Classification of Tweets Related to Natural Disasters Using Machine Learning Algorithms | 
| title_full_unstemmed | Classification of Tweets Related to Natural Disasters Using Machine Learning Algorithms | 
| title_sort | Classification of Tweets Related to Natural Disasters Using Machine Learning Algorithms | 
| author | Iparraguirre-Villanueva, Orlando | 
| author_facet | Iparraguirre-Villanueva, Orlando Melgarejo-Graciano, Melquiades Castro-Leon, Gloria Olaya-Cotera, Sandro John, Ruiz-Alvarado Epifanía-Huerta, Andrés Cabanillas-Carbonell, Michael Zapata-Paulini, Joselyn | 
| author_role | author | 
| author2 | Melgarejo-Graciano, Melquiades Castro-Leon, Gloria Olaya-Cotera, Sandro John, Ruiz-Alvarado Epifanía-Huerta, Andrés Cabanillas-Carbonell, Michael Zapata-Paulini, Joselyn | 
| author2_role | author author author author author author author | 
| dc.contributor.author.fl_str_mv | Iparraguirre-Villanueva, Orlando Melgarejo-Graciano, Melquiades Castro-Leon, Gloria Olaya-Cotera, Sandro John, Ruiz-Alvarado Epifanía-Huerta, Andrés Cabanillas-Carbonell, Michael Zapata-Paulini, Joselyn | 
| dc.subject.es_PE.fl_str_mv | Classification Tweets Disasters Machine learning Natural | 
| topic | Classification Tweets Disasters Machine learning Natural https://purl.org/pe-repo/ocde/ford#2.02.04 | 
| dc.subject.ocde.es_PE.fl_str_mv | https://purl.org/pe-repo/ocde/ford#2.02.04 | 
| description | Abstract—In recent years, computer science has advanced exponentially, helping significantly to identify and classify text extracted from social networks, specifically Twitter. This work identifies, classifies, and analyzes tweets related to real natural disasters through tweets with the hashtag #Nat-uralDisasters, using Machine learning (ML) algorithms, such as Bernoulli Naive Bayes (BNB), Multinomial Naive Bayes (MNB), Logistic Regression (LR), K-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest (RF). First, tweets related to natural disasters were identified, creating a dataset of 122k geo-located tweets for training. Secondly, the data-cleaning process was carried out by applying stemming and lemmatization techniques. Third, exploratory data analysis (EDA) was performed to gain an initial understanding of the data. Fourth, the training and testing process of the BNB, MNB, L, KNN, DT, and RF models was initiated, using tools and libraries for this type of task. The results of the trained models demonstrated optimal performance: BNB, MNB, and LR models achieved a perfor mance rate of 87% accuracy; and KNN, DT, and RF models achieved perfor mances of 82%, 75%, and 86%, respectively. However, the BNB, MNB, and LR models performed better with respect to performance on their respective metrics, such as processing time, test accuracy, precision, and F1 score. Demonstrating, for this context and with the trained dataset that they are the best in terms of text classifiers. | 
| publishDate | 2023 | 
| dc.date.accessioned.none.fl_str_mv | 2023-12-20T15:11:40Z | 
| dc.date.available.none.fl_str_mv | 2023-12-20T15:11:40Z | 
| dc.date.issued.fl_str_mv | 2023 | 
| dc.type.es_PE.fl_str_mv | info:eu-repo/semantics/article | 
| format | article | 
| dc.identifier.uri.none.fl_str_mv | https://hdl.handle.net/20.500.13067/2875 | 
| dc.identifier.journal.es_PE.fl_str_mv | International Journal of Interactive Mobile Technologies (iJIM) | 
| dc.identifier.doi.none.fl_str_mv | https://doi.org/10.3991/ijim.v17i14.39907 | 
| url | https://hdl.handle.net/20.500.13067/2875 https://doi.org/10.3991/ijim.v17i14.39907 | 
| identifier_str_mv | International Journal of Interactive Mobile Technologies (iJIM) | 
| dc.language.iso.es_PE.fl_str_mv | eng | 
| language | eng | 
| dc.relation.url.es_PE.fl_str_mv | https://online-journals.org/index.php/i-jim/article/view/39907 | 
| dc.rights.es_PE.fl_str_mv | info:eu-repo/semantics/openAccess | 
| dc.rights.uri.es_PE.fl_str_mv | https://creativecommons.org/licenses/by/4.0/ | 
| eu_rights_str_mv | openAccess | 
| rights_invalid_str_mv | https://creativecommons.org/licenses/by/4.0/ | 
| dc.format.es_PE.fl_str_mv | application/pdf | 
| dc.publisher.es_PE.fl_str_mv | International Journal of Interactive Mobile Technologies (iJIM) | 
| dc.source.none.fl_str_mv | reponame:AUTONOMA-Institucional instname:Universidad Autónoma del Perú instacron:AUTONOMA | 
| instname_str | Universidad Autónoma del Perú | 
| instacron_str | AUTONOMA | 
| institution | AUTONOMA | 
| reponame_str | AUTONOMA-Institucional | 
| collection | AUTONOMA-Institucional | 
| dc.source.volume.es_PE.fl_str_mv | 17 | 
| dc.source.issue.es_PE.fl_str_mv | 14 | 
| dc.source.beginpage.es_PE.fl_str_mv | 144 | 
| dc.source.endpage.es_PE.fl_str_mv | 162 | 
| bitstream.url.fl_str_mv | http://repositorio.autonoma.edu.pe/bitstream/20.500.13067/2875/1/42_2023.pdf http://repositorio.autonoma.edu.pe/bitstream/20.500.13067/2875/2/license.txt http://repositorio.autonoma.edu.pe/bitstream/20.500.13067/2875/3/42_2023.pdf.txt http://repositorio.autonoma.edu.pe/bitstream/20.500.13067/2875/4/42_2023.pdf.jpg | 
| bitstream.checksum.fl_str_mv | e0d55dbf66537ed3fa182725eead7f86 9243398ff393db1861c890baeaeee5f9 248445ed94dba32b2fdd5dc977ca46d8 4e69e60a267125ad0087445a6f9b1575 | 
| bitstream.checksumAlgorithm.fl_str_mv | MD5 MD5 MD5 MD5 | 
| repository.name.fl_str_mv | Repositorio de la Universidad Autonoma del Perú | 
| repository.mail.fl_str_mv | repositorio@autonoma.pe | 
| _version_ | 1835915259026604032 | 
| score | 13.924177 | 
 Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
    La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
 
   
   
             
            