Multilingual Detection of Cyberbullying on Social Networks Using a Fine-Tuned GPT-3.5 Model
Descripción del Articulo
Cyberbullying on social networks has emerged as a global problem with serious consequences on the mental health of victims, mainly children, and adolescents. Although there are AI-based solutions to address this issue, they face limitations such as a lack of multilingual datasets, detecting sarcasm,...
| Autores: | , , |
|---|---|
| Formato: | artículo |
| Fecha de Publicación: | 2024 |
| Institución: | Universidad Peruana de Ciencias Aplicadas |
| Repositorio: | UPC-Institucional |
| Lenguaje: | inglés |
| OAI Identifier: | oai:repositorioacademico.upc.edu.pe:10757/676024 |
| Enlace del recurso: | http://hdl.handle.net/10757/676024 |
| Nivel de acceso: | acceso embargado |
| Materia: | Artificial intelligence Cyberbullying GPT Hate detection Offensive language Social media |
| id |
UUPC_110eb20a951ec4092776805280724493 |
|---|---|
| oai_identifier_str |
oai:repositorioacademico.upc.edu.pe:10757/676024 |
| network_acronym_str |
UUPC |
| network_name_str |
UPC-Institucional |
| repository_id_str |
2670 |
| dc.title.es_PE.fl_str_mv |
Multilingual Detection of Cyberbullying on Social Networks Using a Fine-Tuned GPT-3.5 Model |
| title |
Multilingual Detection of Cyberbullying on Social Networks Using a Fine-Tuned GPT-3.5 Model |
| spellingShingle |
Multilingual Detection of Cyberbullying on Social Networks Using a Fine-Tuned GPT-3.5 Model Nina-Gutiérrez, Elizabeth Adriana Artificial intelligence Cyberbullying GPT Hate detection Offensive language Social media |
| title_short |
Multilingual Detection of Cyberbullying on Social Networks Using a Fine-Tuned GPT-3.5 Model |
| title_full |
Multilingual Detection of Cyberbullying on Social Networks Using a Fine-Tuned GPT-3.5 Model |
| title_fullStr |
Multilingual Detection of Cyberbullying on Social Networks Using a Fine-Tuned GPT-3.5 Model |
| title_full_unstemmed |
Multilingual Detection of Cyberbullying on Social Networks Using a Fine-Tuned GPT-3.5 Model |
| title_sort |
Multilingual Detection of Cyberbullying on Social Networks Using a Fine-Tuned GPT-3.5 Model |
| author |
Nina-Gutiérrez, Elizabeth Adriana |
| author_facet |
Nina-Gutiérrez, Elizabeth Adriana Pacheco-Alanya, Jesús Emerson Morales-Arevalo, Juan Carlos |
| author_role |
author |
| author2 |
Pacheco-Alanya, Jesús Emerson Morales-Arevalo, Juan Carlos |
| author2_role |
author author |
| dc.contributor.author.fl_str_mv |
Nina-Gutiérrez, Elizabeth Adriana Pacheco-Alanya, Jesús Emerson Morales-Arevalo, Juan Carlos |
| dc.subject.es_PE.fl_str_mv |
Artificial intelligence Cyberbullying GPT Hate detection Offensive language Social media |
| topic |
Artificial intelligence Cyberbullying GPT Hate detection Offensive language Social media |
| description |
Cyberbullying on social networks has emerged as a global problem with serious consequences on the mental health of victims, mainly children, and adolescents. Although there are AI-based solutions to address this issue, they face limitations such as a lack of multilingual datasets, detecting sarcasm, and detecting idioms. Research presents an innovative approach to effective cyberbullying detection using a fine-tuned GPT-3.5 model. Our main contribution is the creation of an extensive multi-label dataset of approximately 60,000 data in English, and Spanish, spanning diverse dialects. This data set was obtained by combining and processing multiple datasets from reliable sources. In addition, we developed a fine-tuned model based on GPT-3.5, capable of identifying hate speech, and offensive language in textual content on social networks. We conducted a thorough evaluation comparing our model to specialized solutions such as Perspective API, Moderation, Content Safety, Toxic Bert, and Gemini. The results demonstrate that our approach outperforms existing models in metrics such as precision, f1-score, and accuracy, making it the most suitable choice for effective cyberbullying detection. This research lays the groundwork for a future app where users can be alerted to harmful content online. |
| publishDate |
2024 |
| dc.date.accessioned.none.fl_str_mv |
2024-10-06T11:17:57Z |
| dc.date.available.none.fl_str_mv |
2024-10-06T11:17:57Z |
| dc.date.issued.fl_str_mv |
2024-01-01 |
| dc.type.es_PE.fl_str_mv |
info:eu-repo/semantics/article |
| format |
article |
| dc.identifier.issn.none.fl_str_mv |
18650929 |
| dc.identifier.doi.none.fl_str_mv |
10.1007/978-3-031-66705-3_17 |
| dc.identifier.uri.none.fl_str_mv |
http://hdl.handle.net/10757/676024 |
| dc.identifier.eissn.none.fl_str_mv |
18650937 |
| dc.identifier.journal.es_PE.fl_str_mv |
Communications in Computer and Information Science |
| dc.identifier.eid.none.fl_str_mv |
2-s2.0-85202606320 |
| dc.identifier.scopusid.none.fl_str_mv |
SCOPUS_ID:85202606320 |
| identifier_str_mv |
18650929 10.1007/978-3-031-66705-3_17 18650937 Communications in Computer and Information Science 2-s2.0-85202606320 SCOPUS_ID:85202606320 |
| url |
http://hdl.handle.net/10757/676024 |
| dc.language.iso.es_PE.fl_str_mv |
eng |
| language |
eng |
| dc.rights.es_PE.fl_str_mv |
info:eu-repo/semantics/embargoedAccess |
| eu_rights_str_mv |
embargoedAccess |
| dc.format.es_PE.fl_str_mv |
application/html |
| dc.publisher.es_PE.fl_str_mv |
Springer Science and Business Media Deutschland GmbH |
| dc.source.none.fl_str_mv |
reponame:UPC-Institucional instname:Universidad Peruana de Ciencias Aplicadas instacron:UPC |
| instname_str |
Universidad Peruana de Ciencias Aplicadas |
| instacron_str |
UPC |
| institution |
UPC |
| reponame_str |
UPC-Institucional |
| collection |
UPC-Institucional |
| dc.source.journaltitle.none.fl_str_mv |
Communications in Computer and Information Science |
| dc.source.volume.none.fl_str_mv |
2172 CCIS |
| dc.source.beginpage.none.fl_str_mv |
252 |
| dc.source.endpage.none.fl_str_mv |
263 |
| bitstream.url.fl_str_mv |
https://repositorioacademico.upc.edu.pe/bitstream/10757/676024/1/license.txt |
| bitstream.checksum.fl_str_mv |
8a4605be74aa9ea9d79846c1fba20a33 |
| bitstream.checksumAlgorithm.fl_str_mv |
MD5 |
| repository.name.fl_str_mv |
Repositorio académico upc |
| repository.mail.fl_str_mv |
upc@openrepository.com |
| _version_ |
1846066052391239680 |
| spelling |
6042ae70dbad70a0e284106c8f04cd0b3000305ef1eee71f459742214f7f85e783030001670a91000b8b89687056f854c3cbe3Nina-Gutiérrez, Elizabeth AdrianaPacheco-Alanya, Jesús EmersonMorales-Arevalo, Juan Carlos2024-10-06T11:17:57Z2024-10-06T11:17:57Z2024-01-011865092910.1007/978-3-031-66705-3_17http://hdl.handle.net/10757/67602418650937Communications in Computer and Information Science2-s2.0-85202606320SCOPUS_ID:85202606320Cyberbullying on social networks has emerged as a global problem with serious consequences on the mental health of victims, mainly children, and adolescents. Although there are AI-based solutions to address this issue, they face limitations such as a lack of multilingual datasets, detecting sarcasm, and detecting idioms. Research presents an innovative approach to effective cyberbullying detection using a fine-tuned GPT-3.5 model. Our main contribution is the creation of an extensive multi-label dataset of approximately 60,000 data in English, and Spanish, spanning diverse dialects. This data set was obtained by combining and processing multiple datasets from reliable sources. In addition, we developed a fine-tuned model based on GPT-3.5, capable of identifying hate speech, and offensive language in textual content on social networks. We conducted a thorough evaluation comparing our model to specialized solutions such as Perspective API, Moderation, Content Safety, Toxic Bert, and Gemini. The results demonstrate that our approach outperforms existing models in metrics such as precision, f1-score, and accuracy, making it the most suitable choice for effective cyberbullying detection. This research lays the groundwork for a future app where users can be alerted to harmful content online.application/htmlengSpringer Science and Business Media Deutschland GmbHinfo:eu-repo/semantics/embargoedAccessArtificial intelligenceCyberbullyingGPTHate detectionOffensive languageSocial mediaMultilingual Detection of Cyberbullying on Social Networks Using a Fine-Tuned GPT-3.5 Modelinfo:eu-repo/semantics/articleCommunications in Computer and Information Science2172 CCIS252263reponame:UPC-Institucionalinstname:Universidad Peruana de Ciencias Aplicadasinstacron:UPCLICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://repositorioacademico.upc.edu.pe/bitstream/10757/676024/1/license.txt8a4605be74aa9ea9d79846c1fba20a33MD51false10757/676024oai:repositorioacademico.upc.edu.pe:10757/6760242024-10-06 11:17:59.346Repositorio académico upcupc@openrepository.comTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo= |
| score |
13.905282 |
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).