A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection

Descripción del Articulo

With the increasing popularity of online social networking platforms, the amount of social data has grown exponentially. Social data analysis is essential as spamming activities and spammers are escalating over online social networking platforms. This paper focuses on spammer detection on the Twitte...

Descripción completa

Detalles Bibliográficos
Autores: Vives, Luis, Tuteja, Gurpreet Singh, Manideep, A. Sai, Jindal, Sonika, Sidhu, Navjot, Jindal, Richa, Bhatt, Abhishek
Formato: artículo
Fecha de Publicación:2022
Institución:Universidad Peruana de Ciencias Aplicadas
Repositorio:UPC-Institucional
Lenguaje:inglés
OAI Identifier:oai:repositorioacademico.upc.edu.pe:10757/660274
Enlace del recurso:http://hdl.handle.net/10757/660274
Nivel de acceso:acceso embargado
Materia:computational classification
decision tree
Gravitation
gravitational search algorithm
social communication
Twitter spammer detection
id UUPC_b933757adb08e728b46c5d6379c2de23
oai_identifier_str oai:repositorioacademico.upc.edu.pe:10757/660274
network_acronym_str UUPC
network_name_str UPC-Institucional
repository_id_str 2670
dc.title.es_PE.fl_str_mv A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection
title A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection
spellingShingle A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection
Vives, Luis
computational classification
decision tree
Gravitation
gravitational search algorithm
social communication
Twitter spammer detection
title_short A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection
title_full A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection
title_fullStr A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection
title_full_unstemmed A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection
title_sort A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection
author Vives, Luis
author_facet Vives, Luis
Tuteja, Gurpreet Singh
Manideep, A. Sai
Jindal, Sonika
Sidhu, Navjot
Jindal, Richa
Bhatt, Abhishek
author_role author
author2 Tuteja, Gurpreet Singh
Manideep, A. Sai
Jindal, Sonika
Sidhu, Navjot
Jindal, Richa
Bhatt, Abhishek
author2_role author
author
author
author
author
author
dc.contributor.author.fl_str_mv Vives, Luis
Tuteja, Gurpreet Singh
Manideep, A. Sai
Jindal, Sonika
Sidhu, Navjot
Jindal, Richa
Bhatt, Abhishek
dc.subject.es_PE.fl_str_mv computational classification
decision tree
Gravitation
gravitational search algorithm
social communication
Twitter spammer detection
topic computational classification
decision tree
Gravitation
gravitational search algorithm
social communication
Twitter spammer detection
description With the increasing popularity of online social networking platforms, the amount of social data has grown exponentially. Social data analysis is essential as spamming activities and spammers are escalating over online social networking platforms. This paper focuses on spammer detection on the Twitter social networking platform. Although existing researchers have developed numerous machine learning methods to detect spammers, these methods are inefficient for appropriately detecting spammers on Twitter due to the imbalance of spam and nonspam data distribution, the involvement of diverse features and the applicability of data mechanisms by spammers to avoid their detection. This research work proposes a novel hybrid approach of the gravitational search algorithm and the decision tree (HGSDT) for detecting Twitter spammers. The individual decision tree (DT) algorithm is not able to address the challenges as it is unstable and ineffective for the higher level of favorable data for a particular attribute. The gravitational search algorithm (GSA) constructs the DTs with improved performance as the gravitational forces act as the information-transferring agents through mass agents. Moreover, the GSA is efficient in handling the data of higher dimensional search space. In the HGSDT approach, the construction of the DT and splitting of nodes are performed with the heuristic function and Newton's laws. The performance of the proposed HGSDT approach is determined for the Social Honeypot dataset and 1KS-10KN dataset by conducting three different experiments to analyze the impact of training data size, features and spammer ratio. The result of the first experiment shows the need of a higher proportion of training data size, the second experiment signifies the more importance of textual content-based features compared to the other feature categories and the third experiment indicates the requirement of balanced data to attain the effective performance of the proposed approach. The overall performance comparison indicates that the proposed HGSDT approach is superior to the incorporated machine learning methods of DT, support vector machine and back propagation neural network for detecting Twitter spammers.
publishDate 2022
dc.date.accessioned.none.fl_str_mv 2022-07-10T16:14:49Z
dc.date.available.none.fl_str_mv 2022-07-10T16:14:49Z
dc.date.issued.fl_str_mv 2022-05-01
dc.type.es_PE.fl_str_mv info:eu-repo/semantics/article
format article
dc.identifier.issn.none.fl_str_mv 01291831
dc.identifier.doi.none.fl_str_mv 10.1142/S0129183122500607
dc.identifier.uri.none.fl_str_mv http://hdl.handle.net/10757/660274
dc.identifier.journal.es_PE.fl_str_mv International Journal of Modern Physics C
dc.identifier.eid.none.fl_str_mv 2-s2.0-85119660069
dc.identifier.scopusid.none.fl_str_mv SCOPUS_ID:85119660069
dc.identifier.isni.none.fl_str_mv 0000 0001 2196 144X
identifier_str_mv 01291831
10.1142/S0129183122500607
International Journal of Modern Physics C
2-s2.0-85119660069
SCOPUS_ID:85119660069
0000 0001 2196 144X
url http://hdl.handle.net/10757/660274
dc.language.iso.es_PE.fl_str_mv eng
language eng
dc.relation.url.es_PE.fl_str_mv https://www.worldscientific.com/doi/10.1142/S0129183122500607
dc.rights.es_PE.fl_str_mv info:eu-repo/semantics/embargoedAccess
eu_rights_str_mv embargoedAccess
dc.format.es_PE.fl_str_mv application/html
dc.publisher.es_PE.fl_str_mv World Scientific
dc.source.es_PE.fl_str_mv Repositorio Academico - UPC
Universidad Peruana de Ciencias Aplicadas (UPC)
dc.source.none.fl_str_mv reponame:UPC-Institucional
instname:Universidad Peruana de Ciencias Aplicadas
instacron:UPC
instname_str Universidad Peruana de Ciencias Aplicadas
instacron_str UPC
institution UPC
reponame_str UPC-Institucional
collection UPC-Institucional
dc.source.journaltitle.none.fl_str_mv International Journal of Modern Physics C
dc.source.volume.none.fl_str_mv 33
dc.source.issue.none.fl_str_mv 5
bitstream.url.fl_str_mv https://repositorioacademico.upc.edu.pe/bitstream/10757/660274/1/license.txt
bitstream.checksum.fl_str_mv 8a4605be74aa9ea9d79846c1fba20a33
bitstream.checksumAlgorithm.fl_str_mv MD5
repository.name.fl_str_mv Repositorio académico upc
repository.mail.fl_str_mv upc@openrepository.com
_version_ 1837188479449038848
spelling b5df785608d336d7fb27eeac799713de500bd67c3957d024627c74624c91c00e31b300599f4c60baf4c5d13864ea5ebeebbb9b3009d26f16da67c6e459f310b42d101eedf500e03399736dea093fbe719729b44f87b830072b78f1abf8b06664f03dfdb62273c8e3004820edc3528dc49d13a3517f848287bc500Vives, LuisTuteja, Gurpreet SinghManideep, A. SaiJindal, SonikaSidhu, NavjotJindal, RichaBhatt, Abhishek2022-07-10T16:14:49Z2022-07-10T16:14:49Z2022-05-010129183110.1142/S0129183122500607http://hdl.handle.net/10757/660274International Journal of Modern Physics C2-s2.0-85119660069SCOPUS_ID:851196600690000 0001 2196 144XWith the increasing popularity of online social networking platforms, the amount of social data has grown exponentially. Social data analysis is essential as spamming activities and spammers are escalating over online social networking platforms. This paper focuses on spammer detection on the Twitter social networking platform. Although existing researchers have developed numerous machine learning methods to detect spammers, these methods are inefficient for appropriately detecting spammers on Twitter due to the imbalance of spam and nonspam data distribution, the involvement of diverse features and the applicability of data mechanisms by spammers to avoid their detection. This research work proposes a novel hybrid approach of the gravitational search algorithm and the decision tree (HGSDT) for detecting Twitter spammers. The individual decision tree (DT) algorithm is not able to address the challenges as it is unstable and ineffective for the higher level of favorable data for a particular attribute. The gravitational search algorithm (GSA) constructs the DTs with improved performance as the gravitational forces act as the information-transferring agents through mass agents. Moreover, the GSA is efficient in handling the data of higher dimensional search space. In the HGSDT approach, the construction of the DT and splitting of nodes are performed with the heuristic function and Newton's laws. The performance of the proposed HGSDT approach is determined for the Social Honeypot dataset and 1KS-10KN dataset by conducting three different experiments to analyze the impact of training data size, features and spammer ratio. The result of the first experiment shows the need of a higher proportion of training data size, the second experiment signifies the more importance of textual content-based features compared to the other feature categories and the third experiment indicates the requirement of balanced data to attain the effective performance of the proposed approach. The overall performance comparison indicates that the proposed HGSDT approach is superior to the incorporated machine learning methods of DT, support vector machine and back propagation neural network for detecting Twitter spammers.Revisión por paresapplication/htmlengWorld Scientifichttps://www.worldscientific.com/doi/10.1142/S0129183122500607info:eu-repo/semantics/embargoedAccessRepositorio Academico - UPCUniversidad Peruana de Ciencias Aplicadas (UPC)International Journal of Modern Physics C335reponame:UPC-Institucionalinstname:Universidad Peruana de Ciencias Aplicadasinstacron:UPCcomputational classificationdecision treeGravitationgravitational search algorithmsocial communicationTwitter spammer detectionA novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detectioninfo:eu-repo/semantics/articleLICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://repositorioacademico.upc.edu.pe/bitstream/10757/660274/1/license.txt8a4605be74aa9ea9d79846c1fba20a33MD51false10757/660274oai:repositorioacademico.upc.edu.pe:10757/6602742022-07-10 16:14:50.335Repositorio académico upcupc@openrepository.comTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=
score 13.7211075
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).