A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection
Descripción del Articulo
With the increasing popularity of online social networking platforms, the amount of social data has grown exponentially. Social data analysis is essential as spamming activities and spammers are escalating over online social networking platforms. This paper focuses on spammer detection on the Twitte...
Autores: | , , , , , , |
---|---|
Formato: | artículo |
Fecha de Publicación: | 2022 |
Institución: | Universidad Peruana de Ciencias Aplicadas |
Repositorio: | UPC-Institucional |
Lenguaje: | inglés |
OAI Identifier: | oai:repositorioacademico.upc.edu.pe:10757/660274 |
Enlace del recurso: | http://hdl.handle.net/10757/660274 |
Nivel de acceso: | acceso embargado |
Materia: | computational classification decision tree Gravitation gravitational search algorithm social communication Twitter spammer detection |
id |
UUPC_b933757adb08e728b46c5d6379c2de23 |
---|---|
oai_identifier_str |
oai:repositorioacademico.upc.edu.pe:10757/660274 |
network_acronym_str |
UUPC |
network_name_str |
UPC-Institucional |
repository_id_str |
2670 |
dc.title.es_PE.fl_str_mv |
A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection |
title |
A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection |
spellingShingle |
A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection Vives, Luis computational classification decision tree Gravitation gravitational search algorithm social communication Twitter spammer detection |
title_short |
A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection |
title_full |
A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection |
title_fullStr |
A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection |
title_full_unstemmed |
A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection |
title_sort |
A novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detection |
author |
Vives, Luis |
author_facet |
Vives, Luis Tuteja, Gurpreet Singh Manideep, A. Sai Jindal, Sonika Sidhu, Navjot Jindal, Richa Bhatt, Abhishek |
author_role |
author |
author2 |
Tuteja, Gurpreet Singh Manideep, A. Sai Jindal, Sonika Sidhu, Navjot Jindal, Richa Bhatt, Abhishek |
author2_role |
author author author author author author |
dc.contributor.author.fl_str_mv |
Vives, Luis Tuteja, Gurpreet Singh Manideep, A. Sai Jindal, Sonika Sidhu, Navjot Jindal, Richa Bhatt, Abhishek |
dc.subject.es_PE.fl_str_mv |
computational classification decision tree Gravitation gravitational search algorithm social communication Twitter spammer detection |
topic |
computational classification decision tree Gravitation gravitational search algorithm social communication Twitter spammer detection |
description |
With the increasing popularity of online social networking platforms, the amount of social data has grown exponentially. Social data analysis is essential as spamming activities and spammers are escalating over online social networking platforms. This paper focuses on spammer detection on the Twitter social networking platform. Although existing researchers have developed numerous machine learning methods to detect spammers, these methods are inefficient for appropriately detecting spammers on Twitter due to the imbalance of spam and nonspam data distribution, the involvement of diverse features and the applicability of data mechanisms by spammers to avoid their detection. This research work proposes a novel hybrid approach of the gravitational search algorithm and the decision tree (HGSDT) for detecting Twitter spammers. The individual decision tree (DT) algorithm is not able to address the challenges as it is unstable and ineffective for the higher level of favorable data for a particular attribute. The gravitational search algorithm (GSA) constructs the DTs with improved performance as the gravitational forces act as the information-transferring agents through mass agents. Moreover, the GSA is efficient in handling the data of higher dimensional search space. In the HGSDT approach, the construction of the DT and splitting of nodes are performed with the heuristic function and Newton's laws. The performance of the proposed HGSDT approach is determined for the Social Honeypot dataset and 1KS-10KN dataset by conducting three different experiments to analyze the impact of training data size, features and spammer ratio. The result of the first experiment shows the need of a higher proportion of training data size, the second experiment signifies the more importance of textual content-based features compared to the other feature categories and the third experiment indicates the requirement of balanced data to attain the effective performance of the proposed approach. The overall performance comparison indicates that the proposed HGSDT approach is superior to the incorporated machine learning methods of DT, support vector machine and back propagation neural network for detecting Twitter spammers. |
publishDate |
2022 |
dc.date.accessioned.none.fl_str_mv |
2022-07-10T16:14:49Z |
dc.date.available.none.fl_str_mv |
2022-07-10T16:14:49Z |
dc.date.issued.fl_str_mv |
2022-05-01 |
dc.type.es_PE.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
dc.identifier.issn.none.fl_str_mv |
01291831 |
dc.identifier.doi.none.fl_str_mv |
10.1142/S0129183122500607 |
dc.identifier.uri.none.fl_str_mv |
http://hdl.handle.net/10757/660274 |
dc.identifier.journal.es_PE.fl_str_mv |
International Journal of Modern Physics C |
dc.identifier.eid.none.fl_str_mv |
2-s2.0-85119660069 |
dc.identifier.scopusid.none.fl_str_mv |
SCOPUS_ID:85119660069 |
dc.identifier.isni.none.fl_str_mv |
0000 0001 2196 144X |
identifier_str_mv |
01291831 10.1142/S0129183122500607 International Journal of Modern Physics C 2-s2.0-85119660069 SCOPUS_ID:85119660069 0000 0001 2196 144X |
url |
http://hdl.handle.net/10757/660274 |
dc.language.iso.es_PE.fl_str_mv |
eng |
language |
eng |
dc.relation.url.es_PE.fl_str_mv |
https://www.worldscientific.com/doi/10.1142/S0129183122500607 |
dc.rights.es_PE.fl_str_mv |
info:eu-repo/semantics/embargoedAccess |
eu_rights_str_mv |
embargoedAccess |
dc.format.es_PE.fl_str_mv |
application/html |
dc.publisher.es_PE.fl_str_mv |
World Scientific |
dc.source.es_PE.fl_str_mv |
Repositorio Academico - UPC Universidad Peruana de Ciencias Aplicadas (UPC) |
dc.source.none.fl_str_mv |
reponame:UPC-Institucional instname:Universidad Peruana de Ciencias Aplicadas instacron:UPC |
instname_str |
Universidad Peruana de Ciencias Aplicadas |
instacron_str |
UPC |
institution |
UPC |
reponame_str |
UPC-Institucional |
collection |
UPC-Institucional |
dc.source.journaltitle.none.fl_str_mv |
International Journal of Modern Physics C |
dc.source.volume.none.fl_str_mv |
33 |
dc.source.issue.none.fl_str_mv |
5 |
bitstream.url.fl_str_mv |
https://repositorioacademico.upc.edu.pe/bitstream/10757/660274/1/license.txt |
bitstream.checksum.fl_str_mv |
8a4605be74aa9ea9d79846c1fba20a33 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 |
repository.name.fl_str_mv |
Repositorio académico upc |
repository.mail.fl_str_mv |
upc@openrepository.com |
_version_ |
1837188479449038848 |
spelling |
b5df785608d336d7fb27eeac799713de500bd67c3957d024627c74624c91c00e31b300599f4c60baf4c5d13864ea5ebeebbb9b3009d26f16da67c6e459f310b42d101eedf500e03399736dea093fbe719729b44f87b830072b78f1abf8b06664f03dfdb62273c8e3004820edc3528dc49d13a3517f848287bc500Vives, LuisTuteja, Gurpreet SinghManideep, A. SaiJindal, SonikaSidhu, NavjotJindal, RichaBhatt, Abhishek2022-07-10T16:14:49Z2022-07-10T16:14:49Z2022-05-010129183110.1142/S0129183122500607http://hdl.handle.net/10757/660274International Journal of Modern Physics C2-s2.0-85119660069SCOPUS_ID:851196600690000 0001 2196 144XWith the increasing popularity of online social networking platforms, the amount of social data has grown exponentially. Social data analysis is essential as spamming activities and spammers are escalating over online social networking platforms. This paper focuses on spammer detection on the Twitter social networking platform. Although existing researchers have developed numerous machine learning methods to detect spammers, these methods are inefficient for appropriately detecting spammers on Twitter due to the imbalance of spam and nonspam data distribution, the involvement of diverse features and the applicability of data mechanisms by spammers to avoid their detection. This research work proposes a novel hybrid approach of the gravitational search algorithm and the decision tree (HGSDT) for detecting Twitter spammers. The individual decision tree (DT) algorithm is not able to address the challenges as it is unstable and ineffective for the higher level of favorable data for a particular attribute. The gravitational search algorithm (GSA) constructs the DTs with improved performance as the gravitational forces act as the information-transferring agents through mass agents. Moreover, the GSA is efficient in handling the data of higher dimensional search space. In the HGSDT approach, the construction of the DT and splitting of nodes are performed with the heuristic function and Newton's laws. The performance of the proposed HGSDT approach is determined for the Social Honeypot dataset and 1KS-10KN dataset by conducting three different experiments to analyze the impact of training data size, features and spammer ratio. The result of the first experiment shows the need of a higher proportion of training data size, the second experiment signifies the more importance of textual content-based features compared to the other feature categories and the third experiment indicates the requirement of balanced data to attain the effective performance of the proposed approach. The overall performance comparison indicates that the proposed HGSDT approach is superior to the incorporated machine learning methods of DT, support vector machine and back propagation neural network for detecting Twitter spammers.Revisión por paresapplication/htmlengWorld Scientifichttps://www.worldscientific.com/doi/10.1142/S0129183122500607info:eu-repo/semantics/embargoedAccessRepositorio Academico - UPCUniversidad Peruana de Ciencias Aplicadas (UPC)International Journal of Modern Physics C335reponame:UPC-Institucionalinstname:Universidad Peruana de Ciencias Aplicadasinstacron:UPCcomputational classificationdecision treeGravitationgravitational search algorithmsocial communicationTwitter spammer detectionA novel hybrid approach of gravitational search algorithm and decision tree for twitter spammer detectioninfo:eu-repo/semantics/articleLICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://repositorioacademico.upc.edu.pe/bitstream/10757/660274/1/license.txt8a4605be74aa9ea9d79846c1fba20a33MD51false10757/660274oai:repositorioacademico.upc.edu.pe:10757/6602742022-07-10 16:14:50.335Repositorio académico upcupc@openrepository.comTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo= |
score |
13.7211075 |
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).