A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis

Dongo I.; Cardinale Y.; Aguilera A.; Martinez F.; Quintero Y.; Robayo G.; Cabeza D.

A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis

Descripción del Articulo

This research was supported by the FONDO NACIONAL DE DESARROLLO CIENTÍFICO, TECNOLÓGICO Y DE INNOVACIÓN TECNOLÓGICA – FONDECYT as executing entity of CONCYTEC under grant agreement no. 01–2019-FONDECYT-BM-INC.INV in the project RUTAS: Robots para centros Urbanos Turísticos Autónomos y basados en Sem...

Descripción completa

Detalles Bibliográficos
Autores:	Dongo I., Cardinale Y., Aguilera A., Martinez F., Quintero Y., Robayo G., Cabeza D.
Formato:	artículo
Fecha de Publicación:	2021
Institución:	Consejo Nacional de Ciencia Tecnología e Innovación
Repositorio:	CONCYTEC-Institucional
Lenguaje:	inglés
OAI Identifier:	oai:repositorio.concytec.gob.pe:20.500.12390/2961
Enlace del recurso:	https://hdl.handle.net/20.500.12390/2961 https://doi.org/10.1108/IJWIS-03-2021-0037
Nivel de acceso:	acceso abierto
Materia:	Web scraping API Credibility Qualitative analysis Twitter https://purl.org/pe-repo/ocde/ford#2.02.04

id	CONC_5d764fcd42988a8ecb18b73950b65d53
oai_identifier_str	oai:repositorio.concytec.gob.pe:20.500.12390/2961
network_acronym_str	CONC
network_name_str	CONCYTEC-Institucional
repository_id_str	4689
dc.title.none.fl_str_mv	A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis
title	A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis
spellingShingle	A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis Dongo I. Web scraping API Credibility Qualitative analysis Twitter https://purl.org/pe-repo/ocde/ford#2.02.04
title_short	A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis
title_full	A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis
title_fullStr	A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis
title_full_unstemmed	A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis
title_sort	A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis
author	Dongo I.
author_facet	Dongo I. Cardinale Y. Aguilera A. Martinez F. Quintero Y. Robayo G. Cabeza D.
author_role	author
author2	Cardinale Y. Aguilera A. Martinez F. Quintero Y. Robayo G. Cabeza D.
author2_role	author author author author author author
dc.contributor.author.fl_str_mv	Dongo I. Cardinale Y. Aguilera A. Martinez F. Quintero Y. Robayo G. Cabeza D.
dc.subject.none.fl_str_mv	Web scraping
topic	Web scraping API Credibility Qualitative analysis Twitter https://purl.org/pe-repo/ocde/ford#2.02.04
dc.subject.es_PE.fl_str_mv	API Credibility Qualitative analysis Twitter
dc.subject.ocde.none.fl_str_mv	https://purl.org/pe-repo/ocde/ford#2.02.04
description	This research was supported by the FONDO NACIONAL DE DESARROLLO CIENTÍFICO, TECNOLÓGICO Y DE INNOVACIÓN TECNOLÓGICA – FONDECYT as executing entity of CONCYTEC under grant agreement no. 01–2019-FONDECYT-BM-INC.INV in the project RUTAS: Robots para centros Urbanos Turísticos Autónomos y basados en Semántica.
publishDate	2021
dc.date.accessioned.none.fl_str_mv	2024-05-30T23:13:38Z
dc.date.available.none.fl_str_mv	2024-05-30T23:13:38Z
dc.date.issued.fl_str_mv	2021
dc.type.none.fl_str_mv	info:eu-repo/semantics/article
format	article
dc.identifier.uri.none.fl_str_mv	https://hdl.handle.net/20.500.12390/2961
dc.identifier.doi.none.fl_str_mv	https://doi.org/10.1108/IJWIS-03-2021-0037
dc.identifier.scopus.none.fl_str_mv	2-s2.0-85111661872
url	https://hdl.handle.net/20.500.12390/2961 https://doi.org/10.1108/IJWIS-03-2021-0037
identifier_str_mv	2-s2.0-85111661872
dc.language.iso.none.fl_str_mv	eng
language	eng
dc.relation.ispartof.none.fl_str_mv	International Journal of Web Information Systems
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	Emerald Group Holdings Ltd.
publisher.none.fl_str_mv	Emerald Group Holdings Ltd.
dc.source.none.fl_str_mv	reponame:CONCYTEC-Institucional instname:Consejo Nacional de Ciencia Tecnología e Innovación instacron:CONCYTEC
instname_str	Consejo Nacional de Ciencia Tecnología e Innovación
instacron_str	CONCYTEC
institution	CONCYTEC
reponame_str	CONCYTEC-Institucional
collection	CONCYTEC-Institucional
repository.name.fl_str_mv	Repositorio Institucional CONCYTEC
repository.mail.fl_str_mv	repositorio@concytec.gob.pe
_version_	1870084308199276544
spelling	Publicationrp05705600rp05703600rp06233600rp08386600rp06234600rp08387600rp08385600Dongo I.Cardinale Y.Aguilera A.Martinez F.Quintero Y.Robayo G.Cabeza D.2024-05-30T23:13:38Z2024-05-30T23:13:38Z2021https://hdl.handle.net/20.500.12390/2961https://doi.org/10.1108/IJWIS-03-2021-00372-s2.0-85111661872This research was supported by the FONDO NACIONAL DE DESARROLLO CIENTÍFICO, TECNOLÓGICO Y DE INNOVACIÓN TECNOLÓGICA – FONDECYT as executing entity of CONCYTEC under grant agreement no. 01–2019-FONDECYT-BM-INC.INV in the project RUTAS: Robots para centros Urbanos Turísticos Autónomos y basados en Semántica.Purpose: This paper aims to perform an exhaustive revision of relevant and recent related studies, which reveals that both extraction methods are currently used to analyze credibility on Twitter. Thus, there is clear evidence of the need of having different options to extract different data for this purpose. Nevertheless, none of these studies perform a comparative evaluation of both extraction techniques. Moreover, the authors extend a previous comparison, which uses a recent developed framework that offers both alternates of data extraction and implements a previously proposed credibility model, by adding a qualitative evaluation and a Twitter-Application Programming Interface (API) performance analysis from different locations. Design/methodology/approach: As one of the most popular social platforms, Twitter has been the focus of recent research aimed at analyzing the credibility of the shared information. To do so, several proposals use either Twitter API or Web scraping to extract the data to perform the analysis. Qualitative and quantitative evaluations are performed to discover the advantages and disadvantages of both extraction methods. Findings: The study demonstrates the differences in terms of accuracy and efficiency of both extraction methods and gives relevance to much more problems related to this area to pursue true transparency and legitimacy of information on the Web. Originality/value: Results report that some Twitter attributes cannot be retrieved by Web scraping. Both methods produce identical credibility values when a robust normalization process is applied to the text (i.e. tweet). Moreover, concerning the time performance, Web scraping is faster than Twitter API and it is more flexible in terms of obtaining data; however, Web scraping is very sensitive to website changes. Additionally, the response time of the Twitter API is proportional to the distance from the central server at San Francisco. © 2021, Emerald Publishing Limited.Consejo Nacional de Ciencia, Tecnología e Innovación Tecnológica - ConcytecengEmerald Group Holdings Ltd.International Journal of Web Information Systemsinfo:eu-repo/semantics/openAccessWeb scrapingAPI-1Credibility-1Qualitative analysis-1Twitter-1https://purl.org/pe-repo/ocde/ford#2.02.04-1A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysisinfo:eu-repo/semantics/articlereponame:CONCYTEC-Institucionalinstname:Consejo Nacional de Ciencia Tecnología e Innovacióninstacron:CONCYTEC20.500.12390/2961oai:repositorio.concytec.gob.pe:20.500.12390/29612024-05-30 16:12:31.545http://purl.org/coar/access_right/c_14cbinfo:eu-repo/semantics/closedAccessmetadata only accesshttps://repositorio.concytec.gob.peRepositorio Institucional CONCYTECrepositorio@concytec.gob.pe#PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE##PLACEHOLDER_PARENT_METADATA_VALUE#<Publication xmlns="https://www.openaire.eu/cerif-profile/1.1/" id="aec1045d-aee5-4279-bf8e-6894bd533f30"> <Type xmlns="https://www.openaire.eu/cerif-profile/vocab/COAR_Publication_Types">http://purl.org/coar/resource_type/c_1843</Type> <Language>eng</Language> <Title>A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis</Title> <PublishedIn> <Publication> <Title>International Journal of Web Information Systems</Title> </Publication> </PublishedIn> <PublicationDate>2021</PublicationDate> <DOI>https://doi.org/10.1108/IJWIS-03-2021-0037</DOI> <SCP-Number>2-s2.0-85111661872</SCP-Number> <Authors> <Author> <DisplayName>Dongo I.</DisplayName> <Person id="rp05705" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> <Author> <DisplayName>Cardinale Y.</DisplayName> <Person id="rp05703" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> <Author> <DisplayName>Aguilera A.</DisplayName> <Person id="rp06233" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> <Author> <DisplayName>Martinez F.</DisplayName> <Person id="rp08386" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> <Author> <DisplayName>Quintero Y.</DisplayName> <Person id="rp06234" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> <Author> <DisplayName>Robayo G.</DisplayName> <Person id="rp08387" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> <Author> <DisplayName>Cabeza D.</DisplayName> <Person id="rp08385" /> <Affiliation> <OrgUnit> </OrgUnit> </Affiliation> </Author> </Authors> <Editors> </Editors> <Publishers> <Publisher> <DisplayName>Emerald Group Holdings Ltd.</DisplayName> <OrgUnit /> </Publisher> </Publishers> <Keyword>Web scraping</Keyword> <Keyword>API</Keyword> <Keyword>Credibility</Keyword> <Keyword>Qualitative analysis</Keyword> <Keyword>Twitter</Keyword> <Abstract>Purpose: This paper aims to perform an exhaustive revision of relevant and recent related studies, which reveals that both extraction methods are currently used to analyze credibility on Twitter. Thus, there is clear evidence of the need of having different options to extract different data for this purpose. Nevertheless, none of these studies perform a comparative evaluation of both extraction techniques. Moreover, the authors extend a previous comparison, which uses a recent developed framework that offers both alternates of data extraction and implements a previously proposed credibility model, by adding a qualitative evaluation and a Twitter-Application Programming Interface (API) performance analysis from different locations. Design/methodology/approach: As one of the most popular social platforms, Twitter has been the focus of recent research aimed at analyzing the credibility of the shared information. To do so, several proposals use either Twitter API or Web scraping to extract the data to perform the analysis. Qualitative and quantitative evaluations are performed to discover the advantages and disadvantages of both extraction methods. Findings: The study demonstrates the differences in terms of accuracy and efficiency of both extraction methods and gives relevance to much more problems related to this area to pursue true transparency and legitimacy of information on the Web. Originality/value: Results report that some Twitter attributes cannot be retrieved by Web scraping. Both methods produce identical credibility values when a robust normalization process is applied to the text (i.e. tweet). Moreover, concerning the time performance, Web scraping is faster than Twitter API and it is more flexible in terms of obtaining data; however, Web scraping is very sensitive to website changes. Additionally, the response time of the Twitter API is proportional to the distance from the central server at San Francisco. © 2021, Emerald Publishing Limited.</Abstract> <Access xmlns="http://purl.org/coar/access_right" > </Access> </Publication> -1
score	13.421615

A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis

Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).

A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis

Descripción del Articulo

Ejemplares Similares