Understanding stance classification of BERT models: an attention-based mechanism

Descripción del Articulo

BERT produces state-of-the-art solutions for many natural language processing tasks at the cost of interpretability. As works discuss the value of BERT’s attention weights to this purpose, we contribute with an attention-based interpretability framework to identify the most influential words for sta...

Descripción completa

Detalles Bibliográficos
Autor: Córdova Sáenz, Carlos Abel
Formato: tesis de maestría
Fecha de Publicación:2022
Institución:Superintendencia Nacional de Educación Superior Universitaria
Repositorio:Registro Nacional de Trabajos conducentes a Grados y Títulos - RENATI
Lenguaje:inglés
OAI Identifier:oai:renati.sunedu.gob.pe:renati/9260
Enlace del recurso:https://renati.sunedu.gob.pe/handle/sunedu/3693338
http://hdl.handle.net/10183/247549
Nivel de acceso:acceso abierto
Materia:Representaciones de codificador bidireccional de transformadores
Procesamiento en lenguaje natural (Informática)
Interpretabilidad (Inteligencia artificial)
COVID-19 (Enfermedad) - Aspectos políticos
Polarización política
https://purl.org/pe-repo/ocde/ford#1.02.01
Descripción
Sumario:BERT produces state-of-the-art solutions for many natural language processing tasks at the cost of interpretability. As works discuss the value of BERT’s attention weights to this purpose, we contribute with an attention-based interpretability framework to identify the most influential words for stance classification using BERT-based models. Unlike related work, we develop a broader level of interpretability focused on the overall model behavior instead of single instances. We aggregate tokens’ attentions into words’ attention weights that are more meaningful and can be semantically related to the domain. We propose attention metrics to assess words’ influence in the correct classification of stances. We use three case studies related to COVID-19 to assess the proposed framework in a broad experimental setting encompassing six datasets and four BERT pre-trained models for Portuguese and English languages, resulting in sixteen stance classification models. Through establishing five different research questions, we obtained valuable insights on the usefulness of attention weights to interpret stance classification that allowed us to generalize our findings. Our results are independent of a particular pre-trained BERT model and comparable to those obtained using an alternative baseline method. High attention scores improve the probability of finding words that positively impact the model performance and influence the correct classification (up to 82% of identified influential words contribute to correct predictions). The influential words represent the domain and can be used to identify how the model leverages the arguments expressed to predict a stance.
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).