A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models

Descripción del Articulo

Transformer models have evolved natural language processing tasks in machine learning and set a new standard for the state of the art. Thanks to the self-attention component, these models have achieved significant improvements in text generation tasks (such as extractive and abstractive text summari...

Descripción completa

Detalles Bibliográficos
Autores: Núñez-Robinson, Daniel, Talavera-Montalto, Jose, Ugarte, Willy
Formato: artículo
Fecha de Publicación:2022
Institución:Universidad Peruana de Ciencias Aplicadas
Repositorio:UPC-Institucional
Lenguaje:inglés
OAI Identifier:oai:repositorioacademico.upc.edu.pe:10757/669595
Enlace del recurso:http://hdl.handle.net/10757/669595
Nivel de acceso:acceso embargado
Materia:Abstractive text summarization
Benchmark
Deep learning
Natural language processing
Transformers
id UUPC_86b98157dd98a66e3ce1fe0b3bd0c4f4
oai_identifier_str oai:repositorioacademico.upc.edu.pe:10757/669595
network_acronym_str UUPC
network_name_str UPC-Institucional
repository_id_str 2670
dc.title.es_PE.fl_str_mv A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models
title A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models
spellingShingle A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models
Núñez-Robinson, Daniel
Abstractive text summarization
Benchmark
Deep learning
Natural language processing
Transformers
title_short A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models
title_full A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models
title_fullStr A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models
title_full_unstemmed A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models
title_sort A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models
author Núñez-Robinson, Daniel
author_facet Núñez-Robinson, Daniel
Talavera-Montalto, Jose
Ugarte, Willy
author_role author
author2 Talavera-Montalto, Jose
Ugarte, Willy
author2_role author
author
dc.contributor.author.fl_str_mv Núñez-Robinson, Daniel
Talavera-Montalto, Jose
Ugarte, Willy
dc.subject.es_PE.fl_str_mv Abstractive text summarization
Benchmark
Deep learning
Natural language processing
Transformers
topic Abstractive text summarization
Benchmark
Deep learning
Natural language processing
Transformers
description Transformer models have evolved natural language processing tasks in machine learning and set a new standard for the state of the art. Thanks to the self-attention component, these models have achieved significant improvements in text generation tasks (such as extractive and abstractive text summarization). However, research works involving text summarization and the legal domain are still in their infancy, and as such, benchmarks and a comparative analysis of these state of the art models is important for the future of text summarization of this highly specialized task. In order to contribute to these research works, the researchers propose a comparative analysis of different, fine-tuned Transformer models and datasets in order to provide a better understanding of the task at hand and the challenges ahead. The results show that Transformer models have improved upon the text summarization task, however, consistent and generalized learning is a challenge that still exists when training the models with large text dimensions. Finally, after analyzing the correlation between objective results and human opinion, the team concludes that the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) [13] metrics used in the current state of the art are limited and do not reflect the precise quality of a generated summary.
publishDate 2022
dc.date.accessioned.none.fl_str_mv 2023-12-08T01:27:49Z
dc.date.available.none.fl_str_mv 2023-12-08T01:27:49Z
dc.date.issued.fl_str_mv 2022-01-01
dc.type.es_PE.fl_str_mv info:eu-repo/semantics/article
format article
dc.identifier.issn.none.fl_str_mv 18650929
dc.identifier.doi.none.fl_str_mv 10.1007/978-3-031-20319-0_28
dc.identifier.uri.none.fl_str_mv http://hdl.handle.net/10757/669595
dc.identifier.eissn.none.fl_str_mv 18650937
dc.identifier.journal.es_PE.fl_str_mv Communications in Computer and Information Science
dc.identifier.eid.none.fl_str_mv 2-s2.0-85144232675
dc.identifier.scopusid.none.fl_str_mv SCOPUS_ID:85144232675
dc.identifier.isni.none.fl_str_mv 0000 0001 2196 144X
identifier_str_mv 18650929
10.1007/978-3-031-20319-0_28
18650937
Communications in Computer and Information Science
2-s2.0-85144232675
SCOPUS_ID:85144232675
0000 0001 2196 144X
url http://hdl.handle.net/10757/669595
dc.language.iso.es_PE.fl_str_mv eng
language eng
dc.relation.url.es_PE.fl_str_mv https://www.springerprofessional.de/en/a-comparative-analysis-on-the-summarization-of-legal-texts-using/23752634
dc.rights.es_PE.fl_str_mv info:eu-repo/semantics/embargoedAccess
dc.rights.*.fl_str_mv Attribution-NonCommercial-ShareAlike 4.0 International
dc.rights.uri.*.fl_str_mv http://creativecommons.org/licenses/by-nc-sa/4.0/
eu_rights_str_mv embargoedAccess
rights_invalid_str_mv Attribution-NonCommercial-ShareAlike 4.0 International
http://creativecommons.org/licenses/by-nc-sa/4.0/
dc.publisher.es_PE.fl_str_mv Springer Science and Business Media Deutschland GmbH
dc.source.none.fl_str_mv reponame:UPC-Institucional
instname:Universidad Peruana de Ciencias Aplicadas
instacron:UPC
instname_str Universidad Peruana de Ciencias Aplicadas
instacron_str UPC
institution UPC
reponame_str UPC-Institucional
collection UPC-Institucional
dc.source.journaltitle.none.fl_str_mv Communications in Computer and Information Science
dc.source.volume.none.fl_str_mv 1675 CCIS
dc.source.beginpage.none.fl_str_mv 372
dc.source.endpage.none.fl_str_mv 386
bitstream.url.fl_str_mv https://repositorioacademico.upc.edu.pe/bitstream/10757/669595/2/license.txt
https://repositorioacademico.upc.edu.pe/bitstream/10757/669595/1/license_rdf
bitstream.checksum.fl_str_mv 8a4605be74aa9ea9d79846c1fba20a33
934f4ca17e109e0a05eaeaba504d7ce4
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositorio académico upc
repository.mail.fl_str_mv upc@openrepository.com
_version_ 1837186818053767168
spelling 99607e4d7c947d73433c106f6c4654e23004755670eaa0297e54b2b5993a5041eda300533fd7e68213307170565ef90452257a500Núñez-Robinson, DanielTalavera-Montalto, JoseUgarte, Willy2023-12-08T01:27:49Z2023-12-08T01:27:49Z2022-01-011865092910.1007/978-3-031-20319-0_28http://hdl.handle.net/10757/66959518650937Communications in Computer and Information Science2-s2.0-85144232675SCOPUS_ID:851442326750000 0001 2196 144XTransformer models have evolved natural language processing tasks in machine learning and set a new standard for the state of the art. Thanks to the self-attention component, these models have achieved significant improvements in text generation tasks (such as extractive and abstractive text summarization). However, research works involving text summarization and the legal domain are still in their infancy, and as such, benchmarks and a comparative analysis of these state of the art models is important for the future of text summarization of this highly specialized task. In order to contribute to these research works, the researchers propose a comparative analysis of different, fine-tuned Transformer models and datasets in order to provide a better understanding of the task at hand and the challenges ahead. The results show that Transformer models have improved upon the text summarization task, however, consistent and generalized learning is a challenge that still exists when training the models with large text dimensions. Finally, after analyzing the correlation between objective results and human opinion, the team concludes that the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) [13] metrics used in the current state of the art are limited and do not reflect the precise quality of a generated summary.engSpringer Science and Business Media Deutschland GmbHhttps://www.springerprofessional.de/en/a-comparative-analysis-on-the-summarization-of-legal-texts-using/23752634info:eu-repo/semantics/embargoedAccessAttribution-NonCommercial-ShareAlike 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-sa/4.0/Abstractive text summarizationBenchmarkDeep learningNatural language processingTransformersA Comparative Analysis on the Summarization of Legal Texts Using Transformer Modelsinfo:eu-repo/semantics/articleCommunications in Computer and Information Science1675 CCIS372386reponame:UPC-Institucionalinstname:Universidad Peruana de Ciencias Aplicadasinstacron:UPCLICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://repositorioacademico.upc.edu.pe/bitstream/10757/669595/2/license.txt8a4605be74aa9ea9d79846c1fba20a33MD52falseCC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-81031https://repositorioacademico.upc.edu.pe/bitstream/10757/669595/1/license_rdf934f4ca17e109e0a05eaeaba504d7ce4MD51false10757/669595oai:repositorioacademico.upc.edu.pe:10757/6695952023-12-08 01:27:50.395Repositorio académico upcupc@openrepository.comTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=
score 13.959421
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).