A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models
Descripción del Articulo
Transformer models have evolved natural language processing tasks in machine learning and set a new standard for the state of the art. Thanks to the self-attention component, these models have achieved significant improvements in text generation tasks (such as extractive and abstractive text summari...
Autores: | , , |
---|---|
Formato: | artículo |
Fecha de Publicación: | 2022 |
Institución: | Universidad Peruana de Ciencias Aplicadas |
Repositorio: | UPC-Institucional |
Lenguaje: | inglés |
OAI Identifier: | oai:repositorioacademico.upc.edu.pe:10757/669595 |
Enlace del recurso: | http://hdl.handle.net/10757/669595 |
Nivel de acceso: | acceso embargado |
Materia: | Abstractive text summarization Benchmark Deep learning Natural language processing Transformers |
id |
UUPC_86b98157dd98a66e3ce1fe0b3bd0c4f4 |
---|---|
oai_identifier_str |
oai:repositorioacademico.upc.edu.pe:10757/669595 |
network_acronym_str |
UUPC |
network_name_str |
UPC-Institucional |
repository_id_str |
2670 |
dc.title.es_PE.fl_str_mv |
A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models |
title |
A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models |
spellingShingle |
A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models Núñez-Robinson, Daniel Abstractive text summarization Benchmark Deep learning Natural language processing Transformers |
title_short |
A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models |
title_full |
A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models |
title_fullStr |
A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models |
title_full_unstemmed |
A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models |
title_sort |
A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models |
author |
Núñez-Robinson, Daniel |
author_facet |
Núñez-Robinson, Daniel Talavera-Montalto, Jose Ugarte, Willy |
author_role |
author |
author2 |
Talavera-Montalto, Jose Ugarte, Willy |
author2_role |
author author |
dc.contributor.author.fl_str_mv |
Núñez-Robinson, Daniel Talavera-Montalto, Jose Ugarte, Willy |
dc.subject.es_PE.fl_str_mv |
Abstractive text summarization Benchmark Deep learning Natural language processing Transformers |
topic |
Abstractive text summarization Benchmark Deep learning Natural language processing Transformers |
description |
Transformer models have evolved natural language processing tasks in machine learning and set a new standard for the state of the art. Thanks to the self-attention component, these models have achieved significant improvements in text generation tasks (such as extractive and abstractive text summarization). However, research works involving text summarization and the legal domain are still in their infancy, and as such, benchmarks and a comparative analysis of these state of the art models is important for the future of text summarization of this highly specialized task. In order to contribute to these research works, the researchers propose a comparative analysis of different, fine-tuned Transformer models and datasets in order to provide a better understanding of the task at hand and the challenges ahead. The results show that Transformer models have improved upon the text summarization task, however, consistent and generalized learning is a challenge that still exists when training the models with large text dimensions. Finally, after analyzing the correlation between objective results and human opinion, the team concludes that the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) [13] metrics used in the current state of the art are limited and do not reflect the precise quality of a generated summary. |
publishDate |
2022 |
dc.date.accessioned.none.fl_str_mv |
2023-12-08T01:27:49Z |
dc.date.available.none.fl_str_mv |
2023-12-08T01:27:49Z |
dc.date.issued.fl_str_mv |
2022-01-01 |
dc.type.es_PE.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
dc.identifier.issn.none.fl_str_mv |
18650929 |
dc.identifier.doi.none.fl_str_mv |
10.1007/978-3-031-20319-0_28 |
dc.identifier.uri.none.fl_str_mv |
http://hdl.handle.net/10757/669595 |
dc.identifier.eissn.none.fl_str_mv |
18650937 |
dc.identifier.journal.es_PE.fl_str_mv |
Communications in Computer and Information Science |
dc.identifier.eid.none.fl_str_mv |
2-s2.0-85144232675 |
dc.identifier.scopusid.none.fl_str_mv |
SCOPUS_ID:85144232675 |
dc.identifier.isni.none.fl_str_mv |
0000 0001 2196 144X |
identifier_str_mv |
18650929 10.1007/978-3-031-20319-0_28 18650937 Communications in Computer and Information Science 2-s2.0-85144232675 SCOPUS_ID:85144232675 0000 0001 2196 144X |
url |
http://hdl.handle.net/10757/669595 |
dc.language.iso.es_PE.fl_str_mv |
eng |
language |
eng |
dc.relation.url.es_PE.fl_str_mv |
https://www.springerprofessional.de/en/a-comparative-analysis-on-the-summarization-of-legal-texts-using/23752634 |
dc.rights.es_PE.fl_str_mv |
info:eu-repo/semantics/embargoedAccess |
dc.rights.*.fl_str_mv |
Attribution-NonCommercial-ShareAlike 4.0 International |
dc.rights.uri.*.fl_str_mv |
http://creativecommons.org/licenses/by-nc-sa/4.0/ |
eu_rights_str_mv |
embargoedAccess |
rights_invalid_str_mv |
Attribution-NonCommercial-ShareAlike 4.0 International http://creativecommons.org/licenses/by-nc-sa/4.0/ |
dc.publisher.es_PE.fl_str_mv |
Springer Science and Business Media Deutschland GmbH |
dc.source.none.fl_str_mv |
reponame:UPC-Institucional instname:Universidad Peruana de Ciencias Aplicadas instacron:UPC |
instname_str |
Universidad Peruana de Ciencias Aplicadas |
instacron_str |
UPC |
institution |
UPC |
reponame_str |
UPC-Institucional |
collection |
UPC-Institucional |
dc.source.journaltitle.none.fl_str_mv |
Communications in Computer and Information Science |
dc.source.volume.none.fl_str_mv |
1675 CCIS |
dc.source.beginpage.none.fl_str_mv |
372 |
dc.source.endpage.none.fl_str_mv |
386 |
bitstream.url.fl_str_mv |
https://repositorioacademico.upc.edu.pe/bitstream/10757/669595/2/license.txt https://repositorioacademico.upc.edu.pe/bitstream/10757/669595/1/license_rdf |
bitstream.checksum.fl_str_mv |
8a4605be74aa9ea9d79846c1fba20a33 934f4ca17e109e0a05eaeaba504d7ce4 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 |
repository.name.fl_str_mv |
Repositorio académico upc |
repository.mail.fl_str_mv |
upc@openrepository.com |
_version_ |
1837186818053767168 |
spelling |
99607e4d7c947d73433c106f6c4654e23004755670eaa0297e54b2b5993a5041eda300533fd7e68213307170565ef90452257a500Núñez-Robinson, DanielTalavera-Montalto, JoseUgarte, Willy2023-12-08T01:27:49Z2023-12-08T01:27:49Z2022-01-011865092910.1007/978-3-031-20319-0_28http://hdl.handle.net/10757/66959518650937Communications in Computer and Information Science2-s2.0-85144232675SCOPUS_ID:851442326750000 0001 2196 144XTransformer models have evolved natural language processing tasks in machine learning and set a new standard for the state of the art. Thanks to the self-attention component, these models have achieved significant improvements in text generation tasks (such as extractive and abstractive text summarization). However, research works involving text summarization and the legal domain are still in their infancy, and as such, benchmarks and a comparative analysis of these state of the art models is important for the future of text summarization of this highly specialized task. In order to contribute to these research works, the researchers propose a comparative analysis of different, fine-tuned Transformer models and datasets in order to provide a better understanding of the task at hand and the challenges ahead. The results show that Transformer models have improved upon the text summarization task, however, consistent and generalized learning is a challenge that still exists when training the models with large text dimensions. Finally, after analyzing the correlation between objective results and human opinion, the team concludes that the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) [13] metrics used in the current state of the art are limited and do not reflect the precise quality of a generated summary.engSpringer Science and Business Media Deutschland GmbHhttps://www.springerprofessional.de/en/a-comparative-analysis-on-the-summarization-of-legal-texts-using/23752634info:eu-repo/semantics/embargoedAccessAttribution-NonCommercial-ShareAlike 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-sa/4.0/Abstractive text summarizationBenchmarkDeep learningNatural language processingTransformersA Comparative Analysis on the Summarization of Legal Texts Using Transformer Modelsinfo:eu-repo/semantics/articleCommunications in Computer and Information Science1675 CCIS372386reponame:UPC-Institucionalinstname:Universidad Peruana de Ciencias Aplicadasinstacron:UPCLICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://repositorioacademico.upc.edu.pe/bitstream/10757/669595/2/license.txt8a4605be74aa9ea9d79846c1fba20a33MD52falseCC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-81031https://repositorioacademico.upc.edu.pe/bitstream/10757/669595/1/license_rdf934f4ca17e109e0a05eaeaba504d7ce4MD51false10757/669595oai:repositorioacademico.upc.edu.pe:10757/6695952023-12-08 01:27:50.395Repositorio académico upcupc@openrepository.comTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo= |
score |
13.959421 |
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).