Fast algorithms for the optimum-path forest-based classifier

Descripción del Articulo

Pattern Recognition applications deal with ever increasing datasets, both in size and complexity. In this work, we propose and analyze efficient algorithms for the Optimum-Path Forest (OPF) supervised classifier. This classifier has proven to provide results comparable to most well-know pattern reco...

Descripción completa

Detalles Bibliográficos
Autor: Culquicondor Ruiz, Aldo Paolo
Formato: tesis de grado
Fecha de Publicación:2018
Institución:Universidad Católica San Pablo
Repositorio:UCSP-Institucional
Lenguaje:español
OAI Identifier:oai:repositorio.ucsp.edu.pe:20.500.12590/15589
Enlace del recurso:https://hdl.handle.net/20.500.12590/15589
Nivel de acceso:acceso abierto
Materia:Algorithm
Optimun Path Forest (OPF)
Image Foresting Transform
https://purl.org/pe-repo/ocde/ford#1.02.01
id UCSP_7b0386f4315967f75371de191a6f6fa4
oai_identifier_str oai:repositorio.ucsp.edu.pe:20.500.12590/15589
network_acronym_str UCSP
network_name_str UCSP-Institucional
repository_id_str 3854
dc.title.es_PE.fl_str_mv Fast algorithms for the optimum-path forest-based classifier
title Fast algorithms for the optimum-path forest-based classifier
spellingShingle Fast algorithms for the optimum-path forest-based classifier
Culquicondor Ruiz, Aldo Paolo
Algorithm
Optimun Path Forest (OPF)
Image Foresting Transform
https://purl.org/pe-repo/ocde/ford#1.02.01
title_short Fast algorithms for the optimum-path forest-based classifier
title_full Fast algorithms for the optimum-path forest-based classifier
title_fullStr Fast algorithms for the optimum-path forest-based classifier
title_full_unstemmed Fast algorithms for the optimum-path forest-based classifier
title_sort Fast algorithms for the optimum-path forest-based classifier
author Culquicondor Ruiz, Aldo Paolo
author_facet Culquicondor Ruiz, Aldo Paolo
author_role author
dc.contributor.advisor.fl_str_mv Ochoa Luna, José Eduardo
Castelo Fernández, César Christian
dc.contributor.author.fl_str_mv Culquicondor Ruiz, Aldo Paolo
dc.subject.es_PE.fl_str_mv Algorithm
Optimun Path Forest (OPF)
Image Foresting Transform
topic Algorithm
Optimun Path Forest (OPF)
Image Foresting Transform
https://purl.org/pe-repo/ocde/ford#1.02.01
dc.subject.ocde.es_PE.fl_str_mv https://purl.org/pe-repo/ocde/ford#1.02.01
description Pattern Recognition applications deal with ever increasing datasets, both in size and complexity. In this work, we propose and analyze efficient algorithms for the Optimum-Path Forest (OPF) supervised classifier. This classifier has proven to provide results comparable to most well-know pattern recognition techniques, but with a much faster training phase. However, there is still room for improvement. The contribution of this work is the introduction of spatial indexing and parallel algorithms on the training and classification phases of the OPF supervised classifier. First, we propose a simple parallelization approach for the training phase. Following the traditional sequential training for the OPF, it maintains a priority queue to compute best samples at each iteration. Later on, we replace this priority queue by an array and a linear search, in the aim of using a more parallel-friendly data structure. We show that this approach leads to more temporal and spatial locality than the former, providing better speedups. Additionally, we show how the use of vectorization on distance calculations affects the overall speedup and we provide directions on when to use it. For the classification phase, we first aim to reduce the number of distance calculations against the classifier samples and, then, we also introduce parallelization. For this purpose, we elaborate a novel theory to index the OPF classifier in a metric space. Then, we use it to build an efficient data structure that allows us to reduce the number of comparison with classifier samples. Finally, we propose its parallelization, leading to a very fast classification for new samples.
publishDate 2018
dc.date.accessioned.none.fl_str_mv 2018-05-07T17:21:40Z
dc.date.available.none.fl_str_mv 2018-05-07T17:21:40Z
dc.date.issued.fl_str_mv 2018
dc.type.none.fl_str_mv info:eu-repo/semantics/bachelorThesis
format bachelorThesis
dc.identifier.other.none.fl_str_mv 1060274
dc.identifier.uri.none.fl_str_mv https://hdl.handle.net/20.500.12590/15589
identifier_str_mv 1060274
url https://hdl.handle.net/20.500.12590/15589
dc.language.iso.es_PE.fl_str_mv spa
language spa
dc.relation.ispartof.fl_str_mv SUNEDU
dc.rights.es_PE.fl_str_mv info:eu-repo/semantics/openAccess
dc.rights.uri.es_PE.fl_str_mv https://creativecommons.org/licenses/by/4.0/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by/4.0/
dc.format.es_PE.fl_str_mv application/pdf
dc.publisher.es_PE.fl_str_mv Universidad Católica San Pablo
dc.publisher.country.es_PE.fl_str_mv PE
dc.source.es_PE.fl_str_mv Universidad Católica San Pablo
Repositorio Institucional - UCSP
dc.source.none.fl_str_mv reponame:UCSP-Institucional
instname:Universidad Católica San Pablo
instacron:UCSP
instname_str Universidad Católica San Pablo
instacron_str UCSP
institution UCSP
reponame_str UCSP-Institucional
collection UCSP-Institucional
bitstream.url.fl_str_mv https://repositorio.ucsp.edu.pe/backend/api/core/bitstreams/e75d83c9-e953-4d58-80d5-5510f439268f/download
https://repositorio.ucsp.edu.pe/backend/api/core/bitstreams/ced9ba7b-acfb-434a-ba32-5c127b334265/download
https://repositorio.ucsp.edu.pe/backend/api/core/bitstreams/1af77852-37bc-4339-8650-403a9625aa86/download
https://repositorio.ucsp.edu.pe/backend/api/core/bitstreams/e98766e8-974d-41af-b9b7-3333a559200d/download
bitstream.checksum.fl_str_mv 1b289c5f9a21632fde46b202905ea693
654e2212989dd654cbfc23543efe7ebf
8a4605be74aa9ea9d79846c1fba20a33
181ca6c86bfc0bc111732f9173f9d0ea
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositorio Institucional de la Universidad Católica San Pablo
repository.mail.fl_str_mv dspace@ucsp.edu.pe
_version_ 1851053034220552192
spelling Ochoa Luna, José EduardoCastelo Fernández, César ChristianCulquicondor Ruiz, Aldo Paolo2018-05-07T17:21:40Z2018-05-07T17:21:40Z20181060274https://hdl.handle.net/20.500.12590/15589Pattern Recognition applications deal with ever increasing datasets, both in size and complexity. In this work, we propose and analyze efficient algorithms for the Optimum-Path Forest (OPF) supervised classifier. This classifier has proven to provide results comparable to most well-know pattern recognition techniques, but with a much faster training phase. However, there is still room for improvement. The contribution of this work is the introduction of spatial indexing and parallel algorithms on the training and classification phases of the OPF supervised classifier. First, we propose a simple parallelization approach for the training phase. Following the traditional sequential training for the OPF, it maintains a priority queue to compute best samples at each iteration. Later on, we replace this priority queue by an array and a linear search, in the aim of using a more parallel-friendly data structure. We show that this approach leads to more temporal and spatial locality than the former, providing better speedups. Additionally, we show how the use of vectorization on distance calculations affects the overall speedup and we provide directions on when to use it. For the classification phase, we first aim to reduce the number of distance calculations against the classifier samples and, then, we also introduce parallelization. For this purpose, we elaborate a novel theory to index the OPF classifier in a metric space. Then, we use it to build an efficient data structure that allows us to reduce the number of comparison with classifier samples. Finally, we propose its parallelization, leading to a very fast classification for new samples.Tesisapplication/pdfspaUniversidad Católica San PabloPEinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/4.0/Universidad Católica San PabloRepositorio Institucional - UCSPreponame:UCSP-Institucionalinstname:Universidad Católica San Pabloinstacron:UCSPAlgorithmOptimun Path Forest (OPF)Image Foresting Transformhttps://purl.org/pe-repo/ocde/ford#1.02.01Fast algorithms for the optimum-path forest-based classifierinfo:eu-repo/semantics/bachelorThesisSUNEDULicenciado en Ciencia de la ComputaciónUniversidad Católica San Pablo. Facultad de Ingeniería y ComputaciónTítulo ProfesionalCiencia de la ComputaciónEscuela Profesional de Ciencia de la ComputaciónTEXTCULQUICONDOR_RUIZ_ALD_FAS.pdf.txtCULQUICONDOR_RUIZ_ALD_FAS.pdf.txtExtracted texttext/plain90430https://repositorio.ucsp.edu.pe/backend/api/core/bitstreams/e75d83c9-e953-4d58-80d5-5510f439268f/download1b289c5f9a21632fde46b202905ea693MD53ORIGINALCULQUICONDOR_RUIZ_ALD_FAS.pdfCULQUICONDOR_RUIZ_ALD_FAS.pdfapplication/pdf756116https://repositorio.ucsp.edu.pe/backend/api/core/bitstreams/ced9ba7b-acfb-434a-ba32-5c127b334265/download654e2212989dd654cbfc23543efe7ebfMD51LICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://repositorio.ucsp.edu.pe/backend/api/core/bitstreams/1af77852-37bc-4339-8650-403a9625aa86/download8a4605be74aa9ea9d79846c1fba20a33MD52THUMBNAILCULQUICONDOR_RUIZ_ALD_FAS.pdf.jpgCULQUICONDOR_RUIZ_ALD_FAS.pdf.jpgGenerated Thumbnailimage/jpeg3889https://repositorio.ucsp.edu.pe/backend/api/core/bitstreams/e98766e8-974d-41af-b9b7-3333a559200d/download181ca6c86bfc0bc111732f9173f9d0eaMD5420.500.12590/15589oai:repositorio.ucsp.edu.pe:20.500.12590/155892023-10-30 12:38:34.861https://creativecommons.org/licenses/by/4.0/info:eu-repo/semantics/openAccessopen.accesshttps://repositorio.ucsp.edu.peRepositorio Institucional de la Universidad Católica San Pablodspace@ucsp.edu.peTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=
score 13.463652
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).