Fast algorithms for the optimum-path forest-based classifier

Culquicondor Ruiz, Aldo Paolo

Fast algorithms for the optimum-path forest-based classifier

Descripción del Articulo

Pattern Recognition applications deal with ever increasing datasets, both in size and complexity. In this work, we propose and analyze efficient algorithms for the Optimum-Path Forest (OPF) supervised classifier. This classifier has proven to provide results comparable to most well-know pattern reco...

Descripción completa

Detalles Bibliográficos
Autor:	Culquicondor Ruiz, Aldo Paolo
Formato:	tesis de grado
Fecha de Publicación:	2018
Institución:	Universidad Católica San Pablo
Repositorio:	UCSP-Institucional
Lenguaje:	español
OAI Identifier:	oai:repositorio.ucsp.edu.pe:20.500.12590/15589
Enlace del recurso:	https://hdl.handle.net/20.500.12590/15589
Nivel de acceso:	acceso abierto
Materia:	Algorithm Optimun Path Forest (OPF) Image Foresting Transform https://purl.org/pe-repo/ocde/ford#1.02.01

Descripción
Sumario:	Pattern Recognition applications deal with ever increasing datasets, both in size and complexity. In this work, we propose and analyze efficient algorithms for the Optimum-Path Forest (OPF) supervised classifier. This classifier has proven to provide results comparable to most well-know pattern recognition techniques, but with a much faster training phase. However, there is still room for improvement. The contribution of this work is the introduction of spatial indexing and parallel algorithms on the training and classification phases of the OPF supervised classifier. First, we propose a simple parallelization approach for the training phase. Following the traditional sequential training for the OPF, it maintains a priority queue to compute best samples at each iteration. Later on, we replace this priority queue by an array and a linear search, in the aim of using a more parallel-friendly data structure. We show that this approach leads to more temporal and spatial locality than the former, providing better speedups. Additionally, we show how the use of vectorization on distance calculations affects the overall speedup and we provide directions on when to use it. For the classification phase, we first aim to reduce the number of distance calculations against the classifier samples and, then, we also introduce parallelization. For this purpose, we elaborate a novel theory to index the OPF classifier in a metric space. Then, we use it to build an efficient data structure that allows us to reduce the number of comparison with classifier samples. Finally, we propose its parallelization, leading to a very fast classification for new samples.

Fast algorithms for the optimum-path forest-based classifier

Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).

Fast algorithms for the optimum-path forest-based classifier

Descripción del Articulo

Ejemplares Similares