The Peruvian Amazon forestry dataset: A leaf image classification corpus

Descripción del Articulo

Forest census allows getting precise data for logging planning and elaboration of the forest management plan. Species identification blunders carry inadequate forest management plans and high risks inside forest concessions. Hence, an identification protocol prevents the exploitation of non-commerci...

Descripción completa

Detalles Bibliográficos
Autores: Vizcarra G., Bermejo D., Mauricio A., Zarate Gomez R., Dianderas E.
Formato: artículo
Fecha de Publicación:2021
Institución:Consejo Nacional de Ciencia Tecnología e Innovación
Repositorio:CONCYTEC-Institucional
Lenguaje:inglés
OAI Identifier:oai:repositorio.concytec.gob.pe:20.500.12390/2335
Enlace del recurso:https://hdl.handle.net/20.500.12390/2335
https://doi.org/10.1016/j.ecoinf.2021.101268
Nivel de acceso:acceso abierto
Materia:Visual interpretation
Deep learning
Interpretation
Leaves dataset
Peruvian Amazon
http://purl.org/pe-repo/ocde/ford#4.01.02
Descripción
Sumario:Forest census allows getting precise data for logging planning and elaboration of the forest management plan. Species identification blunders carry inadequate forest management plans and high risks inside forest concessions. Hence, an identification protocol prevents the exploitation of non-commercial or endangered timber species. The current Peruvian legislation allows the incorporation of non-technical experts, called “materos”, during the identification. Materos use common names given by the folklore and traditions of their communities instead of formal ones, which generally lead to misclassifications. In the real world, logging companies hire materos instead of botanists due to cost/time limitations. Given such a motivation, we explore an end-to-end software solution to automatize the species identification. This paper introduces the Peruvian Amazon Forestry Dataset, which includes 59,441 leaves samples from ten of the most profitable and endangered timber-tree species. The proposal contemplates a background removal algorithm to feed a pre-trained CNN by the ImageNet dataset. We evaluate the quantitative (accuracy metric) and qualitative (visual interpretation) impacts of each stage by ablation experiments. The results show a 96.64% training accuracy and 96.52% testing accuracy on the VGG-19 model. Furthermore, the visual interpretation of the model evidences that leaf venations have the highest correlation in the plant recognition task. © 2021
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).