Random Forests as an extension of the classification trees with the R and Python programs

Descripción del Articulo

This article presents the application of the non-parametric Random Forest method through supervised learning, as an extension of classification trees. The Random Forest algorithm arises as the grouping of several classification trees. Basically it randomly selects a number of variables with which ea...

Descripción completa

Detalles Bibliográficos
Autores: Medina-Merino, Rosa Fátima, Ñique-Chacón, Carmen Ismelda
Formato: artículo
Fecha de Publicación:2017
Institución:Universidad de Lima
Repositorio:Revistas - Universidad de Lima
Lenguaje:español
OAI Identifier:oai:revistas.ulima.edu.pe:article/1775
Enlace del recurso:https://revistas.ulima.edu.pe/index.php/Interfases/article/view/1775
Nivel de acceso:acceso abierto
Materia:Random Forest
classification trees
non-parametric classification models
supervised learning
R language
Python language
Bosques aleatorios
árboles de clasificación
modelos no paramétricos de clasificación
aprendizaje supervisado
lenguaje R
lenguaje Python
Descripción
Sumario:This article presents the application of the non-parametric Random Forest method through supervised learning, as an extension of classification trees. The Random Forest algorithm arises as the grouping of several classification trees. Basically it randomly selects a number of variables with which each individual tree is constructed and predictions are made with these variables that will later be weighted through the calculation of the most voted class of these trees that were generated, to finally do the prediction by Random Forest. For the application, we worked with 3168 recorded voices, for which the results of an acoustic analysis are presented, registering variables such as frequency, spectrum, modulation, among others, seeking to obtain a pattern of identification and classification according to gender through a voice identifier. The data record used is in open access and can be downloaded from the Kaggle web platform via <https://www.kaggle.com/primaryobjects/voicegende>r. For the development of the algorithm’s model, the statistical program R was used. Additionally, applications were made with Python by the development of classification algorithms.
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).