Ensayo en modelos predictivos de Data Mining para diabetes en etapa temprana

Descripción del Articulo

Diabetes has become such a common, but deadly, chronic health problem that it has _x000D_ increased dramatically in recent years. About 50% of all people with diabetes are not _x000D_ diagnosed due to its long-term asymptomatic phase, which is why detecting diabetes in an _x000D_ early phase is of v...

Descripción completa

Detalles Bibliográficos
Autor: Leiva Quispe, José Enrique
Formato: tesis de grado
Fecha de Publicación:2023
Institución:Universidad Nacional de Trujillo
Repositorio:UNITRU-Tesis
Lenguaje:español
OAI Identifier:oai:dspace.unitru.edu.pe:20.500.14414/18558
Enlace del recurso:https://hdl.handle.net/20.500.14414/18558
Nivel de acceso:acceso abierto
Materia:Riesgo de diabetes
Etapa prematura
Minería de datos
Regresión logística
Máquina de Soporte Vectorial
Métricas de evaluación
Descripción
Sumario:Diabetes has become such a common, but deadly, chronic health problem that it has _x000D_ increased dramatically in recent years. About 50% of all people with diabetes are not _x000D_ diagnosed due to its long-term asymptomatic phase, which is why detecting diabetes in an _x000D_ early phase is of vital importance. Science has advanced so much in the field of health that _x000D_ data mining classification techniques have been well accepted by the scientific community _x000D_ for the predictive model of disease risk. In the present investigation, a set of 520 data has _x000D_ been used, which information was collected through a direct survey of patients from the _x000D_ Sylhet Diabetes Hospital in Bangladesh. The respective analysis was carried out using _x000D_ classification algorithms such as Logistic Regression (classical statistical technique) and _x000D_ Support Vector Machine (machine learning technique). After adjusting the models and _x000D_ evaluating using metrics such as accuracy, sensitivity and AUC (in that order), it has been _x000D_ found that the Vector Support Machine model has a better fit and predictive power (0.98, _x000D_ 0.98, 0.99) compared to the model of logistic regression (0.92, 0.94, 0.97). Finally, as a _x000D_ suggestion, useful tips were raised to control risk factors
Nota importante:
La información contenida en este registro es de entera responsabilidad de la institución que gestiona el repositorio institucional donde esta contenido este documento o set de datos. El CONCYTEC no se hace responsable por los contenidos (publicaciones y/o datos) accesibles a través del Repositorio Nacional Digital de Ciencia, Tecnología e Innovación de Acceso Abierto (ALICIA).